GUIDESData AnalysisStatistics
Practical Guide

Data Analysis

Statistical Methods for Human Simulation Data

Written by J Radler | Patient Analog
Last updated: January 2025

What You'll Learn

← Back to Guides

📊 WHY DATA ANALYSIS MATTERS

Rigorous data analysis transforms raw experimental outputs into actionable insights. Proper statistical methods ensure reproducibility, enable meaningful compound comparisons, and satisfy regulatory requirements for alternative method qualification. Poor analysis leads to false positives, failed replication, and regulatory rejection.

Z' > 0.5
Excellent Assay
Screening quality threshold
<15%
Target CV%
Replicate variability
R² > 0.9
Curve Fit Quality
Dose-response minimum
n = 3
Independent Experiments
Biological replicates

PREREQUISITES

Required Knowledge

  • Basic statistics (mean, SD, SEM, CV%)
  • Hypothesis testing (t-test, ANOVA)
  • Non-linear regression concepts
  • Understanding of biological vs technical replicates
  • Quality control metrics (Z-factor)

Data Requirements

  • Raw data in machine-readable format (CSV, Excel)
  • Plate layouts with sample identifiers
  • Vehicle and positive control data
  • Metadata (dates, operators, lot numbers)
  • At least 3 biological replicates per condition

Software Tools

  • GraphPad Prism (dose-response curves)
  • R or Python (advanced analysis)
  • Excel (basic calculations)
  • Plate reader software (raw export)
  • LIMS or ELN (data management)

DATA TYPES IN MPS STUDIES

Continuous Data

Biomarker concentrations, TEER values, viability percentages, fluorescence intensity. Use parametric statistics (t-test, ANOVA) when normally distributed.

Examples: ALT (U/L), albumin (µg/mL), ATP (RLU)

Functional Readouts

Beat rate, conduction velocity, calcium transients, barrier permeability. Often time-series data requiring specialized analysis approaches.

Examples: BPM, cm/s, Papp values

Imaging Data

Cell counts, morphological measurements, fluorescent areas. Requires image processing and feature extraction before statistical analysis.

Examples: % live cells, area (µm²), intensity

Derived Metrics

IC50, EC50, therapeutic index, safety margins. Calculated from primary data using non-linear regression and propagated error estimation.

Examples: IC50 (µM), TI ratio, LOAEL

NORMALIZATION METHODS

Method Formula Use Case Advantages
% of Vehicle Control (Sample / Vehicle Mean) × 100 Toxicity assays, viability Simple, intuitive interpretation
% of Positive Control (Sample - Neg) / (Pos - Neg) × 100 Efficacy screens, inhibition assays Accounts for assay window
Z-Score (Sample - Mean) / SD Hit identification, cross-plate comparison Standardizes across experiments
Per-Cell Normalization Value / Cell Number (or DNA) Secreted biomarkers, variable cell density Corrects for seeding differences
Fold Change Treated / Baseline Time-course studies, pre/post comparison Intuitive, logarithm-friendly
Robust Z-Score (MAD) (Sample - Median) / (1.4826 × MAD) Screens with outliers Resistant to extreme values

DOSE-RESPONSE CURVE FITTING

4-Parameter Logistic Model (4PL)

Y = Bottom + (Top - Bottom) / (1 + 10^((LogIC50 - X) × HillSlope))

Bottom

Minimum response (plateau at high drug concentration). For inhibition curves, typically 0% or residual activity.

Top

Maximum response (plateau at low/no drug). For inhibition curves, typically 100% of vehicle control.

IC50/EC50

Concentration producing 50% of maximal effect. Report with 95% CI. Use log scale (LogIC50) for fitting.

Hill Slope

Steepness of curve. Hill = 1 suggests simple binding. Hill > 1 indicates cooperativity; < 1 suggests heterogeneity.

Best Practices for Curve Fitting

  • Use at least 6-8 concentrations spanning the full response range
  • Space concentrations logarithmically (e.g., 3-fold or half-log dilutions)
  • Include at least 2 concentrations on each plateau
  • Fit to individual replicates first to assess variability, then to means
  • Fix Top/Bottom only when biologically justified; prefer unconstrained fits
  • Report R² (aim for > 0.9) and residuals should show no systematic pattern
  • Compare IC50 values using extra-sum-of-squares F-test, not overlapping CIs

ASSAY QUALITY METRICS

Z' Factor (Zhang et al. 1999)

Z' = 1 - (3σSD_pos + 3σSD_neg) / |Mean_pos - Mean_neg|
  • Z' > 0.5: Excellent assay
  • 0 < Z' < 0.5: Marginal, may work
  • Z' < 0: Assay not suitable for screening

Signal-to-Background (S/B)

S/B = Mean_signal / Mean_background
  • Minimum S/B = 3 for reliable detection
  • Higher S/B = better dynamic range
  • Report alongside Z' for full picture

Coefficient of Variation (CV%)

CV% = (SD / Mean) × 100
  • CV < 10%: Excellent reproducibility
  • CV 10-20%: Acceptable
  • CV > 20%: Needs optimization

Signal Window (SW)

SW = (Mean_pos - 3σSD_pos) - (Mean_neg + 3σSD_neg)
  • SW > 0 required for hit detection
  • Larger SW = more robust separation
  • Alternative to Z' when distributions differ

STATISTICAL TEST SELECTION

Comparison Parametric Test Non-Parametric Alternative Post-hoc Test When to Use
2 Groups Student's t-test (unpaired) Mann-Whitney U N/A Treated vs vehicle
2 Groups (paired) Paired t-test Wilcoxon signed-rank N/A Before/after on same samples
3+ Groups One-way ANOVA Kruskal-Wallis Tukey, Dunnett, Bonferroni Multiple doses vs vehicle
2 Factors Two-way ANOVA Friedman test Šídök, Tukey Dose × time, drug × genotype
IC50 Comparison Extra-sum-of-squares F-test N/A (use bootstrap) N/A Comparing potency between compounds

Note: Use Dunnett's test when comparing multiple groups to a single control. Use Tukey's HSD for all pairwise comparisons. Apply Bonferroni correction for multiple testing when appropriate.

DATA ANALYSIS WORKFLOW

1

Data Import & Validation

Export raw data from plate reader. Verify all wells accounted for. Check for import errors (missing values, truncation). Match to plate layout template.

2

Quality Control Check

Calculate Z' factor using vehicle and positive controls. Verify CV% of controls < 15%. Check for edge effects (plot plate heatmap). Flag plates failing QC criteria.

3

Background Subtraction

Subtract blank wells (media only) from all readings. For plate-based effects, use per-plate blanks. Document blank locations and values.

4

Outlier Assessment

Apply Grubb's test or ROUT method to identify statistical outliers. Document criteria prospectively. Never remove outliers without documented justification. Retain all raw data.

5

Normalization

Express data as % of vehicle control (per plate). For cross-plate comparison, use Z-score or normalize to reference compound on each plate.

6

Curve Fitting (if dose-response)

Fit 4-parameter logistic model. Assess fit quality (R², residuals). Calculate IC50/EC50 with 95% CI. Compare curves using F-test if needed.

7

Statistical Testing

Apply appropriate tests (ANOVA, t-test). Correct for multiple comparisons. Report exact p-values and effect sizes. Use n = biological replicates for power.

8

Visualization & Reporting

Create publication-quality figures with all data points shown. Include error bars (SD or SEM with n stated). Document all analysis steps for audit trail.

💡 EXPERT TIPS

Biological vs Technical Replicates

Use n = independent experiments (different cell batches/days) for statistics, not replicate wells within a plate. Technical replicates show precision; biological replicates show reproducibility.

Log-Transform for IC50

Always fit dose-response on log scale. IC50 values are log-normally distributed. Report geometric mean and 95% CI. Compare IC50s as ratio (fold-change) not difference.

Don't Confuse Significance with Relevance

With enough replicates, tiny differences become "significant." Always report effect size. A 5% change may be p < 0.01 but biologically meaningless. Define meaningful thresholds upfront.

Show All Data Points

Bar graphs hide distribution. Use scatter plots or box plots with individual data points. Reviewers increasingly require this. Shows outliers, distribution shape, and true n.

SOFTWARE TOOLS

Software Best For Key Features Cost
GraphPad Prism Dose-response curves, basic stats 4PL fitting, IC50 comparison, publication figures $250-500/yr
R (drc, ggplot2) Advanced analysis, reproducible pipelines Flexible modeling, scripted workflows, version control Free
Python (scipy, pandas) High-throughput, automation Batch processing, ML integration, custom pipelines Free
Genedata Screener Industrial HTS Plate normalization, QC, curve fitting, database Enterprise
TIBCO Spotfire Interactive visualization Dashboards, exploratory analysis, linked views Enterprise

TROUBLESHOOTING

Problem Possible Causes Solutions
Poor curve fit (low R²) Insufficient concentration range, biphasic response, wrong model Extend dose range, try biphasic model, check for partial agonism
IC50 outside tested range Dose range too narrow, compound inactive Extend range, report as "> highest dose tested" or "< lowest dose"
High inter-plate variability Plate effects, inconsistent seeding, reagent batch Normalize per-plate, include reference compound on each plate, standardize protocol
Low Z' factor High control variability, small signal window Optimize assay conditions, increase control replicates, use better positive control
Edge effects on plate Evaporation, temperature gradients Use perimeter wells for vehicle only, humidify incubator, apply edge correction algorithm
Wide confidence intervals Low n, high variability, poor curve fit Increase biological replicates (n = 3), reduce technical variability, verify assay performance
Non-monotonic dose-response U-shaped response, compound aggregation, assay interference Check solubility, use DLS for aggregation, test for assay interference, biological U-shape is real
Hill slope deviates from 1 Cooperativity, multiple binding sites, receptor reserve Report actual value, may indicate mechanism. Constrain to 1 only with justification
Results don't replicate Cell batch variability, protocol drift, reagent lot Include reference compound as internal standard, track lot numbers, standardize all steps
Statistical test assumptions violated Non-normal distribution, unequal variance Log-transform data, use non-parametric tests, Welch's correction for unequal variance

FREQUENTLY ASKED QUESTIONS

Should I use SD or SEM for error bars?

SD (standard deviation) shows data spread/variability - use when you want to show the range of your data. SEM (standard error of mean) shows precision of the mean estimate - use when comparing means between groups. Always state which you're using and include n. For publication, showing individual data points + mean is often best.

What's the minimum n for regulatory submissions?

FDA typically expects n = 3 independent biological replicates (different passages, days, donors). IQ MPS consortium recommends n = 3 with demonstration of reproducibility across laboratories for qualification. More replicates improve confidence intervals and power. Report both number of independent experiments and technical replicates per experiment.

How do I compare IC50 values between compounds?

Don't compare overlapping confidence intervals - this is statistically incorrect. Use extra-sum-of-squares F-test in GraphPad Prism (compare: "Is one curve?"). This tests whether a single IC50 fits both datasets or whether they're significantly different. Report p-value and fold-difference in IC50.

When should I use non-parametric tests?

Use non-parametric tests when: (1) data is clearly non-normal (Shapiro-Wilk p < 0.05), (2) n < 10 per group (hard to assess normality), (3) data is ordinal or ranked, (4) significant outliers present. Common alternatives: Mann-Whitney U (for t-test), Kruskal-Wallis (for one-way ANOVA), Spearman (for Pearson correlation).

How do I handle values below detection limit?

Options: (1) Replace with LOD/2 or LOD/v2 - simple but biased; (2) Use censored data methods (Tobit regression); (3) If many values below LOD, consider the assay inappropriate for that range. Never use zero. Document your approach. For regulatory, discuss with statistician.

What's the difference between p-value and effect size?

p-value tells you probability of seeing your result if null hypothesis is true - it's affected by sample size. Effect size (Cohen's d, fold-change, R²) tells you magnitude of the difference - biologically meaningful. Always report both. A tiny difference can be "significant" with large n but meaningless. Define "biologically relevant" thresholds upfront.

Should I constrain Top and Bottom in curve fitting?

Default: let software estimate all 4 parameters. Constrain Top=100% only if you have solid vehicle controls demonstrating 100% baseline. Constrain Bottom=0 only if you observe complete inhibition. Improper constraints bias IC50. When reporting, state whether constrained and why. Check that unconstrained estimates are biologically plausible.

How do I correct for multiple comparisons?

More tests = more false positives. Bonferroni (divide a by number of tests) is conservative. Holm-Šídök is less conservative. For dose-response comparing to vehicle, use Dunnett's test (designed for this). For all pairwise, use Tukey's HSD. State correction method used. FDR (Benjamini-Hochberg) is popular for large screens.

What data should I include in regulatory submissions?

Include: (1) Raw data in structured format; (2) Analysis methods with software version; (3) QC metrics (Z', CV%); (4) All individual values, not just means; (5) Outlier criteria defined prospectively; (6) Complete statistical output; (7) Metadata (dates, operators, lot numbers). Follow FAIR principles. Audit trail must be complete and uneditable.

How do I calculate therapeutic index from MPS data?

TI = TC50 / EC50, where TC50 is toxicity concentration (50% cell death) and EC50 is efficacy concentration (50% target effect). Higher TI = safer. Compare to clinical Cmax: Safety Margin = TC50 / Cmax. Report with propagated confidence intervals. Values are model-dependent; compare within same platform only.

RELATED CONTENT

GUIDE
Biomarker Selection
Choose the right endpoints for your study
GUIDE
Drug Testing Methods
Compound handling and experimental design
GUIDE
Quality Assurance
QC protocols and documentation standards
GUIDE
Regulatory Submission
Preparing MPS data for FDA/EMA
SCIENCE
High-Throughput Screening
Scaling MPS for drug discovery
REGULATORY
FDA ISTAND
Data requirements for qualification

NEXT STEPS

  1. Establish QC criteria (Z' > 0.5, CV% < 15%) before starting experiments
  2. Create standardized templates for data import and plate layouts
  3. Validate normalization and analysis workflow with reference compounds
  4. Document all analysis procedures in SOPs for reproducibility
  5. Implement version control for analysis scripts (R/Python)
  6. Archive raw data with complete metadata for regulatory traceability
←Guides Hub

Implementation Pathway

PhaseActivitiesTimeline
PlanningDefine objectives, select platform1-2 months
SetupInstallation, training, protocols2-3 months
ValidationTesting, regulatory engagement6-12 months

Next Steps

🎯

MPS Technology

Platform deep dive

🎯

Personalized Medicine

Patient approaches

🎯

FDA ISTAND

Submission pathways

Frequently Asked Questions

What data types do organ chips generate?

Chips produce continuous sensor data (oxygen, pH, impedance), time-lapse imaging (thousands of images), secreted biomarker measurements (ELISA, mass spec), omics data (RNA-seq, proteomics), and functional readouts (contraction force, barrier permeability, electrical activity).

How do you analyze time-series chip data?

Time-series analysis identifies trends over hours to weeks of culture, detects when toxicity begins, compares treatment versus control kinetics, and models dose-response relationships. Software fits curves to data, calculates area-under-curve, and performs statistical tests for differences.

What statistical methods apply to organ chip experiments?

Methods include ANOVA for comparing multiple conditions, t-tests for two-group comparisons, regression for dose-response, principal component analysis for multi-parameter data, and machine learning for pattern recognition. Biological replicates (3-6 chips per condition) enable statistical power.

Can machine learning analyze chip imaging data?

Yes. Deep learning models trained on chip images automatically segment cells, quantify morphology changes, track cell movement, identify dying cells, and predict toxicity. Convolutional neural networks extract thousands of image features humans cannot perceive.

How do you normalize organ chip data?

Normalization accounts for chip-to-chip variability. Methods include normalizing to internal controls (housekeeping gene expression), calculating fold-change relative to vehicle control, using spike-in standards, and Z-score transformation. Choice depends on data type and experimental design.

What is the minimum dataset for regulatory submission?

Regulatory submissions require experimental design documentation, quality control metrics, raw data with metadata, statistical analysis plans, validation against reference compounds (10-20 chemicals with known human outcomes), reproducibility across batches, and comparison to animal or clinical data.

How do you integrate multi-organ chip data?

Integration combines data from linked organs modeling systemic effects. Pharmacokinetic modeling tracks drug concentration through organs. Systems biology approaches integrate omics data revealing network-level responses. Computational models calibrated with chip data predict whole-body outcomes.

What software is available for chip data analysis?

Options include GraphPad Prism (statistics), ImageJ/CellProfiler (image analysis), R and Python (custom analysis), MatLab (modeling), and specialized platforms like OrganoAnalytics, Tissue Analytics, and instrument-specific software from chip manufacturers.

How do you manage large chip datasets?

Use data management systems tracking experimental metadata, store raw data in structured formats (CSV, HDF5), implement version control, follow FAIR principles (Findable, Accessible, Interoperable, Reusable), and archive data long-term for regulatory or publication requirements.

What is the future of chip data analysis?

Future includes AI automatically analyzing all data types, real-time analysis during experiments enabling adaptive protocols, integration with electronic lab notebooks, cloud platforms enabling multi-site data sharing, and standardized analysis pipelines facilitating regulatory acceptance.