GUIDESData AnalysisStatistics

Practical Guide

Data Analysis

Statistical Methods for Human Simulation Data

Written by J Radler | Patient Analog

Last updated: January 2025

What You'll Learn

Step-by-step protocols and best practices
Common pitfalls and how to avoid them
Expert tips for successful implementation
Resources for further learning and support

← Back to Guides

📊 WHY DATA ANALYSIS MATTERS

Rigorous data analysis transforms raw experimental outputs into actionable insights. Proper statistical methods ensure reproducibility, enable meaningful compound comparisons, and satisfy regulatory requirements for alternative method qualification. Poor analysis leads to false positives, failed replication, and regulatory rejection.

Z' > 0.5

Excellent Assay

Screening quality threshold

<15%

Target CV%

Replicate variability

R² > 0.9

Curve Fit Quality

Dose-response minimum

n = 3

Independent Experiments

Biological replicates

PREREQUISITES

Required Knowledge

Basic statistics (mean, SD, SEM, CV%)
Hypothesis testing (t-test, ANOVA)
Non-linear regression concepts
Understanding of biological vs technical replicates
Quality control metrics (Z-factor)

Data Requirements

Raw data in machine-readable format (CSV, Excel)
Plate layouts with sample identifiers
Vehicle and positive control data
Metadata (dates, operators, lot numbers)
At least 3 biological replicates per condition

Software Tools

GraphPad Prism (dose-response curves)
R or Python (advanced analysis)
Excel (basic calculations)
Plate reader software (raw export)
LIMS or ELN (data management)

DATA TYPES IN MPS STUDIES

Continuous Data

Biomarker concentrations, TEER values, viability percentages, fluorescence intensity. Use parametric statistics (t-test, ANOVA) when normally distributed.

Examples: ALT (U/L), albumin (µg/mL), ATP (RLU)

Functional Readouts

Beat rate, conduction velocity, calcium transients, barrier permeability. Often time-series data requiring specialized analysis approaches.

Examples: BPM, cm/s, Papp values

Imaging Data

Cell counts, morphological measurements, fluorescent areas. Requires image processing and feature extraction before statistical analysis.

Examples: % live cells, area (µm²), intensity

Derived Metrics

IC50, EC50, therapeutic index, safety margins. Calculated from primary data using non-linear regression and propagated error estimation.

Examples: IC50 (µM), TI ratio, LOAEL

NORMALIZATION METHODS

Method	Formula	Use Case	Advantages
% of Vehicle Control	(Sample / Vehicle Mean) × 100	Toxicity assays, viability	Simple, intuitive interpretation
% of Positive Control	(Sample - Neg) / (Pos - Neg) × 100	Efficacy screens, inhibition assays	Accounts for assay window
Z-Score	(Sample - Mean) / SD	Hit identification, cross-plate comparison	Standardizes across experiments
Per-Cell Normalization	Value / Cell Number (or DNA)	Secreted biomarkers, variable cell density	Corrects for seeding differences
Fold Change	Treated / Baseline	Time-course studies, pre/post comparison	Intuitive, logarithm-friendly
Robust Z-Score (MAD)	(Sample - Median) / (1.4826 × MAD)	Screens with outliers	Resistant to extreme values

DOSE-RESPONSE CURVE FITTING

4-Parameter Logistic Model (4PL)

Y = Bottom + (Top - Bottom) / (1 + 10^((LogIC50 - X) × HillSlope))

Bottom

Minimum response (plateau at high drug concentration). For inhibition curves, typically 0% or residual activity.

Top

Maximum response (plateau at low/no drug). For inhibition curves, typically 100% of vehicle control.

IC50/EC50

Concentration producing 50% of maximal effect. Report with 95% CI. Use log scale (LogIC50) for fitting.

Hill Slope

Steepness of curve. Hill = 1 suggests simple binding. Hill > 1 indicates cooperativity; < 1 suggests heterogeneity.

Best Practices for Curve Fitting

Use at least 6-8 concentrations spanning the full response range
Space concentrations logarithmically (e.g., 3-fold or half-log dilutions)
Include at least 2 concentrations on each plateau
Fit to individual replicates first to assess variability, then to means
Fix Top/Bottom only when biologically justified; prefer unconstrained fits
Report R² (aim for > 0.9) and residuals should show no systematic pattern
Compare IC50 values using extra-sum-of-squares F-test, not overlapping CIs

ASSAY QUALITY METRICS

Z' Factor (Zhang et al. 1999)

Z' = 1 - (3σSD_pos + 3σSD_neg) / |Mean_pos - Mean_neg|

Z' > 0.5: Excellent assay
0 < Z' < 0.5: Marginal, may work
Z' < 0: Assay not suitable for screening

Signal-to-Background (S/B)

S/B = Mean_signal / Mean_background

Minimum S/B = 3 for reliable detection
Higher S/B = better dynamic range
Report alongside Z' for full picture

Coefficient of Variation (CV%)

CV% = (SD / Mean) × 100

CV < 10%: Excellent reproducibility
CV 10-20%: Acceptable
CV > 20%: Needs optimization

Signal Window (SW)

SW = (Mean_pos - 3σSD_pos) - (Mean_neg + 3σSD_neg)

SW > 0 required for hit detection
Larger SW = more robust separation
Alternative to Z' when distributions differ

STATISTICAL TEST SELECTION

Comparison	Parametric Test	Non-Parametric Alternative	Post-hoc Test	When to Use
2 Groups	Student's t-test (unpaired)	Mann-Whitney U	N/A	Treated vs vehicle
2 Groups (paired)	Paired t-test	Wilcoxon signed-rank	N/A	Before/after on same samples
3+ Groups	One-way ANOVA	Kruskal-Wallis	Tukey, Dunnett, Bonferroni	Multiple doses vs vehicle
2 Factors	Two-way ANOVA	Friedman test	Šídök, Tukey	Dose × time, drug × genotype
IC50 Comparison	Extra-sum-of-squares F-test	N/A (use bootstrap)	N/A	Comparing potency between compounds

Note: Use Dunnett's test when comparing multiple groups to a single control. Use Tukey's HSD for all pairwise comparisons. Apply Bonferroni correction for multiple testing when appropriate.

DATA ANALYSIS WORKFLOW

Data Import & Validation

Export raw data from plate reader. Verify all wells accounted for. Check for import errors (missing values, truncation). Match to plate layout template.

Quality Control Check

Calculate Z' factor using vehicle and positive controls. Verify CV% of controls < 15%. Check for edge effects (plot plate heatmap). Flag plates failing QC criteria.

Background Subtraction

Subtract blank wells (media only) from all readings. For plate-based effects, use per-plate blanks. Document blank locations and values.

Outlier Assessment

Apply Grubb's test or ROUT method to identify statistical outliers. Document criteria prospectively. Never remove outliers without documented justification. Retain all raw data.

Normalization

Express data as % of vehicle control (per plate). For cross-plate comparison, use Z-score or normalize to reference compound on each plate.

Curve Fitting (if dose-response)

Fit 4-parameter logistic model. Assess fit quality (R², residuals). Calculate IC50/EC50 with 95% CI. Compare curves using F-test if needed.

Statistical Testing

Apply appropriate tests (ANOVA, t-test). Correct for multiple comparisons. Report exact p-values and effect sizes. Use n = biological replicates for power.

Visualization & Reporting

Create publication-quality figures with all data points shown. Include error bars (SD or SEM with n stated). Document all analysis steps for audit trail.

💡 EXPERT TIPS

Biological vs Technical Replicates

Use n = independent experiments (different cell batches/days) for statistics, not replicate wells within a plate. Technical replicates show precision; biological replicates show reproducibility.

Log-Transform for IC50

Always fit dose-response on log scale. IC50 values are log-normally distributed. Report geometric mean and 95% CI. Compare IC50s as ratio (fold-change) not difference.

Don't Confuse Significance with Relevance

With enough replicates, tiny differences become "significant." Always report effect size. A 5% change may be p < 0.01 but biologically meaningless. Define meaningful thresholds upfront.

Show All Data Points

Bar graphs hide distribution. Use scatter plots or box plots with individual data points. Reviewers increasingly require this. Shows outliers, distribution shape, and true n.

SOFTWARE TOOLS

Software	Best For	Key Features	Cost
GraphPad Prism	Dose-response curves, basic stats	4PL fitting, IC50 comparison, publication figures	$250-500/yr
R (drc, ggplot2)	Advanced analysis, reproducible pipelines	Flexible modeling, scripted workflows, version control	Free
Python (scipy, pandas)	High-throughput, automation	Batch processing, ML integration, custom pipelines	Free
Genedata Screener	Industrial HTS	Plate normalization, QC, curve fitting, database	Enterprise
TIBCO Spotfire	Interactive visualization	Dashboards, exploratory analysis, linked views	Enterprise

TROUBLESHOOTING

Problem	Possible Causes	Solutions
Poor curve fit (low R²)	Insufficient concentration range, biphasic response, wrong model	Extend dose range, try biphasic model, check for partial agonism
IC50 outside tested range	Dose range too narrow, compound inactive	Extend range, report as "> highest dose tested" or "< lowest dose"
High inter-plate variability	Plate effects, inconsistent seeding, reagent batch	Normalize per-plate, include reference compound on each plate, standardize protocol
Low Z' factor	High control variability, small signal window	Optimize assay conditions, increase control replicates, use better positive control
Edge effects on plate	Evaporation, temperature gradients	Use perimeter wells for vehicle only, humidify incubator, apply edge correction algorithm
Wide confidence intervals	Low n, high variability, poor curve fit	Increase biological replicates (n = 3), reduce technical variability, verify assay performance
Non-monotonic dose-response	U-shaped response, compound aggregation, assay interference	Check solubility, use DLS for aggregation, test for assay interference, biological U-shape is real
Hill slope deviates from 1	Cooperativity, multiple binding sites, receptor reserve	Report actual value, may indicate mechanism. Constrain to 1 only with justification
Results don't replicate	Cell batch variability, protocol drift, reagent lot	Include reference compound as internal standard, track lot numbers, standardize all steps
Statistical test assumptions violated	Non-normal distribution, unequal variance	Log-transform data, use non-parametric tests, Welch's correction for unequal variance

FREQUENTLY ASKED QUESTIONS

Should I use SD or SEM for error bars?

SD (standard deviation) shows data spread/variability - use when you want to show the range of your data. SEM (standard error of mean) shows precision of the mean estimate - use when comparing means between groups. Always state which you're using and include n. For publication, showing individual data points + mean is often best.

What's the minimum n for regulatory submissions?

FDA typically expects n = 3 independent biological replicates (different passages, days, donors). IQ MPS consortium recommends n = 3 with demonstration of reproducibility across laboratories for qualification. More replicates improve confidence intervals and power. Report both number of independent experiments and technical replicates per experiment.

How do I compare IC50 values between compounds?

Don't compare overlapping confidence intervals - this is statistically incorrect. Use extra-sum-of-squares F-test in GraphPad Prism (compare: "Is one curve?"). This tests whether a single IC50 fits both datasets or whether they're significantly different. Report p-value and fold-difference in IC50.

When should I use non-parametric tests?

Use non-parametric tests when: (1) data is clearly non-normal (Shapiro-Wilk p < 0.05), (2) n < 10 per group (hard to assess normality), (3) data is ordinal or ranked, (4) significant outliers present. Common alternatives: Mann-Whitney U (for t-test), Kruskal-Wallis (for one-way ANOVA), Spearman (for Pearson correlation).

How do I handle values below detection limit?

Options: (1) Replace with LOD/2 or LOD/v2 - simple but biased; (2) Use censored data methods (Tobit regression); (3) If many values below LOD, consider the assay inappropriate for that range. Never use zero. Document your approach. For regulatory, discuss with statistician.

What's the difference between p-value and effect size?

p-value tells you probability of seeing your result if null hypothesis is true - it's affected by sample size. Effect size (Cohen's d, fold-change, R²) tells you magnitude of the difference - biologically meaningful. Always report both. A tiny difference can be "significant" with large n but meaningless. Define "biologically relevant" thresholds upfront.

Should I constrain Top and Bottom in curve fitting?

Default: let software estimate all 4 parameters. Constrain Top=100% only if you have solid vehicle controls demonstrating 100% baseline. Constrain Bottom=0 only if you observe complete inhibition. Improper constraints bias IC50. When reporting, state whether constrained and why. Check that unconstrained estimates are biologically plausible.

How do I correct for multiple comparisons?

More tests = more false positives. Bonferroni (divide a by number of tests) is conservative. Holm-Šídök is less conservative. For dose-response comparing to vehicle, use Dunnett's test (designed for this). For all pairwise, use Tukey's HSD. State correction method used. FDR (Benjamini-Hochberg) is popular for large screens.

What data should I include in regulatory submissions?

Include: (1) Raw data in structured format; (2) Analysis methods with software version; (3) QC metrics (Z', CV%); (4) All individual values, not just means; (5) Outlier criteria defined prospectively; (6) Complete statistical output; (7) Metadata (dates, operators, lot numbers). Follow FAIR principles. Audit trail must be complete and uneditable.

How do I calculate therapeutic index from MPS data?

TI = TC50 / EC50, where TC50 is toxicity concentration (50% cell death) and EC50 is efficacy concentration (50% target effect). Higher TI = safer. Compare to clinical Cmax: Safety Margin = TC50 / Cmax. Report with propagated confidence intervals. Values are model-dependent; compare within same platform only.

NEXT STEPS

Establish QC criteria (Z' > 0.5, CV% < 15%) before starting experiments
Create standardized templates for data import and plate layouts
Validate normalization and analysis workflow with reference compounds
Document all analysis procedures in SOPs for reproducibility
Implement version control for analysis scripts (R/Python)
Archive raw data with complete metadata for regulatory traceability

←Guides Hub

Implementation Pathway

Phase	Activities	Timeline
Planning	Define objectives, select platform	1-2 months
Setup	Installation, training, protocols	2-3 months
Validation	Testing, regulatory engagement	6-12 months

Next Steps

🎯

MPS Technology

Platform deep dive

🎯

Personalized Medicine

Patient approaches

🎯

FDA ISTAND

Submission pathways

Frequently Asked Questions

What data types do organ chips generate?

Chips produce continuous sensor data (oxygen, pH, impedance), time-lapse imaging (thousands of images), secreted biomarker measurements (ELISA, mass spec), omics data (RNA-seq, proteomics), and functional readouts (contraction force, barrier permeability, electrical activity).

How do you analyze time-series chip data?

Time-series analysis identifies trends over hours to weeks of culture, detects when toxicity begins, compares treatment versus control kinetics, and models dose-response relationships. Software fits curves to data, calculates area-under-curve, and performs statistical tests for differences.

What statistical methods apply to organ chip experiments?

Methods include ANOVA for comparing multiple conditions, t-tests for two-group comparisons, regression for dose-response, principal component analysis for multi-parameter data, and machine learning for pattern recognition. Biological replicates (3-6 chips per condition) enable statistical power.

Can machine learning analyze chip imaging data?

Yes. Deep learning models trained on chip images automatically segment cells, quantify morphology changes, track cell movement, identify dying cells, and predict toxicity. Convolutional neural networks extract thousands of image features humans cannot perceive.

How do you normalize organ chip data?

Normalization accounts for chip-to-chip variability. Methods include normalizing to internal controls (housekeeping gene expression), calculating fold-change relative to vehicle control, using spike-in standards, and Z-score transformation. Choice depends on data type and experimental design.

What is the minimum dataset for regulatory submission?

Regulatory submissions require experimental design documentation, quality control metrics, raw data with metadata, statistical analysis plans, validation against reference compounds (10-20 chemicals with known human outcomes), reproducibility across batches, and comparison to animal or clinical data.

How do you integrate multi-organ chip data?

Integration combines data from linked organs modeling systemic effects. Pharmacokinetic modeling tracks drug concentration through organs. Systems biology approaches integrate omics data revealing network-level responses. Computational models calibrated with chip data predict whole-body outcomes.

What software is available for chip data analysis?

Options include GraphPad Prism (statistics), ImageJ/CellProfiler (image analysis), R and Python (custom analysis), MatLab (modeling), and specialized platforms like OrganoAnalytics, Tissue Analytics, and instrument-specific software from chip manufacturers.

How do you manage large chip datasets?

Use data management systems tracking experimental metadata, store raw data in structured formats (CSV, HDF5), implement version control, follow FAIR principles (Findable, Accessible, Interoperable, Reusable), and archive data long-term for regulatory or publication requirements.

What is the future of chip data analysis?

Future includes AI automatically analyzing all data types, real-time analysis during experiments enabling adaptive protocols, integration with electronic lab notebooks, cloud platforms enabling multi-site data sharing, and standardized analysis pipelines facilitating regulatory acceptance.