Workshop Overview: From DoE Data to Decisions

Now that you've designed your experiments and collected data, it's time for the most crucial step: statistical analysis. This workshop will teach you how to extract meaningful insights from your DoE data and validate your models.

Why Statistical Analysis is Critical in DoE
Step 1: Model Building - We need to understand which factors actually affect our response
Step 2: Model Validation - We must ensure our model is reliable and not just fitting noise
Step 3: Optimization - We want to predict optimal conditions with confidence
Step 4: Decision Making - We need statistical evidence to support our formulation choices

Learning Objectives: By the end of this workshop, you'll know how to perform ANOVA on DoE data, check model adequacy through residual analysis, and validate your models for reliable predictions.

Section 1: ANOVA for DoE Analysis

Model Adequacy Checking

Let's Think Step-by-Step: What Makes a Good Model?
Step 1: Significant Effects - The model should include factors that actually matter (p < 0.05)
Step 2: Good Fit - The model should explain a substantial portion of variability (high R²)
Step 3: Valid Assumptions - Residuals should be normally distributed with constant variance
Step 4: No Lack of Fit - The model should capture the true relationship, not miss important patterns
Pharmaceutical Example: Tablet Hardness DoE Analysis

Scenario: We conducted a 2³ factorial design to study tablet hardness. Factors: Compression Force (A), Binder Concentration (B), and Disintegrant Level (C).

Run A: Force (kN) B: Binder (%) C: Disintegrant (%) Hardness (N)
1-1 (15)-1 (2)-1 (1)45
2+1 (25)-1 (2)-1 (1)78
3-1 (15)+1 (4)-1 (1)52
4+1 (25)+1 (4)-1 (1)89
5-1 (15)-1 (2)+1 (3)41
6+1 (25)-1 (2)+1 (3)71
7-1 (15)+1 (4)+1 (3)47
8+1 (25)+1 (4)+1 (3)83
Excel Implementation: ANOVA for DoE
Step 1: Data Analysis → Regression → Multiple Regression
Input Y Range: C2:C9 (Hardness values)
Input X Range: D2:F9 (Coded factor levels)
Output: ANOVA table with F-statistics and p-values

Lack-of-Fit Test

Understanding Lack-of-Fit: What Are We Testing?
Concept: Lack-of-Fit occurs when our model doesn't capture the true relationship between factors and response
Test Logic: If we have replicate runs (multiple observations at the same factor combinations), we can separate "pure error" from "lack-of-fit error"
Pure Error: Variability between replicates at the same conditions (measurement error)
Lack-of-Fit: Additional error beyond pure error (model inadequacy)
FLOF = MSLack-of-Fit / MSPure Error
⚠️ Critical Decision Rule
If FLOF > Fcritical (p < 0.05): Significant lack-of-fit detected!
Action Required: Add higher-order terms, transform variables, or consider different model forms.
Example: Dissolution Rate Study with Replicates

Scenario: Central Composite Design for dissolution optimization with 3 center point replicates.

Condition Replicate 1 Replicate 2 Replicate 3 Mean Pure Error
Center Point85.2%87.1%86.3%86.2%0.95
Corner Point78.5%79.2%78.9%78.9%0.33

Pure Error Calculation: Sum of squared deviations within each replicate group, divided by degrees of freedom.

R² and Adjusted R² Interpretation

Step-by-Step: Understanding Model Fit Measures
R² Formula: R² = 1 - (SSResidual / SSTotal)
Interpretation: Percentage of variability explained by the model (0-100%)
Problem: R² always increases when adding terms, even if they're not significant
Solution: Adjusted R² = 1 - [(SSResidual/(n-p-1)) / (SSTotal/(n-1))]
Benefit: Adjusted R² penalizes for adding unnecessary terms
Example: Content Uniformity Model Comparison
Model Terms Adjusted R² Decision
Linear OnlyA, B, C0.7890.731Good baseline
+ InteractionsA, B, C, AB, AC, BC0.8920.834Improvement!
+ QuadraticAll + A², B², C²0.9030.821Overfitting?

Decision Logic: Choose the model with the highest adjusted R² that includes only significant terms.

Excel R² Calculation
R² Formula: =1-SUMXMY2(actual_range,predicted_range)/DEVSQ(actual_range)
Or simply: Look at "R Square" value in Regression Analysis output

Section 2: Residual Analysis - Model Diagnostics

Normal Probability Plots

Why Check Normality of Residuals?
Assumption: ANOVA assumes residuals follow a normal distribution
Consequence: If residuals aren't normal, our p-values and confidence intervals are unreliable
Test Method: Plot residuals vs. normal scores (quantiles)
Interpretation: Points should fall approximately on a straight line
Example: Capsule Fill Weight Variability Study

Problem: Fill weight data showing potential non-normality

Observation Predicted Actual Residual Normal Score
1248.5251.22.7-1.64
2252.1249.8-2.3-0.97
3250.3250.1-0.2-0.43
4249.7250.40.70.00
5251.8253.11.30.43
6253.2254.71.50.97
7247.9252.14.21.64

Analysis: Plot residuals vs. normal scores. If points deviate from straight line, consider data transformation.

Excel Normal Probability Plot
Step 1: Calculate residuals = Actual - Predicted
Step 2: Sort residuals in ascending order
Step 3: Calculate normal scores = NORM.S.INV((RANK-0.5)/COUNT)
Step 4: Create scatter plot: X-axis = Normal Scores, Y-axis = Residuals

Residuals vs Fitted Values Plot

What This Plot Reveals About Your Model
Good Pattern: Random scatter around zero line (no patterns)
Bad Pattern 1: Funnel shape = heteroscedasticity (non-constant variance)
Bad Pattern 2: Curved pattern = non-linearity (need quadratic terms)
Bad Pattern 3: Outliers = unusual observations requiring investigation

🎯 Interactive Residual Pattern Recognition

Click the buttons below to see different residual patterns:

Click a button to see residual patterns and their interpretations.
Example: API Content Analysis Issues

Scenario: DoE study on API content uniformity showing heteroscedasticity

⚠️ Problem Identified
Residuals vs fitted plot shows funnel pattern: variance increases with API content level.
Solution: Consider square-root transformation of response variable.

Before Transformation: Residual variance from 0.1 to 2.4 (24-fold increase)

After √(API Content) Transformation: Residual variance from 0.05 to 0.12 (2.4-fold increase)

Cook's Distance - Influential Point Detection

Understanding Influential Observations
Concept: Some data points have disproportionate influence on the regression line
Cook's Distance: Measures how much the fitted values change when we remove one observation
Threshold: D > 4/n (where n = number of observations) suggests influential point
Action: Investigate high Cook's D points - are they outliers or valid extreme conditions?
Cook's Di = (ri²/(p+1)) × (hii/(1-hii))
where ri = standardized residual, hii = leverage, p = parameters
Example: Tablet Compression Study Outlier
Run Force (kN) Hardness (N) Residual Cook's D Status
11545-2.10.08Normal
220581.30.03Normal
32571-0.80.01Normal
43045-25.20.89⚠️ Influential

Investigation Result: Run 4 had defective punch - legitimate removal from analysis after verification.

Leverage - High-Impact Factor Combinations

What is Leverage in DoE Context?
Definition: Leverage measures how far a point is from the center of the factor space
High Leverage: Corner points in factorial designs, star points in CCD
Implication: High leverage points strongly influence the fitted model
Threshold: h > 2p/n (where p = parameters, n = observations)
Excel Leverage Calculation
Method 1: Use Regression Analysis with "Residuals" option checked
Method 2: Manual calculation using hat matrix: hii = diagonal elements
Interpretation: Sum of all leverages = number of parameters (p)

Section 3: Model Validation - Ensuring Reliability

Cross-Validation Techniques

Why Validate? The Problem of Overfitting
Problem: Models can fit the training data perfectly but predict poorly on new data
Solution: Test model performance on data it hasn't seen before
K-Fold CV: Split data into k groups, train on k-1, test on 1, repeat k times
Leave-One-Out: Special case where k = n (sample size)
Example: Dissolution Model Cross-Validation

Dataset: 16 runs from Central Composite Design

Method: 4-fold cross-validation (4 groups of 4 runs each)

Fold Training Runs Test Runs R² (Training) R² (Test)
12,3,4,...,1610.940.89
21,3,4,...,1620.930.91
31,2,4,...,1630.950.87
41,2,3,...,15160.940.92

Average Cross-Validation R²: 0.90 (vs. 0.94 on full data)

Conclusion: Model shows good predictive ability with minimal overfitting.

Confirmation Runs - The Ultimate Test

Planning Effective Confirmation Experiments
Purpose: Test model predictions with completely new experiments
Design: Choose factor combinations different from original design points
Replication: Run multiple replicates to assess prediction accuracy
Success Criteria: Observed values fall within prediction intervals
Example: Tablet Formulation Confirmation Study

Original DoE: Optimized binder and disintegrant levels for hardness and disintegration time

Optimal Conditions Found: 3.2% binder, 2.1% disintegrant

Response Predicted 95% PI Lower 95% PI Upper Observed (n=6) Validation
Hardness (N)75.271.878.674.1 ± 2.1✅ Within PI
Disintegration (min)12.510.914.113.2 ± 0.8✅ Within PI
Friability (%)0.650.520.780.71 ± 0.05✅ Within PI

Conclusion: All responses fall within prediction intervals - model validated!

Prediction Intervals vs Confidence Intervals

Critical Distinction for Pharmaceutical Applications
Confidence Interval: Uncertainty about the true mean response at given conditions
Prediction Interval: Uncertainty about a future individual observation
Key Point: Prediction intervals are always wider than confidence intervals
Pharmaceutical Use: Use prediction intervals for specification setting and batch release
Prediction Interval = ŷ ± tα/2,df × √(MSE × (1 + hxx))
where hxx is leverage at prediction point
Excel Prediction Interval Calculation
Formula: =prediction ± T.INV.2T(alpha,df) * SQRT(MSE * (1 + leverage))
MSE: Mean Squared Error from ANOVA table
Leverage: Calculate for new factor combination using design matrix

Process Capability Indices

From DoE Model to Process Control
Cp: Potential capability = (USL - LSL) / (6σ)
Cpk: Actual capability = min[(USL - μ)/(3σ), (μ - LSL)/(3σ)]
DoE Application: Use model residual standard deviation as estimate of σ
Target: Cpk ≥ 1.33 for pharmaceutical processes
Example: Content Uniformity Process Capability

Specifications: 95.0 - 105.0% of label claim

DoE Model Residual SD: σ = 1.2% (from optimized conditions)

Target Mean: μ = 100.0%

Calculation: Cp = (105.0 - 95.0) / (6 × 1.2) = 10.0 / 7.2 = 1.39
Cpk: Upper: (105.0 - 100.0) / (3 × 1.2) = 1.39
Lower: (100.0 - 95.0) / (3 × 1.2) = 1.39
Cpk = min(1.39, 1.39) = 1.39
✅ Process Assessment
Cpk = 1.39 > 1.33: Process is capable!
Expected Defect Rate: < 64 ppm (parts per million)

Workshop Summary & Key Takeaways

Essential Statistical Analysis Checklist
  • ANOVA First: Check significance of effects, overall model fit (R²), and lack-of-fit
  • Residual Analysis: Verify normality, constant variance, and identify outliers
  • Influence Diagnostics: Use Cook's distance and leverage to find influential points
  • Model Validation: Perform cross-validation and confirmation runs
  • Practical Application: Calculate prediction intervals and process capability
⚠️ Common Analysis Mistakes to Avoid
  • Using R² instead of adjusted R² for model comparison
  • Ignoring residual patterns indicating model inadequacy
  • Confusing confidence intervals with prediction intervals
  • Removing outliers without proper investigation
  • Over-interpreting models beyond the experimental region

🎯 Ready for the Capstone Project?

You now have all the statistical tools needed to analyze DoE data effectively. In the next part, you'll apply everything you've learned to a complete pharmaceutical formulation optimization project.