Unit 5: Design and Analysis of Experiments (DoE)

Semester 8

BP801T

Design and Analysis of Experiments (DoE)

This unit focuses on advanced experimental planning, a critical component of Quality by Design (QbD) in modern pharmaceutical development. It introduces the mathematical principles of Factorial Design, specifically 2² and 2³ designs, which allow researchers to study multiple variables simultaneously. It also explores Response Surface Methodology (RSM) and Central Composite Design (CCD) as powerful tools to mathematically model complex relationships and pinpoint the absolute optimal formulation or manufacturing conditions.

Syllabus & Topics

1Design of Experiments (DoE) – Introduction: DoE is a systematic, rigorous approach to engineering problem-solving that applies principles and techniques at the data collection stage to ensure valid, defensible, and supportable engineering conclusions. Traditional One-Factor-At-A-Time (OFAT) experiments (where you keep everything constant and change only one thing) are inefficient, costly, and critically, they fail to identify ‘Interactions’ between variables (e.g., Temperature and Pressure might have a combined effect that is greater than their individual effects). DoE solves this by varying multiple factors simultaneously.
2Factorial Design (Definition and Concept): An experimental strategy where trials are conducted at all possible combinations of the selected levels of the factors under investigation. Factor: An independent variable (e.g., Concentration of Binder, Compression Force). Level: The specific values chosen for a factor (e.g., High level = 10%, Low level = 5%). Response: The dependent outcome variable being measured (e.g., Tablet Hardness, Disintegration Time). The primary goal is to determine the ‘Main Effect’ of each individual factor and the ‘Interaction Effects’ between two or more factors.
32² and 2³ Factorial Designs: The 2^k designation indicates a design with ‘k’ factors, each tested at TWO levels (usually denoted as +1 for High, and -1 for Low). (1) 2² Design: 2 factors tested at 2 levels. Total experimental trials = 2 x 2 = 4 (represented as the 4 corners of a square). It evaluates 2 main effects (Factor A, Factor B) and 1 interaction effect (AxB). (2) 2³ Design: 3 factors tested at 2 levels. Total experimental trials = 2 x 2 x 2 = 8 (represented as the 8 corners of a cube). It evaluates 3 main effects (A, B, C), 3 two-way interactions (AB, AC, BC), and 1 three-way interaction (ABC).
4Advantages of Factorial Design: (1) Uncovers Interactions: The ONLY way to discover if variables interact with each other (which they frequently do in complex pharmaceutical formulations). (2) Efficiency: Yields more information with fewer experimental runs compared to the OFAT method, saving significant time, API, and manufacturing resources. (3) Broad Applicability: The conclusions drawn are valid over a wider range of experimental conditions because the factors are varied simultaneously. (4) Foundation for Optimization: Identifies the ‘vital few’ significant factors, serving as the necessary first step before moving to Response Surface optimization.
5Response Surface Methodology (RSM): While 2-level factorial designs are excellent for SCREENING (identifying WHICH factors are important), they assume relationships are purely linear. However, many pharmaceutical relationships are curved (quadratic). RSM is a collection of mathematical and statistical techniques based on the fit of a polynomial equation to the experimental data. It goes beyond screening to actual OPTIMIZATION — finding the exact point where factors combine to yield the ‘best’ possible response (e.g., maximum yield, optimum viscosity, perfect dissolution profile).
6Central Composite Design (CCD) & Optimization Techniques: To model curvature in RSM, we need to test factors at more than 2 levels (usually 3 or 5 levels). Central Composite Design (CCD) is the most popular RSM design. It builds upon a base factorial (e.g., the 8 points of a 2³ cube) by adding: (1) Center points (trials run at the exact middle value of all factors to check for curvature and experimental error) and (2) Axial/Star points (points pushed out further than the High/Low levels to fully capture quadratic curves). Optimization visually relies on creating 3D Response Surface Plots (visualizing the ‘landscape’ of responses) and 2D Contour Plots (like an elevation map). The software mathematically determines the specific factor coordinates that correspond to the desired ‘peak’ (maximum response) or ‘valley’ (minimum response).

Learning Objectives

Understand Factorial Principles: Differentiate between a Factor, a Level, and a Response in experimental design.

Contrast with Traditional Methods: Explain why a 2² factorial design is statistically superior to One-Factor-At-A-Time (OFAT) experimentation.

Calculate Trials: Determine the number of experimental runs and the specific main/interaction effects evaluated in a 2³ factorial design.

Purpose of RSM: Explain why Response Surface Methodology is necessary for true optimization, going beyond simple screening factorial designs.

Describe CCD: Understand the structural components (factorial points, center points, axial points) that make up a Central Composite Design.

Exam Prep Questions

Q1. What is an “Interaction Effect” in a formulation?

An interaction effect occurs when the impact of one variable depends entirely on the level of another variable. For example, increasing Compression Force might increase tablet hardness when the Binder Concentration is LOW. However, increasing that exactly same Compression Force might remarkably DECREASE tablet hardness when the Binder Concentration is HIGH (perhaps causing the tablet to cap/break). Traditional OFAT testing would completely miss this complex relationship; a factorial design will clearly identify it.

Q2. If I have 5 factors to test in a formulation, a full 2⁵ factorial design requires 32 experiments. What if I can only afford to run 16?

This is a common industrial problem. Researchers use a “Fractional Factorial Design”. Instead of running the full 32 trials, you run a carefully selected half-fraction (2⁵⁻¹ = 16 trials). The tradeoff is “Confounding” (as discussed in Unit 4). You can still accurately measure the 5 Main Effects, but the complex, higher-order interactions (like a 4-way ABCD interaction) become entangled with each other. This is usually an acceptable compromise for initial screening.

Q3. Why do we add “Center Points” to a Central Composite Design (CCD)?

Center points are experimental runs where all factors are set exactly halfway between their High (+1) and Low (−1) levels (coded as 0). They serve two critical purposes:

They allow the statistical software to test for “curvature”—proving that the relationship between the factors and the response is not a straight line, but curved.
Because center points are usually replicated (e.g., running the exact same center point 4 times), they provide an estimate of pure experimental “noise” or error, against which the true factor effects can be measured.