I am in to designing a modeling software in my current work. In relation to that, I had to review some concepts relating to Design of Experiments. This can be a potent concept while carrying out real life experiments in the lab.

I just dont remember any of the experiments  I did in my Under grad. May be they were just single run experiments and hence did not require a DOE as such. Sometimes when I look back, I really feel that the experiments and the education related to it that I received was was far removed from the real world scenario. May be surveying that I learnt in my engineering came any where close to making one understand the real difficulties of experimentation and drawing conclusions.

The next time I came across DOE was during a training session , when I was working in the Quality department of an MNC. However I never got a chance to carry out the DOE in terms of a real life application. Today when I review the fundas, the concepts are definitely robust and take in to  consideration a lot of aspects. One thing I am hungry of is to work on a real life experiment and use the concepts of DOE.

Anyway, here are some basic aspects of DOE :

Experiment
Model a continuous Dependent variable Y which depends on a set of continuous and discrete variables. Uncontrolled factors such as machines, week of the day, etc can have an effect on the experiment outcome. Hence blocking designs are incorporated so that blocking factor has minimal impact on the experimental outcome.

  • Y is continuous

  • Objective is to fit a linear or quadratic form of equation between Y and X’s.

  • Linear Model generates main effects and interaction effects graphs

  • Quadratic form generates response surface graphs which can be used to study the dependence

  • Common Designs

    • Comparative Designs

    • Completely Randomized Designs

    • Blocked Randomized Designs

    • Screening Designs

    • Full Factorial

    • Fractional Factorial

    • Plackett-Burman designs

    • Advanced Modeling

    • Response Surface Modeling

    • Regression Modeling

Main Steps in a DOE

  1. Set objectives
  2. Select process variables
  3. Select an experimental design
  4. Execute the design
  5. Check that the data are consistent with the experimental assumptions
  6. Analyze and interpret the results
  7. Use/present the results (may lead to further runs or DOE’s)

Step 1: Objectives

The first step is to decide is the type of DOE. Is it going to be a sequential / Iterative DOE? It is not necessary to complete the experiment in one go. Iterative approach is the one which is followed widely

Assumptions in DOE:

  • Are the measurement systems capable for all of your responses ?

    • Measurement systems in place for Y
  • Is your process stable? 

    • Use run charts to gauge this
  • Are your responses likely to be approximated well by simple polynomial models?

    • If not, its useful to break up the experiment in to separate experiments
  • Are the residuals (the difference between the model predictions and the actual observations) well behaved?

    • Residuals need to be (roughly) normal and (approximately) independently distributed with a mean of 0 and some constant variance.
  • Tests for Residual Normality

    • Histograms

    • Normal Probability plots

    • Independence of residuals over time

    • Independence of Residuals from Factor Settings

    • Plot of Residuals Versus Corresponding Predicted Values

Step 2: Select Process Variables

Process variables include both inputs and outputs - i.e., factors and responses. The selection of these variables is best done as a team effort. The team should

  • Include all important factors (based on engineering judgment).
  • Be bold, but not foolish, in choosing the low and high factor levels.
  • Check the factor settings for impractical or impossible combinations - i.e., very low pressure and very high gas flows.
  • Include all relevant responses.
  • Avoid using only responses that combine two or more measurements of the process. For example, if interested in selectivity (the ratio of two etch rates), measure both rates, not just the ratio.

Step 3: Select Experimental Design

There are a whole host of designs that one can choose from. It depends on a number of aspects like factors, levels, budget and time constraints

  • Completely randomized designs

  • Randomized block designs

    • Latin squares

    • Graeco-Latin squares

    • Hyper-Graeco-Latin squares

  • Full factorial designs

    • Two-level full factorial designs

    • Full factorial example

    • Blocking of full factorial designs

  • Fractional factorial designs

    • A 23-1 half-fraction design

    • How to construct a 23-1  design

    • Confounding

    • Design resolution

    • Use of fractional factorial designs

    • Screening designs

    • Fractional factorial designs summary tables

  • Plackett-Burman designs

  • Response surface (second-order) designs

  • Central composite designs

  • Box-Behnken designs

  • Response surface design comparisons

  • Blocking a response surface design

  • Adding center points

  • Improving fractional design resolution

  • Mirror-image foldover designs

  • Alternative foldover designs

  • Three-level full factorial designs

  • Three-level, mixed level and fractional factorial designs

Description:

Completely randomized designs
As the name suggests, randomize the number of runs which equals = Factors * levels*Replications

Randomized block designs
When there is some nuisance factor which affects the experiment, one can use the randomized block design to over come the interference of the factor. There is a single factor of primary interest, typically called the treatment factor, and several nuisance factors. For Latin square designs there are 2 nuisance factors, for Graeco-Latin square designs there are 3 nuisance factors, and for Hyper-Graeco-Latin square designs there are 4 nuisance factors.

Full factorial designs
If there are k factors, each at 2 levels, a full factorial design has 2k runs. The design consists of replication, randomization and center points

Fractional factorial designs
Choose a fraction and then decide to run that fraction of total number of runs possible. The way to go about deciding which runs to include is of prime importance. Whatever the strategy one chooses in the definition phase, it is better to choose a design which is balanced and orthogonal.

Plackett-Burman designs
Screening Design where only main effects are of importance

Step 4: Analysis of DOE Data

  • Look at the data. Examine it for outliers, typos and obvious problems. Construct as many graphs as you can to get the big picture.

    • Response distributions (histograms, box plots, etc.)

    • Responses versus time order scatter plot (a check for possible time effects)

    • Responses versus factor levels (first look at magnitude of factor effects)

    • Typical DOE plots (which assume standard models for effects and errors)

    • Main effects mean plots

    • Block plots

    • Normal or half-normal plots of the effects

    • Interaction plots

  • Sometimes the right graphs and plots of the data lead to obvious answers for your experimental objective questions and you can skip to step 5. In most cases, however, you will want to continue by fitting and validating a model that can be used to answer your questions.

  • Create the theoretical model (the experiment should have been designed with this model in mind!).

  • Create a model from the data. Simplify the model, if possible, using stepwise regression methods and/or parameter p-value significance information.

  • Normal plot of all the effects: All the effects thrown out by regression are clearly seen to be away from the normal distribution graph. This is another way to validate the effects that one is considering in the model

  • Test the model assumptions using residual graphs.

    • If none of the model assumptions were violated, examine the ANOVA.

    • Simplify the model further, if appropriate. If reduction is appropriate, then return to step 3 with a new model.

    • If model assumptions were violated, try to find a cause.

    • Are necessary terms missing from the model?

    • Will a transformation of the response help? If a transformation is used, return to step 3 with a new model.

  • Use the results to answer the questions in your experimental objectives – finding important factors, finding optimum settings, etc.

Software that can be used to get one’s hands dirty in using and applying DOE: camo.com

Hope some day I will utilize this knowledge that I have gained over the last one week.