{{short description|Design of tasks}}{{original research|date=December 2020}}{{Use dmy dates|date=December 2020}}File:Response surface metodology.jpg|thumb|Design of experiments with full factorial design (left), response surface response surfaceThe design of experiments (DOE or DOX), also known as experiment design or experimental design, is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associated with experiments in which the design introduces conditions that directly affect the variation, but may also refer to the design of quasi-experiments, in which natural conditions that influence the variation are selected for observation.In its simplest form, an experiment aims at predicting the outcome by introducing a change of the preconditions, which is represented by one or more independent variables, also referred to as “input variables” or “predictor variables.” The change in one or more independent variables is generally hypothesized to result in a change in one or more dependent variables, also referred to as “output variables” or “response variables.” The experimental design may also identify control variables that must be held constant to prevent external factors from affecting the results. Experimental design involves not only the selection of suitable independent, dependent, and control variables, but planning the delivery of the experiment under statistically optimal conditions given the constraints of available resources. There are multiple approaches for determining the set of design points (unique combinations of the settings of the independent variables) to be used in the experiment.Main concerns in experimental design include the establishment of validity, reliability, and replicability. For example, these concerns can be partially addressed by carefully choosing the independent variable, reducing the risk of measurement error, and ensuring that the documentation of the method is sufficiently detailed. Related concerns include achieving appropriate levels of statistical power and sensitivity.Correctly designed experiments advance knowledge in the natural and social sciences and engineering, with design of experiments methodology recognised as a key tool in the successful implementation of a Quality by Design (QbD) framework.WEB, The Sequential Nature of Classical Design of Experiments {{!, Prism |url=https://prismtc.co.uk/resources/blogs-and-articles/the-sequential-nature-of-classical-design-of-experiments |access-date=2023-03-10 |website=prismtc.co.uk}} Other applications include marketing and policy making. The study of the design of experiments is an important topic in metascience.

History

Statistical experiments, following Charles S. Peirce

{{See also|Randomization}}A theory of statistical inference was developed by Charles S. Peirce in “Illustrations of the Logic of Science” (1877â€“1878)Peirce, Charles Sanders (1887). “Illustrations of the Logic of Science”. Open Court (10 June 2014). {{ISBN|0812698495}}. and “A Theory of Probable Inference” (1883),Peirce, Charles Sanders (1883). “A Theory of Probable Inference”. In C. S. Peirce (Ed.), Studies in logic by members of the Johns Hopkins University (p. 126â€“181). Little, Brown and Co (1883) two publications that emphasized the importance of randomization-based inference in statistics.JOURNAL, Stigler, Stephen M., Stephen Stigler, 1978, Mathematical statistics in the early States,projecteuclid.org/euclid.aos/1176344123, Annals of Statistics, 6, 2, 239â€“65 [248], “Indeed, Pierce’s work contains one of the earliest explicit endorsements of mathematical randomization as a basis for inference of which I am aware (Peirce, 1957, pages 216â€“219”, 10.1214/aos/1176344123, 2958876, 483118, free,

Randomized experiments

{{See also|Repeated measures design}}Charles S. Peirce randomly assigned volunteers to a blinded, repeated-measures design to evaluate their ability to discriminate weights.JOURNAL, Peirce, Charles Sanders, Jastrow, Joseph, Charles Sanders Peirce, Joseph Jastrow, 1885, On Small Differences in Sensation,psychclassics.yorku.ca/Peirce/small-diffs.htm, Memoirs of the National Academy of Sciences, 3, 73â€“83, ofJOURNAL, Ian, Hacking, Ian Hacking, Telepathy: Origins of Randomization in Experimental Design, Isis (journal), Isis, 3, 79, September 1988, 427â€“451, 234674, 1013489, 10.1086/354775, 52201011, JOURNAL, Stephen M. Stigler, A Historical View of Statistical Concepts in Psychology and Educational Research, American Journal of Education, 101, 1, November 1992, 60â€“70, 1085417, 10.1086/444032

author-link=Stephen M. Stigler, JOURNAL, Trudy Dehue, Deception, Efficiency, and Random Groups: Psychology and the Gradual Origination of the Random Group Design, Isis (journal), Isis, 88, 4, December 1997, 653â€“673, 10.1086/383850, 9519574, 23526321,www.rug.nl/research/portal/en/publications/deception-efficiency-and-random-groups(459e54f0-1e56-4390-876a-46a33e80621d).html, Peirce’s experiment inspired other researchers in psychology and education, which developed a research tradition of randomized experiments in laboratories and specialized textbooks in the 1800s.

Optimal designs for regression models

{{See also|Optimal design}}Charles S. Peirce also contributed the first English-language publication on an optimal design for regression models in 1876.JOURNAL, Peirce, C. S., 1876, Note on the Theory of the Economy of Research, Coast Survey Report, 197â€“201, Charles Sanders Peirce, , actually published 1879, NOAA PDF Eprint {{Webarchive|url=https://web.archive.org/web/20170302071239docs.lib.noaa.gov/rescue/cgs/001_pdf/CSC-0025.PDF#page=222 |date=2 March 2017 }}. Reprinted in Collected Papers 7, paragraphs 139â€“157, also in Writings 4, pp. 72â€“78, and in JOURNAL, Peirce, C. S., Julyâ€“August 1967, Note on the Theory of the Economy of Research, Operations Research

issue=4, 643â€“648

doi=10.1287/opre.15.4.643, Charles Sanders Peirce, A pioneering optimal design for polynomial regression was suggested by Gergonne in 1815. In 1918, Kirstine Smith published optimal designs for polynomials of degree six (and less).JOURNAL, Guttorp, P., Lindgren, G., Karl Pearson and the Scandinavian school of statistics, International Statistical Review, 77, 2009, 64, 10.1111/j.1751-5823.2009.00069.x, 10.1.1.368.8328, 121294724, JOURNAL, Smith, Kirstine, Kirstine Smith, 1918, On the standard deviations of adjusted and interpolated values of an observed polynomial function and its constants and the guidance they give towards a proper choice of the distribution of observations.,books.google.com/books?id=UMNLAAAAYAAJ, Biometrika, 12, 1â€“2, 1â€“85, 10.1093/biomet/12.1-2.1,

Sequences of experiments

{{See also|Multi-armed bandit problem|Gittins index|Optimal design}}The use of a sequence of experiments, where the design of each may depend on the results of previous experiments, including the possible decision to stop experimenting, is within the scope of sequential analysis, a field that was pioneeredJohnson, N.L. (1961). “Sequential analysis: a survey.” Journal of the Royal Statistical Society, Series A. Vol. 124 (3), 372–411. (pages 375–376) by Abraham Wald in the context of sequential tests of statistical hypotheses.Wald, A. (1945) “Sequential Tests of Statistical Hypotheses”, Annals of Mathematical Statistics, 16 (2), 117–186. Herman Chernoff wrote an overview of optimal sequential designs, while adaptive designs have been surveyed by S. Zacks.Zacks, S. (1996) “Adaptive Designs for Parametric Models”. In: Ghosh, S. and Rao, C. R., (Eds) (1996). “Design and Analysis of Experiments,” Handbook of Statistics, Volume 13. North-Holland. {{ISBN|0-444-82061-2}}. (pages 151–180) One specific type of sequential design is the “two-armed bandit”, generalized to the multi-armed bandit, on which early work was done by Herbert Robbins in 1952.JOURNAL, 10.1090/S0002-9904-1952-09620-8, Robbins, H., 1952, Some Aspects of the Sequential Design of Experiments, Bulletin of the American Mathematical Society, 58, 5, 527â€“535, free,

Fisher’s principles

A methodology for designing experiments was proposed by Ronald Fisher, in his innovative books: The Arrangement of Field Experiments (1926) and The Design of Experiments (1935). Much of his pioneering work dealt with agricultural applications of statistical methods. As a mundane example, he described how to test the lady tasting tea hypothesis, that a certain lady could distinguish by flavour alone whether the milk or the tea was first placed in the cup. These methods have been broadly adapted in biological, psychological, and agricultural research.Miller, Geoffrey (2000). The Mating Mind: how sexual choice shaped the evolution of human nature, London: Heineman, {{ISBN|0-434-00741-2}} (also Doubleday, {{ISBN|0-385-49516-1}}) “To biologists, he was an architect of the ‘modern synthesis’ that used mathematical models to integrate Mendelian genetics with Darwin’s selection theories. To psychologists, Fisher was the inventor of various statistical tests that are still supposed to be used whenever possible in psychology journals. To farmers, Fisher was the founder of experimental agricultural research, saving millions from starvation through rational crop breeding programs.” p.54.

Comparison: In some fields of study it is not possible to have independent measurements to a traceable metrology standard. Comparisons between treatments are much more valuable and are usually preferable, and often compared against a scientific control or traditional treatment that acts as baseline.

Randomization: Random assignment is the process of assigning individuals at random to groups or to different groups in an experiment, so that each individual of the population has the same chance of becoming a participant in the study. The random assignment of individuals to groups (or conditions within a group) distinguishes a rigorous, “true” experiment from an observational study or “quasi-experiment”.Creswell, J.W. (2008), Educational research: Planning, conducting, and evaluating quantitative and qualitative research (3rd edition), Upper Saddle River, NJ: Prentice Hall. 2008, p. 300. {{ISBN|0-13-613550-1}} There is an extensive body of mathematical theory that explores the consequences of making the allocation of units to treatments by means of some random mechanism (such as tables of random numbers, or the use of randomization devices such as playing cards or dice). Assigning units to treatments at random tends to mitigate confounding, which makes effects due to factors other than the treatment to appear to result from the treatment.

The risks associated with random allocation (such as having a serious imbalance in a key characteristic between a treatment group and a control group) are calculable and hence can be managed down to an acceptable level by using enough experimental units. However, if the population is divided into several subpopulations that somehow differ, and the research requires each subpopulation to be equal in size, stratified sampling can be used. In that way, the units in each subpopulation are randomized, but not the whole sample. The results of an experiment can be generalized reliably from the experimental units to a larger statistical population of units only if the experimental units are a random sample from the larger population; the probable error of such an extrapolation depends on the sample size, among other things.

Statistical replication: Measurements are usually subject to variation and measurement uncertainty; thus they are repeated and full experiments are replicated to help identify the sources of variation, to better estimate the true effects of treatments, to further strengthen the experiment’s reliability and validity, and to add to the existing knowledge of the topic.WEB, Dr. Hani, Replication study,www.experiment-resources.com/replication-study.html, 27 October 2011, 2009,www.experiment-resources.com/replication-study.html," title="web.archive.org/web/20120602061136www.experiment-resources.com/replication-study.html,">web.archive.org/web/20120602061136www.experiment-resources.com/replication-study.html, 2 June 2012, dead, However, certain conditions must be met before the replication of the experiment is commenced: the original research question has been published in a peer-reviewed journal or widely cited, the researcher is independent of the original experiment, the researcher must first try to replicate the original findings using the original data, and the write-up should state that the study conducted is a replication study that tried to follow the original study as strictly as possible.{{citation|last=Burman|first=Leonard E.|title=A call for replication studies|url=http://pfr.sagepub.com|journal=Public Finance Review | volume=38 |issue=6|access-date=27 October 2011|author2=Robert W. Reed |author3=James Alm |pages=787â€“793|doi=10.1177/1091142110385210|year=2010|s2cid=27838472}}

Blocking: (File:No block vs block chart.jpg|thumb|150x150px|Blocking (right) )Blocking is the non-random arrangement of experimental units into groups (blocks) consisting of units that are similar to one another. Blocking reduces known but irrelevant sources of variation between units and thus allows greater precision in the estimation of the source of variation under study.




Orthogonality

(File:Factorial Design.svg|thumb|Example of orthogonal factorial design)

Orthogonality concerns the forms of comparison (contrasts) that can be legitimately and efficiently carried out. Contrasts can be represented by vectors and sets of orthogonal contrasts are uncorrelated and independently distributed if the data are normal. Because of this independence, each orthogonal treatment provides different information to the others. If there are T treatments and T â€“ 1 orthogonal contrasts, all the information that can be captured from the experiment is obtainable from the set of contrasts.

Multifactorial experiments: Use of multifactorial experiments instead of the one-factor-at-a-time method. These are efficient at evaluating the effects and possible interactions of several factors (independent variables). Analysis of experiment design is built on the foundation of the analysis of variance, a collection of models that partition the observed variance into components, according to what factors the experiment must estimate or test.

Example

(File:Balance Ã tabac 1850.JPG|right|240px)This example of design experiments is attributed to Harold Hotelling, building on examples from Frank Yates.JOURNAL, Hotelling, Harold, Some Improvements in Weighing and Other Experimental Techniques, Annals of Mathematical Statistics, 15, 3, 297â€“306, 1944, 10.1214/aoms/1177731236,projecteuclid.org/euclid.aoms/1177731236, free, BOOK, Giri, Narayan C., Das, M. N., Design and Analysis of Experiments, Wiley, New York, N.Y, 1979, 9780852269145,books.google.com/books?id=-vGlnx-ZVvEC, 350â€“359, Herman Chernoff, Sequential Analysis and Optimal Design, SIAM Monograph, 1972. The experiments designed in this example involve combinatorial designs.WEB, How to Use Design of Experiments to Create Robust Designs With High Yield, Jack Sifri, youtube.com, 8 December 2014, 2015-02-11,www.youtube.com/watch?v=hfdZabCVwzc, Weights of eight objects are measured using a pan balance and set of standard weights. Each weighing measures the weight difference between objects in the left pan and any objects in the right pan by adding calibrated weights to the lighter pan until the balance is in equilibrium. Each measurement has a random error. The average error is zero; the standard deviations of the probability distribution of the errors is the same number Ïƒ on different weighings; errors on different weighings are independent. Denote the true weights by

theta_1, dots, theta_8.,

We consider two different experiments:

Weigh each object in one pan, with the other pan empty. Let Xi be the measured weight of the object, for i = 1, ..., 8.
Do the eight weighings according to the following scheduleâ€”a weighing matrix:

begin{array}{lcc}& text{left pan} & text{right pan} hlinetext{1st weighing:} & 1 2 3 4 5 6 7 8 & text{(empty)} text{2nd:} & 1 2 3 8 & 4 5 6 7 text{3rd:} & 1 4 5 8 & 2 3 6 7 text{4th:} & 1 6 7 8 & 2 3 4 5 text{5th:} & 2 4 6 8 & 1 3 5 7 text{6th:} & 2 5 7 8 & 1 3 4 6 text{7th:} & 3 4 7 8 & 1 2 5 6 text{8th:} & 3 5 6 8 & 1 2 4 7end{array}

Let Yi be the measured difference for i = 1, ..., 8. Then the estimated value of the weight Î¸1 is

widehat{theta}_1 = frac{Y_1 + Y_2 + Y_3 + Y_4 - Y_5 - Y_6 - Y_7 - Y_8}{8}.

Similar estimates can be found for the weights of the other items:

begin{align}widehat{theta}_2 & = frac{Y_1 + Y_2 - Y_3 - Y_4 + Y_5 + Y_6 - Y_7 - Y_8} 8. [5pt]widehat{theta}_3 & = frac{Y_1 + Y_2 - Y_3 - Y_4 - Y_5 - Y_6 + Y_7 + Y_8} 8. [5pt]widehat{theta}_4 & = frac{Y_1 - Y_2 + Y_3 - Y_4 + Y_5 - Y_6 + Y_7 - Y_8} 8. [5pt]widehat{theta}_5 & = frac{Y_1 - Y_2 + Y_3 - Y_4 - Y_5 + Y_6 - Y_7 + Y_8} 8. [5pt]widehat{theta}_6 & = frac{Y_1 - Y_2 - Y_3 + Y_4 + Y_5 - Y_6 - Y_7 + Y_8} 8. [5pt]widehat{theta}_7 & = frac{Y_1 - Y_2 - Y_3 + Y_4 - Y_5 + Y_6 + Y_7 - Y_8} 8. [5pt]widehat{theta}_8 & = frac{Y_1 + Y_2 + Y_3 + Y_4 + Y_5 + Y_6 + Y_7 + Y_8} 8.end{align}The question of design of experiments is: which experiment is better?The variance of the estimate X1 of Î¸1 is Ïƒ2 if we use the first experiment. But if we use the second experiment, the variance of the estimate given above is Ïƒ2/8. Thus the second experiment gives us 8 times as much precision for the estimate of a single item, and estimates all items simultaneously, with the same precision. What the second experiment achieves with eight would require 64 weighings if the items are weighed separately. However, note that the estimates for the items obtained in the second experiment have errors that correlate with each other.Many problems of the design of experiments involve combinatorial designs, as in this example and others.

Avoiding false positives

{{see also|Metascience}}False positive conclusions, often resulting from the pressure to publish or the author’s own confirmation bias, are an inherent hazard in many fields.JOURNAL, Forstmeier, Wolfgang, Wagenmakers, Eric-Jan, Parker, Timothy H., 23 November 2016, Detecting and avoiding likely false-positive findings â€“ a practical guide, Biological Reviews, en, 92, 4, 1941â€“1968, 10.1111/brv.12315, 27879038, 26793416, 1464-7931, free, 11245.1/31f84a5b-4439-4a4c-a690-6e98354199f5, free, Use of double-blind designs can prevent biases potentially leading to false positives in the data collection phase. When a double-blind design is used, participants are randomly assigned to experimental groups but the researcher is unaware of what participants belong to which group. Therefore, the researcher can not affect the participants’ response to the intervention.JOURNAL, David, Sharoon, Khandhar1, Paras B., July 17, 2023, Double-Blind Study,www.ncbi.nlm.nih.gov/books/NBK546641/, StatPearls Publishing, 31536248, Experimental designs with undisclosed degrees of freedom{{Technical inline|date=August 2023}} are a problem,JOURNAL, Simmons, Joseph, Leif Nelson, Uri Simonsohn, False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant, Psychological Science, 22, 11, 1359â€“1366, November 2011, 0956-7976, 10.1177/0956797611417632, 22006061,

in that they can lead to conscious or unconscious “p-hacking”: trying multiple things until you get the desired result. It typically involves the manipulation â€“ perhaps unconsciously â€“ of the process of statistical analysis and the degrees of freedom until they return a figure below the P-value|p

- content above as imported from Wikipedia
- "design of experiments" does not exist on GetWiki (yet)
- time: 2:49am EDT - Wed, May 22 2024

[ this remote article is provided by Wikipedia ]

CONNECT