Login
In Cooperation with:

American Society for Quality Statistics Division

American Statistical Association

Bernoulli Society for Mathematical Statistics and Probability

Institute of Mathematical Statistics

International Biometric Society

International Chinese Statistical Association

International Society for Bayesian Analysis

International Statistical Institute

Royal Statistical Society

Statistical Society of Canada / Société statistique du Canada
William Sealy Gosset
|
William Sealy GOSSET Summary. Better known by his pseudonym, `Student', Gosset's name is associated with the discovery of the t-distribution and its use, and he had a profound effect on the practice of statistics in industry and agriculture. William Sealy Gosset was born in Canterbury, England. He received a degree from Oxford University in Chemistry and went to work as a ``brewer'' in 1899 at Arthur Guinness Son and Co. Ltd. in Dublin, Ireland. He died in Beaconsfield, England at the age of 61, still in the employ of Guinness. By the circumstances of his work, Gosset was led early in his career at Guinness to examine the relationship between the raw materials for beer and the finished product, and this activity naturally led him to learn the tools of statistical analysis. In 1905, Gosset sought out the advice of Karl Pearson (q.v.) and subsequently spent the better part of a year, in 1906-1907, in Pearson's Biometric Laboratory at University College London, where he worked on small sample statistics problems. Gosset then produced a pair of papers that were published in Biometrika in 1908, under the nom de plume, `Student.' The first of these derived what we now know as `Student's' t-distribution, and the second dealt with the small sample distribution of Pearson's correlation coefficient. These contributions placed Gosset among the great men of the newly emerging field of statistical methodology. In fact, the t-test based on his 1908 paper is perhaps the single most widely used statistical tool in applications. In the years that followed, Gosset worked on a variety of statistical problems in agriculture, including experiments. He was in active correspondence with the leading English statisticians of his day, including Karl Pearson, Egon Pearson (q.v.), and R. A. Fisher (q.v.). Gosset's correspondence with Fisher dealt with highly varied topics and was, as Plackett and Barnard note, ``interspersed with friendly advice on both sides.'' In his later years, he had a number of public disagreements with Fisher over the role of randomisation in experimentation. Gosset was a strong advocate of experimental control, a point that came through quite vividly in his proposal in connection with the Lanarkshire milk experiment in `Student' (1931), although in this paper he was also critical of an evaluation of the study carried out by Bartlett and Fisher (1931). In particular, Gosset was enamoured by the use of systematic experimental plans and opposed the use of randomisation. This controversy led Gosset to prepare his final paper (`Student,' 1937) published a few months after his death. In the next section, we comment on some of the technical details of Gosset's seminal 1908 contributions. For further details on Gosset's life and contributions, see Plackett and Barnard (1990). Gosset's writings are collected in `Student' (1942). Gosset on the Mean and the Correlation Coefficient In 1908, Gosset's work at the Guinness brewery led him to publish the results that would become associated with his name in future generations. In an article entitled ``The probable error of a mean'' (`Student', 1908a), he established the sampling distributions of At time of publication, the importance of these results was not fully recognised. The focus among most contemporary statisticians was on large-sample theory and Gosset's emphasis on small samples, arising from his work at the brewery, set him somewhat apart. In fact, it was not until Fisher generalised `Student's' Aside from the derivations mentioned above, there are a number of interesting features of the 1908 manuscript. First is the break from the tradition of the Biometric School, which used the same symbol for both the population parameter and the sample statistic. In Gosset's paper, he uses Another aspect of this paper worthy of note is Gosset's use of a sampling experiment to help empirically solve the problem at hand, instead of finding an analytic solution. The essence of the simulation was the following - using data on the height and left middle finger measurements of 3000 criminals, he generated 750 random samples of size 4. Gosset then calculated the means, standard deviations and correlation coefficient of each sample as well as This paper implicitly takes an inverse probability approach, although there is no discussion of prior distributions. We encounter, for instance, statements such as ``Thus, to take the tables for samples of 6, the probability of the mean of the population lying between The Correlation Coefficient In addition to the famous article establishing the Gosset on Experimental Design R. A. Fisher's correspondence with Gosset began in 1912, when Fisher sent Gosset a copy of his paper applying maximum likelihood (as it would later come to be known) to estimate the mean and variance of a normal population. They did not meet until a decade later, however, when Gosset visited Rothamsted and presented Fisher with a copy of his statistical tables. They continued to correspond on a variety of topics and, in 1923, there was an exchange of letters between the two on Fisher's work with Mackenzie on the design of experiments, in which Gosset advocated the use of systematic field arrangements, in essence rejecting Fisher's proposal for randomisation. Their disagreement on the use of randomisation continued in private correspondence (see various excerpts in Plackett and Barnard,1990, Chapter 5) and could hardly be read into Gosset's only public criticism of Fisher, in the context of a published comment on the infamous Lanarkshire milk experiment (`Student', 1931). In 1936, however, the debate became public in a discussion of a paper read before the Royal Statistical Society on `co-operation in large-scale experiments.' Gosset led off the discussion by extolling the virtues of Beaven's half-drill strip systematic design, and Fisher, who spoke next, expressed his opposition to such systematic designs. This was followed by a paper by Fisher and Barbacki criticising `the supposed precision of systematic designs' and an exchange of letters between Fisher and Gosset in Nature. At the time of his death, Gosset was working on a detailed response to Fisher in which he once again put forth his support of systematic experimentation and expressed doubts about the role of randomisation. After so many years, they had not resolved their differences on this fundamental statistical issue. The paper appeared posthumously in 1938, and when he read it Fisher observed in a letter to Harold Jeffreys: ``So far as I can judge, `Student' and I would have differed quite inappreciably on randomisation if we had seen enough of each other to know exactly what the other meant, and if he had not felt in duty bound, not only to extol the merits, but also to deny the defects of Beaven's half drill strip system. References Bartlett, S. and Fisher, R.A. (1931). Pasteurized and raw milk. Nature, 127, 591-592. Bennett J.H. (1990). Statistical Inference and Analysis. Selected Correspondence of R.A. Fisher. Oxford, pp.271-272. Fisher, R. A. (1915). Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika, 10, 507-521. Geary, R. C. (1936). The distribution of `Student's' ratio for non-normal samples. Journal of the Royal Statistical Society Supplement (superseded by Series B), 3, 178-184. Geary, R. C. (1947). Testing for normality. Biometrika, 34, 209-242. Jeffreys, H. (1937). On the relation between direct and inverse methods in statistics. Proc. R. Soc. A, 160, 325-348. Pearson, E. S. (1929). The distribution of frequency constants in small samples from non-normal symmetrical and skew populations. Biometrika, 21, 259-286. Pearson, E. S. (1939). `Student' as statistician. Biometrika, 30, 210-250. Pfanzagl, J. and Sheynin, O. (1996). Studies in the history of probability and statistics XLIV: A forerunner of the Plackett, R. L. and Barnard, G.A. (1990). `Student': A Statistical Biography of William Sealy Gosset. Based on the writings of E.S. Pearson. Oxford: Clarendon Press. Stigler, S. (1978). Francis Ysidro Edgeworth, Statistician (with Discussion). Journal of the Royal Statistical Society, 141, 287-322. `Student' (1908a). The probable error of a mean. Biometrika, 6, 1-25. `Student' (1908b). Probable error of a correlation coefficient. Biometrika, 6, 302-310. `Student' (1931). The Lanarkshire milk experiment. Biometrika, 23, 398-406. `Student' (1937). Random and balanced arrangements. Biometrika, 29, 363-379. `Student' (1942). `Student's' Collected Papers. (ed. by E.S. Pearson and J. Wishart), with a forward by L. McMullen. Biometrika Office, University College. Welch B. L. (1958). `Student' and small sample theory. Journal of the American Statistical Association, 53, 777-788. Stephen E. Fienberg and Nicole Lazar |


