language-icon Old Web
English
Sign In

All models are wrong

'All models are wrong' is a common aphorism in statistics; it is often expanded as 'All models are wrong, but some are useful'. It is usually considered to be applicable to not only statistical models, but to scientific models generally. The aphorism is generally attributed to the statistician George Box, although the underlying concept predates Box's writings.2.3  ParsimonySince all models are wrong the scientist cannot obtain a 'correct' one by excessive elaboration. On the contrary following William of Occam he should seek an economical description of natural phenomena. Just as the ability to devise simple but evocative models is the signature of the great scientist so overelaboration and overparameterization is often the mark of mediocrity.Now it would be very remarkable if any system existing in the real world could be exactly represented by any simple model. However, cunningly chosen parsimonious models often do provide remarkably useful approximations. For example, the law PV = RT relating pressure P, volume V and temperature T of an 'ideal' gas via a constant R is not exactly true for any real gas, but it frequently provides a useful approximation and furthermore its structure is informative since it springs from a physical view of the behavior of gas molecules. ... all models are approximations. Essentially, all models are wrong, but some are useful. However, the approximate nature of the model must always be borne in mind....It has been said that 'all models are wrong but some models are useful.' In other words, any model is at best a useful fiction—there never was, or ever will be, an exactly normal distribution or an exact linear relationship. Nevertheless, enormous progress has been made by entertaining such fictions and using them as approximations.All models are approximations. Assumptions, whether implied or clearly stated, are never exactly true. All models are wrong, but some models are useful. So the question you need to ask is not 'Is the model true?' (it never is) but 'Is the model good enough for this particular application?'Modelling in science remains, partly at least, an art. Some principles do exist, however, to guide the modeller. The first is that all models are wrong; some, though, are better than others and we can search for the better ones. At the same time we must recognize that eternal truth is not within our grasp.... it does not seem helpful just to say that all models are wrong. The very word model implies simplification and idealization. The idea that complex physical, biological or sociological systems can be exactly described by a few formulae is patently absurd. The construction of idealized representations that capture important stable aspects of such systems is, however, a vital part of general scientific analysis and statistical models, especially substantive ones, do not seem essentially different from other kinds of model.A model is a simplification or approximation of reality and hence will not reflect all of reality. ... Box noted that 'all models are wrong, but some are useful.' While a model can never be 'truth,' a model might be ranked from very useful, to useful, to somewhat useful to, finally, essentially useless.... there are wonderful models — like city maps....I take his general point, which is that a street map could be exactly correct, to the resolution of the map. ... seemingly incompatible models may be used to make predictions about the same phenomenon. ... For each model we may believe that its predictive power is an indication of its being at least approximately true. But if both models are successful in making predictions, and yet mutually inconsistent, how can they both be true? Let us consider a simple illustration. Two observers are looking at a physical object. One may report seeing a circular disc, and the other may report seeing a rectangle. Both will be correct, but one will be looking at the object (a cylindrical can) from above and the other will be observing from the side. The two models represent different aspects of the same reality.In general, when building statistical models, we must not forget that the aim is to understand something about the real world. Or predict, choose an action, make a decision, summarize evidence, and so on, but always about the real world, not an abstract mathematical world: our models are not the reality—a point well made by George Box in his oft-cited remark that 'all models are wrong, but some are useful'.… no models are —not even the Newtonian laws. When you construct a model you leave out all the details which you, with the knowledge at your disposal, consider inessential…. Models should not be true, but it is important that they are applicable, and whether they are applicable for any given purpose must of course be investigated. This also means that a model is never accepted finally, only on trial.Ce qui est simple est toujours faux. Ce qui ne l’est pas est inutilisable.What is simple is always wrong. What is not is unusable.… no model can ever be theoretically attainable that will completely and uniquely characterize the indefinitely expansible concept of a state of statistical control. What is perhaps even more important, on the basis of a finite portion of the sequence —and we can never have more than a finite portion—we can not reasonably hope to construct a model that will represent exactly any specific characteristic of a particular state of control even though such a state actually exists. Here the situation is much like that in physical science where we find a model of a molecule; any model is always an incomplete though useful picture of the conceived physical thing called a molecule.We all know that art is not truth. Art is a lie that makes us realize truth, at least the truth that is given us to understand. The artist must know the manner whereby to convince others of the truthfulness of his lies. 'All models are wrong' is a common aphorism in statistics; it is often expanded as 'All models are wrong, but some are useful'. It is usually considered to be applicable to not only statistical models, but to scientific models generally. The aphorism is generally attributed to the statistician George Box, although the underlying concept predates Box's writings. The first record of Box saying 'all models are wrong' is in a 1976 paper published in the Journal of the American Statistical Association. The 1976 paper contains the aphorism twice. The two sections of the paper that contain the aphorism are copied below. Box repeated the aphorism in a paper that was published in the proceedings of a 1978 statistics workshop. The paper contains a section entitled 'All models are wrong but some are useful'. The section is copied below. Box repeated the aphorism twice more in his 1987 book, Empirical Model-Building and Response Surfaces (which was co-authored with Norman Draper). The first repetition is on p. 74: 'Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful.' The second repetition is on p. 424, which is excerpted below. A second edition of the book was published in 2007, under the title Response Surfaces, Mixtures, and Ridge Analyses. The second edition also repeats the aphorism twice, in contexts identical with those of the first edition (on p. 63 and p. 414). Box repeated the aphorism two more times in his 1997 book, Statistical Control: By Monitoring and Feedback Adjustment (which was co-authored with Alberto Luceño). The first repetition is on p. 6, which is excerpted below.

[ "Statistics", "Econometrics" ]
Parent Topic
Child Topic
    No Parent Topic