language-icon Old Web
English
Sign In

p-value

In statistical hypothesis testing, the p-value or probability value is, for a given statistical model, the probability that, when the null hypothesis is true, the statistical summary (such as the absolute value of the sample mean difference between two compared groups) would be equal to, or more extreme than, the actual observed results. The use of p-values in statistical hypothesis testing is common in many fields of research such as physics, economics, finance, political science, psychology, biology, criminal justice, criminology, and sociology. The misuse of p-values is a controversial topic in metascience.In the 1770s Laplace considered the statistics of almost half a million births. The statistics showed an excess of boys compared to girls. He concluded by calculation of a p-value that the excess was a real, but unexplained, effect.It is usual and convenient for experimenters to take 5 per cent as a standard level of significance, in the sense that they are prepared to ignore all results which fail to reach this standard, and, by this means, to eliminate from further discussion the greater part of the fluctuations which chance causes have introduced into their experimental results. In statistical hypothesis testing, the p-value or probability value is, for a given statistical model, the probability that, when the null hypothesis is true, the statistical summary (such as the absolute value of the sample mean difference between two compared groups) would be equal to, or more extreme than, the actual observed results. The use of p-values in statistical hypothesis testing is common in many fields of research such as physics, economics, finance, political science, psychology, biology, criminal justice, criminology, and sociology. The misuse of p-values is a controversial topic in metascience. Italicisation, capitalisation and hyphenation of the term varies. For example, AMA style uses 'P value', APA style uses 'p value', and the American Statistical Association uses 'p-value'. In statistics, every conjecture concerning the unknown distribution F {displaystyle F} of a random variable X {displaystyle X} is called a statistical hypothesis. If we state one hypothesis only and the aim of the statistical test is to verify whether this hypothesis is not false, but not, at the same time, to investigate other hypotheses, then such a test is called a significance test. A statistical hypothesis that refers only to the numerical values of unknown parameters of a distribution is called a parametric hypothesis. Methods of verifying statistical hypotheses are called statistical tests. Tests of parametric hypotheses are called parametric tests. We can likewise also have non-parametric hypotheses and non-parametric tests. The p-value is used in the context of null hypothesis testing in order to quantify the idea of statistical significance of evidence. Null hypothesis testing is a reductio ad absurdum argument adapted to statistics. In essence, a claim is assumed valid if its counter-claim is improbable. As such, the only hypothesis that needs to be specified in this test and which embodies the counter-claim is referred to as the null hypothesis (that is, the hypothesis to be nullified). A result is said to be statistically significant if it allows us to reject the null hypothesis. That is, as per the reductio ad absurdum reasoning, the statistically significant result should be highly improbable if the null hypothesis is assumed to be true. The rejection of the null hypothesis implies that the correct hypothesis lies in the logical complement of the null hypothesis. However, unless there is a single alternative to the null hypothesis, the rejection of null hypothesis does not tell us which of the alternatives might be the correct one. As a general example, if a null hypothesis states that a certain summary statistic follows the standard normal distribution N(0,1), then the rejection of this null hypothesis can either mean (i) the mean is not zero, or (ii) the variance is not unity, or (iii) the distribution is not normal, depending on the type of test performed. However, supposing we manage to reject the zero mean hypothesis, even if we know the distribution is normal and variance is unity, the null hypothesis test does not tell us which non-zero value we should adopt as the new mean. If X {displaystyle X} is a random variable representing the observed data and H {displaystyle H} is the statistical hypothesis under consideration, then the notion of statistical significance can be naively quantified by the conditional probability Pr ( X | H ) {displaystyle Pr(X|H)} , which gives the likelihood of a certain observation event X if the hypothesis is assumed to be correct. However, if X {displaystyle X} is a continuous random variable, the probability of observing a specific instance x {displaystyle x} is zero, that is, Pr ( X = x | H ) = 0. {displaystyle Pr(X=x|H)=0.} Thus, this naive definition is inadequate and needs to be changed so as to accommodate the continuous random variables. Nonetheless, it helps to clarify that p-values should not be confused with probability on hypothesis (as is done in Bayesian hypothesis testing) such as Pr ( H | X ) , {displaystyle Pr(H|X),} the probability of the hypothesis given the data, or Pr ( H ) , {displaystyle Pr(H),} the probability of the hypothesis being true, or Pr ( X ) , {displaystyle Pr(X),} the probability of observing the given data. The p-value is defined as the probability, under the null hypothesis H {displaystyle H} (at times denoted H 0 {displaystyle H_{0}} as opposed to H a {displaystyle H_{mathrm {a} }} denoting the alternative hypothesis) about the unknown distribution F {displaystyle F} of the random variable X {displaystyle X} , for the variate to be observed as a value equal to or more extreme than the value observed. If x {displaystyle x} is the observed value, then depending on how we interpret it, the 'equal to or more extreme than what was actually observed' can mean { X ≥ x } {displaystyle {Xgeq x}} (right-tail event), { X ≤ x } {displaystyle {Xleq x}} (left-tail event) or the event giving the smallest probability among { X ≤ x } {displaystyle {Xleq x}} and { X ≥ x } {displaystyle {Xgeq x}} (double-tailed event). Thus, the p-value is given by

[ "Statistical hypothesis testing", "Statistics", "Surgery", "Null hypothesis", "Barnard's test", "Portmanteau test", "Estimation statistics", "Multinomial test" ]
Parent Topic
Child Topic
    No Parent Topic