老澳门开奖结果

Skip to Content Skip to Footer
Time to Abandon Null Hypothesis Significance Testing? Moving Beyond the Default Approach to Statistical Analysis and Reporting

Time to Abandon Null Hypothesis Significance Testing? Moving Beyond the Default Approach to Statistical Analysis and Reporting

Blakeley B. McShane, Eric T. Bradlow, John G. Lynch, Jr. and Robert Meyer

Null hypothesis significance testing (NHST) is the default approach to statistical analysis and reporting in marketing and, more broadly, in the biomedical and social sciences. In a , we propose abandoning NHST as the default approach to statistical analysis and reporting.

As practiced, NHST involves:

  1. assuming that the intervention under investigation has no effect along with other assumptions,
  2. computing a statistical measure known as a P-value based on these assumptions, and
  3. comparing the computed P-value to the arbitrary threshold value of 0.05.

If the P-value is less than 0.05, the effect is declared 鈥渟tatistically significant,鈥 the assumption of no effect is rejected, and it is concluded that the intervention has an effect in the real world. If the P-value is above 0.05, the effect is declared 鈥渟tatistically nonsignificant,鈥 the assumption of no effect is not rejected, and it is concluded that the intervention has no effect in the real world.

Criticisms of NHST

Despite its default role, NHST has long been criticized by both statisticians and applied researchers, including those within marketing. The most prominent criticisms relate to NHST鈥檚 dichotomization of results as 鈥渟tatistically significant鈥 versus 鈥渟tatistically nonsignificant.鈥

For example, authors, editors, and reviewers use 鈥渟tatistical (non)significance鈥 as a filter to select which results to publish. This creates a distorted literature because the effects of published interventions are biased upward in magnitude. It also encourages harmful research practices that yield results that attain 鈥渟tatistical significance.鈥

Advertisement

Additionally, NHST has no basis because no intervention has precisely zero effect in the real world and small P-values and 鈥渟tatistical significance鈥 are guaranteed with sufficient sample sizes. Put differently, there is no need to reject a hypothesis of zero effect when it is already known to be false.

Perhaps the most widespread abuse of statistics is to ascertain where some statistical measure such as a P-value stands relative to 0.05 and take it as a basis to declare 鈥渟tatistical (non)significance,鈥 then make general and certain conclusions from a single study. Single studies are never definitive and thus can never demonstrate an effect or no effect. The aim of studies should be to report results in an unfiltered manner so that they can later be used to make more general conclusions based on the cumulative evidence from multiple studies. Nonetheless, NHST leads researchers to wrongly make general and certain conclusions and to wrongly filter results.

P-values naturally vary a great deal from study to study. As an example, a 鈥渟tatistically significant鈥 original study with an observed P-value of p = 0.005 (far below the 0.05 threshold) and a 鈥渟tatistically nonsignificant鈥 replication study with an observed P-value of p = 0.194 (far above the 0.05 threshold) are highly compatible with one another in the sense that the observed P-value, assuming no difference between them, is p = 0.289. However, when viewed through the lens of 鈥渟tatistical (non)significance,鈥 these two studies appear categorically different and are thus in contradiction because they are categorized differently.

We propose a major transition in statistical analysis and reporting. Specifically, we propose abandoning NHST鈥攁nd the P-value thresholds intrinsic to it鈥攁s the default approach to statistical analysis and reporting. Our recommendations are as follows:

  • 鈥淪tatistical (non)significance鈥 should never be used as a basis to make general and certain conclusions.
  • 鈥淪tatistical (non)significance鈥 should also never be used as a filter to select which results to publish.
  • Instead, all studies should be published in some form or another.
  • Reporting should focus on quantifying study results via point and interval estimates. All of the values inside conventional interval estimates are at least reasonably compatible with the data given all of the assumptions used to compute them; therefore, it makes no sense to single out a specific value such as the null value.
  • General conclusions should be made based on the cumulative evidence from multiple studies.
  • Studies need to treat P-values continuously and as just one factor among many鈥攊ncluding prior evidence, plausibility of mechanism, study design, data quality, and others that vary by research domain鈥攖hat require joint consideration and holistic integration.
  • Researchers must also respect the fact that such conclusions are necessarily tentative and subject to revision as new studies are conducted.

Decisions are seldom necessary in scientific reporting and are best left to end-users such as managers and clinicians when necessary. In such cases, they should be made using a decision analysis that integrates the costs, benefits, and probabilities of all possible consequences via a loss function (which typically varies dramatically across stakeholders)鈥攏ot via arbitrary thresholds applied to statistical summaries such as P-values (鈥渟tatistical (non)significance鈥) which, outside of certain specialized applications such as industrial quality control, are insufficient for this purpose.

Read the Full Article for Complete Details

From: Blakeley B. McShane, Eric T. Bradlow, John G. Lynch, Jr., and Robert J. Meyer, “,” Journal of Marketing.

Go to the Journal of Marketing

Blakeley B. McShane is Professor of Marketing, Northwestern University, USA.

Eric T. Bradlow is K.P. Chao Professor, Professor of Marketing, Statistics, Economics and Education, and Vice-Dean of Analytics, University of Pennsylvania, USA.

John G. Lynch, Jr. is University of Colorado Distinguished Professor, University of Colorado, USA.

Robert Meyer is Frederick H. Ecker/MetLife Insurance Professor of Marketing, University of Pennsylvania, USA.