losarenta.blogg.se - Data dredging in clinical trials

The crisis wastes research funding, erodes credibility and slows down scientific progress. It is increasingly acknowledged that psychology, cognitive neuroscience and biomedical research is in a crisis of producing too many false positive findings which cannot be replicated ( Ioannidis, 2005 Ioannidis et al., 2014 Open Science Collaboration, 2015). Similar, usually undocumented data dredging steps can easily lead to having 20–50%, or more false positives. I also illustrate that it is extremely easy to introduce strong bias into data by very mild selection and re-testing. I demonstrate the high amount of false positive findings generated by these techniques with data from true null distributions. The first approach ‘hacks’ the number of participants in studies, the second approach ‘hacks’ the number of variables in the analysis. Second, researchers may group participants post hoc along potential but unplanned independent grouping variables. First, researchers may violate the data collection stopping rules of null hypothesis significance testing by repeatedly checking for statistical significance with various numbers of participants. I illustrate several forms of two special cases of data dredging. In order to build better intuition to avoid, detect and criticize some typical problems, here I systematically illustrate the large impact of some easy to implement and so, perhaps frequent data dredging techniques on boosting false positive findings.

Hidden data dredging (also called p-hacking) is a major contributor to this crisis because it substantially increases Type I error resulting in a much larger proportion of false positive findings than the usually expected 5%. There is increasing concern about the replicability of studies in psychology and cognitive neuroscience. Department of Psychology, University of Cambridge, Cambridge, UK.