Peer Review of Science is Not Science 9/2/20

Recently I submitted a research paper to a few journals.  The topic was examining if psychology journals are living up to their own standards of publishing.  The APA took the time a few years back to create Journal Academic Reporting Standards (JARS) for quantitative and qualitative research.

Because I have been looking for ways to increase exposure to science for the public and others without reasonable access behind the publishing paywall, I focused on the JARS for research abstracts.  Abstracts are a summary of the research.  For those without a subscription to the journal it is the only information they have on the research.  For those interested in informing the public outside of academia then, the abstract seems like a perfect place to consider investigating how well the scientific community conveys research to the public.

My research assistant and I decided to look at publications from both the APA and APS from 2019.  We chose the flagship journal from APS, Psychological Science and a prominent journal from APA, the Journal of Personality and Social Psychology.  The first task was to determine the sample size of journal articles needed.  It seemed a’ priori that maybe six months of articles would be a minimum.  It was quickly discovered that there was actually very little variance in the quality of the abstracts from APS so the most recent three issues were selected containing thirty-two articles.  The same lack of variability was observed for JPSP so only the most recent issue was included with ten manuscripts.

The JARS for abstracts lists nine elements that should be included in abstracts (see below).  Please feel free to investigate for yourself whether journal abstracts are following these guidelines.

JARS ABSTRACT GUIDELINES

The first investigation was to code how each of the forty-two abstracts conformed to the guidelines.  Two independent raters examined each abstract.  There were few disagreements in coding but where there were it was decided to accept the most liberal coding to favor compliance with JARS and opposite the hypothesis that journals were not adhering to the guidelines.

An experiment was also conducted with 61 undergraduates to see how well the current abstracts informed them versus rewording the same abstracts (3 randomly chosen from the 42) to adhere more closely to the JARS with 30 questions related to methods and results.  The outcome was dismal for the journals.

Even with a liberal coding, more than 50% of the abstracts included two or fewer JARS elements and only four had more than half the JARS elements.  The scores of students who read the revised abstracts (M = 15.35, 95% CI = [13.2-17.5]) were significantly better than those who read the original abstracts (M = 9.23, 95% CI = 8.1-10.3]) with a large effect size; t(59) = 5.08, p < .001, d = 1.3.  One question regarding what statistical test was used, improved from only 4.4% answering correctly with original abstracts to 63.4% for revised abstracts.  It isn’t even close.  Current abstracts are fluff, omitting information necessary to assess validity.  Though very little information in abstracts allows validity assessment, nearly without exception authors were sure to provide application of the findings.

The study demonstrated that the most recent published articles from both APA and APS did very poorly in following JARS and were much less informative regarding facts needed to assess methodological and statistical validity than abstracts that followed JARS.  It was suggested that not only should journals assure compliance but additional elements necessary to assess internal, external and statistical validity should be included in abstracts.  Increasing abstract word limits to 500 would allow space for reporting essential statistical text and method information.  This would not only allow academics to more efficiently filter research but would also allow anyone without subscription access to better evaluate research without financial cost.  Interestingly JAMA has provided free online detailed overviews of research on COVID-19 not restricted to 150-250 words as are most abstracts.

Unfortunately, to date no APS or APA journal has decided to publish the study.  The manuscript hasn’t even made it past the editor gatekeepers.  Of course, journals have their own standards and are free to publish whatever they please.  And it is possible I am biased to think my research is good enough to at least allow reviewers a look.  Granted, I am no Bill Shakespeare or Al Einstein, but I haven’t argued that this research is anything more than what it is, decent evidence to consider improving abstracts.

I can’t help but wonder though, if at least one reason for not considering the research is the perceived black eye for the journals and studies included in the analysis.  I have tried to not let my thoughts wander there but one response I received from Dan Simons an editor at the APS journal Advances in Methods and Practices in Psychological Science has inspired me to write about my experience with peer review with the purpose of informing some while commiserating with others.

I’m sure Dan meant well, but I thought his pedantic criticisms exposed the non-objective approach to peer review that makes it so frustrating.  Dan felt evaluating 42 of the most current abstracts and testing 61 undergraduates with thirty questions on three randomly selected abstracts was insufficient to draw “firm conclusions”.  I don’t know how many more would be needed for “firm” evidence and he didn’t say either.  He also could not understand why we would use the APA JARS guidelines for abstracts to evaluate JPSP and certainly not any APS journal.  He didn’t provide an alternative, but there really isn’t one, so . . .

It was a given that Dan would be displeased that the study did not preregister with him to get permission to conduct science, as that was the very purpose of the journal’s creation.  He also couldn’t understand the use of only three randomly selected abstracts from the set.  To be fair apparently Dan missed the three times we said the three abstracts were selected randomly.  I have witnessed those types of errors in understanding from reviewers many times before.  When a reviewer obviously doesn’t accurately understand your basic methods, it calls into question the validity of other suggestions.  Maybe he also doesn’t realize you can’t expect subjects to provide reliable data when they are asked to read and be tested over a long period of time.  Despite the fact that the methods, materials and abstracts were transparently available to anyone to falsify our claims, Dan couldn’t accept the results because our measures had no “validity”.  In other words, multiple studies preceding this one have not shown a “valid” way to use JARS to evaluate manuscripts.  We were mavericks going beyond approved methods despite their transparency and replicability.

Dan on one hand has said the research couldn’t reach “firm conclusions” but then flipped the script and argued that the results were obvious. “Could the result have been anything other than what was observed?” he quipped without providing an alternative.  Yes, we added more pertinent information to the abstract and readers extracted that from the abstracts.  Not quantum theory for sure but that was the hypothesis, more information equals more understanding.  So, if your research hypothesis seemed likely to be confirmed, you should not demonstrate it empirically even though apparently no one else has seen what should have been obvious?  If it is so obvious why aren’t these journals doing what they claim they should?

He ended with the pro forma pat on the head and try again next time.  Maybe I’m just jaded.  However, I would rather just be told this doesn’t fit our journal and we only have so much room, than to be given personal opinions devoid of objective critique due to violating scientific principles.  Journals only have so much space but perhaps they can allow more for their abstracts so more can evaluate scientific claims effectively.  Dan hoped his “feedback will prove helpful” but it only highlighted the non-scientific constraints of peer review on scientific publishing.