Gemini

Issue 60 · December 2023

Join the conversation on Twitter @Briefer_Yet and LinkedIn.

Did a colleague forward the The Brief to you? Sign up via the button below to get your very own copy.

Prediction

Our prediction for the New Year is that readers of The Brief will appear more intelligent to their colleagues, more informed to their bosses, more attractive to their partners (or prospective partners), and more knowledgeable to their children (excepting teenaged children, of course) as compared with the control group of non-readers (we pity the control group). 

Gemini

1

In case anyone was still wondering if generative AI platforms intend to disintermediate content delivery by scientific and scholarly publishers, Google has helpfully answered the question explicitly. Earlier in the month, Google launched Gemini, its late-to-the-game answer to ChatGPT. One of the use cases for Gemini that Google points out (via a promotional video) is scanning and summarizing the scientific literature:
 
“Over the lunch break, Gemini read 200,000 papers for us, filtered it down to 250, and extracted the data.”
 
The video is careful to mention that these are open access (OA) papers. Google Scholar has spent years comprehensively indexing the scientific and scholarly literature. As part of that indexing process, it tracks which papers are OA and where various OA versions of papers might be found. Google’s index is sophisticated enough to sift through OA papers in hybrid journals. Google Scholar will also note the existence of an OA author accepted manuscript (AAM) version of a paper in PubMed Central. In other words, Google is not going to limit its use of OA papers to only those in OA journals. It is indexing at an article level. 
 
The implications of this granular indexing on licensing arrangements is worth noting. As more and more articles are published on an OA basis, even if in hybrid journals or via green OA, Google Gemini (and other generative AI platforms) will have more free-to-them content to incorporate into their platforms. This raises the question as to whether liberal reuse licenses might need to be revisited by publishers. The CC BY license is looking increasingly like a mechanism to transfer value from scientific and scholarly publishers to the world’s wealthiest tech companies. It is no surprise that many of the leading generative AI companies (Google, Microsoft, Amazon) are supporters of Creative Commons.
 
At the same time publishers are gifting more and more content to US tech companies, such companies are becoming more cautious about securing rights to the content used in their models. The primary customer base for Gemini and other generative AI platforms is enterprise customers (corporations, universities). Such organizations are concerned about IP infringement because they do not wish to be sued (which is a very real possibility: just as The Brief was going to press, the New York Times announced a copyright infringement lawsuit against Open AI and Microsoft). Generative AI platforms are therefore reassuring – and indemnifying – their customers from IP infringement lawsuits related to use of their foundational models. Such indemnification would only apply to the “foundational model” and not to additional content that the enterprise customer applied to the AI. So if a company hoovered up all the non-OA research literature and put it in a database and asked Gemini (or ChatGPT or Amazon Q or whathaveyou) to summarize it, that would presumable not be covered by the indemnification. But the line for “foundational” is not entirely clear (to us at least). When is content “in the model” versus “used in the training set” versus “accessible to the model”? And when Gemini “reads” through the scientific literature, is it using the entire Google Scholar index but only “reading” the full text for the OA papers (noting that, in most cases, Google has pulled the full text even for non-OA content into its database, even if it does not display full text to users)? 
 
Assuming there is value in incorporating scientific and scholarly literature (the portion that has not been given away via CC BY licenses) into generative AI platforms (as Google seems to think), the critical question is, Who pays? Do generative AI platform providers pay to license non-OA content and bake it into their foundational models, or do enterprise customers license the content for use only in their instance of the AI platform? The answer to this question will determine the size of the opportunity, and how to reach it, for publishers. 

A New DEAL: Tiered Pricing

2

In the September issue of The Brief we wrote about the missed opportunity by Elsevier and Germany’s DEAL Consortium to include society titles in the tiered pricing scheme incorporated into the transformative agreement between the two parties. Elsevier and DEAL negotiated two tiers of author fees but reserved the top tier for only Lancet and Cell titles. 

And so we were delighted to see that Wiley and DEAL, in announcing a new agreement (a renewal of their landmark 2019 deal), have developed three pricing tiers and have included society titles in the tiers. The new agreement with DEAL places most society journals in Tier 2 (the middle tier). The top tier (Tier 1) is reserved for journals that “deliver high levels of service or impact.” Wiley reserved the lowest tier for the majority of its proprietary titles. 

Here is how the new agreement reads:

Predefined Criteria for Tiers. Tier 1 (Baseline PAR Fee €3,150) contains Hybrid Journals that deliver high levels of Service and impact as a result of Investment and alignment with relevant research communities. Tier 1 Journals meet one or more of the following criteria: employed Editors-in-Chief or Editorial Office staff, and/or Journals that deliver objectively high impact for their disciplines. In all cases, Tier 1 Journals also have a list price APC of at least €3,500. All Journals that are not in Tier 1 are allocated to Tier 2 or Tier 3, which are governed by ownership. Tier 2 includes all other society-owned Journals (Baseline PAR Fee €2,700) and Tier 3 includes all other Wiley-owned Journals (Baseline PAR Fee €2,200). Changes in APC alone are insufficient to move Journals between Tiers.

Wiley and DEAL are to be applauded for this agreement, which both provides a better model for quality publications and includes society titles. Our only quibble is with the PAR prices themselves. The majority of society titles (Tier 2) will receive a small decrease in per-article revenue as compared with the current (2019) agreement. The current agreement provides a PAR fee of €2,750 per article for all titles whereas the new agreement pays Tier 2 titles €2,700. €50 is perhaps a small difference but a painful one once inflation since 2019 is factored in. While we would have liked to have seen Tier 2 titles priced at parity to the 2019 deal (as, no doubt, would have Wiley), we note that this remains a better PAR fee than the one received by society journals with Elsevier (€2,550) under that publisher’s agreement with the German consortium. We also would have preferred a much higher PAR fee for Tier 1 titles, closer to the €6,450 that the consortium will pay for publication in a Lancet or Cell title. Producing a high impact, highly selective journal (especially one with staff editors) is expensive. Even the €6,450 PAR fee secured by Elsevier is unlikely to cover the costs for such journals, never mind the €3,500 Tier I fee in the Wiley deal. But it is an important step in the right direction.

These quibbles to the side, this marks another landmark deal that (we hope) can be used as a structure (if not specific price points) for other transformative deals. It is through this lens we will look forward to hearing more on the renewal between Springer Nature and DEAL announced earlier this month

Farewell Hindawi

3

In other Wiley news, the Hindawi saga continues to weigh heavily on the publisher. Wiley noted in its second quarter fiscal report (for the quarter ending October 31, 2023), that revenue in its Research unit was “was down 5%, or 7% at constant currency, mainly due to the Hindawi publishing pause.” In the wake of the challenges with Hindawi, Wiley made the extraordinary announcement that the company is officially discontinuing use of the Hindawi brand. In light of the troubles that have come with Hindawi, this decision seems wise but must have been painful given the cost of the acquisition
 
Wiley purchased Hindawi to increase its article output to better position the company for output-based OA remuneration models (including both APCs and transformative agreements). Such models require publishing more and more papers, but scholarly publishing remains a prestige market, and so growth must be accompanied by careful management and strong oversight.
 
While it’s easy to dunk on Wiley for this mess, it’s worth giving Wiley credit for how it is responding. We have our doubts that other publishers equally reliant on guest-edited special issues would be as forthcoming should similar issues be discovered. Wiley has reorganized its management and is putting significant effort into cleanup. The transparency and information sharing coming out of Wiley’s cleanup efforts are admirable, and this month Wiley released a whitepaper on “Tackling publication manipulation at scale.” The paper provides a look at the processes that Wiley has put in place to review the relatively massive corpus of questionable papers from Hindawi journals, and displays an openness toward sharing that information with competitors across the community. One important lesson Wiley states is the infiltration by bad actors of many of the third-party tools used by publishers, particularly services that provide recommendations for peer reviewers, suggesting that Wiley is not the only company that needs to get its house in order. 
 
With the explosion in the availability of AI tools, maintaining research integrity is going to continue to be a major problem for scholarly publishers. The development of automated tools is always going to lag behind those looking for new ways to commit publication fraud. This is an area where human intervention is likely to remain an essential tool in keeping the research literature as reliable as possible. Cooperation and information sharing will be essential to these efforts, and Wiley is setting a good example for others to follow.

ORE Growth

4

The European Commission released a report that models scenarios for future development of its open access publishing platform – Open Research Europe (ORE) – which is currently supplied by F1000. One of the more salient points of the report is the recognition that the free nature (no publication charges) of this Diamond OA platform has “little or no impact on publication choice for authors with access to the necessary funds.” ORE currently serves as a platform for funded authors, and avoiding APCs appears to be a much lower priority than the other reasons driving author choice (a journal’s perceived quality and reputation, reaching the right audience, and speed of publication). ORE’s lack of a Journal Impact Factor, along with its strict data availability requirements, has resulted in the platform seeing relatively low uptake (277 papers in 2023). The most likely projected scenario for the ORE platform has it roughly following the track of F1000Research, which in its 11th year of existence published around 1,600 articles. Even at this “medium” growth level, the ORE platform will still only account for 0.13% of EU research publications.
 
Even with these modest results, the scenarios presented rely upon significant changes in the market, particularly a shift in the role publications play in researcher assessment. Overall, ORE seems to be undergoing the same struggles as Plan S (as discussed in last month’s The Brief). It is no easy task to implement a top-down policy while still trying to maintain some level of academic freedom and researcher choice. In an era where competition for authors is only increasing and publishers are seeking to differentiate themselves by offering a superior author experience, a slow platform that lacks prestige indicators may continue to struggle to gain traction. One has to question whether, given the projected €2,400 cost per paper, it makes sense for ORE to go through all the trouble of running its own platform when so many less-expensive alternatives are already available to its authors.

EWL Updates (and Lack Thereof)

5

The Chinese Academy of Sciences National Science Library (CAS NSL) is expected to soon issue the 2024 edition of the International Journal Early Warning List (EWL). Usually released on December 31, the list comprises international STM journals that CAS NSL’s Journal Ranking team has deemed need additional management from publishers to ensure they provide an appropriate channel for publication of Chinese research. Criteria for inclusion on the list mainly focus on potential research integrity issues but can include journals seen as having disproportionately high APCs or growth in the numbers of Chinese authors. (For more information on the EWL, it’s worth reading this Scholarly Kitcheninterview with Dr. Liying Yang, Director of the Scientometrics and Research Assessment Department at CAS NSL. C&E’s report on STM publishing in China provides additional context, including actions for publishers.)
 
Journals placed on the EWL can expect submissions from Chinese authors to dry up, and editors may receive requests from authors to withdraw accepted manuscripts from imminent publication. For journals dependent on Chinese research, the economic hit can be serious. Publishers can engage with the EWL team and provide data to show how they’ve improved editorial and publishing processes, which will often result in the journal being removed from the EWL after one year.
 
But even if a journal is removed from the EWL, the negative image persists in the Chinese research community. Compounding this issue is that many Chinese research universities have their own journal warning lists, and in recent years have often taken to simply basing them on the EWL. While CAS NSL will often remove a journal from EWL after one year following efforts by the publisher to address EWL criteria, universities often don’t update their lists. So once on the EWL, a publisher can find it is also on a near-permanent series of university warning lists.
 
An example of this is the “2023 List of Negative Journals for Academic Papers” of the influential University of Science and Technology of China (USTC). Released on March 31, 2023, this list was made available on Chinese scholarly social media in early December and has been the subject of much discussion in the Chinese research community. The list includes 77 journals, of which 71 are from the EWL. (The remaining 6 journals include engineering and computer science titles that were likely already on USTC’s radar.) Of the 71 “copy and paste” journals, 64 date from the 2020 EWL. CAS removed all but 8 of these 64 journals from the EWL in 2021, but unfortunately for the publishers concerned, their journals remain on the USTC watchlist three years later.    
 
USTC is one of China’s top universities, a member of the country’s elite C9 league of institutions, and is ranked 57th on the Times Higher Education’s World University Rankings for 2024. (The C9 league was created by the Chinese government in 1988 as a select group tasked with advancing higher education within China and increasing China’s research standing globally.) So how it handles matters like journal warning lists has significant influence in the Chinese research community, and thus on international publishers.
 
Unfortunately USTC is not alone, as many other Chinese research universities also simply borrow from the EWL and other broad-based lists and fail to update them. There is growing awareness in China of this problem and the negative impact it has on not just international publishers, but the authors, editors, and reviewers who have formed a community around a targeted journal. 
 
Discussions are now beginning among certain Chinese stakeholders about how to address this problem, but resolution will be complex and thus likely take some time. 

Dealmaking

6

Once the 4th largest publisher in the market, Taylor & Francis has seen its publication output (as measured by articles) largely eclipsed by its traditional rivals, not to mention MDPI (with Frontiers closing in rapidly). With article output a key success metric in an OA market, Taylor & Francis has no choice but to grow its program and did so this month with the acquisition of Future Science Group (FSG) and its 32 journals and roughly 2,000 articles published per year as of 2022. No stranger to controversy after selling several of its titles off to the predatory publisher OMICS, FSG has at least here chosen a better purchaser that will ensure the legitimacy of its former journals going forward.

People

7

We join the scholarly publishing community in sadness at the passing of Bruce Rosenblum. Bruce was a passionate advocate for workflow improvements in scholarly publishing and later for ALS research, and the impact of his work and life will live on.

***

Sarah Main has joined Elsevier as Vice President for Academic and Government Relations UK.

Andrew Pace has been named Executive Director of the Association of Research Libraries.

Nick Ishmael-Perkins has been named Editor-in-Chief of the American Chemical Society’s Chemical & Engineering News (C&EN). In its centennial year, C&EN has gone through a difficult period after the death of Mohammed Yahia last year, who had previously been named the new Editor-in-Chief.

Briefly Noted

8

Though more likely a mid-life crisis than an obituary, the OA movement seems to be going through something of a reckoning in recent months. The Leiden Madtrics blog describes OA as a “band-aid” approach and calls for a radical rethinking of journal prestige. Meanwhile, Richard Poynder offered an in-depth explanation of his abandonment of the OA movement, suggesting that “what had been conceived as a bottom-up movement founded on principles of voluntarism morphed into a top-down system of command and control, and open access evolved into an oppressive bureaucratic process…”
 
Stanford’s John Ioannidis and colleagues take aim in a preprint at the increase in “extremely productive publishing behavior,” which is defined in the preprint as publishing more than 60 papers per year and is suggested to imply fraudulent misbehavior. The irony of Ioannidis’ pot calling the kettle black was pointed out by Carl Bergstrom, who notes that Ioannidis published 324 papers from 2020 to 2022, placing himself directly in the crosshairs of his own study. Another paper published this month by the extremely productive Ioannidis offers a defense of quantitative metrics, rather than qualitative approaches, for researcher assessment.
 
On the surface, the news that more than 10,000 research papers were retracted in 2023, a leap from just over 4,000 last year, would seem alarming. A deeper read, however, shows that 8,000 of those retractions came from the aforementioned Hindawi debacle, putting non-Hindawi retractions at half the level of the previous year. What that means as far as the trustworthiness of the literature depends on whether one assumes the Hindawi retractions mark a singular, corrected event, or rather a harbinger of more special issue related problems to come.
 
The sustainability challenges for Diamond OA journals were highlighted this month as the independently published Journal of International Students saw a mass editorial resignation when it began charging authors an APC.
 
Another useful approach to research integrity was announced this month by IOP Publishing, which has pledged to donate any APC revenue from retracted papers to Research4Life. IOP Publishing is making a clear statement here about the importance of research integrity, and creating a financial punishment for itself in cases where it fails. It will be interesting to see if other publishers are so willing to put their money where their mouths are regarding research integrity.
 
The American Society for Cell Biology (ASCB) has begun offering a novel member benefit. All member submissions to the society’s flagship journal, Molecular Biology of the Cell (MBoC), are guaranteed to be sent out for peer review. We recall when this was standard practice at many societies and are pleased to see this practice making a comeback. That said, with the proliferation of desk-rejects in recent years, we wonder if some members will wish to opt-out (and whether there is a mechanism to do so)? While some members may value the service, others may prefer to move on quickly if their paper has little chance of being accepted as opposed to waiting on reviews.  
 
Springer Nature this month touted the milestone of its peer review platform (Snapp – Springer Nature Article Processing Platform, or as we put it back in April, “a name that is simultaneously overly optimistic yet exceedingly dull while conjuring to mind a mildly refreshing apple beverage”) handling its 1 millionth paper. While we assume that things have improved with Snapp since 2022 when a Springer Nature editor declared it to be “not even close to fit-for-purpose,” the fact that a five-year-old platform is still only in use by one-third of Springer Nature’s journals points to the complex nature of submission and peer review systems. From the outside they seem like they would be relatively straightforward to build and run, and yet time and time again they prove a much more difficult challenge than expected.
 
One of the great promises of AI for scholarly journals is their use in writing summaries for non-specialist readers. bioRxiv has launched a pilot to see if current LLMs (in this case one from ScienceCast) are up to the task. From the relatively small (and potentially skewed) sampling of responses to the announcement on Twitter, scientists seem supportive of the concept but largely unhappy, at least so far, with the LLM’s performance, suggesting there’s still a ways to go before LLM-generated summaries are where they need to be.
 
BioOne has joined the many organizations looking to pilot Subscribe to Open (S2O) models for OA. As Lisa Janicke Hinchliffe explained this month, S2O agreements generally fly under the radar as far as institutional scrutiny, at least for now, because they present the appearance of a subscription, rather than participation in supporting access more broadly beyond the institution.
 
The STM Association, DataCite, and Crossref have issued a joint statement of best practices for research data sharing
 
STM has also issued a separate and useful whitepaper for publishers featuring “Ethical and Practical Guidelines for the Use of Generative AI in the Publication Process.” On the other side of the equation, Nature warns of the increasing number of misleading research papers being published due to the use of AI in generating results. But if you’re launching a journal covering research on the use of AI in medicine, should you allow AI (specifically LLMs) to be used in writing articles? Absolutely yes, say the editors of NEJM AI.

***
AI provides a tool that allows researchers to ‘play’ with the data and parameters until the results are aligned with the expectations. – Lior Shamir