Surveillance Publishing

Björn Brembs recently explained how

“massive over-payment of academic publishers has enabled
them to buy surveillance technology covering the entire workflow that can be used not only to be combined with our private data and sold, but also to make algorithmic (aka ‘evidenceled’) employment decisions.”

Reading about this led me to this article:

• Jefferson D. Pooley, Surveillance publishing.

It’s all about what publishers are doing to make money by collecting data on the habits of their readers. Let me quote a bunch!

After a general introduction to surveillance capitalism, Pooley turns to “surveillance publishing”. Their prime example: Elsevier. I’ll delete the scholarly footnotes here:

Consider Elsevier. The Dutch publishing house was founded in
the late nineteenth century, but it wasn’t until the 1970s that the firm began to launch and acquire journal titles at a frenzied pace. Elsevier’s model was Pergamon, the postwar science-publishing venture established by the brash Czech-born Robert Maxwell. By 1965, around the time that Garfield’s Science Citation Index first appeared, Pergamon was publishing 150 journals. Elsevier followed Maxwell’s lead, growing at a rate of 35 titles a year by the late 1970s. Both firms hiked their subscription prices aggressively, making huge profits off the prestige signaling of Garfield’s Journal Impact Factor. Maxwell sold Pergamon to Elsevier in 1991, months before his lurid death.

Elsevier was just getting started. The firm acquired The Lancet
the same year, when the company piloted what would become
ScienceDirect, its Web-based journal delivery platform. In 1993 the Dutch publisher merged with Reed International, a UK paper-maker turned media conglomerate. In 2015, the firm changed its name to RELX Group, after two decades of acquisitions, divestitures, and product launches—including Scopus in 2004, Elsevier’s answer to ISI’s Web of Science. The “shorter, more modern name,” RELX explained, is a nod to the company’s “transformation” from publisher to a “technology, content and analytics driven business.” RELX’s strategy? The “organic development of increasingly sophisticated information-based analytics and decisions tools”. Elsevier, in other words, was to become a surveillance publisher.

Since then, by acquisition and product launch, Elsevier has moved to make good on its self-description. By moving up and down the research lifecycle, the company has positioned itself to harvest behavioral surplus at every stage. Tracking lab results? Elsevier has Hivebench, acquired in 2016. Citation and data-sharing software? Mendeley, purchased in 2013. Posting your working paper or preprint? SSRN and Bepress, 2016 and 2017, respectively. Elsevier’s “solutions” for the post-publication phase of the scholarly workflow are anchored by Scopus and its 81 million records.

Curious about impact? Plum Analytics, an altmetrics company, acquired in 2017. Want to track your university’s researchers and their work? There’s the Pure “research information management system,” acquired in 2012. Measure researcher performance? SciVal, spun off from Scopus in 2009, which incorporates media monitoring service Newsflo, acquired in 2015.

Elsevier, to repurpose a computer science phrase, is now a fullstack publisher. Its products span the research lifecycle, from the lab bench through to impact scoring, and even—by way of Pure’s grant-searching tools—back to the bench, to begin anew. Some of its products are, you might say, services with benefits: Mendeley, for example, or even the ScienceDirect journal-delivery platform, provide reference management or journal access for customers and give off behavioral data to Elsevier. Products like SciVal and Pure, up the data chain, sell the processed data back to researchers and their employers, in the form of “research intelligence.”

It’s a good business for Elsevier. Facebook, Google, and Bytedance have to give away their consumer-facing services to attract data-producing users. If you’re not paying for it, the Silicon Valley adage has it, then you’re the product. For Elsevier and its peers, we’re the product and we’re paying (a lot) for it. Indeed, it’s likely that windfall subscription-and-APC profits in Elsevier’s “legacy” publishing business have financed its decade-long acquisition binge in analytics.

As Björn Brembs recently Tweeted:

“massive over-payment of academic publishers has enabled them to buy surveillance technology covering the entire workflow that can be used not only to be combined with our private data and sold, but also to make algorithmic (aka ‘evidenceled’) employment decisions.”

This is insult piled on injury: Fleece us once only to fleece us all over again, first in the library and then in the assessment office. Elsevier’s prediction products sort and process mined data in a variety of ways. The company touts what it calls its Fingerprint® Engine, which applies machine learning techniques to an ocean’s worth of scholarly texts—article abstracts, yes, but also patents, funding announcements, and proposals. Presumably trained on human-coded examples (scholar-designated article keywords?), the model assigns keywords (e.g., “Drug Resistance”) to documents, together with what amounts to a weighted score (e.g., 73%). The list of terms and scores is, the company says, a “Fingerprint.” The Engine is used in a variety of products, including Expert Lookup (to find reviewers), the company’s Journal Finder, and its Pure university-level research-management software. In the latter case, it’s scholars who get Fingerprinted:

“Pure applies semantic technology and 10 different research-specific keyword vocabularies to analyze a researcher’s publications and grant awards and transform them into a unique Fingerprint—a distinct visual index of concepts and a weighted list of structured terms.

But it’s not just Elsevier:

The machine learning techniques that Elsevier is using are of a piece with the RELX’s other predictive-analytics businesses aimed at corporate and legal customers, including LexisNexis Risk Solutions. Though RELX doesn’t provide specific revenue figures for its academic prediction products, the company’s 2020 SEC disclosures indicate that over a third of Elsevier’s revenue come from databases and electronic reference products–a business, the company states, in which “we continued to drive good growth through content development and enhanced machine learning and natural language processing based functionality”.

Many of Elsevier’s rivals appear to be rushing into the analytics
market, too, with a similar full research-stack data harvesting
strategy. Taylor & Francis, for example, is a unit of Informa, a UK-based conglomerate whose roots can be traced to Lloyd’s List, the eighteenth-century maritime-intelligence journal. In its 2020 annual report, the company wrote that it intends to “more deeply use and analyze the first party data” sitting in Taylor & Francis and other divisions, to “develop new services based on hard data and behavioral data insights.”

Last year Informa acquired the Faculty of 1000, together with its OA F1000Research publishing platform. Not to be outdone, Wiley bought Hindawi, a large independent OA publisher, along with its Phenom platform. The Hindawi purchase followed Wiley’s 2016 acquisition of Atypon, a researcher-facing software firm whose online platform, Literatum, Wiley recently adopted across its journal portfolio. “Know thy reader,” Atypon writes of Literatum. “Construct reports on the fly and get visualization of content usage and users’ site behavior in real time.” Springer Nature, to cite a third example, sits under the same Holtzbrink corporate umbrella as Digital Science, which incubates startups and launches products across the research lifecycle, including the Web of Science/Scopus competitor Dimensions, data repository Figshare, impact tracker Altmetric, and many others.

So, the definition of ‘diamond open access‘ should include: no surveillance.

7 Responses to Surveillance Publishing

  1. https://scipost.org/ seems to tick all the right boxes.

    I would be interested in hearing from anyone (here or via email) who has experience with them.

    • John Baez says:

      What’s the idea of SciPost? When you say “tick all the right boxes”, do you mean they’re good in lots of ways, or bad in lots of ways?

      • I mean good in lots of ways. Check them out. High quality, peer-reviewed journals, free to authors and readers. One thing I like about them is that while of course one can put one’s article on arXiv, they actually publish it as well. That is the main problem with some arXiv-overlay journals: they depend on arXiv for publication. While the idea of arXiv is good, in practice it doesn’t work because some papers are blocked and no reasons are given (I’m talking about papers which have appeared in the main journals in the field written by people who have many papers on arXiv already). Like it or not, (too) many people rely (exclusively) on arXiv. Someone who has problems with arXiv suffers enough already. A journal which can’t publish a paper which it has accepted because some anonymous arXiv moderator rejects it without mentioning a reason can’t really call itself a journal.

        The main reason that it is possible to have free (to readers and writers) online journals is because the cost of hosting a PDF on the web is negligible. All online journals should do so, in addition to allowing the paper to be on arXiv.

        I haven’t published anything with SciPost so far. At the moment I have a small technical problem but receive no response from their tech support, even though I have registered with them. If anyone knows a way to get my complaint heard let me know. (I gave up with Google Scholar; there seems to be no way to get mistakes corrected, even though I have my profile set up so that I am asked to confirm updates. Hopefully I won’t have to give up here.) It is depressing to see true open-access journals get almost everything right then fail for a trivial reason, e.g. no response to technical questions (hopefully that is only a temporary problem), or not adding a line of HTML code to link to a PDF they already have in their system, meaning that the journal policies are controlled by anonymous arXiv moderators or artificial “intelligence”.

        • linasv says:

          I can confirm the oddball arxXv experience: If I post math papers, no problem. If I post outside of that, e.g. AI work, the posts are rejected, with comments like “Your work is very important. You should publish it in a journal”. Several AI researchers have experienced this; one of them, Eray Ozkural, claims that the prestige AI/AGI authors are trying to suppress competing work while also plagiarizing results. (FWIW Eray has a hyperbolic and argumentative personality.)

          As an itinerant science researcher, non-university associated, floating between gigs, coughing up the dough for journals is tough. Heck, simply listing an “affiliation” is tough. (I float because I want to work on what I want to work on, but not everyone wants to pay for that over multi-year time-scales.)

          I’m commenting here because my research work will enable even faster, better, stronger and more powerful surveillance. (Sorry!?) In general, I am working on generic ways to extract meaning and structure from complex streams of data, of any origin, in any format. Observing what scientists do? Sure; it’s not really different than observing what genes and proteins do. Or what doctors and patients do (that’s my most highly cited paper).

          Yes, we should strive for Diamond OA in publishing. The ethics of applying big-data surveillance to human activities, and not just galaxies and supernova, is a distinct issue. And then the elephant in the room: the forces of capitalism are rarely channeled by ethical concerns. We profit from bullets, bombs and cigarettes. Why not profit from surveillance? Dorothy might have said “Toto, we’re not in Kansas any more” except we haven’t gotten to the rainbow part yet; the intellectual, memetic, capitalistic AI tornado is gathering strength.

        • I’ve commented more extensively on my arXiv experience in another thread here on John’s blog (thanks to him for letting me air my frustration). A couple of points:

          To some extent, I can understand the “publish it in a journal” response, as it indicates that respected refereed journals are the generally accepted standard and some sort of filtering is needed (to see what happens without it, look at viXra). I work in astronomical cosmology. Monthly Notices of the Royal Astronomical Society is one of the handful of major journals in that field, but my feud with arXiv started when they didn’t allow a paper published in that journal into the obviously appropriate astro-ph category.

          No reason given. I got some really famous colleagues in the field to try to find out what happened (arXiv refused to talk to a low-life such as myself). It turns out that the situation is even worse, at least in my case, as my paper wasn’t rejected due to its contents per se, but because arXiv has to sacrifice a few good papers to avoid getting sued by crackpots. Or so they think. Sounds incredible? I admit that it surprised me. But that is apparently what is going on. Sadly, I don’t think that that will become well known until arXiv is forced to own up to its policies under oath in court.

          John has offered me the opportunity to write a guest post on my arXiv troubles here. I’ll get back to him relatively soon on that, so stay tuned. I wanted to write things up in an article for a journal first, and point to that in the post. (Obviously, I couldn’t put my rant on arXiv, so have to find some other way of reaching the community, most of whom don’t read John’s blog.) Interestingly, the two obvious venues (if anyone has any other suggestions, please let me know, either here or via email) seem ruled out: one rejected my contribution after almost a year, despite regular prodding as to its status, and I haven’t heard from the other one at all (no submission yet, just a general inquiry as to whether they might be interested in airing this topic). I think that in both cases it is down to fear that publicly criticizing arXiv leads to being banned from arXiv, the modern-day equivalent of excommunication, which most people don’t want to risk.

  2. Jim Stuttard says:

    We can add this blatant fraud from 2009: Merk and Elsevier “collaborated to produce something called “The Australasian Journal of Bone and Joint Medicine”. This appears to have looked like a real journal, complete with the Elsevier logo and a board of review editors, but it apparently featured nothing but articles (complimentary article, needless to say) about Merck products.

    Update: It appears that Merck and Elsevier actually set up a whole publishing division, Excerpta Medica, to handle these things. The news broke about a month ago in The Australian, and the story has been rolling downhill ever since, getting larger all the way. Now Elsevier has issued a public apology for their part in the whole affair, as well they should. https://www.science.org/content/blog-post/merck-elsevier-and-fakery

    More details here: https://laikaspoetnik.wordpress.com/2009/05/08/mercks-ghostwriters-haunted-papers-and-fake-elsevier-journals/

  3. I don’t see how anyone can take Elsevier seriously at all for any reason after the shenanigans at Chaos, Solitons & Fractals.

    https://en.wikipedia.org/wiki/Mohamed_El_Naschie

You can use Markdown or HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.