## Unreliable Biomedical Research

13 January, 2014

An American drug company, Amgen, that tried to replicate 53 landmark studies in cancer was able to reproduce the original results in only 6 cases—even though they worked with the original researchers!

That’s not all. Scientists at the pharmaceutical company Bayer were able to reproduce the published results in just a quarter of 67 studies!

How could things be so bad? The picture here shows two reasons:

If most interesting hypotheses are false, a lot of positive results will be ‘false positives’. Negative results may be more reliable. But few people publish negative results, so we miss out on those!

And then there’s wishful thinking, sloppiness and downright fraud. Read this Economist article for more on the problems—and how to fix them:

Trouble at the lab, Economist, 18 October 2013.

That’s where I got the picture above.

## Levels of Excellence

29 September, 2013

Over on Google+, a computer scientist at McGill named Artem Kaznatcheev passed on this great description of what it’s like to learn math, written by someone who calls himself ‘man after midnight’:

The way it was described to me when I was in high school was in terms of ‘levels’.

Sometimes, in your mathematics career, you find that your slow progress, and careful accumulation of tools and ideas, has suddenly allowed you to do a bunch of new things that you couldn’t possibly do before. Even though you were learning things that were useless by themselves, when they’ve all become second nature, a whole new world of possibility appears. You have “leveled up”, if you will. Something clicks, but now there are new challenges, and now, things you were barely able to think about before suddenly become critically important.

It’s usually obvious when you’re talking to somebody a level above you, because they see lots of things instantly when those things take considerable work for you to figure out. These are good people to learn from, because they remember what it’s like to struggle in the place where you’re struggling, but the things they do still make sense from your perspective (you just couldn’t do them yourself).

Talking to somebody two or levels above you is a different story. They’re barely speaking the same language, and it’s almost impossible to imagine that you could ever know what they know. You can still learn from them, if you don’t get discouraged, but the things they want to teach you seem really philosophical, and you don’t think they’ll help you—but for some reason, they do.

Somebody three levels above is actually speaking a different language. They probably seem less impressive to you than the person two levels above, because most of what they’re thinking about is completely invisible to you. From where you are, it is not possible to imagine what they think about, or why. You might think you can, but this is only because they know how to tell entertaining stories. Any one of these stories probably contains enough wisdom to get you halfway to your next level if you put in enough time thinking about it.

What follows is my rough opinion on how this looks in a typical path towards a Ph.D. in math. Obviously this is rather subjective, and makes math look too linear, but I think it’s a useful thought experiment.

Consider the change that a person undergoes in first mastering elementary algebra. Let’s say that that’s one level. This student is now comfortable with algebraic manipulation and the idea of variables.

The next level may come somewhere during a first calculus course. The student now understands the concept of the infinitely small, of slope at a point, and can reason about areas, physical motion, and optimization.

Many stop here, believing that they have finally learned math. Those who do not stop, might proceed through multivariable calculus and perhaps a basic linear algebra course with the tools they currently possess. Their next level comes when they find themselves suffering through an abstract algebra course, and have to once again reshape their whole thought process just to squeak by with a C.

Once this student masters all of that, the rest of the undergraduate curriculum at their university might be a breeze. But not so with graduate school. They gain a level their first year. They gain another their third year. And they are horrified to discover that they are expected to gain a third level before they graduate. This level is the hardest of them all, because it is the first one that consists in mastering material that has been created largely by the student.

I don’t know how many levels there are after that. At least three.

So, the bad news is, you never do see the whole picture (though you see the old picture shrink down to a tiny point), and you can’t really explain what you do see. But the good news is that the world of mathematics is so rich and exciting and wonderful that even your wildest dreams about it cannot possibly compare. It is not like seeing the Matrix—it is like seeing the Matrix within the Matrix within the Matrix within the Matrix within the Matrix.

As he points out, this talk of ‘levels’ is too linear. You can be much better at algebraic geometry than your friend, but way behind them in probability theory. Or even within a field like algebraic geometry, you might be able to understand sheaf cohomology better than your friend, yet still way behind in some classical topic like elliptic curves.

To have worthwhile conversations with someone who is not evenly matched with you in some subject, it’s often good for one of you to play ‘student’ while the other plays ‘teacher’. Playing teacher is an ego boost, and it helps organize your thoughts – but playing student is a great way to amass knowledge and practice humility… and a good student can help the teacher think about things in new ways.

Taking turns between who is teacher and who is student helps keep things from becoming unbalanced. And it’s especially fun when some subject can only be understood with the combined knowledge of both players.

I have a feeling good mathematicians spend a lot of time playing these games—we often hear of famous teams like Atiyah, Bott and Singer, or even bigger ones like the French collective called ‘Bourbaki’. For about a decade, I played teacher/student games with James Dolan, and it was really productive. I should probably find a new partner to learn the new kinds of math I’m working on now. Trying to learn things by yourself is a huge disadvantage if you want to quickly rise to higher ‘levels’.

If we took things a bit more seriously and talked about them more, maybe a lot of us could get better at things faster.

Indeed, after I passed on these remarks, T.A. Abinandanan, a professor of materials science in Bangalore, pointed out this study on excellence in swimming:

• Daniel Chambliss, The mundanity of excellence.

Chambliss emphasizes that in swimming there really are discrete levels of excellence, because there are different kinds of swimming competitions, each with their own different ethos. Here are some of his other main points:

1) Excellence comes from qualitative changes in behavior, not just quantitative ones. More time practicing is not good enough. Nor is simply moving your arms faster! A low-level breaststroke swimmer does very different things than a top-ranked one. The low-level swimmer tends to pull her arms far back beneath her, kick the legs out very wide without bringing them together at the finish, lift herself high out of the water on the turn, and fail to go underwater for a long ways after the turn. The top-ranked one sculls her arms out to the side and sweeps back in, kicks narrowly with the feet finishing together, stays low on the turns, and goes underwater for a long distance after the turn. They’re completely different!

2) The different levels of excellence in swimming are like different worlds, with different rules. People can move up or down within a level by putting in more or less effort, but going up a level requires something very different—see point 1).

3) Excellence is not the product of socially deviant personalities. The best swimmers aren’t “oddballs,” nor are they loners—kids who have given up “the normal teenage life”.

4) Excellence does not come from some mystical inner quality of the athlete. Rather, it comes from learning how to do lots of things right.

5) The best swimmers are more disciplined. They’re more likely to be strict with their training, come to workouts on time, watch what they eat, sleep regular hours, do proper warmups before a meet, and the like.

6) Features of the sport that low-level swimmers find unpleasant, excellent swimmers enjoy. What others see as boring – swimming back and forth over a black line for two hours, say – the best swimmers find peaceful, even meditative, or challenging, or therapeutic. They enjoy hard practices, look forward to difficult competitions, and try to set difficult goals.

7) The best swimmers don’t spend a lot of time dreaming about big goals like winning the Olympics. They concentrate on “small wins”: clearly defined minor achievements that can be rather easily done, but produce real effects.

8) The best swimmers don’t “choke”. Faced with what seems to be a tremendous challenge or a strikingly unusual event such as the Olympic Games, they take it as a normal, manageable situation. One way they do this is by sticking to the same routines. Chambliss calls this the “mundanity of excellence”.

I’ve just paraphrased chunks of the paper. The whole thing is worth reading! I can’t help wondering how much these lessons apply to other areas. He gives an example that could easily apply to mathematics—a

more personal example of failing to maintain a sense of mundanity, from the world of academia: the inability to finish the doctoral thesis, the hopeless struggle for the magnum opus. Upon my arrival to graduate school some 12 years ago, I was introduced to an advanced student we will call Michael. Michael was very bright, very well thought of by his professors, and very hard working, claiming (apparently truthfully) to log a minimum of twelve hours a day at his studies. Senior scholars sought out his comments on their manuscripts, and their acknowledgements always mentioned him by name. All the signs pointed to a successful career. Yet seven years later, when I left the university, Michael was still there-still working 12 hours a day, only a bit less well thought of. At last report, there he remains, toiling away: “finishing up,” in the common expression.

In our terms, Michael could not maintain his sense of mundanity. He never accepted that a dissertation is a mundane piece of work, nothing more than some words which one person writes and a few other people read. He hasn’t learned that the real exams, the true tests (such as the dissertation requirement) in graduate school are really designed to discover whether at some point one is willing just to turn the damn thing in.

## Why Most Published Research Findings Are False

11 September, 2013

My title here is the eye-catching—but exaggerated!—-title of this well-known paper:

• John P. A. Ioannidis, Why most published research findings are false, PLoS Medicine 2 (2005), e124.

It’s open-access, so go ahead and read it! Here is his bold claim:

Published research findings are sometimes refuted by subsequent evidence, with ensuing confusion and disappointment. Refutation and controversy is seen across the range of research designs, from clinical trials and traditional epidemiological studies to the most modern molecular research. There is increasing concern that in modern research, false findings may be the majority or even the vast majority of published research claims. However, this should not be surprising. It can be proven that most claimed research findings are false. Here I will examine the key factors that influence this problem and some corollaries thereof.

He’s not really talking about all ‘research findings’, just research that uses the

ill-founded strategy of claiming conclusive research findings solely on the basis of a single study assessed by formal statistical significance, typically for a p-value less than 0.05.

His main interests are medicine and biology, but many of the problems he discusses are more general.

His paper is a bit technical—but luckily, one of the main points was nicely explained in the comic strip xkcd:

If you try 20 or more things, you should not be surprised that once an event with probability less than 0.05 = 1/20 will happen! It’s nothing to write home about… and nothing to write a scientific paper about.

Even researchers who don’t make this mistake deliberately can do it accidentally. Ioannidis draws several conclusions, which he calls corollaries:

Corollary 1: The smaller the studies, the less likely the research findings are to be true. (If you test just a few jelly beans to see which ones ‘cause acne’, you can easily fool yourself.)

Corollary 2: The smaller the effects being measured, the less likely the research findings are to be true. (If you’re studying whether jelly beans cause just a tiny bit of acne, you you can easily fool yourself.)

Corollary 3: The more quantities there are to find relationships between, the less likely the research findings are to be true. (If you’re studying whether hundreds of colors of jelly beans cause hundreds of different diseases, you can easily fool yourself.)

Corollary 4: The greater the flexibility in designing studies, the less likely the research findings are to be true. (If you use lots and lots of different tricks to see if different colors of jelly beans ‘cause acne’, you can easily fool yourself.)

Corollary 5: The more financial and other interests and prejudices in a scientific field, the less likely the research findings are to be true. (If there’s huge money to be made selling acne-preventing jelly beans to teenagers, you can easily fool yourself.)

Corollary 6: The hotter a scientific field, and the more scientific teams involved, the less likely the research findings are to be true. (If lots of scientists are eagerly doing experiments to find colors of jelly beans that prevent acne, it’s easy for someone to fool themselves… and everyone else.)

Ioannidis states his corollaries in more detail; I’ve simplified them to make them easy to understand, but if you care about this stuff, you should read what he actually says!

### The Open Science Framework

Since his paper came out—and many others on this general theme—people have gotten more serious about improving the quality of statistical studies. One effort is the Open Science Framework.

Here’s what their website says:

The Open Science Framework (OSF) is part network of research materials, part version control system, and part collaboration software. The purpose of the software is to support the scientist’s workflow and help increase the alignment between scientific values and scientific practices.

Document and archive studies.

Move the organization and management of study materials from the desktop into the cloud. Labs can organize, share, and archive study materials among team members. Web-based project management reduces the likelihood of losing study materials due to computer malfunction, changing personnel, or just forgetting where you put the damn thing.

Share and find materials.

With a click, make study materials public so that other researchers can find, use and cite them. Find materials by other researchers to avoid reinventing something that already exists.

Detail individual contribution.

Assign citable, contributor credit to any research material – tools, analysis scripts, methods, measures, data.

Increase transparency.

Make as much of the scientific workflow public as desired – as it is developed or after publication of reports. Find public projects here.

Registration.

Registering materials can certify what was done in advance of data analysis, or confirm the exact state of the project at important points of the lifecycle such as manuscript submission or at the onset of data collection. Discover public registrations here.

Manage scientific workflow.

A structured, flexible system can provide efficiency gain to workflow and clarity to project objectives, as pictured.

### CONSORT

Another group trying to improve the quality of scientific research is CONSORT, which stands for Consolidated Standards of Reporting Trials. This is mainly aimed at medicine, but it’s more broadly applicable.

The key here is the “CONSORT Statement”, a 25-point checklist saying what you should have in any paper about a randomized controlled trial, and a flow chart saying a bit about how the experiment should work.

### What else?

What are the biggest other efforts that are being made to improve the quality of scientific research?

## The Selected Papers Network (Part 4)

29 July, 2013

guest post by Christopher Lee

In my last post, I outlined four aspects of walled gardens that make them very resistant to escape:

• walled gardens make individual choice irrelevant, by transferring control to the owner, and tying your one remaining option (to leave the container) to being locked out of your professional ecosystem;

• all competition is walled garden;

• walled garden competition is winner-take-all;

• even if the “good guys” win (build the biggest walled garden), they become “bad guys” (masters of the walled garden, whose interests become diametrically opposed to that of the people stuck in their walled garden).

To state the obvious: even if someone launched a new site with the perfect interface and features for an alternative system of peer review, it would probably starve to death both for lack of users and lack of impact. Even for the rare user who found the site and switched all his activity to it, he would have little or no impact because almost no one would see his reviews or papers. Indeed, even if the Open Science community launched dozens of sites exploring various useful new approaches for scientific communication, that might make Open Science’s prospects worse rather than better. Since each of these sites would in effect be a little walled garden (for reasons I outlined last time), their number and diversity would mainly serve to fragment the community (i.e. the membership and activity on each such site might be ten times less than it would have been if there were only a few such sites). When your strengths (diversity; lots of new ideas) act as weaknesses, you need a new strategy.

SelectedPapers.net is an attempt to an offer such a new strategy. It represents only about two weeks of development work by one person (me), and has only been up for about a month, so it can hardly be considered the last word in the manifold possibilities of this new strategy. However, this bare bones prototype demonstrates how we can solve the four ‘walled garden dilemmas':

Enable walled-garden users to ‘levitate’—be ‘in’ the walled garden but ‘above’ it at the same time. There’s nothing mystical about this. Think about it: that’s what search engines do all the time—a search engine pulls material out of all the worlds’ walled gardens, and gives it a new life by unifying it based on what it’s about. All selectedpapers.net does is act as a search engine that indexes content by what paper and what topics it’s about, and who wrote it.

This enables isolated posts by different people to come together in a unified conversation about a specific paper (or topic), independent of what walled gardens they came from—while simultaneously carrying on their full, normal life in their original walled garden.

Concretely, rather than telling Google+ users (for example) they should stop posting on Google+ and post only on selectedpapers.net instead (which would make their initial audience plunge to near zero), we tell them to add a few tags to their Google+ post so selectedpapers.net can easily index it. They retain their full Google+ audience, but they acquire a whole new set of potential interactions and audience (trivial example: if they post on a given paper, selectedpapers.net will display their post next to other people’s posts on the same paper, resulting in all sorts of possible crosstalk).

Some people have expressed concern that selectedpapers.net indexes Google+, rightly pointing out that Google+ is yet another walled garden. Doesn’t that undercut our strategy to escape from walled gardens? No. Our strategy is not to try to find a container that is not a walled garden; our strategy is to ‘levitate’ content from walled gardens. Google+ may be a walled garden in some respects, but it allows us to index users’ content, which is all we need.

It should be equally obvious that selectedpapers.net should not limit itself to Google+. Indeed, why should a search engine restrict itself to anything less than the whole world? Of course, there’s a spectrum of different levels of technical challenges for doing this. And this tends to produce an 80-20 rule, where 80% of the value can be attained by only 20% of the work. Social networks like Google+, Twitter etc. provide a large portion of the value (potential users), for very little effort—they provide open APIs that let us search their indexes, very easily. Blogs represent another valuable category for indexing.

More to the point, far more important than technology is building a culture where users expect their content to ‘fly’ unrestricted by walled-garden boundaries, and adopt shared practices that make that happen easily and naturally. Tagging is a simple example of that. By putting the key metadata (paper ID, topic ID) into the user’s public content, in a simple, standard way (as opposed to hidden in the walled garden’s proprietary database), tagging makes it easy for anyone and everyone to index it. And the more users get accustomed to the freedom and benefits this provides, the less willing they’ll be to accept walled gardens’ trying to take ownership (ie. control) of the users’ own content.

Don’t compete; cooperate: if we admit that it will be extremely difficult for a small new site (like selectedpapers.net) to compete with the big walled gardens that surround it, you might rightly ask, what options are left? Obviously, not to compete. But concretely, what would that mean?

☆ enable users in a walled garden to liberate their own content by tagging and indexing it;

☆ add value for those users (e.g. for mathematicians, give them LaTeX equation support);

☆ use the walled garden’s public channel as your network transport—i.e. build your community within and through the walled garden’s community.

This strategy treats the walled garden not as a competitor (to kill or be killed by) but instead as a partner (that provides value to you, and that you in turn add value to). Morever, since this cooperation is designed to be open and universal rather than an exclusive partnership (concretely, anyone could index selectedpapers.net posts, because they are public), we can best describe this as public data federation.

Any number of sites could cooperate in this way, simply by:

☆ sharing a common culture of standard tagging conventions;

☆ treating public data (i.e. viewable by anybody on the web) as public (i.e. indexable by anybody);

☆ drawing on the shared index of global content (i.e. when the index has content that’s relevant to your site’s users, let them see and interact with it).

To anyone used to the traditional challenges of software interoperability, this might seem like a tall order—it might take years of software development to build such a data federation. But consider: by using Google+’s open API, selectedpapers.net has de facto established such a data federation with Google+, one of the biggest players in the business. Following the checklist:

☆ selectedpapers.net offers a very simple tagging standard, and more and more Google+ users are trying it;

☆ Google+ provides the API that enables public posts to be searched and indexed. Selectedpapers.net in turn assures that posts made on selectedpapers.net are visible to Google+ by simply posting them on Google+;

☆ Selectedpapers.net users can see posts from (and have discussions with) Google+ users who have never logged into (or even heard of) selectedpapers.net, and vice versa.

Now consider: what if someone set up their own site based on the open source selectedpapers.net code (or even wrote their own implementation of our protocol from scratch). What would they need to do to ensure 100% interoperability (i.e. our three federation requirements above) with selectedpapers.net? Nothing. That federation interoperability is built into the protocol design itself. And since this is federation, that also means they’d have 100% interoperation with Google+ as well. We can easily do so also with Twitter, WordPress, and other public networks.

There are lots of relevant websites in this space. Which of them can we actually federate with in this way? This divides into two classes: those that have open APIs vs. those that don’t. If a walled garden has an API, you can typically federate with it simply by writing some code to use their API, and encouraging its users to start tagging. Everybody wins: the users gain new capabilities for free, and you’ve added value to that walled garden’s platform. For sites that lack such an API (typically smaller sites), you need more active cooperation to establish a data exchange protocol. For example, we are just starting discussions with arXiv and MathOverflow about such ‘federation’ data exchange.

To my mind, the most crucial aspect of this is sincerity: we truly wish to cooperate with (add value to) all these walled garden sites, not to compete with them (harm them). This isn’t some insidious commie plot to infiltrate and somehow destroy them. The bottom line is that websites will only join a federation if it benefits them, by making their site more useful and more attractive to users. Re-connecting with the rest of the world (in other walled gardens) accomplishes that in a very fundamental way. The only scenario I see where this would not seem advantageous, would be for a site that truly believes that it is going to achieve market dominance across this whole space (‘one walled garden to rule them all’). Looking over the landscape of players (big players like Google, Twitter, LinkedIn, Facebook, vs. little players focused on this space like Mendeley, ResearchGate, etc.), I don’t think any of the latter can claim this is a realistic plan—especially when you consider that any success in that direction will just make all other players federate together in self-defense.

Level the playing field: these considerations lead naturally to our third concern about walled gardens: walled garden competition strongly penalizes new, small players, and makes bigger players assume a winner-takes-all outcome. Concretely, selectedpapers.net (or any other new site) is puny compared with, say, Mendeley. However, the federation strategy allows us to turn that on its head. Mendeley is puny compared with Google+, and selectedpapers.net operates in de facto federation with Google+. How likely is it that Mendeley is going to crush Google+ as a social network where people discuss science? If a selectedpapers.net user could only post to other selectedpapers.net members (a small audience), then Mendeley wins by default. But that’s not how it works: a selectedpapers.net user has all of Google+ as his potential audience. In a federation strategy, the question isn’t how big you are, but rather how big your federation is. And in this day of open APIs, it is really easy to extend that de facto federation across a big fraction of the world’s social networks. And that is level playing field.

Provide no point of control: our last concern about walled gardens was that they inevitably create a divergence of interests for the winning garden’s owner vs. the users trapped inside. Hence the best of intentions (great ideas for building a wonderful community) can truly become the road to hell—an even better walled garden. After all, that’s how the current walled garden system evolved (from the reasonable and beneficial idea of establishing journals). If any one site ‘wins’, our troubles will just start all over again. Is there any alternative?

Yes: don’t let any one site win; only build a successful federation. Since user data can flow freely throughout the federation, users can move freely within the federation, without losing their content, accumulated contacts and reputation, in short, their professional ecosystem. If a successful site starts making policies that are detrimental to users, they can easily vote with their feet. The data federation re-establishes the basis for a free market, namely unconstrained individual freedom of choice.

The key is that there is no central point of control. No one ‘owns’ (i.e. controls) the data. It will be stored in many places. No one can decide to start denying it to someone else. Anyone can access the public data under the rules of the federation. Even if multiple major players conspired together, anyone else could set up an alternative site and appeal to users: vote with your feet! As we know from history, the problem with senates and other central control mechanisms is that given enough time and resources, they can be corrupted and captured by both elites and dictators. Only a federation system with no central point of control has a basic defense: regardless of what happens at ‘the top’, all individuals in the system have freedom of choice between many alternatives, and anybody can start a new alternative at any time. Indeed, the key red flag in any such system is when the powers-that-be start pushing all sorts of new rules that hinder people from starting new alternatives, or freely migrating to alternatives.

Note that implicit in this is an assertion that a healthy ecosystem should contain many diverse alternative sites that serve different subcommunities, united in a public data federation. I am not advocating that selectedpapers.net should become the ‘one paper index to rule them all’. Instead, I’m saying we need one successful exemplar of a federated system, that can help people see how to move their content beyond the walled garden and start ‘voting with their feet’.

So: how do we get there? In my view, we need to use selectedpapers.net to prove the viability of the federation model in two ways:

☆ we need to develop the selectedpapers.net interface to be a genuinely good way to discuss scientific papers, and subscribe to others’ recommendations. It goes without saying that the current interface needs lots of improvements, e.g. to work past some of Google+’s shortcomings. Given that the current interface took only a couple of weeks of hacking by just one developer (yours truly), this is eminently doable.

☆ we need to show that selectedpapers.net is not just a prisoner of Google+, but actually an open federation system, by adding other systems to the federation, such as Twitter and independent blogs. Again, this is straightforward.

### To Be or Not To Be?

All of which brings us to the real question that will determine our fates. Are you for a public data federation, or not? In my
view, if you seriously want reform of the current walled garden
system, federation is the only path forward that is actually a path forward (instead of to just another walled garden). It is the only strategy that allows the community to retain control over its own content. That is fundamental.

And if you do want a public data federation, are you willing to
work for that outcome? If not, then I think you don’t really want it—because you can contribute very easily. Even just adding #spnetwork tags to your posts—wherever you write them—is a very valuable contribution that enormously increases the value of the federation ecosystem.

One more key question: who will join me in developing the
selectedpapers.net platform (both the software, and federation alliances)? As long as selectedpapers.net is a one-man effort, it must fail. We don’t need a big team, but it’s time to turn the project into a real team. The project has solid foundations that will enable rapid development of new federation partnerships—e.g. exciting, open APIs like REST — and of seamless, intuitive user interfaces — such as the MongoDB noSQL database, and AJAX methods. A small, collaborative team will be able to push this system forward quickly in exciting, useful ways. If you jump in now, you can be one of the very first people on the team.

I want to make one more appeal. Whatever you think about
selectedpapers.net as it exists today, forget about it.

Why? Because it’s irrelevant to the decision we need to make today: public data federation, yes or no? First, because the many flaws of the current selectedpapers.net have almost no bearing on that critical question (they mainly reflect the limitations of a version 0.1 alpha product). Second, because the whole point of federation is to ‘let a thousand flowers bloom’— to enable a diverse ecology of different tools and interfaces, made viable because they work together as a federation, rather than starving to death as separate, warring, walled gardens.

Of course, to get to that diverse, federated ecosystem, we first
have to prove that one federated system can succeed—and
liberate a bunch of minds in the process, starting with our own. We have to assemble a nucleus of users who are committed to making this idea succeed by using it, and a team of developers who are driven to build it. Remember, talking about the federation ideal will not by itself accomplish anything. We have to act, now; specifically, we have to quickly build a system that lets more and more people see the direct benefits of public data federation. If and when that is clearly successful, and growing sustainably, we can consider branching out, but not before.

For better or worse, in a world of walled gardens, selectedpapers.net is the one effort (in my limited knowledge) to do exactly that. It may be ugly, and annoying, and alpha, but it offers people a new and different kind of social contract than the walled gardens. (If someone can point me to an equivalent effort to implement the same public data federation strategy, we will of course be delighted to work with them! That’s what federation means).

The question now for the development of public data federation is whether we are working together to make it happen, or on the contrary whether we are fragmenting and diffusing our effort. I believe that public data federation is the Manhattan Project of the war for Open Science. It really could change the world in a fundamental and enduring way. Right now the world may seem headed the opposite direction (higher and higher walls), but it does not have to be that way. I believe that all of the required ingredients are demonstrably available and ready to go. The only remaining requirement is that we rise as a community and do it.

I am speaking to you, as one person to another. You as an individual do not even have the figleaf of saying “Well, if I do this, what’s the point? One person can’t have any impact.” You as an individual can change this project. You as an individual can change the world around you through what you do on this project.

## The Selected Papers Network (Part 3)

12 July, 2013

guest post by Christopher Lee

A long time ago in a galaxy far, far away, scientists (and mathematicians) simply wrote letters to each other to discuss their findings.

In cultured cities, they formed clubs for the same purpose; at club meetings, particularly juicy letters might be read out in their entirety. Everything was informal (bureaucracy to-science ratio around zero), individual (each person spoke only for themselves, and made up their own mind), and direct (when Pierre wrote to Johan, or Nikolai to Karl, no one yelled “Stop! It has not yet been blessed by a Journal!”).

To use my nomenclature, it was a selected-papers network. And it worked brilliantly for hundreds of years, despite wars, plagues and severe network latency (ping times of 109 msec).

Even work we consider “modern” was conducted this way, almost to the twentieth century: for example, Darwin’s work on evolution by natural selection was “published” in 1858, by his friends arranging a reading of it at a meeting of the Linnean Society. From this point of view, it’s the current journal system that’s a historical anomaly, and a very recent one at that.

I’ll spare you an essay on the problems of the current system. Instead I want to focus on the practical question of how to change the system. The nub of the question is a conundrum: how is it, that just as the Internet is reducing publication and distribution costs to zero, Elsevier, the Nature group and other companies have been aggressively raising subscription prices (for us to read our own articles!), in many cases to extortionate levels?

That publishing companies would seek to outlaw Open Access rules via cynical legislation like the “Research Works” Act goes without saying; that they could blithely expect the market to buy a total divorce of price vs. value reveals a special kind of economic illogic.

That illogic has a name: the Walled Garden—and it is the immovable object we are up against. Any effort we make must be informed by careful study of what makes its iniquities so robust.

I’ll start by reviewing some obvious but important points.

A walled garden is an empty container that people are encouraged to fill with their precious content—at which point it stops being “theirs”, and becomes the effective property of whoever controls the container. The key word is control. When Pierre wrote a letter to Johan, the idea that they must pay some ignoramus \$40 for the privilege would have been laughable, because there was no practical way for a third party to control that process. But when you put the same text in a journal, it gains control: it can block Pierre’s letter for any reason (or no reason); and it can lock out Johan (or any other reader) unless he pays whatever price it demands.

Some people might say this is just the “free market” at work—but that is a gross misunderstanding of the walled garden concept. Unless you can point to exactly how the “walls” lock people in, you don’t really understand it. For an author, a free market would be multiple journals competing to consider his paper (just as multiple papers compete for acceptance by a journal). This would be perfectly practical (they could all share the same set of 2-3 referee reports), but that’s not how journals decided to do it. For a reader or librarian, a free market would be multiple journals competing to deliver the same content (same articles): you choose the distributor that provides the best price and service.

Journals simply agree not to compete, by inserting a universal “non-compete clause” in their contract; not only are authors forced to give exclusive rights to one journal, they are not even permitted to seek multiple bids (let more than one journal at a time see the paper). The whole purpose of the walled garden is to eliminate the free market.

Do you want to reform some of the problems of the current system? Then you had better come to grips with the following walled garden principles:

• Walled gardens make individual choice irrelevant, by transferring control to the owner, and tying your one remaining option (to leave the container) to being locked out of your professional ecosystem.

• All the competition are walled gardens.

• Walled garden competition is winner-take-all.

• Even if the “good guys” win and become the biggest walled garden, they become “bad guys”: masters of the walled garden, whose interests become diametrically opposed to those of the people stuck in their walled garden.

To make these ideas concrete, let’s see how they apply to any
reform effort such as selectedpapers.net.

### Walled gardens make individual choice irrelevant

Say somebody starts a website dedicated to such a reform effort, and you decide to contribute a review of an interesting paper. But such a brand-new site by definition has zero fraction of the relevant audience.

Question: what’s the point of writing a review, if it affects nothing and no one will read it? There is no point. Note that if you still choose to make that effort, this will achieve nothing. Individuals choosing to exile themselves from their professional ecosystem have no effect on the Walled Garden. Only a move of the whole ecosystem (a majority) would affect it.

Note this is dramatically different from a free market: even if I, a tiny flea, buy shares of the biggest, most traded company (AAPL, say), on the world’s biggest stock exchange, I immediately see AAPL’s price rise (a tiny bit) in response; when I sell, the price immediately falls in response. A free market is exquisitely sensitive to an individual’s decisions.

This is not an academic question. Many, many people have already tried to start websites with similar “reform” goals as selectedpapers.net. Unfortunately, none of them are gaining traction, for the same reasons that Diaspora has zero chance to beat Facebook.

(If you want to look at one of the early leaders, “open source”, and backed by none other than the Nature Publishing Group, check out Connotea.org. Or on the flip side, consider the fate of Mendeley.)

For years after writing the Selected-Papers Network paper, I held off from doing anything, because at that time I could not see any path for solving this practical problem.

### All the competition are walled gardens

In the physical world, walls do not build themselves, and they have a distressing (or pleasing!) tendency to fall down. In the digital world, by contrast, walls are not the exception but the rule.

A walled garden is simply any container whose data do not automatically interoperate with and in the outside world. Since it takes very special design to achieve any interoperability at all, nearly all websites are walled gardens by default.

More to the point, if websites A and B are competing with each other, is website A going to give B its crown jewels (its users and data)? No, it’s going to build the walls higher. Note that even if a website is open source (anyone can grab its code and start their own site), it’s still a walled garden because its users and their precious data are only stored in its site, and cannot get out.

The significance of this for us is that essentially every “reform” solution being pushed at us, from Mendeley on out to idealistic open source sites, is unfortunately in practice a walled garden. And that means users won’t own their own content (in the crucial sense of control); the walled garden will.

### Walled garden competition is winner-take-all

All this is made worse by the fact that walled garden competition has a strong tendency towards monopoly. It rewards consolidation and punishes small market players. In social networks, size matters. When a little walled garden tries to compete with a big walled garden, all advantages powerfully aid the big incumbent, even if the little one offers great new features. The whole mechanism of individuals “voting with their feet” can’t operate when the only choice available to them is to jump off a cliff: that is, leave the ecosystem where everyone else is.

### Even if you win the walled garden war, the community will lose

Walled gardens intrinsically create a divergence of interests between their owners vs. their users. By giving the owner control and locking in the users, it gives the owner a powerful incentive to expand and exploit his control, at the expense of users, with very little recourse for them. For example, I think my own motivations for starting selectedpapers.net are reasonably pure, but if—for the purpose of argument—it were to grow to dominate mathematics, I still don’t think you should let me (or anyone else) own it as a walled garden.

First of all, you probably won’t agree with many of my decisions; second, if Elsevier offers me \$100 million, how can you know I won’t just sell you out? That’s what the founders of Mendeley just did. Note this argument applies not just to individuals, but even to the duly elected representatives of your own professional societies. For example, in biology some professional societies have been among the most reactionary in fighting Open Access—because they make most of their money from “their” journals. Because they own a walled garden, their interests align with Elsevier, not with their own members.

Actually that’s the whole story of how we got in this mess in the first place. The journal system was started by good people with good intentions, as the “Proceedings” of their club meetings. But because it introduced a mechanism of control, it became a walled garden, with inevitable consequences. If we devote our efforts to a solution that in practice becomes a walled garden, the consequences will again be inevitable.

Why am I dwelling on all these negatives? Let’s not kid ourselves: this is a hard problem, and we are by no means the first to try to crack it. Most of the doors in this prison have already been tried by smart, hard-working people, and they did not lead out. Obviously I don’t believe there’s no way out, or I wouldn’t have started selectedpapers.net. But I do believe we all need to absorb these lessons, if we’re to have any chance of real success.

Roll these principles over in your mind; wargame the possible pathways for reform and note where they collide with one of these principles. Can you find a reliable way out?

In my next post I’ll offer my own analysis of where I think the weak link is. But I am very curious to hear what you come up with.

## Quantitative Reasoning at Yale-NUS College

27 June, 2013

What mathematics should any well-educated person know? It’s rather rare that people have a chance not just to think about this question, but do something about it. But it’s happening now.

There’s a new college called Yale-NUS College starting up this fall in Singapore, jointly run by Yale College and the National University of Singapore. The buildings aren’t finished yet: the above picture shows how a bit of it should look when they are. Faculty are busily setting up the courses and indeed the whole administrative structure of the university, and I’ve had the privilege of watching some of this and even helping out a bit.

It’s interesting because you usually meet an institution when it’s already formed—and you encounter and learn about only those aspects that matter to you. But in this case, the whole institution is being created, and every aspect discussed. And this is especially interesting because Yale-NUS College is designed to be a ‘liberal arts college for Asia for the 21st century’.

As far as I can tell, there are no liberal arts colleges in Asia. Creating a good one requires rethinking the generally Eurocentric attitudes toward history, philosophy, literature, classics and so on that are built into the traditional idea of the liberal arts. Plus, the whole idea of a liberal arts education needs to be rethought for the 21st century. What should a well-educated person know, and be able to do? Luckily, the faculty of Yale-NUS College are taking a fresh look at this question, and coming up with some new answers.

I’m really excited about the Quantitative Reasoning course that all students will take in the second semester of their first year. It will cover topics like this:

• innumeracy, use of numbers in the media.
• visualizing quantitative data.
• cognitive biases, operationalization.
• qualitative heuristics, cognitive biases, formal logic and mathematical proof.
• formal logic, mathematical proofs.
• probability, conditional probability (Bayes’ rule), gambling and odds.
• decision trees, expected utility, optimal decisions and prospect theory.
• sampling, uncertainty.
• quantifying uncertainty, hypothesis testing, p-values and their limitations.
statistical power and significance levels, evaluating evidence.
• correlation and causation, regression analysis.

The idea is not to go into vast detail and not to bombard the students with sophisticated mathematical methods, but to help students:

• learn how to criticize and question claims in an informed way;

• learn to think clearly, to understand logical and intuitive reasoning, and to consider appropriate standards of proof  in different contexts;

• develop a facility and comfort with a variety of representations of quantitative data, and practical experience in gathering data;

• understand the sources of bias and error in seemingly objective numerical data;

• become familiar with the basic concepts  of probability and statistics, with particular emphasis on recognizing when these techniques provide reliable results and when they threaten to mislead us.

They’ll do some easy calculations using R, a programming language optimized for statistics.

Most exciting of all to me is how the course will be taught. There will be about 9 teachers. It will be ‘team-based learning’, where students are divided into (carefully chosen) groups of six. A typical class will start with a multiple choice question designed to test the students understanding of the material they’ve just studied. Then the team will discuss their answers, while professors walk around and help out; then they’ll take the quiz again; then one professor will talk about that topic.

This idea is called ‘peer instruction’. Some studies have shown this approach works better than the traditional lecture style. I’ve never seen it in action, though my friend Christopher Lee uses it in now in his bioinformatics class, and he says it’s great. You can read about its use in physics here:

• Eric Mazur, Physics Education.

I’ll be interested to see it in action starting in August, and later I hope to teach part-time at Yale-NUS College and see how it works for myself!

At the very least, it’s exciting to see people try new things.

## The Selected Papers Network (Part 2)

14 June, 2013

Last time Christopher Lee and I described some problems with scholarly publishing. The big problems are expensive journals and ineffective peer review. But we argued that solving these problems require new methods of

selection—assessing papers

and

endorsement—making the quality of papers known, thus giving scholars the prestige they need to get jobs and promotions.

The Selected Papers Network is an infrastructure for doing both these jobs in an open, distributed way. It’s not yet the solution to the big visible problems—just a framework upon which we can build those solutions. It’s just getting started, and it can use your help.

But before I talk about where all this is heading, and how you can help, let me say what exists now.

This is a bit dangerous, because if you’re not sure what a framework is for, and it’s not fully built yet, it can be confusing to see what’s been built so far! But if you’ve thought about the problems of scholarly publishing, you’re probably sick of hearing about dreams and hopes. You probably want to know what we’ve done so far. So let me start there.

### SelectedPapers.net as it stands today

SelectedPapers.net lets you recommend papers, comment on them, discuss them, or simply add them to your reading list.

But instead of “locking up” your comments within its own website—the “walled garden” strategy followed by many other services—it explicitly shares these data in a way that people not on SelectedPapers.net can easily see. Any other service can see and use them too. It does this by using existing social networks—so that users of those social networks can see your recommendations and discuss them, even if they’ve never heard of SelectedPapers.net!

The idea is simple. You add some hashtags to let SelectedPapers.net know you’re talking to it, and to let it know which paper you’re talking about. It notices these hashtags and copies your comments over to its publicly accessible database.

So far Christopher Lee has got it working on Google+. So right now, if you’re a Google+ user, you can post comments on SelectedPapers.net using your usual Google+ identity and posting process, just by including suitable hashtags. Your post will be seen by your usual audience—but also by people visiting the SelectedPapers.net website, who don’t use Google+.

If you want to strip the idea down to one sentence, it’s this:

Given that social networks already exist, all we need for truly open scientific communication is a convention on a consistent set of tags and IDs for discussing papers.

That makes it possible to integrate discussion from all social networks—big and small—as a single unified forum. It’s a federated approach, rather than a single isolated website. And it won’t rely on any one social network: after Google+, we can get it working for Twitter and other networks and forums.

But more about the theory later. How, exactly, do you use it?

### Getting Started

To see how it works, take a look here:

Under ‘Recent activity’ you’ll see comments and recommendations of different papers, so far mostly on the arXiv.

Support for other social networks such as Twitter is coming soon. But here’s how you can use it now, if you’re a member of Google+:

• We suggest that you first create (in your Google+ account) a Google+ Circle specifically for discussing research with (e.g. call it “Research”). If you already have such a circle, or circles, you can just use those.

• Click Sign in with Google on https://selectedpapers.net or on a paper discussion page.

• The usual Google sign-in window will appear (unless you are already signed in). Google will ask if you want to use the Selected Papers network, and specifically for what Circle(s) to let it see the membership list(s) (i.e. the names of people you have added to that Circle). SelectedPapers.net uses this as your initial “subscriptions”, i.e. the list of people whose recommendations you want to receive. We suggest you limit this to your “Research” circle, or whatever Circle(s) of yours fit this purpose.

Note the only information you are giving SelectedPapers.net access to is this list of names; in all other respects SelectedPapers.net is limited by Google+ to the same information that anyone on the internet can see, i.e. your public posts. For example, SelectedPapers.net cannot ever see your private posts within any of your Circles.

• Now you can initiate and join discussions of papers directly on any SelectedPapers.net page.

• Alternatively, without even signing in to SelectedPapers.net, you can just write posts on Google+ containing the hashtag #spnetwork, and they will automatically be included within the SelectedPapers.net discussions (i.e. indexed and displayed so that other people can reply to them etc.). Here’s an example of a Google+ post example:

This article by Perelman outlines a proof of the Poincare conjecture!

#spnetwork #mustread #geometry #poincareConjecture arXiv:math/0211159

You need the tag #spnetwork for SelectedPapers.net to notice your post. Tags like #mustread, #recommend, and so on indicate your attitude to a paper. Tags like #geometry, #poincareConjecture and so on indicate a subject area: they let people search for papers by subject. A tag of the form arXiv:math/0211159 is necessary for arXiv papers; note that this does not include a # symbol.

For PubMed papers, include a tag of the form PMID:22291635. Other published papers usually have a DOI (digital object identifier), so for those include a tag of the form doi:10.3389/fncom.2012.00001.

Tags are the backbone of SelectedPapers.net; you can read more about them here.

• You can also post and see comments at https://selectedpapers.net. This page also lets you search for papers in the arXiv and search for published papers via their DOI or Pubmed ID. If you are signed in, the homepage will also show the latest recommendations (from people you’re subscribed to), papers on your reading list, and papers you tagged as interesting for your work.

### Papers

Papers are the center of just about everything on the selected papers network. Here’s what you can currently do with a paper:

• click to see the full text of the paper via the arXiv or the publisher’s website.

• read other people’s recommendations and discussion of the paper.

• add it to your Reading List. This is simply a private list of papers—a convenient way of marking a paper for further attention later. When you are logged in, your Reading list is shown on the homepage. No one else can see your reading list.

• share the paper with others (such as your Google+ Circles or Google+ communities that you are part of).

• tag it as interesting for a specific topic. You do this either by clicking the checkbox of a topic (it shows topics that other readers have tagged the paper), by selecting from a list of topics that you have previously tagged as interesting to you, or by simply typing a tag name. These tags are public; that is, everyone can see what topics the paper has been tagged with, and who tagged them.

• post a question or comment about the paper, or reply to what other people have said about it. This traffic is public. Specifically, clicking the Discuss this Paper button gives you a Google+ window (with appropriate tags already filled in) for writing a post. Note that in order for the spnet to see your post, you must include Public in the list of recipients for your post (this is an inherent limitation of Google+, which limits apps to see only the same posts that any internet user would see – even when you are signed-in to the app as yourself on Google+).

• recommend it to others. Once again, you must include Public in the list of recipients for your post, or the spnet cannot see it.

We strongly suggest that you include a topic hashtag for your research interest area. For example, if there is a hashtag that people in your field commonly use for posting on Twitter, use it. If you have to make up a new hashtag, keep it intuitive and follow “camelCase” capitalization e.g. #openPeerReview.

### Open design

Note that thanks to our open design, you do not even need to create a SelectedPapers.net login. Instead, SelectedPapers.net authenticates with Google (for example) that you are signed in to Google+; you never give SelectedPapers.net your Google password or access to any confidential information.

Moreover, even when you are signed in to SelectedPapers.net using your Google sign-in, it cannot see any of your private posts, only those you posted publicly—in other words, exactly the same as what anybody on the Internet can see.

### What to do next?

We really need some people to start using SelectedPapers.net and start giving us bug reports. The place to do that is here:

or if that’s too difficult for some reason, you can just leave a comment on this blog entry.

We could also use people who can write software to improve and expand the system. I can think of fifty ways the setup could be improved: but as usual with open-source software, what matters most is not what you suggest, but what you’re willing to do.

Next, let mention three things we could do in the longer term. But I want to emphasize that these are just a few of many things that can be done in the ecosystem created by a selected papers network. We don’t need to all do the same thing, since it’s an open, federated system.

Overlay journals. A journal doesn’t need to do distribution and archiving of papers anymore: the arXiv or PubMed can do that. A journal can focus on the crucial work of selection and endorsement—it can just point to a paper on the arXiv or PubMed, and say “this paper is published”. Such journals, called overlay journals, are already being contemplated—see for example Tim Gowers’ post. But they should work better in the ecosystem created by a selected papers network.

Review boards. Publication doesn’t need to be a monogamous relation between a journal and an author. We could also have prestigious ‘review boards’ like the Harvard Genomics Board or the Institute of Network Science who pick, every so often, what they consider to be best papers in their chosen area. In their CVs, scholars could then say things like “this paper was chosen as one of the Top Ten Papers in Topology in 2015 by the International Topology Review Board”. Of course, boards would become prestigious in the usual recursive way: by having prestigious members, being associated with prestigious institutions, and correctly choosing good papers to bestow prestige upon. But all this could be done quite cheaply.

Open peer review. Last time, we listed lots of problems with how journals referee papers. Open peer review is a way to solve these problems. I’ll say more about it next time. For now, go here:

• Christopher Lee, Open peer review by a selected-papers network, Frontiers of Computational Neuroscience 6 (2012).

### A federated system

After reading this, you may be tempted to ask: “Doesn’t website X already do most of this? Why bother starting another?”

Here’s the answer: our approach is different because it is federated. What does that mean? Here’s the test: if somebody else were to write their own implementation of the SelectedPapers.net protocol and run it on their own website, would data entered by users of that site show up automatically on selectedpapers.net, and vice versa? The answer is yes, because the protocol transports its data on open, public networks, so the same mechanism that allows selectedpapers.net to read its users’ messages would work for anyone else. Note that no special communications between the new site and SelectedPapers.net would be required; it is just federated by design!

One more little website is not going to solve the problems with journals. The last thing anybody wants is another password to remember! There are already various sites trying to solve different pieces of the problem, but none of them are really getting traction. One reason is that the different sites can’t or won’t talk to each other—that is, federate. They are walled gardens, closed ecosystems. As a result, progress has been stalled for years.

And frankly, even if some walled garden did eventually eventually win out, that wouldn’t solve the problem of expensive journals. If one party became able to control the flow of scholarly information, they’d eventually exploit this just as the journals do now.

So, we need a federated system, to make scholarly communication openly accessible not just for scholars but for everyone—and to keep it that way.

## The Selected Papers Network (Part 1)

7 June, 2013

Christopher Lee has developed some new software called the Selected Papers Network. I want to explain that and invite you all to try using it! But first, in this article, I want to review the problems it’s trying to address.

There are lots of problems with scholarly publishing, and of course even more with academia as a whole. But I think Chris and I are focused on two: expensive journals, and ineffective peer review.

### Expensive Journals

Our current method of publication has some big problems. For one thing, the academic community has allowed middlemen to take over the process of publication. We, the academic community, do most of the really tricky work. In particular, we write the papers and referee them. But they, they publishers, get almost all the money, and charge our libraries for it—more and more, thanks to their monopoly power. It’s an amazing business model:

Get smart people to work for free, then sell what they make back to them at high prices.

People outside academia have trouble understanding how this continues! To understand it, we need to think about what scholarly publishing and libraries actually achieve. In short:

1. Distribution. The results of scholarly work get distributed in publicly accessible form.

2. Archiving. The results, once distributed, are safely preserved.

3. Selection. The quality of the results is assessed, e.g. by refereeing.

4. Endorsement. The quality of the results is made known, giving the scholars the prestige they need to get jobs and promotions.

Thanks to the internet, jobs 1 and 2 have become much easier. Anyone can put anything on a website, and work can be safely preserved at sites like the arXiv and PubMed Central. All this is either cheap or already supported by government funds. We don’t need journals for this.

The journals still do jobs 3 and 4. These are the jobs that academia still needs to find new ways to do, to bring down the price of journals or make them entirely obsolete.

The big commercial publishers like to emphasize how they do job 3: selection. The editors contact the referees, remind them to deliver their referee reports, and communicate these reports to the authors, while maintaining the anonymity of the referees. This takes work.

However, this work can be done much more cheaply than you’d think from the prices of journals run by the big commercial publishers. We know this from the existence of good journals that charge much less. And we know it from the shockingly high profit margins of the big publishers, particularly Elsevier.

It’s clear that the big commercial publishers are using their monopoly power to charge outrageous prices for their products. Why do they continue to get away with this? Why don’t academics rebel and publish in cheaper journals?

One reason is a broken feedback loop. The academics don’t pay for journals out of their own pocket. Instead, their university library pays for the journals. Rising journal costs do hurt the academics: money goes into paying for journals that could be spent in other ways. But most of them don’t notice this.

The other reason is item 4: endorsement. This is the part of academic publishing that outsiders don’t understand. Academics want to get jobs and promotions. To do this, we need to prove that we’re ‘good’. But academia is so specialized that our colleagues are unable to tell how good our papers are. Not by actually reading them, anyway! So, they try to tell by indirect methods—and a very important one is the prestige of the journals we publish in.

The big commercial publishers have bought most of the prestigious journals. We can start new journals, and some of us are already doing that, but it takes time for these journals to become prestigious. In the meantime, most scholars prefer to publish in prestigious journals owned by the big publishers, even if this slowly drives their own libraries bankrupt. This is not because these scholars are dumb. It’s because a successful career in academia requires the constant accumulation of prestige.

The Elsevier boycott shows that more and more academics understand this trap and hate it. But hating a trap is not enough to escape the trap.

Boycotting Elsevier and other monopolistic publishers is a good thing. The arXiv and PubMed Central are good things, because they show that we can solve the distribution and archiving problems without the help of big commercial publishers. But we need to develop methods of scholarly publishing that solve the selection and endorsement problems in ways that can’t be captured by the big commercial publishers.

I emphasize ‘can’t be captured’, because these publishers won’t go down without a fight. Anything that works well, they will try to buy—and then they will try to extract a stream of revenue from it.

### Ineffective Peer Review

While I am mostly concerned with how the big commercial publishers are driving libraries bankrupt, my friend Christopher Lee is more concerned with the failures of the current peer review system. He does a lot of innovative work on bioinformatics and genomics. This gives him a different perspective than me. So, let me just quote the list of problems from this paper:

• Christopher Lee, Open peer review by a selected-papers network, Frontiers of Computational Neuroscience 6 (2012).

The rest of this section is a quote:

Expert peer review (EPR) does not work for interdisciplinary peer review (IDPR). EPR means the assumption that the reviewer is expert in all aspects of the paper, and thus can evaluate both its impact and validity, and can evaluate the paper prior to obtaining answers from the authors or other referees. IDPR means the situation where at least one part of the paper lies outside the reviewer’s expertise. Since journals universally assume EPR, this creates artificially high barriers to innovative papers that combine two fields [Lee, 2006]—-one of the most valuable sources of new discoveries.

Shoot first and ask questions later means the reviewer is expected to state a REJECT/ACCEPT position before getting answers from the authors or other referees on questions that lie outside the reviewer’s expertise.

No synthesis: if review of a paper requires synthesis—combining the different expertise of the authors and reviewers in order to determine what assumptions and criteria are valid for evaluating it—both of the previous assumptions can fail badly [Lee, 2006].

Journals provide no tools for finding the right audience for an innovative paper. A paper that introduces a new combination of fields or ideas has an audience search problem: it must search multiple fields for people who can appreciate that new combination. Whereas a journal is like a TV channel (a large, pre-defined audience for a standard topic), such a paper needs something more like Google—a way of quickly searching multiple audiences to find the subset of people who can understand its value.

Each paper’s impact is pre-determined rather than post-evaluated: By ‘pre-determination’ I mean that both its impact metric (which for most purposes is simply the title of the journal it was published in) and its actual readership are locked in (by the referees’s decision to publish it in a given journal) before any readers are allowed to see it. By ‘post-evaluation’ I mean that impact should simply be measured by the research community’s long-term response and evaluation of it.

Non-expert PUSH means that a pre-determination decision is made by someone outside the paper’s actual audience, i.e., the reviewer would not ordinarily choose to read it, because it does not seem to contribute sufficiently to his personal research interests. Such a reviewer is forced to guess whether (and how much) the paper will interest other audiences that lie outside his personal interests and expertise. Unfortunately, people are not good at making such guesses; history is littered with examples of rejected papers and grants that later turned out to be of great interest to many researchers. The highly specialized character of scientific research, and the rapid emergence of new subfields, make this a big problem.

In addition to such false-negatives, non-expert PUSH also causes a huge false-positive problem, i.e., reviewers accept many papers that do not personally interest them and which turn out not to interest anybody; a large fraction of published papers subsequently receive zero or only one citation (even including self-citations [Adler et al., 2008]). Note that non-expert PUSH will occur by default unless reviewers are instructed to refuse to review anything that is not of compelling interest for their own work. Unfortunately journals assert an opposite policy.

One man, one nuke means the standard in which a single negative review equals REJECT. Whereas post-evaluation measures a paper’s value over the whole research community (‘one man, one vote’), standard peer review enforces conformity: if one referee does not understand or like it, prevent everyone from seeing it.

PUSH makes refereeing a political minefield: consider the contrast between a conference (where researchers publicly speak up to ask challenging questions or to criticize) vs. journal peer review (where it is reckoned necessary to hide their identities in a ‘referee protection program’). The problem is that each referee is given artificial power over what other people can like—he can either confer a large value on the paper (by giving it the imprimatur and readership of the journal) or consign it zero value (by preventing those readers from seeing it). This artificial power warps many aspects of the review process; even the ‘solution’ to this problem—shrouding the referees in secrecy—causes many pathologies. Fundamentally, current peer review treats the reviewer not as a peer but as one who wields a diktat: prosecutor, jury, and executioner all rolled into one.

Restart at zero means each journal conducts a completely separate review process of a paper, multiplying the costs (in time and effort) for publishing it in proportion to the number of journals it must be submitted to. Note that this particularly impedes innovative papers, which tend to aim for higher-profile journals, and are more likely to suffer from referees’s IDPR errors. When the time cost for publishing such work exceeds by several fold the time required to do the work, it becomes more cost-effective to simply abandon that effort, and switch to a ‘standard’ research topic where repetition of a pattern in many papers has established a clear template for a publishable unit (i.e., a widely agreed checklist of criteria for a paper to be accepted).

The reviews are thrown away: after all the work invested in obtaining reviews, no readers are permitted to see them. Important concerns and contributions are thus denied to the research community, and the referees receive no credit for the vital contribution they have made to validating the paper.

In summary, current peer review is designed to work for large, well-established fields, i.e., where you can easily find a journal with a high probability that every one of your reviewers will be in your paper’s target audience and will be expert in all aspects of your paper. Unfortunately, this is just not the case for a large fraction of researchers, due to the high level of specialization in science, the rapid emergence of new subfields, and the high value of boundary-crossing research (e.g., bioinformatics, which intersects biology, computer science, and math).

### Toward solutions

Next time I’ll talk about the software Christopher Lee has set up. But if you want to get a rough sense of how it works, read the section of Christopher Lee’s paper called The Proposal in Brief.

## Meta-Rationality

15 March, 2013

On his blog, Eli Dourado writes something that’s very relevant to the global warming debate, and indeed most other debates.

He’s talking about Paul Krugman, but I think with small modifications we could substitute the name of almost any intelligent pundit. I don’t care about Krugman here, I care about the general issue:

Nobel laureate, Princeton economics professor, and New York Times columnist Paul Krugman is a brilliant man. I am not so brilliant. So when Krugman makes strident claims about macroeconomics, a complex subject on which he has significantly more expertise than I do, should I just accept them? How should we evaluate the claims of people much smarter than ourselves?

A starting point for thinking about this question is the work of another Nobelist, Robert Aumann. In 1976, Aumann showed that under certain strong assumptions, disagreement on questions of fact is irrational. Suppose that Krugman and I have read all the same papers about macroeconomics, and we have access to all the same macroeconomic data. Suppose further that we agree that Krugman is smarter than I am. All it should take, according to Aumann, for our beliefs to converge is for us to exchange our views. If we have common “priors” and we are mutually aware of each others’ views, then if we do not agree ex post, at least one of us is being irrational.

It seems natural to conclude, given these facts, that if Krugman and I disagree, the fault lies with me. After all, he is much smarter than I am, so shouldn’t I converge much more to his view than he does to mine?

Not necessarily. One problem is that if I change my belief to match Krugman’s, I would still disagree with a lot of really smart people, including many people as smart as or possibly even smarter than Krugman. These people have read the same macroeconomics literature that Krugman and I have, and they have access to the same data. So the fact that they all disagree with each other on some margin suggests that very few of them behave according to the theory of disagreement. There must be some systematic problem with the beliefs of macroeconomists.

In their paper on disagreement, Tyler Cowen and Robin Hanson grapple with the problem of self-deception. Self-favoring priors, they note, can help to serve other functions besides arriving at the truth. People who “irrationally” believe in themselves are often more successful than those who do not. Because pursuit of the truth is often irrelevant in evolutionary competition, humans have an evolved tendency to hold self-favoring priors and self-deceive about the existence of these priors in ourselves, even though we frequently observe them in others.

Self-deception is in some ways a more serious problem than mere lack of intelligence. It is embarrassing to be caught in a logical contradiction, as a stupid person might be, because it is often impossible to deny. But when accused of disagreeing due to a self-favoring prior, such as having an inflated opinion of one’s own judgment, people can and do simply deny the accusation.

How can we best cope with the problem of self-deception? Cowen and Hanson argue that we should be on the lookout for people who are “meta-rational,” honest truth-seekers who choose opinions as if they understand the problem of disagreement and self-deception. According to the theory of disagreement, meta-rational people will not have disagreements among themselves caused by faith in their own superior knowledge or reasoning ability. The fact that disagreement remains widespread suggests that most people are not meta-rational, or—what seems less likely—that meta-rational people cannot distinguish one another.

We can try to identify meta-rational people through their cognitive and conversational styles. Someone who is really seeking the truth should be eager to collect new information through listening rather than speaking, construe opposing perspectives in their most favorable light, and offer information of which the other parties are not aware, instead of simply repeating arguments the other side has already heard.

All this seems obvious to me, but it’s discussed much too rarely. Maybe we can figure out ways to encourage this virtue that Cohen and Hanson call ‘meta-rationality’? There are already too many mechanisms that reward people for aggressively arguing for fixed positions. If Krugman really were ‘meta-rational’, he might still have his Nobel Prize, but he probably wouldn’t be a popular newspaper columnist.

The Azimuth Project, and this blog, are already doing a lot of things to prevent people from getting locked into fixed positions and filtering out evidence that goes against their views. Most crucial seems to be the policy of forbidding insults, bullying, and overly repetitive restatement of the same views. These behaviors increase what I call the ‘heat’ in a discussion, and I’ve decided that, all things considered, it’s best to keep the heat fairly low.

Heat attracts many people, so I’m sure we could get a lot more people to read this blog by turning up the heat. A little heat is a good thing, because it engages people’s energy. But heat also makes it harder for people to change their minds. When the heat gets too high, changing ones mind is perceived as a defeat, to be avoided at all costs. Even worse, people form ‘tribes’ who back each other up in every argument, regardless of the topic. Rationality goes out the window. And meta-rationality? Forget it!

### Some Questions

Dourado talks about ways to “identify meta-rational people.” This is very attractive, but I think it’s better to talk about “identifying when people are behaving meta-rationally”. I don’t think we should spend too much of our time looking around for paragons of meta-rationality. First of all, nobody is perfect. Second of all, as soon as someone gets a big reputation for rationality, meta-rationality, or any other virtue, it seems they develop a fan club that runs a big risk of turning into a cult. This often makes it harder rather than easier for people to think clearly and change their minds!

I’d rather look for customs and institutions that encourage meta-rationality. So, my big question is:

How can we encourage rationality and meta-rationality, and make them more popular?

Of course science, and academia, are institutions that have been grappling with this question for centuries. Universities, seminars, conferences, journals, and so on—they all put a lot of work into encouraging the search for knowledge and examining the conditions under which it thrives.

And of course these institutions are imperfect: everything humans do is riddled with flaws.

But instead of listing cases where existing institutions failed to do their job optimally, I’d like to think about ways of developing new customs and institutions that encourage meta-rationality… and linking these to the existing ones.

Why? Because I feel the existing institutions don’t reach out enough to the ‘general public’, or ‘laymen’. The mere existence of these terms is a clue. There are a lot of people who consider academia as an ‘ivory tower’, separate from their own lives and largely irrelevant. And there are a lot of good reasons for this.

There’s one you’ve heard me talk about a lot: academia has let its journals get bought by big multimedia conglomerates, who then charge high fees for access. So, we have have scientific research on global warming paid for by our tax dollars, and published by prestigious journals such as Science and Nature… which unfortunately aren’t available to the ‘general public’.

That’s like a fire alarm you have to pay to hear.

But there’s another problem: institutions that try to encourage meta-rationality seem to operate by shielding themselves from the broader sphere that favors ‘hot’ discussions. Meanwhile, the hot discussions don’t get enough input from ‘cooler’ forums… and vice versa!

For example: we have researchers in climate science who publish in refereed journals, which mostly academics read. We have conferences, seminars and courses where this research is discussed and criticized. These are again attended mostly by academics. Then we have journalists and bloggers who try to explain and discuss these papers in more easily accessed venues. There are some blogs written by climate scientists, who try to short-circuit the middlemen a bit. Unfortunately the heated atmosphere of some of these blogs makes meta-rationality difficult. There are also blogs by ‘climate skeptics’, many from outside academia. These often criticize the published papers, but—it seems to me—rarely get into discussions with the papers’ authors in conditions that make it easy for either party to change their mind. And on top of all this, we have various think tanks who are more or less pre-committed to fixed positions… and of course, corporations and nonprofits paying for advertisements pushing various agendas.

Of course, it’s not just the global warming problem that suffers from a lack of public forums that encourage meta-rationality. That’s just an example. There have got to be some ways to improve the overall landscape a little. Just a little: I’m not expecting miracles!

### Details

Here’s the paper by Aumann:

• Robert J. Aumann, Agreeing to disagree, The Annals of Statistics 4 (1976), 1236-1239.

and here’s the one by Cowen and Hanson:

• Tyler Cowen and Robin Hanson, Are disagreements honest?, 18 August 2004.

Personally I find Aumann’s paper uninteresting, because he’s discussing agents that are not only rational Bayesians, but rational Bayesians that share the same priors to begin with! It’s unsurprising that such agents would have trouble finding things to argue about.

His abstract summarizes his result quite clearly… except that he calls these idealized agents ‘people’, which is misleading:

Abstract. Two people, 1 and 2, are said to have common knowledge of an event E if both know it, 1 knows that 2 knows it, 2 knows that 1 knows is, 1 knows that 2 knows that 1 knows it, and so on.

Theorem. If two people have the same priors, and their posteriors for an event A are common knowledge, then these posteriors are equal.

Cowen and Hanson’s paper is more interesting to me. Here are some key sections for what we’re talking about here:

### How Few Meta-rationals?

We can call someone a truth-seeker if, given his information and level of effort on a topic, he chooses his beliefs to be as close as possible to the truth. A non-truth seeker will, in contrast, also put substantial weight on other goals when choosing his beliefs. Let us also call someone meta-rational if he is an honest truth-seeker who chooses his opinions as if he understands the basic theory of disagreement, and abides by the rationality standards that most people uphold, which seem to preclude self-favoring priors.

The theory of disagreement says that meta-rational people will not knowingly have self-favoring disagreements among themselves. They might have some honest disagreements, such as on values or on topics of fact where their DNA encodes relevant non-self-favoring attitudes. But they will not have dishonest disagreements, i.e., disagreements directly on their relative ability, or disagreements on other random topics caused by their faith in their own superior knowledge or reasoning ability.

Our working hypothesis for explaining the ubiquity of persistent disagreement is that people are not usually meta-rational. While several factors contribute to this situation, a sufficient cause that usually remains when other causes are removed is that people do not typically seek only truth in their beliefs, not even in a persistent rational core. People tend to be hypocritical in have self-favoring priors, such as priors that violate indexical independence, even though they criticize others for such priors. And they are reluctant to admit this, either publicly or to themselves.

How many meta-rational people can there be? Even if the evidence is not consistent with most people being meta-rational, it seems consistent with there being exactly one meta-rational person. After all, in this case there never appears a pair of meta-rationals to agree with each other. So how many more meta-rationals are possible?

If meta-rational people were common, and able to distinguish one another, then we should see many pairs of people who have almost no dishonest disagreements with each other. In reality, however, it seems very hard to find any pair of people who, if put in contact, could not identify many persistent disagreements. While this is an admittedly difficult empirical determination to make, it suggests that there are either extremely few meta-rational people, or that they have virtually no way to distinguish each other.

Yet it seems that meta-rational people should be discernible via their conversation style. We know that, on a topic where self-favoring opinions would be relevant, the sequence of alternating opinions between a pair of people who are mutually aware of both being meta-rational must follow a random walk. And we know that the opinion sequence between typical non-meta-rational humans is nothing of the sort. If, when responding to the opinions of someone else of uncertain type, a meta-rational person acts differently from an ordinary non-meta-rational person, then two meta-rational people should be able to discern one another via a long enough conversation. And once they discern one another, two meta-rational people should no longer have dishonest disagreements. (Aaronson (2004) has shown that regardless of the topic or their initial opinions, any two Bayesians have less than a 10% chance of disagreeing by more than a 10% after exchanging about a thousand bits, and less than a 1% chance of disagreeing by more than a 1% after exchanging about a million bits.)

Since most people have extensive conversations with hundreds of people, many of whom they know very well, it seems that the fraction of people who are meta-rational must be very small. For example, given $N$ people, a fraction $f$ of whom are meta-rational, let each person participate in $C$ conversations with random others that last long enough for two meta-rational people to discern each other. If so, there should be on average $f^2CN/2$ pairs who no longer disagree. If, across the world, two billion people, one in ten thousand of who are meta-rational, have one hundred long conversations each, then we should see one thousand pairs of people with only honest disagreements. If, within academia, two million people, one in ten thousand of who are meta-rational, have one thousand long conversations each, we should see ten agreeing pairs of academics. And if meta-rational people had any other clues to discern each another, and preferred to talk with one another, there should be far more such pairs. Yet, with the possible exception of some cult-like or fan-like relationships, where there is an obvious alternative explanation for their agreement, we know of no such pairs of people who no longer disagree on topics where self-favoring opinions are relevant.

We therefore conclude that unless meta-rationals simply cannot distinguish each other, only a tiny non-descript percentage of the population, or of academics, can be meta-rational. Either few people have truth-seeking rational cores, and those that do cannot be readily distinguished, or most people have such cores but they are in control infrequently and unpredictably. Worse, since it seems unlikely that the only signals of meta-rationality would be purely private signals, we each seem to have little grounds for confidence in our own meta-rationality, however much we would like to believe otherwise.

Personally, I think the failure to find ‘ten agreeing pairs of academics’ is not very interesting. Instead of looking for people who are meta-rational in all respects, which seems futile, I’m more interested in to looking for contexts and institutions that encourage people to behave meta-rationally when discussing specific issues.

For example, there’s surprisingly little disagreement among mathematicians when they’re discussing mathematics and they’re on their best behavior—for example, talking in a classroom. Disagreements show up, but they’re often dismissed quickly when one or both parties realize their mistake. The same people can argue bitterly and endlessly over politics or other topics. They are not meta-rational people: I doubt such people exist. They are people who have been encouraged by an institution to behave meta-rationally in specific limited ways… because the institution rewards this behavior.

Moving on:

### Personal policy implications

Readers need not be concerned about the above conclusion if they have not accepted our empirical arguments, or if they are willing to embrace the rationality of self-favoring priors, and to forgo criticizing the beliefs of others caused by such priors. Let us assume, however, that you, the reader, are trying to be one of those rare meta-rational souls in the world, if indeed there are any. How guilty should you feel when you disagree on topics where self-favoring opinions are relevant?

If you and the people you disagree with completely ignored each other’s opinions, then you might tend to be right more if you had greater intelligence and information. And if you were sure that you were meta-rational, the fact that most people were not might embolden you to disagree with them. But for a truth-seeker, the key question must be how sure you can be that you, at the moment, are substantially more likely to have a truth-seeking, in-control, rational core than the people you now disagree with. This is because if either of you have some substantial degree of meta-rationality, then your relative intelligence and information are largely irrelevant except as they may indicate which of you is more likely to be self-deceived about being meta-rational.

One approach would be to try to never assume that you are more meta-rational than anyone else. But this cannot mean that you should agree with everyone, because you simply cannot do so when other people disagree among themselves. Alternatively, you could adopt a “middle” opinion. There are, however, many ways to define middle, and people can disagree about which middle is best (Barns 1998). Not only are there disagreements on many topics, but there are also disagreements on how to best correct for one’s limited meta-rationality.

Ideally we would want to construct a model of the process of individual self-deception, consistent with available data on behavior and opinion. We could then use such a model to take the observed distribution of opinion, and infer where lies the weight of evidence, and hence the best estimate of the truth. [Ideally this model would also satisfy a reflexivity constraint: when applied to disputes about self-deception it should select itself as the best model of self-deception. If people reject the claim that most people are self-deceived about their meta-rationality, this approach becomes more difficult, though perhaps not impossible.]

A more limited, but perhaps more feasible, approach to relative meta-rationality is to seek observable signs that indicate when people are self-deceived about their meta-rationality on a particular topic. You might then try to disagree only with those who display such signs more strongly than you do. For example, psychologists have found numerous correlates of self-deception. Self-deception is harder regarding one’s overt behaviors, there is less self-deception in a galvanic skin response (as used in lie detector tests) than in speech, the right brain hemisphere tends to be more honest, evaluations of actions are less honest after those actions are chosen than before (Trivers 2000), self-deceivers have more self-esteem and less psychopathology, especially less depression (Paulhus 1986), and older children are better than younger ones at hiding their self-deception from others (Feldman & Custrini 1988). Each correlate implies a corresponding sign of self-deception.

Other commonly suggested signs of self-deception include idiocy, self-interest, emotional arousal, informality of analysis, an inability to articulate supporting arguments, an unwillingness to consider contrary arguments, and ignorance of standard mental biases. If verified by further research, each of these signs would offer clues for identifying other people as self-deceivers.

Of course, this is easier said than done. It is easy to see how self-deceiving people, seeking to justify their disagreements, might try to favor themselves over their opponents by emphasizing different signs of self-deception in different situations. So looking for signs of self-deception need not be an easier approach than trying to overcome disagreement directly by further discussion on the topic of the disagreement.

We therefore end on a cautionary note. While we have identified some considerations to keep in mind, were one trying to be one of those rare meta-rational souls, we have no general recipe for how to proceed. Perhaps recognizing the difficulty of this problem can at least make us a bit more wary of our own judgments when we disagree.

## The Faculty of 1000

31 January, 2012

As of this minute, 1890 scholars have signed a pledge not to cooperate with the publisher Elsevier. People are starting to notice. According to this Wired article, the open-access movement is “catching fire”:

• David Dobbs, Testify: the open-science movement catches fire, Wired, 30 January 2012.

Now is a good time to take more substantial actions. But what?

Many things are being discussed, but it’s good to spend a bit of time thinking about the root problems and the ultimate solutions.

The world-wide web has made journals obsolete: it would be better to put papers on freely available archives and then let boards of top scholars referee them. But how do we get to this system?

In math and physics we have the arXiv, but nobody referees those papers. In biology and medicine, a board called the Faculty of 1000 chooses and evaluates the best papers, but there’s no archive: they get those papers from traditional journals.

Whoops—never mind! That was yesterday. Now the Faculty of 1000 has started an archive!

• Rebecca Lawrence, F1000 Research – join us and shape the future of scholarly communication, F1000, 30 January 2012.

• Ivan Oransky, An arXiv for all of science? F1000 launches new immediate publication journal, Retraction Watch, 30 January 2012.

This blog article says “an arXiv for all science”, but it seems the new F1000 Research archive is just for biology and medicine. So now it’s time for the mathematicians and physicists to start catching up.