Welcome to the SPARC Open Access Newsletter, issue #166
December 2, 2012
by Peter Suber
The idea of an open-access evidence rack
The containers of research are evolving beyond articles and books, and OA is facilitating that evolution. Indeed, OA is a precondition for nearly all of that evolution, which is one reason why OA is closer to the minimum than the maximum of what we should expect for research in the digital age. Speaking personally, I have stronger reasons to want OA itself than any particular, proposed new container. Hence, I've generally stuck to the case for OA, confident that the case for better containers will take care of itself as smart, motivated people explore the opportunities OA is creating.
This article is an exception. I want to describe one kind of new container for OA research. In this case, "structure" may be a better term than "container". I want to describe a structure for organizing the evidence in support of the basic propositions in a field, and for making that evidence OA.
I call it an "evidence rack" for reasons that should become clear in a minute. My conception of it is incomplete in the way that makes it flexible, leaving room for many different choices about how to finish the job. But even this much of the idea makes me want to put it out for discussion, trial, testing, and refinement. I'd also like to enlist your help in imagining the rest of the picture.
Here's the idea in three steps.
First, identify the basic propositions in the field or sub-field you want to cover. To start small, identify the basic propositions you want to defend in a given article.
Second, create a separate OA web page for each proposition. For now, don't worry about the file format or other technicalities. What's important is that the pages should (1) be easy to update, (2) carry a time-stamp showing when they were last updated, and (3) give each proposition a unique URL. Let's call them "proposition pages".
Third, start filling in each page with the evidence in support of its proposition. If some evidence has been published in an article or book, then cite the publication. When the work is online (OA or TA), add a link as well. Whenever you can link directly to evidence, rather than merely to publications describing evidence, do that. For example, some propositions can be supported by linkable data in an open dataset. But because citations and data don't always speak for themselves, consider adding some annotations to explain how cited pieces of evidence support the given proposition.
Each supporting study or piece of evidence should have an entry to itself. A proposition page should look more like a list than an article. It should look like a list of citations, annotated citations, or bullet points. It should look like a footnote, perhaps a very long footnote, for the good reason that one intended use of a proposition page is to be available for citation and review as a compendious, perpetually updated, public footnote.
If the proposition has an identifiable author or source, then add an attribution to the page. But many propositions that belong in evidence racks (such as, "The ten warmest years on record have all occurred since the year 2000") will not have identifiable authors.
If the proposition is supported by abundant evidence, take your time to collect it, or at least to collect the major pieces of evidence. The page can be as large as you need it to be. Doing the job well means that it never has to be done over from scratch again. It merely has to be updated with new evidence or revised to reflect new interpretations.
If a page of evidence starts to become umanageably large, and especially if the proposition is broad, then break the proposition into sub-propositions and give each one a page to itself. If you break up a proposition page after it has been online for a while and acquired incoming links, then create a redirect at the original URL pointing to a menu of the new, narrower propositions.
If a proposition uses technical language, then consider adding a non-technical or less technical paraphrase. If the proposition needs some context to make its full meaning or importance evident, then consider adding a paragraph to provide that context. Also consider posting translations of the proposition, the paraphrase, and annotations into several other major languages.
If a proposition is inconsistent with other propositions in the same rack, then the implicated pages should link to one another. When incompatible propositions have supporting evidence, then intellectually honest evidence racks will want to include them, and intellectually honest users will want to see what the evidence amounts to. The rationale for including incompatible propositions in the same rack is to do capture the current state of research, or the current state of the evidence, on proposals that remain disputed or questions that remain unsettled. The rationale for including the mutual links is to help users navigate the the debate and evaluate the evidence for different positions.
An evidence rack is a list of propositions in which each proposition anchors a list of supporting studies or pieces of evidence. It's a list of lists. I call it a rack because I picture the series of propositions as series of hooks, and I picture the chain of evidence under a given proposition as a chain hanging from the hook.
* Authorship variations
If you're writing an article, you might create an evidence rack for the propositions you want to defend in the article. If you want the article to be a solo work, then you might want the evidence rack to be a solo work for the same reason: for credit or attribution, or to reflect your own perspective. Similarly, if your research group is writing an article, it might create an evidence rack for its results and limit contributions to the members of the group. In these cases, users would attribute the evidence rack to the named author or the named group.
An important alternative is to open up an evidence rack to contributions from others. The rack's founders, editors, or project managers could formulate the propositions and crowd-source the job of documenting them, or they could crowd-source both jobs. They could open the rack to new entries from anyone, on the model of Wikipedia. Or they could solicit submissions from anyone, but only post the entries that pass some kind of review or scrutiny. Editors and contributors could be anonymous or attributed. But even if the people are not named, the rack should be named for the purpose of attribution.
* Some benefits
One benefit of an evidence rack is the way it zeroes in on propositions and their supporting evidence. Readers can find the evidence in support of a given claim without having to read a whole article or book devoted to indefinitely many propositions. Readers seeking evidence for a given proposition won't have to wade through stage-setting, explanation, analysis, discussion, and review, which they may not need.
There will always be a place for full-prose exposition, for example, in articles and books, because there will always be a place for stage-setting, explanation, analysis, discussion, and review. Articles and books (as well as post-articles and post-books) can benefit from companion evidence racks, and vice versa. Evidence racks are particularly strong on individuating basic propositions, articulating them, cataloging them, and documenting them. While evidence racks drop most of the jobs tackled by full-prose genres, which is part of their utility, we shouldn't underestimate the remaining prose demands of a good evidence rack. Some of the hardest jobs in creating an authoritative evidence rack will be wording the propositions themselves, wording the evidence entries, and wording any associated annotations.
An evidence rack is living or dynamic. It can grow indefinitely to keep pace with new propositions in a field, new evidence for existing propositions, and new interpretations of existing evidence. Published articles and books are frozen in time. Even if they're occasionally republished in new editions, that kind of updating is episodic and infrequent. An evidence rack can be updated at will, in real time. A well-maintained evidence rack can be more complete and up to date than any publication, print or digital.
An unexpected benefit is that an evidence rack supports forking --or could support forking if properly licensed. If a given rack stops growing, or refuses to acknowledge certain evidence, or tolerates muddled and imprecise propositions, then a rival project could copy the original, keep what is worth keeping, and take the project further or take it in a new direction. The risk of forking should keep an evidence rack careful and consultative, and the actuality of forking should be an adequate remedy to inadequate editing and quality.
Perhaps the greatest benefit is the way that evidence racks can assist and transform downstream scholarship. Suppose you are writing a new article, and a handful of the propositions you'd like to assert are already well-supported on OA proposition pages. You shouldn't have to find your own new evidence for those propositions, or find your own original way to describe evidence already well-described or well-summarized elsewhere. You should be able to cite the proposition pages that have already collected and organized the evidence you need, especially if they have acquired stature in the research community as comprehensive, rigorous, and trustworthy.
For the same reason that an evidence rack can help you document your claims, it can help you decide which claims are worth making in the first place. In that sense helps both writing and analysis. As you write, you can see what propositions related to your topic are supported by evidence and what the evidence is. You can quickly cite communally gathered evidence for a relevant proposition you planned to assert. You can also make connections with propositions you didn't plan to assert but which are implicated by your thesis and supported by communally gathered evidence. You can find new evidence to strengthen your arguments. You can also find counter-evidence weakening your arguments and leading you to qualify overbroad conclusions. These benefits are not new, of course. They're the benefits of looking at the evidence. But an evidence rack should make it easier for everyone to look at the evidence.
If OA evidence racks organize trustworthy public footnotes for scholars, they can do the same for journalists. Instead of linking to just one source, or citing one source without linking to it, or citing no research sources at all, journalists could link to a trusted compendium of the current state of the evidence. Not all readers will click through or study what they find. But the evidence will be one click away for readers who do care. Knowing that, more journalists should care to provide the link.
I've been pessimistic that providing OA to research on (say) evolution or climate change would do much to dent public ignorance.
But part of that pessimism was about providing OA to journal articles as they exist today. The peer-reviewed research article is better for reporting new results or proposing new hypotheses than for organizing knowledge. In fact, it's very bad at organizing knowledge, for professionals or non-professionals. When a field is fairly mature, then we supplement research articles with review articles, encyclopedias, textbooks, and other genres. But in new fields, or at the frontier of established fields, research articles are just about all we have.
I'm hoping that evidence racks could change that. They work in new fields as well as mature fields, they present evidence in a more intelligible and digestible format than research articles, and they organize knowledge better than articles and better than articles supplemented by datasets. I'm also hoping that as the containers of research evolve beyond articles and books, we'll develop many others that surpass research articles in these respects. I don't know how optimistic to be that an ecosystem of new OA research structures will make a serious dent in public ignorance of evolution or climate change. But I'm optimistic that OA evidence racks could do more than OA research articles.
* Evidence racks and OA
An evidence rack could be toll-access (TA) in the same way that articles, books, dissertations, and datasets could be TA. But an OA evidence rack would be vastly more useful than a TA rack, in at least the ways that other research structures would be more useful when OA. Today, of course, many scholars who know this well nevertheless do not make their work OA. But at least there are countervailing incentives to explain those decisions, and so far there aren't yet any countervailing incentives to compromise on the access policy for an evidence rack. Here, then, I'll acknowledge that an evidence rack could be TA, but I'll only talk about racks that choose OA.
If some evidence is locked away in a TA publication, then at least an OA page could cite that TA publication, usually with a link, and perhaps with an annotation or summary. When the evidence is already OA, then the OA page could link directly to the OA evidence. If the current state of the evidence means that some cannot yet be OA, at least the evidence rack itself can be OA. Even on topics where much or most of the direct evidence is TA, a well-made evidence rack can be an OA dashboard for scanning and reviewing that evidence.
The most useful evidence racks will support real-time updating. A TA business model would probably interfere with that, even if it didn't have to. Imagine an evidence rack hosted by a TA publisher. If you paid for access, would you have to pay again periodically in order to view the periodic updates? Or would the publisher charge once for permanent access? In principle, publishers could go either way on that. But in practice I suspect that few users would pay to view periodic updates, and few TA publishers would accept a one-time payment for permanent access to a continually growing resource. I also suspect that as more and more research literature and evidence becomes OA, fewer and fewer authors would want to cite TA proposition pages in their footnotes. If so, then only OA racks would yield all the benefits of real-time updating.
Forking is a necessary quality control on an evidence rack. But forking presupposes a certain kind of open license. CC-BY is the natural for this, but CC-BY-NC would also work. An ND ("no derivatives") restriction would prevent forking by anyone, and an SA ("share-alike") restriction would prevent forking by users who wish to create a rack they could release under CC-BY.
* Doing justice to uncertainty and disagreement
In every field, and on many important propositions, there are conflicting studies. How should an evidence rack deal with them? There are many possibilities compatible with the basic model. Here are four.
1. A proposition page could list all the relevant studies, with annotations on what each has to say about the proposition in question. (Advantage: the page is more complete, and users can find all the conflicting evidence in one place.)
2. A rack could create separate proposition pages for the separate, conflicting propositions. For example, if some studies suggest that "all A's are B's" and some suggest that "some A's are not B's", then a rack could give each proposition its own page. Of course, the two pages should link to one another. (Advantage: Each page remains focused on the evidence in support of a given proposition.)
3. Sometimes it turns out that the conclusions of two apparently conflicting studies are both correct. There is an interpretation that reconciles them, or appears to reconcile them, but the reconciling interpretation didn't arise until some time after both studies were published. In cases like that, a rack could take either of the approaches above (list conflicting studies on the same page, or list them on separate pages), but when the reconciling interpretation appears, it could create a separate page for it. If the editors judge that the new interpretation settles the dispute, then they could take down the earlier pages and point users to the new page. If the question remains open, they could leave the earlier pages up while the debate continues.
4. Or a rack could simply support forking. This is the ultimate way to accommodate disagreement. Even if a given evidence rack tries to show the evidence for different positions in the field, those who find it flawed can fork the project and create a variant they like better. As the scholarly dispute about the disputed question matures, and new evidence emerges, the variant projects could merge again or they could remain separate.
Note that the intellectually honest step of including evidence and counter-evidence in the same rack, rather than artificially tidying up the picture, could actually prevent forking. That's a reason to take the step. Editors who feel impelled to tidy up the evidence in order to spare readers the burden of reading error, or the labor of working through disagreements, may simply shift the disagreements from separate pages in one rack to separate racks. (The same is true of articles and books; any exposition that is unfairly one-sided in order to downplay disagreement will simply elicit disagreement.)
The most stubborn kinds of uncertainty and disagreement arise from well-done studies with incompatible results. But not all studies are well-done, of course, which raises a hard question. When is a study strong enough to list on a proposition page, or weak enough to avoid listing? This requires precisely the kind of judgment call that scholars must make in their role as scholars. An evidence rack offers no shortcut to the hard work of evaluating studies and deciding what they are worth. However this fact itself provides a kind of answer. If an evidence rack is a way to open up, organize, and perhaps crowd-source footnotes to document basic propositions in a field, then a study is strong enough to list on a proposition page if it's strong enough to cite in a footnote in a published article. Needless to say, we'll continue to disagree on when a study is strong enough to cite in a footnote in a published article.
* Edited racks
I suspect that the most successful evidence racks will be run by editorial boards with named editors. Here are some reasons why.
Each proposition must be stated carefully. It must be accurate, clear, precise, and sufficiently narrow. Wikipedia is very good at many things, such as comprehensiveness and timeliness, even accuracy. But it's not very good at clear and careful writing.
If a given rack wants to open up the job of formulating and revising its propositions, it will need some way to deal with edit wars. Wikipedia is free to allow edit wars to last nearly forever, though it often refuses to do so. However, an evidence rack doesn't have the same freedom. A proposition page cannot grow until the wording of its proposition is fairly precise and fairly settled. Rephrasing a proposition can make some previously relevant evidence suddenly irrelevant.
An evidence rack without editors is vulnerable to spamming by ideologues (selling creationism or Holocaust revisionism) and just plain spammers (selling viagra and lottery tickets). Serious evidence racks should reflect serious disagreements in the field, but not the pet theories of know-nothing trolls. The line may be hard to draw, but without editors it could be impossible.
An evidence rack could try to deal with weak, ignorant, deceptive, and off-topic contributions as Wikipedia does, by letting responsible users edit the contributions of irresponsible ones --which of course entails that it also allow the reverse. That may work and I welcome the experiment. But even Wikipedia has been forced to introduce editorial layers with the power to trump ordinary contributors.
One of the great benefits of an evidence rack is that its proposition pages may be cited in the footnotes of new articles and books in the field. If the rack is well-run, then its proposition pages will not only be credible and authoritative, but more credible and more authoritative than the footnotes that most scholars could write on their own. But if proposition pages are written the way Wikipedia is written, would serious researchers be skittish about citing them? This is a descriptive question about what scholars would actually do, not a normative question about what they ought to do.
If there are ways to crowd-source high-quality evidence racks, then they are worth trying. They're worth trying if only because a crowd of scholars could notice and add more relevant evidence than any single scholar or hand-picked team if scholars could. However, if the result is high in actual quality but low in reputed quality, causing scholars to distrust it and avoid citing it, then it will fail to serve one of its essential or most promising functions.
A well-run evidence rack can only function as the compendious public footnotes of a field if it passes the scholarly sniff test. If scholars feel they will lose career points if they cite it, then it isn't helping them. We can hope there are good solutions to this problem. But hope isn't a solution, and even today we haven't solved the analogous problem of allowing serious scholars to feel free to cite Wikipedia. It's not enough to create high-quality scholarship. Actual excellence must be accompanied by reputed excellence, or credibility, if we are to persuade scholars that they have nothing to lose, and perhaps something to gain, by citing it in their professional publications. (We can agree on this without taking a position on where Wikipedia stands on either the quality scale or the credibility scale.)
But here's the main point: while we discuss possible solutions other than editors, we can use editors.
If the editors are named, and if they are respected in the field, then their evidence rack can gain authority and stature in the field. It won't merely cite respected works of scholarship in its evidence entries. It can itself become a respected work of scholarship. That will invite scholars to cite its proposition pages in their footnotes. At the same time, for racks that invite contributions, scholars will have an incentive to contribute.
While the case for quality control is strong, so is the case for crowd-sourcing. We should think about the best mix, but we should not start by assuming there must be a trade-off. Many forms of quality control are compatible with many forms of crowd-sourcing.
If we're drawn toward some form of crowd-sourcing in which contributors are unattributed, let's not lose sleep worrying that scholars may not wish to contribute. It's easy enough to test. We can just start. We can put knowledge ahead of career building until we can find a way to yoke them back together. At least that's better than putting career building ahead of knowledge.
* Evidence racks 2.0
We could easily build an evidence rack in a wiki, in Google Docs, in a list-making tool like WorkFlowy, in an online database, or even in hand-crafted HTML pages. But evidence racks will reach a new level of utility when we write dedicated software for managing them. Here are some features that definitely help the cause.
--We want to be able to revise the formulation of a proposition without changing the URL of the page and breaking incoming links. This is easy with hand-crafted HTML pages and hard with wiki pages.
--Users should be able to view the collection of propositions in many different sequences. The founders or editors of a given rack may have their own idea of the natural thematic sequence for the propositions. That could be the default sequence, displayed on the default table of contents. But readers should have other options, for example, to view the propositions sorted by the date added to the rack, the date revised (revising the language of the proposition), and the date updated (updating the page with new evidence). Users should be able to create their own sequences, save them, and make them available as public or private options for other readers. Each alternate sequence should function like an alternate table of contents, showing the complete set of propositions in the rack, with links to the separate proposition pages. When viewing the default or user-defined table of contents, users should have the choice to click on an icon next to a proposition to hide or show the supporting evidence without leaving the master list.
--Editors should be able to split a broad proposition into two or more narrower ones and automatically create a menu of redirects to the new, narrower propositions. If the software doesn't make this easy, then the difficulty of doing it manually, or the foreseeable inconvenience to readers of not doing it, would deter the splitting of propositions.
--Similarly, editors should be able to join two or more narrow propositions, or two more synonymous propositions inadvertently added separately, into one page, with automatic redirects to the new page.
--Forking, merging, and version control should be as easy for evidence racks as Github makes it for coding projects.
--As standards evolve for expressing canonical propositions as RDF triples, then evidence racks could support machine-readable triples alongside natural-language propositions for human readers, and become interoperable with other knowledge-integration projects. For some developments along these lines, see Nanopublications and OpenPhacts (Open Pharmacological Concepts Triple Store).
* Autobiographical digression
I started thinking along these lines when I realized that I often wanted to document the same proposition more than once. I might document it well in one article, and then want to assert it again in a later article. But I wouldn't want to assert it without evidence. I could cite the earlier article in the later article, but almost always the earlier article was devoted to many propositions, not just the one I later wanted to document. Hence the reference in the later article would be diffuse, even if relevant. Moreover, in the later article I might want my readers to know about all the sources cited in the earlier article but also add some new ones that had appeared since. I started looking for a ways to spin off the documentation for individual propositions from the larger articles in which they were embedded.
I've lived with that problem for more than a decade. But a second thread started just this summer. When my book came out in June (Open Access, MIT Press, 2012), I started an OA page of updates, supplements, and other notes.
The print edition of the book contains more than 40 pages of small-font endnotes. But I have more that I'd like to share with readers. I had to omit some documentation because it came out too late. Obviously scholars didn't stop publishing relevant new studies when I submitted my manuscript in the spring of 2011. I had to omit other documentation because the book was supposed to be short and I was already over my wordcount. I'm grateful to MIT Press for letting me go long. But I still had to perform triage on the available evidence and omit some that was relevant and credible but secondary.
My page of updates and supplements gave me a space --an OA space-- to solve both problems. As the page grew, I realized that it gave me hooks on which to hang all the evidence I knew about. In this case, the hooks were the page numbers of the first edition of the book. But it wasn't hard to see that the same material could be re-organized by proposition rather than by page number.
Having the hooks, unfortunately, is not the same as having the time to make use of them. The page currently has about 50 hooks, or adds updates or supplements to propositions asserted on about 50 different pages. I'm happy with the way it's going, but it still just scratches the surface.
My book home page is not a full-blown evidence rack, and may never become one, for several reasons. (1) I'll want the book page to remain organized by page number even if an evidence rack grows up alongside it with a different organization. And I'll only want a separate rack to grow up alongside if I can avoid duplicating my own labor or the labor of those who might join me in an crowd-sourced version of the project. (2) The book covers a lot of ground. If I broke it into separate propositions for separate documentation, the result would be huge. I realize that this wouldn't be an objection if I started with a subset of the book's major propositions or if I crowd-sourced the project of documenting them, and I'm still thinking about those possibilities. (3) Until the book itself is OA, in about six months, the documentation for many of my assertions will be divided between citations in my published endnotes and citations subsequently added to my OA book home page. (4) I'm still thinking through what an evidence rack could be and ought to be. When I've made more of the decisions I've simply described here, then I'll be in better position to know whether my book page could morph into an evidence rack.
* Postscript for philosophers. Logical atomists may welcome the idea of an evidence rack. But the idea of an evidence rack does not presuppose logical atomism, and I am very far from being a logical atomist myself. In ordinary articles and books, we individuate propositions for many good reasons, among them to support separate propositions with separate footnotes. Evidence racks individuate propositions for the same reason. They're a format, tool, or convenience, not a philosophical position on the nature of propositions and their relations to other propositions. To use an evidence rack, we needn't believe that they're the only way or the best way to organize knowledge. We needn't believe that the many propositions in the same rack are logically independent, that their referents are metaphysically independent, or that their relationships with other propositions are all external. In fact, we could add links from one proposition to others in the same rack, and these links could specify any relationships we wanted to specify. Any position that could be defended with evidence in footnoted prose, including the rejection of logical atomism, could be represented in an evidence rack. For those who don't know or don't care about logical atomism, please disregard this paragraph and return to useful work. This paragraph is not an argument for evidence racks, merely an argument against one very unlikely misinterpretation of evidence racks.