Open access and the Google book settlement
SPARC Open Access Newsletter, issue #140
December 2, 2009
by Peter Suber
Google and the groups suing it --the Authors Guild and the Association of American Publishers-- released a revised version of their settlement agreement on November 13. Judge Denny Chin gave it preliminary approval six days later. (For the major documents, see the links at the end.)
Many sharp eyes and sharp minds are looking at what the revised agreement says, how it differs from the original agreement of October 2008, how well it answers objections levelled against the original, and whether the preliminary approval ought to become final approval. I won't do any of that here. I want to focus on the settlement's implications for OA.
The questions I'm ignoring here are large, and everyone who cares about OA and the future of research should care about them. I'm putting them to one side because many very competent people are already taking them on and because very little has been written about the OA implications. But please go beyond my narrow analysis and try to digest and evaluate the whole settlement. If approved, it will affect the way books are read and distributed for decades to come. It will also, almost certainly, affect what courts and legislatures have to say about fair use, orphan works, mass digitization, class-action procedures, and anti-competitive practices in the digital universe.
(1) The first point to make is that OA was never an issue in the lawsuit. Google wasn't scanning copyrighted books in order to make them OA, and the plaintiff groups didn't sue Google because they thought it was making them OA or planning to make them OA.
However, Google's wide-ranging book-scanning program did overlap with OA. For example, Google was scanning public-domain books and making them at least gratis OA. But the lawsuit raised no objection to the public-domain scans or their terms of access. When the lawsuit was filed, Google suspended its scanning of copyrighted books, but continued its scanning and posting of public-domain books without objection from any quarter.
The lawsuit focuses on Google scans of copyrighted books and a few of its subsets. Some of the copyrighted books were scanned with opt-in publisher consent under the Google Publisher Project. The plaintiffs were fine with that, and only sued to stop the opt-out scanning of copyrighted books under the Google Library Project.
Google Publisher Project (now called the Partner Program)
Google Library Project
Within the subset of copyrighted books scanned with library permission but not necessarily publisher permission, the settlement focuses even more narrowly on the subset of books that are out of print or "not commercially available" (NCA). Within that set, most of the hubbub is about the smaller subset of unclaimed NCA books and the still-smaller subset of orphan unclaimed NCA books. For none of these sets of copyrighted books did Google or the settlement ever propose OA. The controversy and negotiation are about the non-OA terms under which these books would be digitized and distributed.
Bottom line: if Google had never been sued, or if it had won the suit outright, without having to settle, we still wouldn't have OA to the scanned, copyrighted books which are the subject of the suit. In that sense, the lawsuit did not prevent OA to any class of books and the settlement is not a retreat from an earlier plan to provide OA.
(2) If there's an exception, it's an attenuated sort. Both the original and amended settlement provide for free online access from a small number of terminals in libraries (Sections 1.117 and 4.8.a.i) to at least 85% (Section 7.2.e.i.1-2) of the corpus of otherwise non-OA digital books.
(Section numbers refer to the amended settlement agreement.)
We shouldn't call this OA, however. These provisions don't make any books OA. They merely give users a kind of special access to non-OA books.
This exception has the approval of the plaintiffs, of course, or it would not appear in the settlement. We could say that it's analogous to the accommodation authors and publishers have made to the existence of free lending libraries. But before we get too comfortable with that analogy, we should remember that the Authors Guild has not fully accommodated the existence of free lending libraries. As recently as 1987 it demanded "a government-funded royalty paid to authors of books borrowed from libraries."
Moreover, far more citizens have free access to print books through free lending libraries than will have free online access to digital books through the small number of privileged library terminals. How hard will it be to find a terminal for free online access? A typical college or university is allowed one terminal for every 10,000 full-time students (Section 4.8.a.i.1). Community colleges may have one for every 4,000 students (Section 4.8.a.i.2). Public libraries may have one per building (Section 4.8.a.i.3). The only change from the original agreement is that the settlement-created Book Rights Registry may, at its "sole discretion", authorize more terminals in any library building (Section 4.8.a.i.3). (More on this discretion of the BRR in #8, below.)
(3) Another attenuated sort of exception is that both versions of the settlement by default allow Google to display up to 20% of any copyrighted book it scans under the program (Section 4.3.b.i.1). This is a larger portion than the tiny snippets Google displays today.
When OA people say that a text is OA, they mean that the full-text is OA. In that sense, it would be misleading to call the 20% slices "OA texts". But no matter what terms we use to describe them, these slices are gratis OA and larger than the snippets that came before.
(4) Even if the settlement does not itself provide significant OA, two commentators have proposed ways in which the settlement could have done much more.
Charles Nesson of Harvard Law School proposed in April 2009 that a cut of the revenue from Google-scanned orphan works should fund an Open Access Trust.
The Open Access Trust.
Part of Nesson's argument is that the revenue generated by selling access to digitized orphan works does not belong to Google or its partners and should be "dedicated to the public good". The amended settlement almost agrees: it changes the way the revenue from orphan works can be spent and comes very close to dedicating it to the public good. But it doesn't expressly require that any of it be spent on OA. (More on how this money can be spent in #7 below.)
I've supported Nesson's argument from another direction: the settlement makes it legally or financially impossible for competitors to re-digitize the same orphan works, and legally or financially impossible for users to make a fair-use argument for access. An OA Trust would benefit the public by providing OA either to some of the same orphan works or to works from the public domain not yet covered by other initiatives.
In June 2009, Peter Eckersley of the Electronic Frontier Foundation proposed that Google put its raw scans in escrow. After an exclusive period of 14 years in which the settlement partners could profit from them, the scans would become OA. Eckersley proposed this for all the Google scans, but the idea comes closest to a compromise that all parties might accept for orphan works.
Part of Eckersley's argument is that this digitization job is too large to repeat. Making the scans available will make repetition unnecessary, and the delay will allow Google to cover its investment without foreclosing competition.
Elsewhere I've supported Eckersley's argument from another direction: digitization projects funded by public-private partnerships should provide OA to their results, even if they allow some delay, and for this purpose private universities represent public funds, since they enjoy public subsidies through untaxed property and tax deductible contributions. (See Cases 7 and 8 in the appendix.)
(5) In a June 2009 interview with Jennifer Howard, Google's Adam Smith said that the company would be willing to put CC licenses on its scanned copyrighted books, if that's what authors wanted.
Two months later Google announced the CC option without waiting for the settlement to be approved.
Until the settlement is approved, I believe the CC option is only available to Google Partners. Partners also have the option to specify how much of the book will be freely available to users --for example, 100% instead of 20%.
Using a Creative Commons license with your books
Setting a book's browsable percentage
Becoming a Google Partner
Details from the (original) Google Book Settlement FAQ
Google built the CC option into the amended settlement (Section 4.2.a.i) along with the option to set the book's sale price at zero (Section 4.2.b.i.1). These options will be available even to rightsholders who are not Google Partners.
Note that what I've been calling a CC option allows rightsholders to select any license they like; it merely uses CC licenses as an example (Section 4.2.a.i). If we consider this a libre OA option, then the new settlement gives rightsholders separate gratis OA and libre OA options.
Also note that the OA options can only be exercised by rightsholders. Hence they will never be exercised for unclaimed orphan works, at least while those works are unclaimed or orphaned. This is a barrier, but not a new barrier, to OA for orphan works.
If anyone is wondering, it's not the case that digitization projects covering orphan works must hold off on OA until they can track down the unknown rightsholders and obtain permission. For a large counter-example run by responsible, law-abiding organizations in the US and UK --two of the countries where the amended settlement would still operate-- see the Medical Journals Backfiles Digitization Project from the Wellcome Library (UK), Joint Information Systems Committee (UK), and the National Library of Medicine (US). The project includes orphan works, digitizes them, makes them OA, and promises to take them down if the copyright holder steps forward and objects.
Google has repeatedly said in public that it favors a legislative solution to the problem of orphan works and would welcome a solution that provides wider access than the settlement. It supports Congressional action in parallel with the settlement, and we should not regard the settlement --a compromise with the Authors Guild and Association of American Publishers-- as its whole position on the question.
(6) Also in June 2009, Rainer Kuhlen and Germany's Coalition for Action: Copyright for Education and Research (Aktionsbündnis: Urheberrecht für Bildung und Wissenschaft) proposed that Google should offer OA to at least some Google-scanned books by German authors. The proposal was made during the public comment period on the first version of the settlement, but would be implemented independently of the settlement.
The Aktionsbündnis proposal
Because the OA would be limited to books by German authors willing to make them OA, Google is willing in principle. This should not be surprising, since the proposal is a variation on the theme of Google's opt-in Partner Program. But the details have yet to hammered out.
(7) The revised agreement changes the way the Book Rights Registry (BRR) will use the revenue generated from the sale of orphan works. If the money is not claimed after five years, the BRR may use some of it to search for the rightsholders who deserve to be paid. If it's not claimed after 10 years, the BRR may ask the court to give it to nonprofits benefiting "Rightsholders and the reading public" (Section 6.3.a.2-3).
This solution is better than the original in two ways. First, it's possible that some of the "nonprofits benefiting...the reading public" will be supporters or providers of OA. At least they fit the description. The new provision doesn't expressly use the money to fund an Open Access Trust, but it's compatible with that outcome.
Second, even if the money never supports OA, using it to find the rightsholders who deserve it, or giving it charity, are far better outcomes than giving it to the plaintiffs. Members of the plaintiff organizations didn't write or publish the orphan works generating this revenue. Their only claim to the money is that they brought the lawsuit, i.e. that they have an aggressively conservative view of fair use. It's astonishing that they gave themselves the money in the first version of the settlement and expected a court to approve it.
(8) Much of the settlement's overall impact on OA will be in the hands of the BRR. To see why, we must look more closely at its governance and the policy questions it will be allowed to decide.
The BRR will be governed by a board to be composed of members of the Authors Guild and Association of American Publishers, the two original plaintiff groups, and new, similar plaintiffs from Australia, Canada, and the UK (Section 6.2.b.ii).
The new, non-US counterparts of the Authors Guild are the UK Society of Authors and the (UK) Authors' Licensing and Collecting Society, and the Australian Society of Authors. Their chairs or presidents will represent them on the BRR board. I haven't yet seen news of the non-US counterparts of the Association of American Publishers.
Here's the key point: the board will *not* be composed of people representing authors in general and publishers in general. It will be composed of authors representing the Authors Guild and kindred organizations and publishers representing the Association of American Publishers and kindred organizations.
This matters because the Authors Guild does not adequately represent academic authors. An open letter signed by 21 faculty members at the University of California (August 13, 2009) made this point forcefully:
http://graphics8.nytimes.com/packages/pdf/business/googlebooksearchsettlement.pdf[B]ecause most Authors Guild members are not academic authors and academic authors did not directly participate in the negotiations over the terms of the settlement agreement, it would appear that the Authors Guild representatives were inspired by priorities different from those that academic authors would have agreed to, had our input been sought or heeded. Specifically, we are concerned that the Authors Guild negotiators likely prioritized maximizing profits over maximizing public access to knowledge, while academic authors would have reversed those priorities. We note that the scholarly books written by academic authors constitute a much more substantial part of the Book Search corpus than the Authors Guild members' books....
The general absence of academic authors from the negotiation matters for three reasons. First, it helps explain why the original and amended settlements offer so few OA options or texts. The California letter continues:[T]he agreement does not contemplate or make provision for open access choices that have in recent years become common among academic authorial communities, especially with regard to out of print books....[T]he agreement does not explicitly acknowledge that academic authors might want to make their books, particularly out-of-print books, freely available by dedicating their books to the public domain or making them available under a Creative Commons or other open access license. We think it is especially likely that academic authors of orphan books would favor public domain or Creative Commons-type licensing if it were possible for them to make such a choice through a convenient mechanism. We are concerned that the BRR will have an institutional bias against facilitating these kinds of unfettered public interest, open access alternatives....Another issue that bears on open access principles is a provision of the settlement agreement that contemplates that subscribers will be able to annotate their books, but restricts the extent to which annotations can be shared....
Both points --that the settlement doesn't represent academic authors and that academic authors would have wanted more OA options-- were reinforced in a subsequent open letter organized by Pamela Samuelson and signed by 64 law professors from around the country (September 3, 2009).
Second, in a class-action lawsuit like this one, the parties in court must fairly represent the members of the class and in this case the author groups do not fairly represent the class of authors. This should be a ground to dismiss the settlement, renegotiate it with more representative plaintiffs, or limit its application to non-academic books. But it did not stop Judge Chin from granting preliminary approval to the amended terms.
Third, the AG-type authors will sit on the BRR board and academic authors will not. This will affect all the decisions made down the road by the BRR. Here are some of the OA-related decisions where bias on the BRR board could make a difference:
(a) The BRR and Google may choose to allow more library terminals for free online access to the otherwise-non-OA corpus (Section 4.8.a.iii). But until the author seats on the board are more representative, the board is unlikely to approve more terminals for free online access.
(b) As noted, if revenue generated from the sale of orphan works is unclaimed after 10 years, then the BRR may file motions with the with the court "recommending how Unclaimed Funds...should be distributed to literacy-based charities...that directly or indirectly benefit the Rightsholders and the reading public..." (Section 6.3.a.3). Many groups supporting OA fit this description. But until the author seats on the board are more representative, the board is unlikely to recommend groups that support OA.
(c) When libraries provide books for digitization, the amended settlement allows them to use the digital copies in eight specific ways (Section 7.2.b.i-viii). Any other uses require approval of the BRR in consultation with rightsholders (Section 7.2.b.ix.1). But until the author seats on the board are more representative, the board is unlikely to approve uses that widen access for users, even for orphan works when no rightsholders are standing in the way.
(d) The amended settlement describes three not-yet-implemented revenue models for the digitized books. Google and the BRR may agree in the future to implement any combination of them (Section 4.7). One of the three models, file downloading, would reduce access barriers and increase reuse possibilities, even if it could only accompany consumer purchase of the same books. (The other two models are print-on-demand and consumer subscription.) But until the author seats on the board are more representative, the board is unlikely to implement this option.
(e) Both versions of the settlement allow two institutions at a time to treat the digitized books as a "research corpus" for "non-consumptive research", that is, for text-mining or "analysis...on one or more Books, but not research in which a researcher reads or displays substantial portions of a Book to understand the intellectual content presented within the Book" (Sections 1.93 and 7.2.b.vi). The researchers may publisher their results (Section 7.2.d.vii), but must have the permission of Google and the BRR to make commercial use of their results (Section 7.2.d.viii). However, until the author seats on the board are more representative, the board is unlikely to grant permission for commercial reuse.
(9) If the author seats on the BRR board will not adequately represent the whole class of authors --in particular, academic authors and their interests-- the same could be said about the publisher seats on the board. The Association of American Publishers does not represent all publishers. In particular, it does not represent publishers of OA books and journals, even if some OA publishers belong to the AAP. On the contrary, the AAP lobbies hard against OA policies and was one of the key players behind the hiring of Dezenhall Associates ("Public access equals government censorship") and the PRISM fiasco.
Even if we put aside the publishers focusing on OA journals, the BRR board will not represent publishers focusing on OA books, such as Bloomsbury Academic, the Open Humanities Press, the OAPEN project, and the many publishers of OA textbooks and university presses launching OA book series or imprints. One of the most promising new business models for scholarly monographs is to harness the synergy of full-text OA and print-on-demand (POD), a model adopted by a rapidly growing number university presses. But the interests of those presses, and the interests of their academic authors and readers, will be systematically undervalued by the author and the publisher representatives on the BRR board.
This might be excusable if most of the books covered by the settlement were non-academic books like novels and popular non-fiction. But as the California letter points out, the reverse is true. "We note that the scholarly books written by academic authors constitute a much more substantial part of the Book Search corpus than the Authors Guild members' books...."
(10) In Germany, the backlash against the Google settlement spilled over into backlash against OA. In March some anonymous scholars posted the Heidelberg Appeal, a jeremiad against the Google settlement with an aside against OA. To date, the document has collected over 2,600 signatures. The authors objected to the book settlement on the ground that it would steal intellectual property, distribute pirated copies of protected works, and interfere with the freedom of authors to decide where to publish their work. It criticized OA for the same vices, and pointed out that --unlike the foreign Google menace-- OA has defenders inside Germany itself.
The Appeal made several groundless objections to the settlement, but had a plausible case that the original version was not consistent with German copyright law. All its objections to OA were groundless, as was the suggestion that the settlement and OA were somehow connected.
Heidelberg Appeal (March 2009)
For nearly 100 examples of the Heidelberg-based OA misunderstanding, responses to it, and articles about the ruckus, see the OA Tracking Project tag library for "oa.heidelberg_appeal".
The amended settlement agreement answers the primary objection of the Heidelberg Appeal by excluding most books published in Germany. (The only exceptions will be German publications registered with the US Copyright Office.) It also excludes books published in most other countries in the world, and only includes books published in the US and countries with similar legal systems: UK, Australia and Canada (Section 1.19).
By excluding most foreign books from the settlement, the amended settlement excludes them from the free online access from the special library terminals. This is the only sense in which the amended settlement cuts back on the free online access allowed in the original settlement.
Despite its wild swings, the Heidelberg Appeal was one of several factors leading to the settlement revision excluding German books. Another part of its legacy has been to link the Google settlement, copyright infringement, state coercion, and OA in the eyes of many Germans who hadn't been following the issues. Many of the signatories knew nothing about OA but what the document asserted and simply wanted to express their support for copyright law and author rights.
When Google announced the amended agreement and the exclusion of works from non-Anglophone countries, it added that "Google remains interested in working directly with international rightsholders and organizations that represent them, including those in countries excluded from the settlement, to reach similar agreements to make their works available worldwide."
The Aktionsbündnis proposal (in #6 above) should not be seen as a direct example, since it sought author-permitted OA, not settlement-style TA. But it does show Google's willingness to take up negotiations outside the settlement to cover books from countries excluded from the settlement, and its willingness to extend opt-in OA beyond the formal Partners program.
* Here are the primary documents on the revised agreement:
Amended Settlement Agreement (the 168 pp. document filed by the parties with the US District Court for the Southern District of NY, November 13, 2009).
Redline edition of the amended settlement agreement, marking all changes from the first edition.
Supplemental Notice To Authors, Publishers And Other Book Rightsholders About The Google Book Settlement (the notice to rightsholders which, if approved, the parties hope to send out this month).
The Revised Google Books Settlement Agreement (Google's summary of the revised settlement).
Questions about the Revised Google Books Settlement (Google's FAQ about the revised settlement).
Amended Settlement Filed in Authors Guild v. Google (the Authors Guild summary of the revised settlement).
Preliminary approval of the amended settlement agreement, November 19, 2009.
Motion from Amazon to overturn the preliminary approval, November 20, 2009.
* Here are three sites to keep help keep track of the filings, analysis, news, and comment:
The Public Index (line-by-line annotation, analysis, and discussion of the documents, and more, from the Institute for Information Law and Policy at New York Law School, a project led by James Grimmelmann).
Google Book Settlement (Google's portal on the settlement)
OA Tracking Project tag library on the Google settlement (more than 500 items tagged to date)
* Here's a selective list of some of the major comments and analyses:
James Grimmelmann, GBS: Midnight Madness, Laboratorium, November 14, 2009.
Danny Sullivan, Revised Google Book Settlement Filed & Live Blogging The Press Call, Search Engine Land, November 14, 2009
Open Book Alliance, Is the Google Books Settlement Worth the Wait? November 14, 2009.
Fred von Lohmann, Google Books Settlement 2.0: Evaluating the Pros and Cons, The Electronic Frontier Foundation, November 16, 2009.
--Also see his subsequent installments in the same series: November 17, 2009.
--November 19, 2009.
--November 23, 2009.
Randall Picker, Assessing Competition Issues in the Amended Google Book Search Settlement, University of Chicago Law & Economics, Olin Working Paper No. 499, November 16, 2009.
Gavin Baker, Revised Google Book settlement: what it means for OA, Open Access News, November 16, 2009.
Pamela Samuelson, New Google Book Settlement Aims Only to Placate Governments, Huffington Post, November 17, 2009.
Kenneth Crews, GBS 2.0: The New Google Books (Proposed) Settlement, November 17, 2009.
Jonathan Band, A Guide for the Perplexed Part III: The Amended Settlement Agreement, from the ALA, ACRL, and ARL, November 23, 2009.
James Grimmelmann, The Google Settlement: what's right, what's wrong, what's left to do, Publishers Weekly, November 23, 2009.
Ben Hallman, Q&A: Open Book Alliance Lawyer Gary Reback on the Google Book Search Settlement, AmLaw Litigation Daily, November 24, 2009.
Read this issue online
SOAN is published and sponsored by the Scholarly Publishing and Academic Resources Coalition (SPARC).
Additional support is provided by Data Conversion Laboratory (DCL), experts in converting research documents to XML.
This is the SPARC Open Access Newsletter (ISSN 1546-7821), written by Peter Suber and published by SPARC. The views I express in this newsletter are my own and do not necessarily reflect those of SPARC or other sponsors.
To unsubscribe, send any message from the subscribed address to <SPARC-OANewsemail@example.com>.
Please feel free to forward any issue of the newsletter to interested colleagues. If you're reading a forwarded copy, you can subscribe by sending any message to <SPARC-OANewsfirstname.lastname@example.org>.
SPARC home page for the Open Access Newsletter and Open Access Forum
SPARC Open Access Newsletter, archived back issues
Open Access Overview
Open Access Tracking Project
Open Access News blog
SOAN is licensed under a Creative Commons Attribution 3.0 United States License.
Return to the Newsletter archive