Visibility beyond open access
SPARC Open Access Newsletter, issue #87
July 2, 2005
by Peter Suber
Open access makes literature easier to retrieve for researchers who know it exists and easier to discover for researchers who don't know it exists.  For the second purpose, it's not enough to remove price barriers and permission barriers.  We have to make the literature visible to scholars and their research tools.  Clearly OA by itself will give any literature a huge boost in visibility.  But while OA literature is much more visible than printed or priced literature, OA by itself is closer to the minimum than the maximum of what we should expect in the digital age. 

Generally speaking, there are two ways to improve visibility:  bring more eyeballs to the literature at a certain site, and copy the literature or at least links to the literature to sites where there are already more eyeballs.  The two methods are not separable:  successful methods for attracting eyeballs will also attract literature and links, and successful methods for collecting literature and links will also collect eyeballs. 

While the two methods are inseparable, they are asymmetric.  User eyeballs can only point in one direction at a time, while literature and links can be copied to any number of sites where there might be eyeballs.  Or at least the literature can be copied if the copyright holder has decided to permit copying.  Two lessons follow:  first, even if you try the hard work of shifting eyeballs, don't fail to try the easier work of putting copies of your work or links to your work where people might see them.  Second, don't lock up your work with publishers whose copyright policies prohibit the kind of copying that increases visibility.  For example, avoid the 20% of non-OA journals that do not permit postprint archiving.  Publishing in their journals will give you a certain level of exposure, but you want the vastly greater level of exposure that comes from having at least one copy that is OA.  If certain popular search engines already attract user eyeballs, you want those engines to index an OA version of your work, not just a pay-per-view version.  You want to facilitate retrieval and reading, not just discovery.

Here are some ways to make your OA literature even more visible than it already is.

* Deposit a copy of your work in an OA, OAI-compliant repository.  An article on your personal web site can be fully OA, but an OAI repository will make it visible to a large and growing number of academically focused, cross-archive search engines.

If Google and other mainstream search engines index the OA articles on personal web sites, isn't that enough?  That's the wrong question.  Yes, Google indexing will greatly boost visibility.  But it's better to be indexed by both the generalists and the specialists than by either breed alone.  See "The case for OAI in the age of Google" from SOAN for 5/3/04.

For the same reason, if your work is already on deposit in an OAI-compliant repository, make sure the repository facilitates crawling by Google and other mainstream search engines.  Here are some tips to facilitate search-engine crawling to pass on to your repository maintainer.

Just last month, Google launched SiteMaps to help webmasters make sure that Google --and other search engines-- can find and crawl their content.   OA repositories and journals should definitely try this.

* If your repository is in the deep web, and you can't move it to the surface web, then deposit a copy of your work in another repository on the surface web.  More search engines index surface-web material than deep-web material, which is why the deep web is sometimes called the "invisible web".

* Every enhancement to search-engine relevancy algorithms, and every increment in user sophistication, makes your work more visible.  As long as some relevancy algorithms are better than others, then every consideration nudging users toward the better search engines will make your work more visible.  Since you gain from technical advances in the algorithms and from the education of users, you should support both.  If a search engine is tops in some respects today, it may not be tops in other respects today or in the same respects tomorrow.  Don't encourage your users to settle for just one research tool, no matter how good it is right now.

To see this point from the other side, run a search on Schmoogle, a Google variant that returns hits in random order.  If your work is indexed in Google, then Schmoogle will find it.  But will it be visible to users who search for it?  (To show users the effect, Schmoogle labels its hits with the rank they would have had in Google.)

* If the journal publishing your work offers free current awareness alerts by email or RSS, then your work will be more visible.  If search engines indexing your work let users store searches and sign up for periodic alerts for matching new content, then your work will be more visible.

* If you publish your work in an OA journal, then it's already visible to users who look in the places where OA work can be found.  But if your OA journal is also distributed in a priced aggregation, then without losing the first audience you'll gain the audience of researchers who look first or look only in that aggregation.  Among the priced aggregations that include some OA journals are EBSCO A to Z, SwetsWise Online Content, and WilsonWeb. 

The real advantage here may be small, to judge by librarian complaints that researchers tend to try Google before trying the expensive databases licensed by the library.  And it may be shrinking, as the growing body of OA content justifies users in looking first in the most convenient places.  But the advantage is still real, and authors of articles in OA journals should not complain, or suspect anything sinister, when those OA journals are picked up by priced aggregators.

* Similarly, if your work is indexed in the OA-oriented tools, like search engines, then you lose nothing and gain new visibility if it's also indexed in the more traditional abstracting and indexing services for your field.  This can come for free (to you, the author) if you publish in certain journals --which needn't be TA.  For example, _PLoS Biology_ is indexed by Cambridge Abstracts, EBSCO, Index Medicus, all of the ISI indexes (BIOSIS, Current Contents, Science Citation Index, and Web of Science), Lexis-Nexis, and Swets.  But if you do publish in a TA journal, you can supplement the conventional indexing with OA indexing by depositing a copy of your postprint in an OA repository.

* If your OA work is indexed by search engines that generally focus on priced content, then your work will be more visible --e.g. to researchers who look first or look only in such search engines.  For example, just last month Elsevier's Scirus started a program to index OA repositories.  (Google and Yahoo also index these repositories but they don't --yet-- have Scirus' coverage of priced content.)

Note to authors who have not yet deposited their work in an OA repository:  Google, Yahoo, and Elsevier are *competing* to offer superior indexing of OA repositories.  Don't you want to be part of that?

Although it's not within my topic here, the reverse is true as well.  The visibility of priced content will increase when it's indexed alongside free content  --e.g. for researchers focusing on free resources.  Google Scholar will do this, and so will Yahoo's new Search Subscriptions as soon as it's integrated into Yahoo's general search engine.

The point is not merely that free content helps priced content or that priced content helps free content.  Any collection will attract more eyeballs as it grows in size or usefulness.  Adding high-quality free content will help, just as adding high-quality priced content will help.  We can quibble about which helps more, but it doesn't really matter.  The providers of both kinds have an interest in combining their gravitational attraction.

* As search and analysis tools depend more and more on intelligent XML tagging, such as semantic web tagging, then either add these tags to OA editions of your work or choose publishers that will do so.  The boost in discoverability and usefulness is potentially huge.  The price is a slight increase in the difficulty of manuscript preparation.  (BTW, the tagging can be retroactive if you don't have the time to do it prior to release.)  You may not want to do this yourself, but neither do you want the non-OA providers to be the only ones doing it.

* If libraries catalog the records of OA journals, then all the contents of those journals will be more visible --e.g. to researchers who look first in the library catalog. 

The DOAJ offers metadata on OA journals free for downloading.

For tips on how to use these records in library catalogs, see Joan Conger's summary of a 2003 discussion thread on the ERIL list.

* If your OA work is also available in formats for hand-helds and PDAs, then it will be more visible --e.g. to users who cannot run searches from their desk or lap.  Because many practicing physicians fall into this category, the NIH offers PubMed for Handhelds and PubMed on Tap.

* If the articles citing your work use reference linking (making references into live links), then your work will be more visible.  Here's a sign that we are acclimating ourselves to the internet age:  linking is no longer a convenience we celebrate but a necessity we presuppose.  We no longer cheer its presence but curse its absence. 

Reference linking is sometimes thought of as a luxury that many OA journals cannot afford to offer.  But in July 2003 Alf Eaton wrote a Perl script to do the job on the downloadable corpus of OA articles from BioMed Central.  His script can be adapted to any XML file or database that uses PubMed ID numbers and he's willing to share it with any publishers who'd like to use it.  I really don't know whether there's a better or cheaper tool today, but for two years this option has been essentially free.

* If you publish your work in a TA journal and deposit a copy of the postprint in an OA repository, then make sure that readers of the OA postprint can tell that it has been peer-reviewed.  Label it with a citation showing the journal in which it was published.  Even if readers have no trouble finding your work, in most fields they are more likely to click through to full-text if they know it has survived peer review at a reputable journal.  The citation does not remove an access barrier but adds an access invitation.

* If you are the copyright holder for your work and consent to waive some of your rights in order to unshackle your users, then make your consent human-readable and machine-readable.  You can make it human-readable through a statement in the file.  You can make it machine-readable through a Creative Commons license or metadata in a standardized rights language. 

For example, Creative Commons and Yahoo both offer search engines that let users limit searches to CC-licensed content.

* If you're an expert on a certain topic, then make sure that Wikipedia includes the fruits of your expertise.  (If you didn't know, anyone is free to edit Wikipedia articles and that includes you.)  You may not have a high opinion of Wikipedia, but there are two reasons not to let that stop you.  First, it can become a self-fulfilling prophecy.  If experts add or enhance articles to reflect their expertise, then Wikipedia will deserve respect to that extent.  Second, Wikipedia is an increasingly common first stop, and probably last stop, for non-academic users looking for information.  If you want to be visible to non-academic users, then it's an eyeball destination that you can easily join.  (BTW, the consensus view among academics accustomed to peer review seems to be that Wikipedia is better than it has any right to be.  Don't give up your standards, but don't judge this resource from mere presumptions without firsthand knowledge.)

* Every step toward bridging the digital divide will make your work more visible to those who previously lacked infrastructure.  These initiatives are not limited to providing connectivity.  For example, the NIH is funding a program to make its OA genomics resources more accessible to "minority-served institutions" (MSI's) than they already are.

* Every step toward making the online files of your OA work handicap-accessible will make your work more "visible" or more available to everyone who may wish to read your work.

* Every step toward making your OA work available in more than one language will make it more visible to readers who don't read your language.

* Every step to dismantle filtering and censorship regimes will make your work more visible to researchers whose only online access is through a school, ISP, or government determined to limit what they can see.


Read this issue online

SOAN is published and sponsored by the Scholarly Publishing and Academic Resources Coalition (SPARC).

Additional support is provided by Data Conversion Laboratory (DCL), experts in converting research documents to XML.


This is the SPARC Open Access Newsletter (ISSN 1546-7821), written by Peter Suber and published by SPARC.  The views I express in this newsletter are my own and do not necessarily reflect those of SPARC or other sponsors.

To unsubscribe, send any message (from the subscribed address) to <>.

Please feel free to forward any issue of the newsletter to interested colleagues.  If you are reading a forwarded copy, see the instructions for subscribing at either of the next two sites below.

SPARC home page for the Open Access Newsletter and Open Access Forum

Peter Suber's page of related information, including the newsletter editorial position

Newsletter, archived back issues

Forum, archived postings

Conferences Related to the Open Access Movement

Timeline of the Open Access Movement

Open Access Overview

Open Access News blog

Peter Suber

SOAN is licensed under a Creative Commons Attribution 3.0 United States License.

Return to the Newsletter archive