Welcome to the Free Online Scholarship (FOS) Newsletter
     March 18, 2002

Article summarizing software

One of the myriad ways that sophisticated software will help researchers is to write short summaries of digital articles.  Imagine succinct, AI-generated summaries accompanying URLs in a search engine.  Imagine bookmarking a hundred relevant-looking articles for a research project and siccing a summarizer on them to see which deserve a full read.  Imagine right-clicking on a paragraph of postmodern discourse, and selecting "cut the crap" from a pop-up menu.

Gerald DeJong pioneered this kind of AI with FRUMP (Fast Reading Understanding and Memory Program), a 1979 adaptation of Roger Schank's script-based AI.  FRUMP could read long newspaper stories and write strikingly accurate short summaries.  To see where this technology is today, visit the Columbia Newsblaster, an AI news portal from Columbia University's NLP (Natural Language Processing) Group.  Newsblaster collects news in real time from a dozen major free online sources, and breaks it into general categories (e.g. U.S. World, Science) and specific topics (e.g. stem cell research).  Then it writes its own summary of the news on each topic, and gives links to full stories for those who want to read more.  Judge for yourself, but I'm sure you'll find the auto-generated summaries to be clear, accurate, and successful in distinguishing what's central to a story from what's peripheral.

Apart from the intelligent software, a summarizing service like Newsblaster depends on the availability of free online content to harvest as data for the software.  Imagine a "Researchblaster" for your discipline, harvesting the growing number of free, online, full-text articles, and offering accurate summaries organized by category and topic.  The Columbia NLP Group is working on such a system for the field of medicine.

Columbia Newsblaster

Columbia Natural Language Processing Group

Papers from the Columbia NLP Group on summarizing medical articles

Stephen Wan's resources on Automatic Text Summarization, including a history of the field, list of projects, glossary, and bibliography

* Postscript.  Does anyone know of free online research sites in any field that already run summarizing software?  How about freely available summarizing software capable of taking web content as data?

Text-summarizing or "gisting" software is just one example of software that will take FOS as data and return services unobtainable or even unimaginable to researchers in the age of print.  Here's another example of from this week's news.  In FOSN for 11/2/01, I wondered whether taxonomy or categorization software, which evolved for business, was being used anywhere for academic research.  This week the Institute of Physics announced that it is using the Vivisimo Clustering Engine for searching its online journals.



* The Budapest Open Access Initiative has now been translated into Russian.
(Sign it, persuade your institution to sign it, take steps to implement it, and spread the word.)

* _In Cognito_ is now making its full-text articles freely accessible online.  It will still publish a priced, print edition.

_In Cognito_

Press release on the new open-access edition

* The government of New Zealand has approved a plan to provide free online access to the nation's statutes, regulations, and bills.  Existing online collections are incomplete, unofficial, and do not show amendments incorporated into the rules they amend.  The complete official version will be phased in gradually and should be finished in three years.

* The University of Colorado has decided to receive only the electronic editions of all the 600-800 Elsevier journals to which its four campuses subscribe.
(Thanks to Library Geek.)

* Lawtopic is an interesting new service.  It simply lists topics for legal research papers, organized by branches of law.  The topics are submitted by judges, law professors, and practicing lawyers.  They are intended for use by law students in need of cutting-edge research topics.  Students who write on one of the topics are encouraged to credit Lawtopic.org.  (PS:  Are there similar services in other disciplines?  I'd love to pose questions for bright grad students to research and answer.)
(Thanks to The Filter.)

* DigitalConsumer.org is a new group devoted to protecting fair-use rights over digital content repealed by the DMCA and threatened by the SSSCA.  It aims to enact a digital Bill of Rights and undo the recent changes in copyright law that have harmed readers, consumers, libraries, innovation, and interoperability.  The site includes its draft Bill of Rights, an excellent FAQ, list of suggested readings, and a form for sending letters to Congress.  If you've wondered how to explain the problems with contemporary copyright law to colleagues who haven't been following the news, or how to defend this critique of recent developments without endorsing piracy, this site is exemplary.  (PS:  One of DigitalConsumer's co-founders is Joe Kraus of Excite.  Kraus was one of the anti-SSSCA witness at the recent Senate hearings.)

* [Not Relevant But Who Cares Department.]  A radical separatist group in India has shot and wounded seven people for helping students to cheat on university exams.  The shooters are not friends of higher education so much as partisans of purity aiming to end every form of corruption in the Indian state of Manipur.
(Accessible only to CHE subscribers.)


New on the net

* On March 14, the Text-e symposium finished discussing its tenth and last text.  But its forum will remain open for a general discussion of the issues raised during the symposium:  the impact of the web on reading, writing, research, and the diffusion of knowledge.  The moderators have posted their "conclusions" to the site to stimulate further discussion.  However, their conclusions are much more about the nature and advantages of online symposia than the impact of the web on reading, writing, and research.

* The Hybrid Library (HyLiFe) Project ended last year.  Its toolkit and recommendations are now online.
(Thanks to El.pub Weekly.)

* Ohio State University has released version 2.0 of Prospero, an open-source internet document delivery system.  Prospero lets libraries scan, send, and receive documents, and lets users view them on any web browser.  This allows libraries to use the internet as the medium for inter-library loan, both to move documents between libraries and to deliver them to patrons.  The code for 2.0 is now available for downloading.
(Thanks to Library News Daily.)

* Apache has released 1.0rc2 (version 1.0 release candidate 2) of Xindice, its open-source database specifically designed to story large archives of XML. The code may now be downloaded from the site.  (PS:  Xandice was formerly called dbXML; see FOSN for 10/19/01.)
(Thanks to El.pub Weekly.)


Share your thoughts

* The Internet Archive is looking for librarians, authors, publishers, teachers, and children's advocates to help it build the International Children's Digital Library, a free online archive of children's literature.  It is also looking for funds to digitize 100,000 children's books.
(Thanks to Shelflife.)

* The National Center for State Courts and the Justice Management Institute would like your thoughts on its draft policy on public and private access to court records.  Comments will be welcome until April 15.
(Thanks to C-FIT.)


In other publications

* In the March 18 _TheScientist_, Barry Palevitz explains how the National Library of Medicine (essentially, PubMed) is using its LinkOut service to give users expanded access to other online sources of biomedical research.

* In the March 17 _New York Times_ Kevin Kelly has an excellent essay on how recording and copying technology has changed, and will continue to change, music.  The deep changes to music that Kelly describes are only FOS-related if you can let them suggest to your imagination equally deep changes to scientific and scholarly literature.  One of his conclusions that may transfer to scholarship is that there are many reasons why free and priced versions of the same content may coexist.  For example, there is far more music of the kind you like than you could ever listen to in your lifetime.  One service worth paying for (until it too is free) is a trusted discriminator that brings to your attention the content you most want right now.  Another conclusion that may transfer is that ultimately the pricelessness of free content is less revolutionary than what Kelly calls the "liquidity" of digital content, or its susceptibility to manipulation by software.  (PS:  For an example of how this matters for scholarship, see the story on text summarizing software, above.)

* In the March 14 _Guardian_, Chris Middleton argues that over the next five years publishing on demand (POD) will be much more attractive to consumers than ebooks.

* In a March 13 article posted to _AntiCensorWare_, Seth Finkelstein explains why internet filtering software blocks sites like the Internet Archive's Wayback Machine.  Because these archives are comprehensive, and because access to them is all or nothing, filtering software regards them as "loopholes" or "proxy avoidance systems" and blocks them.  This kind of censorship "is not a program accident or human error.  It will not be fixed-when-exposed, or fixed-in-next-release, or fixed-when-AI-is-developed-someday.  It is a logical outcome of the imperatives of control over all reading material.  This 'pre-slipped slope' results in the deliberate electronic book-burning of a unique, unparalleled, digital library."
(Thanks to LIS News.)

* In the March 11 _Newsbytes_, Kevin Featherly reports on several bills before the Florida legislature which would close records to the public that are now open.  None of the information to be removed from public access is the kind that might be useful to terrorists.  For example, one kind reveals when public officials meet in private with contractors making bids on public projects.  Florida's newspapers are leading a campaign to keep the records open.
(Thanks to Internet Law News.)

A similar controversy is brewing in New Jersey, but in this case the initiative to close public records originates in the governor's office and the opposition to it comes from the legislature.
(Thanks to Freedom News Daily.)

* The opening story in the March 11 _The Filter_ is not only about the Eldred case and the Supreme Court's agreement to hear it next term, but the OpenLaw method of developing the legal arguments that got it this far.  OpenLaw is the innovation of Lawrence Lessig, who wanted to bring the "many eyeballs" approach of open source software to litigation.  Harvard's Berkman Center uses OpenLaw to brainstorm in public about the best legal arguments and strategies to use in real cases (see FOSN for 1/16/02, 2/25/02).  The Eldred case is OpenLaw's first case and its first success.


* The March 7 issue of _CLIRinghouse_ is now online.  The anonymous editorial in this issue argues that course management software should include "direct access" to a college's online library catalog and licensed databases.  This would not only help students find relevant literature and assistance, but make better use of two expensive campus investments (licensed online journals and course management software).

* In a March 2 posting to his web site, Henry Gladney gives a a short (1.5 pp.) overview of some of his recent work in the long-term preservation of digital documents.

* In a March 15 story in _Planet eBook_, Sam Vaknin debunks the myth that copyright protects authors (rather than publishers) and gives authors an incentive to create (rather than publishers an incentive to publish).  In the process he describes the corporate interests that have driven recent worldwide changes to copyright law, and three powerful tendencies that make IP rules less and less relevant.  These three tendencies are competition with free content (supported by advertising and other dissemination subsidies), disintermediation, and market fragmentation.

* The March issue of _First Monday_ is now online.  It contains the following two FOS-related articles.

A. Dedeke analyzes different ways to price digital information, given that the first copy is expensive to produce while subsequent copies are virtually free.  Dedeke also looks at ways to justify differential pricing based on differences in features and performance.

Johan Soderberg offers a Marxist critique of copyright, and argues that several aspects of Marx's economic philosophy can be construed to support copyleft and the open source movement.

* In a February 24 posting to _Responsible Netizen_, Nancy Willard presents her report on public schools using internet filters created by the Religious Right.

Here are some news and op-ed pieces based on Willard's report


Following up

To see past coverage of these stories in FOSN, use the search engine at the FOSN archive.

* More on the SSSCA

Jonathan Zittrain has an op-ed in the _New York Times_.
("The PC platform and the Internet to which it connects is [sic] the engine of the information revolution --as important to our economy and culture as all the movies in Hollywood.  A shift from open platforms to closed appliances may be inevitable, as our consumerist desire for trustworthy PC's dovetails with information providers' obsession with control. But we should beware the haste with which some would sacrifice flexibility for control.  If we can't at least temper this taming of the chaotic PC, the victims will be competition, innovation and consumer freedom.")

The EFF wants consumers to show their support for Intel, for standing up to Disney at the recent SSSCA hearings

On March 14, the Senate Judiciary Committee held a hearing on the SSSCA.
(The hearing agenda and witness list.)

Amy Harmon in the _New York Times_ sets the stage for the hearing by summarizing the controversy to date.

Tom Spring gives the background for the readers of _PC World_.

At the hearing, executives from Intel, AOL, and Excite testified against the SSSCA.  Hilary Rosen for the RIAA spoke in favor of it.

Senator Patrick Leahy says the Senate will not pass the SSSCA until Hollywood and Silicon Valley can find a solution that protects IP without violating fair-use rights.  (PS:  And of course if Hollywood and Silicon Valley do find such a solution, then the SSSCA won't be necessary.)

In recent public statements, Senators Feinstein and Specter have supported the SSSCA, while Hatch and Leahy have distanced themselves from it.

* More on the Eldred case.

The _Washington Post_ ran a pro-Eldred editorial on March 5.
("The real-world explanation for these perpetual [copyright] extensions is, of course, the disproportionate financial clout of corporate copyright holders in Congress and the galloping increase in the potential value of their intellectual property, which heightens the holders' reluctance to give it up when the time comes.  There may not be sufficient constitutional basis for the court to right that imbalance.  But if not, perhaps in the wake of campaign finance reform, some idealistic lawmaker should think about addressing it.")

Evan Schultz in _Legal Times_ argues that Lessig's strategy in the Eldred case could fracture the alliance supporting his cause.  Lessig's earlier arguments were narrowly focused and persuasive, but they lost in court.  One of Lessig's recent arguments is borrowed from Phyllis Schlafly's Eagle Forum, and is broader, less persuasive, and more divisive than his earlier arguments.
("Though [Lessig] initially walked the line between camps in the broad war over the limits of congressional power, he was inevitably drawn into the fight. That means that a case with seemingly broad appeal now must be viewed in starkly political terms. It means that the best hope for truly reviving the public domain is probably Congress rather than the courts.")

* More on the Elcomsoft/Sklyarov case

Steven Levy summarized the case and its issues for the general public in MSNBC.
("Can it be illegal to give people the tools to break into their own property?  The U.S. government thinks so.")

* More on the Rosetta Books case

Random House has lost its appeal to stop Rosetta Books from publishing electronic versions of books for which Random House owns the print rights.

Opinion of the Second Circuit Court of Appeals (March 12)

* More on the British Telecom hyperlink patent case

What BT has patented is not closely related to what we know and love as the internet hyperlink, or at least not as closely related as BT claimed.  Some observers think this ruling will be the basis of a summary judgment against BT.

Opinion of District Court for the Southern District of New York (March 13)

* More on anonymous browsing

Last fall, Zero Knowledge laid down its Freedom Network.  But it has now revived it in a new form under the name Freedom WebSecure.  SafeWeb, which also discontinued its free anonymizing service, is looking for a way to bring it back.


Catching up (old news I should have discovered earlier)

* Launched last September, Open Source Schools is a portal for open source software and "open content" in the service of education.  You won't find much about FOS at the site, yet, but the mission statement for the organization says it aims "to assist in the movement to broaden 'free' and 'open source' to include more than software".  It appears to be "open" to learning more about FOS, and supporting it, if any readers have an interest spreading the word.
(Thanks to C-FIT.)



* In FOSN for 3/11/02, I wrote about a petition asking that federally funded software be open source.  I attributed the petition to Openly Informatics, but I should have attributed it to Open Informatics.  Openly Informatics is a software developer for reference linking and related services, but it has no connection to Open Informatics.  Some of its code is released as open source, but for business reasons unrelated to those put forth in the petition.  My apologies for any confusion this may have caused.

Openly Informatics

Open Informatics

The petition, from the latter, not the former

* In FOSN for 3/11/02, I cited this article on the Budapest Open Access Initiative.  Because I couldn't find the author's name, I called it anonymous.  Helene Bosc has discovered that the author's name is Fabrice Node-Langlois.  Thanks, Helene.

La revolte des savants pour la libre publication (for _Figaro_)
(The article is no longer available at this URL.)



If you plan to attend one of the following conferences, please share your observations with us through our discussion forum.

* Digital Resources and International Information Exchange:  East-West
March 18 (Flushing NY), 20 (Stamford CT)

* Internet Librarian International 2002
London, March 18-20

* The New Information Order and the Future of the Archive
Edinburgh, March 20-23

* Institute of Mueum and Library Services.  Building Digital Communities
Baltimore, March 20-22

* Advanced Licensing Workshop
Dallas, March 20-22

* Electronic Publishing Strategy
London, March 22

* Association of Information and Dissemination Centers (ASDIC) Spring 2002 Meeting
St. Augustine, Florida, March 24-26

* OCLC Institute. Steering by Standards.  (A series of satellite videoconferences.)
Cyberspace.  OAI, March 26.  OAIS, April 19.  Metadata standards in the future, May 29.

* WebSearch University
San Francisco, March 25-26; Stamford CT, April 30 - May 1; Washington DC, September 23-24; Chicago, Octeober 22-23; Dallas, November 19-20.

* European Colloquium on Information Retrieval Research
Glasgow, March 25-27

* e-Content:  Discovering and Delivering Value
Toronto, March 25-27

* New Developments in Digital Libraries
Ciudad Real, Spain, April 2-3

* The New Information Order and the Future of the Archive
Edinburgh, March 20-23

* Copyright Management in Higher Education:  Ownership, Access and Control
Adelphi, Maryland, April 4-5

* Global Knowledge Partnership Annual Meeting
Addis Ababa, April 4-5

* What Scholars Need to Know to Publish Today:  Digital Writing and Access for Readers
Albany, New York, April 8

* International Conference on Information Technology: Coding and Computing
Las Vegas, April 8-10

* NetLab and Friends:  10 Years of Digital Library Development
Lund, April 10-12

* E-Content 2002 (on ebooks)
London, April 11

* Censorship and Free Access to Information in Libraries and on the Internet
Copenhagen, April 11

* International Learned Journals Seminar:  We Can't Go On Like This:  The Future of Journals
London, April 12

* SIAM International Conference on Data Mining
Arlington, Virginia, April 11-13

* Creating access to information:  EBLIDA workshop on getting a better deal from your information licences
The Hague, April 12

* Licensing Electronic Resources to Libraries
Philadelphia, April 15

* United Kingdom Serials Group Annual Conference and Exhibition
University of Warwick, April 15- 17

* Conference on Computers, Freedom, and Privacy
San Francisco, April 16-19

* EDUCAUSE Networking 2002
Washington, D.C., April 17-18

* Museums and the Web 2002
Boston, April 17-20

* Legal Guidelines for Use of Intellectual Property in Higher Education
Oneonta, NY, April 19

* Information, Knowledges and Society: Challenges of A New Era
Havana, April 22-26

* DAI Institute on The State of Digital Preservation:  An International Perspective
Washington, D.C., April 24-25

* CLIR Sponsors' Symposium:  New Challenges, New Solutions:  Libraries for the Future
Washington, D.C., April 26

* The European Library:  The Gate to Europe's Knowledge:  Milestone Conference
Frankfurt am Main, April 29-30


The Free Online Scholarship Newsletter is supported by a grant from the Open Society Institute.


This is the Free Online Scholarship Newsletter (ISSN 1535-7848).

Please feel free to forward any issue of the newsletter to interested colleagues.  If you are reading a forwarded copy of this issue, you may subscribe by signing up at the FOS home page.

FOS home page, general information, subscriptions, editorial position

FOS Newsletter, subscriptions, back issues

FOS Discussion Forum, subscriptions, postings

Guide to the FOS Movement

Sources for the FOS Newsletter

Peter Suber

Copyright (c) 2002, Peter Suber

Return to the Newsletter archive