Prepared for Conference
on Free Information Ecology
March 31-April1, 2000
Legal Information - A Strong Case for Free Content, An Illustration of How Difficult "Free" May Be to Define, Realize, and Sustain
Peter W. Martin
Co-Director Legal Information Institute and
Jane M.G. Foster Professor of Law
Cornell Law School
[I. Introduction | II. The LII | III. Why Law Should Be Free | IV. Reasons Some Law-Making Bodies Are Hostile | V. The Difficulty of Defining and Implementing "Free" | VI. A Few Conclusions]
For the past eight years, my colleague Tom Bruce and I, doing business as Cornell's Legal Information Institute (LII) < http://lii.law.cornell.edu/ > have worked at liberating and disseminating content flowing from sources that in theory should support the widest possible distribution. The following observations draw on that experience. Since this work has given us an unusual vantage point and the experience has been rich and is vigorously ongoing, these remarks will cling closely to what we have seen and heard. While hesitant to generalize, I do view what follows as a collection of important realities or data points with which any broad theory concerning "free" access to or use of legal information (and perhaps information of other types) must contend.
II. The LII -- A Unique Vantage Point
Before proceeding to those realities, I should describe the vantage point. Today, Cornell's Legal Information Institute runs the most heavily used non-commercial, comprehensive law site on the Net. We have arrived at this spot, in part, by having been there from the Web's earliest days. We ran the first Net server focusing on a discipline outside the hard sciences (initially a gopher). We created and released as freeware the first Web browser to run under Windows (Cello) -- a necessary step in those pioneering times toward providing effective hypertext access to law via the Net. But we are also in this location, today, because of a strong personal commitment to the value of broad, free public access to law and our keen interest how digital technology can be used to achieve that end. Both individually and together, Tom Bruce and I have been confronted with and resisted countless opportunities to trade in that commitment for large personal gain. I report that not to seek your approval, but to help you judge our value as a case study. Many of our academic colleagues view us as foolish in the extreme. While other Internet projects that began in the academy have, long since, moved to some commercial form, with great gain to the principals, we have held to our original non-profit path.
From the outset the two of us agreed to put all our creative output and the resulting revenues into the LII. Both of us had pre-existing publishing or consulting arrangements, but since combining forces in the institute, we have done all our consulting, speaking, writing, editing, software creating, and database building as "the Legal Information Institute". We have turned down individual relationships with commercial entities that would have subtracted time and focus from the LII or compromised the institute's commercial neutrality.
The LII has no close counterparts in the U.S. For those seeking to assemble useful U.S. law data around a problem, issue, or point of curiosity the principal alternatives are a range of commercial providers offering more or less comprehensive information services, public bodies offering fragments, and many niche players of new and interesting kinds. Judging from the daily inquiries we receive from start-ups, many more of the latter are on the way. Similar academically-connected law-not-coms do exist in Canada (at the University of Montreal < http://www.canlii.org/index_en.html >), Australia (the Australasian Legal Information Institute < http://www.austlii.edu.au/ >), Norway (Lovdata < http://www.lovdata.no/info/lawdata.html >), Zambia (the Zambia Legal Information Institute < http://www.zamlii.ac.zm/ >), and South Africa (at the University of Witwatersrand < http://www.law.wits.ac.za/ >). One appears poised close to launch in the U.K. Due to the huge differences in the pre-existing legal information ecology in these places our discussions with these peer institutions yields as much insight into the peculiarities of the U.S. setting as it does information about how others are approaching common problems.
University platforms for the dissemination of legal information should, in theory, be able to come close to a "free content" ideal. A central element of law, after all, is communication. Whether the law in question commands, enables, or creates framework and process for resolving disputes, whether it emanates from or is implemented by legislature, administrative agency or court, it achieves its intended effect only by reaching an intended audience. And universities have historically sought effective means of disseminating the results of their faculties' research and scholarship. "Land grant" institutions like Cornell and others with substantial public funding are dotted with "free" information distributing activities.
We established our institute with in 1992 in the conviction that digital technology should facilitate a quantum shift in the distribution of legal information and also make it possible for a university law school to become a serious electronic publisher of its own research. To explain the venture to colleagues and alumni we analogized its aims to those that prompted Cornell to establish its first law journal in 1915 and two additional ones in later years. Journals like these, we pointed out, are costly. Given the school's purposes in producing them they would be free if they could be free, since all whose work they contain seek the widest possible readership. But with print, the incremental costs of production and distribution prevent "giving copies away" without limit. The Net, we argued, removed that frustrating constraint.
Our vision represented more than a shift in institutional support for publication to a new, digital form with the potential for increasing readership and reducing cost. We intended this institute to become itself the locus of serious research on how this same medium might be used to improve access to legal information. Our research in this area has, from the start, been applied or experimental. We have built a succession of new products and services designed to be useful to a variety of constituencies, both familiar and new. Since the available technologies and the reachable user base have been changing at unprecedented speeds, our efforts to work effectively with law content at their intersection have been stimulating, influential, and frequently overwhelming.
In 1993 our original web server held a hypertext version of the U.S. Constitution, an HTML front-end to the collection of Supreme Court decisions at the Case Western Reserve ftp site, the Uniform Commercial Code and a few federal statutes -- all created in HTML version 1 by hand mark-up. More a proof of concept project than a resource for serious researchers it responded to a few hundred data requests a week.
By late winter 2000, we were running an array of eight servers that must deal with over a million data requests a day, representing at least 40,000 user sessions. On days when the Supreme Court releases decisions, summaries linked to the opinions in full text are dispatched via e-mail to 20,000 initial recipients of the liibulletin. Since we encourage redistribution, we have no idea of the size of this e-bulletin's ultimate audience. Needless to say, it much larger than and quite different from that for the Cornell Law Review. While we offer two CD-ROMs for which we charge, all these other services are free.
Being free they have gathered a very different population than subscribes to LEXIS, Westlaw or even Loislaw. Unfortunately, we have only limited direct information about those we serve or what interests and needs draw them into the U.S. Code, decisions of the U.S. Supreme Court, state law materials, the Federal Rules of Evidence, collections of resources around such particular topics as bankruptcy, employment discrimination, child custody or free speech, or any of the other resources that we mount or organize. We simply have not yet had the time, space or research collaborators to pursue this fascinating line of inquiry. There is, however, a fair amount we can deduce. Furthermore, since the Internet is a communications channel, and not simply a broadcast medium, the LII does "hear" from good numbers of its diverse public. Many LII users are "U.S. law people" -- lawyers, legal academics, or law students in the United States. Yet there are many other categories as well, some we had not anticipated, being conditioned by the commercial law publishers and our institutional experience to think of a limited set of professional groups as the "legal information market." To those with seemingly costless, limitless access to the commercial on-line systems (U.S. law faculty and students), or with the revenue base to afford heavy use of those well developed services (large law firms), the initial law offerings on the Net seemed small and largely redundant. By contrast, to individuals and institutions lacking comprehensive and timely access to U.S. law, even modest amounts of important legal material on the Net offered a radical improvement. Many of our users fall in this latter category. They include: 1) teachers, librarians, and students involved in both secondary and higher education (other than law schools); 2) lawyers in public offices, public interest, and small firm settings; 3) professionals in fields heavily affected by law -- police officers, business managers, journalists to name but a few, 4) ordinary citizens wanting more detail on a high profile decision or issue; and 5) all sorts and conditions of people outside the U.S. Traveling along the Internet to places and people not reached by WESTLAW, LEXIS, or comprehensive law libraries, the Institute uncovered an enormous unsatisfied demand for accurate and timely legal information.
Throughout this extraordinary journey we have been buffeted with experiences bearing on the conference theme and have also had the opportunity to observe the behavior of a rapidly shifting cast of other players.
III. Reasons Why Legal Information Should be Free
A. Law functions through communication
From earliest times "communication" has been central to law. As the technology of communication has changed, the impact of those changes on law and the central actors in the law process (law-makers, law-appliers, lawyers, and citizens) has been profound. The introduction of the technology of writing, then the printing press, then widespread literacy and the growth of organized libraries each transformed the law activity.
Since many legal norms do not operate through citizen self application, the quality of communication within the structure of government is equally important to the law's performance. In areas like tax and social security, law operates through large government agencies, which intersect the lives and activities of huge numbers of citizens. Qualities of performance such as accuracy, timeliness, consistency, and equity (like cases treated like, different cases, with appropriate difference) are strongly influenced by how communication of governing legal norms is accomplished within these agency structures. In areas of the law where judges or judges and law enforcement officials are essential elements of the law application process, the concerns are quite similar even as the means of communication have traditionally been different. Public bodies and those who do their work are among the most important users of legal information.
Better access and improved communication have been consistent targets throughout the history of printed law -- from Sir Edward Coke who translated the classic Littleton's Tenures from "Law French" into English so that it might be understood "seeing that ignorance of the law is no excuse," through the early 19th century statutes that required judges to write out their decisions so that accurate copies might be distributed in print, the late 19th century codification and restatement movements that were premised significantly on a view that law derived from the mosaic of judicial opinions was too inaccessible, to the Administrative Procedure Act and subsequent "plain English regulation" movements of the 20th. In some instances, concern that people be able to know the grounds of their accountability, "ignorance of the law being no excuse," captures the rationale for these reforms, but in others the aims are better understood affirmatively. That is to say whatever goals the law is pursuing and through whatever intermediate means, the prime instrument is communication. Efforts to make law more accessible, more understandable, more clearly expressed are ultimately efforts to make law more effective and in a democracy, more accountable.
New York legislation provides for publication and placement of reported appellate court decisions in all county and public libraries -- as a means of providing free access to the state's law. [E.g., < http://assembly.state.ny.us/cgi-bin/claws?law=53&art=27 > A similar provision for free distribution of statutes exists in states that publish their own. [E.g., < http://www.revisor.leg.state.mn.us/stats/3C/12.html >
Liberated by digital technology from the marginal costs of printing, shipping, and storing which force hard choices about how many copies, where, and for whom, law-making bodies might be expected to embrace free distribution of their output by entities like ours and indeed to undertake it themselves. Our experience teaches that there are many reasons they may not do so, at least in any way that counts.
B. Legal texts are so important shouldn't their dissemination be controlled?
As we began placing federal statutes and U.S. Supreme Court decisions on the Net, using them as test bed collections for our earliest experiments in mark-up, hypertext, and full-text search and retrieval we were blind to a barrier that lay at the threshold of similar experiments undertaken by colleagues in Canada at the University of Montreal and in Australia at our southern hemisphere namesake, and arguably blocked initiatives in the U.K. In the U.K. and many of its former colonies this impediment to the free flow of legal information goes by the name of "Crown copyright." See < http://www.lexum.umontreal.ca/conf/dac/en/ > Stripped of all nuance this is a legal doctrine that law, including legislation, agency rules, and the opinions of judges, is covered by copyright, with that copyright held by the government. Those who would publish law in any form, print or the new digital alternatives, must secure permission. And that permission can be can be conditioned on payment of royalty, flatly denied, or deliberated over interminably. This approach cannot honestly be attributed to a single rationale, but it might be represented by the thought that legal texts (and originally others such as the Bible) are too important for the government to allow uncontrolled publication. The revenue and other returns flowing from "official" or "authorized" printers and the vested interests of public printers may be a significant reason for the endurance of this doctrine. On the other hand, it is not a huge leap from the view that not everyone should be allowed to practice law to one that not everyone should be allowed to publish legal texts where typos can have such major consequences.
Because judicial opinions, legislation, and regulations emanating from the U.S. government are unquestionable free from copyright, we had to consult with no official before redistributing decisions of the Supreme Court we had acquired in digital format. For the same reason, we were able to order a CD-ROM compilation of the U.S. Code from the Government Printing Office for under $40, convert the data to HTML, add cross-links, and place the resulting information product on the Net without having to persuade anyone in the U.S. government of the importance of our purpose or the quality of our work. Because the New York courts and legislature do not quarrel with the sound view that state legal texts are also in the public domain (a proposition unfortunately not codified in the U.S. Copyright Act), we had to gain no New York approval before placing decisions of the New York Court of Appeals and the Consolidated Laws of New York on-line.
Our principal copyright problems associated with acquiring law data have come from a sector we naively assumed to be friendly to the cause of broad public dissemination. I cherish the "cease and desist letter" I received from the American Bar Association's general counsel demanding that we remove the Model Rules of Professional Conduct from the LII Web site. (Our immediate response was to substitute the Idaho rules based closely on the ABA standard.) My blood pressure rises when I recall exchanges over license terms with the American Law Institute and the Permanent Editorial Board of the U.C.C. In order these bodies assert proprietary claims to the Model Rules, the U.C.C. or at least its official comments, and all Restatements of the Law. More importantly, all depend for their survival on the royalties they receive from commercial distribution.
C. Costs of acquisition and reproduction as barriers limiting distribution of this public domain material
Moving content from print to digital format is costly, running currently at two to three dollars per page. This is a second burden we have not had the need nor would we have had the resources to take on. (Our only experience with such work is in producing an elegant on-line version of an important historical text, Bracton, in collaboration the Harvard Law Library. In that project, these costs were borne by the Ames Foundation. < http://bracton.law.cornell.edu/bracton/Common/index.html >) Everything our institute has done has begun with digital material -- in most cases digital material acquired from a public source. The Supreme Court of the United States began releasing its decisions in electronic format in May of 1990. The New York Court of Appeals established a dial-up bulletin board at around the same time. By the nineties courts, legislative bodies, and agencies were preparing their output with computers. While print was still their formal or official distribution medium, digital release posed minimal incremental costs. Both the U.S. Supreme Court and New York Court of Appeals financed these incremental costs by charging subscription fees. The former set up a system limited to information brokers or resellers and priced it accordingly. The New York Court set a much lower annual fee of $30. Even with the added long distances charges for those outside the Albany area this put the resource directly in the hands of lawyers and small newspapers.
In many European countries, including those unencumbered by doctrines of government copyright in law, free distribution has nonetheless been frustrated by tight control on the terms of access to official systems of digital distribution. Comfortable with uncontrolled private sector print publication and conditioned by Westlaw and LEXIS to view digital law as no less suitable for competitive, multiple source redistribution, U.S. courts and legislatures have been far quicker to release digital take-offs from their law-making activities and to do so without attempting to impose conditions. One of the most dramatic developments in U.S. legal information flow of the past few years has been the explosion of state and local government law sites. < http://www.law.cornell.edu/background/states/ >
IV. Why, Nonetheless, Public Bodies May Be Indifferent, If not Hostile, to Free Digital Distribution
Despite the apparent promise in the number of public law sites, our experience has taught us not to be surprised when courts and legislatures fail to embrace or aid free distribution of their output, let alone implement effective digital distribution themselves. Here are some of the reasons for such response.
A. Use of the time before print publication for completion and revision
A distribution process that includes substantial time between official act and final official publication may allow some measure of revision during that period. Many appellate courts, for example, have grown comfortable with, indeed, reliant upon the lag between initial release of their opinions and their appearance in "law reports," using that time for cite checking and editorial review.
In some jurisdictions those functions are actually performed by a separate office, the office of court reporter. Judges write opinions which are released in "slip form" but then readied by a court reporter for publication in archival form. When reporters add summaries and key words to decisions that commonly occurs after rather than before initial release. Nearly all courts delay the attachment of full citation information to decisions until their appearance in print.
All of the above features are reflected in the current practice of New York's highest court. Five decisions handed down (and placed on the Web) by the New York Court of Appeals between December 2 and 16 were not published in "official form" until February 9, 2000. They were logged in by the Cornell Law Library staff on February 18.
Since the software used by the LII to pull decisions from the Court of Appeals new Web server detected reloading of significant numbers of decision files in January, at a time when the court was not in session, I inquired of the court's clerk whether we could assume that no changes of content were involved unless the revisions was specifically noted in the file. In the same message I pointed out a date error in one of the decisions. Here is his reply:
As promised, I followed up with our web master concerning files your system picks up as having been revised. He confirms that he does re-format the files on occasion, and that in most cases the revised files your system identifies have not been changed in any material way.
However, on rare occasions, substantive changes are made to decisions that were previously posted. It is not our policy to mark these, as all writings posted bear a disclaimer that they are subject to revision before publication in the official New York Reports. I therefore regret that I cannot assure you, as you request, that you can ignore a fresh file for a decision the court has previously released unless it carries some explicit notation that it is an amended or corrected version. However, I can tell you that the number of times such a fresh file will represent an amended or corrected version are exceedingly rare.
Again, I thank you for your interest in the Court's web site and for your bringing the date errors to my attention.
Clerk, New York Court of Appeals
It is an overstatement to say that the version of a decision the court releases in digital format is a draft, but as the clerk points out each file carries the warning: "This opinion is uncorrected and subject to revision before publication in the New York Reports." Having worked with the court's decisions for five years, I can assure you that is not just a formality. If that is so, why doesn't the court subsequently release the final version at its Web site?
B. Use of "official" status as a form of barter
Courts (and legislatures) in large market jurisdictions like New York are able to and therefore tempted to reap some return from their output. Since these bodies are not only a source of law but also heavy users of legal information the contractual arrangements surrounding "official" reports or a state code can provide a way to finance government operations. The commercial entity undertaking the responsibility of doing official publication in print and now electronic format commonly contracts to furnish the issuing public body and other designated recipients with significant quantities of its output.
The admixture of editorial content by a state court reporter or legislative staff creates a composite that is copyrightable. That allows the public body to assure a measure of exclusivity to any potential private sector partner, or to secure a revenue stream from any competitor, or both.
This recipe has worked in New York and California, though not in small population states. Indeed, historically large states have been able to generate competition over these contracts. The current New York contract, let to Lawyer's Coop while it was still in competition with West, expires at the end of this year. The terms under which the next five year contract will be bid are constrained by both established practice and statute. The successful proposal will commit to furnishing numerous copies of the published reports to state offices ranging from the state library, through all the state judiciary, to each county and public library in the state. It will also commit to a price for the sale of the reports to the public -- both print volumes and other media or formats. The successful bidder will almost certainly agree to provide the hardware, software, and training necessary to enable the staff of the reporter's office to enter decisions into the contractor's data system driving both print and electronic publication.
Several cycles ago local printers vied for this contract. In view of the scale of the undertaking and the current shape of the legal information marketplace that seems unlikely in 2000. Bidding is certainly beyond comprehension for a small non-profit like the Legal Information Institute. My principal point, however, is simply that the attractiveness of this contract to any potential commercial bidder would be dramatically undercut by unrestricted, uncharged release of decisions in final and official form. The existence of free "final" and "citable" copies, whether at a public site or picked up and distributed with or without enhancement by non-profits like ourselves or a state bar association or "no charge" commercial sites like Findlaw, would threaten the private sales on which this contracting practice depends.
In California the publisher of official reports prior to the West-Thomson merger was also a Thomson subsidiary, in its case Bancroft-Whitney. Under the current contract it is West Group. The Court system has a Web site. It holds only "slip opinions" and prior to this January held them for only 100 days. It has now begun to archive "slip opinions" beyond that period and instructs users both that the archive collection is not "provided for purposes of legal research" and that:
Cases beyond the Web site's retention period are available at Westlaw.com in the CA-ORCS database or individually in WestDoc. Westlaw.com is a fee-based online research service of the publisher of the California Official Reports.
Prior to the West-Thomson merger, West published official state reports in eight states, Thomson subsidiaries in six including New York, Illinois, and California. Now, of course, all are under the same tent and competition between the two over those contracts is history. Whether big market states will succeed in drawing new private sector bidders or whether their situation will come to more closely resemble that currently existing in small market states remains to be seen.
C. Small legal information markets -- Some very different behavior
1. North Dakota is not New York
As is often true at points of dramatic change, those least well served by the old regime can more readily see and seize the full advantages of the new. It should come as no surprise that a strong model for what can be done with a judicial Web site has first arisen in a state with fewer than a million residents [=>], North Dakota.
North Dakota is one of many states too small to warrant their own set of printed law reports, let alone generate competition over "official report" status. Decisions of its Supreme Court and an annual handful of selected decisions of its intermediate Court of Appeals were published by the West Publishing Company along with those of six other states in West's North Western Reporter series.
In 1997, the North Dakota Supreme Court adopted its own "media-neutral" case citation system (in full conformance with the recommendations of several national bodies). The day an opinion is released it is assigned its permanent official citation. References to specific portions of that opinion need not await its appearance in print for the system includes paragraph numbers. Under court's 1997 citation rule, a citing reference to a particular passage in a 1997 decision would be Falcon v. State, 1997 ND 200, ¶ 12 or Falcon v. State, 1997 ND 200, ¶ 12, 570 N.W. 2d 719. (Falcon being the 200th decision of the North Dakota Supreme court in 1997 and the referenced passage being in paragraph 12.)
The Court also established an official web site to which decisions are released in final, official, citable form -- released and archived. See < http://www.court.state.nd.us/court/opinions.htm > The following year, 1998, the site added decisions of the North Dakota Court of Appeals in the same final and official form.
2. The New York Court of Claims
The New York Court of Claims is not the U.S. Supreme Court. It is not the New York Court of Appeals. Most citizens and many lawyers in New York know nothing of it. And so it should be for this court has a narrow, though exclusive role. It has jurisdiction over all civil claims for damages brought against the state of New York. See < http://www.nyscourtofclaims.state.ny.us/ > Before we were approached by the court's staff, neither Tom nor I knew anything of its work.
While the precedents and procedures of this specialized court may become critically important to those with a claim in tort or contract against New York, the size of this legal information market falls far below any reasonable threshold for print dissemination by public or private source.
While the lawyers who bring cases to the court are rarely repeat players, those they oppose, on the other hand, work for a single law firm -- the office of the Attorney General of New York State. The files and experience of that office place it in a position of considerable advantage over the typical claimant and even, perhaps, over most judges serving on the court. Those judges number twenty or so; their chambers are scattered across the state from Long Island to Buffalo and Plattsburgh. In a year's time they dispose of 200 or so decisions on the merits. Before digital distribution became possible no systematic process assured that a judge sitting in Binghamton would know of a decision rendered by a colleague in Utica on a similar case the prior year.
Under a series of contracts the LII has worked with the judges, clerks, and secretaries of the court to create a database system that will begin with the computer in chambers on which opinions are written, move them along an electronic pathway to the Chief Clerk's office in Albany, for insertion into a full collection -- accessible via the Net to other members of the court, to those with similar matters before the court, journalists, and anyone else interested in the court's work or the performance of a particular judge. A prototype made up of the court's 1995 decisions is now on-line < http://www.nyscourtofclaims.state.ny.us/decision.htm > and the "in chambers" portion of the new database system, in use as of the first of the year.
Providing "free" access to relevant legal information that previously did not move effectively may, in the end, have more impact than adding a "free" alternative to the already numerous channels distributing decisions of the U.S. Supreme Court. And law-making bodies without effective access to their own output have no excuse for not creating a digital archive for shared internal and public use.
3. Municipal codes
In some respects, municipal codes are more important to issues in the daily lives of many citizens than the U.S. Code, and often they are far less accessible. In countless communities, including surprisingly large ones, the collected ordinances are still poorly maintained and inefficiently distributed. Here is an account of a recent attempt to secure the dog ordinances of the City of Binghamton, New York. The source is a young lawyer in that city; the date, February 2000.
Believe it or not, the city clerk told me that no complete copy of the Binghamton Code is available to the public anywhere, even the public library. The only way to get an up-to-date version is to go to (or call) the clerk's office. I know they sell copies of the zoning and building codes ($10 each) and when I called, the clerk faxed me an up-to-date version of the dog law at my request. (He knows I'm a lawyer and we've spoken before.) I don't know what would happen if a citizen needed to see part of the code that was not available for sale.
I'm pretty sure there's not even a complete copy of the code in the clerk's office -- I think you'd have to tell someone which section you were interested in and they'd print it out....
In a growing number of U.S. cities and counties the Net has enabled officials to do what Binghamton has not. In such places as Rochester, New York < http://www.ci.rochester.ny.us/ > ; Cincinnati, Ohio < http://www.ci.cincinnati.oh.us/ >; Boone County, Kentucky < http://nt2.scbbs.com/cgi-bin/om_isapi.dll?clientID=1418&infobase=amlegal-16&softpage=ref_MainView >; Fitchburg, Massachusetts < http://126.96.36.199/folio.pgi/fitchbg.nfo? >; and Yuma City, Arizona < http://nt2.scbbs.com/cgi-bin/om_isapi.dll?clientID=1414&infobase=amlegal-35&softpage=ref_MainView >, citizens troubled by barking dogs, interested in establishing an ambulance service or message parlor, curious about the required setback for a contemplated garage can find the pertinent law on-line.
V. The Difficulty of Defining and Implementing a Public Responsibility to Disseminate Law (for Free)
A. Costly "free" data
My early enthusiasm over the growing number of public bodies releasing law in digital form, thrust me into a public exchange with Vance Opperman, then President of the West Publishing Company. He dismissed these sources as offering only "raw data," uttering the phrase in a pejorative tone that suggested sewage. It was strictly a rhetorical move, but it suggested an important truth: "not all data are of equal value." Digital data can be delivered "free" configured in ways that carry huge implicit costs, for both redistributors and ultimate users.
B. Free information in inhospitable formats
In January 1997, when the Legal Information Institute first undertook programmatic conversion of U.S. Supreme Court decisions to HTML, the Court was releasing its decisions in word-processing format -- Wordperfect 5.1. In the summer of that same year, the Court shifted internally to Microsoft Word. Rather than release opinions as Word documents, the Court began with the October 1997 term to release its decisions in the proprietary PDF format. The change came with little warning and insufficient time to allow us to build and fully test what had to be totally new conversion software. West, LEXIS, and the New York Times had to contend with this same inattention to the needs of subscribers to the Court's electronic distribution service, though with far greater resources. Fortunately, somewhat later in the same term the Court added an SGML-like tagged ASCII format -- a hybrid of structural and presentational markup. Unfortunately, this came too late to save the LII the effort of creating software to convert PDF. < http://supct.law.cornell.edu/supct/ >
Why dig into these technical details? In such technical details lies the difference between effective, free and costly, limited public access. Too many public law-making bodies that have undertaken digital distribution of law data have done so without any thought to facilitating redistribution with added value. Distributing only in PDF is a telltale sign. PDF is not friendly to subsequent machine processing. Those who want a court opinion to "look like a court opinion" on the screen or upon being sent to a laser printer are fond of the format. But for those who would link the references within a document to the cited material, add key words and other metadata, create sophisticated full-text indices, and integrate its content with other related law materials PDF is a major barrier. The commercial subscribers to the Supreme Court's opinion distribution service were every bit as unhappy as the LII with the switch to PDF and as quick to embrace the subsequent SGLM-like alternative.
Subtler barriers lie in format changes and inconsistencies produced by simple inattention. Bodies that exercise great care to assure the quality and consistency of their output in print can wreak havoc on the data systems of others that build on their opinions, enactments, or rules because they will release data that will print handsomely on a page but be utterly confusing to text processing software or search engine. Our work with the opinions of the New York Court of Appeals has given us repeated painful lessons in the many different ways a majority opinion can be joined with a dissent, main headings within an opinion set off, and the date of the decision indicated -- all the while printing quite handsomely. Until public bodies take digital distribution as seriously as they do print, this will remain a problem.
C. Archiving, a public responsibility too?
Legal documents of reasonable age often bear on contemporary legal problems or issues. A source of free law data that provides access to only a year of appellate decisions or even a decade or the current version of the compiled statutes or regulations, will be incomplete and may be misleadingly so. To the extent that recent decisions refer to earlier ones and compilations indicate when and where sections and larger units were added or amended, researchers will at least be alerted to the need of continuing their research using other data sources even though that may entail major discontinuities of time, place and medium. But in cases where that need is less evident, the ease of access to a partial collection may create a hazard.
This issue was addressed, at least in part, by the important 1994 Wisconsin State Bar report on citation. See <http://www.law.cornell.edu/papers/wiscite/wiscite.overview.html > Provoking fierce West Publishing Company opposition, that report drew national attention to the powerful connection between citation norms and access and the consequent need for vendor and media neutral citation. It also identified the need for a public archive. The report recommended:
Maintenance by the Clerk's Office (and the State law Library) in electronic format of an archive of the final copies of all opinions in their final form. This archive would be available to anyone who wished to copy part or all of the archive, for the cost of copying. It would constitute the authoritative master copy of state court opinions. < http://www.law.cornell.edu/papers/wiscite/proposal.html >
Nearly six years later the Wisconsin courts have finally implemented the recommended citation system, but there is, as yet, no archive. The North Dakota court site appears to represent one, but it reaches back less than a decade. Creating an archival collection that goes back beyond the point at which courts began to release and store their decisions in digital format requires resources than no non-profit has to devote and no public body has yet appropriated.
A given year's compiled code (legislative or administrative) is far less problematic a resource than a single year's appellate decisions. Furthermore, a simple form of archiving can consist of retaining each superseded version rather than throwing it away, as limited resources have forced the LII to do with the U.S. Code. Reflecting, perhaps, the culture of its publisher, the National Archives and Records Administration, the federal government's online Code of Federal Regulations takes this approach. See < http://www.access.gpo.gov/nara/about-cfr.html > A far more sophisticated model is provided by Tasmania's Consolidated Legislation Online < http://www.thelaw.tas.gov.au/ > which offers a dynamically generated "point in time" view of legislation reaching back, at present, only three years.
D. U.S. Courts of Appeals -- Procurement mindset and inconsistent data structures
Several years ago, a public spirited group of law school librarians, technology people, and others gathered at the Georgetown Law Center to explore ways of bringing the decisions of the U.S. Court of Appeals to the Net. The University of Texas and Emory had already assumed that task as to some of the circuits and it was not long before all were spoken for, at least once. Observing that this distributed collection would need connective tissue the LII offered to built and maintain a form of uniform locator service -- a valuable idea that failed to fly in part because of the difficult of acquiring the proprietary citation information for those decisions in mergeable digital form. To this day none of the official or non-profit sites serving Court of Appeals decisions provides a way to connect a particular opinion with its standard citation. The service the LII did launch and have attempted to sustain is a cross-site full text index -- which enables users to search for decisions dealing with particular topics of federal law without having to visit multiple sites and master the idiosyncrasies of multiple search engines. < http://straylight.law.cornell.edu/usca/search/ >
The good news is that in the years since we undertook this project several of the circuits have established their own servers, including the Fifth, the Sixth, the Seventh, and the Ninth. Decisions at the Sixth Circuit site carry the electronic citation established by court rule in 1994. But regrettably these public sites have not been designed to facilitate cross-site linking or indexing and in that respect they are less useful than their academic precursors. Not surprisingly courts that cannot coordinate their schedules for recruiting law clerks or numerous other details of carrying out their parallel tasks have each contracted for decision database services without much regard to the interests of those seeking to access and read their decisions, let alone those seeking to integrate their work product with that of other circuits.
Primary legal texts are peculiarly fragmentary or recombinant. Though chunks of the U.S. Code are denominated chapters, they are not written to be read from start to finish, one after another. One gathers relevant provisions around a problem, following cross references in a section that link it to others that sharpen or qualify its effect, tracing back to determine if any of the operative words or phrases are defined elsewhere. Individual appellate decisions rarely can be understood without reference to numerous others, including later ones. And since decisions cannot themselves refer to later opinions that overrule, disapprove or qualify their holdings data systems must do that work. This high degree of textual interconnection is why such large gains can be realized by placing legal materials in a searchable, hypertext environment. Much of our institute's research has concerned techniques, both automated and editorial, that aid the gathering of related legal materials from multiple sources.
Although the LII's on-line U.S. Code was once a Net "exclusive" it has long since become one of many. The House of Representatives itself offers a searchable version. Nonetheless, this LII resource continues to draw over 3 million hits a week. The explanation lies not in unique content but distinctive features of format and functionality. While this collection's content is drawn from the government, it has been reformatted and given navigation and finding aids not available elsewhere on the Net.
We continue to add new features that have increase the value of this resource and significantly several of them draw together information services provided by different offices of the federal government. We have, for example, created links between the Code and related portions of the Code of Federal Regulations, and built an updating feature that integrates separate services offered by the LII, the House of Representatives, and the Library of Congress.
Public law-making bodies that recognize their obligation to provide effective public access to their law still need a lot of help in coming to understand that a handsome, free, up-to-date collection of PDF files can fail to deliver on that obligation and can actually frustrate it by making it difficult for other public bodies and independent value-adders like the LII from integrating their work with other relevant material. "Open," "modular," and "interoperable" are qualities as important to the value of legal data as they are with computer code. The on-line opinions of the North Dakota Supreme Court can and do link to cited earlier decisions of the court [E.g., < http://www.court.state.nd.us/COURT/OPINIONS/990306.htm >, but references to the North Dakota Century Code, also on-line, are not linked because the legislature's site, built from a database used for bill drafting has not been structured with such use in mind. < http://www.state.nd.us/lr/index.html > The LII's on-line U.S. Code, by contrast, has from the start been set up to welcome links -- whether from Supreme Court decisions at our own site or the sites of thousands of others, ranging from U.S. government agencies to the COWPIE newsletter.
E. Public law gathers -- Agencies that gather law for their own staff and those they relate to
One place in the public sector where these issues may be appreciated is in some administrative agencies. The law governing the operations of the U.S. Department of Agriculture includes both portions of the U.S. Code and the Code of Federal Regulations. It is, thus, not surprising that its web site gathers both, via web links < http://www.usda.gov/news/about.htm > or that the California State Water Resources Control Board site assembles relevant state and federal water quality law. Most of this agency law gathering is simply that without tighter integration of the material, but some agency law collections now surpass some rather costly commercial legal information products in timeliness and comprehensiveness. E.g., < http://www.ssa.gov/ >
F. A continuing need for the kind of work the LII has undertaken
For free law content on the Internet to approach its potential value new analogs must be developed for some very old devices that make particular texts locatable -- devices for organizing, finding, and sorting whose print predecessors have become so ubiquitous and familiar as to be invisible. The recombinant nature of law data and very public and decentralized nature of the Net underscore the need for interoperability between collections. Interoperability calls for a set of common approaches permit cross-referencing between documents in separate collections and that act to create integrated functionality among them.
The LII's aim has and continues to be to be more than a non-commercial distributor of law content. Through example, white papers, workshops, and technical exchanges with peers we have worked to set and spread standards for interoperability, markup, and resource location. This July we are sponsoring an international invitational workshop on these technical matters that can have such important consequences. Participants are coming from all of the major English-speaking jurisdictions, from important U.S. Government web publishers, from the highest quality state sites offering legal information within the United States, as well as from important sites in Norway, South Africa, and elsewhere. We firmly believe that these discussions will be an important step in improving the cooperative relationships and interoperable technologies shared between "law-not-coms" worldwide.
Like these peers and others putting law content on the Net, the LII has encountered a vastly larger and more diverse audience for legal materials than the commercial publishers and on-line providers previously perceived or dealt with. Often, it is an audience that is highly sophisticated in its needs even though it is not an audience of lawyers; professionals of all kinds in many countries make use of legal information. This new and important audience is largely ignorant of the idiosyncrasies of legal research and is, in effect, asking why legal research can't be done in ways that are closer to other forms of on-line research. It is a good question, and while there are doubtless sound reasons why legal research must be different there is also little doubt that a commercial duopoly serving an all-lawyer audience had little reason to innovate or to make things easier for non-professional -- and hence they did not. Another target of the LII's research involves building systems that seek to serve these nontraditional audiences more effectively. We do so in the belief that finding and organizing legal information is not all that easy for lawyers either, and that ultimately improvements in the environment for a broader audience will improve things for legal professionals as well.
Our present and planned future work in this area concerns: mark-up standards and document structuring, metadata and metadata description, and the coordination of this standards work with other public legal information providers. We shall continue to maintain and further develop key collections of primary material as testbeds for this work, with the twin goals of determining that contemplated standards actually work in practice and of demonstrating that the work involved in conforming pre-existing collections can result in worthwhile improvements in functionality.
What general conclusions, if any, do I derive from the LII's eight year journey. They seem, in summary, unremarkable:
First, the costs of content acquisition bear directly on the capacity of any public or non-profit institution to distribute that content without charge.
Second, those costs come in many forms and can be astoundingly, defeatingly high even with "free" data.
Third, effective opposition to free dissemination of content comes, wittingly or not, from many surprising sources. They include important public and non-profit bodies which are, today, disturbingly dependent on historic content distribution arrangements that are inconsistent with free distribution.
Fourth, with content that must, for greatest value, be recombinant "free" is not enough. Shared standards, interoperability, and the existence of third-party integrators are necessary elements of "effective free."
Finally, "free" need not be, often should not be, without limitation or control over any and all conceivable forms of reuse. One the ways the Legal Information Institute has sought to finance public access activities is by charging commercial entities for the use of our software tools, content, and know how.