THE INTERNET AND LEGAL INFORMATION :
PROJECTS AND PROSPECTS
TOM R. BRUCE
CORNELL LAW SCHOOL
For the Internet community, the New Yorker cartoon marked a passage into adolescence. The Internet had risen far enough into the popular consciousness to be laughed at by sophisticates, a mark of wider public awareness. Many now divide the course; informality is an Internet hallmark) into "before the dogs" and mutts were just the beginning. By purporting to be guides to the Internet appeared on the shelves of mass-market booksellers. By New Year's Day they were out of date. All in all, literally thousands of Internet-related stories appeared in the non-technical press in 1993 and early 1994. Many of them made liberal use of the word "cyberspace". Most of them prominently featured the phrases "information superhighway" and "interactive television". Virtually all of them were variations on a theme: five hundred channels of interactive mud- wrestling will be coming to your home soon, courtesy of an unholy alliance of the telephone, cable television, and computer industries. The reaction from most people who had actually been working with Internet- based technologies was one of cautious approval; they liked the attention, but they didn't know whether a future in which all the bandwidth is consumed by Arnold Schwarzenegger movies was exactly what they'd been working toward. Like an adolescent with growing pains, the Internet community is not at all sure whether it wants to be an adult on someone else's terms, or if it might just be wiser to remain a child long enough for the whole thing to blow over. Like other adolescents, it is already well out of childhood with no way back.
The situation today is quite different. Many Federal agencies are now making information available through the Net; perhaps the largest single example is the EDGAR database of securities filings. A dozen or more law schools are mounting information in one format or another; many others have plans to do so. Uniform stylistic standards are the subject of recurrent discussion in the academic legal-technology community, and it seems likely that a "legal hypertext style book" will emerge sooner rather than later. Practitioners have been slower to act; relatively few firms have direct Internet connectivity, and most are worried about security issues which were solved to the satisfaction of the defence industry years ago. Nonetheless, the first WorldWideWeb server operated by a law firm came on-line for test purposes a few hours before this sentence was written, bearing information designed to showcase the intellectual inventory of the firm. There will be many, many more.
These youthful accomplishments invite speculative comparisons with established and presumably more mature commercial legal data services. The comparisons are sometimes fuelled by enthusiasm (the "Wow! we could start our own LEXIS!" response) or skepticism ("Why should we do this at all when we have WESTLAW?"). Those two giants themselves use the Internet as a pipe through which to ship their data inventory. But looking at Internet infrastructure merely as a better or worse way of delivering the same old services to the same old markets is a profound mistake. It is equally wrongheaded to assume that the methods historically used by the computerised legal information industry to exchange money for information are the only ones available to us. To do so ignores the fact that the Internet is a very different medium, and it ignores a long history of software and hardware development in which quantitative improvements in technology, accompanied by reduction in cost, lead to enormous qualitative change.
We are witnessing a revolution of scale and of scope which in many respects will parallel the personal computer's overthrow of the mainframe, and which will occur for many of the same reasons. Perhaps that sweeping statement is simply the legal technologists' transliteration of the media hype mentioned a moment ago; I suppose you could call it the "500 channels of interactive legal information" story. If it is a fantasy, it is one firmly rooted in the reality of present-day production systems, and a look at those roots is in order.
Complete descriptions of the Web standards are outside the scope of this paper, as is detailed explanation of hypertext and what it does. I will instead concentrate on a few critical aspects and implications of the technology, with examples taken largely from our ongoing work at the Legal Information Institute. It seems only logical to begin with the hardware and software needed to support the effort.
Connectivity costs for a Web server vary greatly depending on the type of institution running it. At most university-based sites, the cost of Internet connection for a particular machine is subsumed in some overall budget and is inexpensive on a per-machine basis; ours costs in the vicinity of $36 (US) per month. Costs for a dedicated 14.4K phone link serviced by a commercial Internet access provider, such as the one used by Venable, Baetjer, Howard, & Civiletti to provide access to their Web server, might run $2000 annually. Finally, all the software used at the server is freely distributed and costs nothing. To put these numbers in perspective, the LII's hardware, software, and connectivity cost for startup of their Web, Gopher, and listserv operations, plus one year's operating cost, was less than half the amount of money saved each year by a move to electronic proof generation at the school's two student-edited law journals.
Earlier hypertext systems allowed the user to navigate among documents maintained on standalone computers and on local area networks; they were, in effect, confined to environments where the workstation could be fooled into believing that the disk used to store hypertext documents was physically attached to it. This was and is a serviceable technology, and its utility in the organization and retrieval of legal text has been widely recognized and acted upon. Web technology goes beyond these earlier systems in its ability to provide hypertext links across machines on the Internet, for the most part without regard to specific delivery protocols. It is easy for a Web client to access information made available through the Web itself, through Gopher servers, WAIS databases, anonymous FTP, and essentially any other access system for which gateway software can be written (examples include the HyTelnet, Hyper-G, and TechInfo formats). In cultural and administrative terms, this means that hypertext links can tie together bodies of text which are related in substance but unrelated in sponsorship, text which is mounted and maintained by different organizations, without regard for geographical or institutional proximity. The editorial and maintenance costs of a collection can be spread across many cooperating "sub-providers", each one mounting and maintaining a portion, perhaps one in which the provider has substantial expertise or interest.
The collection of US Supreme Court opinions offered by the Legal Information Institute illustrates this proposition well. The actual opinions are mounted for anonymous FTP retrieval at Case Western Reserve University under Project Hermes, an effort begun by the Court in 1990 to make decisions available electronically to public and private-sector entities. Case Western decided to use anonymous FTP -- the most widely available technology at the time -- as the means of distribution, dividing each text from the court into syllabus, opinion, dissents, concurrences, and so forth, and assigning each portion a unique file name based on the docket number of the case. The user of CWRU's anonymous FTP site is confronted with a directory structure full of files with names like "92-1168.ZS.filt", which is in fact the syllabus of the decision in the Harris v. Forklift Systems sexual-harassment case. CWRU is, of course, adding considerable value by filtering out word-processing codes, dividing the opinions into reasonably-sized chunks, and providing a publicly-accessible distribution point. For the average individual trying to obtain an opinion, however, the combination of cryptic filenames which won't pass unaltered onto a DOS-based file system and the poor navigational capabilities of most FTP clients add up to an interface best described as user-hostile.
The LII's contribution has been the construction of hypertext pages which link case names and related information to the FTP files at Case Western. This provides end users with a hypertextual, annotated table of contents from which they may select cases by actual case name and portion (e.g. syllabus, opinion, concurrence, and so forth) without needing to know or to look up the docket number. These hypertext pages are stored on Cornell's server, and the opinions remain at Case Western. CWRU bears the cost of mounting the actual opinions, and Cornell bears the cost of organizing and maintaining usable access points for the average user. We have gone on to make the collection searchable in a variety of ways, including organization by keyworded topic and full-text search. Again, only the indices and the searching software are at Cornell, with the "base text" at CWRU.
This process, which we have glibly dubbed "add-on scholarship", is much more than a workaday exercise in the librarian's or indexer's art. The same techniques can be used to unify scattered bodies of related material. We are in the early stages of a grant-funded project in which the Rules of Professional Conduct as adopted in each of the 50 states would be mounted by institutions in each of the states. Participating organizations -- which might be law schools, bar associations, or law firms with a pro bono bent -- need only mount those sections which differ from the two major variants on the Model Rules of Professional Conduct. The remainder of the corpus for each state would be drawn from a central pool, accessed section by section via hypertext links. Thus, the table of contents for any given state would contain a mixture of pointers to generic and localized sections, minimizing the work needed from any one of the participating institutions and, correspondingly, its expense.
The first idea is illustrated by LIIBULLETIN, a service offered by the LII shortly after we began automatically constructing pointers to new Supreme Court opinions as they were delivered to CWRU by the Court. The same software which discovered the existence of new opinions to be indexed and added to our hypertext pages could, with only minor modifications, send notice by e-mail to persons who wished to know the existence of a new opinion from the Court, in a message which also contains the syllabus of the opinion and instructions for retrieving its full text by mail. This offers a service to those who have only mail connectivity to the Net, as well as adding value through its timeliness and through its narrowcast characteristics: the alert is transmitted directly to a mailbox which the user presumably scrutinizes frequently, rather than through a broadcast or bulletin-board apparatus.
The second idea takes its power from the idea that a publicly-mounted text on the Net is visible to large numbers of other experts who presumably have access to mail and can address responses more or less directly to the author. On several occasions -- though I should perhaps blush to mention it -- errata or omissions in an online text have been brought to our attention very quickly by others on the Net. There is an interactive power working in the author's favor here; one need not be trapped for a year or more in an error which has been typeset and distributed to who knows where. Our experience is that the most perceptive and expert readers are the most likely to provide feedback. One anticipates a certain amount of annoyance from non-expert readers, or readers with axes to grind, but in fact this kind of communcation has in our case been very rare by contrast with a much greater volume of genuinely useful feedback received from our peers.
Of course, conversations need not be limited to errata and other problems. Substantive discussion of online works can take place, be captured, and then perhaps incorporated into the work. The capture and subsequent public offering of electronic conversation about electronic texts offers the possibility of symposia and colloquia which take place in virtual space. Because of their specific relationship with an electronically published work, such symposia can remain relatively both focussed and timely.
The Net is a communications space common to the public, private, governmental, and academic sectors. Many synergistic efforts can result from this kind of cross-sector proximity. The LII works closely with a number of corporate sponsors who have an interest in legal and quasi-legal information, and examples of such projects exist at many places other than Cornell. The EDGAR securities-filing database was recently mounted on the Internet; it is a large and important example of government information from one agency (the Securities and Exchange Commission) being mounted with funding from another agency (the National Science Foundation) by a collaboration between a private sector entity (Internet Multicast, Inc.) and an academic institution (the business school at New York University). Our own more modest example is the mounting of the NASDAQ Financial Executive Journal, an outreach publication of the NASDAQ Stock Market.
The NASDAQ project is an interesting example of what one might call "corporate presence information", information mounted by a company not for sale or as advertising but as a means of offering access to corporate intellectual property as a new species of customer service. The value of Internet publication to NASDAQ is presence in front of its issuing companies, many of whom are high-tech corporations who already have considerable Internet presence themselves. Just as a law firm might open an office in a city where a major client is beginning business operations, NASDAQ has opened a kiosk in cyberspace as a way of showing solidarity with and service to its customers. As it happened, the service was a timely one; the first issue, which was largely concerned with strategies for avoiding shareholder suits when stock prices tumble, was put on the Net roughly a week before the sharp drop in the price of Apple Computer stock last summer. We logged a large number of accesses from individuals at 'apple.com' in the weeks that followed.
The timeliness of NASDAQ's first informational offering was coincidental, but it is not difficult to imagine that corporations -- and even law firms -- might find ways to make equally topical information available by design. Many already do exactly that with newsletters describing the effects of regulatory changes, pending legislation, and so on. Electronic information can, of course, be distributed much more quickly than paper, and can thus be even more timely. We routinely receive the text of the NASDAQ publication at the same time as the printer of the paper edition does, and generally have it on-line ten days prior to the mailing of the paper version.
The other end of the spectrum is not yet well-populated either, primarily because there are as yet no commonly-accepted and reliable authentication mechanisms (passwording systems) which would permit "pay-by-the-drink" services to exist. Counterpoint Publishing, among others, has been experimenting with a subscription system which does seem to work well, although one imagines that the administrative overhead is high. Of course, it is actually possible for a company to make money by giving information away; West Publishing's recent offering of a national directory of lawyers via Gopher server can be seen as this kind of "loss leader", a useful and coherent data resource made freely available, with the implicit promise of much more to those who pay.
In any case, workable authentication schemes are no more than a year away; they are understandably the target of a great deal of interest in both the academic and commercial research and development communities. One should bear in mind that authentication is only the tip of an economic iceberg. Customer-service businesses of various kinds are needed. Imagine, for example, that you routinely access four hundred different information servers each month as you move around the Net; in a hypertext environment you might do this quite easily and naturally. How many invoices do you get at the end of the month? Four hundred? Or would you prefer to deal with a business organization which will consolidate those individual charges into a single itemized invoice, much as charge-card providers do? Some commercial on-line services (Delphi was perhaps the first) are clearly planning to position themselves in this way, but it is not clear that any one of them can do so effectively as a single member of a large and growing pool of access providers.
There are those in the academic community for whom any hint of commercialization is anathema, but their days are numbered. Altruism in the mounting of information at no apparent cost has served to jump-start use of the Net, but may not serve the public well in the long run unless the altruists raise their sights. Volunteers have sometimes done a terrible job of meeting information-quality standards we take for granted in print and in the commercial on-line services. In part this is because volunteers have been more interested in the delivery systems than in the content, and in part it is owed to the fact that they can't afford to maintain large collections individually. This is not to say that academic organizations should give up electronic publication in favor of existing commercial operations. Far from it. It is simply that organizations in academia need to find new ways to be compensated for their work; there is much about their labor that has real value in an information economy. Cost sharing and other cross-sector efforts will require an unprecedented degree of administrative cooperation and creativity in constructing relationships between academic, governmental, and private-sector players who do not at this point understand one another very well. The technology makes such cooperation possible but not inevitable; the next major strides to be made in electronic publishing on the Internet need to be administrative rather than technological innovations.