Archive for January, 2016

Citation Software

Monday, January 4th, 2016

Citations and Software – A Long and Vexed Relationship

Hat tip to the team responsible for Blueline (http://blueline.blue/), who suggested a post on the love-hate relationship between programmers and The Bluebook.

They have discovered, as others have before, how challenging it is to create software that will identify all the legal citations in a document and do something to or with them. The trail, dotted with patents and patent applications, is a long one, stretching back to the 1980s when a pair of Harvard Law School grads established a software enterprise they called Jurisoft. By 1986 Jurisoft’s offerings included CiteRite, list price $395, very likely the first successful PC program focused on the professional rather than business side of law practice. CiteRite would scan a brief for citations and generate a report enumerating all failures to conform to Bluebook format. In short order, Jurisoft was acquired by the parent company of Lexis. By 1990 the Jurisoft line included a companion program named FullAuthority, which to quote one reviewer had the “smarts” to do the following:

All you have to do with FullAuthority is tell it the name of the text file on your computer that contains the legal citations. It will zip through your document, tracking each legal citation like a bloodhound. When it has rounded them all up, it will organize them into groups. These groups may include cases (with separate categories for state and federal cases), statutes (with separate categories for state and federal statutes) and other authorities.

Together CiteRite II and FullAuthority comprised Jurisoft’s Citation Toolbox.  Their system requirements are a stark reminder of the computer environment of the early 1990’s:

IBM PC or compatible, MS-DOS 2.0 or higher, 250 kilobytes available memory, hard disk recommended

In the early 1990s both major online providers were moving toward hyperlinking some of the citations that appeared in their collections of judicial opinions, which, of course, required them (and all subsequent competitors) to have sophisticated inhouse tools for identifying and manipulating citations.

In time Word replaced WordPerfect as lawyers’ preferred word processing software and Dakota Legal Software brought out a Word add-on designed to compete with the Jurisoft programs. Lexis acquired its technology as well and folded it into the company’s Lexis for Microsoft Office. Today, that package, like the comparable Drafting Assistant from Westlaw, performs cite-checking, quote-checking, and citation linking in addition to format review and table of authorities compilation.

Both major vendors also have, included as part of their latest generation systems, a copy-with-citation feature purporting to furnish a properly formatted citation (in any one of numerous formats including the distinctive non-Bluebook variants employed in California. Michigan, and New York).  They were reviewed in an earlier post.

Citation tools operating outside and apart from Westlaw and Lexis continued to appear. Although maintenance of the CiteIt! software appears to have ended over a decade ago, the product’s features are still on display at: http://www.sidebarsoft.com/. Another product, CiteGenie, held its ground until WestlawNext’s copy-with-citation feature effectively supplanted it. And, for a time, Jureeka! offered those reading citation-filled documents on the open Web a browser add-on that would converted plain text citations into links. Now along comes Blueline.

Some Reasons for Programmers to Love The Bluebook

Whether designed to review a document for citation format compliance, to check a citator for authority undercutting cited decisions, or to compile a table of authorities, verify the accuracy of a quotation, or generate a link, citation software must first identify which of the diverse character strings found as it scans a document constitute citations and not addresses, part numbers, or radio station call letters. If citation format were uniform across the United States, if judges in federal and state courts and the lawyers submitting documents to them conformed their citations of authority to a common standard presented in a consistent format, the job would be an easy one. The Bluebook, with its claim to offer “a uniform system of citation” (a phrase its proprietors have trademarked), purports to be just that. And so it is within the universe of academic law journals. Complex though it may be, to the extent that the citations in U.S. law writing conform to The Bluebook the programmer’s job is relatively straightforward. To the chagrin of those attempting to construct citation-identifying algorithms, however, courts in the fifty U.S. states have quite diverse ideas about citation norms. Often they are focused narrowly on the legal authorities most frequently cited in cases coming before them. The Bluebook specifies that Indiana Code sections be cited in the format “Ind. Code § x-x-x-x” and those of the Idaho Code as “Idaho Code § x-x”, but when judges and lawyers in Indiana cite code provisions to one another they often cite to I.C. § x-x-x-x; just as those in Idaho cite to I.C. § x-x. Generally, the federal courts and those practicing before them take a less parochial view when citing state authorities, but they are far from consistent on some very basic points. The Bluebook has it that a provision in the Code of Federal Regulations should be cited: “x C.F.R. § xxx.xx (year)”. The U.S. Supreme Court favors “x CFR § xxx.xx” (no periods, no date) but is not followed on this point by most lower federal courts. (Those at Blueline claim their citation analysis suggests “that Republican appointed judges typically cite the U.S. Code as ‘USC’, whereas Democrat appointees prefer ‘U.S.C.'”) Approaches to compressing party names and citing treatises are all over the place.  The same holds for abbreviations of the several sets of federal procedural rules as cited in briefs and court opinions.

A citation reform movement of the last two decades has called for courts to break away from print-dependent case identifiers through the attachment of vendor and medium neutral citations to their decisions prior to release. Building on recommendations of the ABA, the American Association of Law Libraries (AALL) prepared a detailed implementation manual. It carries the title AALL Universal Citation Guide and provides a modern blueprint for uniformity. No surprise, several of the states adopting the new approach have deviated substantially from it. How does The Bluebook address the resulting lack of uniformity? Its Rule 10.3.3 instructs that “the requirements of the jurisdiction’s format should be observed.”

As the folks at Blueline put it “the approved and unapproved variations in Bluebook style create a huge hurdle for coders who rely on hard and fast rules.”  Weak force though it may be, The Bluebook does offer a template for citation recognition on which programmers can begin to build. Deviations from its “uniform system” can be then treated as special cases or alternatives.

Grounds for Programmer Frustration with The Bluebook

Were all judges and lawyers to follow The Bluebook meticulously, would programmers be satisfied? Not so long as its citation rules remain stuck in print-era conventions. Volume and page number are far less precise than “2015 IL 117090, ¶ 31” which points to a single paragraph (straddling a page-break) in a uniquely identified decision of the Illinois Supreme Court. Decided this past January, the decision only later received volume number and pagination in the National Reporter System. Yet The Bluebook directs the passage in question be cited by the latter formula (unnecessary, delayed, and less exact). Page numbers can even yield ambiguous results. A Blueline communique reports that “a query intended for Peate v. McCann, 294 F.3d 879 (7th Cir. 2002) accidentally pulled McCaskill v. Sci Management Corp., 294 F.3d 879 (7th Cir. 2002) because the latter opinion was only 44 words long.”

The Bluebook‘s deference to the major online services, particularly when dealing with the increasingly large pool of “unpublished” decisions, is another problem. A single decision is known as “2015 BL 377979” on Bloomberg Law, “2015 U.S. Dist. LEXIS 155224” on Lexis, and “2015 WL 7253819”.  Google Scholar and other public access sites have the decision but don’t know it by any of those designations. No citation parser can establish the identity of those references or match any of them to a non-proprietary version of the case. Situated as it is in the academy, a domain handsomely served by the major commercial systems, The Bluebook fails to address this problem adequately, and its deference to the commercial sector leads to a strong bias in favor of publisher-specific citations.

That same bias combined with The Bluebook‘s continuing attachment to print leads to rules for statutory and treatise citations that are not followed uniformly because in the current practice environment they simply cannot be.