{"id":598,"date":"2016-05-06T18:49:42","date_gmt":"2016-05-06T18:49:42","guid":{"rendered":"http:\/\/citeblog.access-to-law.com\/?p=598"},"modified":"2021-12-11T18:11:18","modified_gmt":"2021-12-11T18:11:18","slug":"lessons-the-federal-courts-might-learn-from-westlaws-prolonged-data-processing-error","status":"publish","type":"post","link":"https:\/\/citeblog.access-to-law.com\/?p=598","title":{"rendered":"Lessons the Federal Courts Might Learn from Westlaw\u2019s Prolonged Data Processing Error"},"content":{"rendered":"<h2>The Thomson Reuters\u00a0Errata Notice<\/h2>\n<p>On April 15, 2016 <a href=\"http:\/\/legalsolutions.thomsonreuters.com\/law-products\/cases\">Thomson Reuters notified subscribers<\/a> to its online and print case law services that a significant number of U.S. decisions it had published since November 2014 contained errors.<\/p>\n<p><a href=\"https:\/\/citeblog.access-to-law.com\/wp-content\/uploads\/2016\/05\/email.jpg\" rel=\"attachment wp-att-607\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-607 aligncenter\" src=\"https:\/\/citeblog.access-to-law.com\/wp-content\/uploads\/2016\/05\/email.jpg\" alt=\"email\" width=\"404\" height=\"271\" srcset=\"https:\/\/citeblog.access-to-law.com\/wp-content\/uploads\/2016\/05\/email.jpg 575w, https:\/\/citeblog.access-to-law.com\/wp-content\/uploads\/2016\/05\/email-300x201.jpg 300w\" sizes=\"auto, (max-width: 404px) 100vw, 404px\" \/><\/a><\/p>\n<p>Here and there words had been dropped.\u00a0 The company explained that the errors had been introduced by software run on the electronic texts it collected from the authoring courts.\u00a0 Thomson posted a list of the affected cases.\u00a0 The initial list <a href=\"http:\/\/www.lawsitesblog.com\/2016\/04\/thomson-reuters-says-left-text-600-cases-since-2014.html\">contained some 600 cases<\/a>.\u00a0 <a href=\"http:\/\/static.legalsolutions.thomsonreuters.com\/static\/pdf\/corrected-decisions.pdf\">A week later it had grown to over 2,500<\/a> through the addition of cases loaded on Westlaw but not published in the National Reporter Service (NRS).\u00a0 Two weeks out the list included links to corrected versions of the affected cases with the restored language highlighted.\u00a0 The process of making the corrections led Thomson to revise the number of casualties downward (<em>See<\/em> <a href=\"http:\/\/static.legalsolutions.thomsonreuters.com\/static\/pdf\/newpdf\/2015WL3939426e.pdf\">the list\u2019s entry for <em>U.S. v. Ganias<\/em><\/a>, for example.), but only slightly.<\/p>\n<p>Thomson Reuters sought to minimize the importance of this event, asserting that none of the errors \u201c<a href=\"http:\/\/static.legalsolutions.thomsonreuters.com\/static\/pdf\/cases-customer-faq.pdf\">changed the meaning of the law in the case<\/a>.\u201d\u00a0 Commendably, Thomson apologized, acknowledging and detailing the errata.\u00a0 It spun its handling of the processing error&#8217;s discovery\u00a0as a demonstration of the company\u2019s commitment to transparency.\u00a0 On closer analysis the episode reveals major defects in the current system for disseminating federal case law (and the case law of those states that, like the lower federal courts, leave key elements of the process to Thomson Reuters).<\/p>\n<h2>Failure to View\u00a0Case Law Publication as a Public Function<\/h2>\n<p>Neither the U.S. Courts of Appeals nor the U.S. District Courts have an \u201cofficial publisher.\u201d\u00a0 No reporter\u2019s office or similar\u00a0public agency produces and stamps its seal on consistently formatted, final, citable versions of the judicial opinions rendered by those courts in the way the <a href=\"http:\/\/en.wikipedia.org\/wiki\/Reporter_of_Decisions_of_the_Supreme_Court_of_the_United_States\">Reporter of Decisions of the U.S. Supreme Court<\/a> does for the nation\u2019s highest court.\u00a0 By default, cemented in by over a century of market dominance and professional practice, that job has fallen to a single commercial firm (originally\u00a0the West Publishing Company, now by acquisition and merger Thomson Reuters) to gather and publish the decisions of those courts in canonical form.\u00a0 Although that situation arose during the years in which print was the sole or principal medium of distribution, it has carried over into the digital era.\u00a0 Failure of the federal judiciary to adopt and implement a system of non-proprietary, medium-neutral citation has allowed it\u00a0to happen.<\/p>\n<p>With varying degrees of effectiveness, individual court web sites do as they were mandated by Congress in the <a href=\"http:\/\/access-to-law.com\/elaw\/readings\/egovernment_act_2002.pdf#page=15\">E-Government Act of 2002<\/a>. \u00a0They provide electronic access to the court\u2019s decisions as they are released.\u00a0 The online decision files, spread across over one hundred\u00a0sites, present opinion texts in a diversity of formats.\u00a0 Crucially, all lack the citation data needed by any legal professional wishing to refer to a particular opinion or passage within it. \u00a0Nearly twenty years ago the American Bar Association <a href=\"http:\/\/www.americanbar.org\/content\/dam\/aba\/administrative\/legal_technology_resources\/universal-citation.authcheckdam.pdf\">called upon the nation\u2019s courts to assume the task of assigning citations<\/a>.\u00a0 By now the judiciaries in\u00a0<a href=\"http:\/\/www.law.cornell.edu\/citation\/2-200.htm#2-230\">close to one-third of the states<\/a> have done so.\u00a0 The federal courts have not.<\/p>\n<h2>Major Failings of the Federal Courts\u2019 Existing Approach<\/h2>\n<h3>Delivery of Decisions with PDF Pagination to Systems that Must Remove It<\/h3>\n<p>Several states, including a number that\u00a0produce large volumes of appellate decisions, placed no cases on the Thomson Reuters errata list.\u00a0 Conspicuous by their absence, for example, are decisions from the courts of California and New York.\u00a0 The company\u2019s identification of the software bug combined with inspection of the corrected documents explains why.\u00a0 Wrote Thomson it all began with an \u201cupgrade to our PDF conversion process.\u201d<\/p>\n<p>The lower federal courts, like those of many states, release their decisions to Thomson Reuters, other redistributors, and the public as PDF files.\u00a0 The page breaks in these \u201cslip opinion\u201d PDFs have absolutely no enduring value.\u00a0 Thomson (like Lexis, Bloomberg Law, Casemaker, FastCase, Google Scholar, Ravel Law, and the rest) must remove opinion texts from this electronic delivery package and pull together paragraphs and footnotes that straddle PDF pages.\u00a0 All the words dropped by Thomson\u2019s \u201cPDF conversion process\u201d were proximate to slip opinion page breaks.\u00a0 Why are there no California and New York cases on list?\u00a0 Those states release appellate decisions in less rigid document formats.\u00a0 California decisions are <a href=\"http:\/\/www.courts.ca.gov\/opinions-slip.htm\">available in Microsoft Word format<\/a> as well as PDF.\u00a0 The New York Law Reporting Bureau <a href=\"http:\/\/www.courts.state.ny.us\/reporter\/slipidx\/cidxtable.shtml\">releases decisions in html<\/a>.\u00a0 <a href=\"http:\/\/www.oscn.net\/applications\/oscn\/DeliverDocument.asp?CiteID=477433\">So does Oklahoma<\/a>; no Oklahoma decisions appear on the Thomson errata list.<\/p>\n<h3>Failure to Employ One\u00a0Consistent Format<\/h3>\n<p>The lower federal courts compound the PDF extraction challenge by employing no single consistent format.\u00a0 Leaving individual judges of the ninety-four district courts to one side, the U.S. Courts of Appeals inflict a range of remarkably different styles on those commercial entities and non-profits that must process their decisions so that they will scroll and present text, footnotes, and interior divisions on the screens of computers, tablets, and phones with reasonable efficiency and consistency.\u00a0 The Second Circuit\u2019s format features <a href=\"https:\/\/www.gpo.gov\/fdsys\/pkg\/USCOURTS-ca2-13-03159\/pdf\/USCOURTS-ca2-13-03159-0.pdf\">double-spaced texts, numbered lines, and bifurcated footnotes<\/a>; the Seventh Circuit\u2019s has <a href=\"https:\/\/www.gpo.gov\/fdsys\/pkg\/USCOURTS-ca7-13-02243\/pdf\/USCOURTS-ca7-13-02243-1.pdf\">single-spaced lines, unnumbered, with\u00a0very few footnotes<\/a> (<a href=\"http:\/\/aja.ncsc.dni.us\/courtrv\/cr38-2\/CR38-2Posner.pdf\">none in opinions by Judge Posner<\/a>).<\/p>\n<p>In contrast the <a href=\"http:\/\/courts.mi.gov\/opinions_orders\/Pages\/default.aspx\">decisions released by the Michigan Supreme Court<\/a>, although\u00a0embedded in PDF, reflect a cleanly consistent template.\u00a0 The same is true of those coming from the supreme courts of <a href=\"http:\/\/www.floridasupremecourt.org\/decisions\/index.shtml\">Florida<\/a>, <a href=\"http:\/\/www.txcourts.gov\/supreme\/orders-opinions.aspx\">Texas<\/a>, and <a href=\"http:\/\/www.wicourts.gov\/opinions\/sopinion.htm\">Wisconsin<\/a>. \u00a0Decisions from these states do not appear on the Thomson list.<\/p>\n<h3>Lack of a Readily Accessible, Authenticated Archive of the Official Version<\/h3>\n<p>By its own account it took Thomson Reuters over a year to discover this data processing problem.\u00a0 With human proofreaders it would not have taken so long.\u00a0 Patently, they are no longer part of the company\u2019s publication process.\u00a0 Some of the omitted words would have been invisible to anyone or any software not performing a word-for-word comparison between the decision released by the court and the Westlaw\/National Reporter Service version.\u00a0 <a href=\"http:\/\/static.legalsolutions.thomsonreuters.com\/static\/pdf\/newpdf\/771_F3d_59.pdf\">Dropping \u201cSo ordered\u201d from the end of an opinion<\/a> or <a href=\"http:\/\/static.legalsolutions.thomsonreuters.com\/static\/pdf\/newpdf\/779_F3d_437.pdf\">the word \u201cPlaintiff\u201d prior to the party\u2019s name at its beginning<\/a> fall in this category. \u00a0However, the vast majority of the omissions rendered the affected sentence or sentences unintelligible.\u00a0 At least <a href=\"http:\/\/static.legalsolutions.thomsonreuters.com\/static\/pdf\/newpdf\/785_F3d_787.pdf\">one removed part of a web site URL<\/a>.\u00a0 <a href=\"http:\/\/static.legalsolutions.thomsonreuters.com\/static\/pdf\/newpdf\/793F3d277.pdf\">Others dropped citations<\/a>.\u00a0 In the case of a number of\u00a0state courts, a reader perplexed by a commercial service&#8217;s\u00a0version of a decision\u00a0can readily retrieve an official copy\u00a0of the opinion text from a public site and\u00a0compare its language.\u00a0 That is true, for example, in\u00a0Illinois.\u00a0 Anyone reading the 2015 Illinois Supreme Court decision in <a href=\"http:\/\/static.legalsolutions.thomsonreuters.com\/static\/pdf\/newpdf\/26_NE3d_335.pdf\"><em>People v. Smith<\/em> on Westlaw<\/a> puzzled by\u00a0the sentence\u00a0\u201c\u00b6 3 The defendant, Mickey D. Smith, was charged in a three-count indictment lawful justification and with intent to cause great bodily harm, shot White in the back with a handgun thereby causing his death.\u201d could have pulled the original, official opinion from the judiciary web site simply by employing\u00a0a Google search and the decision&#8217;s\u00a0court attached citation (<a href=\"http:\/\/www.illinoiscourts.gov\/opinions\/SupremeCourt\/2015\/116572.pdf\">2015 IL 116572<\/a>), scrolled directly to paragraph 3, and discovered the Westlaw error.\u00a0 The same holds for the other six published Illinois decisions on the Thomson list.\u00a0 Since New Mexico also posts final, official versions of its decisions outfitted with public domain citations, it, too, provides a straightforward way for users of Westlaw or any other commercial service to check the accuracy of dubious case data.<\/p>\n<p>The growing digital repository of federal court decisions on the <a href=\"https:\/\/www.gpo.gov\/fdsys\/browse\/collection.action?collectionCode=USCOURTS\">GPO\u2019s FDsys site<\/a> falls short of the standard set by these state examples.\u00a0 To begin, it is seriously incomplete.\u00a0 Over fifty of the entries on the Thomson Reuters list are decisions from the Southern District of New York, a court not yet included in FDsys.\u00a0 Moreover, since the federal courts employ no system of court applied citation, there is no simple way to retrieve a specific decision from FDsys or to move directly to a puzzling passage within it.\u00a0 With an unusual party name or docket number <a href=\"https:\/\/www.gpo.gov\/fdsys\/search\/advanced\/advsearchpage.action\">the FDsys search utility<\/a> may prove effective but with a case name like \u201c<a href=\"http:\/\/static.legalsolutions.thomsonreuters.com\/static\/pdf\/newpdf\/804_F3d_132.pdf\">U.S. v. White<\/a>\u201d retrieval is a challenge.\u00a0 A unique citation would make the process far less cumbersome. \u00a0However, since the lower federal courts rely on Thomson Reuters to attach enduring citations to their cases (in the form of volume and page numbers in its commercial publications) the texts flow into FDsys without\u00a0them.<\/p>\n<h3>The Ripple\u00a0of the Thomson Reuters Errors into Other Database Systems<\/h3>\n<p>Because the federal courts have allowed the citation data assigned by Thomson Reuters, including the location of interior page breaks, to remain the de facto citation standard for U.S. lawyers and judges, all other publishers are compelled in some degree to draw upon the National Reporter System.\u00a0 They cannot simply work from the texts released by their deciding courts, but must, once a case has received Thomson editorial treatment and citation assignment, secure at least some of what Thomson has added. \u00a0That introduces\u00a0both unnecessary expense\u00a0and a second point\u00a0of data vulnerability to case law dissemination. \u00a0Possible approaches range from: (a) extracting only the volume and pagination from the Thomson reports (print or electronic) and inserting that data in the version of the decision released by the court to (b) replacing the court\u2019s original version with a full digital copy of the NRS version.\u00a0 Whether the other publisher acquires the Thomson Reuters\u00a0data in electronic form under license or by redigitizing the NRS print reports, the second approach will inevitably pick up errors injected by Thomson Reuters editors and software. \u00a0For that reason the recent episode illuminates\u00a0how the various online research services assemble case data.<\/p>\n<h4>Services Unaffected by the Thomson Reuters\u00a0Glitch<\/h4>\n<p>Lexis was not affected by the Thomson Reuters errors because it does not draw decision texts from the National Reporter System. \u00a0(That is not to say that Lexis\u00a0is not capable of committing similar processing errors of its own. \u00a0<em>See<\/em> the first paragraph in\u00a0the Lexis version of <em>U.S. Ravensberg<\/em>, 776 f.3d 587 (7th Cir. 2015).) \u00a0 So that\u00a0Lexis\u00a0subscribers\u00a0can cite opinions using the volume and page numbers assigned by Thomson, Lexis extracts them from the NRS reports and inserts them in the original text.\u00a0 In other respects, however, it does not conform decision data to that found in Westlaw.\u00a0 <a href=\"http:\/\/verdict.justia.com\/2014\/01\/20\/citation-dna-whos-datas-daddy\">As explained elsewhere<\/a>\u00a0its approach is revealed in\u00a0how the service treats cases that contain internal cross-references.\u00a0 In the federal courts and other jurisdictions still using print-based citation, a dissenting judge referring to a portion of the majority opinion must use \u201cslip opinion\u201d pagination.\u00a0 Later when published by Thomson Reuters these \u201cante at\u201d references are converted by the company\u2019s editors, software, or some combination of the two to the pagination of the volume in which the case appears.\u00a0 Search recent U.S. Court of Appeals decision on Lexis on the phrase \u201cante at\u201d and you will discover that in its system they remain in their original \u201cslip opinion\u201d form.\u00a0 For a single example, compare Judge Garza\u2019s dissenting opinion in In re <em>Deepwater Horizon<\/em>, 739 F.3d 790 (5th Cir. 2014) as it appears on Lexis with the version on Westlaw or in the pages of the <em>Federal Reporter<\/em>.<\/p>\n<p>Bloomberg Law appears to draw more extensively on the NRS version of a decision.\u00a0 Its version of the Garza dissent in In re <em>Deepwater Horizon<\/em> expresses the cross references in <em>Federal Reporter<\/em> pagination.\u00a0 However, like Lexis it does not replace the original \u201cslip opinions\u201d with the versions appearing in the pages of the <em>Federal Reporter<\/em>.\u00a0 Examination of a sample of the cases Thomson Reuters has identified as flawed finds that Bloomberg Law, like Lexis, has the dropped language.\u00a0 Casemaker does as well.<\/p>\n<h4>Services that Copy Directly from Thomson&#8217;s Reports, Errors and All<\/h4>\n<p>In contrast, Fastcase, Google Scholar, and Ravel Law all appear to replace \u201cslip opinions\u201d with digitized texts drawn\u00a0from the National Reporter System.\u00a0 As a consequence when Thomson Reuters drops words or makes other changes in an original opinion text so do they<em>.<\/em>\u00a0 The Westlaw errors are still to be found in\u00a0the case data of these other services.<\/p>\n<h2>Might FDsys Provide a Solution?<\/h2>\n<p><a href=\"https:\/\/citeblog.access-to-law.com\/wp-content\/uploads\/2016\/05\/fdsys-1.jpg\" rel=\"attachment wp-att-609\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-609 aligncenter\" src=\"https:\/\/citeblog.access-to-law.com\/wp-content\/uploads\/2016\/05\/fdsys-1.jpg\" alt=\"fdsys\" width=\"504\" height=\"294\" srcset=\"https:\/\/citeblog.access-to-law.com\/wp-content\/uploads\/2016\/05\/fdsys-1.jpg 761w, https:\/\/citeblog.access-to-law.com\/wp-content\/uploads\/2016\/05\/fdsys-1-300x175.jpg 300w\" sizes=\"auto, (max-width: 504px) 100vw, 504px\" \/><\/a><\/p>\n<p><a href=\"http:\/\/www.geeklawblog.com\/2011\/05\/federal-court-decisions-via-fdsysgov.html\">Since 2011<\/a> decisions from a growing number of federal courts have been collected, authenticated, and digitally stored in their original format as part of the <a href=\"http:\/\/www.gpo.gov\/fdsys\/browse\/collection.action?collectionCode=USCOURTS\">GPO\u2019s FDsys program<\/a>.\u00a0 As noted earlier that data gathering is still seriously incomplete.\u00a0 Furthermore, the GPO\u00a0role is currently limited to authenticating decision files and adding a very modest set\u00a0of metadata.\u00a0 Adding decision identifiers\u00a0designed to facilitate retrieval of individual cases, ideally designations consistent with emerging norms of medium-neutral citation, would be an enormously useful extension of that role.\u00a0 So would be the assignment of paragraph numbers throughout decision texts, but regrettably that task properly belongs at the source.\u00a0 It is time for the Judicial Conference of the United States to revisit vendor and medium neutral citation.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The Thomson Reuters\u00a0Errata Notice On April 15, 2016 Thomson Reuters notified subscribers to its online and print case law services that a significant number of U.S. decisions it had published since November 2014 contained errors. Here and there words had been dropped.\u00a0 The company explained that the errors had been introduced by software run on [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[12,11,41],"tags":[13,14,27],"class_list":["post-598","post","type-post","status-publish","format-standard","hentry","category-cases","category-neutral-citations","category-official","tag-cases-2","tag-neutral-citations-2","tag-official"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/citeblog.access-to-law.com\/index.php?rest_route=\/wp\/v2\/posts\/598","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/citeblog.access-to-law.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/citeblog.access-to-law.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/citeblog.access-to-law.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/citeblog.access-to-law.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=598"}],"version-history":[{"count":14,"href":"https:\/\/citeblog.access-to-law.com\/index.php?rest_route=\/wp\/v2\/posts\/598\/revisions"}],"predecessor-version":[{"id":616,"href":"https:\/\/citeblog.access-to-law.com\/index.php?rest_route=\/wp\/v2\/posts\/598\/revisions\/616"}],"wp:attachment":[{"href":"https:\/\/citeblog.access-to-law.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=598"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/citeblog.access-to-law.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=598"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/citeblog.access-to-law.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=598"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}