Scanning and Processing of Decisions for the Law Foundation of Newfoundland and Labrador

The Law Foundation of Newfoundland and Labrador provided funding to make historical court decisions (rendered before 2003) from the province freely available on the CanLII website.  Lexum identified the decisions, scanned them, generated word processing versions, converted them to HTML and PDF, and published the resulting files online.

The Challenge

The Law Foundation of Newfoundland and Labrador provides grants to advance public understanding of the law and access to legal services. The Foundation provided a grant to CanLII in order to improve access to court decisions pre-dating 2003, which were available only in printed case law reports. Creating mass digital resources from legacy documents involves a certain level of complexity. In this case, for each decision, the sources needed to be identified, scanned, processed and disseminated through CanLII. Lexum came up with a pragmatic, economical solution to this challenge.

The Solution

As a preliminary step, the list of decisions to be processed was identified by querying CanLII’s citator and isolating decisions which had not yet been made available on the CanLII’s website. The next step was to locate the ideal source of each decision to constitute the groundwork of the digital data. Decisions were individually scanned from dozens of volumes borrowed from local law libraries, as well as paper-based Canadian case law material accumulated by Lexum over the years.  A thorough quality control process was performed on the resulting image files: each one was checked for any missing pages.

The PDF images were then converted into usable digital text using Optical Character Recognition (OCR). After proofreading the resulting text in order to eliminate possible spelling mistakes and typos, the decisions’ content was formatted into Microsoft Word in a specifically designed template.  Proprietary content found in the printed reports, such as summaries, headnotes, editor’s notes and headers/footers were omitted in the resulting Word files. Only the reasons for decisions were retained.  The name of the Court, case name, neutral citation, decision date and docket numbers were extracted and formatted. A set of Word styles were applied to various portions of the text in order to better control the documents’ visual appearance. Paragraph numbering was also added where there was none. These files then went through another quality control step, where metadata and text were checked for any errors.

Finally, the project involved publishing this new content on the CanLII website.  Over the last few years, more than 40,000 pages, in other words, 4,600 decisions, have been digitized in this manner. Doing so has augmented the exhaustive coverage for Newfoundland and Labrador superior courts on CanLII by 16 years (back to 1987 from 2003). Citation information was updated accordingly, and a judicial history check was performed on the Court of Appeal of Newfoundland and Labrador database so that any and all possible associations could be made between trial and appeal decisions.