As mentioned in a previous post, the Law via the Internet Conference has returned with renewed energy after a three-year hiatus. We could not ignore this splendid opportunity to meet again with our colleagues from around the world. The conference was hosted in collaboration with the Centre for Computers and Law of the Faculty of Law, University of Vienna, and the Austrian Legal Information Institute LII-Austria. It ended up being a considerable success despite the last-minute call-for-proposal, gathering people invested in providing online access to legal information worldwide, on-site in Vienna and online.
Lexum had its share of the spotlight through no less than three presentations:
CatLII Unleashed: AI Case Analysis for CanLII Decisions
Frédéric Pelletier, VP Legal Information at Lexum & Chief Editor for CanLII
Frédéric showcased our CatLII pilot project leveraging Artificial Intelligence to analyze legal decisions and summarize them and introduced the many challenges we had to address. Beyond the sheer scope of the project, Frédéric emphasized the mandate to protect CanLII’s reputation, a task that can prove challenging when A.I. can be prone to hallucinate. CatLII tackled this challenge by constraining its analysis to the body of each decision and by embedding links pointing back to the original paragraphs.
This very year has also seen a regular stream of significant innovations in Large Language Models (LLM), to the point where every month can bring a new game-changing development. As such, the tools developed by Lexum have been designed using a flexible and modular approach, allowing us to benchmark and eventually adopt new technologies proving to be more performing or cost-effective.
Exploiting Citation Graphs in Large Corpora to Improve Relevance on Broad Queries
Marc-André Morissette, VP Technology
Marc-André detailed the challenges we faced exploiting the citation network with a customized LLM to improve results in search queries. “We have created an algorithm that analyzes a corpus’s citation network and identifies the most cited documents in the context of the user’s query. Heavily cited documents are inferred to be more authoritative. This approach can even rescue relevant documents that were initially missed because they do not contain the query’s terms.” The Citation Network project improves the user search experience through three different angles: conceptual search, navigational search, and general performance improvements.
Overall, it provides a better ranking for broad, topic-based queries, as well as navigational queries. Moreover, these improvements are achieved with response times equivalent to what they used to be. As it’s based on a learning model, the larger the database, the more accurate the results become.
What are we building next?
Ivan Mokanov, President
Finally, Ivan provided the audience with a snapshot of where Lexum currently stands and, from there, gave us a glimpse of the wonderful new features being developed.
To begin with, CanLII can be proud of being now acknowledged as the preferred reference source for Canadian case law according to the Canadian Guide to Uniform Legal Citation, aka the McGill Guide. CanLII publishes legal decisions from 383 courts and administrative tribunals. It also publishes consolidated statutes for all jurisdictions with point-in-time capability going back 20 years, consolidated regulations for all jurisdictions, and annual statutes for all jurisdictions. CanLII has the best repository of Canadian primary law: we have reached a point where if it’s not on CanLII, it doesn’t exist.
For 30 years now, Lexum’s mission has been to disrupt and democratize legal publishing by automating previously manual processes. Our vision was that case law and legislation could be made more widely accessible at a lesser cost through automated processes. Here are a few examples of features that we’ve launched over the years: automated detection and association of citations, keywords and key-phrase extraction, near-duplicate detection, automated case and legislation information extraction, automated document conversion, automated legislative structure recognition, citation graph statistical analysis for search relevance, automatic detection of cases requiring anonymization, sentence and paragraph segmentation…
For the same reason, AI has been central to many of our recent initiatives. In the short term, we plan to focus on a few very down-to-earth projects, such as scaling AI case analysis, introducing legislative summaries, and automated translations. In the longer term, we are working toward improving search results ranking through learning, implementing, and developing neural search, among other things.
Stay tuned! Exciting times are ahead.