Learning from the Citation Network or How to Boost Your Search Results with AI

Lexum is committed to using Artificial Intelligence (AI) to implement new features and improve our clients’ user experience. Lately, we have been working on a search improvement feature for CanLII that learns from the citation network between cases, and it’s a game changer. This new feature dramatically enhances search through Neural Links and Learning Language Models.

The Citation Network project is an AI boost built to refine and conceptualize the search engine. It works around three different concepts: conceptual search, navigational search, and general performance improvements.

Conceptual Search

Conceptual search connects terms with associated concepts that are not explicitly spelled out in the most relevant documents. Without matching keywords, analysis, discussions, and interpretations in other documents may inform us about these concepts. The Citation Network project leverages the network of links embedded in case law to learn about these concepts and boost the ranking of documents accordingly.

Let’s look at the “fair dealing” expression, for instance. Legal professionals have a general understanding of the meaning of this expression and its implications. However, many judicial decisions do not include the keywords “fair dealing” even if fair dealing is at the core of the judge’s reasoning. Conceptual search connects the notion of fair dealing with these cases when other decisions addressing the concept cite them in context.

Navigational Search

Navigational search has been improved by associating key terms with corresponding documents. For instance, legal professionals may know that PIPEDA stands for the Personal Information Protection and Electronic Documents Act. The search engine did not. Until recently, there was no match between the two. When searching for PIPEDA, you used to get back a series of results listing only those matching the term. Most likely a decision including the exact term but not what it refers to, such as this one:

Since the implementation of the Citation Network project, the first result for PIPEDA is the matching legislative document:

A connection is now made between the PIPEDA acronym and what it stands for. Since it’s a learning model, it’s getting richer and more precise as additional legal documents get published.

General Performance Improvements

Learning from the citation network also improves the general performance of Lexum’s search engine in various ways.

First, the algorithm is much better at returning highly authoritative documents for broad conceptual queries. For this kind of query, the document presented as the top result is more often the relevant statutes or the highly cited Supreme Court cases on the topic at stake.

Second, the algorithm is also much better at directing users to specific sections or concepts that are directly related to a legislative subunit.

Third, we have observed that results are coherent between conceptual queries and navigation queries that are essentially identical. For example, “Charter 2b” and “freedom of expression” effectively return the same results.

The Citation Network project dramatically improves the relevance of search results. It provides a better ranking for broad, topic-based queries, as well as navigational queries. Moreover, these improvements are achieved with response times equivalent to what they used to be. And as it’s based on a learning model, the larger the database, the more accurate the results will get.