Answers for Searching

Augmenting Intelligence Since 2020

Jan 10, 2020 11:43:12 AM / by Curtis Wadsworth, Founder, CEO

The 19th century 20’s brought in the first industrial revolution with powered looms and steam powered railroads. The 20th century’s 20’s were roaring, an age of prosperity that ushered in widespread use of automobiles, telephones, movies, radio, electrical appliances, and air travel.


Cliff and Dorothy together would have been unstoppable. 

Will the 21st century 20’s be the augmented 20’s?

We are or have been living in the “Information Age.” According to Wikipedia, the Information Age is “characterized by the rapid shift from traditional industry that the Industrial Revolution brought through industrialization to an economy primarily based upon information technology.” It’s hard to know what “information technology” means in this context. To my mind, the economy is primarily based on the purchase and sale of electronic gadgets that facilitate the dissemination and capture of information, Apple, Sony, Samsung, and companies that manage the information, Google, Facebook, Instagram, LinkedIn. People seem to REALLY like this “information technology,” so despite its misgivings, I suspect it will remain an important part of the economy in the 20’s.

The information age has created numerous ways to capture information and a virtually unlimited capacity to store it. The problem: We don’t have reliable means for retrieving the stored information. Google is great. Don’t get me wrong, but find that corgi video you watched 25 times and sent to all your friends in June of 2012 (after you’ve finished reading this post). Good luck…

This deficiency has made much of the information we’ve worked so hard to capture and store unusable. Neither the user nor computer has the vocabulary distinguish the cute corgi video that you remember from 2012 from the millions of cute corgi videos that have been captured and disseminated in the time before and after you spent 2 hours binge watching a little fluff ball throw a ball to himself 8 years ago.

Wouldn’t it be easier to find your favorite videos if you could describe what happens in the video (a corgi puppy throws a ball in the air and catches it) and your computer/smartphone/tablet could understand your description and return videos that meet it?

That’s exactly what AI technology, natural language processing specifically, promises to do: Make information accessible by allowing you and your device to speak the same language with the goal of making it easier to retrieve specific information quickly.

Intelligence is measured against time. In our society, geniuses know the answer. Really smart people can extrapolate the answer from what they know. Smart people know where to look for the answer. The rest of us hope Wikipedia is right. Nearly every test you’ve ever taken is timed, and clients often judge attorneys by “responsiveness,” i.e. how quickly they respond to their questions. Time separates the Ken Jennings of the world from the Cliff Clavons. Having the internet’s collective knowledge at your fingertips will undoubtedly make many Cliffs, Kens, by augmenting intelligence.

Boolean searching was a good start, and Google has done an incredible job of leveraging user data to identify what consumers are looking for in a few words. For example, a search for “meatball Pittsburgh” will immediately return Emporio: A Meatball Joint, which is a darn good place for meatballs, as the first result. Emporio is returned first because Google users have viewed the Emporio page more often than Sienna Mercato, the larger restaurant that encompasses Emporio. Weighting by popularity, scoring pages that were visited by users who entered similar queries more highly, works great for consumers and basic searching.

Unfortunately, weighting by popularity is not helpful when searching for technical information, scholarly articles, or patents. In most cases, search engines designed to search technical information score return results based on the number of times query terms are used. Our “meatball Pittsburgh” search on Google Scholar returns the book Pickles to Pittsburgh: The Sequel To Cloudy With A Chance Of Meatballs and “New structures in complex formation between DNA and cationic liposomes visualized by freeze—fracture electron microscopy,” a scholarly article by University of Pittsburgh scientist that describes liposomes as “meatballs.” Notably, the terms “meatball(s)” and “Pittsburgh” are bolded in the description.

Most search engines simply count the number of times the search terms are used. There is no accounting for the relationship between the search terms, the spatial relationship of the terms in the text, or context. “Meatballs,” the food, and “meatballs,” the liposomes, are equivalent, and top references use the term meatballs more often than lower scored references regardless how this term is used. These searches return 1000’s of results that must be carefully reviewed to find texts in which the terms are used together in the proper context. Term counting is not a great way to search if you are trying to find information quickly.

Among other things, natural language processing (NLP) applies the whole search query to the search taking into account the relationships between the terms, their spatial relationship, and how they are used. Rather than simply searching for “meatballs Pittsburgh,” NLP based search allows users to describe exactly what they are looking for. For example, a search for “a meatball sandwich with bacon aioli on ciabatta” will find menu descriptions and recipes that include each of the components of the query, meatballs, bacon aioli, and ciabatta. NLP based search engines understand the relationship between words in the query and results, returning a description of a “meatball sandwich” higher than a description of a “bacon sandwich,” and the context of the query, returning descriptions of food before the descriptions of liposome meatballs.

There is a lot of work to be done before NLP based search engines truly understand written or spoken language. But, by using the relationship between terms, and their spatial relationship and context, NLP based search is a huge advance in searching technology. Dorothy returns 100 results that include sentences and paragraphs that describe the subject matter of the search query rather than 1000’s of results that simply use the terms, dramatically reducing the amount of time required to review search results. As this technology improves, the return results will be even more accurate, answering the questions encompassed by your queries more quickly and augmenting your intelligence.



Tags: Insider, lawyer, Natural Language Processing, AI, Legal Tech