Answers for Searching

Introducing DiversiSEARCH

Nov 15, 2019 10:31:32 AM / by Curtis Wadsworth, Founder, CEO

This week we are pushing a revolutionary update that will improve the results returned by Dorothy across technologies. The update includes a number of improvements including better relevancy scoring and reduced redundancy. By far, the biggest improvement is the introduction of DiversiSEARCHTM technology to the platform.

giphyPower user and professional student, John Connor Smythe, expressed excitement upon reviewing DiversiSEARCH results.  

What is DiversiSEARCHTM?

DiversiSEARCHTM allows Dorothy to carry out the same search using multiple search models. We are currently using three algorithms: two proprietary NLP based models and a keyword algorithm similar to the traditional search platforms like, Google Patent. We are applying these models to separately indexed databases and have built in the flexibility to add additional models and databases. This reduces bias and allows us to incorporate search models that are better at searching particular technologies. In addition, DiversiSEARCHTM is big step in creating synergy between the searcher and the platform, allowing users to harness the power of AI driven patent search.

There can be no single “magic bullet” search model that can search all technologies, all document types, and all writing styles. Like humans, some search algorithms are better at searching certain technologies like software or electronics and not so good at searching other technologies such as chemistry. We can exploit these characteristics by incorporating diverse search models into the platform. The platform will provide better results than each of these search models individually, making Dorothy an expert at searching all technologies.

As we discussed last week (see Exploring Relevancy 2), professional searchers typically carry out the same or very similar searches on different platforms to exploit differences in search models and identify all relevant results. Every search engine is built on a model that compares an object in a query to objects in a database. In Dorothy’s case, the query object is a description of an invention and the objects in a database are patent publications. Most models rely on the same basic technologies. Subtle differences in the way the query and database are parsed and indexed and weighting factors involved in scoring or ranking results cause them to return different results or the same or similar results in a different order.

Unfortunately, the vast majority of the return results across platforms are the same, so the searcher spends a lot of analysis time identifying duplicates. This process is inefficient and largely ineffective.

More importantly, the presence of a particular reference in numerous lists introduces bias. Your natural inclination is to assume that a reference is important if it appears in the return results in multiple searches on several different platforms. Without the knowledge that it appears in multiple return results lists, this same reference may not be relevant. The identified subject matter may be described at length in the background section or boilerplate, for example. I’ve had this conversation with our interns many times: “Where did this reference come from?” I ask. “It was in the top 10 return results in all of our test platforms,” they respond.

With DiversiSEARCHTM, Dorothy simultaneously searches the same database indexed in different ways using multiple models. The results from each search are compiled and analyzed based on their relevancy scores. A result that appears in multiple lists but has a lower relevancy score than another reference that is not returned in every list will appear in the returned results (once) below the reference having a higher relevancy score. The fact that it is consistently returned does not produce bias.

Dorothy with DiversiSEARCHTM also eliminates weighting bias by allowing the user to adjust weighting factors using the “Relevancy Criteria” (“Novel Feature Weight”) slide bar. Weighting factors used in current search models are typically static. All searches performed on platforms, like Google Patent, are applied to the same model with identically set weighting factors, creating bias to references that meet the weighting requirements.

Our goal is to produce a tool that creates synergy between the searcher and technology. Dorothy allows the searcher to control nearly every aspect of the search by targeting specific elements (“Novel Feature”), adjusting various weighting factors (“Relevancy Criteria”), and searching across different databases. A future update will allow the user to adjust the weight of multiple “Novel Features” individually, allowing them to be more heavily weighted or excluded from return results. The platform is an extension of the searcher. Dorothy is your partner, using the user's expertise to produce better results.

DiversiSEARCHTM is a huge step forward for search technology. We are excited to bring it to patent professionals (the most uncompromising users of search technology on the planet).

Oh yeah! The best part: DiversiSEARCHTM does not impact search time. At less than 2 minutes each, you’ll be able to perform numerous searches, identify the most relevant prior art, and grab a cup of coffee in less than 30 minutes (several thousand times faster than your alternative legal service provider).


Tags: Patent Law, Natural Language Processing, AI, Legal Tech, Creative Solutions, Extreme Problem Solving