This week we are pushing a revolutionary update that will improve the results returned by Dorothy across technologies. The update includes a number of improvements including better relevancy scoring and reduced redundancy. By far, the biggest improvement is the introduction of DiversiSEARCHTM technology to the platform.
Last week we discussed relevance and the advantages that NLP based search engines have compared to their keyword searching counterparts. Because NLP understands the elements of a search query in context, NLP based engines, like Dorothy, have a clear advantage over keyword based search engines. We used relatively simple examples to illustrate this point. But, there’s more.
The single most important determinant of whether a search was successful is whether the documents returned by the search engine are relevant to the user. As you might imagine, relevancy is a daily discussion at DorothyAI, since we want to have happy customers who view our returned results as being highly relevant.
Two-thirds of the almost 150 patent lawyers and agents we talked to over the last year, describe duplicates in the returned results as a major problem with current search tools. We have a plan for removing duplicates that appears to be working pretty well in testing. The best part is that we can use what we learned finding duplicates to add even more value platform. We’ll talk about our solution in a future post.
At first multiple Shia LaBeouf's was fun...
Dorothy uses natural language processing to search the patent database. Many search platforms have semantic search capabilities which seem to vary in their effectiveness. Like Dorothy, the semantic search query is a plain English description of the thing being searched. You are probably asking yourself, “What’s the difference?”
Patents are REALLY important in the pharmaceutical industry. Taking a new drug to market cost $3-$5 billion and can take up to 16 years thanks to the arduous FDA approval process. Even though many patents that cover new drugs have less 5 years pendency after the drug is approved for sale, 80% of the overall revenue pharmaceutical companies make are tied directly to a patent claiming an approved drug. VC’s and institutional investors understand this, and rarely invest in drugs that are not covered by at least one patent. Basically, if you are going to raise money for a biotech company and/or you want to recoup the cost of bringing the drug to market you are going to need a patent.
We are in a transition here at DorothyAI, as we move from a company that creates software solutions to a company that sells software solutions. Actually, we’re a company that sells the software solutions it is still creating. I don’t know if that is a “transition.” In any case, we’ve spent this week looking at feedback from our previewers and setting our goals for the future.
All things considered, the USPTO patent database is well curated. Millions of patent documents (issued patents and application publications) are available for search and download. The documents include the complete application (title, abstract, specification, and claims), along with various important dates (filing, publication, and issue dates) and lists of references submitted or cited during prosecution. Not bad.