Tuesday, October 15, 2019

The sad state of NLP tech in the marketplace today

Update: Sinequa has caught up to us in going straight to the answer, EXCEPT we are FASTER and MORE ACCURATE. Don't believe it? Try the first 3 covid queries on our covid virus search at noonean.com and then try them against sinequa's covid search. We blow them away. One good thing about so many companies doing covid search, we finally get the put up or shut up comparisons.
-------------------------------------

Very few products exist which use advanced NLP techniques. We've scoured the marketplace and find product after product lacking.

Here is a brief list of things they do:
 
     Phrase recognition
     Simple named entity recognition
     Part of Speech
     Lemmatization
     User Intent Analysis
     Disambiguation

While it's true that all of these are parts of NLP none of them are ADVANCED NLP techniques. Do they help improve search, yes. But not as much as using advanced techniques.

Here is how one competitor describes their tech:

  • Semantic search: uses a variety of signals to understand the user’s intent and handle ambiguity. For example, semantic search understands that when a user searches for “profit,” she would also want to find data sets that reference “net income.” Similarly, if a user searches for “NJ,” they would also return entries for “New Jersey.”
  • Lemmatization uses variations on words such as plurals, tenses, genders, hyphenated forms, and more. For example, a search for “running” would return matches for “runs” or “ran.”
  • Advanced syntax increases precision through techniques such as phrase search, fielded search, Boolean matching, and proximity search.
  • Fuzzy matching increases recall and allows for looser matching. Substring and approximate matches allow users to find data sources when they only have partial information or even incorrect information.
None of this is Advanced NLP. And sadly it's commonplace that companies are pitching that they have NLP tech in their search but it's all very rudimentary. While you can't argue that lemmatization and simple named entity extraction is NOT NLP, because it is, it's really just the first few steps of a processing chain with the higher order functions lopped off and ignored. And why is that? Because they don't really have backgrounds in doing NLP research. 

But it's a successful strategy that gets investment and sales, but really it's a shoddy product. So just like the AI buzz that gets everyone funded, it's quite troubling that such minimal low level tech is representative of what's state of the art. It isn't. 

Shockingly, things are no better at Google, Microsoft, or Amazon regarding their NLP search offerings.  Focused more on User Intent recognition / Sentiment analysis and disambiguation, these products are focused on Chat Bots not Enterprise Search. 



Sinequa just recently demo-ed their tech at KMWorld. First you get a list of documents and then you have to click on the documents to see the analysis of each document.  Noonean is more advanced, we show you the answers DIRECTLY not documents. 


Noonean is designed from the beginning for enterprise scale and works with a parallel index so you can trial the technology without any risk of damaging your current enterprise. The index can reside as a SOLR core on the same machine or a completely different machine. 


Noonean.com brings more advanced NLP techniques to the market.

Companies see the value of automating customer service and think their traditional search technologies are sufficient for the enterprise. But moving to Cognitive Search and Insight engines with Advanced NLP and AI learning will take things to another level and have a profound impact on their bottom lines.  In reality, enterprise search is a first line crude form of knowledge management, and fact bases based on NLP Ontologies are the next level after that. 




No comments:

Post a Comment