By David Schubmehl, Research Vice President, Conversational Artificial Intelligence and Intelligent Knowledge Discovery, IDC
Consumers and knowledge workers have high expectations for search today.
Consumers and knowledge workers have high expectations for search today. Web search engines and prominent social media sites often provide usable and useful search results within a couple of iterations. However, many organizations’ search engines and systems can’t duplicate that success. Why is this happening?
There are several reasons that this occurs:
- The searcher uses terms and phrases in their search query that aren’t mentioned in the relevant topical documents
- There are so many results that the search system can’t determine which ones are important to the search, so it returns all the results in a haphazard fashion
- The information that the searcher is looking for may be spread across several documents and only by piecing this information together can they find the answer that they want
- The searcher isn’t exactly sure what they’re looking for, but they’ll know it when they see it
- The content that knowledge workers and consumers are looking for isn’t always indexed by the search system
When these things happen to searchers, it leaves them frustrated and unwilling to use their organization’s search system in the future because “our search sucks”.
The good news is that there are solutions to most of these problems and advanced search solutions are using artificial intelligence, natural language processing and machine learning to address these challenges.
Let’s take the first item: The searcher uses terms and phrases in their search query that aren’t mentioned in the relevant topical documents. In the past, this problem was tackled by maintaining a thesaurus of similar and relevant terms but keeping a thesaurus up to date can be a time consuming and manual practice. Today, using deep learning language models like BERT as part of the search algorithm lets a search system find similar or more relevant documents without the need for a manual thesaurus.
Make Data Findable
Performing natural language processing (NLP) of documents while indexing them can also be beneficial. Entities such as people, places, events, and other things can be extracted and used as metadata in the documents to help searchers understand what a document may be about. Similarly, relationships between entities can also be extracted using NLP that also helps the search system to recognize what documents are about. Finally, NLP can also be used to categorize and classify documents automatically, making it easier for searchers to differentiate between groups of documents when a search term or phrase returns too many documents to be useful to the searcher.
These AI-generated classifications and categorizations can be helpful to a searcher who is not sure what they’re looking for. By browsing the various categories, searchers can more easily get a feel for what information is indexed in that system relevant to the terms that the searcher used in their queries.
Personalize Search Results
Another use for AI and ML is to provide the searcher with more personalized results based on their past search history, job role and other pertinent information. This data when used by machine learning algorithms can help the search system to focus on what the searcher was interested in in the past or what may be most interesting related to their job role and return results more closely linked to those items.
Finally, AI and ML can be used by the search system to create smarter indexing systems, eliminating duplicate or near duplicate documents and relating documents together, even though they may be from completely different document repositories. Creating smarter indexing systems makes searches easier and more efficient because unnecessary duplicates are filtered out.
The bottom line is that artificial intelligence and machine learning can be a tremendous boon to search systems and many advanced search systems are now integrating these technologies into their solutions and applications.
Dave Schubmehl‘s research covers information access and artificial intelligence technologies around conversational AI technologies including speech AI and text AI, machine translation, embedded knowledge graph creation, intelligent knowledge discovery, information retrieval, unstructured information representation, knowledge representation, deep learning, machine learning, unified access to structured and unstructured information, chatbots and digital assistants, and rich media search in SaaS, cloud and installed software environments. This research analyzes the trends and dynamics of the Text and Audio AI software markets and the costs, benefits and workflow impact of solutions that use these technologies.
A word from SearchBlox:
As Dave mentions it’s easy to blame the search tool, or even the IT department, when users can’t find what they’re looking for. Those of us managing the back end know that it takes significant effort to make content findable in the first place—and the cost to manually address the problem typically exceeds budgets and team capacity. By applying AI across all four stages of enterprise search, we’re using ML to address the root cause challenges in practical ways. For example, the ability to fix your content during set up with PreText™ NLP means it’s easy to make content like PDFs and Microsoft Office-generated files more findable in the first place. The tool automatically adds titles, descriptions and metadata so it’s a game changer when dealing with lots of data coming from a variety of sources. Once the content is properly labeled and indexed, the right tools make it easy to adjust relevance by audience, tag and group data from across the enterprise, or discover what’s really missing from your website, support portal, or even your product mix.