Auto-Complete Suggestions with OpenSearch

Hemendra Chaudhary
4 min readJun 28, 2024

Introduction

It is a known fact that many new programs come with suggestions on typing that help in suggesting what is likely to be entered as one types for. Such programs enhance user experiences making it more probable that user will get what they are looking for hence saving time that would otherwise be spent searching manually. So we can say that OpenSearch is an appropriate ground for auto-complete suggestions because it is backed by strong open-source search and analytics engine which handles all types of data as well as queries.

Understanding Index Mappings in OpenSearch

Index mappings in OpenSearch define the structure and behaviour of the data in the index. It is important to select adequate data types for the fields that would be used in searching and suggesting while creating an index for autocomplete predictions.

keyword mapping: if you define a field to be of type keyword like this.

PUT /bookstore
{
"mappings": {
"properties": {
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}

Then when you make a search query on this field you have to insert the whole value (keyword search) so keyword field.

POST /bookstore/_doc
{
"title": "The Lord of the Rings: The Fellowship of the Ring"
}

when you execute search like this:

GET products/_search
{
"query": {
"match": {
"title.keyword": "The Lord"
}
}
}

it will not match any docs. You have to search with the whole word “The Lord of the Rings: The Fellowship of the Ring”.

text mapping on the other hand is analysed and you can search using tokens from the field value. a full text search in the whole value:

GET products/_search
{
"query": {
"match": {
"title": "The Lord"
}
}
}

This will return a matching documents.

By default, keyword fields are both indexed (since index is enabled) and store them on disk (because doc_values is enabled). To save disk space instead of indexing them, you can specify that a field may not be indexed by setting index to false. If you’re looking for a field that needs ‘full-text’ search, assign it text instead.

You can check this to more details keyword vs. text

Implementing Auto-Complete Suggestions with Wildcard Queries

Wildcard queries are a good choice for auto-complete suggestions when using keyword mappings, as they allow for prefix-based searching.

Apart from it, there is match_phrase_prefix query that can also perform the auto-complete suggestions but the quality of it was not up to the mark, and also it has some issues with spaces as well, it doesn’t work where there are spaces in the search terms.

And like this, if you use wildcard query with text or any other mapping then it also has problem with spaces, as it will not search anything after a space.

So the best option is to use wildcard with keyword mappings if you have requirement to perform a auto-complete on the non analysed data.

Let’s assume that we have some documents in the index, such as:

POST /bookstore/_doc
{
"title": "The Lord of the Rings: The Fellowship of the Ring"
}
POST /bookstore/_doc
{
"title": "The Lord of the Rings: The Two Towers"
}
POST /bookstore/_doc
{
"title": "The Lord of the Rings: The Return of the King"
}

Now, using the following query we can perform the auto-complete search on OpenSearch

GET /bookstore/_search
{
"query": {
"wildcard": {
"title.keyword": {
"value": "The Lord of the Rings*"
}
}
}
}

We include keyword while searching so that it can search on the non analysed field, and get results without facing any issues. The query returns all documents that have a title that starts with “The Lord of the Rings”, which are all books in the “The Lord of the Rings” series.

I‘s important to note that wildcard queries are case-insensitive by default. However, you can use the case_insensitive option to control this behaviour. For example, to perform a case-sensitive search, you can set the case_insensitive option to false:

GET /bookstore/_search
{
"query": {
"wildcard": {
"title.keyword": {
"value": "The Lord of the Rings*",
"case_insensitive": false
}
}
}
}

In this case, the query will only match documents where the title starts with “The Lord of the Rings” in the same case as the search term. And if you don’t have such requirements you can set it to true then you search with lower case as well.

This is simplest way to perform the auto-complete suggestion, as I’ve implemented the same logic in our product too, as we have dataset of around more than 1 million documents and it get me the results less than 100ms, so I can say that it worked very fine.

As there are very limited resources on the same so I’d to do lots of research to find this, that’s why I thought to deliver this to the community as well.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Hemendra Chaudhary
Hemendra Chaudhary

Written by Hemendra Chaudhary

DevOps engineer with expertise in AWS, Docker, Ansible, Terraform, and CI/CD pipelines. Follow for insights on using these tools in the world of DevOps.

No responses yet

Write a response