Learn Advanced Full-Text Searches With MongoDB Atlas Search | by Lynn Kwong | Apr, 2022

Photograph by Daniel Lerman in Unsplash.

As we now have realized within the previous post, we are able to run full-text search queries with MongoDB Atlas Search now. If you’re implementing a search engine and haven’t determined which software to make use of, MongoDB Atlas Search could be a good various to conventional instruments like Elasticsearch. On this submit, we’ll introduce some superior settings and search queries for MongoDB Atlas Search. One can find that they’re fairly much like their counterparts in Elasticsearch. It is because, underneath the hood, Apache Lucene is used because the core for each Elasticsearch and MongoDB.

It ought to be famous that MongoDB Atlas Search is just accessible in case your MongoDB databases are hosted by MongoDB Atlas. It’s not accessible should you handle your MongoDB servers regionally.

We’ll proceed to make use of the laptops assortment within the merchandise database we’ve been engaged on as a result of the product names and attributes are good examples for demonstrating full-text searches.

Should you haven’t used MongoDB Atlas earlier than, it’s beneficial to have a fast begin with this article. And if you wish to observe alongside, please obtain this JSON file and use the next command to import the info to MongoDB Atlas. Should you don’t have mongoimport put in but, please observe this linok to obtain and set up the MongoDB Database Instruments.

Be aware that that you must change the username, password, and cluster title in your personal case. If in case you have community connection points, keep in mind to examine in case your IP deal with is added to the IP Entry Listing in Atlas UI.

When the above command is run, you’ll have the merchandise database and laptops assortment created mechanically in MongoDB Atlas, which accommodates 200 laptop computer paperwork as follows:

Let’s now create an Atlas Search Index that shall be used for full-text searches. You may create an Atlas Search Index with the Atlas UI, Atlas Search API, or MongoDB CLI. The only manner is to make use of the Atlas UI since you don’t have to specify the cluster metadata resembling public/personal keys, cluster title, group id, and so on.

Within the Atlas UI, discover your group, undertaking, and cluster. If in case you have just one cluster, then it’s already proven there while you open the Atlas UI. Click on the cluster title to open the management panels the place you could find the “Search” tab. Click on it to open the web page for creating Search Indexes:

Now click on the “Create Search Index” button to create one Search Index for our laptops assortment within the merchandise database. A web page like this shall be opened:

It’s beneficial to decide on the “JSON Editor” as a result of you’ll be able to have extra superior configurations there. Moreover, virtually all of the documentation of MongoDB Atlas Search makes use of JSON configurations. Subsequently, it’s higher to get conversant in the JSON settings from the very starting.

We’ll use static mapping on this superior tutorial. In apply, it’s additionally beneficial to make use of static mapping so you’ll be able to select which fields to be listed and the way they are going to be listed. You may have superior settings like autocomplete and synonyms with static mapping. The mapping doc for the laptops assortment is:

Some vital notes concerning the index definitions:

  • The analyzer is used when the paperwork are listed, and searchAnalyzer is used to investigate the search queries. They’re usually the identical. The default and likewise mostly used analyzer is lucene.customary which lowercases a string and splits it into tokens by house and punctuations. Moreover, widespread cease phrases are eliminated resembling the, for, that, and so on.
  • We specify that dynamic is false, so we have to explicitly specify the mapping for every area.
  • The title area is listed with two sorts, one is the common string, and the opposite one is autocomplete, which helps search-as-you-type queries. The autocomplete kind has its own settings, however the default ones ought to be sufficient most often.
  • The attributes area is an array of paperwork. Atlas Search requires solely the info kind of the array parts.

Importantly, please notice that we have to create a separate assortment within the similar database for the synonym definitions. You may run the next instructions with mongosh to create the synonyms assortment:

Be aware that the synonyms assortment ought to be created in the identical database because the goal assortment to be searched, and have the identical title as within the synonyms area within the index definition.

There are two forms of synonyms, particularly equal and specific. With an equal kind, all of the synonyms are equal to one another and are interchangeable. Nonetheless, with the specific kind, it’s one course solely, particularly, solely the phrases within the enter area may be changed by these within the synonyms area and never the opposite manner round. Subsequently, should you seek for “Lenovo”, the laptops containing solely “ThinkPad” however no “Lenovo” shall be returned. Nonetheless, should you seek for “ThinkPad”, these containing solely “Lenovo” however not “ThinkPad” is not going to be returned. We’ll see it with an instance later.

After the synonyms assortment is created, you’ll be able to proceed to create the Search Index as proven above.

Now that the Search Index has been created, we are able to proceed to create and run full-text search queries. With static mappings together with superior settings like autocomplete and synonyms, we can not use the “Search Tester” in Atlas UI to run complicated queries however want to make use of the aggregation pipelines with mongosh or a driver. We’ll use mongosh on this submit as a result of it’s not restricted to a selected programming language and thus is extra generic.

Search 1: Use the autocomplete function to search-as-you-type:

[
_id: 2, name: 'Lenovo IdeaPad Y700-15', score: 1 ,
_id: 8, name: 'Lenovo ThinkPad T470s', score: 1
]

We use the autocomplete operator to carry out search-as-you-type queries. The path area specifies the sector to go looking in opposition to, which ought to have the autocomplete kind outlined as proven above. If you wish to be taught a bit extra in regards to the fundamental syntax for the MongoDB Atlas Search aggregation queries, please examine this post.

Search 2: Use the autocomplete plus fuzzy search function.

[
_id: 2, name: 'Lenovo IdeaPad Y700-15', score: 1 ,
_id: 8, name: 'Lenovo ThinkPad T470s', score: 1
]

This question has the identical outcome because the one above, which implies that the fuzzy search and autocomplete options are working correctly. Be aware that we specify the maxEdits and prefixLength options for fuzzy so we received’t get too many irrelevant outcomes with this question.

Search 3: Search with synonyms.

Let’s first attempt to search with equal synonyms.

[
_id: 134, name: 'Apple MacBook Pro', score: 1.738951921463012 ,
_id: 184, name: 'Apple MacBook Pro', score: 1.738951921463012
]

You may attempt to search with “MacBook”, “Macintosh” or “Mac” and can all the time get the identical outcomes as a result of they’re equal synonyms and are interchangeable.

Now let’s attempt to search with specific synonyms:

[
_id: 97, name: 'ThinkPad T480' ,
_id: 40, name: 'Lenovo T480' ,
_id: 117, name: 'Lenovo ThinkPad T480'
]
[
_id: 97, name: 'ThinkPad T480' ,
_id: 117, name: 'Lenovo ThinkPad T480'
]

These two examples show that “ThinkPad” may be searched by “Lenovo”, however not vice versa.

Search 4: Search with a phrase.

Generally we might wish to search an ordered sequence of phrases that should seem precisely as specified within the enter question. This may be achieved with the phrase operator. Let’s seek for “Lenovo T480” with the textual content and phrase operators individually and you will notice the distinction instantly:

[
_id: 40, name: 'Lenovo T480' ,
_id: 117, name: 'Lenovo ThinkPad T480' ,
_id: 97, name: 'ThinkPad T480' ,
...
]
[  _id: 40, name: 'Lenovo T480'  ]

Just one result’s returned with the phrase operator because it requires the tokens within the search string to look in the identical order within the ensuing paperwork.

Search 5: Mix a number of operators collectively.

Lastly, let’s be taught to make use of the compound operator to mix a number of operators collectively. If in case you have some background with Elasticsearch, you will notice the syntax is fairly comparable. We additionally use the should, mustNot, ought to, and filter clauses in MongoDB Atlas Search.

Let’s discover all of the HP laptops which are nonetheless in inventory:

[
_id: 200, name: 'HP ZBook 14u G6', quantity: 8 ,
_id: 196, name: 'HP ZBook 14u G6', quantity: 4
]

Oddly, the equals operator can solely work with boolean and objectId values. Subsequently, we have to use the range operator to examine if the amount is 0.

Let’s write a extra complicated question to seek out laptops assembly the next circumstances:

  • model isn’t Apple.
  • nonetheless in inventory.
  • reminiscence is 32GB or storage capability is 1TB

Whoa! It could get actually complicated for such a easy search challenge. Just like Elasticsearch, we are able to simply get into such points (“bool/compound hell”) if some area accommodates an array of nested paperwork. Nonetheless, it’s really not that complicated as soon as you already know the sample. It’s simply type of cumbersome to jot down.

Key factors for this compound search question:

  • The filter clause has the identical impact as should. Nonetheless, it’s not used to calculate the ultimate search rating. If you wish to enhance the rating for some area, you would want to make use of the should clause and never filter. Additionally, it’s value mentioning that mustNot doesn’t contribute to the search rating, both. It really works just like the negation of a filter clause.
  • The ought to clause, because the title signifies, specifies the circumstances that ought to be met, and are thus elective. Nonetheless, we are able to use the minimumShouldMatch choice to specify what number of elective circumstances ought to be met to return a outcome.
  • We will use the compound operator inside a compound operator. That is the place issues begin to get complicated. Nonetheless, we must always simply remember the fact that the nested compound operator works precisely the identical because the top-level one. It’s usually used for the fields that comprise an array of nested paperwork, just like the attributes area on this instance.

On this article, we now have demonstrated methods to create a MongoDB Atlas Search Index with static mapping. Some particular settings are launched resembling autocomplete for search-as-you-type search and search with synonyms. Now we have additionally launched some widespread search queries, from fundamental to superior, which may be tailored and used immediately in your sensible work.

For the looking of nested fields, particularly these whose worth is an array of paperwork, don’t get intimidated by the seemingly complicated queries. So long as you understand how the compound operator works, you’ll be able to construct highly effective queries by your self, utilizing should, mustNot, ought to, and filter clauses because the constructing blocks.

More Posts