Azure Cognitive Search — when ‘name LIKE…’ is not enough

iteo
5 min readJul 9, 2020

Are you in need of implementing advanced search service in your application? You can’t be bothered to learn, deploy, and maintain the next database server? You don’t want to learn AI theory or another tangled library? Azure Cognitive Search might be the solution for your concerns!

Let’s take a closer look!

What is it? And why should I use it?

Azure Cognitive Search is a search-as-a-service cloud solution allowing developers to perform advanced searches not only over text and geodata but also files like Microsoft Office documents and even images. It uses artificial intelligence to extract text — all happening backstage without any need for knowledge in that area!

But there are already other search solutions. Why should I use this one? First of all, it’s a cloud-based service. No installing, configuring server, worrying about disc space. Second, Azure Cognitive Search aims to be a universal solution. You define the criteria that data can be searched by. Today you need regular text search, but tomorrow it can be searching in documents, or maybe you will need to use a different database? Azure Cognitive Search is ready for that.

Lots of search operations can be implemented directly on the database but can turn out to be computationally intensive. Moving this responsibility to the dedicated solution helps to ease the database and also gives a possibility to scale when needed.

It is worth mentioning that Azure provides us with SLA on a 99,9% availability level (if some certain requirements are fulfilled).

What does it offer?

Azure Cognitive Search is a humongous tool and we wouldn’t have a day to describe all of its functionalities, but let’s look at the most important of them.

Indexes

The main asset of Cognitive Search. Every document used and processed by Azure Search ends up in an index that is made up of records containing various attributes. The index defines how data can be searched.

Replicas

Each replica holds a copy of your data and thanks to that you can load balance requests. In addition, all load balancing and replication are managed by Azure Search, all you have to do is adjust the number of replicas.

Increase the number of replicas when you start getting too many requests from multiple users and performance drops.

Partitions

Provides storage for your index operations (for example index refreshing or rebuilding). Each partition has shared data of all your indexes. So for example, if you have 4 partitions on your Azure Search instance your index data is split into four parts.

When on a low load a single query takes too long to complete, adding more replicas won’t solve the problem. Adding more partitions might be the solution. Splitting data into smaller chunks can allow parallel operations, increasing performance.

Indexer

Crawler that extracts searchable data from an external data source and populates your Azure Search index. All you need to do is connect your data sources like a database or storage and set whether the indexes should be built on-demand or periodically.

Synonyms

Cognitive search allows you to create a collection of synonyms. For example when you have documents connected with the word “Dog” you can associate the words “Puppy” or “Canine” and when the user searches for any of those 3 they will hit the same results. Isn’t that great?

Language analyzers

Azure Cognitive Search provides us with 35 Lucene and 50 Microsoft text analyzers. You can set an individual analyzer for each field in your index. Language analyzers perform lexical analysis using specific linguistic rules of the target language.

Advanced features

Azure Cognitive Search isn’t just a simple search service, it has so much more to offer. Text searching can be customized with many text analyzers. Search results can be sorted and paged almost without any configuration. But what makes it stand out from the competition is AI Enrichment — set of extensions that use Machine Learning to extend search possibilities. For example, built-in OCR that can read text from images or an algorithm, based on neural networks, that allow us to identify visual features such as facial recognition or image interpretation. AI can also be used for entity recognition (email, URL, date) and language recognition. If that’s still not enough, we can define our own so-called skills, that are available via REST API, and connect them to our Azure Search.

How do we use it?

Don’t we all love actual, real-life examples? Luckily we happen to have a few. Let us share some cases of how we have used it in our projects.

Advanced user searching

We have data about employees stored in the database. They need to be searchable by email, name, last name, but also their job title and skill level. Under the hood, there are already few joins of tables required. But efficiency is only one problem here.

We want the search to be the case and accent insensitive (so “Ę” is treated as “E”, and “e” is equal to “E”). Since database contains employees from all around the world some names can be hard to pronounce for users from other countries, so we want to show results even if small typo has been made in a query (“Brzęczyszczykiewicz” should still be found if a user entered “Brzeczyszczykeiwicz”).

Having those requirements in mind, try guessing what tool we chose to complete this task.

You got it! We have chosen Azure Cognitive Search! It could have fulfilled all the required needs. Easing database with custom index, ignore accents in a query, allow small typos while remaining efficient.

Searching documents

Imagine this — the project is already deployed and running for a few months. The customer uploads lots of documents that are stored as an image in our cloud storage. Then comes a new requirement.

“We want to search by text in those scans.”

”But sir… those are images…”

“Is this a problem?”

“As a matter of fact — no!”

We can use Azure Cognitive Search to easily setup OCR (Optical Character Recognition), and if photos are already stored in Azure (for instance Azure Blob Storage) we don’t even need any additional configuration, just select the correct option in Search. No need to extract the text for all existing files manually! At this point, things can get even spicier. You can use AI to extract data from images, like face recognition, categorization, nudity detection, and so on, but that’s a topic for a whole different story.

Conclusion

Thanks for staying with us. If you are in need of reliable and robust finding service, that can handle large amounts of data, has high possibilities of configuration, and is easy to use then look no further!

Azure Cognitive Search can be used for both simpler tasks, offering basic and even free tiers, as well as very complicated scenarios where performance is critical even at the cost of the budget. Powered by Artificial Intelligence, it can achieve astonishing things!

Do you want to implement this amazing service in your project? Feel free to contact us, it will be our pleasure to cooperate with you. Together we can deliver everything!

--

--

iteo

iteo is an international digital product studio founded in Poland, that helps businesses benefit from technology better. Visit us on www.iteo.com