Lab 1 - Cognitive Search
Last updated
Last updated
Before we can start we need a couple of resources within the Azure Portal
Sign in to the Azure portal.
Click the plus sign ("+ Create Resource") in the top-left corner.
Use the search bar to find "Azure Cognitive Search" or navigate to the resource through Web > Azure Cognitive Search.
4. Choose a subscription 5. Create a new or choose an existing resource group (location: westeurope)
6. Name the service 7. Choose a location (westeurope) 8. Choose a Pricing Tier (Free) 9. Click 'Review and Create'
Sign in to the Azure portal.
Click the plus sign ("+ Create Resource") in the top-left corner.
Use the search bar to find "Azure Cognitive Services" or navigate to the resource through Web > Cognitive Services.
4. Choose a subscription 5. Create a new or choose an existing resource group (location: westeurope)
6. Name the service 7. Choose a location (westeurope) 8. Choose a Pricing Tier (Standard S0) 9. Click 'Review and Create'
Sign in to the Azure portal.
Click the plus sign ("+ Create Resource") in the top-left corner.
Use the search bar to find "Azure Cognitive Search" or navigate to the resource through Web > Cognitive Services.
4. Choose a subscription 5. Create a new or choose an existing resource group (location: westeurope)
6. Name the storage account 7. Choose a location (westeurope) 8. Click 'Review and Create'
1. Go within the Azure Portal to the storage account that you just created 2. Click on Containers
3. Click on [+ Container] 4. Give the container a name (Ex. 'data') 5. Click Create 6. Click on the created container 7. Click the [Upload] button 8. Choose some local pdf files (or download a small set from here) 9.Click [Upload]
Go in the portal to the 'Search Service' that you created earlier.
Click [+ Import Data]
You will be guided through some steps in a wizard.
- Data Source: Azure Blob Storage - Data source name: 'blob' - Connection String: Choose an existing connection, choose the storage and container you just made and click [Select]
Click [Next]
A skillset is a list of skills that will be executed to enrich the data that is already found in the documents.
Expand 'Attach Cognitive Services' and select the cognitive service you created.
Expand 'Add enrichments' - Check 'Enable OCR' - Check 'Text Cognitive Skills' - Check 'Image Cognitive Skills'
In this step you are adding prebuilt AI skills to your indexation procedure.
You can change the target language to another if you want.
Click [Next]
In this view, you define how and what data you want to save in your index. An index exists out of json documents that all have the same structure.
Notice that for every field you have following options:
Retrievable
Search API will be able to retrieve this field
Filterable
Search API will be able to filter on this field
Sortable
Search API will can sort on this field
Facetable
Search API can generate facets on this field
Searchable
Search API can search through this field
Analyzer
What kind of analyser can be used to search through your field. If you want to understand the differences between them. Try out following demo
Sugester
Enable this if you want to give your search input box an autosuggest functionality. This will not improve your search results, but only the usability of the frontend that you build.
Leave all the 'Retrievable' checkboxes as they are. Enable Filterable & Facetable on people, organizations, locations, keyphrases & language
Only enable 'Searchable on 'translated text' and 'merged content' Analyzers for 'translated text' needs to be Microsoft - {language you chose} and for 'merged content' it can be 'Microsoft-English'
Click [Next]
In an indexer you define finetuning of your source, when your data needs to be analysed and how to handle errors.
In advanced options you can add an extension filter on '.pdf'
Click [Submit] to start the process.
The process of analysing the pdf files has started now. You can follow the progress by clicking on the tab 'Indexers'
By clicking on 'Refresh', you will update the status.
When the indexation has succeed you can start exploring the data by clicking on 'Search Explorer'
By making use of the 'query string' input field you can explorer your data.
Answer following questions:
How many documents are in the index?
Return documents that talk about the person 'David Burt' (filter)
Return the facets for location linked to the above results
Some help can be found here and here
If you want to go an extra mile:
Try to build a small web application that has an input field that can be used to search through your data. And show the results with highlights in a clean way.
Build a custom web api skill to manipulate some data (help)
Help can be found here