Monday, 16 October 2023

Elastic Search keword vs text

The main difference between the keyword and text data types in Elasticsearch is that keyword fields are not analyzed, while text fields are. This means that Elasticsearch will store the text in a keyword field exactly as it is, without performing any analysis such as tokenization, stemming, or stop word removal. On the other hand, Elasticsearch will perform these operations on the text in a text field before storing it.

Keyword fields are typically used for fields that need to be searched for exact matches, such as product IDs, email addresses, or URLs. They can also be used for fields that contain structured data, such as JSON or XML. Text fields are typically used for fields that contain natural language text, such as product descriptions, blog posts, or tweets.

Here is a table that summarizes the key differences between keyword and text fields:

FeatureKeywordText
Analyzed?NoYes
Used forExact matches, structured dataNatural language text
ExamplesProduct ID, email address, URL, hostname, status code, ZIP code, tagProduct description, blog post, tweet

Here are some examples of how to use keyword and text fields:

JSON
# Keyword field
{
  "properties": {
    "product_id": {
      "type": "keyword"
    }
  }
}

# Text field
{
  "properties": {
    "product_description": {
      "type": "text"
    }
  }
}

To search for documents that contain a specific keyword, you can use the term query. For example, the following query will find all products that have the product ID 1234567890:

JSON
{
  "term": {
    "product_id": "1234567890"
  }
}

To search for documents that contain a specific word or phrase in a text field, you can use the match query. For example, the following query will find all products that contain the word shoes in their description:

JSON
{
  "match": {
    "product_description": "shoes"
  }
}

You can also use keyword and text fields in conjunction with each other to create more complex search queries. For example, the following query will find all products that have the product ID 1234567890 and that also contain the word shoes in their description:

JSON
{
  "bool": {
    "must": [
      {
        "term": {
          "product_id": "1234567890"
        }
      },
      {
        "match": {
          "product_description": "shoes"
        }
      }
    ]
  }
}

Which data type you choose to use for a particular field depends on your specific needs. If you need to search for exact matches, then you should use a keyword field. If you need to search for natural language text, then you should use a text field. You can also use keyword and text fields in conjunction with each other to create more complex search queries.

No comments:

Post a Comment