Mapping Types and Field Data Types in Elasticsearch
Mapping types and field data types are fundamental concepts in Elasticsearch that define how data is indexed, stored and queried within an index. Understanding these concepts is crucial for effectively modeling our data and optimizing search performance.
In this article, We will learn about the mapping types, field data types, and their significance in Elasticsearch. Also, understand some examples for better understanding.
Introduction to Mapping Types
- Mapping types were a way to organize data within an index in earlier versions of Elasticsearch (before 6.x).
- With mapping types, we could define multiple types within a single index, each representing a different entity or document structure.
- Mapping types allowed us to define the schema for documents within an index. For example, we could have a "book" type and a "movie" type within the same index.
- Starting with Elasticsearch then mapping types are outdated and will be removed in future versions.
- Instead of using mapping types, Elasticsearch now recommends using a single type per index.
- The shift away from mapping types allows for a more flexible schema design, reducing mapping conflicts and improving efficiency.
- In newer versions of Elasticsearch, it is recommended to use different indices to represent different entities or document structures.
- Each index can have its mapping to define the schema for documents within that index.
Field Data Types in Elasticsearch
Field data types define the type of data that can be stored in a field within a document. Elasticsearch provides a wide range of data types to accommodate various types of data, including text, numbers, dates, and more. Let's explore some common field data types in Elasticsearch:
1. Text
- It is used for full-text searches. Text fields are analyzed which means that they are broken down into individual terms which are then indexed. This allows for efficient full-text search but can result in larger index sizes.
2. Keyword
- It is Used for exact matching. Keyword fields are not analyzed and are indexed as-is. They are suitable for fields like IDs, email addresses and enum values.
3. Integer
- It is Used for numeric values. Elasticsearch supports various numeric types, including integer, long, float and double. Numeric fields are indexed in a way that allows for efficient range queries and aggregations.
4. Long
- The long data type is similar to integer but is suitable for larger numbers.
5. Date
- It is Used for date and time values. Elasticsearch can parse various date formats and supports date arithmetic and formatting in queries.
6. Boolean
- The boolean data type is used for fields that contain true or false values.
7. Float and Double
- The float and double data types are used for fields that contain floating-point numbers.
Mapping Types and Field Definitions
Let's create a simple index with mappings for various field data types to understand how mappings and field data types work in Elasticsearch.
PUT /my_index
{
"mappings": {
"properties": {
"title": {
"type": "text"
},
"category": {
"type": "keyword"
},
"quantity": {
"type": "integer"
},
"price": {
"type": "float"
},
"is_active": {
"type": "boolean"
},
"created_at": {
"type": "date"
}
}
}
}
Explanation:
- We create an index named my_index with mappings for fields like title, category, quantity, price, is_active, and created_at.
- Each field is assigned a specific data type (text, keyword, integer, float, boolean, date).
Indexing Documents with Field Data Types
Suppose We need to index a document into our my_index
index to demonstrate how field data types are applied in Elasticsearch.
POST /my_index/_doc/1
{
"title": "Product A",
"category": "Electronics",
"quantity": 100,
"price": 49.99,
"is_active": true,
"created_at": "2022-05-01T12:00:00"
}
Explanation: In this example, we use a POST
request to index a document with various field data types into the my_index
index. Each field in the document corresponds to a specific field data type in Elasticsearch, such as text, keyword, integer, float, boolean, and date.
Retrieving Mapping Information
Suppose We want to retrieve mapping information for the my_index
index to understand how field data types are mapped in Elasticsearch.
GET /my_index/_mapping
Sample Output:
{
"my_index": {
"mappings": {
"properties": {
"title": {
"type": "text"
},
"category": {
"type": "keyword"
},
"quantity": {
"type": "integer"
},
"price": {
"type": "float"
},
"is_active": {
"type": "boolean"
},
"created_at": {
"type": "date"
}
}
}
}
}
Explanation: The GET
request to the _mapping
endpoint retrieves the mapping information for the my_index
index. This information includes the field data types for each field in the index, allowing us to understand how Elasticsearch interprets and indexes our data.
Querying Documents with Field Data Types
Suppose We need to query documents based on their field data types, such as finding products with a price less than 50
GET /my_index/_search
{
"query": {
"range": {
"price": {
"lt": 50
}
}
}
}
Explanation: This GET
request to the _search
endpoint uses a range query to find documents in the my_index
index where the price
field is less than 50. By specifying the field data type (float
) and the range (lt
: less than), we can retrieve relevant documents based on their numeric values.
Conclusion
Mapping types and field data types are foundational concepts in Elasticsearch that play a crucial role in organizing and querying data. By understanding mapping types and selecting appropriate field data types for your index, you can optimize search performance and ensure accurate representation of your data.
In this article, we explored the basics of mapping types, common field data types, and their significance in Elasticsearch. We also provided practical examples and outputs to illustrate how mappings and field data types are defined, indexed, and queried in Elasticsearch. With this knowledge, you'll be better equipped to design and manage Elasticsearch indices effectively for your applications.