Python MongoDB - distinct()
Last Updated :
28 Jul, 2020
Improve
MongoDB is a cross-platform, document-oriented database that works on the concept of collections and documents. It stores data in the form of key-value pairs and is a NoSQL database program. The term NoSQL means non-relational. Refer to MongoDB and Python for an in-depth introduction to the topic. Now let's understand the use of distinct() function in PyMongo.
distinct()
PyMongo includes the distinct()
function that finds and returns the distinct values for a specified field across a single collection and returns the results in an array.
Syntax : distinct(key, filter = None, session = None, **kwargs) Parameters :
- key : field name for which the distinct values need to be found.
- filter : (Optional) A query document that specifies the documents from which to retrieve the distinct values.
- session : (Optional) a ClientSession.
Let's create a sample collection :
# importing the module
from pymongo import MongoClient
# creating a MongoClient object
client = MongoClient()
# connecting with the portnumber and host
client = MongoClient("mongodb://localhost:27017/")
# accessing the database
database = client['database']
# access collection of the database
mycollection = mydatabase['myTable']
documents = [{"_id": 1, "dept": "A",
"item": {"code": "012", "color": "red"},
"sizes": ["S", "L"]},
{"_id": 2, "dept": "A",
"item": {"code": "012", "color": "blue"},
"sizes": ["M", "S"]},
{"_id": 3, "dept": "B",
"item": {"code": "101", "color": "blue"},
"sizes": "L"},
{"_id": 4, "dept": "A",
"item": {"code": "679", "color": "black"},
"sizes": ["M"]}]
mycollection.insert_many(documents)
for doc in mycollection.find({}):
print(doc)
Output :
{'_id': 1, 'dept': 'A', 'item': {'code': '012', 'color': 'red'}, 'sizes': ['S', 'L']} {'_id': 2, 'dept': 'A', 'item': {'code': '012', 'color': 'blue'}, 'sizes': ['M', 'S']} {'_id': 3, 'dept': 'B', 'item': {'code': '101', 'color': 'blue'}, 'sizes': 'L'} {'_id': 4, 'dept': 'A', 'item': {'code': '679', 'color': 'black'}, 'sizes': ['M']}Now we will; use the
distinct()
method to :
- Return distinct values for a Field
- Return Distinct Values for an Embedded Field
- Return Distinct Values for an Array Field
- Return Specific Query
# distinct() function returns the distinct values for the
# field dept from all documents in the mycollection collection
print(mycollection.distinct('dept'))
# distinct values for the field color,
# embedded in the field item, from all documents
# in the mycollection collection
print(mycollection.distinct('item.color'))
# returns the distinct values for the field sizes
# from all documents in the mycollection collection
print(mycollection.distinct("sizes"))
# distinct values for the field code,
# embedded in the field item, from the documents
# in mycollection collection whose dept is equal to B.
print(mycollection.distinct("item.code", {"dept" : "B"}))
Output :
['A', 'B'] ['red', 'blue', 'black'] ['L', 'S', 'M'] ['101']