How to Count Occurrences of Specific Value in Pandas Column?

Last Updated : 19 Nov, 2024

Let's learn how to count occurrences of a specific value in columns within a Pandas DataFrame using .value_counts() method and conditional filtering.

Count Occurrences of Specific Values using value_counts()

To count occurrences of values in a Pandas DataFrame, use the value_counts() method. This function helps analyze the frequency of values within a specific column or the entire Pandas DataFrame.

Let's start by setting up a sample DataFrame to demonstrate these methods. The DataFrame contains four columns: 'name', 'subjects', 'marks' and 'age'

Python

# Example: Count Values in a DataFrame Column
import pandas as pd

# Create a DataFrame with 5 rows and 4 columns
data = pd.DataFrame({
    'name': ['sravan', 'ojsawi', 'bobby',  'rohith', 
             'gnanesh', 'sravan', 'sravan', 'ojaswi'],
    'subjects': ['java', 'php', 'java', 'php', 'java',
                 'html/css', 'python', 'R'],
    'marks': [98, 90, 78, 91, 87, 78, 89, 90],
    'age': [11, 23, 23, 21, 21, 21, 23, 21]
})

count_sravan = data['name'].value_counts().get('sravan', 0)
print("Occurrences of 'sravan':", count_sravan)

count_ojaswi = data['name'].value_counts().get('ojaswi', 0)
print("Occurrences of 'ojaswi':", count_ojaswi)

Output:

Occurrences of 'sravan': 3
Occurrences of 'ojaswi': 1

Each value_counts() method call specifies the column and value of interest to return the count of occurrences.

Syntax: data['column_name'].value_counts()[value]
where
data: the input DataFrame.
column_name: the target column in the DataFrame.
value: the specific string or integer value to be counted within the column.

We can also count the occurrences of a specific value in a pandas column using following methods:

Table of Content

Using Conditional Filtering with sum()

This method compares values in the column with the specified value and then sums up True values, representing the count.

Python

count_sravan = (data['name'] == 'sravan').sum()
print("Occurrences of 'sravan':", count_sravan)

count_gnanesh = (data['name'] == 'gnanesh').sum()
print("Occurrences of 'gnanesh':", count_gnanesh)

Output:

Occurrences of 'sravan': 3
Occurrences of 'gnanesh': 1

Using count() after Conditional Filtering

This approach filters the DataFrame for rows matching the condition, then counts the resulting rows.

Python

count_sravan = data[data['name'] == 'sravan'].count()['name']
print("Occurrences of 'sravan':", count_sravan)

count_bobby = data[data['name'] == 'bobby'].count()['name']
print("Occurrences of 'bobby':", count_bobby)

Output:

Occurrences of 'sravan': 3
Occurrences of 'bobby': 1

Using len() after Conditional Filtering

Similar to the previous method, this one uses len() to get the length of the filtered DataFrame directly.

Python

count_sravan = len(data[data['name'] == 'sravan'])
print("Occurrences of 'sravan':", count_sravan)

count_rohith = len(data[data['name'] == 'rohith'])
print("Occurrences of 'rohith':", count_rohith)

Output:

Occurrences of 'sravan': 3
Occurrences of 'rohith': 1

Using apply() with a Lambda Function

You can use the apply() function to create a custom function that checks each value. This method is useful if you want additional customization.

Python

count_sravan = data['name'].apply(lambda x: x == 'sravan').sum()
print("Occurrences of 'sravan':", count_sravan)

count_ojaswi = data['name'].apply(lambda x: x == 'ojaswi').sum()
print("Occurrences of 'ojaswi':", count_ojaswi)

Output:

Occurrences of 'sravan': 3
Occurrences of 'ojaswi': 1

Using np.sum() for Conditional Counting

If you have a large DataFrame, using NumPy’s np.sum() can offer a performance boost by operating directly on the boolean mask.

Python

import numpy as np

count_sravan = np.sum(data['name'] == 'sravan')
print("Occurrences of 'sravan':", count_sravan)

count_java = np.sum(data['subjects'] == 'java')
print("Occurrences of 'java':", count_java)

Output:

Occurrences of 'sravan': 3
Occurrences of 'java': 3

Using Grouping and Aggregation

This approach is useful if you need counts for multiple values simultaneously. Group by the column and use .size() to count occurrences.

Python

counts = data.groupby('name').size()
count_sravan = counts.get('sravan', 0)
print("Occurrences of 'sravan':", count_sravan)

subject_counts = data.groupby('subjects').size()
count_php = subject_counts.get('php', 0)
print("Occurrences of 'php':", count_php)

Output:

Occurrences of 'sravan': 3
Occurrences of 'php': 2

When to use each Pandas Method

The table combined overview of each method for counting occurrences in a Pandas column, along with when to use them and their code syntax: