Count Frequency of Columns in Pandas DataFrame
Let’s learn how we can get the frequency counts of a column in Pandas DataFrame.
Get frequency counts of a column using value_counts()
In Pandas, you can calculate the frequency counts of values in a column using the value_counts() method. This method is efficient to count frequency of a single column.
import pandas as pd
# Sample DataFrame
data = {
'Fruits': ['Apple', 'Banana', 'Apple', 'Orange', 'Banana', 'Apple', 'Orange'],
'Quantity': [3, 2, 3, 1, 2, 3, 1]
}
df = pd.DataFrame(data)
# Frequency counts of the 'Fruits' column
fruit_counts = df['Fruits'].value_counts()
print(fruit_counts)
Output:
Fruits
Apple 3
Banana 2
Orange 2
Name: count, dtype: int64
The result is a Series with the unique values as the index and their counts as values.
We can use additional option like normalize (to get relative frequency) or ascending (to sort in ascending order) with this method.
More methods to compute frequency counts for a column in a Pandas DataFrame:
Table of Content
Using groupby() with size
The groupby() method, combined with size(), calculates the frequency of each unique value in a column. This method is useful when we want to count frequency across multiple columns.
fruit_counts = df.groupby('Fruits').size()
print(fruit_counts)
Output:
Fruits
Apple 3
Banana 2
Orange 2
dtype: int64
Using Counter from the collections Module
You can convert the column to a list and use Counter for frequency counts. It returns a dictionary, which can be converted to series if required.
from collections import Counter
fruit_counts = Counter(df['Fruits'])
print(fruit_counts)
Output:
Counter({'Apple': 3, 'Banana': 2, 'Orange': 2})
To convert this result back to a Pandas Series:
fruit_counts_series = pd.Series(fruit_counts)
print(fruit_counts_series)
Output:
Apple 3
Banana 2
Orange 2
dtype: int64
Using crosstab() to count frequency
crosstab() is useful for counting the occurrences of values across categories. This method is highly efficient for counting frequency across multiple dimensions.
fruit_counts = pd.crosstab(index=df['Fruits'], columns='count')
print(fruit_counts)
Output:
col_0 count
Fruits
Apple 3
Banana 2
Orange 2
Using a Dictionary Comprehension
A dictionary comprehension can also count frequencies.
fruit_counts = {fruit: len(df[df['Fruits'] == fruit]) for fruit in df['Fruits'].unique()}
print(fruit_counts)
Output:
{'Apple': 3, 'Banana': 2, 'Orange': 2}
Using numpy.unique()
If you prefer NumPy, this method can be a fast alternative of Pandas.
import numpy as np
unique, counts = np.unique(df['Fruits'], return_counts=True)
fruit_counts = dict(zip(unique, counts))
print(fruit_counts)
Output:
{'Apple': 3, 'Banana': 2, 'Orange': 2}
Using a Pivot Table
Pivot tables are versatile and can also be used for frequency counts.
fruit_counts = df.pivot_table(index='Fruits', aggfunc='size')
print(fruit_counts)
Output:
Fruits
Apple 3
Banana 2
Orange 2
dtype: int64
Using apply() with lambda function
Instead of size, you can use count for non-null frequency counts with this. This manual approach is rarely used, because it’s not efficient for large datasets.
fruit_counts = df['Fruits'].apply(lambda x: (df['Fruits'] == x).sum())
print(fruit_counts)
Output:
Fruits
Apple 3
Banana 2
Orange 2
Name: Quantity, dtype: int64