Open In App

Count Frequency of Columns in Pandas DataFrame

Last Updated : 21 Nov, 2024
Summarize
Comments
Improve
Suggest changes
Like Article
Like
Share
Report
News Follow

Let’s learn how we can get the frequency counts of a column in Pandas DataFrame.

Get frequency counts of a column using value_counts()

In Pandas, you can calculate the frequency counts of values in a column using the value_counts() method. This method is efficient to count frequency of a single column.

import pandas as pd

# Sample DataFrame
data = {
    'Fruits': ['Apple', 'Banana', 'Apple', 'Orange', 'Banana', 'Apple', 'Orange'],
    'Quantity': [3, 2, 3, 1, 2, 3, 1]
}
df = pd.DataFrame(data)

# Frequency counts of the 'Fruits' column
fruit_counts = df['Fruits'].value_counts()

print(fruit_counts)

Output:

Fruits
Apple 3
Banana 2
Orange 2
Name: count, dtype: int64

The result is a Series with the unique values as the index and their counts as values.

We can use additional option like normalize (to get relative frequency) or ascending (to sort in ascending order) with this method.


More methods to compute frequency counts for a column in a Pandas DataFrame:

Using groupby() with size

The groupby() method, combined with size(), calculates the frequency of each unique value in a column. This method is useful when we want to count frequency across multiple columns.

fruit_counts = df.groupby('Fruits').size()
print(fruit_counts)

Output:

Fruits
Apple 3
Banana 2
Orange 2
dtype: int64

Using Counter from the collections Module

You can convert the column to a list and use Counter for frequency counts. It returns a dictionary, which can be converted to series if required.

from collections import Counter

fruit_counts = Counter(df['Fruits'])
print(fruit_counts)

Output:

Counter({'Apple': 3, 'Banana': 2, 'Orange': 2})

To convert this result back to a Pandas Series:

fruit_counts_series = pd.Series(fruit_counts)
print(fruit_counts_series)

Output:

Apple     3
Banana 2
Orange 2
dtype: int64

Using crosstab() to count frequency

crosstab() is useful for counting the occurrences of values across categories. This method is highly efficient for counting frequency across multiple dimensions.

fruit_counts = pd.crosstab(index=df['Fruits'], columns='count')
print(fruit_counts)

Output:

col_0   count
Fruits
Apple 3
Banana 2
Orange 2

Using a Dictionary Comprehension

A dictionary comprehension can also count frequencies.

fruit_counts = {fruit: len(df[df['Fruits'] == fruit]) for fruit in df['Fruits'].unique()}
print(fruit_counts)

Output:

{'Apple': 3, 'Banana': 2, 'Orange': 2}

Using numpy.unique()

If you prefer NumPy, this method can be a fast alternative of Pandas.

import numpy as np

unique, counts = np.unique(df['Fruits'], return_counts=True)
fruit_counts = dict(zip(unique, counts))
print(fruit_counts)

Output:

{'Apple': 3, 'Banana': 2, 'Orange': 2}

Using a Pivot Table

Pivot tables are versatile and can also be used for frequency counts.

fruit_counts = df.pivot_table(index='Fruits', aggfunc='size')
print(fruit_counts)

Output:

Fruits
Apple 3
Banana 2
Orange 2
dtype: int64

Using apply() with lambda function

Instead of size, you can use count for non-null frequency counts with this. This manual approach is rarely used, because it’s not efficient for large datasets.

fruit_counts = df['Fruits'].apply(lambda x: (df['Fruits'] == x).sum())
print(fruit_counts)

Output:

Fruits
Apple 3
Banana 2
Orange 2
Name: Quantity, dtype: int64




Next Article
Article Tags :
Practice Tags :

Similar Reads

three90RightbarBannerImg