How to sort a Pandas DataFrame by multiple columns?

Last Updated : 05 Apr, 2025

We are given a DataFrame and our task is to sort it based on multiple columns. This means organizing the data first by one column and then by another within that sorted order. For example, if we want to sort by ‘Rank’ in ascending order and then by ‘Age’ in descending order, the output will be a DataFrame ordered according to those rules, with NaN values placed at the end if specified.

Using nlargest()

nlargest() method is the fastest way to get the top n rows sorted by specific columns. It is optimized for performance, making it ideal when you need to retrieve only the top values based on one or more criteria.

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'Name': ['Raj', 'Akhil', 'Sonum', 'Tilak', 'Divya', 'Megha'],
    'Age': [20, 22, 21, 19, 17, 23],
    'Rank': [1, np.nan, 8, 9, 4, np.nan]
})

# Selecting top 3 rows with highest 'Rank'
res = df.nlargest(3, ['Rank'])
print(res)

Output

    Name  Age  Rank
3  Tilak   19   9.0
2  Sonum   21   8.0
4  Divya   17   4.0

Explanation: nlargest(n, columns) selects the top n rows with the highest values in the specified column, ignoring NaNs. Here, df.nlargest(3, [‘Rank’]) efficiently sorts by ‘Rank’ in descending order and returns the top 3 rows.

Table of Content

Using nsmallest()
Using sort_values()
Using sort_index()
Using argsort()

Using nsmallest()

nsmallest() method works similarly to nlargest() but retrieves the lowest n values instead.

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'Name': ['Raj', 'Akhil', 'Sonum', 'Tilak', 'Divya', 'Megha'],
    'Age': [20, 22, 21, 19, 17, 23],
    'Rank': [1, np.nan, 8, 9, 4, np.nan]
})

# Selecting bottom 3 rows with lowest 'Rank'
res = df.nsmallest(3, ['Rank'])
print(res)

Output

    Name  Age  Rank
0    Raj   20   1.0
4  Divya   17   4.0
2  Sonum   21   8.0

Explanation: nsmallest(n, columns) selects the bottom n rows with the lowest values in the specified column, ignoring NaNs. Here, df.nsmallest(3, [‘Rank’]) sorts ‘Rank’ in ascending order and returns the lowest 3 rows.

Using sort_values()

sort_values() method is the most flexible and widely used method for sorting a DataFrame by multiple columns. It allows sorting in both ascending and descending order while handling missing values efficiently.

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'Name': ['Raj', 'Akhil', 'Sonum', 'Tilak', 'Divya', 'Megha'],
    'Age': [20, 22, 21, 19, 17, 23],
    'Rank': [1, np.nan, 8, 9, 4, np.nan]
})

# Sorting by 'Rank' in ascending order and 'Age' in descending order
res = df.sort_values(by=['Rank', 'Age'], ascending=[True, False], na_position='last')
print(res)

Output

    Name  Age  Rank
0    Raj   20   1.0
4  Divya   17   4.0
2  Sonum   21   8.0
3  Tilak   19   9.0
5  Megha   23   NaN
1  Akhil   22   NaN

Explanation: sort_values(by, ascending, na_position) sorts a DataFrame based on multiple columns. Here, df.sort_values() sorts ‘Rank’ in ascending order and, for equal ranks, sorts ‘Age’ in descending order while pushing NaN values to the end.

Using sort_index()

sort_index() method sorts the DataFrame based on its index rather than its column values. It is useful when you want to reorder rows by their index, such as after setting a custom index.

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'Name': ['Raj', 'Akhil', 'Sonum', 'Tilak', 'Divya', 'Megha'],
    'Age': [20, 22, 21, 19, 17, 23],
    'Rank': [1, np.nan, 8, 9, 4, np.nan]
})

# Sorting the DataFrame by index in descending order
res = df.sort_index(ascending=False)
print(res)

Output

    Name  Age  Rank
5  Megha   23   NaN
4  Divya   17   4.0
3  Tilak   19   9.0
2  Sonum   21   8.0
1  Akhil   22   NaN
0    Raj   20   1.0

Explanation: sort_index(ascending) sorts a DataFrame based on its index. Here, df.sort_index(ascending=False) arranges the rows in descending order of their index values.

Using argsort()

If you need extremely fast sorting and are working with NumPy arrays, you can use argsort() to get the sorted indices and then apply them to the DataFrame.

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'Name': ['Raj', 'Akhil', 'Sonum', 'Tilak', 'Divya', 'Megha'],
    'Age': [20, 22, 21, 19, 17, 23],
    'Rank': [1, np.nan, 8, 9, 4, np.nan]
})
# Sorting DataFrame by 'Rank' using NumPy's argsort
sorted_idx = np.argsort(df['Rank'].values, kind='quicksort')
res = df.iloc[sorted_idx]
print(res)

Output

    Name  Age  Rank
0    Raj   20   1.0
4  Divya   17   4.0
2  Sonum   21   8.0
3  Tilak   19   9.0
1  Akhil   22   NaN
5  Megha   23   NaN

Explanation: np.argsort(df[‘Rank’].values, kind=’quicksort’) returns sorted indices for the ‘Rank’ column, ignoring NaNs. Using .iloc[sorted_idx], the DataFrame is reordered accordingly.

Style Plots using Matplotlib

neelutiwari

Improve

Article Tags :

Practice Tags :

python

How to sort a Pandas DataFrame by multiple columns?

Using nlargest()

Using nsmallest()

Using sort_values()

Using sort_index()

Using argsort()

Similar Reads

Thank You!

What kind of Experience do you want to share?