Python Pandas - Home
Python Pandas - Introduction
Python Pandas - Environment Setup
Python Pandas - Basics
Python Pandas - Introduction to Data Structures
Python Pandas - Index Objects
Python Pandas - Panel
Python Pandas - Basic Functionality
Python Pandas - Indexing & Selecting Data
Python Pandas - Series
Python Pandas - Series
Python Pandas - Slicing a Series Object
Python Pandas - Attributes of a Series Object
Python Pandas - Arithmetic Operations on Series Object
Python Pandas - Converting Series to Other Objects
Python Pandas - DataFrame
Python Pandas - DataFrame
Python Pandas - Accessing DataFrame
Python Pandas - Slicing a DataFrame Object
Python Pandas - Modifying DataFrame
Python Pandas - Removing Rows from a DataFrame
Python Pandas - Arithmetic Operations on DataFrame
Python Pandas - IO Tools
Python Pandas - IO Tools
Python Pandas - Working with CSV Format
Python Pandas - Reading & Writing JSON Files
Python Pandas - Reading Data from an Excel File
Python Pandas - Writing Data to Excel Files
Python Pandas - Working with HTML Data
Python Pandas - Clipboard
Python Pandas - Working with HDF5 Format
Python Pandas - Comparison with SQL
Python Pandas - Data Handling
Python Pandas - Sorting
Python Pandas - Reindexing
Python Pandas - Iteration
Python Pandas - Concatenation
Python Pandas - Statistical Functions
Python Pandas - Descriptive Statistics
Python Pandas - Working with Text Data
Python Pandas - Function Application
Python Pandas - Options & Customization
Python Pandas - Window Functions
Python Pandas - Aggregations
Python Pandas - Merging/Joining
Python Pandas - MultiIndex
Python Pandas - Basics of MultiIndex
Python Pandas - Indexing with MultiIndex
Python Pandas - Advanced Reindexing with MultiIndex
Python Pandas - Renaming MultiIndex Labels
Python Pandas - Sorting a MultiIndex
Python Pandas - Binary Operations
Python Pandas - Binary Comparison Operations
Python Pandas - Boolean Indexing
Python Pandas - Boolean Masking
Python Pandas - Data Reshaping & Pivoting
Python Pandas - Pivoting
Python Pandas - Stacking & Unstacking
Python Pandas - Melting
Python Pandas - Computing Dummy Variables
Python Pandas - Categorical Data
Python Pandas - Categorical Data
Python Pandas - Ordering & Sorting Categorical Data
Python Pandas - Comparing Categorical Data
Python Pandas - Handling Missing Data
Python Pandas - Missing Data
Python Pandas - Filling Missing Data
Python Pandas - Interpolation of Missing Values
Python Pandas - Dropping Missing Data
Python Pandas - Calculations with Missing Data
Python Pandas - Handling Duplicates
Python Pandas - Duplicated Data
Python Pandas - Counting & Retrieving Unique Elements
Python Pandas - Duplicated Labels
Python Pandas - Grouping & Aggregation
Python Pandas - GroupBy
Python Pandas - Time-series Data
Python Pandas - Date Functionality
Python Pandas - Timedelta
Python Pandas - Sparse Data Structures
Python Pandas - Sparse Data
Python Pandas - Visualization
Python Pandas - Visualization
Python Pandas - Additional Concepts
Python Pandas - Caveats & Gotchas

Python Pandas - Sorting

Quiz

Sorting is a fundamental operation when working with data in Pandas, whether you're organizing rows, columns, or specific values. Sorting can help you to arrange your data in a meaningful way for better understanding and easy analysis.

Pandas provides powerful tools for sorting your data efficiently, which can be done by labels or actual values. In this tutorial, we'll explore various methods for sorting data in Pandas, from basic sorting by index or column labels to more advanced techniques like sorting by multiple columns and choosing specific sorting algorithms.

Types of Sorting in Pandas

There are two kinds of sorting available in Pandas. They are −

Sorting by Label − This involves sorting the data based on the index labels.
Sorting by Value − This involves sorting data based on the actual values in the DataFrame or Series.

Sorting by Label

To sort by the index labels, you can use the sort_index() method, by passing the axis arguments and the order of sorting, data structure object can be sorted. By default, this method sorts the DataFrame in ascending order based on the row labels.

Example

Let's take a basic example of demonstrating the sorting a DataFrame by using the sort_index() method.

Open Compiler

import pandas as pd
import numpy as np

unsorted_df = pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],columns = ['col2','col1'])

print("Original DataFrame:\n", unsorted_df)

# Sort the DataFrame by labels
sorted_df=unsorted_df.sort_index()
print("\nOutput Sorted DataFrame:\n", sorted_df)

Its output is as follows −

Original DataFrame:
        col2      col1
1  1.116188  1.631727
4  0.287900 -1.097359
6  0.058885 -0.642273
2 -2.070172  0.148255
3 -1.458229  1.298907
5 -0.723663  2.220048
9 -1.271494  2.001025
8 -0.412954 -0.808688
0  0.922697 -0.429393
7 -0.476054 -0.351621

Output Sorted DataFrame:
        col2      col1
0  0.922697 -0.429393
1  1.116188  1.631727
2 -2.070172  0.148255
3 -1.458229  1.298907
4  0.287900 -1.097359
5 -0.723663  2.220048
6  0.058885 -0.642273
7 -0.476054 -0.351621
8 -0.412954 -0.808688
9 -1.271494  2.001025

Example − Controlling the Order of Sorting

By passing the Boolean value to ascending parameter, the order of the sorting can be controlled. Let us consider the following example to understand the same.

Open Compiler

import pandas as pd
import numpy as np

unsorted_df = pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],columns = ['col2','col1'])

print("Original DataFrame:\n", unsorted_df)

# Sort the DataFrame by ascending order
sorted_df = unsorted_df.sort_index(ascending=False)
print("\nOutput Sorted DataFrame:\n", sorted_df)

Its output is as follows −

Original DataFrame:
        col2      col1
1 -0.668366  0.576422
4  0.605218 -0.066065
6  1.140478  0.236687
2  0.137617  0.312423
3 -0.055631  0.774057
5  0.108002  1.038820
9 -0.929134 -0.982358
8 -0.207542 -1.283386
0 -0.210571 -0.656371
7 -0.106388  0.672418

Output Sorted DataFrame:
        col2      col1
9 -0.929134 -0.982358
8 -0.207542 -1.283386
7 -0.106388  0.672418
6  1.140478  0.236687
5  0.108002  1.038820
4  0.605218 -0.066065
3 -0.055631  0.774057
2  0.137617  0.312423
1 -0.668366  0.576422
0 -0.210571 -0.656371

Example − Sort the Columns

By passing the axis argument with a value 0 or 1, the sorting can be done on the column labels. By default, axis=0, sort by row. Let us consider the following example to understand the same.

Open Compiler

import pandas as pd
import numpy as np
 
unsorted_df = pd.DataFrame(np.random.randn(6,4),index=[1,4,2,3,5,0],columns = ['col2','col1', 'col4', 'col3'])

print("Original DataFrame:\n", unsorted_df)

# Sort the DataFrame columns
sorted_df=unsorted_df.sort_index(axis=1)
print("\nOutput Sorted DataFrame:\n", sorted_df)

Its output is as follows −

Original DataFrame:
        col2      col1      col4      col3
1 -0.828951 -0.798286 -1.794752 -0.082656
4  0.440243 -0.693218 -0.218277 -0.790168
2  1.017670  1.443679 -1.939119 -1.887223
3 -0.992471 -1.425046  0.651336 -0.278247
5 -0.103537 -0.879433  0.471838  0.860885
0 -0.222297  1.094805  0.501531 -0.580382

Output Sorted DataFrame:
        col1      col2      col3      col4
1 -0.798286 -0.828951 -0.082656 -1.794752
4 -0.693218  0.440243 -0.790168 -0.218277
2  1.443679  1.017670 -1.887223 -1.939119
3 -1.425046 -0.992471 -0.278247  0.651336
5 -0.879433 -0.103537  0.860885  0.471838
0  1.094805 -0.222297 -0.580382  0.501531

Sorting by Actual Values

Like index sorting, sorting by actual values can be done using the sort_values() method. This method allows sorting by one or more columns. It accepts a 'by' argument which will use the column name of the DataFrame with which the values are to be sorted.

Example − Sorting a Series Values

The following example demonstrates how to sort a pandas Series object using the sort_values() method.

Open Compiler

import pandas as pd

panda_series = pd.Series([18, 95, 66, 12, 55, 0])
print("Unsorted Pandas Series: \n", panda_series)

panda_series_sorted = panda_series.sort_values(ascending=True)
print("\nSorted Pandas Series: \n", panda_series_sorted)

On executing the above code you will get the following output −

Unsorted Pandas Series: 
 0    18
1    95
2    66
3    12
4    55
5     0
dtype: int64

Sorted Pandas Series: 
 5     0
3    12
0    18
4    55
2    66
1    95
dtype: int64

Example − Sorting a DataFrame Values

The following example demonstrates working of the sort_values() method on a DataFrame Object.

Open Compiler

import pandas as pd
import numpy as np

unsorted_df = pd.DataFrame({'col1':[2,9,5,0],'col2':[1,3,2,4]})
print("Original DataFrame:\n", unsorted_df)

# Sort the DataFrame by values
sorted_df = unsorted_df.sort_values(by='col1')
print("\nOutput Sorted DataFrame:\n", sorted_df)

Its output is as follows −

Original DataFrame:
    col1  col2
0     2     1
1     9     3
2     5     2
3     0     4

Output Sorted DataFrame:
    col1  col2
3     0     4
0     2     1
2     5     2
1     9     3

Observe, col1 values are sorted and the respective col2 value and row index will alter along with col1. Thus, they look unsorted.

Example − Sorting Value of the Multiple Columns

You can also sort by multiple columns by passing a list of column names to the 'by' parameter.

Open Compiler

import pandas as pd
import numpy as np

unsorted_df = pd.DataFrame({'col1':[2,1,0,1],'col2':[1,3,4,2]})

print("Original DataFrame:\n", unsorted_df)

# Sort the DataFrame multiple columns by values
sorted_df = unsorted_df.sort_values(by=['col1','col2'])
print("\nOutput Sorted DataFrame:\n", sorted_df)

Its output is as follows −

Original DataFrame:
    col1  col2
0     2     1
1     1     3
2     0     4
3     1     2

Output Sorted DataFrame:
    col1  col2
2     0     4
3     1     2
1     1     3
0     2     1

Choosing a Sorting Algorithm

Pandas allows you to specify the sorting algorithm using the kind parameter in the sort_values() method. You can choose between 'mergesort', 'heapsort', and 'quicksort'. 'mergesort' is the only stable algorithm.

Example

The following example sorts a DataFrame using the sort_values() method with specific algorithm.

Open Compiler

import pandas as pd
import numpy as np

unsorted_df = pd.DataFrame({'col1':[2,5,0,1],'col2':[1,3,0,4]})
print("Original DataFrame:\n", unsorted_df)

# Sort the DataFrame 
sorted_df = unsorted_df.sort_values(by='col1' ,kind='mergesort')
print("\nOutput Sorted DataFrame:\n", sorted_df)

Its output is as follows −

Original DataFrame:
    col1  col2
0     2     1
1     5     3
2     0     0
3     1     4

Output Sorted DataFrame:
    col1  col2
2     0     0
3     1     4
0     2     1
1     5     3

Print Page