Python Pandas - Home
Python Pandas - Introduction
Python Pandas - Environment Setup
Python Pandas - Basics
Python Pandas - Introduction to Data Structures
Python Pandas - Index Objects
Python Pandas - Panel
Python Pandas - Basic Functionality
Python Pandas - Indexing & Selecting Data
Python Pandas - Series
Python Pandas - Series
Python Pandas - Slicing a Series Object
Python Pandas - Attributes of a Series Object
Python Pandas - Arithmetic Operations on Series Object
Python Pandas - Converting Series to Other Objects
Python Pandas - DataFrame
Python Pandas - DataFrame
Python Pandas - Accessing DataFrame
Python Pandas - Slicing a DataFrame Object
Python Pandas - Modifying DataFrame
Python Pandas - Removing Rows from a DataFrame
Python Pandas - Arithmetic Operations on DataFrame
Python Pandas - IO Tools
Python Pandas - IO Tools
Python Pandas - Working with CSV Format
Python Pandas - Reading & Writing JSON Files
Python Pandas - Reading Data from an Excel File
Python Pandas - Writing Data to Excel Files
Python Pandas - Working with HTML Data
Python Pandas - Clipboard
Python Pandas - Working with HDF5 Format
Python Pandas - Comparison with SQL
Python Pandas - Data Handling
Python Pandas - Sorting
Python Pandas - Reindexing
Python Pandas - Iteration
Python Pandas - Concatenation
Python Pandas - Statistical Functions
Python Pandas - Descriptive Statistics
Python Pandas - Working with Text Data
Python Pandas - Function Application
Python Pandas - Options & Customization
Python Pandas - Window Functions
Python Pandas - Aggregations
Python Pandas - Merging/Joining
Python Pandas - MultiIndex
Python Pandas - Basics of MultiIndex
Python Pandas - Indexing with MultiIndex
Python Pandas - Advanced Reindexing with MultiIndex
Python Pandas - Renaming MultiIndex Labels
Python Pandas - Sorting a MultiIndex
Python Pandas - Binary Operations
Python Pandas - Binary Comparison Operations
Python Pandas - Boolean Indexing
Python Pandas - Boolean Masking
Python Pandas - Data Reshaping & Pivoting
Python Pandas - Pivoting
Python Pandas - Stacking & Unstacking
Python Pandas - Melting
Python Pandas - Computing Dummy Variables
Python Pandas - Categorical Data
Python Pandas - Categorical Data
Python Pandas - Ordering & Sorting Categorical Data
Python Pandas - Comparing Categorical Data
Python Pandas - Handling Missing Data
Python Pandas - Missing Data
Python Pandas - Filling Missing Data
Python Pandas - Interpolation of Missing Values
Python Pandas - Dropping Missing Data
Python Pandas - Calculations with Missing Data
Python Pandas - Handling Duplicates
Python Pandas - Duplicated Data
Python Pandas - Counting & Retrieving Unique Elements
Python Pandas - Duplicated Labels
Python Pandas - Grouping & Aggregation
Python Pandas - GroupBy
Python Pandas - Time-series Data
Python Pandas - Date Functionality
Python Pandas - Timedelta
Python Pandas - Sparse Data Structures
Python Pandas - Sparse Data
Python Pandas - Visualization
Python Pandas - Visualization
Python Pandas - Additional Concepts
Python Pandas - Caveats & Gotchas

Python Pandas - Pivoting

Quiz

Pivoting in Python Pandas is a powerful data transformation technique that reshapes data for easier analysis and visualization. It changes the data representation from a "long" format to a "wide" format, making it simpler to perform aggregations and comparisons.

This technique is particularly useful when dealing with time series data or datasets with multiple columns. Pandas provides two primary methods for pivoting −

pivot(): Reshapes data according to specified column or index values.
pivot_table(): It is a more flexible method that allows you to create a spreadsheet-style pivot table as a DataFrame.

In this tutorial, we will learn about the pivoting in Pandas using these methods with examples to demonstrate their applications in data manipulation.

Pivoting with pivot()

The Pandas df.pivot() method is used to reshape data when there are unique values for the specified index and column pairs. It is straightforward and useful when your data is well-structured without duplicate entries for the index/column combination.

Example

Here is a basic example demonstrating pivoting a Pandas DataFrame with the Pandas df.pivot() method.

Open Compiler

import pandas as pd

# Create a DataFrame
df = pd.DataFrame({"Col1": range(12),"Col2": ["A", "A", "A", "B", "B","B", "C", "C", "C", "D", "D", "D"],
"date": pd.to_datetime(["2024-01-03", "2024-01-04", "2024-01-05"] * 4)})

# Display the Input DataFrame
print('Original DataFrame:\n', df)

# Pivot the DataFrame
pivoted = df.pivot(index="date", columns="Col2", values="Col1")

# Display the output
print('Pivoted DataFrame:\n', pivoted)

Following is the output of the above code −

Original DataFrame:

	Col1	Col2	date
0	0	A	2024-01-03
1	1	A	2024-01-04
2	2	A	2024-01-05
3	3	B	2024-01-03
4	4	B	2024-01-04
5	5	B	2024-01-05
6	6	C	2024-01-03
7	7	C	2024-01-04
8	8	C	2024-01-05
9	9	D	2024-01-03
10	10	D	2024-01-04
11	11	D	2024-01-05

Pivoted DataFrame:

Col2	A	B	C	D
date
2024-01-03	0	3	6	9
2024-01-04	1	4	7	10
2024-01-05	2	5	8	11

Note: The pivot() method requires that the index and columns specified have unique values. If your data contains duplicates, you should use the pivot_table() method instead.

Pivoting with pivot_table()

The pivot() method is a straightforward way to reshape data, while pivot_table() offers flexibility for aggregation, making it suitable for more complex data manipulation tasks. This is particularly useful for summarizing data when dealing with duplicates and requires aggregation of data.

Example

This example demonstrates pivoting a DataFrame using the df.pivot_table() method.

Open Compiler

import numpy as np
import pandas as pd
import datetime

# Create a DataFrame
df = pd.DataFrame({"A": [1, 1, 2, 3] * 6,
"B": ["A", "B", "C"] * 8,
"C": ["x", "x", "x", "y", "y", "y"] * 4,
"D": np.random.randn(24),
"E": np.random.randn(24),
"F": [datetime.datetime(2013, i, 1) for i in range(1, 13)] +[datetime.datetime(2013, i, 15) for i in range(1, 13)]})

# Display the Input DataFrame
print('Original DataFrame:\n', df)

# Pivot the DataFrame
pivot_table = pd.pivot_table(df, values="D", index=["A", "B"], columns=["C"])

# Display the output
print('Pivoted DataFrame:\n', pivot_table)

Following is the output of the above code −

Original DataFrame:

	A	B	C	D	E	F
0	1	A	x	-0.530395	2.819976	2013-01-01
1	1	B	x	-0.593347	0.639877	2013-02-01
2	2	C	x	0.817982	-0.270499	2013-03-01
3	3	A	y	-1.438048	1.707790	2013-04-01
4	1	B	y	0.207012	-0.387364	2013-05-01
5	1	C	y	0.462550	1.145260	2013-06-01
6	2	A	x	3.032849	-0.027888	2013-07-01
7	3	B	x	-0.972964	-0.546819	2013-08-01
8	1	C	x	0.613289	-0.041706	2013-09-01
9	1	A	y	-1.072169	-0.061843	2013-10-01
10	2	B	y	1.305400	0.463981	2013-11-01
11	3	C	y	-1.265300	-2.020158	2013-12-01
12	1	A	x	0.165842	-0.154173	2013-01-15
13	1	B	x	-1.038789	-0.705948	2013-02-15
14	2	C	x	0.690639	0.843855	2013-03-15
15	3	A	y	-0.592202	0.718281	2013-04-15
16	1	B	y	0.048085	-1.111917	2013-05-15
17	1	C	y	1.111408	-0.125214	2013-06-15
18	2	A	x	0.086012	0.276006	2013-07-15
19	3	B	x	-1.043673	-0.777043	2013-08-15
20	1	C	x	-0.471818	0.736921	2013-09-15
21	1	A	y	-0.138187	-0.849134	2013-10-15
22	2	B	y	-2.286761	-2.859258	2013-11-15
23	3	C	y	0.392779	-0.104737	2013-12-15

Pivoted DataFrame: C x y A B 1 A -0.182276 -0.605178 B -0.816068 0.127548 C 0.070736 0.786979 2 A 1.559430 NaN B NaN -0.490681 C 0.754310 NaN 3 A NaN -1.015125 B -1.008318 NaN C NaN -0.436260

Pivoting with Aggregation

The Pandas pivot_table() method can be used to specify an aggregation function. By default it calculates the mean, but you can also use functions like sum, count, or even custom functions for applying aggregation to the pivoting.

Example

This example demonstrates how to apply aggregation function with pivoting a DataFrame using the df.pivot_table() method.

Open Compiler

import numpy as np
import datetime
import pandas as pd

# Create a DataFrame
df = pd.DataFrame({"A": [1, 1, 2, 3] * 6,
"B": ["A", "B", "C"] * 8,
"C": ["x", "x", "x", "y", "y", "y"] * 4,
"D": np.random.randn(24),
"E": np.random.randn(24),
"F": [datetime.datetime(2013, i, 1) for i in range(1, 13)] +[datetime.datetime(2013, i, 15) for i in range(1, 13)]})

# Display the Input DataFrame
print('Original DataFrame:\n', df)

# Pivot the DataFrame with a aggregate function
pivot_table = pd.pivot_table(df, values=["D", "E"], index=["B"], columns=["A", "C"], aggfunc="sum")

# Display the output
print('Pivoted DataFrame:\n', pivot_table)

Following is the output of the above code −

Original DataFrame:

	A	B	C	D	E	F
0	1	A	x	0.921728	0.807799	2013-01-01
1	1	B	x	0.565152	-0.369947	2013-02-01
2	2	C	x	1.260114	0.352844	2013-03-01
3	3	A	y	-1.369645	1.504198	2013-04-01
4	1	B	y	0.882293	-1.177686	2013-05-01
5	1	C	y	0.560940	1.126121	2013-06-01
6	2	A	x	1.496309	0.131623	2013-07-01
7	3	B	x	0.557194	1.545635	2013-08-01
8	1	C	x	0.518436	-0.581113	2013-09-01
9	1	A	y	-0.607266	-1.032699	2013-10-01
10	2	B	y	-0.665019	-0.628637	2013-11-01
11	3	C	y	2.146093	-0.175748	2013-12-01
12	1	A	x	0.203505	1.402464	2013-01-15
13	1	B	x	0.602630	1.028011	2013-02-15
14	2	C	x	-0.137840	0.099252	2013-03-15
15	3	A	y	1.517678	1.027651	2013-04-15
16	1	B	y	-0.949576	-0.268656	2013-05-15
17	1	C	y	-0.328505	-0.092841	2013-06-15
18	2	A	x	0.089315	0.771653	2013-07-15
19	3	B	x	1.889772	-0.348258	2013-08-15
20	1	C	x	1.081143	-0.006387	2013-09-15
21	1	A	y	0.223791	0.944354	2013-10-15
22	2	B	y	0.111047	-1.634985	2013-11-15
23	3	C	y	-1.475421	0.586825	2013-12-15

Pivoted DataFrame: D ... E A 1 2 ... 2 3 C x y x ... y x y B ... A 1.125233 -0.383475 1.585623 ... NaN NaN 2.531849 B 1.167782 -0.067283 NaN ... -2.263622 1.197377 NaN C 1.599579 0.232435 1.122273 ... NaN NaN 0.411078 [3 rows x 12 columns]

Print Page