Pandas DataFrame corr() Method
Pandas dataframe.corr() is used to find the pairwise correlation of all columns in the Pandas Dataframe in Python. Any NaN values are automatically excluded. To ignore any non-numeric values, use the parameter numeric_only = True. In this article, we will learn about DataFrame.corr() method in Python.
Pandas DataFrame corr() Method Syntax
Syntax: DataFrame.corr(self, method=’pearson’, min_periods=1, numeric_only = False)
Parameters:
- method:
- pearson: standard correlation coefficient
- kendall: Kendall Tau correlation coefficient
- spearman: Spearman rank correlation
- min_periods: Minimum number of observations required per pair of columns to have a valid result. Currently only available for pearson and spearman correlation
- numeric_only: Whether only the numeric values are to be operated upon or not. It is set to False by default.
Returns: count :y : DataFrame
Pandas Data Correlations corr() Method
A good correlation depends on the use, but it is safe to say you have at least 0.6 (or -0.6) to call it a good correlation. A simple example to show how correlation work in Python.
import pandas as pd
df = {
"Array_1": [30, 70, 100],
"Array_2": [65.1, 49.50, 30.7]
}
data = pd.DataFrame(df)
print(data.corr())
Output
Array_1 Array_2
Array_1 1.000000 -0.990773
Array_2 -0.990773 1.000000
Creating Sample Dataframe
Printing the first 10 rows of the Dataframe.
Note: The correlation of a variable with itself is 1. For a link to the CSV file Used in Code, click here.
# importing pandas as pd
import pandas as pd
# Making data frame from the csv file
df = pd.read_csv("nba.csv")
# Printing the first 10 rows of the data frame for visualization
df[:10]
Output

Python Pandas DataFrame corr() Method Examples
Find Correlation Among the Columns Using pearson Method
Here, we are using corr() function to find the correlation among the columns in the Dataframe using ‘Pearson’ method. We are only having four numeric columns in the Dataframe. The output Dataframe can be interpreted as for any cell, row variable correlation with the column variable is the value of the cell. As mentioned earlier, the correlation of a variable with itself is 1. For that reason, all the diagonal values are 1.00.
# To find the correlation among
# the columns using pearson method
df.corr(method='pearson')
Output

Find Correlation Among the Columns Using Kendall Method
Use Pandas df.corr() function to find the correlation among the columns in the Dataframe using ‘kendall’ method. The output Dataframe can be interpreted as for any cell, row variable correlation with the column variable is the value of the cell. As mentioned earlier, the correlation of a variable with itself is 1. For that reason, all the diagonal values are 1.00.
# importing pandas as pd
import pandas as pd
# Making data frame from the csv file
df = pd.read_csv("nba.csv")
# To find the correlation among
# the columns using kendall method
df.corr(method='kendall')
Output

Pandas DataFrame corr() Method – FAQs
What does corr()
do in Pandas?
The
corr()
method calculates the correlation between columns in a DataFrame. By default, it computes the Pearson correlation coefficient, but other methods like Spearman and Kendall can also be used depending on the nature of the data.
How does the corr
method work in Python?
The
corr
method computes the correlation coefficient between every pair of numerically-valued columns in a DataFrame. The Pearson correlation coefficient measures the linear relationship between two variables, ranging from -1 to +1, where:
- +1 indicates a perfect positive linear relationship,
- -1 indicates a perfect negative linear relationship,
- 0 indicates no linear relationship.
How to find the correlation of a DataFrame in Pandas?
To find the correlation matrix of a DataFrame, simply call the
corr()
method:import pandas as pd
# Example DataFrame
df = pd.DataFrame({
‘A’: [1, 2, 3, 4, 5],
‘B’: [5, 6, 7, 8, 9],
‘C’: [9, 8, 7, 6, 5]
})
# Calculate the correlation matrix
correlation_matrix = df.corr()
print(correlation_matrix)
How to find correlation in a dataset?
Finding correlation in a dataset involves calculating the correlation matrix for all numerical columns in the dataset:
# Assuming ‘df’ is your DataFrame
print(df.corr())This will provide you with a matrix showing the correlation coefficients between all pairs of numerical columns in the DataFrame.
How to find correlation between two variables?
To find the correlation between two specific variables or columns in a DataFrame:
# Calculate correlation between column ‘A’ and ‘B’
correlation = df[‘A’].corr(df[‘B’])
print(f”The correlation between A and B is {correlation}”)This method isolates the correlation coefficient between just the two specified columns, giving you a clear view of their linear relationship.