Python | Pandas Index.get_duplicates()
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.
Pandas Index.get_duplicates()
function extract duplicated index elements. This function returns a sorted list of index elements which appear more than once in the Index.
Syntax: Index.get_duplicates()
Returns : List of duplicated indexes.
Example #1: Use Index.get_duplicates()
function to find all the duplicate values in the Index.
# importing pandas as pd import pandas as pd # Creating the Index idx = pd.Index([ 'Labrador' , 'Beagle' , 'Labrador' , 'Lhasa' , 'Husky' , 'Beagle' ]) # Print the Index idx |
Output :
let’s find out all the duplicate values in the Index.
# print the duplicated values. idx.get_duplicates() |
Output :
As we can see in the output, the Index.get_duplicates()
function has returned all the values which are having more than one occurrence in the Index.
Example #2: Use Index.get_duplicates()
function to find all the duplicate in the Index. The Index also contains NaN
values.
# importing pandas as pd import pandas as pd # Creating the Index idx = pd.Index([ 'Labrador' , 'Beagle' , None , 'Labrador' , 'Lhasa' , 'Husky' , 'Beagle' , None , 'Koala' ]) # Print the Index idx |
Output :
As we can see in the output we are having some missing values. Lets see how the Index.get_duplicates()
function treats them.
# print the duplicate values in Index idx.get_duplicates() |
Output :
The occurrence of missing values more than once has been treated as duplicates.