Open In App

How to Select Rows & Columns by Name or Index in Pandas Dataframe – Using loc and iloc

Last Updated : 28 Nov, 2024
Summarize
Comments
Improve
Suggest changes
Like Article
Like
Share
Report
News Follow

When working with labeled data or referencing specific positions in a DataFrame, selecting specific rows and columns from Pandas DataFrame is important. In this article, we’ll focus on pandas functions—loc and iloc—that allow you to select rows and columns either by their labels (names) or their integer positions (indexes).

Let’s see an basic example to understand both methods:

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'],'Age': [25, 30, 35],'City': ['New York', 'Los Angeles', 'Chicago']}
df = pd.DataFrame(data)

# Using loc (label-based)
result_loc = df.loc[0, 'Name']  # Select value at row 0 and column 'Name'

# Using iloc (position-based)
result_iloc = df.iloc[1, 2]  # Select value at row 1 and column 2

print("Using loc:", result_loc)   
print("Using iloc:", result_iloc)

Output:

Using loc: Alice
Using iloc: Los Angeles

Selecting Rows and Columns Using .loc[] (Label-Based Indexing)

The .loc[] method selects data based on labels (names of rows or columns). It is flexible and supports various operations like selecting single rows/columns, multiple rows/columns, or specific subsets.

Key Features of .loc[]:

  • Label-based indexing.
  • Can select both rows and columns simultaneously.
  • Supports slicing and filtering.

Select a Single Row by Label:

row = df.loc[0]  # Select the first row
print(row)

Output:

Name       Alice
Age 25
City New York
Name: 0, dtype: object

Select Multiple Rows by Labels:

rows = df.loc[[0, 2]]  # Select rows with index labels 0 and 2
print(rows)

Output:

      Name  Age      City
0 Alice 25 New York
2 Charlie 35 Chicago

Select Specific Rows and Columns:

subset = df.loc[0:1, ['Name', 'City']]  # Select first two rows and specific columns
print(subset)

Output:

    Name         City
0 Alice New York
1 Bob Los Angeles

Filter Rows Based on Conditions:

filtered = df.loc[df['Age'] > 25]  # Select rows where Age > 25
print(filtered)

Output:

      Name  Age         City
1 Bob 30 Los Angeles
2 Charlie 35 Chicago

Selecting Rows and Columns Using .iloc[] (Integer-Position Based Indexing)

The .iloc[] method selects data based on integer positions (index numbers). It is particularly useful when you don’t know the labels but know the positions.

Key Features of .iloc[]:

  • Uses integer positions (0, 1, 2, …) to index rows and columns.
  • Just like .loc[], you can pass a range or a list of indices.
  • Supports slicing, similar to Python lists.
  • Unlike .loc[], it is exclusive when indexing ranges, meaning that the end index is excluded.

Select a Single Row by Position:

row = df.iloc[1]  # Select the second row (index position = 1)
print(row)

Output:

Name            Bob
Age 30
City Los Angeles
Name: 1, dtype: object

Select Multiple Rows by Positions:

rows = df.iloc[[0, 2]]  # Select first and third rows by position
print(rows)

Output:

      Name  Age      City
0 Alice 25 New York
2 Charlie 35 Chicago

Select Specific Rows and Columns:

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'],'Age': [25, 30, 35],'City': ['New York', 'Los Angeles', 'Chicago']}
df = pd.DataFrame(data)
subset=df.iloc[[0,2],[1]]
print(subset)

Output:

   Age
0 25
2 35

Key Differences Between loc and iloc

Although both loc and iloc allow row and column selection, they differ in how they handle indexes:

Featurelociloc
Indexing BasisLabel-based (uses row/column labels)Position-based (uses integer positions)
InclusivenessInclusive of both start and end pointsExclusive of the end point (slicing)
Row SelectionWorks with index labels (could be strings)Works with integer positions (0, 1, 2, …)
Column SelectionWorks with column labels (can be strings)Works with integer positions (0, 1, 2, …)

Use loc when:

  • You have specific labels for rows or columns and want to work with them directly.
  • Your dataset has non-numeric indexes (e.g., strings, datetime).

Use iloc when:

  • You are working with numeric positions and don’t need to reference row/column names directly.
  • You want to select data based on positions within the structure.

Both loc and iloc are incredibly useful tools for selecting specific data in a Pandas DataFrame. The key difference is whether you’re selecting by label (loc) or index position (iloc). Understanding how and when to use these methods is essential for efficient data manipulation.

Selecting Rows and Columns by Name or Index Using loc and iloc – FAQs

What is the loc and iloc Method to Retrieve Rows and Columns?

  • loc: Used for label-based indexing, selecting rows and columns by their names.
  • iloc: Used for positional indexing, selecting rows and columns by their integer positions.

How Do I Combine iloc and loc?

iloc and loc cannot be directly combined in a single operation because they use different indexing schemes. Use them separately based on your specific needs.

Is iloc for Rows or Columns?

iloc can be used to select both rows and columns by their integer index positions. You specify the rows and columns as two slices or lists of indices.

How Do I Select Specific Rows and Columns from a DataFrame in R?

In R, you can select specific rows and columns from a DataFrame using indexing with square brackets [ , ], where the first dimension is rows and the second is columns. You can use numeric indices or the names of the rows/columns.

Example:

# Sample DataFrame in R data <- data.frame( Name = c(‘John’, ‘Anna’, ‘Mike’, ‘Chris’), Age = c(28, 24, 35, 42), City = c(‘New York’, ‘Paris’, ‘Berlin’, ‘London’) ) # Select first two rows and the column ‘Name’ and ‘Age’ selected <- data[1:2, c(‘Name’, ‘Age’)] print(selected)



Next Article

Similar Reads

three90RightbarBannerImg