
- Python Pandas - Home
- Python Pandas - Introduction
- Python Pandas - Environment Setup
- Python Pandas - Basics
- Python Pandas - Introduction to Data Structures
- Python Pandas - Index Objects
- Python Pandas - Panel
- Python Pandas - Basic Functionality
- Python Pandas - Indexing & Selecting Data
- Python Pandas - Series
- Python Pandas - Series
- Python Pandas - Slicing a Series Object
- Python Pandas - Attributes of a Series Object
- Python Pandas - Arithmetic Operations on Series Object
- Python Pandas - Converting Series to Other Objects
- Python Pandas - DataFrame
- Python Pandas - DataFrame
- Python Pandas - Accessing DataFrame
- Python Pandas - Slicing a DataFrame Object
- Python Pandas - Modifying DataFrame
- Python Pandas - Removing Rows from a DataFrame
- Python Pandas - Arithmetic Operations on DataFrame
- Python Pandas - IO Tools
- Python Pandas - IO Tools
- Python Pandas - Working with CSV Format
- Python Pandas - Reading & Writing JSON Files
- Python Pandas - Reading Data from an Excel File
- Python Pandas - Writing Data to Excel Files
- Python Pandas - Working with HTML Data
- Python Pandas - Clipboard
- Python Pandas - Working with HDF5 Format
- Python Pandas - Comparison with SQL
- Python Pandas - Data Handling
- Python Pandas - Sorting
- Python Pandas - Reindexing
- Python Pandas - Iteration
- Python Pandas - Concatenation
- Python Pandas - Statistical Functions
- Python Pandas - Descriptive Statistics
- Python Pandas - Working with Text Data
- Python Pandas - Function Application
- Python Pandas - Options & Customization
- Python Pandas - Window Functions
- Python Pandas - Aggregations
- Python Pandas - Merging/Joining
- Python Pandas - MultiIndex
- Python Pandas - Basics of MultiIndex
- Python Pandas - Indexing with MultiIndex
- Python Pandas - Advanced Reindexing with MultiIndex
- Python Pandas - Renaming MultiIndex Labels
- Python Pandas - Sorting a MultiIndex
- Python Pandas - Binary Operations
- Python Pandas - Binary Comparison Operations
- Python Pandas - Boolean Indexing
- Python Pandas - Boolean Masking
- Python Pandas - Data Reshaping & Pivoting
- Python Pandas - Pivoting
- Python Pandas - Stacking & Unstacking
- Python Pandas - Melting
- Python Pandas - Computing Dummy Variables
- Python Pandas - Categorical Data
- Python Pandas - Categorical Data
- Python Pandas - Ordering & Sorting Categorical Data
- Python Pandas - Comparing Categorical Data
- Python Pandas - Handling Missing Data
- Python Pandas - Missing Data
- Python Pandas - Filling Missing Data
- Python Pandas - Interpolation of Missing Values
- Python Pandas - Dropping Missing Data
- Python Pandas - Calculations with Missing Data
- Python Pandas - Handling Duplicates
- Python Pandas - Duplicated Data
- Python Pandas - Counting & Retrieving Unique Elements
- Python Pandas - Duplicated Labels
- Python Pandas - Grouping & Aggregation
- Python Pandas - GroupBy
- Python Pandas - Time-series Data
- Python Pandas - Date Functionality
- Python Pandas - Timedelta
- Python Pandas - Sparse Data Structures
- Python Pandas - Sparse Data
- Python Pandas - Visualization
- Python Pandas - Visualization
- Python Pandas - Additional Concepts
- Python Pandas - Caveats & Gotchas
Python Pandas - Boolean Indexing
Boolean indexing is a technique used to filter data based on specific conditions. It allows us to create masks or filters that extract subsets of data meeting defined criteria. It allows selecting elements from an array, list, or DataFrame using boolean values (True or False).
Instead of manually iterating through data to find values that meet a condition, Boolean indexing simplifies the process by applying logical expressions.
What is Boolean Indexing in Pandas?
In Pandas, Boolean indexing is used to filter rows or columns of a DataFrame or Series based on conditional statements. It helps extract specific data that meets the defined condition by creating boolean masks, which are arrays of True and False values. The True values indicate that the respective data should be selected, while False values indicate not selected.
In this tutorial, we will learn how to access data in a Pandas DataFrame using Boolean indexing with conditional expressions, .loc[], and .iloc[] methods. We will also explore how to apply complex conditions using logical operators for advanced filtering.
Creating a Boolean Index
Creating a boolean index is done by applying a conditional statement to a DataFrame or Series object. For example, if you specify a condition to check whether values in a column are greater than a specific number, then Pandas will return a series of True or False values, which results in a Boolean index.
Example: Creating a Boolean Index
The following example demonstrates how to create a boolean index based on a condition.
import pandas as pd # Create a Pandas DataFrame df = pd.DataFrame([[1, 2], [3, 4], [5, 6]], columns=['A', 'B']) # Display the DataFrame print("Input DataFrame:\n", df) # Create Boolean Index result = df > 2 print('Boolean Index:\n', result)
Following is the output of the above code −
Input DataFrame:A | B | |
---|---|---|
0 | 1 | 2 |
1 | 3 | 4 |
2 | 5 | 6 |
Boolean Index: A B 0 False False 1 True True 2 True True
Filtering Data Using Boolean Indexing
Once a boolean index is created, you can use it to filter rows or columns in the DataFrame. This is done by using .loc[] for label-based indexing and .iloc[] for position-based indexing.
Example: Filtering Data using the Boolean Index with .loc
The following example demonstrates filtering the data using boolean indexing with the .loc method. The .loc method is used to filter rows based on the boolean index and specify columns by their label.
import pandas as pd # Create a Pandas DataFrame df = pd.DataFrame([[1, 2], [3, 4], [5, 6]], columns=['A', 'B']) # Display the DataFrame print("Input DataFrame:\n", df) # Create Boolean Index s = (df['A'] > 2) # Filter DataFrame using the Boolean Index with .loc print('Output Filtered DataFrame:\n',df.loc[s, 'B'])
Following is the output of the above code −
Input DataFrame:A | B | |
---|---|---|
0 | 1 | 2 |
1 | 3 | 4 |
2 | 5 | 6 |
1 4 2 6 Name: B, dtype: int64
Filtering Data using the Boolean Index with .iloc
Similar to the above approach, the .iloc method is used for position-based indexing.
Example: Using .iloc with a Boolean Index
This example uses the .iloc method for positional indexing. By converting the boolean index to an array using .values attribute, we can filter the DataFrame similarly to .loc method.
import pandas as pd # Create a Pandas DataFrame df = pd.DataFrame([[1, 2], [3, 4], [5, 6]], columns=['A', 'B']) # Display the DataFrame print("Input DataFrame:\n", df) # Create Boolean Index s = (df['A'] > 2) # Filter data using .iloc and the Boolean Index print('Output Filtered Data:\n',df.iloc[s.values, 1])
Following is the output of the above code −
Input DataFrame:A | B | |
---|---|---|
0 | 1 | 2 |
1 | 3 | 4 |
2 | 5 | 6 |
1 4 2 6 Name: B, dtype: int64
Advanced Boolean Indexing with Multiple Conditions
Pandas provides more complex boolean indexing by combining multiple conditions with the operators like & (and), | (or), and ~ (not). And also you can apply these conditions across different columns to create highly specific filters.
Example: Using Multiple Conditions Across Columns
The following example demonstrates how apply the boolean indexing with multiple conditions across columns.
import pandas as pd # Create a DataFrame df = pd.DataFrame({'A': [1, 3, 5, 7],'B': [5, 2, 8, 4],'C': ['x', 'y', 'x', 'z']}) # Display the DataFrame print("Input DataFrame:\n", df) # Apply multiple conditions using boolean indexing result = df.loc[(df['A'] > 2) & (df['B'] < 5), 'A':'C'] print('Output Filtered DataFrame:\n',result)
Following is the output of the above code −
Input DataFrame:A | B | C | |
---|---|---|---|
0 | 1 | 5 | x |
1 | 3 | 2 | y |
2 | 5 | 8 | x |
3 | 7 | 4 | z |
A | B | C | |
---|---|---|---|
1 | 3 | 2 | y |
3 | 7 | 4 | z |