Creating Pandas dataframe using list of lists
In this article, we will explore the Creating Pandas data frame using a list of lists. A Pandas DataFrame is a versatile 2-dimensional labeled data structure with columns that can contain different data types. It is widely utilized as one of the most common objects in the Pandas library. There are various methods for Creating a Pandas data frame using a list of lists, and we will specifically delve into the approach of utilizing a list of lists for this purpose.
Create Pandas Dataframe using list of lists
There are various methods to create a Pandas data frame using a list of lists. Here, we are discussing some generally used methods that are following
- Using pd.DataFrame() Function
- Handling Missing Values
- DataFrame with Different Data Types
Using pd.DataFrame() function
In this example, we will create a list of lists and then pass it to the Pandas DataFrame function. Also, we will add the parameter of columns which will contain the column names.
Python3
# Import pandas library import pandas as pd # initialize list of lists data = [[ 'Geeks' , 10 ], [ 'for' , 15 ], [ 'geeks' , 20 ]] # Create the pandas DataFrame df = pd.DataFrame(data, columns = [ 'Name' , 'Age' ]) # print dataframe. print (df) |
Output:
Name Age
0 Geeks 10
1 for 15
2 geeks 20
Let’s see another example with the same implementation as above.
Python3
# Import pandas library import pandas as pd # initialize list of lists data = [[ 'DS' , 'Linked_list' , 10 ], [ 'DS' , 'Stack' , 9 ], [ 'DS' , 'Queue' , 7 ], [ 'Algo' , 'Greedy' , 8 ], [ 'Algo' , 'DP' , 6 ], [ 'Algo' , 'BackTrack' , 5 ], ] # Create the pandas DataFrame df = pd.DataFrame(data, columns = [ 'Category' , 'Name' , 'Marks' ]) # print dataframe. print (df) |
Output:
Category Name Marks
0 DS Linked_list 10
1 DS Stack 9
2 DS Queue 7
3 Algo Greedy 8
4 Algo DP 6
5 Algo BackTrack 5
Handling Missing Values
Below code creates a Pandas DataFrame named df
from a list of lists, where missing values represented as None
are replaced with NaN
. It prints the resulting DataFrame containing information about individuals, including names, ages, and occupations.
Python3
import pandas as pd import numpy as np # Creating a DataFrame with missing values from a list of lists data = [[ 'Geek1' , 28 , 'Engineer' ], [ 'Geek2' , None , 'Data Scientist' ], [ 'Geek3' , 32 , None ]] columns = [ 'Name' , 'Age' , 'Occupation' ] df = pd.DataFrame(data, columns = columns) df = df.replace({ None : np.nan}) # Replacing None with NaN for missing values print (df) |
Output :
Name Age Occuption
0 Geek1 28.0 Engineer
1 Geek2 NaN Data Scientist
2 Geek3 32.0 NaN
DataFrame With Different Data Types
Below code creates a Pandas DataFrame from a list of lists, converting the ‘Age’ column to numeric format and handling errors, with the result printed. The ‘Age’ values, initially a mix of numbers and strings, are corrected to numeric format.
Python3
import pandas as pd # Creating a DataFrame with different data types from a list of lists data = [[ 'Geek1' , 28 , 'Engineer' ], [ 'Geek2' , 25 , 'Data Scientist' ], [ 'Geek3' , '32' , 'Manager' ]] # Age represented as a string columns = [ 'Name' , 'Age' , 'Occupation' ] df = pd.DataFrame(data, columns = columns) df[ 'Age' ] = pd.to_numeric(df[ 'Age' ], errors = 'coerce' ) # Convert 'Age' column to numeric, handling errors print (df) |
Output :
Name Age Occupation
0 Geek1 28.0 Engineer
1 Geek2 25.0 Data Scientist
2 Geek3 NaN Manager
Defining column names using Dataframe.columns() function
Doing some operations on dataframe like transpose. And also defining the Dataframe without column parameters and using df.columns() for the same.
In this example the below code uses pandas to create a DataFrame from a list of lists, assigns column names (‘Col_1’, ‘Col_2’, ‘Col_3’), prints the original DataFrame, transposes it, and prints the result. Transposing swaps rows and columns in the DataFrame.
Python3
# Import pandas library import pandas as pd # initialize list of lists data = [[ 1 , 5 , 10 ], [ 2 , 6 , 9 ], [ 3 , 7 , 8 ]] # Create the pandas DataFrame df = pd.DataFrame(data) # specifying column names df.columns = [ 'Col_1' , 'Col_2' , 'Col_3' ] # print dataframe. print (df, "\n" ) # transpose of dataframe df = df.transpose() print ( "Transpose of above dataframe is-\n" , df) |
Output :
Col_1 Col_2 Col_3
0 1 5 10
1 2 6 9
2 3 7 8
Transpose of above dataframe is-
0 1 2
Col_1 1 2 3
Col_2 5 6 7
Col_3 10 9 8