Class XII (As Per CBSE Board) : Informatics Practices
Class XII (As Per CBSE Board) : Informatics Practices
Class XII (As Per CBSE Board) : Informatics Practices
wise
Syllabus
2021-22
Chapter 1
Data Handling
using Pandas -1
Informatics Practices
Class XII ( As per CBSE Board)
Visit : python.mykvs.in for regular updates
Data Handling using Pandas -1
2. DataFrame
DataFrame is like a two-dimensional array with
heterogeneous data.
SR. Admn Student Name Class Section Gender Date Of
No. No Birth
1 001284 NIDHI MANDAL I A Girl 07/08/2010
2 001285 SOUMYADIP I A Boy 24/02/2011
BHATTACHARYA
3 001286 SHREYAANG I A Boy 29/12/2010
SHANDILYA
Basic feature of DataFrame are
❖ Heterogeneous data
❖ Size Mutable
❖ Data Mutable
Pandas Series
It is like one-dimensional array capable of holding data
of any type (integer, string, float, python objects, etc.).
Series can be created using constructor.
Syntax :- pandas.Series( data, index, dtype, copy)
Creation of Series is also possible from – ndarray,
dictionary, scalar value.
Series can be created using
1. Array
2. Dict
3. Scalar value or constant
Pandas Series
e.g.
Output
Series([], dtype: float64)
Output Output
1 a 100 a
2 b 101 b
3 c 102 c
4 d 103d dtype:
dtype: object object
Note : default index is starting
from 0 Note : index is starting from 100
Output Output
a 0.0 b 1.0
b 1.0 c 2.0
c 2.0 d NaN
dtype: float64 a 0.0
dtype: float64
Pandas Series
Head function
e.g
Output
a 1
b. 2
c. 3
dtype: int64
Return first 3 elements
Visit : python.mykvs.in for regular updates
Data Handling using Pandas -1
Pandas Series
tail function
e.g
Output
c 3
d. 4
e. 5
dtype: int64
Return last 3 elements
Visit : python.mykvs.in for regular updates
Data Handling using Pandas -1
Pandas Series
Retrieve Data Using Label as (Index)
e.g.
Output c
3
d 4
dtype: int64
Pandas Series
Retrieve Data from selection
There are three methods for data selection:
▪ loc gets rows (or columns) with particular labels from
the index.
▪ iloc gets rows (or columns) at particular positions in
the index (so it only takes integers).
▪ ix usually tries to behave like loc but falls back to
behaving like iloc if a label is not present in the index.
ix is deprecated and the use of loc and iloc is encouraged
instead
Pandas Series
Retrieve Data from
selection
e.g. >>> s.ix[:3] # the integer is in the index so
>>> s = pd.Series(np.nan,
index=[49,48,47,46,45, 1, 2, 3, 4, 5]) s.ix[:3] works like loc
>>> s.iloc[:3] # slice the first three rows 49 NaN
49 NaN 48 NaN
48 NaN
47 NaN 47 NaN
>>> s.loc[:3] # slice up to and including 46 NaN
label 3 45 NaN
49 NaN
48 NaN
1 NaN
47 NaN 2 NaN
46 NaN 3 NaN
45 NaN
1 NaN
2 NaN
3 NaN
Pandas DataFrame
It is a two-dimensional data structure, just like any table
(with rows & columns).
Basic Features of DataFrame
Columns may be of different types
Size can be changed(Mutable)
Labeled axes (rows / columns)
Arithmetic operations on rows and columns
Structure
Rows
Pandas DataFrame
Create a DataFrame from Lists 0
e.g.1 0 1
output 1 2
import pandas as pd1 2 3
data1 = [1,2,3,4,5] 3 4
df1 = pd1.DataFrame(data1) 4 5
print (df1)
e.g.2
import pandas as pd1
data1 = [['Freya',10],['Mohak',12],['Dwivedi',13]]
Name Age
df1 = pd1.DataFrame(data1,columns=['Name','Age'])
1 Freya 10
print (df1) output 2 Mohak 12
2 Dwivedi 13
Pandas DataFrame
Create a DataFrame from Dict of ndarrays / Lists
e.g.1
import pandas as pd1
data1 = {'Name':['Freya', 'Mohak'],'Age':[9,10]}
df1 = pd1.DataFrame(data1)
print (df1)
Output
Name Age
1 Freya 9
2 Mohak 10
Write below as 3rd statement in above prog for indexing
df1 = pd1.DataFrame(data1, index=['rank1','rank2','rank3','rank4'])
Visit : python.mykvs.in for regular updates
Data Handling using Pandas -1
Pandas DataFrame
Create a DataFrame from List of Dicts
e.g.1
import pandas as pd1
data1 = [{'x': 1, 'y': 2},{'x': 5, 'y': 4, 'z': 5}]
df1 = pd1.DataFrame(data1)
print (df1)
Output
x y z
0 1 2 NaN
1 5 4 5.0
Column Deletion
del df1['one'] # Deleting the first column using DEL function
df.pop('two') #Deleting another column using POP function
Rename columns
df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
>>> df.rename(columns={"A": "a", "B": "c"})
a c
0 1 4
1 2 5
2 3 6
Visit : python.mykvs.in for regular updates
Data Handling using Pandas -1
Pandas DataFrame
Row Selection, Addition, and Deletion
#Selection by Label
import pandas as pd1
d1 = {'one' : pd1.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd1.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])} df1
= pd1.DataFrame(d1)
print (df1.loc['b'])
Output
one 2.0
two 2.0
Name: b, dtype: float64
Pandas DataFrame
#Selection by integer location
import pandas as pd1
d1 = {'one' : pd1.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd1.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df1 = pd1.DataFrame(d1)
print (df1.iloc[2])
Output
one 3.0
two 3.0
Name: c, dtype: float64
Pandas DataFrame
Addition of Rows
import pandas as pd1
df1 = df1.append(df2)
print (df1)
Deletion of Rows
# Drop rows with label 0
df1 = df1.drop(0)
Pandas DataFrame
Iterate over rows in a dataframe
e.g.
import pandas as pd1
import numpy as np1
raw_data1 = {'name': ['freya', 'mohak'],
'age': [10, 1],
'favorite_color': ['pink', 'blue'],
'grade': [88, 92]}
df1 = pd1.DataFrame(raw_data1, columns = ['name', 'age',
'favorite_color', 'grade'])
for index, row in df1.iterrows():
print (row["name"], row["age"])
Output
freya 10
mohak 1
Visit : python.mykvs.in for regular updates
Data Handling using Pandas -1
Pandas DataFrame
Head & Tail
head() returns the first n rows (observe the index values). The default number of
elements to display is five, but you may pass a custom number. tail() returns the
last n rows .e.g.
import pandas as pd
import numpy as np
#Create a Dictionary of series
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
#Create a DataFrame
df = pd.DataFrame(d)
print ("Our data frame is:")
print df
print ("The first two rows of the data frame is:")
print df.head(2)
Visit : python.mykvs.in for regular updates
Data Handling using Pandas -1
Pandas DataFrame
Indexing a DataFrame using .loc[ ] :
This function selects data by the label of the rows and columns.
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(8, 4),
index = ['a','b','c','d','e','f','g','h'], columns = ['A', 'B', 'C', 'D'])
# dictionary of lists
dict = {'name':[“Mohak", “Freya", “Roshni"],
'degree': ["MBA", "BCA", "M.Tech"],
'score':[90, 40, 80]}