OneCampus - Python Programming Textbook
OneCampus - Python Programming Textbook
1.Python Introduction……………………………………………...6
2.Python Getting Started……………………………………...9
3.Python Syntax……………………………………………...11
4.Python Comments……………………………………………...12
5.Python Variables……………………………………………...14
6.Python Operators……………………………………………...19
7.Python Data Types……………………………………………...23
8.Python Numbers……………………………………………...26
9.Python Strings……………………………………………...29
10.Python Casting……………………………………………...42
11.Python Booleans……………………………………………...43
12.Python Lists……………………………………………...47
13.Python Tuples……………………………………………...68
14.Python range() Function………………………………80
15.Python Sets……………………………………………...82
16.Python frozenset() Function………………………………92
17.Python Dictionaries……………………………………………93
18.Python math Module…………………………………………106
19.Python User Input……………………………………………...110
20.Python eval() Function………………………………………111
21.Python If ... Else……………………………………………...112
22.Python While Loops…………………………………………….117
23.Python For Loops……………………………………………...119
24.Python Arrays……………………………………………...124
25.Python Functions…………………………………………….127
26.Python Lambda……………………………………………...134
27.Python Classes and Objects………………………………..136
28.Python Inheritance…………………………………………….140
29.Python Iterators……………………………………………...144
30.Python Scope……………………………………………...148
31.Python Modules……………………………………………...154
32.Python Datetime……………………………………………...158
33.Python Math……………………………………………...162
34.Python JSON……………………………………………...164
35.Python RegEx……………………………………………...169
36.Python PIP………………………………………………...177
37.Python Try Except……………………………………………...180
38.Python String Formatting……………………………………183
39.Python File Handling…………………………………………185
40.NumPy Tutorial ……………………………………………...191
41.Pandas Tutorial ……………………………………………...283
42.SciPy Tutorial ……………………………………………...312
01.Python Introduction
What is Python?
Python is a popular programming language. It was created by Guido van Rossum, and released in
1991.
It is used for:
Good to know
● The most recent major version of Python is Python 3, which we shall be using in this
tutorial. However, Python 2, although not being updated with anything other than
Syntax
print(object(s) , sep=separator , end=end , file=file , flush=flush )
Parameter Values
Paramete Description
r
object(s) Any object, and as many as you like. Will be converted to a string before printed
sep=' Optional. Specify how to separate the objects, if there is more than one. Default is ' '
separator '
end=' end ' Optional. Specify what to print at the end. Default is '\n' (line feed)
flush Optional. A Boolean, specifying if the output is flushed (True) or buffered (False). Default is False
02.Python Getting Started
Python Install
Many PCs and Macs will have python already installed.
To check if you have python installed on a Windows PC, search in the start bar for Python or run
the following on the Command Line (cmd.exe):
C:\Users\ Your Name >python --version
To check if you have python installed on a Linux or Mac, then on Linux open the command line
or on Mac open the Terminal, and type:
python --version
If you find that you do not have python installed on your computer, then you can download it for
free from the following website: https://www.python.org/
Python Quickstart
Python is an interpreted programming language, this means that as a developer you write Python
(.py) files in a text editor and then put those files into the python interpreter to be executed.
The way to run a python file is like this on the command line:
C:\Users\ Your Name >python helloworld.py
Where "helloworld.py" is the name of your python file.
Let's write our first Python file, called helloworld.py, which can be done in any text editor.
print ( "Hello, World!" )
Simple as that. Save your file. Open your command line, navigate to the directory where you
saved your file, and run:
C:\Users\ Your Name >python helloworld.py
The output should read: Hello, World!
Congratulations, you have written and executed your first Python program!
The Python Command Line
To test a short amount of code in python sometimes it is the quickest and easiest not to write the
code in a file. This is made possible because Python can be run as a command-line itself.
Type the following on the Windows, Mac, or Linux command line:
C:\Users\ Your Name >python
Or, if the "python" command did not work, you can try "py":
C:\Users\ Your Name >py
From there you can write any python, including our hello world example from earlier in the
tutorial:
C:\Users\ Your Name >python
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:04:45) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print("Hello, World!")
Which will write "Hello, World!" in the command line:
C:\Users\ Your Name >python
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:04:45) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print("Hello, World!")
Hello, World!
Whenever you are done in the python command line, you can simply type the following to quit
the python command-line interface: exit()
03.Python Syntax
Execute Python Syntax
As we learned in the previous page, Python syntax can be executed by writing directly in the
Command Line:
>>> print("Hello, World!")
Hello, World!
Or by creating a python file on the server, using the .py file extension, and running it in the
Command Line:
C:\Users\ Your Name >python myfile.py
Python Indentation
Indentation refers to the spaces at the beginning of a code line.
Where in other programming languages the indentation in code is for readability only, the
indentation in Python is very important.
Python uses indentation to indicate a block of code.
Example
if 5 > 2 :
print ( "Five is greater than two!" )
Python will give you an error if you skip the indentation!
04.Python Comments
● Comments can be used to explain Python code.
● Comments can be used to make the code more readable.
● Comments can be used to prevent execution when testing code.
Creating a Comment
Comments start with a #, and Python will ignore them:
#This is a comment
print ( "Hello, World!" )
Comments can be placed at the end of a line, and Python will ignore the rest of the line:
print ( "Hello, World!" ) #This is a comment
A comment does not have to be text that explains the code, it can also be used to prevent
Python from executing code:
#print("Hello, World!")
print ( "Cheers, Mate!" )
Multi-Line Comments
Python does not really have a syntax for multi-line comments.
To add a multiline comment you could insert a # for each line.
#This is a comment
#written in
#more than just one line
print ( "Hello, World!" )
Or, not quite as intended, you can use a multiline string.
Since Python will ignore string literals that are not assigned to a variable, you can add a multiline
string (triple quotes) in your code, and place your comment inside it:
"""
This is a comment
written in
more than just one line
"""
print ( "Hello, World!" )
05.Python Variables
Variables
Variables are containers for storing data values.
Creating Variables
Python has no command for declaring a variable.
A variable is created the moment you first assign a value to it.
x=5
y = "John"
print (x)
print (y)
Variables do not need to be declared with any particular type , and can even change type
after they have been set.
x=4 # x is of type int
x = "Sally" # x is now of type str
print (x)
Casting
If you want to specify the data type of a variable, this can be done with casting.
x = str ( 3 ) # x will be '3'
y = int ( 3 ) # y will be 3
z = float ( 3 ) # z will be 3.0
Case-Sensitive
Variable names are case-sensitive.
a=4
A = "Sally"
#A will not overwrite a
Camel Case
Each word, except the first, starts with a capital letter:
myVariableName = "John"
Pascal Case
Each word starts with a capital letter:
MyVariableName = "John"
Snake Case
Each word is separated by an underscore character:
my_variable_name = "John"
Unpack a Collection
If you have a collection of values in a list, tuple, etc. Python allows you extract the values into
variables. This is called unpacking .
Learn more about unpacking in Tuples Chapter.
06.Python Operators
Python Operators
Operators are used to performing operations on variables and values.
In the example below, we use the + operator to add together two values:
Example
print ( 10 + 5 )
Python divides the operators into the following groups:
● Arithmetic operators
● Assignment operators
● Comparison operators
● Logical operators
● Identity operators
● Membership operators
● Bitwise operators
in Returns True if a sequence with the specified value is present in the object x in y
not in Returns True if a sequence with the specified value is not present in the object x not in y
== Equal x == y
!= Not equal x != y
and Returns True if both statements are true x < 5 and x < 10
not Reverse the result, returns False if the result is true not(x < 5 and x < 10)
Python Bitwise Operators
Bitwise operators are used to comparing (binary) numbers:
Opera Name Description
tor
<< Zero fill left Shift left by pushing zeros in from the right and let the leftmost bits fall off
shift
>> Signed right Shift right by pushing copies of the leftmost bit in from the left, and let the rightmost bits fall
shift off
07.Python Operators
Built-in Data Types
In programming, the data type is an important concept.
Variables can store data of different types, and different types can do different things.
Python has the following data types built-in by default, in these categories:
08.Python Numbers
Python Numbers
There are three numeric types in Python:
● int
● float
● complex
Variables of numeric types are created when you assign a value to them:
Example
x=1 # int
y = 2.8 # float
z = 1j # complex
To verify the type of any object in Python, use the type() function:
Example
print ( type (x))
print ( type (y))
print ( type (z))
Int
Int, or integer, is a whole number, positive or negative, without decimals, of unlimited length.
Example
Integers:
x=1
y = 35656222554887711
z = - 3255522
print ( type (x))
print ( type (y))
print ( type (z))
Float
Float, or "floating-point number" is a number, positive or negative, containing one or more
decimals.
Example
Floats:
x = 1.10
y = 1.0
z = - 35.59
print ( type (x))
print ( type (y))
print ( type (z))
Float can also be scientific numbers with an "e" to indicate the power of 10.
Example
Floats:
x = 35e3
y = 12E4
z = - 87.7e100
print ( type (x))
print ( type (y))
print ( type (z))
Complex
Complex numbers are written with a "j" as the imaginary part:
Example
Complex:
x = 3 +5j
y = 5j
z = -5j
print ( type (x))
print ( type (y))
print ( type (z))
Type Conversion
You can convert from one type to another with the int(), float(), and complex() methods:
Example
Convert from one type to another:
x=1 # int
y = 2.8 # float
z = 1j # complex
#convert from int to float:
a = float (x)
#convert from float to int:
b = int (y)
#convert from int to complex:
c = complex (x)
print (a)
print (b)
print (c)
print ( type (a))
print ( type (b))
print ( type (c))
Note: You cannot convert complex numbers into another number type.
09.Python Strings
Strings
Strings in python are surrounded by either single quotation marks or double quotation marks.
'hello' is the same as "hello".
You can display a string literal with the print() function:
Example
print ( "Hello" )
print ( 'Hello' )
Multiline Strings
You can assign a multiline string to a variable by using three quotes:
Example
You can use three double quotes:
a = """Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua."""
print (a)
Or three single quotes:
Example
a = '''Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua.'''
print (a)
Note: in the result, the line breaks are inserted at the same position as in the code.
Strings are Arrays
Like many other popular programming languages, strings in Python are arrays of bytes
representing Unicode characters.
However, Python does not have a character data type, a single character is simply a string with a
length of 1.
Square brackets can be used to access elements of the string.
Example
Get the character at position 1 (remember that the first character has the position 0):
a = "Hello, World!"
print (a[ 1 ])
String Length
To get the length of a string, use the len() function.
Example
The len() function returns the length of a string:
a = "Hello, World!"
print ( len (a))
Check String
To check if a certain phrase or character is present in a string, we can use the keyword in.
Example
Check if "free" is present in the following text:
txt = "The best things in life are free!"
print ( "free" in txt)
Use it in an if statement:
Example
Print only if "free" is present:
txt = "The best things in life are free!"
if "free" in txt:
print ( "Yes, 'free' is present." )
Learn more about If statements in our Python If...Else chapter.
Check if NOT
To check if a certain phrase or character is NOT present in a string, we can use the keyword not
in.
Example
Check if "expensive" is NOT present in the following text:
txt = "The best things in life are free!"
print ( "expensive" not in txt)
Use it in an if statement:
Example
print only if "expensive" is NOT present:
txt = "The best things in life are free!"
if "expensive" not in txt:
print ( "Yes, 'expensive' is NOT present." )
Negative Indexing
Use negative indexes to start the slice from the end of the string:
Example
Get the characters:
From: "o" in "World!" (position -5)
To, but not included: "d" in "World!" (position -2):
b = "Hello, World!"
print (b[- 5 :- 2 ])
Upper Case
Example
The upper() method returns the string in the upper case:
a = "Hello, World!"
print (a.upper())
Lower Case
Example
The lower() method returns the string in lower case:
a = "Hello, World!"
print (a.lower())
Remove Whitespace
Whitespace is the space before and/or after the actual text, and very often you want to remove
this space.
Example
The strip() method removes any whitespace from the beginning else the end:
a = " Hello, World! "
print (a.strip()) # returns "Hello, World!"
Replace String
Example
The replace() method replaces a string with another string:
a = "Hello, World!"
print (a.replace( "H" , "J" ))
Split String
The split() method returns a list where the text between the specified separator becomes the list
items.
Example
The split() method splits the string into substrings if it finds instances of the separator:
a = "Hello, World!"
print (a.split( "," )) # returns ['Hello', ' World!']
Learn more about Lists in our Python Lists chapter.
Example
To add a space between them, add a " ":
a = "Hello"
b = "World"
c=a+""+b
print (c)
Example
quantity = 3
itemno = 567
price = 49.95
myorder = "I want to pay {2} dollars for {0} pieces of item {1}."
print (myorder. format (quantity, itemno, price))
Escape Characters
Other escape characters used in Python:
Code Result
\' Single Quote
\\ Backslash
\n New Line
\r Carriage Return
\t Tab
\b Backspace
\f Form Feed
\ooo Octal value
\xhh Hex value
Method Description
capitalize() Converts the first character to upper case
casefold() Converts string into lower case
center() Returns a centered string
isalpha() Returns True if all characters in the string are in the alphabet
isdecimal() Returns True if all characters in the string are decimals
isdigit() Returns True if all characters in the string are digits
isidentifier() Returns True if the string is an identifier
islower() Returns True if all characters in the string are lower case
isnumeric() Returns True if all characters in the string are numeric
rfind() Searches the string for a specified value and returns the last position of where
it was found
zfill() Fills the string with a specified number of 0 values at the beginning
rindex() Searches the string for a specified value and returns the last position of where
it was found
rjust() Returns a right justified version of the string
rpartition() Returns a tuple where the string is parted into three parts
rsplit() Splits the string at the specified separator, and returns a list
rstrip() Returns a right trim version of the string
split() Splits the string at the specified separator, and returns a list
splitlines() Splits the string at line breaks and returns a list
startswith() Returns true if the string starts with the specified value
strip() Returns a trimmed version of the string
swapcase() Swaps cases, lower case becomes upper case and vice versa
title() Converts the first character of each word to upper case
translate() Returns a translated string
upper() Converts a string into upper case
_________________________________________________________
10.Python Casting
Specify a Variable Type
There may be times when you want to specify a type on to a variable. This can be done with
casting. Python is an object-orientated language, and as such it uses classes to define data types,
including its primitive types.
Casting in python is therefore done using constructor functions:
● int() - constructs an integer number from an integer literal, a float literal (by removing all
decimals), or a string literal (providing the string represents a whole number)
● float() - constructs a float number from an integer literal, a float literal or a string literal
(providing the string represents a float or an integer)
● str() - constructs a string from a wide variety of data types, including strings, integer
literals and float literals
Example
Integers:
x = int ( 1 ) # x will be 1
y = int ( 2.8 ) # y will be 2
z = int ( "3" ) # z will be 3
Floats:
x = float ( 1 ) # x will be 1.0
y = float ( 2.8 ) # y will be 2.8
z = float ( "3" ) # z will be 3.0
w = float ( "4.2" ) # w will be 4.2
Strings:
x = str ( "s1" ) # x will be 's1' y = str ( 2 ) # y will be '2'
z = str ( 3.0 ) # z will be '3.0'
11.Python Booleans
Booleans represent one of two values: True or False.
Boolean Values
In programming you often need to know if an expression is True or False.
You can evaluate any expression in Python, and get one of two answers, True or False.
When you compare two values, the expression is evaluated and Python returns the Boolean
answer:
Example
print ( 10 > 9 )
print ( 10 == 9 )
print ( 10 < 9 )
When you run a condition in an if statement, Python returns True or False:
Example
Print a message based on whether the condition is True or False:
a = 200
b = 33
if b > a:
print ( "b is greater than a" )
else :
print ( "b is not greater than a" )
Example
Evaluate two variables:
x = "Hello"
y = 15
print ( bool (x))
print ( bool (y))
12.Python Lists
mylist = [ "apple" , "banana" , "cherry" ]
List
Lists are used to store multiple items in a single variable.
Lists are one of 4 built-in data types in Python used to store collections of data, the other 3 are
Tuple, Set, and Dictionary, all with different qualities and usage.
Lists are created using square brackets:
Example
Create a List:
thislist = [ "apple" , "banana" , "cherry" ]
print (thislist)
List Items
List items are ordered, changeable, and allow duplicate values.
List items are indexed, the first item has index [0], the second item has index [1] etc.
Ordered
When we say that lists are ordered, it means that the items have a defined order, and that order
will not change.
If you add new items to a list, the new items will be placed at the end of the list.
Note: There are some list methods that will change the order, but in general: the order of the
items will not change.
Changeable
The list is changeable, meaning that we can change, add, and remove items in a list after it has
been created.
Allow Duplicates
Since lists are indexed, lists can have items with the same value:
Example
Lists allow duplicate values:
thislist = [ "apple" , "banana" , "cherry" , "apple" , "cherry" ]
print (thislist)
List Length
To determine how many items a list has, use the len() function:
Example
Print the number of items in the list:
thislist = [ "apple" , "banana" , "cherry" ]
print ( len (thislist))
type()
From Python's perspective, lists are defined as objects with the data type 'list':
<class 'list'>
Example
What is the data type of a list?
mylist = [ "apple" , "banana" , "cherry" ]
print ( type (mylist))
Insert Items
To insert a new list item, without replacing any of the existing values, we can use the insert()
method.
The insert() method inserts an item at the specified index:
Example
Insert "watermelon" as the third item:
thislist = [ "apple" , "banana" , "cherry" ]
thislist.insert( 2 , "watermelon" )
print (thislist)
Note: As a result of the example above, the list will now contain 4 items.
Insert Items
To insert a list item at a specified index, use the insert() method.
The insert() method inserts an item at the specified index:
Example
Insert an item as the second position:
thislist = [ "apple" , "banana" , "cherry" ]
thislist.insert( 1 , "orange" )
print (thislist)
Note: As a result of the examples above, the lists will now contain 4 items.
Extend List
To append elements from another list to the current list, use the extend() method.
Example
Add the elements of tropical to thislist:
thislist = [ "apple" , "banana" , "cherry" ]
tropical = [ "mango" , "pineapple" , "papaya" ]
thislist.extend(tropical)
print (thislist)
The elements will be added to the end of the list.
List Comprehension
List comprehension offers a shorter syntax when you want to create a new list based on the
values of an existing list.
Example:
Based on a list of fruits, you want a new list, containing only the fruits with the letter "a" in the
name.
Without list comprehension you will have to write a for statement with a conditional test inside:
Example
fruits = [ "apple" , "banana" , "cherry" , "kiwi" , "mango" ]
newlist = []
for x in fruits:
if "a" in x:
newlist.append(x)
print (newlist)
With list comprehension you can do all that with only one line of code:
Example
fruits = [ "apple" , "banana" , "cherry" , "kiwi" , "mango" ]
newlist = [x for x in fruits if "a" in x]
print (newlist)
The Syntax
newlist = [expression for item in iterable if condition == True ]
The return value is a new list, leaving the old list unchanged.
Condition
The condition is like a filter that only accepts the items that valuate to True.
Example
Only accept items that are not "apple":
newlist = [x for x in fruits if x != "apple" ]
The condition if x != "apple" will return True for all elements other than "apple", making the
new list contain all fruits except "apple".
The condition is optional and can be omitted:
Example
With no if statement:
newlist = [x for x in fruits]
Iterable
The iterable can be any iterable object, like a list, tuple, set etc.
Example
You can use the range() function to create an iterable:
newlist = [x for x in range ( 10 )]
Same example, but with a condition:
Example
Accept only numbers lower than 5:
newlist = [x for x in range ( 10 ) if x < 5 ]
Expression
The expression is the current item in the iteration, but it is also the outcome, which you can
manipulate before it ends up like a list item in the new list:
Example
Set the values in the new list to upper case:
newlist = [x.upper() for x in fruits]
You can set the outcome to whatever you like:
Example
Set all values in the new list to 'hello':
newlist = [ 'hello' for x in fruits]
The expression can also contain conditions, not like a filter, but as a way to manipulate the
outcome:
Example
Return "orange" instead of "banana":
newlist = [x if x != "banana" else "orange" for x in fruits]
The expression in the example above says:
"Return the item if it is not banana, if it is banana return orange".
Sort Descending
To sort descending, use the keyword argument reverse = True:
Example
Sort the list descending:
thislist = [ "orange" , "mango" , "kiwi" , "pineapple" , "banana" ]
thislist.sort(reverse = True )
print (thislist)
Example
Sort the list descending:
thislist = [ 100 , 50 , 65 , 82 , 23 ]
thislist.sort(reverse = True )
print (thislist)
Reverse Order
What if you want to reverse the order of a list, regardless of the alphabet?
The reverse() method reverses the current sorting order of the elements.
Example
Reverse the order of the list items:
thislist = [ "banana" , "Orange" , "Kiwi" , "cherry" ]
thislist.reverse()
print (thislist)
Method Description
append( Adds an element at the end of the list
)
clear() Removes all the elements from the list
copy() Returns a copy of the list
count() Returns the number of elements with the specified value
extend() Add the elements of a list (or any iterable), to the end of the current list
index() Returns the index of the first element with the specified value
insert() Adds an element at the specified position
pop() Removes the element at the specified position
remove( Removes the item with the specified value
)
reverse( Reverses the order of the list
)
sort() Sorts the list
13.Python Tuples
mytuple = ( "apple" , "banana" , "cherry" )
Tuple
Tuples are used to store multiple items in a single variable.
Tuple is one of 4 built-in data types in Python used to store collections of data, the other 3 are
List, Set, and Dictionary, all with different qualities and usage.
A tuple is a collection which is ordered and unchangeable .
Tuples are written with round brackets.
Example
Create a Tuple:
thistuple = ( "apple" , "banana" , "cherry" )
print (thistuple)
Tuple Items
Tuple items are ordered, unchangeable, and allow duplicate values.
Tuple items are indexed, the first item has index [0], the second item has index [1] etc.
Ordered
When we say that tuples are ordered, it means that the items have a defined order, and that order
will not change.
Unchangeable
Tuples are unchangeable, meaning that we cannot change, add or remove items after the tuple
has been created.
Allow Duplicates
Since tuples are indexed, they can have items with the same value:
Example
Tuples allow duplicate values:
thistuple = ( "apple" , "banana" , "cherry" , "apple" , "cherry" )
print (thistuple)
Tuple Length
To determine how many items a tuple has, use the len() function:
Example
Print the number of items in the tuple:
thistuple = ( "apple" , "banana" , "cherry" )
print ( len (thistuple))
Add Items
Since tuples are immutable, they do not have a build-in append() method, but there are other
ways to add items to a tuple.
1. Convert into a list : Just like the workaround for changing a tuple, you can convert it into a
list, add your item(s), and convert it back into a tuple.
Example
Convert the tuple into a list, add "orange", and convert it back into a tuple:
thistuple = ( "apple" , "banana" , "cherry" )
y = list (thistuple)
y.append( "orange" )
thistuple = tuple (y)
2. Add tuple to a tuple . You are allowed to add tuples to tuples, so if you want to add one item,
(or many), create a new tuple with the item(s), and add it to the existing tuple:
Example
Create a new tuple with the value "orange", and add that tuple:
thistuple = ( "apple" , "banana" , "cherry" )
y = ( "orange" ,)
thistuple += y
print (thistuple)
Note: When creating a tuple with only one item, remember to include a comma after the item,
otherwise it will not be identified as a tuple.
Remove Items
Note: You cannot remove items in a tuple.
Tuples are unchangeable , so you cannot remove items from it, but you can use the same
workaround as we used for changing and adding tuple items:
Example
Convert the tuple into a list, remove "apple", and convert it back into a tuple:
thistuple = ( "apple" , "banana" , "cherry" )
y = list (thistuple)
y.remove( "apple" )
thistuple = tuple (y)
Or you can delete the tuple completely:
Example
The del keyword can delete the tuple completely:
thistuple = ( "apple" , "banana" , "cherry" )
del thistuple
print (thistuple) #this will raise an error because the tuple no longer exists
Using Asterisk*
If the number of variables is less than the number of values, you can add an * to the variable
name and the values will be assigned to the variable as a list:
Example
Assign the rest of the values as a list called "red":
fruits = ( "apple" , "banana" , "cherry" , "strawberry" , "raspberry" )
(green, yellow, *red) = fruits
print (green)
print (yellow)
print (red)
If the asterisk is added to another variable name than the last, Python will assign values to the
variable until the number of values left matches the number of variables left.
Example
Add a list of values the "tropic" variable:
fruits = ( "apple" , "mango" , "papaya" , "pineapple" , "cherry" )
(green, *tropic, red) = fruits
print (green)
print (tropic)
print (red)
Multiply Tuples
If you want to multiply the content of a tuple a given number of times, you can use the *
operator:
Example
Multiply the fruits tuple by 2:
fruits = ( "apple" , "banana" , "cherry" )
mytuple = fruits * 2
print (mytuple)
Meth Description
od
count Returns the number of times a specified value occurs in a tuple
()
index Searches the tuple for a specified value and returns the position of where it was
() found
Syntax
range(start, stop, step )
Parameter Values
Parameter Description
stop Required. An integer number specifying at which position to stop (not included).
More Examples
Create a sequence of numbers from 3 to 5, and print each item in the sequence:
x = range ( 3 , 6 )
for n in x:
print (n)
Create a sequence of numbers from 3 to 19, but increment by 2 instead of 1:
x = range ( 3 , 20 , 2 )
for n in x:
print (n)
15.Python Sets
myset = { "apple" , "banana" , "cherry" }
Set
Sets are used to store multiple items in a single variable.
Set is one of 4 built-in data types in Python used to store collections of data, the other 3 are List,
Tuple, and Dictionary, all with different qualities and usage.
A set is a collection which is both unordered and unindexed .
Sets are written with curly brackets.
Example
Create a Set:
thisset = { "apple" , "banana" , "cherry" }
print (thisset)
Note: Sets are unordered, so you cannot be sure in which order the items will appear.
Set Items
Set items are unordered, unchangeable, and do not allow duplicate values.
Unordered
Unordered means that the items in a set do not have a defined order.
Set items can appear in a different order every time you use them, and cannot be referred to by
index or key.
Unchangeable
Sets are unchangeable, meaning that we cannot change the items after the set has been created.
Once a set is created, you cannot change its items, but you can add new items.
type()
From Python's perspective, sets are defined as objects with the data type 'set':
<class 'set'>
Example
What is the data type of a set?
myset = { "apple" , "banana" , "cherry" }
print ( type (myset))
Change Items
Once a set is created, you cannot change its items, but you can add new items.
Add Sets
To add items from another set into the current set, use the update() method.
Example
Add elements from tropical into thisset:
thisset = { "apple" , "banana" , "cherry" }
tropical = { "pineapple" , "mango" , "papaya" }
thisset.update(tropical)
print (thisset)
Method Description
add() Adds an element to the set
clear() Removes all the elements from the set
copy() Returns a copy of the set
difference() Returns a set containing the difference between two or
more sets
difference_update() Removes the items in this set that are also included in
another, specified set
discard() Remove the specified item
intersection() Returns a set, that is the intersection of two other sets
intersection_update() Removes the items in this set that are not present in other,
specified set(s)
isdisjoint() Returns whether two sets have a intersection or not
issubset() Returns whether another set contains this set or not
issuperset() Returns whether this set contains another set or not
pop() Removes an element from the set
remove() Removes the specified element
symmetric_difference() Returns a set with the symmetric differences of two sets
symmetric_difference_updat inserts the symmetric differences from this set and another
e()
union() Return a set containing the union of sets
update() Update the set with the union of this set and others
Syntax
frozenset(iterable )
Parameter Values
Parameter Description
iterable An iterable object, like list, set, tuple etc.
More Examples
Example
Try to change the value of a frozenset item.
This will cause an error:
mylist = [ 'apple' , 'banana' , 'cherry' ]
x = frozenset (mylist)
x[ 1 ] = "strawberry"
17.Python Dictionaries
thisdict = { "brand" : "Ford" ,
"model" : "Mustang" ,
"year" : 1964
}
Dictionary
Dictionaries are used to store data values in key:value pairs.
A dictionary is a collection which is ordered*, changeable and does not allow duplicates.
As of Python version 3.7, dictionaries are ordered . In Python 3.6 and earlier, dictionaries are
unordered .
Dictionaries are written with curly brackets, and have keys and values:
Example
Create and print a dictionary:
thisdict = {
"brand" : "Ford" ,
"model" : "Mustang" ,
"year" : 1964
}
print (thisdict)
Dictionary Items
Dictionary items are ordered, changeable, and do not allow duplicates.
Dictionary items are presented in key:value pairs, and can be referred to by using the key name.
Example
Print the "brand" value of the dictionary:
thisdict = { "brand" : "Ford" , "model" : "Mustang" , "year" : 1964 }
print (thisdict[ "brand" ])
Ordered or Unordered?
As of Python version 3.7, dictionaries are ordered . In Python 3.6 and earlier, dictionaries are
unordered .
When we say that dictionaries are ordered, it means that the items have a defined order, and that
order will not change.
Unordered means that the item does not have a defined order, you cannot refer to an item by
using an index.
Changeable
Dictionaries are changeable, meaning that we can change, add or remove items after the
dictionary has been created.
Dictionary Length
To determine how many items a dictionary has, use the len() function:
Example
Print the number of items in the dictionary:
print ( len (thisdict))
type()
From Python's perspective, dictionaries are defined as objects with the data type 'dict':
<class 'dict'>
Example
Print the data type of a dictionary:
thisdict = { "brand" : "Ford" , "model" : "Mustang" , "year" : 1964 }
print ( type (thisdict))
Get Keys
The keys() method will return a list of all the keys in the dictionary.
Example
Get a list of the keys:
x = thisdict.keys()
The list of the keys is a view of the dictionary, meaning that any changes done to the dictionary
will be reflected in the keys list.
Example
Add a new item to the original dictionary, and see that the keys list gets updated as well:
car = { "brand" : "Ford" , "model" : "Mustang" , "year" : 1964 }
x = car.keys()
print (x) #before the change
car[ "color" ] = "white"
print (x) #after the change
Get Values
The values() method will return a list of all the values in the dictionary.
Example
Get a list of the values:
x = thisdict.values()
The list of the values is a view of the dictionary, meaning that any changes done to the dictionary
will be reflected in the values list.
Example
Make a change in the original dictionary, and see that the values list gets updated as well:
car = { "brand" : "Ford" , "model" : "Mustang" , "year" : 1964 }
x = car.values()
print (x) #before the change
car[ "year" ] = 2020
print (x) #after the change
Example
Add a new item to the original dictionary, and see that the values list gets updated as well:
car = { "brand" : "Ford" , "model" : "Mustang" , "year" : 1964 }
x = car.values()
print (x) #before the change
car[ "color" ] = "red"
print (x) #after the change
Get Items
The items() method will return each item in a dictionary, as tuples in a list.
Example
Get a list of the key:value pairs
x = thisdict.items()
The returned list is a view of the items of the dictionary, meaning that any changes done to the
dictionary will be reflected in the items list.
Example
Make a change in the original dictionary, and see that the items list gets updated as well:
car = { "brand" : "Ford" , "model" : "Mustang" , "year" : 1964 }
x = car.items()
print (x) #before the change
car[ "year" ] = 2020
print (x) #after the change
Example
Add a new item to the original dictionary, and see that the items list gets updated as well:
car = { "brand" : "Ford" , "model" : "Mustang" , "year" : 1964 }
x = car.items()
print (x) #before the change
car[ "color" ] = "red"
print (x) #after the change
Update Dictionary
The update() method will update the dictionary with the items from the given argument.
The argument must be a dictionary, or an iterable object with key:value pairs.
Example
Update the "year" of the car by using the update() method:
thisdict = { "brand" : "Ford" , "model" : "Mustang" , "year" : 1964 }
thisdict.update({ "year" : 2020 })
Python - Add Dictionary Items
Adding Items
Adding an item to the dictionary is done by using a new index key and assigning a value to it:
Example
thisdict ={ "brand" : "Ford" , "model" : "Mustang" , "year" : 1964 }
thisdict[ "color" ] = "red"
print (thisdict)
Update Dictionary
The update() method will update the dictionary with the items from a given argument. If the item
does not exist, the item will be added.
The argument must be a dictionary, or an iterable object with key:value pairs.
Example
Add a color item to the dictionary by using the update() method:
thisdict = { "brand" : "Ford" , "model" : "Mustang" , "year" : 1964 }
thisdict.update({ "color" : "red" })
myfamily = {
"child1" : { "name" : "Emil" , "year" : 2004 },
"child2" : { "name" : "Tobias" , "year" : 2007 },
"child3" : { "name" : "Linus" , "year" : 2011 }
}
Or, if you want to add three dictionaries into a new dictionary:
Example
Create three dictionaries, then create one dictionary that will contain the other three dictionaries:
child1 = { "name" : "Emil" , "year" : 2004 }
child2 = { "name" : "Tobias" , "year" : 2007 }
child3 = { "name" : "Linus" , "year" : 2011 }
Method Description
copy() Returns a copy of the dictionary
fromkey Returns a dictionary with the specified keys and value
s()
get() Returns the value of the specified key
items() Returns a list containing a tuple for each key value pair
keys() Returns a list containing the dictionary's keys
pop() Removes the element with the specified key
popitem( Removes the last inserted key-value pair
)
setdefaul Returns the value of the specified key. If the key does not exist: insert the key,
t() with the specified value
update() Updates the dictionary with the specified key-value pairs
values() Returns a list of all the values in the dictionary
clear() Removes all the elements from the dictionary
Math Constants
Constant Description
math.e Returns Euler's number (2.7182...)
math.inf Returns a floating-point positive infinity
math.nan Returns a floating-point NaN (Not a Number) value
math.pi Returns PI (3.1415...)
math.tau Returns tau (6.2831...)
Math Methods
math.acos() Returns the arc cosine of a number
math.comb() Returns the number of ways to choose k items from n items without repetition
and order
math.copysign() Returns a float consisting of the value of the first parameter and the sign of
the second parameter
math.dist() Returns the Euclidean distance between two points (p and q), where p and q are the
coordinates of that point
math.expm1() Returns Ex - 1
math.fsum() Returns the sum of all items in any iterable (tuples, arrays, lists, etc.)
math.isclose() Checks whether two values are close to each other, or not
math.ldexp() Returns the inverse of math.frexp() which is x * (2**i) of the given numbers x
and i
math.log() Returns the natural logarithm of a number, or the logarithm of number to base
math.perm() Returns the number of ways to choose k items from n items with order and
without repetition
math.remainder() Returns the closest value that can make numerator completely divisible by
the denominator
math.sin() Returns the sine of a number
Syntax
eval(expression , globals , locals )
Parameter Values
Parameter Description
Indentation
Python relies on indentation (whitespace at the beginning of a line) to define scope in the code.
Other programming languages often use curly-brackets for this purpose.
Example
Elif
The elif keyword is pythons way of saying "if the previous conditions were not true, then try this
condition".
Example
a = 33
b = 33
if b > a:
print ( "b is greater than a" )
elif a == b:
print ( "a and b are equal" )
In this example a is equal to b, so the first condition is not true, but the elif condition is true, so
we print to screen that "a and b are equal".
Else
The else keyword catches anything which isn't caught by the preceding conditions.
Example
a = 200
b = 33
if b > a:
print ( "b is greater than a" )
elif a == b:
print ( "a and b are equal" )
else :
print ( "a is greater than b" )
In this example a is greater than b, so the first condition is not true, also the elif condition is not
true, so we go to the else condition and print to screen that "a is greater than b".
You can also have an else without the elif:
Example
a = 200
b = 33
if b > a:
print ( "b is greater than a" )
else :
print ( "b is not greater than a" )
Short Hand If
If you have only one statement to execute, you can put it on the same line as the if statement.
Example
One line if statement:
if a > b: print ( "a is greater than b" )
And
The and keyword is a logical operator, and is used to combine conditional statements:
Example
Test if a is greater than b, AND if c is greater than a:
a = 200
b = 33
c = 500
if a > b and c > a:
print ( "Both conditions are True" )
Or
The or keyword is a logical operator, and is used to combine conditional statements:
Example
Test if a is greater than b, OR if a is greater than c:
a = 200
b = 33
c = 500
if a > b or a > c:
print ( "At least one of the conditions is True" )
Nested If
You can have if statements inside if statements, this is called nested if statements.
Example
x = 41
if x > 10 :
print ( "Above ten," )
if x > 20 :
print ( "and also above 20!" )
else :
print ( "but not above 20." )
if b > a:
pass
Print all numbers from 0 to 5, and print a message when the loop has ended:
for x in range ( 6 ):
print (x)
else :
print ( "Finally finished!" )
Note: The else block will NOT be executed if the loop is stopped by a break statement.
Example
Break the loop when x is 3, and see what happens with the else block:
for x in range ( 6 ):
if x == 3 : break
print (x)
else :
print ( "Finally finished!" )
Nested Loops
A nested loop is a loop inside a loop.
The "inner loop" will be executed one time for each iteration of the "outer loop":
Example
Print each adjective for every fruit:
adj = [ "red" , "big" , "tasty" ]
fruits = [ "apple" , "banana" , "cherry" ]
for x in adj:
for y in fruits:
print (x, y)
Arrays
Note: This page shows you how to use LISTS as ARRAYS, however, to work with arrays in
Python you will have to import a library, like the NumPy library.
Arrays are used to store multiple values in one single variable:
Example
Create an array containing car names:
cars = [ "Ford" , "Volvo" , "BMW" ]
What is an Array?
An array is a special variable, which can hold more than one value at a time.
If you have a list of items (a list of car names, for example), storing the cars in single variables
could look like this:
car1 = "Ford"
car2 = "Volvo"
car3 = "BMW"
However, what if you want to loop through the cars and find a specific one? And what if you had
not 3 cars, but 300?
The solution is an array!
An array can hold many values under a single name, and you can access the values by referring
to an index number.
Array Methods
Python has a set of built-in methods that you can use on lists/arrays.
Method Description
append( Adds an element at the end of the list
)
clear() Removes all the elements from the list
copy() Returns a copy of the list
count() Returns the number of elements with the specified value
extend() Add the elements of a list (or any iterable), to the end of the current list
index() Returns the index of the first element with the specified value
insert() Adds an element at the specified position
pop() Removes the element at the specified position
remove( Removes the first item with the specified value
)
reverse( Reverses the order of the list
)
sort() Sorts the list
25.Python Functions
A function is a block of code which only runs when it is called.
You can pass data, known as parameters, into a function.
A function can return data as a result.
Creating a Function
In Python a function is defined using the def keyword:
Example
def my_function():
print ( "Hello from a function" )
Calling a Function
To call a function, use the function name followed by parenthesis:
Example
def my_function():
print ( "Hello from a function" )
my_function()
Arguments
Information can be passed into functions as arguments.
Arguments are specified after the function name, inside the parentheses. You can add as many
arguments as you want, just separate them with a comma.
The following example has a function with one argument (fname). When the function is called,
we pass along a first name, which is used inside the function to print the full name:
Example
def my_function(fname):
print (fname + " Refsnes" )
my_function( "Emil" )
my_function( "Tobias" )
my_function( "Linus" )
Arguments are often shortened to args in Python documentations.
Parameters or Arguments?
The terms parameter and argument can be used for the same thing: information that are passed
into a function.
From a function's perspective:
A parameter is the variable listed inside the parentheses in the function definition.
An argument is the value that is sent to the function when it is called.
Number of Arguments
By default, a function must be called with the correct number of arguments. Meaning that if your
function expects 2 arguments, you have to call the function with 2 arguments, not more, and not
less.
Example
This function expects 2 arguments, and gets 2 arguments:
def my_function(fname, lname):
print (fname + " " + lname)
my_function( "Emil" , "Refsnes" )
If you try to call the function with 1 or 3 arguments, you will get an error:
Example
This function expects 2 arguments, but gets only 1:
def my_function(fname, lname):
print (fname + " " + lname)
my_function( "Emil" )
Keyword Arguments
You can also send arguments with the key = value syntax.
This way the order of the arguments does not matter.
Example
def my_function(child3, child2, child1):
print ( "The youngest child is " + child3)
Example
def my_function(country = "Norway" ):
print ( "I am from " + country)
my_function( "Sweden" )
my_function( "India" )
my_function()
my_function( "Brazil" )
Return Values
To let a function return a value, use the return statement:
Example
def my_function(x):
return 5 * x
print (my_function( 3 ))
print (my_function( 5 ))
print (my_function( 9 ))
Recursion
Python also accepts function recursion, which means a defined function can call itself.
Recursion is a common mathematical and programming concept. It means that a function calls
itself. This has the benefit of meaning that you can loop through data to reach a result.
The developer should be very careful with recursion as it can be quite easy to slip into writing a
function which never terminates, or one that uses excess amounts of memory or processor power.
However, when written correctly recursion can be a very efficient and mathematically-elegant
approach to programming.
In this example, tri_recursion() is a function that we have defined to call itself ("recurse"). We
use the k variable as the data, which decrements (-1) every time we recurse. The recursion ends
when the condition is not greater than 0 (i.e. when it is 0).
To a new developer it can take some time to work out how exactly this works, best way to find
out is by testing and modifying it.
Example
Recursion Example
def tri_recursion(k):
if (k > 0 ):
result = k + tri_recursion(k - 1 )
print(result)
else :
result = 0
return result
print( "\n\nRecursion Example Results" )
tri_recursion( 6 )
26.Python Lambda
A lambda function is a small anonymous function.
A lambda function can take any number of arguments, but can only have one expression.
Syntax
lambda arguments : expression
The expression is executed and the result is returned:
Example
Add 10 to argument a, and return the result:
x = lambda a : a + 10
print (x( 5 ))
Lambda functions can take any number of arguments:
Example
Multiply argument a with argument b and return the result:
x = lambda a, b : a * b
print (x( 5 , 6 ))
Example
Summarize argument a, b, and c and return the result:
x = lambda a, b, c : a + b + c
print (x( 5 , 6 , 2 ))
Create a Class
To create a class, use the keyword class:
Example
Create a class named MyClass, with a property named x:
class MyClass:
x=5
Create Object
Now we can use the class named MyClass to create objects:
Example
Create an object named p1, and print the value of x:
p1 = MyClass()
print (p1.x)
Object Methods
Objects can also contain methods. Methods in objects are functions that belong to the object.
Let us create a method in the Person class:
Example
Insert a function that prints a greeting, and execute it on the p1 object:
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def myfunc(self):
print ( "Hello my name is " + self.name)
p1 = Person( "John" , 36 )
p1.myfunc()
Note: The self parameter is a reference to the current instance of the class, and is used to access
variables that belong to the class.
Delete Objects
You can delete objects by using the del keyword:
Example
Delete the p1 object:
del p1
28.Python Inheritance
Python Inheritance
Inheritance allows us to define a class that inherits all the methods and properties from another
class.
Parent class is the class being inherited from, also called base class.
Child class is the class that inherits from another class, also called derived class.
def printname(self):
print(self.firstname, self.lastname)
#Use the Person class to create an object, and then execute the printname method:
x = Person( "John" , "Doe" )
x.printname()
Add Properties
Example
Add a property called graduationyear to the Student class:
class Student(Person):
def __init__(self, fname, lname):
super ().__init__(fname, lname)
self.graduationyear = 2019
In the example below, the year 2019 should be a variable, and passed into the Student class when
creating student objects. To do so, add another parameter in the __init__() function:
Example
Add a year parameter, and pass the correct year when creating objects:
class Student(Person):
def __init__(self, fname, lname, year):
super ().__init__(fname, lname)
self.graduationyear = year
x = Student( "Mike" , "Olsen" , 2019 )
Add Methods
Example
Add a method called welcome to the Student class:
class Student(Person):
def __init__(self, fname, lname, year):
super ().__init__(fname, lname)
self.graduationyear = year
def welcome(self):
print ( "Welcome" , self.firstname, self.lastname, "to the class of" ,
self.graduationyear)
If you add a method in the child class with the same name as a function in the parent class, the
inheritance of the parent method will be overridden.
29.Python Iterators
Python Iterators
An iterator is an object that contains a countable number of values.
An iterator is an object that can be iterated upon, meaning that you can traverse through all the
values.
Technically, in Python, an iterator is an object which implements the iterator protocol, which
consist of the methods __iter__() and __next__().
Iterator vs Iterable
Lists, tuples, dictionaries, and sets are all iterable objects. They are iterable containers which you
can get an iterator from.
All these objects have a iter() method which is used to get an iterator:
Example
Return an iterator from a tuple, and print each value:
mytuple = ( "apple" , "banana" , "cherry" )
myit = iter (mytuple)
print ( next (myit))
print ( next (myit))
print ( next (myit))
Even strings are iterable objects, and can return an iterator:
Example
Strings are also iterable objects, containing a sequence of characters:
mystr = "banana"
myit = iter (mystr)
print ( next (myit))
print ( next (myit))
print ( next (myit))
print ( next (myit))
print ( next (myit))
print ( next (myit))
Create an Iterator
To create an object/class as an iterator you have to implement the methods __iter__() and
__next__() to your object.
As you have learned in the Python Classes/Objects chapter, all classes have a function called
__init__(), which allows you to do some initializing when the object is being created.
The __iter__() method acts similar, you can do operations (initializing etc.), but must always
return the iterator object itself.
The __next__() method also allows you to do operations, and must return the next item in the
sequence.
Example
Create an iterator that returns numbers, starting with 1, and each sequence will increase by one
(returning 1,2,3,4,5 etc.):
class MyNumbers:
def __iter__(self):
self.a = 1
return self
def __next__(self):
x = self.a
self.a += 1
return x
myclass = MyNumbers()
myiter = iter(myclass)
print ( next (myiter))
print ( next (myiter))
print ( next (myiter))
print ( next (myiter))
print ( next (myiter))
StopIteration
The example above would continue forever if you had enough next() statements, or if it was used
in a for loop.
To prevent the iteration to go on forever, we can use the StopIteration statement.
In the __next__() method, we can add a terminating condition to raise an error if the iteration is
done a specified number of times:
Example
Stop after 20 iterations:
class MyNumbers:
def __iter__(self):
self.a = 1
return self
def __next__(self):
if self.a <= 20 :
x = self.a
self.a += 1
return x
else :
raise StopIteration
myclass = MyNumbers()
myiter = iter(myclass)
for x in myiter:
print (x)
30.Python Scope
A variable is only available from inside the region it is created. This is called scope .
Local Scope
A variable created inside a function belongs to the local scope of that function, and can only be
used inside that function.
Example
A variable created inside a function is available inside that function:
def myfunc():
x = 300
print (x)
myfunc()
Function Inside Function
As explained in the example above, the variable x is not available outside the function, but it is
available for any function inside the function:
Example
The local variable can be accessed from a function within the function:
def myfunc():
x = 300
def myinnerfunc():
print (x)
myinnerfunc()
myfunc()
Global Scope
A variable created in the main body of the Python code is a global variable and belongs to the
global scope.
Global variables are available from within any scope, global and local.
Example
A variable created outside of a function is global and can be used by anyone:
x = 300
def myfunc():
print (x)
myfunc()
print (x)
Naming Variables
If you operate with the same variable name inside and outside of a function, Python will treat
them as two separate variables, one available in the global scope (outside the function) and one
available in the local scope (inside the function):
Example
The function will print the local x, and then the code will print the global x:
x = 300
def myfunc():
x = 200
print (x)
myfunc()
print (x)
Global Keyword
If you need to create a global variable, but are stuck in the local scope, you can use the global
keyword.
The global keyword makes the variable global.
Example
If you use the global keyword, the variable belongs to the global scope:
def myfunc():
global x
x = 300
myfunc()
print (x)
Also, use the global keyword if you want to make a change to a global variable inside a function.
Example
To change the value of a global variable inside a function, refer to the variable by using the
global keyword:
x = 300
def myfunc():
global x
x = 200
myfunc()
print (x)
30.Python Scope
A variable is only available from inside the region it is created. This is called scope .
Local Scope
A variable created inside a function belongs to the local scope of that function, and can only be
used inside that function.
Example
A variable created inside a function is available inside that function:
def myfunc():
x = 300
print (x)
myfunc()
Function Inside Function
As explained in the example above, the variable x is not available outside the function, but it is
available for any function inside the function:
Example
The local variable can be accessed from a function within the function:
def myfunc():
x = 300
def myinnerfunc():
print (x)
myinnerfunc()
myfunc()
Global Scope
A variable created in the main body of the Python code is a global variable and belongs to the
global scope.
Global variables are available from within any scope, global and local.
Example
A variable created outside of a function is global and can be used by anyone:
x = 300
def myfunc():
print (x)
myfunc()
print (x)
Naming Variables
If you operate with the same variable name inside and outside of a function, Python will treat
them as two separate variables, one available in the global scope (outside the function) and one
available in the local scope (inside the function):
Example
The function will print the local x, and then the code will print the global x:
x = 300
def myfunc():
x = 200
print (x)
myfunc()
print (x)
Global Keyword
If you need to create a global variable, but are stuck in the local scope, you can use the global
keyword.
The global keyword makes the variable global.
Example
If you use the global keyword, the variable belongs to the global scope:
def myfunc():
global x
x = 300
myfunc()
print (x)
Also, use the global keyword if you want to make a change to a global variable inside a function.
Example
To change the value of a global variable inside a function, refer to the variable by using the
global keyword:
x = 300
def myfunc():
global x
x = 200
myfunc()
print (x)
31.Python Modules
What is a Module?
Consider a module to be the same as a code library.
A file containing a set of functions you want to include in your application.
Create a Module
To create a module just save the code you want in a file with the file extension .py:
Example
Save this code in a file named mymodule.py
def greeting(name):
print ( "Hello, " + name)
Use a Module
Now we can use the module we just created, by using the import statement:
Example
Import the module named mymodule, and call the greeting function:
import mymodule
mymodule.greeting( "Jonathan" )
Note: When using a function from a module, use the syntax: module_name.function_name .
Variables in Module
The module can contain functions, as already described, but also variables of all types (arrays,
dictionaries, objects etc):
Example
Save this code in the file mymodule.py
person1 = {
"name" : "John" ,
"age" : 36 ,
"country" : "Norway"
}
Example
Import the module named mymodule, and access the person1 dictionary:
import mymodule
a = mymodule.person1[ "age" ]
print (a)
Naming a Module
You can name the module file whatever you like, but it must have the file extension .py
Re-naming a Module
You can create an alias when you import a module, by using the as keyword:
Example
Create an alias for mymodule called mx:
import mymodule as mx
a = mx.person1[ "age" ]
print (a)
Built-in Modules
There are several built-in modules in Python, which you can import whenever you like.
Example
Import and use the platform module:
import platform
x = platform.system()
print (x)
Example
The module named mymodule has one function and one dictionary:
def greeting(name):
print ( "Hello, " + name)
person1 = {
"name" : "John" ,
"age" : 36 ,
"country" : "Norway"
}
Example
Import only the person1 dictionary from the module:
from mymodule import person1
print (person1[ "age" ])
Note: When importing using the from keyword, do not use the module name when referring to
elements in the module. Example: person1["age"], not mymodule.person1["age"]
32.Python Datetime
Python Dates
A date in Python is not a data type of its own, but we can import a module named datetime to
work with dates as date objects.
Example
Import the datetime module and display the current date:
import datetime
x = datetime.datetime.now()
print (x)
Date Output
When we execute the code from the example above the result will be:
2021-07-04 15:19:53.819856
The date contains year, month, day, hour, minute, second, and microsecond.
The datetime module has many methods to return information about the date object.
Here are a few examples, you will learn more about them later in this chapter:
Example
Return the year and name of weekday:
import datetime
x = datetime.datetime.now()
print (x.year)
print (x.strftime( "%A" ))
34.Python JSON
JSON is a syntax for storing and exchanging data.
JSON is text, written with JavaScript object notation.
JSON in Python
Python has a built-in package called json, which can be used to work with JSON data.
Example
Import the json module:
import json
You can convert Python objects of the following types, into JSON strings:
● dict
● list
● tuple
● string
● int
● float
● True
● False
● None
Example
Convert Python objects into JSON strings, and print the values:
import json
print (json.dumps({ "name" : "John" , "age" : 30 }))
print (json.dumps([ "apple" , "bananas" ]))
print (json.dumps(( "apple" , "bananas" )))
print (json.dumps( "hello" ))
print (json.dumps( 42 ))
print (json.dumps( 31.76 ))
print (json.dumps( True ))
print (json.dumps( False ))
print (json.dumps(None))
When you convert from Python to JSON, Python objects are converted into the JSON
(JavaScript) equivalent:
Python JSON
dict Object
list Array
tuple Array
str String
int Number
float Number
True true
False false
None null
Example
Convert a Python object containing all the legal data types:
import json
x={
"name" : "John" ,
"age" : 30 ,
"married" : True ,
"divorced" : False ,
"children" : ( "Ann" , "Billy" ),
"pets" : None,
"cars" : [
{ "model" : "BMW 230" , "mpg" : 27.5 },
{ "model" : "Ford Edge" , "mpg" : 24.1 }
]
}
print (json.dumps(x))
RegEx Module
Python has a built-in package called re, which can be used to work with Regular Expressions.
Import the re module:
import re
RegEx in Python
When you have imported the re module, you can start using regular expressions:
Example
Search the string to see if it starts with "The" and ends with "Spain":
import re
txt = "The rain in Spain"
x = re.search( "^The.*Spain$" , txt)
RegEx Functions
The re module offers a set of functions that allows us to search a string for a match:
Function Description
findall Returns a list containing all matches
search Returns a Match object if there is a match anywhere in the string
split Returns a list where the string has been split at each match
sub Replaces one or many matches with a string
Metacharacters
Metacharacters are characters with a special meaning:
Sets
A set is a set of characters inside a pair of square brackets [] with a special meaning:
Set Description
[arn] Returns a match where one of the specified characters (a, r, or n) are present
[a-n] Returns a match for any lower case character, alphabetically between a and n
[^arn Returns a match for any character EXCEPT a, r, and n
]
[012 Returns a match where any of the specified digits (0, 1, 2, or 3) are present
3]
[0-9] Returns a match for any digit between 0 and 9
[0-5] Returns a match for any two-digit numbers from 00 and 59
[0-9]
[a- Returns a match for any character alphabetically between a and z, lower case OR
zA- upper case
Z]
[+] In sets, +, *, ., |, (), $,{} has no special meaning, so [+] means: return a match for
any + character in the string.
Special Sequences
A special sequence is a \ followed by one of the characters in the list below, and has a special
meaning:
Match Object
A Match Object is an object containing information about the search and the result.
Note: If there is no match, the value None will be returned, instead of the Match Object.
Example
Do a search that will return a Match Object:
import re
txt = "The rain in Spain"
x = re.search( "ai" , txt)
print (x) #this will print an object
The Match object has properties and methods used to retrieve information about the search, and
the result:
.span() returns a tuple containing the start-, and end positions of the match.
.string returns the string passed into the function
.group() returns the part of the string where there was a match
Example
Print the position (start- and end-position) of the first match occurrence.
The regular expression looks for any words that starts with an upper case "S":
import re
txt = "The rain in Spain"
x = re.search(r "\bS\w+" , txt)
print (x.span())
Example
Print the string passed into the function:
import re
txt = "The rain in Spain"
x = re.search(r "\bS\w+" , txt)
print (x.string)
Example
Print the part of the string where there was a match.
The regular expression looks for any words that starts with an upper case "S":
import re
txt = "The rain in Spain"
x = re.search(r "\bS\w+" , txt)
print (x.group())
Note: If there is no match, the value None will be returned, instead of the Match Object.
36.Python PIP
What is PIP?
PIP is a package manager for Python packages, or modules if you like.
Note: If you have Python version 3.4 or later, PIP is included by default.
What is a Package?
A package contains all the files you need for a module.
Modules are Python code libraries you can include in your project.
Install PIP
If you do not have PIP installed, you can download and install it from this page:
https://pypi.org/project/pip/
Download a Package
Downloading a package is very easy.
Open the command line interface and tell PIP to download the package you want.
Navigate your command line to the location of Python's script directory, and type the following:
Example
Download a package named "camelcase":
C:\Users\ Your Name \AppData\Local\Programs\Python\Python36-32\Scripts>pip install
camelcase
Now you have downloaded and installed your first package!
Using a Package
Once the package is installed, it is ready to use.
Import the "camelcase" package into your project.
Example
Import and use "camelcase":
import camelcase
c = camelcase.CamelCase()
txt = "hello world"
print (c.hump(txt))
Find Packages
Find more packages at https://pypi.org/ .
Remove a Package
Use the uninstall command to remove a package:
Example
Uninstall the package named "camelcase":
C:\Users\ Your Name \AppData\Local\Programs\Python\Python36-32\Scripts>pip uninstall
camelcase
The PIP Package Manager will ask you to confirm that you want to remove the camelcase
package:
Uninstalling camelcase-02.1:
Would remove:
c:\users\ Your Name \appdata\local\programs\python\python36-32\lib\site-
packages\camecase-0.2-py3.6.egg-info
c:\users\ Your Name \appdata\local\programs\python\python36-32\lib\site-
packages\camecase\*
Proceed (y/n)?
Press y and the package will be removed.
List Packages
Use the list command to list all the packages installed on your system:
Example
List installed packages:
C:\Users\ Your Name \AppData\Local\Programs\Python\Python36-32\Scripts>pip list
Result:
Package Version
-----------------------
camelcase 0.2
mysql-connector 2.1.6
pip 18.1
pymongo 3.6.1
setuptools 39.0.1
Exception Handling
When an error occurs, or exception as we call it, Python will normally stop and generate an error
message.
These exceptions can be handled using the try statement:
Examples
The try block will generate an exception, because x is not defined:
try :
print (x)
except :
print ( "An exception occurred" )
Since the try block raises an error, the except block will be executed.
Without the try block, the program will crash and raise an error:
This statement will raise an error, because x is not defined:
print (x)
Many Exceptions
You can define as many exception blocks as you want, e.g. if you want to execute a special block
of code for a special kind of error:
Example
Print one message if the try block raises a NameError and another for other errors:
try :
print (x)
except NameError:
print ( "Variable x is not defined" )
except :
print ( "Something else went wrong" )
Else
You can use the else keyword to define a block of code to be executed if no errors were raised:
In this example, the try block does not generate any error:
try :
print ( "Hello" )
except :
print ( "Something went wrong" )
else :
print ( "Nothing went wrong" )
Finally
The finally block, if specified, will be executed regardless if the try block raises an error or not.
Example
try :
print (x)
except :
print ( "Something went wrong" )
finally :
print ( "The 'try except' is finished" )
This can be useful to close objects and clean up resources:
Example
Try to open and write to a file that is not writable:
try :
f = open ( "demofile.txt" )
f.write( "Lorum Ipsum" )
except :
print ( "Something went wrong when writing to the file" )
finally :
f.close()
The program can continue, without leaving the file object open.
Raise an exception
As a Python developer you can choose to throw an exception if a condition occurs.
To throw (or raise) an exception, use the raise keyword.
Example
Raise an error and stop the program if x is lower than 0:
x=-1
if x < 0 :
raise Exception( "Sorry, no numbers below zero" )
The raise keyword is used to raise an exception.
You can define what kind of error to raise, and the text to print to the user.
Raise a TypeError if x is not an integer:
x = "hello"
if not type (x) is int :
raise TypeError( "Only integers are allowed" )
String format()
The format() method allows you to format selected parts of a string.
Sometimes there are parts of a text that you do not control, maybe they come from a database, or
user input?
To control such values, add placeholders (curly brackets {}) in the text, and run the values
through the format() method:
Example
Add a placeholder where you want to display the price:
price = 49
txt = "The price is {} dollars"
print (txt. format (price))
You can add parameters inside the curly brackets to specify how to convert the value:
Example
Format the price to be displayed as a number with two decimals:
txt = "The price is {:.2f} dollars"
Check out all formatting types in our String format() Reference.
Multiple Values
If you want to use more values, just add more values to the format() method:
print (txt. format (price, itemno, count))
And add more placeholders:
Example
quantity = 3
itemno = 567
price = 49
myorder = "I want {} pieces of item number {} for {:.2f} dollars."
print (myorder. format (quantity, itemno, price))
Index Numbers
You can use index numbers (a number inside the curly brackets {0}) to be sure the values are
placed in the correct placeholders:
Example
quantity = 3
itemno = 567
price = 49
myorder = "I want {0} pieces of item number {1} for {2:.2f} dollars."
print (myorder. format (quantity, itemno, price))
Also, if you want to refer to the same value more than once, use the index number:
Example
age = 36
name = "John"
txt = "His name is {1}. {1} is {0} years old."
print (txt. format (age, name))
Named Indexes
You can also use named indexes by entering a name inside the curly brackets {carname}, but
then you must use names when you pass the parameter values txt.format(carname = "Ford"):
Example
myorder = "I have a {carname}, it is a {model}."
print (myorder. format (carname = "Ford" , model = "Mustang" ))
39.Python File Handling
File handling is an important part of any web application.
Python has several functions for creating, reading, updating, and deleting files.
File Handling
The key function for working with files in Python is the open() function.
The open() function takes two parameters; filename , and mode .
There are four different methods (modes) for opening a file:
"r" - Read - Default value. Opens a file for reading, error if the file does not exist
"a" - Append - Opens a file for appending, creates the file if it does not exist
"w" - Write - Opens a file for writing, creates the file if it does not exist
"x" - Create - Creates the specified file, returns an error if the file exists
In addition you can specify if the file should be handled as binary or text mode
"t" - Text - Default value. Text mode
"b" - Binary - Binary mode (e.g. images)
Syntax
To open a file for reading it is enough to specify the name of the file:
f = open ( "demofile.txt" )
The code above is the same as:
f = open ( "demofile.txt" , "rt" )
Because "r" for read, and "t" for text are the default values, you do not need to specify them.
Note: Make sure the file exists, or else you will get an error.
Read Lines
You can return one line by using the readline() method:
Example
Read one line of the file:
f = open ( "demofile.txt" , "r" )
print (f.readline())
By calling readline() two times, you can read the two first lines:
Example
Read two lines of the file:
f = open ( "demofile.txt" , "r" )
print (f.readline())
print (f.readline())
By looping through the lines of the file, you can read the whole file, line by line:
Example
Loop through the file line by line:
f = open ( "demofile.txt" , "r" )
for x in f:
print (x)
Close Files
It is a good practice to always close the file when you are done with it.
Example
Close the file when you are finish with it:
f = open ( "demofile.txt" , "r" )
print (f.readline())
f.close()
Note: You should always close your files, in some cases, due to buffering, changes made to a file
may not show until you close the file.
Delete Folder
To delete an entire folder, use the os.rmdir() method:
Example
Remove the folder "myfolder":
import os
os.rmdir( "myfolder" )
Note: You can only remove empty folders.
40.NumPy Tutorial
NumPy is a Python library.
NumPy is used for working with arrays.
NumPy is short for "Numerical Python".
Learning by Reading
We have created 43 tutorial pages for you to learn more about NumPy.
Starting with a basic introduction and ends up with creating and plotting random data sets, and
working with NumPy functions:
Basic
● Introduction
● Getting Started
● Creating Arrays
● Array Indexing
● Array Slicing
● Data Types
● Copy vs View
● Array Shape
● Array Reshape
● Array Iterating
● Array Join
● Array Split
● Array Search
● Array Sort
● Array Filter
Random
● Random Intro
● Data Distribution
● Random Permutation
● Seaborn Module
● Normal Dist.
● Binomial Dist.
● Poisson Dist.
● Uniform Dist.
● Logistic Dist.
● Multinomial Dist.
● Exponential Dis.
● Chi Square Dist.
● Rayleigh Dist.
● Pareto Dist.
● Zipf Dist.
Ufunc
● ufunc Intro
● Create Function
● Simple Arithmetic
● Rounding Decimals
● Logs
● Summations
● Products
● Differences
● Finding LCM
● Finding GCD
● Trigonometric
● Hyperbolic
● Set Operations
NumPy Introduction
What is NumPy?
NumPy is a Python library used for working with arrays.
It also has functions for working in the domain of linear algebra, fourier transform, and matrices.
NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use it
freely.
NumPy stands for Numerical Python.
Import NumPy
Once NumPy is installed, import it in your applications by adding the import keyword:
import numpy
Now NumPy is imported and ready to use.
Example
import numpy
arr = numpy.array([ 1 , 2 , 3 , 4 , 5 ])
print (arr)
NumPy as np
NumPy is usually imported under the np alias.
alias: In Python aliases are an alternate name for referring to the same thing.
Create an alias with the as keyword while importing:
import numpy as np
Now the NumPy package can be referred to as np instead of numpy.
Example
import numpy as np
arr = np.array([ 1 , 2 , 3 , 4 , 5 ])
print (arr)
Dimensions in Arrays
A dimension in arrays is one level of array depth (nested arrays).
nested array: are arrays that have arrays as their elements.
0-D Arrays
0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.
Example
Create a 0-D array with value 42
import numpy as np
arr = np.array( 42 )
print (arr)
1-D Arrays
An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.
These are the most common and basic arrays.
Example
2-D Arrays
An array that has 1-D arrays as its elements is called a 2-D array.
These are often used to represent matrix or 2nd order tensors.
NumPy has a whole sub module dedicated towards matrix operations called numpy.mat
Example
Create a 2-D array containing two arrays with the values 1,2,3 and 4,5,6:
import numpy as np
arr = np.array([[ 1 , 2 , 3 ], [ 4 , 5 , 6 ]])
print (arr)
3-D arrays
An array that has 2-D arrays (matrices) as its elements is called 3-D array.
These are often used to represent a 3rd order tensor.
Example
Create a 3-D array with two 2-D arrays, both containing two arrays with the values 1,2,3 and
4,5,6:
import numpy as np
arr = np.array([[[ 1 , 2 , 3 ], [ 4 , 5 , 6 ]], [[ 1 , 2 , 3 ], [ 4 , 5 , 6 ]]])
print (arr)
Get third and fourth elements from the following array and add them.
import numpy as np
arr = np.array([ 1 , 2 , 3 , 4 ])
print (arr[ 2 ] + arr[ 3 ])
Access the third element of the second array of the first array:
import numpy as np
arr = np.array([[[ 1 , 2 , 3 ], [ 4 , 5 , 6 ]], [[ 7 , 8 , 9 ], [ 10 , 11 , 12 ]]])
print (arr[ 0 , 1 , 2 ])
Example Explained
arr[0, 1, 2] prints the value 6.
And this is why:
The first number represents the first dimension, which contains two arrays:
[[1, 2, 3], [4, 5, 6]]
and:
[[7, 8, 9], [10, 11, 12]]
Since we selected 0, we are left with the first array:
[[1, 2, 3], [4, 5, 6]]
The second number represents the second dimension, which also contains two arrays:
[1, 2, 3]
and:
[4, 5, 6]
Since we selected 1, we are left with the second array:
[4, 5, 6]
The third number represents the third dimension, which contains three values:
4
5
6
Since we selected 2, we end up with the third value:
6
Negative Indexing
Use negative indexing to access an array from the end.
Example
Print the last element from the 2nd dim:
import numpy as np
arr = np.array([[ 1 , 2 , 3 , 4 , 5 ], [ 6 , 7 , 8 , 9 , 10 ]])
print ( 'Last element from 2nd dim: ' , arr[ 1 , - 1 ])
Negative Slicing
Use the minus operator to refer to an index from the end:
Example
Slice from the index 3 from the end to index 1 from the end:
import numpy as np
arr = np.array([ 1 , 2 , 3 , 4 , 5 , 6 , 7 ])
print (arr[- 3 :- 1 ])
STEP
Use the step value to determine the step of the slicing:
Example
Change data type from float to integer by using 'i' as parameter value:
import numpy as np
arr = np.array([ 1.1 , 2.1 , 3.1 ])
newarr = arr.astype( 'i' )
print (newarr)
print (newarr.dtype)
Example
Change data type from float to integer by using int as parameter value:
import numpy as np
arr = np.array([ 1.1 , 2.1 , 3.1 ])
newarr = arr.astype( int )
print (newarr)
print (newarr.dtype)
Example
Change data type from integer to boolean:
import numpy as np
arr = np.array([ 1 , 0 , 3 ])
newarr = arr.astype( bool )
print (newarr)
print (newarr.dtype)
COPY:
Example
Make a copy, change the original array, and display both arrays:
import numpy as np
arr = np.array([ 1 , 2 , 3 , 4 , 5 ])
x = arr.copy()
arr[ 0 ] = 42
print (arr)
print (x)
The copy SHOULD NOT be affected by the changes made to the original array.
VIEW:
Example
Make a view, change the original array, and display both arrays:
import numpy as np
arr = np.array([ 1 , 2 , 3 , 4 , 5 ])
x = arr.view()
arr[ 0 ] = 42
print (arr)
print (x)
The view SHOULD be affected by the changes made to the original array.
Make Changes in the VIEW:
Example
Make a view, change the view, and display both arrays:
import numpy as np
arr = np.array([ 1 , 2 , 3 , 4 , 5 ])
x = arr.view()
x[ 0 ] = 31
print (arr)
print (x)
The original array SHOULD be affected by the changes made to the view.
Unknown Dimension
You are allowed to have one "unknown" dimension.
Meaning that you do not have to specify an exact number for one of the dimensions in the
reshape method.
Pass -1 as the value, and NumPy will calculate this number for you.
Example
Convert 1D array with 8 elements to 3D array with 2x2 elements:
import numpy as np
arr = np.array([ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 ])
newarr = arr.reshape( 2 , 2 , - 1 )
print (newarr)
Note: We can not pass -1 to more than one dimension.
print (newarr)
Note: There are a lot of functions for changing the shapes of arrays in numpy, flatten, ravel and
also for rearranging the elements rot90, flip, fliplr, flipud etc. These fall under the Intermediate to
Advanced section of numpy.
Search Sorted
There is a method called searchsorted() which performs a binary search in the array, and returns
the index where the specified value would be inserted to maintain the search order.
The searchSorted() method is assumed to be used on sorted arrays.
Example
Find the indexes where the value 7 should be inserted:
import numpy as np
arr = np.array([ 6 , 7 , 8 , 9 ])
x = np.searchsorted(arr, 7 )
print (x)
Example explained: The number 7 should be inserted on index 1 to remain the sort order.
The method starts the search from the left and returns the first index where the number 7 is no
longer larger than the next value.
Search From the Right Side
By default the leftmost index is returned, but we can give side='right' to return the rightmost
index instead.
Example
Find the indexes where the value 7 should be inserted, starting from the right:
import numpy as np
arr = np.array([ 6 , 7 , 8 , 9 ])
x = np.searchsorted(arr, 7 , side= 'right' )
print (x)
Example explained: The number 7 should be inserted on index 2 to remain the sort order.
The method starts the search from the right and returns the first index where the number 7 is no
longer less than the next value.
Multiple Values
To search for more than one value, use an array with the specified values.
Example
Find the indexes where the values 2, 4, and 6 should be inserted:
import numpy as np
arr = np.array([ 1 , 3 , 5 , 7 ])
x = np.searchsorted(arr, [ 2 , 4 , 6 ])
print (x)
The return value is an array: [1 2 3] containing the three indexes where 2, 4, 6 would be inserted
in the original array to maintain the order.
Random Distribution
A random distribution is a set of random numbers that follow a certain probability density
function .
Probability Density Function: A function that describes a continuous probability. i.e.
probability of all values in an array.
We can generate random numbers based on defined probabilities using the choice() method of
the random module.
The choice() method allows us to specify the probability for each value.
The probability is set by a number between 0 and 1, where 0 means that the value will never
occur and 1 means that the value will always occur.
Example
Generate a 1-D array containing 100 values, where each value has to be 3, 5, 7 or 9.
The probability for the value to be 3 is set to be 0.1
The probability for the value to be 5 is set to be 0.3
The probability for the value to be 7 is set to be 0.6
The probability for the value to be 9 is set to be 0
from numpy import random
x = random.choice([ 3 , 5 , 7 , 9 ], p=[ 0.1 , 0.3 , 0.6 , 0.0 ], size=( 100 ))
print (x)
The sum of all probability numbers should be 1.
Even if you run the example above 100 times, the value 9 will never occur.
You can return arrays of any shape and size by specifying the shape in the size parameter.
Example
Same example as above, but return a 2-D array with 3 rows, each containing 5 values.
from numpy import random
x = random.choice([ 3 , 5 , 7 , 9 ], p=[ 0.1 , 0.3 , 0.6 , 0.0 ], size=( 3 , 5 ))
print (x)
Random Permutations
Random Permutations of Elements
A permutation refers to an arrangement of elements. e.g. [3, 2, 1] is a permutation of [1, 2, 3] and
vice-versa.
The NumPy Random module provides two methods for this: shuffle() and permutation().
Shuffling Arrays
Shuffle means changing the arrangement of elements in-place. i.e. in the array itself.
Example
Randomly shuffle elements of following array:
from numpy import random
import numpy as np
arr = np.array([ 1 , 2 , 3 , 4 , 5 ])
random.shuffle(arr)
print (arr)
The shuffle() method makes changes to the original array.
random distributions.
Install Seaborn.
If you have Python and PIP already installed on a system, install it using this command:
C:\Users\Your Name>pip install seaborn
If you use Jupyter, install Seaborn using this command:
C:\Users\Your Name>!pip install seaborn
Distplots
Distplot stands for distribution plot, it takes as input an array and plots a curve corresponding to
the distribution of points in the array.
Import Matplotlib
Import the pyplot object of the Matplotlib module in your code using the following statement:
import matplotlib.pyplot as plt
Import Seaborn
Import the Seaborn module in your code using the following statement:
import seaborn as sns
Plotting a Displot
Example
import matplotlib.pyplot as plt
import seaborn as sns
sns.distplot([ 0 , 1 , 2 , 3 , 4 , 5 ])
plt.show()
Note: The curve of a Normal Distribution is also known as the Bell Curve because of the bell-
shaped curve.
Binomial Distribution
Binomial Distribution
Binomial Distribution is a Discrete Distribution .
It describes the outcome of binary scenarios, e.g. toss of a coin, it will either be head or tails.
It has three parameters:
n - number of trials.
p - probability of occurence of each trial (e.g. for toss of a coin 0.5 each).
size - The shape of the returned array.
Discrete Distribution: The distribution is defined at separate set of events, e.g. a coin toss's
result is discrete as it can be only head or tails whereas height of people is continuous as it can be
170, 170.1, 170.11 and so on.
Example
Given 10 trials for coin toss generate 10 data points:
from numpy import random
x = random.binomial(n= 10 , p= 0.5 , size= 10 )
print (x)
Result
Result
Poisson Distribution
Poisson Distribution
Poisson Distribution is a Discrete Distribution .
It estimates how many times an event can happen in a specified time. e.g. If someone eats twice a
day what is the probability he will eat thrice?
It has two parameters:
lam - rate or known number of occurrences e.g. 2 for above problem.
size - The shape of the returned array.
Example
Generate a random 1x10 distribution for occurence 2:
from numpy import random
x = random.poisson(lam= 2 , size= 10 )
print (x)
plt.show()
Result
Difference Between Poisson and Binomial Distribution
The difference is very subtle: binomial distribution is for discrete trials, whereas poisson
distribution is for continuous trials.
But for very large n and near-zero p binomial distribution is near identical to poisson distribution
such that n * p is nearly equal to lam.
Example
from numpy import random
import matplotlib.pyplot as plt
import seaborn as sns
plt.show()
Result
Uniform Distribution
Uniform Distribution
Used to describe probability where every event has equal chances of occuring.
E.g. Generation of random numbers.
It has three parameters:
a - lower bound - default 0 .0.
b - upper bound - default 1.0.
size - The shape of the returned array.
Example
Create a 2x3 uniform distribution sample:
from numpy import random
x = random.uniform(size=(2 , 3 ))
print (x)
plt.show()
Result
Logistic Distribution
Logistic Distribution
Logistic Distribution is used to describe growth.
Used extensively in machine learning in logistic regression, neural networks etc.
It has three parameters:
loc - mean, where the peak is. Default 0.
scale - standard deviation, the flatness of distribution. Default 1.
size - The shape of the returned array.
Example
Draw 2x3 samples from a logistic distribution with mean at 1 and stddev 2.0:
from numpy import random
x = random.logistic(loc= 1 , scale= 2 , size=( 2 , 3 ))
print (x)
Both distributions are nearly identical, but logistic distribution has more area under the tails. ie. It
represented more possibility of occurrence of an event further away from mean.
For higher values of scale (standard deviation) the normal and logistic distributions are nearly
identical apart from the peak.
Example
from numpy import random
import matplotlib.pyplot as plt
import seaborn as sns
plt.show()
Result
Multinomial Distribution
Multinomial Distribution
Multinomial distribution is a generalization of binomial distribution.
It describes outcomes of multinomial scenarios unlike binomial where scenarios must be only
one of two. e.g. Blood type of a population, dice roll outcome.
It has three parameters:
n - number of possible outcomes (e.g. 6 for dice roll).
pvals - list of probabilities of outcomes (e.g. [1/6, 1/6, 1/6, 1/6, 1/6, 1/6] for dice roll).
size - The shape of the returned array.
Example
Draw out a sample for dice roll:
from numpy import random
x = random.multinomial(n= 6 , pvals=[ 1 / 6 , 1 / 6 , 1 / 6 , 1 / 6 , 1 / 6 , 1 / 6 ])
print (x)
Note: Multinomial samples will NOT produce a single value! They will produce one value for
each pval.
Note: As they are generalizations of binomial distribution their visual representation and
similarity of normal distribution is same as that of multiple binomial distributions.
Exponential Distribution
Exponential Distribution
Exponential distribution is used for describing time till the next event e.g. failure/success etc.
It has two parameters:
scale - inverse of rate ( see lam in poisson distribution ) defaults to 1.0.
size - The shape of the returned array.
Example
Draw out a sample for exponential distribution with 2.0 scale with 2x3 size:
from numpy import random
x = random.exponential(scale= 2 , size=( 2 , 3 ))
print (x)
Rayleigh Distribution
Rayleigh Distribution
Rayleigh distribution is used in signal processing.
It has two parameters:
scale - (standard deviation) decides how flat the distribution will be default 1.0).
size - The shape of the returned array.
Example
Draw out a sample for rayleigh distribution with scale of 2 with size 2x3:
from numpy import random
x = random.rayleigh(scale= 2 , size=( 2 , 3 ))
print (x)
Pareto Distribution
Pareto Distribution
A distribution following Pareto's law i.e. 80-20 distribution (20% factors cause 80% outcome).
It has two parameter:
a - shape parameter.
size - The shape of the returned array.
Example
Draw out a sample for pareto distribution with shape of 2 with size 2x3:
from numpy import random
x = random.pareto(a= 2 , size=( 2 , 3 ))
print (x)
plt.show()
Result
Zipf Distribution
Zipf distributions are used to sample data based on zipf's law.
Zipf's Law: In a collection the nth common term is 1/n times of the most common term. E.g. the
5th common word in english has occurs nearly 1/5th times as of the most used word.
It has two parameters:
a - distribution parameter.
size - The shape of the returned array.
Example
Draw out a sample for zipf distribution with distribution parameter 2 with size 2x3:
from numpy import random
x = random.zipf(a= 2 , size=( 2 , 3 ))
print (x)
NumPy ufuncs
What are ufuncs?
ufuncs stands for "Universal Functions" and they are NumPy functions that operate on the
ndarray object.
What is Vectorization?
Converting iterative statements into a vector based operation is called vectorization.
It is faster as modern CPUs are optimized for such operations.
Add the Elements of Two Lists
list 1: [1, 2, 3, 4]
list 2: [4, 5, 6, 7]
One way of doing it is to iterate over both of the lists and then sum each element.
Example
Without ufunc, we can use Python's built-in zip() method:
x=[1,2,3,4]
y=[4,5,6,7]
z = []
for i, j in zip (x, y):
z.append(i + j)
print (z)
NumPy has a ufunc for this, called add(x, y) that will produce the same result.
Example
With ufunc, we can use the add() function:
import numpy as np
x=[1,2,3,4]
y=[4,5,6,7]
z = np.add(x, y)
print (z)
Example
Create your own ufunc for addition:
import numpy as np
def myadd(x, y):
return x+y
myadd = np.frompyfunc(myadd, 2 , 1 )
print (myadd([ 1 , 2 , 3 , 4 ], [ 5 , 6 , 7 , 8 ]))
Addition
The add() function sums the content of two arrays, and return the results in a new array.
Example
Add the values in arr1 to the values in arr2:
import numpy as np
arr1 = np.array([ 10 , 11 , 12 , 13 , 14 , 15 ])
arr2 = np.array([ 20 , 21 , 22 , 23 , 24 , 25 ])
newarr = np.add(arr1, arr2)
print (newarr)
The example above will return [30 32 34 36 38 40] which is the sums of 10+20, 11+21, 12+22
etc.
Subtraction
The subtract() function subtracts the values from one array with the values from another array,
and returns the results in a new array.
Example
Subtract the values in arr2 from the values in arr1:
import numpy as np
arr1 = np.array([ 10 , 20 , 30 , 40 , 50 , 60 ])
arr2 = np.array([ 20 , 21 , 22 , 23 , 24 , 25 ])
newarr = np.subtract(arr1, arr2)
print (newarr)
The example above will return [-10 -1 8 17 26 35] which is the result of 10-20, 20-21, 30-22 etc.
Multiplication
The multiply() function multiplies the values from one array with the values from another array,
and returns the results in a new array.
Example
Multiply the values in arr1 with the values in arr2:
import numpy as np
arr1 = np.array([ 10 , 20 , 30 , 40 , 50 , 60 ])
arr2 = np.array([ 20 , 21 , 22 , 23 , 24 , 25 ])
newarr = np.multiply(arr1, arr2)
print (newarr)
The example above will return [200 420 660 920 1200 1500] which is the result of 10*20,
20*21, 30*22 etc.
Division
The divide() function divides the values from one array with the values from another array, and
returns the results in a new array.
Example
Divide the values in arr1 with the values in arr2:
import numpy as np
arr1 = np.array([ 10 , 20 , 30 , 40 , 50 , 60 ])
arr2 = np.array([ 3 , 5 , 10 , 8 , 2 , 33 ])
newarr = np.divide(arr1, arr2)
print (newarr)
The example above will return [3.33333333 4. 3. 5. 25. 1.81818182] which is the result of 10/3,
20/5, 30/10 etc.
Power
The power() function raises the values from the first array to the power of the values of the
second array, and returns the results in a new array.
Example
Raise the values in arr1 to the power of values in arr2:
import numpy as np
arr1 = np.array([ 10 , 20 , 30 , 40 , 50 , 60 ])
arr2 = np.array([ 3 , 5 , 6 , 8 , 2 , 33 ])
newarr = np.power(arr1, arr2)
print (newarr)
The example above will return [1000 3200000 729000000 6553600000000 2500 0] which is the
result of 10*10*10, 20*20*20*20*20, 30*30*30*30*30*30 etc.
Remainder
Both the mod() and the remainder() functions return the remainder of the values in the first array
corresponding to the values in the second array, and return the results in a new array.
Example
Return the remainders:
import numpy as np
arr1 = np.array([ 10 , 20 , 30 , 40 , 50 , 60 ])
arr2 = np.array([ 3 , 7 , 9 , 8 , 2 , 33 ])
newarr = np.mod(arr1, arr2)
print (newarr)
The example above will return [1 6 3 0 0 27] which is the remainder when you divide 10 with 3
(10%3), 20 with 7 (20%7) 30 with 9 (30%9) etc.
You get the same result when using the remainder() function:
Example
Return the remainders:
import numpy as np
arr1 = np.array([ 10 , 20 , 30 , 40 , 50 , 60 ])
arr2 = np.array([ 3 , 7 , 9 , 8 , 2 , 33 ])
newarr = np.remainder(arr1, arr2)
print (newarr)
Absolute Values
Both the absolute() and the abs() functions functions do the same absolute operation element-
wise but we should use absolute() to avoid confusion with python's inbuilt math.abs()
Example
Return the quotient and mod:
import numpy as np
arr = np.array([- 1 , - 2 , 1 , 2 , 3 , - 4 ])
newarr = np.absolute(arr)
print (newarr)
The example above will return [1 2 1 2 3 4].
Rounding Decimals
There are primarily five ways of rounding off decimals in NumPy:
● truncation
● fix
● rounding
● floor
● ceil
Truncation
Remove the decimals, and return the float number closest to zero. Use the trunc() and fix()
functions.
Example
Truncate elements of following array:
import numpy as np
arr = np.trunc([- 3.1666 , 3.6667 ])
print (arr)
Example
Same example, using fix():
import numpy as np
arr = np.fix([- 3.1666 , 3.6667 ])
print (arr)
Rounding
The around() function increments preceding digit or decimal by 1 if >=5 else do nothing.
E.g. round off to 1 decimal point, 3.16666 is 3.2
Example
Round off 3.1666 to 2 decimal places:
import numpy as np
arr = np.around( 3.1666 , 2 )
print (arr)
Floor
The floor() function rounds off decimal to the nearest lower integer.
E.g. floor of 3.166 is 3.
Example
Floor the elements of following array:
import numpy as np
arr = np.floor([- 3.1666 , 3.6667 ])
print (arr)
Note: The floor() function returns floats, unlike the trunc() function who returns integers.
Ceil
The ceil() function rounds off decimal to the nearest upper integer.
E.g. The level of 3.166 is 4.
Example
Ceil the elements of following array:
import numpy as np
arr = np.ceil([- 3.1666 , 3.6667 ])
print (arr)
NumPy Logs
Logs
NumPy provides functions to perform log at the base 2, e and 10.
We will also explore how we can take log for any base by creating a custom ufunc.
All of the log functions will place -inf or inf in the elements if the log can not be computed.
Log at Base 2
Use the log2() function to perform log at the base 2.
Example
Find log at base 2 of all elements of following array:
import numpy as np
arr = np.arange( 1 , 10 )
print (np.log2(arr))
Note: The range(1, 10) function returns an array with integers starting from 1 (included) to 10
(not included).
Log at Base 10
Use the log10() function to perform log at the base 10.
Example
Find log at base 10 of all elements of following array:
import numpy as np
arr = np.arange( 1 , 10 )
print (np.log10(arr))
NumPy Summations
Summations
What is the difference between summation and addition?
Addition is done between two arguments whereas summation happens over n elements.
Example
Add the values in arr1 to the values in arr2:
import numpy as np
arr1 = np.array([ 1 , 2 , 3 ])
arr2 = np.array([ 1 , 2 , 3 ])
newarr = np.add(arr1, arr2)
print (newarr)
Returns: [2 4 6]
Example
Sum the values in arr1 and the values in arr2:
import numpy as np
arr1 = np.array([ 1 , 2 , 3 ])
arr2 = np.array([ 1 , 2 , 3 ])
newarr = np. sum ([arr1, arr2])
print (newarr)
Returns: 12
Summation Over an Axis
If you specify axis=1, NumPy will sum the numbers in each array.
Example
Perform summation in the following array over 1st axis:
import numpy as np
arr1 = np.array([ 1 , 2 , 3 ])
arr2 = np.array([ 1 , 2 , 3 ])
newarr = np. sum ([arr1, arr2], axis= 1 )
print (newarr)
Returns: [6 6]
Cumulative Sum
Cumulative sum means partially adding the elements in an array.
E.g. The partial sum of [1, 2, 3, 4] would be [1, 1+2, 1+2+3, 1+2+3+4] = [1, 3, 6, 10].
Perform partial sum with the cumsum() function.
Example
Perform cumulative summation in the following array:
import numpy as np
arr = np.array([ 1 , 2 , 3 ])
newarr = np.cumsum(arr)
print (newarr)
Returns: [1 3 6]
NumPy Products
Products
To find the product of the elements in an array, use the prod() function.
Example
Find the product of the elements of this array:
import numpy as np
arr = np.array([ 1 , 2 , 3 , 4 ])
x = np.prod(arr)
print (x)
Returns: 24 because 1*2*3*4 = 24
Example
Find the product of the elements of two arrays:
import numpy as np
arr1 = np.array([ 1 , 2 , 3 , 4 ])
arr2 = np.array([ 5 , 6 , 7 , 8 ])
x = np.prod([arr1, arr2])
print (x)
Returns: 40320 because 1*2*3*4*5*6*7*8 = 40320
Cumulative Product
Cumulative product means taking the product partially.
E.g. The partial product of [1, 2, 3, 4] is [1, 1*2, 1*2*3, 1*2*3*4] = [1, 2, 6, 24]
Perform partial sum with the cumprod() function.
Example
Take cumulative product of all elements for following array:
import numpy as np
arr = np.array([ 5 , 6 , 7 , 8 ])
newarr = np.cumprod(arr)
print (newarr)
Returns: [5 30 210 1680]
NumPy Differences
Differences
A discrete difference means subtracting two successive elements.
E.g. for [1, 2, 3, 4], the discrete difference would be [2-1, 3-2, 4-3] = [1, 1, 1]
To find the discrete difference, use the diff() function.
Example
Compute discrete difference of the following array:
import numpy as np
arr = np.array([ 10 , 15 , 25 , 5 ])
newarr = np.diff(arr)
print (newarr)
Returns: [5 10 -20] because 15-10=5, 25-15=10, and 5-25=-20
We can perform this operation repeatedly by giving parameter n.
E.g. for [1, 2, 3, 4], the discrete difference with n = 2 would be [2-1, 3-2, 4-3] = [1, 1, 1] , then,
since n=2, we will do it once more, with the new result: [1-1, 1-1] = [0, 0]
Example
Compute discrete difference of the following array twice:
import numpy as np
arr = np.array([ 10 , 15 , 25 , 5 ])
newarr = np.diff(arr, n= 2 )
print (newarr)
Returns: [5 -30] because: 15-10=5, 25-15=10, and 5-25=-20 AND 10-5=5 and -20-10=-30
Radians to Degrees
Example
Convert all of the values in following array arr to degrees:
import numpy as np
arr = np.array([np.pi/ 2 , np.pi, 1.5 *np.pi, 2 *np.pi])
x = np.rad2deg(arr)
print (x)
Finding Angles
Finding angles from values of sine, cos, tan. E.g. sin, cos and tan inverse (arcsin, arccos, arctan).
NumPy provides ufuncs arcsin(), arccos() and arctan() that produce radian values for
corresponding sin, cos and tan values given.
Example
Find the angle of 1.0:
import numpy as np
x = np.arcsin( 1.0 )
print (x)
Hypotenuse
Finding hypotenuse using pythagoras theorem in NumPy.
NumPy provides the hypot() function that takes the base and perpendicular values and produces
a hypotenuse based on pythagoras theorem.
Example
Find the hypotenuse for 4 base and 3 perpendicular:
import numpy as np
base = 3
perp = 4
x = np.hypot(base, perp)
print (x)
Finding Angles
Finding angles from values of hyperbolic sine, cos, tan. E.g. sinh, cosh and tanh inverse (arcsinh,
arccosh, arctanh).
Numpy provides ufuncs arcsinh(), arccosh() and arctanh() that produce radian values for
corresponding sinh, cosh and tanh values given.
Example
Find the angle of 1.0:
import numpy as np
x = np.arcsinh( 1.0 )
print (x)
Finding Union
To find the unique values of two arrays, use the union1d() method.
Example
Find union of the following two set arrays:
import numpy as np
arr1 = np.array([ 1 , 2 , 3 , 4 ])
arr2 = np.array([ 3 , 4 , 5 , 6 ])
newarr = np.union1d(arr1, arr2)
print (newarr)
Finding Intersection
To find only the values that are present in both arrays, use the intersect1d() method.
Example
Find intersection of the following two set arrays:
import numpy as np
arr1 = np.array([ 1 , 2 , 3 , 4 ])
arr2 = np.array([ 3 , 4 , 5 , 6 ])
newarr = np.intersect1d(arr1, arr2, assume_unique= True )
print (newarr)
Note: the intersect1d() method takes an optional argument assume_unique, which if set to True
can speed up computation. It should always be set to True when dealing with sets.
Finding Difference
To find only the values in the first set that is NOT present in the seconds set, use the setdiff1d()
method.
Example
Find the difference of the set1 from set2:
import numpy as np
set1 = np.array([ 1 , 2 , 3 , 4 ])
set2 = np.array([ 3 , 4 , 5 , 6 ])
newarr = np.setdiff1d(set1, set2, assume_unique= True )
print (newarr)
Note: the setdiff1d() method takes an optional argument assume_unique, which if set to True can
speed up computation. It should always be set to True when dealing with sets.
41.Pandas Tutorial
Pandas is a Python library.
Pandas is used to analyze data.
Learning by Reading
We have created 14 tutorial pages for you to learn more about Pandas.
Starting with a basic introduction and ends up with cleaning and plotting data:
Basic
● Introduction
● Getting Started
● Pandas Series
● DataFrames
● Read CSV
● Read JSON
● Analyze Data
Cleaning Data
● Clean Data
● Clean Empty Cells
● Clean Wrong Format
● Clean Wrong Data
● Remove Duplicates
Advanced
● Correlations
● Plotting
Pandas Introduction
What is Pandas?
Pandas is a Python library used for working with data sets.
It has functions for analyzing, cleaning, exploring, and manipulating data.
The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and was
created by Wes McKinney in 2008.
Import Pandas
Once Pandas is installed, import it in your applications by adding the import keyword:
import pandas
Now Pandas are imported and ready to use.
Example
import pandas
mydataset = {
'passings' : [ 3 , 7 , 2 ]
myvar = pandas.DataFrame(mydataset)
print (myvar)
Pandas as pd
Pandas is usually imported under the pd alias.
alias: In Python aliases are an alternate name for referring to the same thing.
Create an alias with the as keyword while importing:
import pandas as pd
Now the Pandas package can be referred to as pd instead of pandas.
Example
import pandas as pd
mydataset = {
'cars' : [ "BMW" , "Volvo" , "Ford" ],
'passings' : [ 3 , 7 , 2 ]
}
myvar = pd.DataFrame(mydataset)
print (myvar)
import pandas as pd
print (pd.__version__)
Pandas Series
What is a Series?
A Pandas Series is like a column in a table.
It is a one-dimensional array holding data of any type.
Example
Create a simple Pandas Series from a list:
import pandas as pd
a=[1,7,2]
myvar = pd.Series(a)
print (myvar)
Labels
If nothing else is specified, the values are labeled with their index number. First value has index
0, second value has index 1 etc.
This label can be used to access a specified value.
Example
Return the first value of the Series:
print (myvar[ 0 ])
Create Labels
With the index argument, you can name your own labels.
Example
Create your own labels:
import pandas as pd
a=[1,7,2]
print (myvar)
When you have created labels, you can access an item by referring to the label.
Example
Return the value of "y":
import pandas as pd
myvar = pd.Series(calories)
print (myvar)
Note: The keys of the dictionary become the labels.
To select only some of the items in the dictionary, use the index argument and specify only the
items you want to include in the Series.
Example
Create a Series using only data from "day1" and "day2":
import pandas as pd
calories = { "day1" : 420 , "day2" : 380 , "day3" : 390 }
print (myvar)
DataFrames
Datasets in Pandas are usually multi-dimensional tables, called DataFrames.
Series is like a column, a DataFrame is the whole table.
Example
Create a DataFrame from two Series:
import pandas as pd
data = {
"calories" : [ 420 , 380 , 390 ],
"duration" : [ 50 , 40 , 45 ]
}
myvar = pd.DataFrame(data)
print (myvar)
Pandas DataFrames
What is a DataFrame?
A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with
rows and columns.
Example
Create a simple Pandas DataFrame:
import pandas as pd
data = {
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print (df)
Result
calories 420
duration 50
Name: 0, dtype: int64
Locate Row
As you can see from the result above, the DataFrame is like a table with rows and columns.
Pandas use the loc attribute to return one or more specified row(s)
Example
Return row 0:
print (df.loc[0 ])
Result
calories duration
0 420 50
1 380 40
\ Note: This example returns a Pandas Series .
Example
Return row 0 and 1:
Result
calories duration
day1 420 50
day2 380 40
day3 390 45
Note: When using [], the result is a Pandas DataFrame .
Result
calories 380
duration 40
Name: 0, dtype: int64
df = pd.read_csv( 'data.csv' )
print (df.to_string())
Tip: use to_string() to print the entire DataFrame.
By default, when you print a DataFrame, you will only get the first 5 rows, and the last 5 rows:
Example
Print a reduced sample:
import pandas as pd
df = pd.read_csv( 'data.csv' )
print (df)
Dictionary as JSON
JSON = Python Dictionary
JSON objects have the same format as Python dictionaries.
If your JSON code is not in a file, but in a Python Dictionary, you can load it into a DataFrame
directly:
Example
Load a Python Dictionary into a DataFrame:
import pandas as pd
data = { "Duration" :{ "0" : 60, "1" : 60 , "2" : 60 , "3" : 45 , "4" : 45 , "5" : 60 },
"Pulse" :{ "0" : 110 , "1" : 117 , "2" : 103 , "3" : 109 , "4" : 117 , "5" : 102 },
"Maxpulse" :{ "0" : 130 , "1" : 145 , "2" : 135 , "3" : 175 , "4" : 148 , "5" : 127 },
"Calories" :{ "0" : 409 , "1" : 479 , "2" : 340 , "3" : 282 , "4" : 406 , "5" : 300 }
}
df = pd.DataFrame(data)
print (df)
Result Explained
The result tells us there are 169 rows and 4 columns:
RangeIndex: 169 entries, 0 to 168
Data columns (total 4 columns):
And the name of each column, with the data type:
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Duration 169 non-null int64
1 Pulse 169 non-null int64
2 Maxpulse 169 non-null int64
3 Calories 164 non-null float64
Null Values
The info() method also tells us how many Non-Null values there are present in each column, and
in our data set it seems like there are 164 of 169 Non-Null values in the "Calories" column.
Which means that there are 5 rows with no value at all, in the "Calories" column, for whatever
reason.
Empty values, or Null values, can be bad when analyzing data, and you should consider
removing rows with empty values. This is a step towards what is called cleaning data.
The data set contains some empty cells ("Date" in row 22, and "Calories" in row 18 and 28).
The data set contains the wrong format ("Date" in row 26).
The data set contains wrong data ("Duration" in row 7).
The data set contains duplicates (row 11 and 12).
Remove Rows
One way to deal with empty cells is to remove rows that contain empty cells.
This is usually OK, since data sets can be very big, and removing a few rows will not have a big
impact on the result.
Example
Return a new Dataframe with no empty cells:
import pandas as pd
df = pd.read_csv( 'data.csv' )
new_df = df.dropna()
print (new_df.to_string())
In our cleaning examples we will be using a CSV file called 'dirtydata.csv'.
Download dirtydata.csv. or Open dirtydata.csv
Note: By default, the dropna() method returns a new DataFrame, and will not change the
original.
If you want to change the original DataFrame, use the inplace = True argument:
Example
Remove all rows with NULL values:
import pandas as pd
df = pd.read_csv( 'data.csv' )
df.dropna(inplace = True )
print (df.to_string())
Note: Now, the dropna(inplace = True) will NOT return a new DataFrame, but it will remove all
rows containing NULL values from the original DataFrame.
Let's try to convert all cells in the 'Date' column into dates.
Pandas has a to_datetime() method for this:
Example
Convert to date:
import pandas as pd
df = pd.read_csv( 'data.csv' )
df[ 'Date' ] = pd.to_datetime(df[ 'Date' ])
print (df.to_string())
Result:
Duration Date Pulse Max Pulse Calories
0 60 '2020/12/01' 110 130 409.1
1 60 '2020/12/02' 117 145 479.0
2 60 '2020/12/03' 103 135 340.0
3 45 '2020/12/04' 109 175 282.4
4 45 '2020/12/05' 117 148 406.0
5 60 '2020/12/06' 102 127 300.0
6 60 '2020/12/07' 110 136 374.0
7 450 '2020/12/08' 104 134 253.3
8 30 '2020/12/09' 109 133 195.1
9 60 '2020/12/10' 98 124 269.0
10 60 '2020/12/11' 103 147 329.3
11 60 '2020/12/12' 100 120 250.7
12 60 '2020/12/12' 100 120 250.7
13 60 '2020/12/13' 106 128 345.3
14 60 '2020/12/14' 104 132 379.3
15 60 '2020/12/15' 98 123 275.0
16 60 '2020/12/16' 98 120 215.2
17 60 '2020/12/17' 100 120 300.0
18 45 '2020/12/18' 90 112 NaN
19 60 '2020/12/19' 103 123 323.0
20 45 '2020/12/20' 97 125 243.0
21 60 '2020/12/21' 108 131 364.2
22 45 NaT 100 119 282.0
23 60 '2020/12/23' 130 101 300.0
24 45 '2020/12/24' 105 132 246.0
25 60 '2020/12/25' 102 126 334.5
26 60 '2020/12/26' 100 120 250.0
27 60 '2020/12/27' 92 118 241.0
28 60 '2020/12/28' 103 132 NaN
29 60 '2020/12/29' 100 132 280.0
30 60 '2020/12/30' 102 129 380.3
31 60 '2020/12/31' 92 115
As you can see from the result, the date in row 26 was fixed, but the empty date in row 22 got a
NaT (Not a Time) value, in other words an empty value. One way to deal with empty values is
simply removing the entire row.
Removing Rows
The result from the conversion in the example above gave us a NaT value, which can be handled
as a NULL value, and we can remove the row by using the dropna() method.
Example
Remove rows with a NULL value in the "Date" column:
df.dropna(subset=[ 'Date' ], inplace = True )
How can we fix wrong values, like the one for "Duration" in row 7?
Replacing Values
One way to fix wrong values is to replace them with something else.
In our example, it is most likely a typo, and the value should be "45" instead of "450", and we
could just insert "45" in row 7:
Example
Set "Duration" = 45 in row 7:
df.loc[ 7 , 'Duration' ] = 45
For small data sets you might be able to replace the wrong data one by one, but not for big data
sets.
To replace wrong data for larger data sets you can create some rules, e.g. set some boundaries for
legal values, and replace any values that are outside of the boundaries.
Example
Loop through all values in the "Duration" column.
If the value is higher than 120, set it to 120:
for x in df.index:
if df.loc[x, "Duration" ] > 120 :
df.loc[x, "Duration" ] = 120
Removing Rows
Another way of handling wrong data is to remove the rows that contains wrong data.
This way you do not have to find out what to replace them with, and there is a good chance you
do not need them to do your analyses.
Example
Delete rows where "Duration" is higher than 120:
for x in df.index:
if df.loc[x, "Duration" ] > 120 :
df.drop(x, inplace = True )
By taking a look at our test data set, we can assume that row 11 and 12 are duplicates.
To discover duplicates, we can use the duplicated() method.
The duplicated() method returns a Boolean values for each row:
Example
Returns True for every row that is a duplicate, otherwise False:
print (df.duplicated())
Removing Duplicates
To remove duplicates, use the drop_duplicates() method.
Example
Remove all duplicates:
df.drop_duplicates(inplace = True )
Remember: The (inplace = True) will make sure that the method does NOT return a new
DataFrame, but it will remove all duplicates from the original DataFrame.
Pandas - Plotting
Plotting
Pandas uses the plot() method to create diagrams.
We can use Pyplot, a submodule of the Matplotlib library to visualize the diagram on the screen.
Read more about Matplotlib in our Matplotlib Tutorial.
Example
Import pyplot from Matplotlib and visualize our DataFrame:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv( 'data.csv' )
df.plot()
plt.show()
The examples in this page uses a CSV file called: 'data.csv'.
Download data.csv or Open data.csv
Scatter Plot
Specify that you want a scatter plot with the kind argument:
kind = 'scatter'
A scatter plot needs an x- and a y-axis.
In the example below we will use "Duration" for the x-axis and "Calories" for the y-axis.
Include the x and y arguments like this:
x = 'Duration', y = 'Calories'
Example
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv( 'data.csv' )
df.plot(kind = 'scatter' , x = 'Duration' , y = 'Calories' )
plt.show()
Result
Remember: In the previous example, we learned that the correlation between "Duration" and
"Calories" was 0.922721, and we concluded that higher duration means more calories burned.
Result
Note: The histogram tells us that there were over 100 workouts that lasted between 50 and 60
minutes.
42.SciPy Tutorial
SciPy is a scientific computation library that uses NumPy underneath.
SciPy stands for Scientific Python.
Learning by Reading
We have created 10 tutorial for you to learn the fundamentals of SciPy:
● Getting Started
● Constants
● Optimizers
● Sparse Data
● Graphs
● Spatial Data
● Matlab Arrays
● Interpolation
● Significance Tests
SciPy Introduction
What is SciPy?
SciPy is a scientific computation library that uses NumPy underneath.
SciPy stands for Scientific Python.
It provides more utility functions for optimization, stats and signal processing.
Like NumPy, SciPy is open source so we can use it freely.
SciPy was created by NumPy's creator Travis Olliphant.
Import SciPy
Once SciPy is installed, import the SciPy module(s) you want to use in your applications by
adding the from scipy import module statement:
from scipy import constants
Now we have imported the constants module from SciPy, and the application is ready to use it:
Example
How many cubic meters are in one liter:
from scipy import constants
print (constants.liter)
constants: SciPy offers a set of mathematical constants, one of them is liter which returns 1 liter
as cubic meters.
You will learn more about constants in the next chapter.
SciPy Constants
Constants in SciPy
As SciPy is more focused on scientific implementations, it provides many built-in scientific
constants.
These constants can be helpful when you are working with Data Science.
PI is an example of a scientific constant.
Example
Print the constant value of PI:
from scipy import constants
print (constants.pi)
Constant Units
A list of all units under the constants module can be seen using the dir() function.
Example
List all constants:
from scipy import constants
print ( dir (constants))
Unit Categories
The units are placed under these categories:
● Metric
● Binary
● Mass
● Angle
● Time
● Length
● Pressure
● Volume
● Speed
● Temperature
● Energy
● Power
● Force
Binary Prefixes:
Return the specified unit in bytes (e.g. kibi returns 1024)
Example
from scipy import constants
Mass:
Return the specified unit in kg (e.g. gram returns 0.001)
Example
from scipy import constants
Angle:
Return the specified unit in radians (e.g. degree returns 0.017453292519943295)
Example
from scipy import constants
print (constants.degree) #0.017453292519943295
print (constants.arcmin) #0.0002908882086657216
print (constants.arcminute) #0.0002908882086657216
print (constants.arcsec) #4.84813681109536e-06
print (constants.arcsecond) #4.84813681109536e-06
Time:
Return the specified unit in seconds (e.g. hour returns 3600.0)
Example
from scipy import constants
Length:
Return the specified unit in meters (e.g. nautical_mile returns 1852.0)
Example
from scipy import constants
Pressure:
Return the specified unit in pascals (e.g. psi returns 6894.757293168361)
Example
from scipy import constants
Area:
Return the specified unit in square meters (e.g. hectare returns 10000.0)
Example
from scipy import constants
Volume:
Return the specified unit in cubic meters (e.g. liter returns 0.001)
Example
from scipy import constants
Speed:
Return the specified unit in meters per second (e.g. speed_of_sound returns 340.5)
Example
from scipy import constants
Temperature:
Return the specified unit in Kelvin (e.g. zero_Celsius returns 273.15)
Example
from scipy import constants
Energy:
Return the specified unit in joules (e.g. calorie returns 4.184)
Example
from scipy import constants
Power:
Return the specified unit in watts (e.g. horsepower returns 745.6998715822701)
Example
from scipy import constants
Force:
Return the specified unit in newton (e.g. kilogram_force returns 9.80665)
Example
from scipy import constants
print (constants.dyn) #1e-05
print (constants.dyne) #1e-05
print (constants.lbf) #4.4482216152605
print (constants.pound_force) #4.4482216152605
print (constants.kgf) #9.80665
print (constants.kilogram_force) #9.80665
SciPy Optimizers
Optimizers in SciPy
Optimizers are a set of procedures defined in SciPy that either find the minimum value of a
function, or the root of an equation.
Optimizing Functions
Essentially, all of the algorithms in Machine Learning are nothing more than a complex equation
that needs to be minimized with the help of given data.
Roots of an Equation
NumPy is capable of finding roots for polynomials and linear equations, but it can not find roots
for nonlinear equations, like this one:
x + cos(x)
For that you can use SciPy's optimize.root function.
This function takes two required arguments:
fun - a function representing an equation.
x0 - an initial guess for the root.
The function returns an object with information regarding the solution.
The actual solution is given under attribute x of the returned object:
Example
Find root of the equation x + cos(x):
from scipy.optimize import root
from math import cos
def eqn(x):
return x + cos(x)
myroot = root(eqn, 0 )
print (myroot.x)
Note: The returned object has much more information about the solution.
Example
Print all information about the solution (not just x which is the root)
print (myroot)
Minimizing a Function
A function, in this context, represents a curve, curves have high points and low points .
High points are called maxima .
Low points are called minima .
The highest point in the whole curve is called global maxima , whereas the rest of them are
called local maxima .
The lowest point in the whole curve is called global minima , whereas the rest of them are called
local minima .
Finding Minima
We can use scipy.optimize.minimize() function to minimize the function.
The minimize() function takes the following arguments:
fun - a function representing an equation.
x0 - an initial guess for the root.
method - name of the method to use. Legal values:
'CG'
'BFGS'
'Newton-CG'
'L-BFGS-B'
'TNC'
'COBYLA'
'SLSQP'
callback - function called after each iteration of optimization.
options - a dictionary defining extra params:
{
"disp": boolean - print detailed description
"gtol": number - the tolerance of the error
}
Example
Minimize the function x^2 + x + 2 with BFGS:
from scipy.optimize import minimize
def eqn(x):
return x** 2 + x + 2
mymin = minimize(eqn, 0 , method= 'BFGS' )
print (mymin)
CSR Matrix
We can create a CSR matrix by passing an array into the function scipy.sparse.csr_matrix().
Example
Create a CSR matrix from an array:
import numpy as np
from scipy.sparse import csr_matrix
arr = np.array([ 0 , 0 , 0 , 0 , 0 , 1 , 1 , 0 , 2 ])
print (csr_matrix(arr))
The example above returns:
(0, 5) 1
(0, 6) 1
(0, 8) 2
From the result we can see that there are 3 items with value.
The 1. item is in row 0 position 5 and has the value 1.
The 2. item is in row 0 position 6 and has the value 1.
The 3. item is in row 0 position 8 and has the value 2.
SciPy Graphs
Working with Graphs
Graphs are an essential data structure.
SciPy provides us with the module scipy.sparse.csgraph for working with such data structures.
Adjacency Matrix
Adjacency matrix is a nxn matrix where n is the number of elements in a graph.
And the values represent the connection between the elements.
For a graph like this, with elements A, B and C, the connections are:
A & B are connected with weight 1.
A & C are connected with weight 2.
C & B is not connected.
The Adjacency Matrix would look like this:
ABC
A:[0 1 2]
B:[1 0 0]
C:[2 0 0]
Below follows some of the most used methods for working with adjacency matrices.
Connected Components
Find all of the connected components with the connected_components() method.
Example
import numpy as np
from scipy.sparse.csgraph import connected_components
from scipy.sparse import csr_matrix
arr = np.array([
[ 0 , 1 , 2 ],
[ 1 , 0 , 0 ],
[2,0,0]
])
newarr = csr_matrix(arr)
print (connected_components(newarr))
Dijkstra
Use the dijkstra method to find the shortest path in a graph from one element to another.
It takes following arguments:
1. return_predecessors: boolean (True to return whole path of traversal otherwise
False).
2. indices: index of the element to return all paths from that element only.
3. limit: max weight of path.
Example
Find the shortest path from element 1 to 2:
import numpy as np
from scipy.sparse.csgraph import dijkstra
from scipy.sparse import csr_matrix
arr = np.array([
[ 0 , 1 , 2 ],
[ 1 , 0 , 0 ],
[2,0,0]
])
newarr = csr_matrix(arr)
print (dijkstra(newarr, return_predecessors= True , indices= 0 ))
Floyd Warshall
Use the floyd_warshall() method to find the shortest path between all pairs of elements.
Example
Find the shortest path between all pairs of elements:
import numpy as np
from scipy.sparse.csgraph import floyd_warshall
from scipy.sparse import csr_matrix
arr = np.array([
[ 0 , 1 , 2 ],
[ 1 , 0 , 0 ],
[2,0,0]
])
newarr = csr_matrix(arr)
print (floyd_warshall(newarr, return_predecessors= True ))
Bellman Ford
The bellman_ford() method can also find the shortest path between all pairs of elements, but this
method can handle negative weights as well.
Example
Find shortest path from element 1 to 2 with given graph with a negative weight:
import numpy as np
from scipy.sparse.csgraph import bellman_ford
from scipy.sparse import csr_matrix
arr = np.array([
[ 0 , - 1 , 2 ],
[ 1 , 0 , 0 ],
[2,0,0]
])
newarr = csr_matrix(arr)
print (bellman_ford(newarr, return_predecessors= True , indices= 0 ))
Example
Traverse the graph depth first for given adjacency matrix:
import numpy as np
from scipy.sparse.csgraph import depth_first_order
from scipy.sparse import csr_matrix
arr = np.array([
[ 0 , 1 , 0 , 1 ],
[ 1 , 1 , 1 , 1 ],
[ 2 , 1 , 1 , 0 ],
[0,1,0,1]
])
newarr = csr_matrix(arr)
print (depth_first_order(newarr, 1 ))
Example
Traverse the graph breadth first for given adjacency matrix:
import numpy as np
from scipy.sparse.csgraph import breadth_first_order
from scipy.sparse import csr_matrix
arr = np.array([
[ 0 , 1 , 0 , 1 ],
[ 1 , 1 , 1 , 1 ],
[ 2 , 1 , 1 , 0 ],
[0,1,0,1]
])
newarr = csr_matrix(arr)
print (breadth_first_order(newarr, 1 ))
Triangulation
A Triangulation of a polygon is to divide the polygon into multiple triangles with which we can
compute an area of the polygon.
A Triangulation with points means creating surface composed triangles in which all of the given
points are on at least one vertex of any triangle in the surface.
One method to generate these triangulations through points is the Delaunay() Triangulation.
Example
Create a triangulation from following points:
import numpy as np
from scipy.spatial import Delaunay
import matplotlib.pyplot as plt
points = np.array([
[ 2 , 4 ],
[ 3 , 4 ],
[ 3 , 0 ],
[ 2 , 2 ],
[4,1]
])
simplices = Delaunay(points).simplices
Result:
Convex Hull
A convex hull is the smallest polygon that covers all of the given points.
Use the ConvexHull() method to create a Convex Hull.
Example
Create a convex hull for following points:
import numpy as np
from scipy.spatial import ConvexHull
import matplotlib.pyplot as plt
points = np.array([
[ 2 , 4 ],
[ 3 , 4 ],
[ 3 , 0 ],
[ 2 , 2 ],
[ 4 , 1 ],
[ 1 , 2 ],
[ 5 , 0 ],
[ 3 , 1 ],
[ 1 , 2 ],
[0,2]
])
hull = ConvexHull(points)
hull_points = hull.simplices
plt.scatter(points[:, 0 ], points[:, 1 ])
for simplex in hull_points:
plt.plot(points[simplex, 0 ], points[simplex, 1 ], 'k-' )
plt.show()
Result:
KDTrees
KDTrees are a datastructure optimized for nearest neighbor queries.
E.g. in a set of points using KDTrees we can efficiently ask which points are nearest to a certain
given point.
The KDTree() method returns a KDTree object.
The query() method returns the distance to the nearest neighbor and the location of the
neighbors.
Example
Find the nearest neighbor to point (1,1):
from scipy.spatial import KDTree
points = [( 1 , - 1 ), ( 2 , 3 ), (- 2 , 3 ), ( 2 , - 3 )]
kdtree = KDTree(points)
res = kdtree.query(( 1 , 1 ))
print (res)
Result:
(2.0, 0)
Distance Matrix
There are many Distance Metrics used to find various types of distances between two points in
data science, Euclidean distance, cosine distance etc.
The distance between two vectors may not only be the length of straight line between them, it
can also be the angle between them from origin, or number of unit steps required etc.
Many of the Machine Learning algorithm's performance depends greatly on distance matrices.
E.g. "K Nearest Neighbors'', or "K Means'' etc.
Let us look at some of the Distance Metrics:
Euclidean Distance
Find the euclidean distance between given points.
Example
from scipy.spatial.distance import euclidean
p1 = ( 1 , 0 )
p2 = ( 10 , 2 )
res = euclidean(p1, p2)
print (res)
Result:
9.21954445729
Cosine Distance
Is the value of cosine angle between the two points A and B.
Example
Find the cosine distance between given points:
from scipy.spatial.distance import cosine
p1 = ( 1 , 0 )
p2 = ( 10 , 2 )
res = cosine(p1, p2)
print (res)
Result:
0.019419324309079777
Hamming Distance
Is the proportion of bits where two bits are different.
It's a way to measure distance for binary sequences.
Example
Find the hamming distance between given points:
from scipy.spatial.distance import hamming
p1 = ( True , False , True )
p2 = ( False , True , True )
res = hamming(p1, p2)
print (res)
Result:
0.666666666667
Example
Export the following array as variable name "vec" to a mat file:
from scipy import io
import numpy as np
arr = np.arange( 10 )
io.savemat( 'arr.mat' , { "vec" : arr})
Note: The example above saves a file name "arr.mat" on your computer.
To open the file, check out the "Import Data from Matlab Format" example below:
SciPy Interpolation
What is Interpolation?
Interpolation is a method for generating points between given points.
For example: for points 1 and 2, we may interpolate and find points 1.33 and 1.66.
Interpolation has many uses, in Machine Learning we often deal with missing data in a dataset,
interpolation is often used to substitute those values.
This method of filling values is called imputation .
Apart from imputation, interpolation is often used where we need to smooth the discrete points in
a dataset.
1D Interpolation
The function interp1d() is used to interpolate a distribution with 1 variable.
It takes x and y points and returns a callable function that can be called with new x and returns
corresponding y.
Example
For given xs and ys interpolate values from 2.1, 2.2... to 2.9:
from scipy.interpolate import interp1d
import numpy as np
xs = np.arange(10)
ys = 2*xs + 1
interp_func = interp1d(xs, ys)
newarr = interp_func(np.arange( 2.1 , 3 , 0.1 ))
print (newarr)
Result:
[5.2 5.4 5.6 5.8 6. 6.2 6.4 6.6 6.8]
Note: that new xs should be in the same range as of the old xs, meaning that we can't call
interp_func() with values higher than 10, or less than 0.
Spline Interpolation
In 1D interpolation the points are fitted for a single curve whereas in Spline interpolation the
points are fitted against a piecewise function defined with polynomials called splines.
The UnivariateSpline() function takes xs and ys and produces a callback function that can be
called with new xs.
Piecewise function: A function that has different definitions for different ranges.
Example
Find univariate spline interpolation for 2.1, 2.2... 2.9 for the following non linear points:
from scipy.interpolate import UnivariateSpline
import numpy as np
xs = np.arange( 10 )
ys = xs** 2 + np.sin(xs) + 1
interp_func = UnivariateSpline(xs, ys)
newarr = interp_func(np.arange( 2.1 , 3 , 0.1 ))
print (newarr)
Result:
[5.62826474 6.03987348 6.47131994 6.92265019 7.3939103 7.88514634
8.39640439 8.92773053 9.47917082]
T-Test
T-tests are used to determine if there is significant difference between means of two variables.
and lets us know if they belong to the same distribution.
It is a two tailed test.
The function ttest_ind() takes two samples of the same size and produces a tuple of t-statistic and
p-value.
Example
Find if the given values v1 and v2 are from same distribution:
import numpy as np
from scipy.stats import ttest_ind
v1 = np.random.normal(size= 100 )
v2 = np.random.normal(size= 100 )
res = ttest_ind(v1, v2)
print (res)
Result:
0.68346891833752133
If you want to return only the p-value, use the p value property:
Example
...
res = ttest_ind(v1, v2).pvalue
print (res)
Result:
0.68346891833752133
KS-Test
KS test is used to check if given values follow a distribution.
The function takes the value to be tested, and the CDF as two parameters.
A CDF can be either a string or a callable function that returns the probability.
It can be used as a one tailed or two tailed test.
By default it is two tails. We can pass parameter alternatives as a string of one of two-sided, less,
or greater.
Example
Find if the given value follows the normal distribution:
import numpy as np
from scipy.stats import kstest
v = np.random.normal(size= 100 )
res = kstest(v, 'norm' )
print (res)
Result:
KstestResult(statistic=0.047798701221956841, pvalue=0.97630967161777515)
Example
Show statistical description of the values in an array:
import numpy as np
from scipy.stats import describe
v = np.random.normal(size= 100 )
res = describe(v)
print (res)
Result:
DescribeResult(
nobs=100,
minmax=(-2.0991855456740121, 2.1304142707414964),
mean=0.11503747689121079,
variance=0.99418092655064605,
skewness=0.013953400984243667,
kurtosis=-0.671060517912661
)