Python
Python
Python
Python Identifier
Lines and Indentation
Multi-Line Statements
Quotation in Python
Comments in Python
Multiple Statement Groups as Suites
Assigning Values to Variables
Multiple Assignment
Print
Standard Data Types
Python Strings
Python Lists
Python Tuples
Python Dictionary
Type Conversion in Python
Types of Operators
Python Decision Making
Python Loops
Python Date and Time
Python Functions
Function Arguments
Scope of Variables
Python Modules and Packages
Python Files Input/ Output
Python File Methods
Python Excel Handling
KeyStrokes
Python Exception Handling
Python Object Oriented
Class Inheritance
Python Regular Expressions
Python Decorators
Multithreading in Python
Multiprocessing in Python
Memory Management
Tesseract-OCR Module in Python
Pywinautogui Module
Selenium Module
Request Module
List Comprehension
Dictionary Comprehension
Lambda Function
Filter Function
Reduce Function
Socket Module
Python
Python is a general-purpose interpreted, interactive, object-oriented, and high-
level programming language. Python is a high-level, interpreted, interactive and
object-oriented scripting language.
Python is Interpreted: Python is processed at runtime by the
interpreter. You do not need to compile your program before executing it.
Python is Interactive: You can actually sit at a Python prompt and
interact with the interpreter directly to write your programs.
Python is Object-Oriented: Python supports Object-Oriented style or
technique of programming that encapsulates code within objects.
Python is a Beginner's Language: Python is a great language for the
beginner-level programmers and supports the development of a wide
range of applications from simple text processing to WWW browsers to
games.
Databases: Python provides interfaces to all major commercial
databases.
GUI Programming: Python supports GUI applications that can be
created and ported to many system calls, libraries and windows systems,
such as Windows MFC, Macintosh, and the X Window system of Unix.
It supports functional and structured programming methods as well as
OOP.
It can be used as a scripting language or can be compiled to byte-code
for building large applications.
Difference between scripting and programming language:
Scripting language doesn’t have compiler.
It provides very high-level dynamic data types and supports dynamic
type checking.
IT supports automatic garbage collection.
It can be easily integrated with C, C++, COM, ActiveX, CORBA, and Java.
Running Python
There are three different ways to start Python –
Interactive Interpreter
Integrated Development Environment- Graphical User Interface (GUI) environment as well, if you
have a GUI application on your system that supports Python IDLE.
CPython
CPython is the implementation of the language called “Python” in C. Python is
an interpreted programming language. Hence, Python programmers need
interpreters to convert Python code into machine code. Whereas Cython is a
compiled programming language. The Cython programs can be executed
directly by the CPU of the underlying computer without using any interpreter.
Cython
Cython is designed as a C-extension for Python. The developers can use Cython
to speed up Python code execution. But they can still write and run Python
programs without using Cython. But the programmers have to install both
Python and C-compiler as a pre-requisite to run Cython programs.
Speed of
Cython is faster. It is slower.
Execution
Python Identifiers
Name used to identify a variable, function, class, module or other object. An
identifier starts with a letter A to Z or a to z or an underscore (_) followed by
zero or more letters, underscores and digits (0 to 9).
Python does not allow punctuation characters such as @, $, and % within
identifiers. Python is a case sensitive programming language.
Thus, Manpower and manpower are two different identifiers in Python.
Multi-Line Statements
Statements in Python typically end with a new line. Python does, however, allow
the use of the line continuation character (\) to denote that the line should
continue.
total = item_one + \
item_two
Quotation in Python
Python accepts single ('), double (") and triple (''' or """) quotes to denote string
literals, as long as the same type of quote starts and ends the string.
Comments in Python
All characters after the # or ## and up to the end of the physical line are part
of the comment and the Python interpreter ignores them.
To comment multiple lines in python use three single quotes (‘'’) or three
double quotes (‘’’’’).
Print()
end parameter in print():
By default python’s print() function ends with a newline.
Subsets of strings can be taken using the slice operator ([ ] and [:] ) with
indexes starting at 0 in the beginning of the string and working their way from
-1 at the end.
str = 'Hello World!'
print(str) # Prints complete string
print(str[0]) # Prints first character of the string
print(str[2:5]) # Prints characters starting from 3rd to 5th
print(str[2:]) # Prints string starting from 3rd character
print(str * 2) # Prints string two times
print(str + "TEST") # Prints concatenated string
print(str[::-1] # prints reverse of string
s = "Geeksforgeeks"
print (reverse(s))
The reversed() returns the reversed iterator of the given string and then its
elements are joined empty string separated using join(). And reversed order
string is formed.
Indexing
Slicing
Escape Characters
Escape or non-printable characters that can be represented with backslash
notation.
In Python strings, the backslash "\" is a special character, also called the
"escape" character. It is used in representing certain whitespace characters:
"\t" is a tab,
Python Lists
A list contains items separated by commas and enclosed within square brackets
([]). One difference between them is that all the items belonging to a list can be
of different data type.
The values stored in a list can be accessed using the slice operator ([ ] and
[:]) with indexes starting at 0 in the beginning of the list and working their way
to end -1.
The plus (+) sign is the list concatenation operator, and the asterisk
(*) is the repetition operator.
list1 = ['physics', 'chemistry', 1997, 2000];
1 len(list)
length of elements of list.
2 count(list)
count the number of elements of list.
3 Index(element)
Fetches the 1st index of element in the list.
Python has a set of built-in methods that you can use on lists/arrays.
Method Description
extend() Add the elements of a list (or any iterable), to the end of the
current list
index() Returns the index of the first element with the specified value
Indexing
Slicing
Python Tuples
Tuples are enclosed within parentheses. A tuple is a sequence
of immutable Python objects.
The main differences between lists and tuples are: Lists are enclosed in brackets
( [ ] ) and their elements and size can be changed, while tuples are enclosed in
parentheses ( ( ) ) and cannot be updated. Tuples can be thought of as read-
only lists.
To write a tuple containing a single value you have to include a comma, even
though there is only one value −
tup1 = (50,);
Changing a Tuple
This means that elements of a tuple cannot be changed once they have been
assigned. But, if the element is itself a mutable data type like list, its nested items
can be changed.
Tuple Uses:
To represent single set of data.
To provide easy access to and manipulation of a data set.
To return multiple values from a method without using out parameters or ByRef
parameters.
Python Dictionary
Python's dictionaries are kind of hash table type.
Each key is separated from its value by a colon (:), the items are
separated by commas, and the whole thing is enclosed in curly braces. An
empty dictionary without any items is written with just two curly braces, like
this: {}.
Keys are unique within a dictionary while values may not be. The values of a
dictionary can be of any type, but the keys must be of an immutable data
type such as strings, numbers, or tuples.
Dictionaries are enclosed by curly braces ({ }) and values can be assigned and
accessed using square braces ([]).
1 cmp(dict1, dict2)
Compares elements of both dict.
2 len(dict)
Gives the total length of the dictionary. This would be equal to the
number of items in the dictionary.
3 str(dict)
Produces a printable string representation of a dictionary
4 type(variable)
Returns the type of the passed variable. If passed variable is dictionary,
then it would return a dictionary type.
1 dict.clear()
Removes all elements of dictionary dict
2 dict.copy()
Returns a shallow copy of dictionary dict
3 dict.fromkeys()
Create a new dictionary with keys from seq and values set to value.
4 dict.get(key, default=None)
For key key, returns value or default if key not in dictionary
5 dict.has_key(key)
Returns true if key in dictionary dict, false otherwise
6 dict.items()
Returns a list of dict's (key, value) tuple pairs
7 dict.keys()
Returns list of dictionary dict's keys
8 dict.setdefault(key, default=None)
Similar to get(), but will set dict[key]=default if key is not already in dict
9 dict.update(dict2)
Adds dictionary dict2's key-values pairs to dict
10 dict.values()
Returns list of dictionary dict's values
(b) Keys must be immutable. Which means you can use strings, numbers or
tuples as dictionary keys but something like ['key'] is not allowed.
Nested Dictionary
nested_dict = { 'dictA': {'key_1': 'value_1'},
'dictB': {'key_2': 'value_2'}}
Here, the nested_dict is a nested dictionary with the dictionary dictA and dictB. They are two
dictionary each having own key and value.
people = {1: {'name': 'John', 'age': '27', 'sex': 'Male'},
print(people[1]['age'])
print(people[1]['sex'])
Implicit
Example:
a=5
a = 5.9
Explicit
1. int(a,base) : This function converts any data type to integer. ‘Base’
specifies the base in which string is if data type is string.
2. float() : This function is used to convert any data type to a floating point
number
3. ord() : This function is used to convert a character to integer.
4. hex() : This function is to convert integer to hexadecimal string.
5. oct() : This function is to convert integer to octal string.
6. tuple() : This function is used to convert to a tuple.
7. set() : This function returns the type after converting to set.
8. list() : This function is used to convert any data type to a list type.
9. dict() : This function is used to convert a tuple of order (key,value) into
a dictionary.
10. str() : Used to convert integer into a string.
11. complex(real,imag) : : This function converts real numbers to
complex(real,imag) number.
Types of Operator
Python language supports the following types of operators.
Assignment Operators: used to assign values to variables
=, +=, -=, *=, /=, %=, **=, //=
o Simple Assigment:
x= 10
print(x)
o Multiple Assignment:
Raj=Sid=Pari=’48’
Print (Raj)
Print(Sid)
Print(Pari)
o Compound Assignment:
a=10
a+=5
print(a)
Python Loops
Loop Type Description
while loop Repeats a statement or group of statements while a
given condition is TRUE. It tests the condition before
executing the loop body.
nested loops You can use one or more loop inside any another while,
for or do-while loop.
for Statements
words = ['cat', 'window', 'defenestrate']
for w in words:
print(w, len(w))
cat 3
window 6
defenestrate 12
range(0, 10, 3)
[0, 3, 6, 9]
range(10,0,-1)
[10,9,8,7,6,5,4,3,2,1]
xrange() function
Used in Python 2
Produces a generator object
xrange(1,6)
>>xrange(1,6) #generator object
1
2
3
4
print(sys.getsizeof(range(100))
>>864
Enumerate() function
If you want access to the index of each element within the body of a loop, use
the built-in enumerate function:
Generator
There is a lot of overhead in building an iterator in Python; we have to
implement a class with __iter__() and __next__() method, keep track of
internal states, raise StopIteration when there was no values to be returned
etc.
This is both lengthy and counter intuitive. Generator comes into rescue in
such situations.
A generator is a function that returns an object (iterator) which we can
iterate over (one value at a time).
If a function contains at least one yield statement, it becomes a generator
function. Both yield and return will return some value from a function.
The difference is that, while a return statement terminates a function
entirely, yield statement pauses the function saving all its states and later
continues from there on successive calls.
def my_gen():
n=1
print(‘This is printed first’)
#Generator function contains yield statements
yield n
n+ = 1
print(‘This is printed second’)
yield n
n+ = 1
print(‘This is printed last’)
yield n
To restart the process we need to create another generator object using something like
a = my_gen().
a is generator object.
2. Memory Efficient
A normal function to return a sequence will create the entire sequence in memory
before returning the result. This is an overkill if the number of items in the sequence is
very large.
Generator implementation of such sequence is memory friendly and is preferred since it
only produces one item at a time.
4. Pipelining Generators
Suppose we have a log file from a famous fast food chain. The log file has a column
(4th column) that keeps track of the number of pizza sold every hour and we want to
sum it to find the total pizzas sold in 5 years.
Assume everything is in string and numbers that are not available are marked as 'N/A'.
A generator implementation of this could be as follows.
Python Closure
def print_msg(msg):
# This is the outer enclosing function
def printer():
# This is the nested function
print(msg)
return printer # returns the nested function
# Now let's try calling this function.
# Output: Hello
another = print_msg("Hello")
another()
The print_msg() function was called with the string "Hello" and the returned
function was bound to the name another. On calling another(), the message was
still remembered although we had already finished executing
the print_msg() function.
This technique by which some data ("Hello in this case) gets attached to the code is
called closure in Python.
This value in the enclosing scope is remembered even when the variable goes out
of scope or the function itself is removed from the current namespace
# setter
def set_temperature(self, value):
print("Setting value...")
if value < -273.15:
raise ValueError("Temperature below -273.15 is not possible")
self._temperature = value
human = Celsius(37)
print(human.temperature)
print(human.to_fahrenheit())
human.temperature = -300
When an object is created, the __init__() method gets called. This method has the
line self.temperature = temperature. This expression automatically
calls set_temperature().
Similarly, any access like c.temperature automatically calls get_temperature().
This is what property does.
where,
import time;
localtime = time.localtime(time.time())
print("Local current time :", localtime)
This would produce the following result, which could be formatted in any other
presentable form −
Local current time : time.struct_time(tm_year=2013, tm_mon=7,
tm_mday=17, tm_hour=21, tm_min=26, tm_sec=3, tm_wday=2, tm_yday=198, tm_isdst=0)
Python Functions
Defining a Function
Function blocks begin with the keyword def followed by the function name and
parentheses ( ( ) ).
Any input parameters or arguments should be placed within these parentheses. You
can also define parameters inside these parentheses.
The code block within every function starts with a colon (:) and is indented.
The statement return [expression] exits a function, optionally passing back an
expression to the caller. A return statement with no arguments is the same as return
None.
Syntax
def functionname( parameters ):
"function_docstring"
function_suite
return [expression]
Calling a Function
Once the basic structure of a function is finalized, you can execute it by calling
it from another function or directly from the Python prompt.
Pass by reference vs value
All parameters (arguments) in the Python language are passed by reference. It
means if you change what a parameter refers to within a function, the change
also reflects in the calling function.
Function Arguments
Required arguments
Keyword arguments - When you use keyword arguments in a function call,
the caller identifies the arguments by the parameter name.
def printme( str ):
printme( str = "My string")
Variable-length arguments –
Arbitrary Arguments
You may need to process a function for more arguments than you specified
while defining the function. These arguments are called variable-
length arguments and are not named in the function definition, unlike required
and default arguments.
def functionname([formal_args,] *var_args_tuple ):
"function_docstring"
function_suite
return [expression]
An asterisk (*) is placed before the variable name that holds the values of all
non-keyword variable arguments. This tuple remains empty if no additional
arguments are specified during the function call.
# Driver Code
my_list = [1, 2, 3, 4]
# Driver code
print(mySum(1, 2, 3, 4, 5))
print(mySum(10, 20))
** is used for dictionaries:
Here ** unpacked the dictionary used with it and passed the items
in the dictionary as keyword arguments to the function.
It let us pass N (variable) number of arguments which can be named or
keyworded.
def fun(**kwargs):
# kwargs is a dict
print(type(kwargs))
Syntax
lambda [arg1 [,arg2,.....argn]]:expression
d = lambda x : x *2
print(d(4))
O/P : 8
Scope of Variables
Global variables
Local variables
#!/usr/bin/python
total = 0; # This is global variable.
# Function definition is here
def sum( arg1, arg2 ):
# Add both the parameters and return them."
total = arg1 + arg2; # Here total is local variable.
Global Keyword
Global keyword is a keyword that allows a user to modify a variable outside of the
current scope. It is used to create global variables from a non-global scope i.e
inside a function. Global keyword is used inside a function only when we want to do
assignments or when we want to change a variable. Global is not needed for
printing and accessing.
# increment value of a by 5
x =x +5
print("Value of x inside a function:", x)
change()
print("Value of x outside a function:", x)
Output:
Value of x inside a function: 20
Value of x outside a function: 20
This statement does not import the entire module fib into the current namespace; it
just introduces the item fibonacci from the module fib into the global symbol table of
the importing module.
This provides an easy way to import all the items from a module into the current
namespace; however, this statement should be used sparingly.
Packages in Python
A package is a collection of Python modules, i.e., a package is a directory of Python modules
containing an additional __init__.py file. The __init__.py distinguishes a package from a
directory that just happens to contain a bunch of Python scripts. Packages can be nested to any
depth, provided that the corresponding directories contain their own __init__.py file.
This file can be empty, and it indicates that the directory it contains is a
Python package, so it can be imported the same way a module can be
imported.
The __init__.py file can also decide which modules the package exports as
the API, while keeping other modules internal, by overriding
the __all__ variable.
When you import a module or a package, the corresponding object created by Python is always
of type module. This means that the distinction between module and package is just at the file
system level. Note, however, when you import a package, only variables/functions/classes in
the __init__.py file of that package are directly visible, not sub-packages or modules.
Error! Filename not specified.
For example, in the datetime module, there is a submodule called date. When you import
datetime, it won't be imported. You'll need to import it separately.
>>> import datetime
>>> date.today()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'date' is not defined
>>> from datetime import date
>>> date.today()
datetime.date(2017, 9, 1)
raw_input
input
This prompts you to enter any string and it would display same string on the
screen. When I typed "Hello Python!", its output is like this −
Enter your input: Hello Python
Received input is : Hello Python
This would produce the following result against the entered input −
Enter your input: [x*5 for x in range(2,10,2)]
Recieved input is : [10, 20, 30, 40]
Attribute Description
file.softspace Returns false if space explicitly required with print, true otherwise.
Syntax:
fileObject.close();
with Command
The shortest way to open a text file is by using “with” command as follows:
with open("file-name", "r") as fp:
fileData = fp.read()
#to print the contents of the file print(fileData)
File Positions
The tell() method tells you the current position within the file; in other
words, the next read or write will occur at that many bytes from the
beginning of the file.
The seek(offset[, from]) method changes the current file position.
The offset argument indicates the number of bytes to be moved.
The from argument specifies the reference position from where the bytes
are to be moved.
If from is set to 0, it means use the beginning of the file as the reference
position and 1 means use the current position as the reference position and
if it is set to 2 then the end of the file would be taken as the reference
position.
Directories in Python
OS Module
The mkdir() Method
The mkdir() method of the os module to create directories in the current
directory.
#!/usr/bin/python
import os
# Create a directory "test"
os.mkdir("test")
Creating Subdirectories
import os
os.makedirs("Temp/ temp1/ temp
Renaming a directory/folder
The os.rename() method can rename a folder from an old name to a new
one.
Example
import os
os.rename("Temp","Temp11")
If the file "my_file.txt" exist in the current path, it will return true else false.
Syntax:
os.remove(file_name)
1 file.close()
Close the file. A closed file cannot be read or written any more.
2 file.flush()
Flush the internal buffer, like stdio's fflush. This may be a no-op on some
file-like objects.
3 file.fileno()
Returns the integer file descriptor that is used by the underlying
implementation to request I/O operations from the operating system.
4 file.isatty()
Returns True if the file is connected to a tty(-like) device, else False.
5 file.next()
Returns the next line from the file each time it is being called.
6 file.read([size])
Reads at most size bytes from the file (less if the read hits EOF before
obtaining size bytes).
file.read() #reads all the content of file
file.read(5) # reads 5 characters from the file
7 file.readline([size])
Reads one entire line from the file. A trailing newline character is kept in the
string.
file.readline(5) #reads line number 5
8 file.readlines([sizehint])
Reads until EOF using readline() and return a list containing the lines. If the
optional sizehint argument is present, instead of reading up to EOF, whole
lines totalling approximately sizehint bytes (possibly after rounding up to an
internal buffer size) are read.
file.readlines() #reads upto EOF
file.readlines(5) # gives list with 5 lines of files
9 file.seek(offset[, whence])
Sets the file's current position
10 file.tell()
Returns the file's current position
11 file.truncate([size])
Truncates the file's size. If the optional size argument is present, the file is
truncated to (at most) that size.
12 file.write(str)
Writes a string to the file. There is no return value.
file.write(“hello world”) #amend the content to the file
13 file.writelines(sequence)
Writes a sequence of strings to the file. The sequence can be any iterable
object producing strings, typically a list of strings.
List_of_lines =[ “One line of text here”, “and another line here”, “and yet
another here”, “and so on and so forth”]
file.writelines(List_of_lines) #write multiple lines to a file
Important Statements
When you open a file for reading, if the file does not exist, an error occurs
When you open a file for writing, if the file does not exist, a new file is
created
When you open a file for writing, if the file exists, the existing file is
overwritten with the new file.
Pywinauto.application
KeyStrokes
Send_Keys
From pywinauto.keyboard import send_keys
def func_execute_keystrokes(keys, repetitions, time_wait=False, delay=1):
try:
repetitions = round(float(repetitions))
keys = keys.lower()
key_shorthand = {'ctrl': '^', 'alt': '%', 'shift': '+'}
hot_keys = {'esc': 'vk_escape'}
keys_array = keys.split('+')
for index, key in enumerate(keys_array):
if key in hot_keys.keys():
keys_array[index] = hot_keys[key]
key = keys_array[index]
if len(key) > 1 and key not in key_shorthand.keys(): # if
keystroke is not single character
keys_array[index] = '{' + key.upper() + '}'
if key in key_shorthand.keys():
keys_array[index] = key_shorthand[key]
final_keys = ''.join(keys_array)
for i in range(repetitions):
if time_wait:
time.sleep(float(delay))
send_keys(final_keys)
log.info('Entered keystroke ' + keys)
except Exception as e:
log.error('Error sending key strokes: ' + str(sys.exc_info()[1]))
return False, e
Handling an exception
try:
You do your operations here;
......................
except ExceptionI:
If there is ExceptionI, then execute this block.
except ExceptionII:
If there is ExceptionII, then execute this block.
......................
else:
If there is no exception then execute this block.
try:
You do your operations here;
......................
except:
If there is any exception, then execute this block.
......................
else:
If there is no exception, then execute this block.
This kind of a try-except statement catches all the exceptions that occur.
try:
You do your operations here;
......................
except(Exception1[, Exception2[,...ExceptionN]]]):
If there is any exception from the given exception list,
then execute this block.
......................
else:
If there is no exception, then execute this block.
try:
fh = open("testfile", "w")
try:
fh.write("This is my test file for exception handling!!")
finally:
print "Going to close the file"
fh.close()
except IOError:
print "Error: can\'t find file or read data"
Argument of an Exception
A value that gives additional information about the problem. The contents
of the argument vary by exception. You capture an exception's argument by
supplying a variable in the except clause as follows −
try:
You do your operations here;
......................
except ExceptionType as Argument:
You can print value of Argument here...
Raising an Exceptions
Syntax:
raise [Exception [, args [, traceback]]]
User-Defined Exceptions
In the try block, the user-defined exception is raised and caught in the
except block. The variable e is used to create an instance of the
class Networkerror.
class Networkerror(RuntimeError):
def __init__(self, arg):
self.args = arg
So once you defined above class, you can raise the exception as follows −
try:
raise Networkerror("Bad hostname")
except Networkerror,e:
print e.args
Creating Classes
The class statement creates a new class definition. The name of the class
immediately follows the keyword class followed by a colon as follows −
class Employee:
'Common base class for all employees'
empCount = 0
def displayCount(self):
print("Total Employee %d" % Employee.empCount)
def displayEmployee(self):
print("Name : ", self.name, ", Salary: ", self.salary)
Elements outside the __init__ method are static elements; they belong to the
class.
empCount = 0
The first method __init__() is a special method, which is called class constructor
or initialization method that Python calls when you create a new instance of this
class.
We declare other class methods like normal functions with the exception that
the first argument to each method is self. Python adds the self argument to the
list for us.
Self is a reference variable which is always pointing to current Object. Within
the python class to refer current object we should use self variable.
The first argument to the constructor and instance method should be self.
Python virtual machine(PVM) is responsible to provide value for self argument
and it is not required provide explicitly.
By using self we can declare and access instance variable.
Constructor
The name should be always: __init__().
Whenever an object is created, constructor is called automatically, it is not required
to call it explicitly.
The main objective: to declare and initialize variables. For every object,
constructor will be executed once.
__new__ method
Whenever a class is instantiated __new__ and __init__ methods are
called. __new__ method will be called when an object is created
and __init__ method will be called to initialize the object. In the base
class object, the __new__ method is defined as a static method which
requires to pass a parameter cls. cls represents the class that is needed to
be instantiated, and the compiler automatically provides this parameter at
the time of instantiation.
Accessing Attributes
You access the object's attributes using the dot operator with object. Class
variable would be accessed using class name as follows −
emp1.displayEmployee()
print "Total Employee %d" % Employee.empCount
Instead of using the normal statements to access attributes, you can use the
following functions −
Types of Methods
1. Instance method
2. Class method
3. Static method
Class Inheritance
Different Types of Inheritance
Single inheritance.
Multi-level inheritance.
Multiple inheritance.
Multipath inheritance.
Hierarchical Inheritance.
Hybrid Inheritance.
Inheritance
class SubClassName (ParentClass1[, ParentClass2, ...]):
'Optional class documentation string'
class_suite
Example:
#!/usr/bin/python
def parentMethod(self):
print 'Calling parent method'
def getAttr(self):
print "Parent attribute :", Parent.parentAttr
def childMethod(self):
print 'Calling child method'
class Base1:
pass
class Base2:
pass
Polymorphism-Overriding Methods
Polymorphism means same function name (but different signatures) being uses for
different types.
In Python, Polymorphism lets us define methods in the child class that have
the same name as the methods in the parent class. In inheritance, the child
class inherits the methods from the parent class. However, it is possible to
modify a method in a child class that it has inherited from the parent class.
This is particularly useful in cases where the method inherited from the
parent class doesn’t quite fit the child class. In such cases, we re-implement
the method in the child class. This process of re-implementing a method in
the child class is known as Method Overriding.
Encapsulation-Data Hiding
An object's attributes may or may not be visible outside the class definition. You
need to name attributes with a double underscore prefix, and those
attributes then are not being directly visible to outsiders.
class JustCounter:
__secretCount = 0
def count(self):
self.__secretCount += 1
print(self.__secretCount)
counter = JustCounter()
counter.count()
counter.count()
print(counter.__secretCount)
Single Underscore:
In Interpreter
_ returns the value of last executed expression value in Python
Prompt/Interpreter
After a name
To avoid such conflict between python keyword and variable we use
underscore after name
Before a name
Leading Underscore before variable/function/method name indicates to
programmer that It is for internal use only, that can be modified
whenever class want.
Here name prefix by underscore is treated as non-public. If specify from
Import * all the name starts with _ will not import. Python does not
specify truly private so this ones can be called directly from other
modules if it is specified in __all__, we also call it weak Private.
Python Regular Expressions
A regular expression is a special sequence of characters that helps you match or
find other strings or sets of strings, using a specialized syntax held in a pattern.
The module re provides full support for Perl-like regular expressions in Python.
The re module raises the exception re.error if an error occurs while compiling or
using a regular expression.
Parameter Description
Findall()
returns all non-overlapping matches of pattern in string, as a list of strings.
The string is scanned left-to-right, and matches are returned in the order in which they are
found.
find all the matches and returns the list with objects.
l = re.findall(‘[a-z]’,’ab3ffds7djfh’)
print(l)
Fullmatch()
Complete string should match according to the given pattern, if then returns match object
otherwise None.
import re
match = re.fullmatch(pattern, string,)
group(num) or groups()
We use group(num) or groups() function of match object to get matched expression.
1 group(num=0)
This method returns entire match pattern (or specific subgroup num)
2 groups()
This method returns all matching subgroups in a tuple (empty if there
weren't any)
groupdict([default])
Return a dictionary containing all the named subgroups of the match, keyed by the
subgroup name. The default argument is used for groups that did not participate in the
match; it defaults to None. For example:
search Function
This function searches for first occurrence of RE pattern within string with
optional flags. If not found returns None.
Syntax:
import re
re.search(pattern, string, flags=0)
Parameter Description
Pattern This is the regular expression to be matched.
This method replaces all occurrences of the RE pattern in string with repl,
substituting all occurrences unless max provided. This method returns modified
string.
substitution or replacement
re.sub(regex,replacement,targetstring)
s = re.sub(‘\d’,’#’,’ab3f6fds7djf0h’)
print(s)
subn()
to find the number of substitution
return type is tuple
t = re.subn(regex,replacement,targetstring)
tuple(resultstring,number of replacements)
t = re.subn(‘\d’,’#’,’ab3f6fds7djf0h’)
print(s)
split()
returns a list with all the elements separated by split elements
l = re.split(‘[.]’,’www.google.com’)
print(l)
Application Areas of Regular Expression:
Validations
Pattern Matching application
Translators like compilers, interpreters, assemblers
To develop digital circuits
To develop communication Protocols like TCP/IP
MetaCharacters
Metacharacters are characters that are interpreted in a special way by a RegEx engine.
[] . ^ $ * + ? {} () \ |
. - Period
A period matches any single character (except newline '\n').
^ - Caret
The caret symbol ^ is used to check if a string starts with a certain character.
$ - Dollar
The dollar symbol $ is used to check if a string ends with a certain character.
* - Star
The star symbol * matches zero or more occurrences of the pattern left to it.
+ - Plus
The plus symbol + matches one or more occurrences of the pattern left to it.
? - Question Mark
The question mark symbol ? matches zero or one occurrence of the pattern left to it.
{} - Braces
Consider this code: {n,m}. This means at least n, and at most m repetitions of the pattern
left to it.
| - Alternation
Vertical bar | is used for alternation (or operator).
() - Group
Parentheses () is used to group sub-patterns. For example, (a|b|c)xz match any string
that matches either a or b or c followed by xz
\ - Backslash
Backlash \ is used to escape various characters including all metacharacters.
For example,
\$a match if a string contains $ followed by a. Here, $ is not interpreted by a RegEx
engine in a special way.
Character classes:
[abc]= either a or b or c
[^abc] = except a, b and c
[a-z] = any lowercase alphabet symbol
[A-Z] = any uppercase alphabet symbol
[a-zA-Z] = any alphabet symbol
[0-9] = any numeric symbol
[a-zA-Z0-9] = any alpha numeric symbol
[^a-zA-Z0-9] = special character
Quantifiers
Used to specify the number of occurrences to match
a = exactly one ‘a’
a+ = atleast one ‘a’
a* = any number of a’s including zero number also
a? = atmost one ‘a’
either one or zero number of a’s
a{n} = Exactly n number of a’s
a{m,n} = minimum m number of a’s and maximum n number of a’s
^a = checks whether the given target string starts with a or not
a$ = checks whether the given target string ends with a or not
Python Decorators
we have two different kinds of decorators in Python:
Function decorators
Class decorators
A decorator in Python is any callable Python object that is used to modify a function
or a class. A reference to a function "func" or a class "C" is passed to a decorator
and the decorator returns a modified function or class. The modified functions or
classes usually contain calls to the original function "func" or class "C".
Monkey Patching
Multithreading in Python
import threading
t=threading.thread(target=function, args=[,Kwargs])
t.start()
target: the function to be executed by thread
args: the arguments to be passed to the target function
To start a thread, we use start method of Thread class.
A thread is a lightweight process, and multithreading allows us to execute multiple
threads at once. As you know, Python is a multithreaded language. It has a multi-
threading package.
The GIL (Global Interpreter Lock) ensures that a single thread executes at a
time. A thread holds the GIL and does a little work before passing it on to the next
thread. This makes for an illusion of parallel execution. But in reality, it is just
threaded taking turns at the CPU. Of course, all the passing around adds overhead
to the execution.
Pros
Lightweight - low memory footprint
Shared memory - makes access to state from another context easier
Allows you to easily make responsive UIs
cPython C extension modules that properly release the GIL will run in parallel
Great option for I/O-bound applications
Cons
cPython - subject to the GIL
Not interruptible/killable
If not following a command queue/message pump model (using
the Queue module), then manual use of synchronization primitives become a
necessity (decisions are needed for the granularity of locking)
Code is usually harder to understand and to get right - the potential for race
conditions increases dramatically
Multiprocessing in Python
Multiprocessing refers to the ability of a system to support more than one processor
at the same time. The operating system allocates these threads to the processors
improving performance of the system.
# importing the multiprocessing module
import multiprocessing
def print_cube(num):
print("Cube: {}".format(num * num * num))
def print_square(num):
print("Square: {}".format(num * num))
if __name__ == "__main__":
# creating processes
p1 = multiprocessing.Process(target=print_square, args=(10, ))
p2 = multiprocessing.Process(target=print_cube, args=(10, ))
# starting process 1
p1.start()
# starting process 2
p2.start()
Advantages
In multiprocessing, any newly created process will do following:
run independently
have their own memory space.
The threading module uses threads, the multiprocessing module uses
processes. The difference is that threads run in the same memory space,
while processes have separate memory. This makes it a bit harder to
share objects between processes with multiprocessing. Since threads use
the same memory, precautions have to be taken or two threads will write
to the same memory at the same time. This is what the global interpreter
lock is for.
Spawning processes is a bit slower than spawning threads. Once they are
running, there is not much difference.
Memory Management
Every object in Python has a reference count and a pointer to a type.
ob_refcnt: reference count
ob_type: pointer to another type
Python has a private heap space to hold all objects and data structures. Being
programmers, we cannot access it; it is the interpreter that manages it. But with
the core API, we can access some tools. The Python memory manager controls the
allocation.
Garbage Collection
Additionally, an inbuilt garbage collector recycles all unused memory so it can make
it available to the heap space.
Python allows you to inspect the current reference count of an object with
the sys module. You can use sys.getrefcount(numbers), but keep in mind that
passing in the object to getrefcount() increases the reference count by 1.
In any case, if the object is still required to hang around in your code, its reference
count is greater than 0. Once it drops to 0, the object has a specific deallocation
function that is called which “frees” the memory so that other objects can use it.
Coroutine
What is the difference between a "generator" and a "coroutine" in Python?
A generator is an iterator that generates values on the fly as needed. It is defined using the yield
keyword and iterates over a "for" loop or calling the next() function. Generators are useful for
generating large sequences of values that may be too large to store in memory.
On the other hand, a coroutine is a special kind of function that can be paused and resumed at
specific points. It is defined using the async def syntax and iterates using an async for loop or
by calling the await function. Coroutines help perform asynchronous operations, such as
network or database I/O, without blocking the main thread of execution.
Asyncio
The "asyncio" library is a built-in library in Python that provides an infrastructure for writing
asynchronous, concurrent, and parallel code. It is designed to help developers write highly
efficient and scalable network servers and clients. Asyncio enables you to write code that can
perform I/O operations without blocking the main thread of execution, which can significantly
improve the performance and responsiveness of your applications.
def foo():
pass
def bar():
pass
def baz():
pass
If someone writes from my_module import *, only foo and bar will be imported, because they are the
only names listed in __all__. baz will not be imported, because it is not in __all__.
img = cv2.imread('image.jpg')
To avoid all the ways your tesseract output accuracy can drop, you need to make
sure the image is appropriately pre-processed.
This includes rescaling, binarization, noise removal, deskewing, etc.
After preprocessing with the following code
image = cv2.imread('aurebesh.jpg')
gray = get_grayscale(image)
thresh = thresholding(gray)
opening = opening(gray)
canny = canny(gray)
digitize invoices, PDFs or number plates
Limitations of Tesseract
The OCR is not as accurate as some commercial solutions available to us.
Doesn't do well with images affected by artifacts including partial occlusion,
distorted perspective, and complex background.
It is not capable of recognizing handwriting.
It may find gibberish and report this as OCR output.
If a document contains languages outside of those given in the -l LANG
arguments, results may be poor.
It is not always good at analyzing the natural reading order of documents. For
example, it may fail to recognize that a document contains two columns, and
may try to join text across columns.
Poor quality scans may produce poor quality OCR.
It does not expose information about what font family text belongs to.
Pywinautogui Module
PyAutoGUI is a cross-platform GUI automation Python module for human
beings. Used to programmatically control the mouse & keyboard.
Screenshot Functions
>>> import pyautogui>>> im2 = pyautogui.screenshot('my_screenshot2.png')
Selenium Module
1. Usage
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Firefox()
driver.get("http://www.python.org")
2. Locating Elements
Selenium provides the following methods to locate elements in a page:
find_element_by_id
find_element_by_name
find_element_by_xpath
find_element_by_link_text
find_element_by_partial_link_text
find_element_by_tag_name
find_element_by_class_name
find_element_by_css_selector
find_elements_by_name
find_elements_by_xpath
find_elements_by_link_text
find_elements_by_partial_link_text
find_elements_by_tag_name
find_elements_by_class_name
find_elements_by_css_selector
3. Waits
most of the user action requires some kind of wait before performing it.
driver = webdriver.Firefox()
driver.get("http://somedomain/url_that_delays_loading")
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "myDynamicElement"))
)
finally:
driver.quit()
Implicit Wait
An implicit wait directs the WebDriver to poll the DOM for a certain
amount of time (as mentioned in the command) when trying to locate an
element that is not visible immediately. The default value of time that
can be set using Implicit wait is zero. Its unit is in seconds. Implicit wait
remains associated with the web element until it gets destroyed.
driver.implicitly_wait(100)
The timeout in this example is 100 seconds which will trigger if the
target element is not available during that period.
Explicit Wait
Following are the two Selenium Python classes needed to implement
explicit waits.
WebDriverWait, and
Expected Conditions class of the Python.
4. Action Chains
ActionChains are a way to automate low level interactions such as mouse
movements, mouse button actions, key press, and context menu interactions.
This is useful for doing more complex actions like hover over and drag and drop.
Generate user actions.
When you call methods for actions on the ActionChains object, the actions
are stored in a queue in the ActionChains object. When you call perform(),
the events are fired in the order they are queued up.
ActionChains can be used in a chain pattern:
menu = driver.find_element_by_css_selector(".nav")
hidden_submenu = driver.find_element_by_css_selector(".nav #submenu1")
ActionChains(driver).move_to_element(menu).click(hidden_submenu).perform(
)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
driver = webdriver.Firefox()
driver.get('http://www.python.org/')
driver.save_screenshot('screenshot.png')
driver.quit()
Exceptions in Selenium
NoSuchElmentException
StaleElementReferenceException
WebDriverException
we usually use Requests, the beloved Python HTTP library, for simple sites;
and Selenium, the popular browser automation tool, for sites that make heavy
use of Javascript. Using Requests generally results in faster and more
concise code, while using Selenium makes development faster on Javascript
heavy sites.
Request Module
Ping a website or portal for information this is called making a request.
Import requests
r = requests.get(‘https://github.com/timeline.json’)
r.status_code
>>200
r.headers
{
'status': '200 OK',
'content-encoding': 'gzip',
'transfer-encoding': 'chunked',
'connection': 'close',
'server': 'nginx/1.0.4',
'x-runtime': '148ms',
'etag': '"e1ca502697e5c9317743dc078f67693f"',
'content-type': 'application/json; charset=utf-8'
}
Encoding
Requests will automatically decade any content pulled from a server. But most
Unicode character sets are seamlessly decoded anyway.
print r.encoding
>> utf-8
Custom Headers
To add custom HTTP headers to a request, you must pass them through a
dictionary to the headers parameter.
import json
url = 'https://api.github.com/some/endpoint'
payload = {'some': 'data'}
headers = {'content-type': 'application/json'}
r = requests.post(url, data=json.dumps(payload),
headers=headers)
r.history
r = requests.put("http://httpbin.org/put")
r = requests.delete("http://httpbin.org/delete")
r = requests.head("http://httpbin.org/get")
r = requests.options("http://httpbin.org/get")
What are the commands that are used to copy an object in Python?
newdict = olddict.copy()
The assignment statement doesn’t copy any object but it creates a binding between the
target and the object that is used for the mutable items. Copy is required to keep a copy
of it using the modules that is provided to give generic and shallow operations.
To implement a queue, use collections.deque which was designed to have fast appends
and pops from both ends.
List Comprehensions
It can be used to construct lists in a very natural, easy way, like a mathematician is
used to do.
S = {x² : x in {0 ... 9}}
V = (1, 2, 4, 8, ..., 2¹²)
M = {x | x in S and x even}
>>> S = [x**2 for x in range(10)]
>>> V = [2**i for i in range(13)]
>>> M = [x for x in S if x % 2 == 0]
>>>
>>> print(S); print(V); print(M)
Note: Lines beginning with ">>>" and "..." indicate input to Python (these are the default prompts of
the interactive interpreter). Everything else is output from Python.
The interesting thing is that we first build a list of non-prime numbers, using a single
list comprehension, then use another list comprehension to get the "inverse" of the
list, which are prime numbers.
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
Lists can contain any type of elements, including strings, nested lists and functions.
You can even mix different types within a list.
The following works on a list of strings and produces a list of lists. Each of the sublists
contains two strings and an integer.
>>> words = 'The quick brown fox jumps over the lazy dog'.split()
>>> print words
>>>
>>> stuff = [[w.upper(), w.lower(), len(w)] for w in words]
>>> for i in stuff:
... print i
...
['THE', 'the', 3]
['QUICK', 'quick', 5]
['BROWN', 'brown', 5]
['FOX', 'fox', 3]
['JUMPS', 'jumps', 5]
['OVER', 'over', 4]
['THE', 'the', 3]
['LAZY', 'lazy', 4]
['DOG', 'dog', 3]
>>>
>>> stuff = map(lambda w: [w.upper(), w.lower(), len(w)], words)
>>> for i in stuff:
... print i
...
['THE', 'the', 3]
['QUICK', 'quick', 5]
['BROWN', 'brown', 5]
['FOX', 'fox', 3]
['JUMPS', 'jumps', 5]
['OVER', 'over', 4]
['THE', 'the', 3]
['LAZY', 'lazy', 4]
['DOG', 'dog', 3]
Dictionary Comprehensions
Zip
#dictionary comprehension
names = ['Bruce','Clark','Peter','Logan']
heros = ['Batman','Superman','Spiderman','Wolverine']
#print(list(zip(names,heros)))
print(dict(zip(names,heros)))
...
{'Peter': 'Spiderman', 'Bruce': 'Batman', 'Logan': 'Wolverine',
'Clark': 'Superman'}
Now it's time to use zip. First, we zip the lists and loop
through them in parallel like this:
>>> list(zip(keys,values))
[('a', 1), ('b', 2), ('c', 3)]
>>> D2 = {}
>>> for (k,v) in zip(keys, values):
... D2[k] = v
...
>>> D2
{'a': 1, 'b': 2, 'c': 3}
Lambda Function
Anonymous function that are constructed at runtime.
A simple 1-line function that do not use def or return key keywords. These are
implicit.
#double
def double(x):
return x*2
lambda x: 2*x
x is parameter and (2*x) is the return.
#add x and y
def add(x,y):
return x+y
lambda x,y: x+y
#max of x,y
def max(x,y):
if x>y:
return x
else:
return y
print(max(8,5))
Map Function
Apply the same function to each element of a sequence and returns the modified
list.
Map
List,[m,n,p] New list, [f(m), f(n), f(p)]
Function,f()
The above example also demonstrates that you can do exactly the same thing
with map() and a lambda function. However, there are cases when you cannot
use map()and have to use a list comprehension instead, or vice versa. When you
can use both, then it is often preferable to use a list comprehension, because this is
more efficient and easier to read, most of the time.
You cannot use list comprehensions when the construction rule is too complicated to be
expressed with "for" and "if" statements, or if the construction rule can change
dynamically at runtime. In this case, you better use map() and / or filter() with an
appropriate function. Of course, you can combine that with list comprehensions.
#print[16,9,4,1]
def square(lst1):
lst2 = []
for num in lst1:
lst2.append(num**2)
return lst2
print(square[4,3,2,1])
n=[4,3,2,1]
print(list(map(lambda x : x**2, n)))
or
print(list(map(square,n)))
Function,f() = lambda x : x**2
List = n
#list comprehension
print([x**2 for x in n])
Filter Function
Filter items out of sequence and returns filtered list.
Filter
List,[m,n,p] New list, [m,n]
Condition,c()
If(m == condition)
#print[4,3]
def over_two(lst1):
lst2 = [x for x in lst1 if x>2]
return lst2
print(over_two([4,3,2,1]))
n[4,3,2,1]
print(list(filter(lambda x: x>2,n)))
Reduce Function
Applies same operation to items of a sequence and uses the result of operation as
first param of next operation and returns an item, not a list.
Reduce
List,[m,n,p] f(f(m,n),p)
Function,f()
#print 24
def mult(lst):
prod = lst[0]
for i in range(1,len(lst)):
prod = prod * lst[i]
return prod
print(mult([4,3,2,1]))
#reduce function
n = [4,3,2,1]
print(list(reduce(lambda x,y : x*y,n)))
Function Annotations
Function annotations are completely optional metadata information about the types used by
user-defined functions.
Annotations are stored in the __annotations__ attribute of the function as a dictionary and have
no effect on any other part of the function. Parameter annotations are defined by a colon after
the parameter name, followed by an expression evaluating to the value of the annotation.
Return annotations are defined by a literal ->, followed by an expression, between the
parameter list and the colon denoting the end of the def statement.
def f(ham: str, eggs: str = 'eggs') -> str:
print("Annotations:", f.__annotations__)
print("Arguments:", ham, eggs)
return ham + ' and ' + eggs
...
>>> f('spam')
Annotations: {'ham': <class 'str'>, 'return': <class 'str'>,
'eggs': <class 'str'>}
Arguments: spam eggs
'spam and eggs'
NotImplementedErro
r If an abstract method isn’t available.
Socket Module
Socket programming is a way of connecting two nodes on a network to
communicate with each other. One socket(node) listens on a particular port at an
IP, while other socket reaches out to the other to form a connection. Server forms
the listener socket while client reaches out to the server.
import socket
import sys
Here we made a socket instance and passed it two parameters. The first
parameter is AF_INET and the second one is SOCK_STREAM. AF_INET
refers to the address family ipv4. The SOCK_STREAM means connection
oriented TCP protocol.
try:
# Send data
message = 'This is the message. It will be repeated.'
print >>sys.stderr, 'sending "%s"' % message
sock.sendall(message)
finally:
print >>sys.stderr, 'closing socket'
sock.close()
When the entire message is sent and a copy received, the socket is closed to free up the port.
For sending data the socket library has a sendall function. This function allows you to send data
to a server to which the socket is connected and server can also send data to the client using
this function.
# Next bind to the port we have not typed any ip in the ip field
# instead we have inputted an empty string this makes the server listen to
# requests coming from other computers on the network
s.bind(('', port))
print("socket binded to %s" %(port))
Client
# Import socket module
import socket
Paramiko
import paramiko
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
conn = ssh.connect(hostname=server, username=username, password=password)
channel = ssh.invoke_shell()
channel.send('1\n')
channel.recv(9999)
ssh.close()
What is PEP 8?
PEP 8 is a coding convention, a set of recommendation, about how to write your Python
code more readable. Python Style guide
https://pep8.org
What is pickling and unpickling?
Pickle module accepts any Python object and converts it into a string representation
and dumps it into a file by using dump function, this process is called pickling.
Python Object String representation (Byte Stream)
dumps() – This function is called to serialize an object hierarchy.
While the process of retrieving original Python objects from the stored string
representation is called unpickling.
String represntation (Byte Stream) Python Object
loads() – This function is called to de-serialize a data stream.
a, b, c, d = [3, 4, 5]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: need more than 3 values to unpack
>>> myVar = 5
>>> list = []
>>> list.remove(myVar)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: list.remove(x): x not in list
What are the tools that help to find bugs or perform static analysis?
PyChecker is a static analysis tool that detects the bugs in Python source code and warns
about the style and complexity of the bug. Pylint is another tool that verifies whether
the module meets the coding standard.
Random Module
Explain how can you generate random numbers in Python?
To generate random numbers in Python, you need to import command as
import random
random.randint(a,b)
This returns a random floating point number in the range (a,b).
random() - 0 to 1 (not inclusive)
Randint(x,y) - inclusive
Uniform(x,y) – not inclusive
randrange()-
randrange(start,stop,step)
randdrange(10) generates a number from 0 to 9
randrange(1,11) generates a number from 1 to 10
randrange(1,11,2) generates a number 1,3,5,7,9 randomly
shuffle(list) - shuffles in place and doesn't return anything
sample(population,k)- Return a k length list of unique elements chosen from the
population sequence. Used for random sampling without replacement.
from random import sample
words = ['red', 'adventure', 'cat', 'cat']
newwords = sample(words, len(words)) # Copy and shuffle
print(newwords) # Possible Output: ['cat', 'cat', 'red', 'adventure']
What is the command to shuffle a list and to randomly select an element in a list?
list = [‘apple’,’banana’,’grapes’]
random.shuffle(list)
random.choice(list)
output:
<class 'int'>
<class 'int'>
Define class?
Class is a specific object type created when class statement is executed.
How do we share global variable across modules in Python?
We can create a config file and store the entire global variable to be shared across
modules or scripts in it.
Describe how to implement cookies for web python?
A cookie is an arbitrary string of characters that uniquely identify a session. Each cookie
is specific to one website and one user.
What are uses of lambda?
It used to create small anonymous functions at run time.
How do I copy an object in Python?
Try copy.copy or copy.deepcopy for the general case. Not all objects can be copied, but
most can.
import copy
newobj = copy.copy(oldobj) # shallow copy
newobj = copy.deepcopy(oldobj) # deep (recursive) copy
Some objects can be copied more easily. Dictionaries have a copy method:
newdict = olddict.copy()
Sequences can be copied by slicing:
new_list = L[:]
You can also use the list, tuple, dict, and set functions to copy the corresponding objects,
and to convert between different sequence types:
It allows you to literally step through your list and select only those elements that your
step value includes.
L = range(10)
L[::2]
[0, 2, 4, 6, 8]
Negative values also work to make a copy of the same list in reverse order:
L[::-1]
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
When To Use Python Lists And When To Use Tuples, Dictionaries Or Sets?
Sets
Unordered collection of unique objects.
set_one={1,2,3,4,5}
set_two = {4,5,6,7.8}
Functions:
Add(): set_one.add(6)
Union(): print(set_one.union(set_two))
Intersection(): print(set_one.intersection(set_two))
Difference(): print(set_one.difference(set_two))
Lists Versus Tuples
Tuples are used to collect an immutable ordered list of elements. This means that:
You can’t add elements to a tuple. There’s no append() or extend() method
for tuples,
You can’t remove elements from a tuple. Tuples have
no remove() or pop() method,
You can find elements in a tuple since this doesn’t change the tuple.
You can also use the in operator to check if an element exists in the tuple.
So, if you’re defining a constant set of values and all you’re going to do with it is iterate
through it, use a tuple instead of a list. It will be faster than working with lists and also
safer, as the tuples contain “write-protect” data.
Note that, because you have keys and values that link to each other, the performance
will be better than lists in cases where you’re checking membership of an element.
Output:
'VimalVimalVimalVimalVimal'
The valid range for the argument is from 0 through 1,114,111 (0x10FFFF in base 16).
ValueError will be raised if i is outside that range.
ord(c)
Given a string representing one Unicode character, return an integer representing the
Unicode code point of that character. For example, ord('a') returns the
integer 97 and ord('€') (Euro sign) returns 8364. This is the inverse of chr().
Str = “hello”
print(str.capitalize())
output: Hello
del()-
sort()-To sort the list in ascending order.
count ()- This method counts the number of occurrences of one string within
another string. The simplest form is this one: s.count(substring). Only non-
overlapping occurrences are taken into account:
find()-The function returns the index of the first occurrence of the substring. If
the substring is not found, the method returns -1.
Parameters
endswith()- It returns True if the string ends with the specified suffix, otherwise return
False optionally restricting the matching with the given indices start and end.
str.endswith(suffix[, start[, end]])
Tuple
A tuple needn’t be enclosed in parenthesis.
a,b,c = 1,2,3
a,b,c
(1,2,3)
Dictionary
The method setdefault() is similar to get(), but will set dict[key]=default if key
is not already in dict.
dict.setdefault(key, default=None)
itertools.count(start=0, step=1)
Make an iterator that returns evenly spaced values starting with n. Often used as an
argument to map() to generate consecutive data points. Also, used with zip() to add
sequence numbers. Equivalent to:
def count(start=0, step=1):
# count(10) --> 10 11 12 13 14 ...
# count(2.5, 0.5) -> 2.5 3.0 3.5 ...
n = start
while True:
yield n
n += step
When counting with floating point numbers, better accuracy can sometimes be achieved
by substituting multiplicative code such
as: (start + step * i for i in count()).
itertools.cycle(iterable)
Make an iterator returning elements from the iterable and saving a copy of each. When
the iterable is exhausted, return elements from the saved copy. Repeats indefinitely.
Equivalent to:
def cycle(iterable):
# cycle('ABCD') --> A B C D A B C D A B C D ...
saved = []
for element in iterable:
yield element
saved.append(element)
while saved:
for element in saved:
yield element
Note, this member of the toolkit may require significant auxiliary storage (depending on
the length of the iterable).
pytest
Pytest is a testing framework which allows us to write test codes using python.
Arrays
A numpy array is a grid of values, all of the same type,
is indexed by a tuple of nonnegative integers.
The number of dimensions is the rank of the array;
the shape of an array is a tuple of integers giving the size of the array along each
dimension.
We can initialize numpy arrays from nested Python lists, and access elements using square
brackets:
import numpy as np
import numpy as np
arange() will create arrays with regularly incrementing values. Check the docstring for complete
information on the various ways it can be used. A few examples will be given here:
>>> np.arange(10)
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
array([ 2. , 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9])
linspace() will create arrays with a specified number of elements, and spaced equally
between the specified beginning and end values. For example:
The advantage of this creation function is that one can guarantee the number of elements
and the starting and end point, which arange() generally will not do for arbitrary start, stop,
and step values.
indices() will create a set of arrays (stacked as a one-higher dimensioned array), one per
dimension with each representing variation in that dimension. An example illustrates much
better than a verbal description:
>>> np.indices((3,3))
array([[[0, 0, 0], [1, 1, 1], [2, 2, 2]], [[0, 1, 2], [0, 1, 2], [0, 1, 2]]])
This is particularly useful for evaluating functions of multiple dimensions on a regular grid.
Array indexing
Slicing:
Similar to Python lists, numpy arrays can be sliced. Since arrays may be multidimensional, you
must specify a slice for each dimension of the array:
import numpy as np
# When using integer array indexing, you can reuse the same
# element from the source array:
print(a[[0, 0], [1, 1]]) # Prints "[2 2]"
One useful trick with integer array indexing is selecting or mutating one element
from each row of a matrix:
import numpy as np
# Create a new array from which we will select elements
a = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
Array math
Basic mathematical functions operate elementwise on arrays, and are available both as
operator overloads and as functions in the numpy module:
import numpy as np
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])
v = np.array([9,10])
w = np.array([11, 12])
# Matrix / vector product; both produce the rank 1 array [29 67]
print(x.dot(v))
print(np.dot(x, v))
Numpy provides many useful functions for performing computations on arrays; one of
the most useful is sum :
import numpy as np
x = np.array([[1,2],[3,4]])
x = np.array([[1,2], [3,4]])
print(x) # Prints "[[1 2]
# [3 4]]"
print(x.T) # Prints "[[1 3]
# [2 4]]"
Broadcasting
allows numpy to work with arrays of different shapes when performing arithmetic operations.
import numpy as np
This works; however when the matrix x is very large, computing an explicit loop in Python
could be slow. Note that adding the vector v to each row of the matrix x is equivalent to
forming a matrix vv by stacking multiple copies of v vertically, then performing elementwise
summation of x and vv . We could implement this approach like this:
import numpy as np
import numpy as np
10 23 56 17 52 61 73 90 26 72
Key Points
Homogeneous data
Size Immutable
Values of Data Mutable
A pandas Series can be created using the following constructor −
pandas.Series( data, index, dtype, copy)
Empty Series
import pandas as pd
s = pd.Series()
print s
Output:
0 a
1 b
2 c
3 d
dtype: object
Example 2
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data,index=[100,101,102,103])
print s
Its output is as follows −
100 a
101 b
102 c
103 d
dtype: object
Example 2
DataFrame
DataFrame is a two-dimensional array with heterogeneous data. For example,
Key Points
Heterogeneous data
Size Mutable
Data Mutable
Labeled axes (rows and columns)
Can Perform Arithmetic operations on rows and columns
pandas.DataFrame
pandas.DataFrame( data, index, columns, dtype, copy)
Create DataFrame
A pandas DataFrame can be created using various inputs like −
Lists
dict
Series
Numpy ndarrays
Another DataFrame
#With two column indices with one index with other name
df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1'])
print df1
print df2
Its output is as follows −
#df1 output
a b
first 1 2
second 5 10
#df2 output
a b1
first 1 NaN
second 5 NaN
Note − Observe, df2 DataFrame is created with a column index other than the dictionary key;
thus, appended the NaN’s in place. Whereas, df1 is created with column indices same as
dictionary keys, so NaN’s appended.
Panel
Panel is a three-dimensional data structure with heterogeneous data. It is hard
to represent the panel in graphical representation. But a panel can be illustrated as
a container of DataFrame.
Key Points
Heterogeneous data
Size Mutable
Data Mutable
Pandas head()
The head() method provides a rapid summary of a DataFrame. It returns the column
headers and a specified number of rows from the beginning.
import pandas as pd
# create a dataframe
data = {'Name': ['John', 'Alice', 'Bob', 'Emma', 'Mike', 'Sarah', 'David', 'Linda',
'Tom', 'Emily'],
'Age': [25, 30, 35, 28, 32, 27, 40, 33, 29, 31],
'City': ['New York', 'Paris', 'London', 'Sydney', 'Tokyo', 'Berlin', 'Rome',
'Madrid', 'Toronto', 'Moscow']}
df = pd.DataFrame(data)
# display the first three rows
print('First Three Rows:')
print(df.head(3))
print()
Output
In this example, we displayed selected rows of the df DataFrame starting from the top using
head().
Notice that the first five rows are selected by default when no argument is passed to the head()
method.
Pandas tail()
The tail() method is similar to head() but it returns data starting from the end
of the DataFrame.