Python Unit-2 Notes

UNIT-2
STRINGS:
A string can be defined as set of characters enclosed within pair of
single quotes ' ' or double quotes " " or triple quotes ''' '''.
Any symbol which is used while writing a program is called

character. For example, the English language has 26 characters.
Computers do not deal with characters; they deal with numbers

(binary). Even though you may see characters on your screen,
internally it is stored and manipulated as a combination of 0s and 1s.
This conversion of character to a number is called encoding, and

the reverse process is decoding. ASCII and Unicode are some of the
popular encodings used.
In Python, a string is a sequence of Unicode characters. Unicode was

introduced to include every character in all languages and bring
uniformity in encoding.
Strings are immutable. This means that once defined, they cannot
be changed.
About Unicode
Today’s programs need to be able to handle a wide variety of

characters. Applications are often internationalized to display
messages and output in a variety of user-selectable languages; the
same program might need to output an error message in English,
French, Japanese, Hebrew, or Russian. Web content can be written in
any of these languages and can also include a variety of emoji
symbols. Python’s string type uses the Unicode Standard for
representing characters, which lets Python programs work with all
these different possible characters.
Unicode is a specification that aims to list every character used by

human languages and give each character its own unique code. The
Unicode specifications are continually revised and updated to add
new languages and symbols.
A character is the smallest possible component of a text. ‘A’, ‘B’, ‘C’,

etc., are all different characters. So are ‘È’ and ‘Í’. Characters vary
depending on the language or context you’re talking about. For
example, there’s a character for “Roman Numeral One”, ‘Ⅰ’, that’s
separate from the uppercase letter ‘I’. They’ll usually look the same,
but these are two different characters that have different meanings.
The Unicode standard describes how characters are represented

by code points. A code point value is an integer in the range 0 to
0x10FFFF (about 1.1 million values, the actual number assigned is
less than that).
A Unicode string is a sequence of code points, which

are numbers from 0 through 0x10FFFF (1,114,111
decimal). This sequence of code points needs to be represented in
memory as a set of code units, and code units are then mapped to 8-
bit bytes.
UTF-8 is one of the most commonly used encodings, and Python
often defaults to using it. UTF stands for “Unicode Transformation
Format”, and the ‘8’ means that 8-bit values are used in the
encoding. (There are also UTF-16 and UTF-32 encodings, but they are
less frequently used than UTF-8.)
UTF-8 is recommended practice for encoding data to be exchanged

between systems.
UTF-8 is upward compatible with ASCII.
In Python3, all strings internally are UNICODE.
chr() function
The chr() method returns a character (a string) from an integer
(represents unicode code point of the character).
The syntax of chr() is:
chr(i)
chr() Parameters
chr() method takes a single parameter, an integer i.

The valid range of the integer is from 0 through 1,114,111.
Return Value from chr()
chr() returns:
 a character (a string) whose Unicode code point is the integer i
If the integer i is outside the range, ValueError will be raised.
Example 1: How chr() works?
print(chr(97))
print(chr(65))
print(chr(1200))
Output
a
A
Ұ
Example 2: Integer passed to chr() is out of the range
print(chr(-1))
Output
Traceback (most recent call last):
File "", line 1, in
ValueError: chr() arg not in range(0x110000)
When you run the program, ValueError is raised.

It's because the argument passed to chr() method is out of the
range.
The reverse operation of chr() function can be performed
by ord() function
ord() function
The ord() function returns an integer representing the Unicode
character.
The syntax of ord() is:
ord(ch)
ord() Parameters
The ord() function takes a single parameter:

 ch - a Unicode character
Return value from ord()
The ord() function returns an integer representing the Unicode

character.
Example: How ord() works in Python?
print(ord('5')) # 53
print(ord('A')) # 65
print(ord('$')) # 36
Output
53
65
36
Creating strings in Python

Strings can be created by enclosing characters inside a single quote
or double-quotes. Even triple quotes can be used in Python but
generally used to represent multiline strings and docstrings.
# defining strings in Python

# all of the following are equivalent
my_string = 'Hello'
print(my_string)
my_string = "Hello"
print(my_string)
my_string = '''Hello'''
print(my_string)
# triple quotes string can extend multiple lines
my_string = '''Hello, welcome to
the world of Python'''
print(my_string)
Reading and Converting strings

 We prefer to read data in using strings and then parse and
convert the data as we need.
 This gives us more control over error situations and/or bad user
input.
 Raw input numbers must be converted from strings.
Let us consider an example:
n=input("Enter the number")
In the above example, we have not specified any data type, in Python
even though we do not specify any data type by default it will be
considered as string. Whenever we enter the data, by default it will
be converted to string.
Representation or Accessing the string contents
We can get at any single character in a string using an index specified

in square brackets.
The index value must be an integer and starts at zero.
The index value can be an expression that is computed.
Example:
Let us take the sting as
str1='REVA' # Creating a string
Representation of string:
R E V A
0 1 2 3
Write a python program to read, print a string and print a

character present at specific index
str1=str(input("Enter a string"))
print("String entered=", str1)
print("Character present at first index is", str1[1])
You will get a python error if you attempt to index beyond

the end of a string.
So be careful when constructing index values and slices.
There is a built-in function len( ) that gives us the length of a string.

Example:
str1='REVA'
print(len(str1)
Output:
Slicing a string
Slicing in Python is a feature that enables accessing parts of
sequences like strings, tuples, and lists. You can also use them to
modify or delete the items of mutable sequences such as lists.
 We can also look at any continuous section of a string using a

colon operator.
 The second number is one beyond the end of the slice -“up to
but not including”.
 If the second number is beyond the end of the string, it stops at
the end.
 If we leave off the first number or the last number of the slice,
it is assumed to be the beginning or end of the string
respectively.
 The indexes can also be negative numbers in reverse order.
Write a python program to extract the substrings 'Hi'
'Welcome' 'REVA' 'UNIVERSITY' from the following string
str1='Hi, Welcome to REVA UNIVERSITY'
Program:
str1='Hi, Welcome to REVA UNIVERSITY'

print("Sub string 1 is", str1[0:2])
print("Sub string 1 is", str1[20:])
A substring "substr" between index1 and index2 is to be

extracted from the given input string, "str1", which is read
using input() and display the substring "substr" using a user
defined function.
str1=str(input("Enter the string"))

def slicing():
m=int(input("Enter the starting index value or index1"))
n=int(input("Enter the ending index value or index2"))
substr=str1[m:n]
print("String extracted between index1 and index2 is",substr)
slicing()
String containing multiple words is to be read from the user
one at a time and
i) convert all the strings to uppercase and
str1=str(input('Enter the string'))

print('String in Upper case is',str1.upper())
ii) split the words of a string using space as the separation

character.
str1=str(input('Enter the string'))

print("string with space as the separation character is",str1.split(' '))
Changing the string contents

In python, strings are immutable, once created, we cannot alter the
contents of a string.
Example:
my_string = 'REVA'
my_string[3] = 'a'
TypeError: 'str' object does not support item assignment
Deleting the content of a string
Example:
my_string = 'REVA'
del my_string[1]
TypeError: 'str' object doesn't support item deletion
Example:
del my_string
my_string
NameError: name 'my_string' is not defined
Concatenation of two or more strings

We can concatenate 2 strings by using ‘+’ operator.
Write a python program to read and find the concatenate of

two strings
s1=str(input("Enter first string"))
s2=str(input("Enter second string"))
s3=s1+s2
print("Concatenated string is",s3)
Concatenation using braces

We can use the small or round brackets to concatenate 2 or more
strings.
Example:
s=('Hi '
' REVA'
' UNIVERSITY')
print(s)
Output:
Hi REVA UNIVERSITY
Iterating through string
Example:
str1='SHIVA KUMAR'
for i in str1:
print(i)
Using in as an operator
The in keyword can also be used to check to see if one string

is "in" another string.
The in expression is a logical expression and returns True or False
and can be used in an if statement.
Example:
str1='REVA'
print('E' in str1)
print('e' in str1)
print('REV' in str1)
str2='ECE'
if 'EC' in str2:
print('Found the String')
else:
print('Not Found')
String Comparison
To compare two strings, we will compare each and every
character in both the strings. If all the characters in both the
strings are same, then only it is possible for us to tell that
both the strings are same. If there is any mismatching
character, then difference between the Unicode values of
mismatching characters in both the strings will be calculated.
Based on the difference value, we will tell whether string1 is
greater or string2 is greater.
Write a python program to compare two strings

str1='CSE'
str2='ECE'
if str1 > str2:
print('str1 is greater than str2')
elif str1 < str2:
print('str2 is greater than str1')
else:
print('str1 is equal to str2')
String in-built or built-in Functions/Methods:
Capitalize()
The function is used to convert the first character of a word
into uppercase character.
Example:
str1='reva University'
str1.capitalize()
Output:
Reva University
lower()
The function is used to convert the entire string into lower
case.
Example:
str1='ELECTRONICS'
str1.lower()
Output:
electronics
upper()
The function is used to convert the entire string into Upper
case.
Example:
str1='electronics'
str1.upper()
Output:
ELECTRONICS
center()
The center() method takes two arguments:
.width - length of the string with padded characters

.fillchar (optional) - padding character
The fillchar argument is optional. If it's not provided, space is
taken as default argument.
The center() method returns a string padded with specified

fillchar. It doesn't modify the original string.
Example:
str1='REVA'
str1.center(10)
Output:
' REVA '
Example:
str1='REVA'
str1.center(10,'*')
Output:
'***REVA***'
count()
The function is used to count the number of occurences of a
character or set of characters.
Example:
str1='REVA UNIVERSITY'
str1.count('E')
Output:
2
Example:
str1.count('REVA')
Output:
1
find()
The function returns an integer value:
.If the substring exists inside the string, it returns the index of
the first occurrence of the substring.
.If substring doesn't exist inside the string, it returns -1.
Example:
str1.find('A')
Output:
3
Example:
str1.find('UNI')
Output:
5
rfind()
The function prints the last occurence of the character.
Example:
str1.find('E')
Output:
9
strip()
The function returns a copy of the string by removing both
the leading and the trailing characters.
Example:
str1=' REVA UNIVERSITY '
str1.strip()
Output:
'REVA UNIVERSITY'
lstrip()
The function returns a copy of the string with leading
characters removed.
Example:
str1.lstrip()
Output:
'REVA UNIVERSITY '
rstrip()
The function returns a copy of the string with with trailing
characters removed.
Example:
str1.rstrip()
Output:
' REVA UNIVERSITY'
replace()
The replace() method returns a copy of the string where all
occurrences of a substring is replaced with another substring.
Example:
str1='reva UNIVERSITY'
str1.replace('reva','REVA')
Output:
'REVA UNIVERSITY'
title()
The title() method returns a string with first letter of each
word capitalized; a title cased string.
Example:
str1='reva university'
str1.title()
Output:
'Reva University'
split()
The split() method breaks up a string at the specified
separator and returns a list of strings. The string splits at the
specified separator.
If the separator is not specified, any whitespace (space,
newline etc.) string is a separator.
Example:
str1.split()
Output:
['REVA', 'UNIVERSITY']
Example:
str1='COMPUTER SCIENCE'
str1.split('E')
Output:
['COMPUT', 'R SCI', 'NC', '']
isalpha()
The isalpha() method returns True if all characters in the
string are alphabets. If not, it returns False.
Example:
str1.isalpha()
Output:
True
isalnum()
The isalnum() method returns True if all characters in the
string are alphanumeric (either alphabets or numbers). If not,
it returns False.
Example:
str1='CS 1 ECE 2'
str1.isalnum()
Output:
True
islower()
The islower() method returns True if all alphabets in a string
are lowercase alphabets. If the string contains at least one
uppercase alphabet, it returns False.
Example:
str1='cse'
str1.islower()
Output:
True
isupper()
The isupper() method returns True if all alphabets in a string
are uppercase alphabets. If the string contains at least one
lowercase alphabet, it returns False.
Example:
str1='CSE'
str1.isupper()
Output:
True
isdigit()
The isdigit() method returns True if all characters in a string
are digits. If not, it returns False.
Example:
str1='2021'
str1.isdigit()
Output:
True
startswith()
The startswith() method returns True if a string starts with
the specified prefix(string). If not, it returns False.
Example:
str1.startswith('R')
Output:
True
endswith()
The endswith() method returns True if a string ends with the
specified prefix(string). If not, it returns False.
Example:
str1.startswith('Y')
Output:
True
casefold()
The casefold() method is an aggressive lower() method which
converts strings to case folded strings for caseless matching.
The casefold() method removes all case distinctions present
in a string. It is used for caseless matching, i.e. ignores cases
when comparing.
Example:
str1='REVA'
str1.casefold()
Output:
reva
index()
The index() method returns the index of a substring inside
the string (if found). If the substring is not found, it raises an
exception.
The index() method is similar to find() method for strings. The
only difference is that find() method returns -1 if the
substring is not found, whereas index() throws an exception.
Example:
str1='REVA'
str1.index('R')
Output:
0
Regular Expressions
A Regular Expression (RegEx) is a sequence of characters that defines
a search pattern. For example,
â...s$
The above code defines a RegEx pattern. The pattern is: any five
letter string starting with a and ending with s.
A pattern defined using RegEx can be used to match against a string.
Expression String Matched?
abs No match
alias Match
â...s$ abyss Match
Alias No match
An abacus No match
Specify Pattern Using RegEx

To specify regular expressions, metacharacters are used. In the
above example, ^ and $ are metacharacters.
MetaCharacters
Metacharacters are characters that are interpreted in a special way
by a RegEx engine. Here's a list of metacharacters:
[] . ^ $ * + ? {} () \ |
[] - Square brackets
Square brackets specify a set of characters you wish to match.
a 1 match
ac 2 matches
[abc]
Hey Jude No match
abc de ca 5 matches
Here, [abc] will match if the string you are trying to match contains
any of the a, b or c.
You can also specify a range of characters using - inside square
brackets.
 [a-e] is the same as [abcde].
 [1-4] is the same as [1234].
 [0-39] is the same as [01239].
You can complement (invert) the character set by using

caret ^ symbol at the start of a square-bracket.
 [âbc] means any character except a or b or c.
 [^0-9] means any non-digit character.
. - Period
A period matches any single character (except newline '\n').
a No match
ac 1 match
..
acd 1 match
acde 2 matches (contains 4 characters)
^ - Caret
The caret symbol ^ is used to check if a string starts with a certain
character.
Expressio
String Matched?
n
a 1 match
â abc 1 match
bac No match
âb abc 1 match

Expressio
String Matched?
n
No match (starts with a but not

acb
followed by b)
$ - Dollar
The dollar symbol $ is used to check if a string ends with a certain
character.
a 1 match
a$ formula 1 match
cab No match
* - Star
The star symbol * matches zero or more occurrences of the pattern
left to it.
mn 1 match
man 1 match
ma*n maaan 1 match
main No match (a is not followed by n)
woman 1 match
+ - Plus
The plus symbol + matches one or more occurrences of the pattern
left to it.
ma+n mn No match (no a character)
man 1 match
maaan 1 match
woman 1 match
? - Question Mark
The question mark symbol ? matches zero or one occurrence of the
pattern left to it.
mn 1 match
man 1 match
No match (more than one a

maaan
ma?n character)
woma
1 match
n
{} - Braces
Consider this code: {n,m}. This means at least n, and at
most m repetitions of the pattern left to it.
abc dat No match
abc daat 1 match (at daat)

a{2,3}
aabc daaat 2 matches (at aabc and daaat)
aabc daaaat 2 matches (at aabc and daaaat)
Let's try one more example. This RegEx [0-9]{2, 4} matches at least 2
digits but not more than 4 digits
1 match (match at
ab123csde
ab123csde)
[0-9]{2,4} 12 and
3 matches (12, 3456, 73)
345673
1 and 2 No match
| - Alternation
Vertical bar | is used for alternation (or operator).
cde No match
a|b ade 1 match (match at ade)
acdbea 3 matches (at acdbea)
Here, a|b match any string that contains either a or b
() - Group
Parentheses () is used to group sub-patterns. For example, (a|b|
c)xz match any string that matches either a or b or c followed by xz
ab xz No match
(a|b|c)xz abxz 1 match (match at abxz)
axz cabxz 2 matches (at axzbc cabxz)

\ - Backslash
Backlash \ is used to escape various characters including all
metacharacters. For example,
\$a match if a string contains $ followed by a. Here, $ is not
interpreted by a RegEx engine in a special way.
If you are unsure if a character has special meaning or not, you can
put \ in front of it. This makes sure the character is not treated in a
special way.
Special Sequences
Special sequences make commonly used patterns easier to write.
Here's a list of special sequences:
\A - Matches if the specified characters are at the start of a string.

the sun Match

\Athe
In the sun No match
\b - Matches if the specified characters are at the beginning or end

of a word.
football Match
\bfoo a football Match
afootball No match
the foo Match
foo\b the afoo test Match
the afootest No match
\B - Opposite of \b. Matches if the specified characters are not at

the beginning or end of a word.
football No match
\Bfoo a football No match
afootball Match
the foo No match
foo\B the afoo test No match
the afootest Match

\d - Matches any decimal digit. Equivalent to [0-9]
12abc3 3 matches (at 12abc3)

\d
Python No match
\D - Matches any non-decimal digit. Equivalent to [^0-9]

1ab34"50 3 matches (at 1ab34"50)

\D
1345 No match
\s - Matches where a string contains any whitespace character.

Equivalent to [ \t\n\r\f\v].
Python RegEx 1 match

\s
PythonRegEx No match
\S - Matches where a string contains any non-whitespace character.
Equivalent to [^ \t\n\r\f\v].
ab 2 matches (at a b)
\S
No match
\w - Matches any alphanumeric character (digits and alphabets).

Equivalent to [a-zA-Z0-9_]. By the way, underscore _ is also
considered an alphanumeric character.
12&": ;c 3 matches (at 12&": ;c)

\w
%"> ! No match
\W - Matches any non-alphanumeric character. Equivalent to [â-zA-

Z0-9_]
1a2%c 1 match (at 1a2%c)

\W
Python No match
\Z - Matches if the specified characters are at the end of a string.

I like Python 1 match
Python\Z I like Python Programming No match
Python is fun. No match
Now we understood the basics of RegEx, let's discuss how to use

RegEx in your Python code.
Python RegEx
Python has a module named re to work with regular expressions. To
use it, we need to import the module.
import re
The module defines several functions and constants to work with

RegEx.
re.search()
The re.search() method takes two arguments: a pattern and a string.
The method looks for the first location where the RegEx pattern
produces a match with the string.
If the search is successful, re.search() returns a match object; if not,
it returns None.
Syntax of the function:
s = re.search(pattern, str)
Write a python program to perform the searching process

or pattern matching using search() function.
import re
string = "Python is fun"
s = re.search('Python', string)
if s:
print("pattern found inside the string")
else:
print("pattern not found")
Here, s contains a match object.

s.start(), s.end() and s.span()
The start() function returns the index of the start of the matched
substring. Similarly, end() returns the end index of the matched
substring. The span() function returns a tuple containing start and
end index of the matched part.
>>> s.start()
0
>>> s.end()
6
>>> s.span()
(0, 6)
>>> s.group()
‘Python’
re.match()
The re.match() method takes two arguments: a pattern and a string.

If the pattern is found at the start of the string, then the method
returns a match object. If not, it returns None.
Write a python program to perform the searching process

or pattern matching using match() function.
import re
pattern = 'â...s$'
test_string = 'abyss'
result = re.match(pattern, test_string)
if result:
print("Search successful.")
else:
print("Search unsuccessful.")
Here, we used re.match() function to search pattern within

the test_string.
re.sub()
The syntax of re.sub() is:
re.sub(pattern, replace, string)

The method returns a string where matched occurrences are
replaced with the content of replace variable.
If the pattern is not found, re.sub() returns the original string.
You can pass count as a fourth parameter to the re.sub() method. If

omited, it results to 0. This will replace all occurrences.
Example1:
re.sub('â','b','aaa')
Output:
'baa'
Example2:
s=re.sub('a','b','aaa')
print(s)
Output:
‘bbb’
Example3:
s=re.sub('a','b','aaa',2)
print(s)
Output:
‘bba’
re.subn()
The re.subn() is similar to re.sub() expect it returns a tuple of 2 items
containing the new string and the number of substitutions made.
Example1:
s=re.subn('a','b','aaa')
print(s)
Output:
(‘bbb’, 3)
re.findall()
The re.findall() method returns a list of strings containing all
matches.
If the pattern is not found, re.findall() returns an empty list.
Syntax:
re.findall(pattern, string)
Example1:
s=re.findall('a','abab')
print(s)
Output:
['a', 'a']
re.split()
The re.split method splits the string where there is a match and
returns a list of strings where the splits have occurred.
If the pattern is not found, re.split() returns a list containing the

original string.
You can pass maxsplit argument to the re.split() method. It's the
maximum number of splits that will occur.
By the way, the default value of maxsplit is 0; meaning all possible

splits.
Syntax:
re.split(pattern, string)
Example1:
s=re.split('a','abab')
print(s)
Output:
['', 'b', 'b']
Example2:
s=re.split('a','aababa',3)
print(s)
Output:
['', '', 'b', 'ba']
Python program to check that a string contains only a

certain set of characters (in this case a-z, A-Z and 0-9).
import re
pattern='\w+'
s1='shiva'
s2='sachin1'
s3='virat2'
a=re.search(pattern,s1)
b=re.search(pattern,s2)
c=re.search(pattern,s3)
print(a)
print(b)
print(c)
Python program to verify the Phone number using Regular

Expressions.
import re
pattern='(0|91)?[6-9][0-9]{9}'
p1='9731822325'
a=re.search(pattern,p1)
if a:
print("Search is successful")
else:
print("Search is unsuccessful")
Python program to extract email addresses using regular

expressions in Python (in this case john_123@gmail.com).
import re
pattern='(\w)+@(\w)+\.(com)'
email='john_123@gmail.com'
s1=re.search(pattern,email)
if s1:
print("Search is successful")
else:
print("Unsuccessful")
CASE STUDY
Street Addresses: In this case study, we will take one street address
as input and try to perform some operations on the input by making
use of library functions.
Example:
str1='100 NORTH MAIN ROAD'
str1.replace('ROAD','RD')
Output:
'100 NORTH MAIN RD'
str1.replace('NORTH','NRTH')
Output:
'100 NRTH MAIN ROAD'
re.sub('ROAD','RD',str1)
Output:
'100 NORTH MAIN RD'
re.sub('NORTH','NRTH',str1)
Output:
'100 NRTH MAIN ROAD'
re.split('A',str1)
Output:
['100 NORTH M', 'IN RO', 'D']
re.findall('O',str1)
Output:
['O', 'O']
re.sub('^1','2',str1)
Output:
'200 NORTH MAIN ROAD'
Roman Numerals
I=1
V=5
X = 10
L = 50
C = 100
D = 500
M = 1000
For writing 4, we will write the roman number representation as IV.

For 9, we will write as IX. For 40, we can write as XL. For 90, we can
write as XC. For 900, we can write as CM.
Let us write the roman number representation for few numbers.
Ex1:
1940
MCMXL
Ex2:
1946
MCMXLVI
Ex3:
1940
MCMXL
Ex4:
1888
MDCCCLXXXVIII
Checking for thousands:
1000=M
2000=MM
3000=MMM
Possible pattern is to have M in it.
Example:
pattern = '^M?M?M?$'
re.search(pattern, 'M')
Output:
<re.Match object; span=(0, 1), match='M'>
re.search(pattern, 'MM')
Output:
<re.Match object; span=(0, 2), match='MM'>
re.search(pattern, 'MMM')
Output:
<re.Match object; span=(0, 3), match='MMM'>
re.search(pattern, 'ML')
re.search(pattern, 'MX')
re.search(pattern, 'MI')
re.search(pattern, 'MMMM')
Checking for Hundreds:
100=C
200=CC
300=CCC
400=CD
500=D
600=DC
700=DCC
800=DCCC
900=CM
Example:
pattern = '^M?M?M?(CM|CD|D?C?C?C?)$'
re.search(pattern,'MCM')
Output:
<re.Match object; span=(0, 3), match='MCM'>
re.search(pattern,'MD')
Output:
<re.Match object; span=(0, 2), match='MD'>
re.search(pattern,'MMMCCC')
Output:
<re.Match object; span=(0, 6), match='MMMCCC'>

re.search(pattern,'MCMLXX')
Using the {n,m} syntax
We will check in the string, where in the pattern occurs at least

minimum ‘n’ times and at most maximum ‘m’ times.
Example:
pattern='^M{0,3}$'
re.search(pattern,'MM')
Output:
<re.Match object; span=(0, 2), match='MM'>

re.search(pattern,'M')
Output:
<re.Match object; span=(0, 1), match='M'>
re.search(pattern,'MMM')
Output:
<re.Match object; span=(0, 3), match='MMM'>
Checking for Tens and Ones:
1=I
2=II
3=III
4=IV
5=V
6=VI
7=VII
8=VIII
9=IX
10=X
20=XX
30=XXX
40=XL
50=L
60=LX
70=LXX
80=LXXX
90=XC
Example:
pattern='^M?M?M?(CM|CD|D?C?C?C?)(XC|XL|L?X?X?X?)
(IX|IV|V?I?I?I?)$'
re.search(pattern,'MDLVI')
Output:
<re.Match object; span=(0, 5), match='MDLVI'>
re.search(pattern,'MCMXLVI')
Output:
<re.Match object; span=(0, 7), match='MCMXLVI'>

re.search(pattern,'MMMCCCXLV')
Output:
<re.Match object; span=(0, 9), match='MMMCCCXLV'>

Python Unit-2 Notes

Uploaded by

Python Unit-2 Notes

Uploaded by

UNIT-2

Any symbol which is used while writing a program is called

Computers do not deal with characters; they deal with numbers

This conversion of character to a number is called encoding, and

In Python, a string is a sequence of Unicode characters. Unicode was

Today’s programs need to be able to handle a wide variety of

Unicode is a specification that aims to list every character used by

A character is the smallest possible component of a text. ‘A’, ‘B’, ‘C’,

The Unicode standard describes how characters are represented

A Unicode string is a sequence of code points, which

UTF-8 is recommended practice for encoding data to be exchanged

The syntax of chr() is:

chr() method takes a single parameter, an integer i.

Return Value from chr()

Example 1: How chr() works?

Example 2: Integer passed to chr() is out of the range

Traceback (most recent call last):

File "", line 1, in

ValueError: chr() arg not in range(0x110000)

When you run the program, ValueError is raised.

The syntax of ord() is:

The ord() function takes a single parameter:

Return value from ord()

The ord() function returns an integer representing the Unicode

Creating strings in Python

# defining strings in Python

Reading and Converting strings

Let us consider an example:

n=input("Enter the number")

Representation or Accessing the string contents

We can get at any single character in a string using an index specified

Write a python program to read, print a string and print a

You will get a python error if you attempt to index beyond

So be careful when constructing index values and slices.

There is a built-in function len( ) that gives us the length of a string.

 We can also look at any continuous section of a string using a

str1='Hi, Welcome to REVA UNIVERSITY'

A substring "substr" between index1 and index2 is to be

str1=str(input("Enter the string"))

str1=str(input('Enter the string'))

ii) split the words of a string using space as the separation

str1=str(input('Enter the string'))

Changing the string contents

TypeError: 'str' object does not support item assignment

Deleting the content of a string

TypeError: 'str' object doesn't support item deletion

NameError: name 'my_string' is not defined

Concatenation of two or more strings

Write a python program to read and find the concatenate of

Concatenation using braces

Iterating through string

The in keyword can also be used to check to see if one string

Write a python program to compare two strings

String in-built or built-in Functions/Methods:

.width - length of the string with padded characters

The center() method returns a string padded with specified

Expression String Matched?

^a...s$ abyss Match

Specify Pattern Using RegEx

Expression String Matched?

 [1-4] is the same as [1234].

 [0-39] is the same as [01239].

You can complement (invert) the character set by using

 [^0-9] means any non-digit character.

acde 2 matches (contains 4 characters)

^ab abc 1 match

No match (starts with a but not

ma*n maaan 1 match

main No match (a is not followed by n)

ma+n mn No match (no a character)

main No match (a is not followed by n)

No match (more than one a

main No match (a is not followed by n)

abc dat No match

abc daat 1 match (at daat)

aabc daaaat 2 matches (at aabc and daaaat)