Unit 4 Final
Unit 4 Final
NumPy: Introduction, NdArray object, Data Types, Array Attributes, Indexing and
Slicing,Array manipulation, mathematical functions, Matplotlib; Pandas: Introduction
to pandas data structures-series-Data Frame-Panel-basic functions-descriptive statistics
function-iterating data frames-statistical functions-aggregations-visualization- plotting
graphs using plotly Library.
NumPy:
What is NumPy?
It also has functions for working in domain of linear algebra, fourier transform, and
matrices.
NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can
use it freely.
In Python we have lists that serve the purpose of arrays, but they are slow to process.
NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.
The array object in NumPy is called ndarray, it provides a lot of supporting functions that make
working with ndarray very easy.
Arrays are very frequently used in data science, where speed and resources are very important.
Data Science: is a branch of computer science where we study how to store, use and analyze
data for deriving information from it.
NumPy arrays are stored at one continuous place in memory unlike lists, so processes can
access and manipulate them very efficiently.
This is the main reason why NumPy is faster than lists. Also it is optimized to work with latest
CPU architectures.
Installation of NumPy
If you have Python and PIP already installed on a system, then installation of NumPy is very
easy.
If this command fails, then use a python distribution that already has NumPy installed like,
Anaconda, Spyder etc.
Import NumPy
Once NumPy is installed, import it in your applications by adding the import keyword:
import numpy
import numpy
print(arr)
NumPy as np
alias: In Python alias are an alternate name for referring to the same thing.
import numpy as np
import numpy as np
print(arr)
import numpy as np
print(np.__version__)
ndarray.shape
This array attribute returns a tuple consisting of array dimensions. It can also be used to resize
the array.
import numpy as np
a = np.array([[1,2,3],[4,5,6]])
print a.shape
(2, 3)
import numpy as np
a = np.array([[1,2,3],[4,5,6]])
a.shape = (3,2)
print a
[[1, 2]
[3, 4]
[5, 6]]
Example 3
Live Demo
import numpy as np
a = np.array([[1,2,3],[4,5,6]])
b = a.reshape(3,2)
print b
[3, 4]
[5, 6]]
ndarray.ndim
Example 1
Live Demo
import numpy as np
a = np.arange(24)
print a
[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
Example 2
Live Demo
import numpy as np
a = np.arange(24)
a.ndim
# now reshape it
b = a.reshape(2,4,3)
print b
[[[ 0, 1, 2]
[ 3, 4, 5]
[ 6, 7, 8]
[ 9, 10, 11]]
numpy.itemsize
This array attribute returns the length of each element of array in bytes.
Example 1
Live Demo
import numpy as np
print x.itemsize
Example 2
Live Demo
import numpy as np
numpy.flags
The ndarray object has the following attributes. Its current values are returned by this function.
C_CONTIGUOUS (C)
F_CONTIGUOUS (F)
OWNDATA (O)
The array owns the memory it uses or borrows it from another object
WRITEABLE (W)
The data area can be written to. Setting this to False locks the data, making it read-only
5
ALIGNED (A)
The data and all elements are aligned appropriately for the hardware
UPDATEIFCOPY (U)
This array is a copy of some other array. When this array is deallocated, the base array will be
updated with the contents of this array
Example
Live Demo
import numpy as np
x = np.array([1,2,3,4,5])
print x.flags
C_CONTIGUOUS : True
F_CONTIGUOUS : True
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
import numpy as np
print(arr)
print(type(arr))
type(): This built-in Python function tells us the type of the object passed to it. Like in above
code it shows that arr is numpy.ndarray type.
To create an ndarray, we can pass a list, tuple or any array-like object into the array() method,
and it will be converted into an ndarray:
import numpy as np
print(arr)
Dimensions in Arrays
0-D Arrays
0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.
import numpy as np
arr = np.array(42)
print(arr)
1-D Arrays
An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(arr)
2-D Arrays
An array that has 1-D arrays as its elements is called a 2-D array.
NumPy has a whole sub module dedicated towards matrix operations called numpy.mat
Create a 2-D array containing two arrays with the values 1,2,3 and 4,5,6:
import numpy as np
print(arr)
3-D arrays
An array that has 2-D arrays (matrices) as its elements is called 3-D array.
Create a 3-D array with two 2-D arrays, both containing two arrays with the values 1,2,3 and
4,5,6:
import numpy as np
print(arr)
NumPy Arrays provides the ndim attribute that returns an integer that tells us how many
dimensions the array have.
import numpy as np
a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
print(a.ndim)
print(b.ndim)
print(c.ndim)
print(d.ndim)
When the array is created, you can define the number of dimensions by using the ndmin
argument.
import numpy as np
print(arr)
The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the
second has index 1 etc.
import numpy as np
print(arr[0])
ExampleGet your own Python Server
import numpy as np
print(arr[1])
Get third and fourth elements from the following array and add them.
import numpy as np
print(arr[2] + arr[3])
To access elements from 2-D arrays we can use comma separated integers representing the
dimension and the index of the element.
Think of 2-D arrays like a table with rows and columns, where the dimension represents the
row and the index represents the column.
import numpy as np
import numpy as np
To access elements from 3-D arrays we can use comma separated integers representing the
dimensions and the index of the element.
Access the third element of the second array of the first array:
import numpy as np
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(arr[0, 1, 2])
Example Explained
The first number represents the first dimension, which contains two arrays:
and:
The second number represents the second dimension, which also contains two arrays:
[1, 2, 3]
and:
[4, 5, 6]
[4, 5, 6]
The third number represents the third dimension, which contains three values:
6
Since we selected 2, we end up with the third value:
Negative Indexing
import numpy as np
Slicing arrays
Slicing in python means taking elements from one given index to another given index.
import numpy as np
print(arr[1:5])
Note: The result includes the start index, but excludes the end index.
import numpy as np
print(arr[4:])
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[:4])
Negative Slicing
Slice from the index 3 from the end to index 1 from the end:
import numpy as np
print(arr[-3:-1])
STEP
import numpy as np
print(arr[1:5:2])
import numpy as np
print(arr[::2])
From the second element, slice elements from index 1 to index 4 (not included):
import numpy as np
print(arr[1, 1:4])
import numpy as np
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
print(arr[0:2, 2])
From both elements, slice index 1 to index 4 (not included), this will return a 2-D array:
import numpy as np
print(arr[0:2, 1:4])
strings - used to represent text data, the text is given under quote marks. e.g. "ABCD"
complex - used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5 + 2.5j
NumPy has some extra data types, and refer to data types with one character, like i for integers,
u for unsigned integers etc.
Below is a list of all data types in NumPy and the characters used to represent them.
i - integer
b - boolean
u - unsigned integer
f - float
c - complex float
m - timedelta
M - datetime
O - object
S - string
U - unicode string
The NumPy array object has a property called dtype that returns the data type of the array:
import numpy as np
print(arr.dtype)
import numpy as np
print(arr.dtype)
We use the array() function to create arrays, this function can take an optional argument: dtype
that allows us to define the expected data type of the array elements:
import numpy as np
print(arr)
print(arr.dtype)
import numpy as np
print(arr)
print(arr.dtype)
If a type is given in which elements can't be casted then NumPy will raise a ValueError.
ValueError: In Python ValueError is raised when the type of passed argument to a function is
unexpected/incorrect.
A non integer string like 'a' can not be converted to integer (will raise an error):
import numpy as np
The best way to change the data type of an existing array, is to make a copy of the array with
the astype() method.
The astype() function creates a copy of the array, and allows you to specify the data type as a
parameter.
The data type can be specified using a string, like 'f' for float, 'i' for integer etc. or you can use
the data type directly like float for float and int for integer.
Change data type from float to integer by using 'i' as parameter value:
import numpy as np
newarr = arr.astype('i')
print(newarr)
print(newarr.dtype)
Change data type from float to integer by using int as parameter value:
import numpy as np
newarr = arr.astype(int)
print(newarr)
print(newarr.dtype)
newarr = arr.astype(bool)
print(newarr)
print(newarr.dtype)
Trigonometric Functions
NumPy has standard trigonometric functions which return trigonometric ratios for a given
angle in radians.
import numpy as np
a = np.array([0,30,45,60,90])
print np.sin(a*np.pi/180)
print '\n'
print np.cos(a*np.pi/180)
print '\n'
print np.tan(a*np.pi/180)
6.12323400e-17]
Tangent values for given angles:
1.63312394e+16]
arcsin, arcos, and arctan functions return the trigonometric inverse of sin, cos, and tan of the
given angle. The result of these functions can be verified by numpy.degrees() function by
converting radians to degrees.
import numpy as np
a = np.array([0,30,45,60,90])
sin = np.sin(a*np.pi/180)
print sin
print '\n'
inv = np.arcsin(sin)
print inv
print '\n'
print np.degrees(inv)
print '\n'
cos = np.cos(a*np.pi/180)
print cos
print '\n'
inv = np.arccos(cos)
print inv
print '\n'
print '\n'
tan = np.tan(a*np.pi/180)
print tan
print '\n'
inv = np.arctan(tan)
print inv
print '\n'
print np.degrees(inv)
6.12323400e-17]
Inverse of cos:
In degrees:
Tan function:
[ 0.00000000e+00 5.77350269e-01 1.00000000e+00 1.73205081e+00
1.63312394e+16]
Inverse of tan:
In degrees:
numpy.around()
This is a function that returns the value rounded to the desired precision. The function takes the
following parameters.
numpy.around(a,decimals)
Where,
Input data
decimals
The number of decimals to round to. Default is 0. If negative, the integer is rounded to position
to the left of the decimal point
import numpy as np
print a
print '\n'
print np.around(a)
Original array:
After rounding:
[ 1. 6. 123. 1. 26. ]
numpy.floor()
This function returns the largest integer not greater than the input parameter. The floor of the
scalar x is the largest integer i, such that i <= x. Note that in Python, flooring always is rounded
away from 0.
import numpy as np
print a
print '\n'
print np.floor(a)
numpy.ceil()
The ceil() function returns the ceiling of an input value, i.e. the ceil of the scalar x is the smallest
integer i, such that i >= x.
import numpy as np
a = np.array([-1.7, 1.5, -0.2, 0.6, 10])
print a
print '\n'
print np.ceil(a)
Contents of ndarray object can be accessed and modified by indexing or slicing, just like
Python's in-built container objects.
As mentioned earlier, items in ndarray object follows zero-based index. Three types of indexing
methods are available − field access, basic slicing and advanced indexing.
Basic slicing is an extension of Python's basic concept of slicing to n dimensions. A Python slice
object is constructed by giving start, stop, and step parameters to the built-in slice function. This
slice object is passed to the array to extract a part of array.
import numpy as np
a = np.arange(10)
s = slice(2,7,2)
print a[s]
[2 4 6]
In the above example, an ndarray object is prepared by arange() function. Then a slice object is
defined with start, stop, and step values 2, 7, and 2 respectively. When this slice object is passed
to the ndarray, a part of it starting with index 2 up to 7 with a step of 2 is sliced.
The same result can also be obtained by giving the slicing parameters separated by a colon :
(start:stop:step) directly to the ndarray object.
import numpy as np
a = np.arange(10)
b = a[2:7:2]
print b
[2 4 6]
If only one parameter is put, a single item corresponding to the index will be returned. If a : is
inserted in front of it, all items from that index onwards will be extracted. If two parameters
(with : between them) is used, items between the two indexes (not including the stop index)
with default step one are sliced.
import numpy as np
a = np.arange(10)
b = a[5]
print b
import numpy as np
a = np.arange(10)
print a[2:]
[2 3 4 5 6 7 8 9]
import numpy as np
a = np.arange(10)
print a[2:5]
import numpy as np
a = np.array([[1,2,3],[3,4,5],[4,5,6]])
print a
print 'Now we will slice the array from the index a[1:]'
print a[1:]
[[1 2 3]
[3 4 5]
[4 5 6]]
[[3 4 5]
[4 5 6]]
Slicing can also include ellipsis (…) to make a selection tuple of the same length as the
dimension of an array. If ellipsis is used at the row position, it will return an ndarray comprising
of items in rows.
import numpy as np
a = np.array([[1,2,3],[3,4,5],[4,5,6]])
print a
print '\n'
print a[...,1]
print '\n'
print a[1,...]
print '\n'
print a[...,1:]
[[1 2 3]
[3 4 5]
[4 5 6]]
[2 4 5]
[3 4 5]
[[2 3]
[4 5]
[5 6]]
NumPy Ndarray
Ndarray is the n-dimensional array object defined in the numpy which stores the collection of
the similar type of elements. In other words, we can define a ndarray as the collection of the
data type (dtype) objects.
The ndarray object can be accessed by using the 0 based indexing. Each element of the Array
object contains the same size in the memory.
Creating a ndarray object
The ndarray object can be created by using the array routine of the numpy module. For this
purpose, we need to import the numpy.
1. >>> a = numpy.array
We can also pass a collection object into the array routine to create the equivalent n-
dimensional array. The syntax is given below.
1. >>> numpy.array(object, dtype = None, copy = True, order = None, subok = False, nd
min = 0)
SN Parameter Description
2 dtype We can change the data type of the array elements by changing this option
to the specified type. The default is none.
3 copy It is optional. By default, it is true which means the object is copied.
4 order There can be 3 possible values assigned to this option. It can be C (column
order), R (row order), or A (any)
5 subok The returned array will be base class array by default. We can change this
to make the subclasses passes through by setting this option to true.
To change the data type of the array elements, mention the name of the data type along with
the collection.
1. >>> a = numpy.array([1, 3, 5, 7], complex)
The ndim function can be used to find the dimensions of the array.
ADVERTISEMENT
The itemsize function is used to get the size of each array item. It returns the number of bytes
taken by each array element.
Example
1. #finding the size of each item in the array
2. import numpy as np
3. a = np.array([[1,2,3]])
4. print("Each item contains",a.itemsize,"bytes")
Output:
To check the data type of each array item, the dtype function is used. Consider the following
example to check the data type of the array items.
Example
Output:
To get the shape and size of the array, the size and shape function associated with the numpy
array is used.
Example
1. import numpy as np
2. a = np.array([[1,2,3,4,5,6,7]])
3. print("Array Size:",a.size)
4. print("Shape:",a.shape)
Output:
Array Size: 7
Shape: (1, 7)
Reshaping the array objects
By the shape of the array, we mean the number of rows and columns of a multi-dimensional
array. However, the numpy module provides us the way to reshape the array by changing the
number of rows and columns of the multi-dimensional array.
The reshape() function associated with the ndarray object is used to reshape the array. It accepts
the two parameters indicating the row and columns of the new shape of the array.
Example
1. import numpy as np
2. a = np.array([[1,2],[3,4],[5,6]])
3. print("printing the original array..")
4. print(a)
5. a=a.reshape(2,3)
6. print("printing the reshaped array..")
7. print(a)
Output:
Slicing in the NumPy array is the way to extract a range of elements from an array. Slicing in
the array is performed in the same way as it is performed in the python list.
1. import numpy as np
2. a = np.array([[1,2],[3,4],[5,6]])
3. print(a[0,1])
4. print(a[2,0])
Output:
2
5
The above program prints the 2nd element from the 0th index and 0th element from the 2nd index
of the array.
Linspace
The linspace() function returns the evenly spaced values over the given interval. The following
example returns the 10 evenly separated values over the given interval 5-15
Example
1. import numpy as np
2. a=np.linspace(5,15,10) #prints 10 values which are evenly spaced over the given inter
val 5-15
3. print(a)
Output:
The NumPy provides the max(), min(), and sum() functions which are used to find the
maximum, minimum, and sum of the array elements respectively.
Example
1. import numpy as np
2. a = np.array([1,2,3,10,15,4])
3. print("The array:",a)
4. print("The maximum element:",a.max())
5. print("The minimum element:",a.min())
6. print("The sum of the elements:",a.sum())
Output:
The array: [ 1 2 3 10 15 4]
The maximum element: 15
The minimum element: 1
The sum of the elements: 35
A NumPy multi-dimensional array is represented by the axis where axis-0 represents the
columns and axis-1 represents the rows. We can mention the axis to perform row-level or
column-level calculations like the addition of row or column elements.
To calculate the maximum element among each column, the minimum element among each
row, and the addition of all the row elements, consider the following example.
Example
1. import numpy as np
2. a = np.array([[1,2,30],[10,15,4]])
3. print("The array:",a)
4. print("The maximum elements of columns:",a.max(axis = 0))
5. print("The minimum element of rows",a.min(axis = 1))
6. print("The sum of all rows",a.sum(axis = 1))
Output:
The sqrt() and std() functions associated with the numpy array are used to find the square root
and standard deviation of the array elements respectively.
Standard deviation means how much each element of the array varies from the mean value of
the numpy array.
Example
1. import numpy as np
2. a = np.array([[1,2,30],[10,15,4]])
3. print(np.sqrt(a))
4. print(np.std(a))
Output:
The numpy module allows us to perform the arithmetic operations on multi-dimensional arrays
directly.
In the following example, the arithmetic operations are performed on the two multi-
dimensional arrays a and b.
Example
1. import numpy as np
2. a = np.array([[1,2,30],[10,15,4]])
3. b = np.array([[1,2,3],[12, 19, 29]])
4. print("Sum of array a and b\n",a+b)
5. print("Product of array a and b\n",a*b)
6. print("Division of array a and b\n",a/b)
Array Concatenation
The numpy provides us with the vertical stacking and horizontal stacking which allows us to
concatenate two multi-dimensional arrays vertically or horizontally.
Example
1. import numpy as np
2. a = np.array([[1,2,30],[10,15,4]])
3. b = np.array([[1,2,3],[12, 19, 29]])
4. print("Arrays vertically concatenated\n",np.vstack((a,b)));
5. print("Arrays horizontally concatenated\n",np.hstack((a,b)))
Output:
[10 15 4 12 19 29]]
A data type is a way to specify the type of data that will be stored in an array. For example,
Here, the array1 array contains three integer elements, so the data type is Integer(int64)), by
default.
NumPy provides us with several built-in data types to efficiently represent numerical data.
NumPy Data Types
NumPy offers a wider range of numerical data types than what is available in Python. Here's
the list of most commonly used numeric data types in NumPy:
1. int8, int16, int32, int64 - signed integer types with different bit sizes
2. uint8, uint16, uint32, uint64 - unsigned integer types with different bit sizes
3. float32, float64 - floating-point types with different precision levels
4. complex64, complex128 - complex number types with different precision levels
To check the data type of a NumPy array, we can use the dtype attribute. For example,
import numpy as np
# Output: int64
Run Code
In the above example, we have used the dtype attribute to check the data type of
the array1 array.
Since array1 is an array of integers, the data type of array1 is inferred as int64 by default.
Example: Check Data Type of NumPy Array
import numpy as np
Output
int64
float64
complex128
Here, we have created types of arrays and checked the default data types of these arrays using
the dtype attribute.
int_array - contains four integer elements whose default data type is int64
float_array - contains three floating-point numbers whose default data type is float64
complex_array - contains three complex numbers whose default data type
is complex128
In NumPy, we can create an array with a defined data type by passing the dtype parameter
while calling the np.array() function. For example,
import numpy as np
print(array1, array1.dtype)
Output
[1 3 7] int32
In the above example, we have created a NumPy array named array1 with a defined data type.
Here, inside np.array(), we have passed an array [1, 3, 7] and set the dtype parameter to int32.
Since we have set the data type of the array to int32, each element of the array is represented
as a 32-bit integer.
Output
[1 3 7] int8
[2 4 6] uint16
[1.2 2.3 3.4] float32
[1.+2.j 2.+3.j 3.+4.j] complex64
In NumPy, we can convert the data type of an array using the astype() method. For example,
import numpy as np
Output
[1 3 5 7] int64
[1. 3. 5. 7.] float64
In NumPy, attributes are properties of NumPy arrays that provide information about the array's
shape, size, data type, dimension, and so on.
For example, to get the dimension of an array, we can use the ndim attribute.
Attributes Description
itemsize returns the size (in bytes) of each elements in the array
data returns the buffer containing actual elements of the array in memory
array1.ndim
The ndim attribute returns the number of dimensions in the numpy array. For example,
import numpy as np
# Output: 2
Run Code
In this example, array1.ndim returns the number of dimensions present in array1. As array1 is
a 2D array, we got 2 as an output.
The size attribute returns the total number of elements in the given array.
Let's see an example.
import numpy as np
# Output: 6
Run Code
In this example, array1.size returns the total number of elements in the array1 array, regardless
of the number of dimensions.
Since these are a total of 6 elements in array1, the size attribute returns 6.
In NumPy, the shape attribute returns a tuple of integers that gives the size of the array in each
dimension. For example,
import numpy as np
# Output: (2,3)
Run Code
Here, array1 is a 2-D array that has 2 rows and 3 columns. So array1.shape returns the
tuple (2,3) as an output.
We can use the dtype attribute to check the datatype of a NumPy array. For example,
import numpy as np
# create an array of integers
array1 = np.array([6, 7, 8])
# Output: int64
Run Code
In the above example, the dtype attribute returns the data type of array1.
Since array1 is an array of integers, the data type of array1 is inferred as int64 by default.
Note: To learn more about the dtype attribute to check the datatype of an array, visit NumPy
Data Types.
In NumPy, the itemsize attribute determines size (in bytes) of each element in the array. For
example,
import numpy as np
# use of itemsize to determine size of each array element of array1 and array2
print(array1.itemsize) # prints 8
print(array2.itemsize) # prints 4
Run Code
Output
8
4
Here,
array1 is an array containing 64-bit integers by default, which uses 8 bytes of memory
per element. So, itemsize returns 8 as the size of each element.
array2 is an array of 32-bit integers, so each element in this array uses only 4 bytes of
memory. So, itemsize returns 4 as the size of each element.
In NumPy, we can get a buffer containing actual elements of the array in memory using
the data attribute.
In simpler terms, the data attribute is like a pointer to the memory location where the array's
data is stored in the computer's memory.
Let's see an example.
import numpy as np
Output
Here, the data attribute returns the memory addresses of the data
for array1 and array2 respectively.
Array Indexing
in NumPy
In the above array, 5 is the 3rd element. However, its index is 2.
This is because the array indexing starts from 0, that is, the first element of the array has index
0, the second element has index 1, and so on.
Now, we'll see how we can access individual items from the array using the index number.
Now, we can use the index number to access array elements as:
Note: Since the last element of array1 is at index 4, if we try to access the element beyond that,
say index 5, we will get an index error: IndexError: index 5 is out of bounds for axis 0 with
size 5
We can use indices to change the value of an element in a NumPy array. For example,
import numpy as np
Output
In the above example, we have modified elements of the numbers array using array indexing.
numbers[0] = 12 - modifies the first element of numbers and sets its value to 12
numbers[2] = 14 - modifies the third element of numbers and sets its value to 14
NumPy allows negative indexing for its array. The index of -1 refers to the last item, -2 to the
second last item and so on.
NumPy Array
Negative Indexing
Let's see an example.
import numpy as np
Output
9
7
Similar to regular indexing, we can also modify array elements using negative indexing. For
example,
import numpy as np
Here, numbers[-1] = 13 modifies the last element to 13 and numbers[-2] = 17 modifies the
second-to-last element to 17.
Note: Unlike regular indexing, negative indexing starts from -1 (not 0) and it starts counting
from the end of the array.
2-D NumPy Array Indexing
Array indexing in NumPy allows us to access and manipulate elements in a 2-D array.
To access an element of array1, we need to specify the row index and column index of the
element. Suppose we have following 2-D array,
Now, say we want to access the element in the third row and second column we specify the
index as:
array1[2, 1] # returns 6
Since we know indexing starts from 0. So to access the element in the third row and second
column, we need to use index 2 for the third row and index 1 for the second column
respectively.
# create a 2D array
array1 = np.array([[1, 3, 5, 7],
[9, 11, 13, 15],
[2, 4, 6, 8]])
In NumPy, we can access specific rows or columns of a 2-D array using array indexing.
import numpy as np
# create a 2D array
array1 = np.array([[1, 3, 5],
[7, 9, 2],
[4, 6, 8]])
Output
Second Row: [7 9 2]
Third Column: [5 2 8]
Here,
We learned how to access elements in a 2D array. We can also access elements in higher
dimensional arrays.
Note: In 3D arrays, slice is a 2D array that is obtained by taking a subset of the elements in one
of the dimensions.
import numpy as np
# Output: 22
Run Code
Here, we created a 3D array called array1 with shape (2, 3, 4). This array contains 2 2D arrays,
each with 3 rows and 4 columns.
Then, we used indexing to access a specific element of array1. Notice the code,
array1[1, 2, 1]
Here,
array1[ , ,1] - access the second element of the third row, i.e.
[22]
With slicing, we can easily access elements in the array. It can be done on one or more
dimensions of a NumPy array.
array[start:stop:step]
Here,
start - index of the first element to be included in the slice
stop - index of the last element (exclusive)
step - step size between each element in the slice
Note: When we slice arrays, the start index is inclusive but the stop index is exclusive.
If we omit start, slicing starts from the first element
If we omit stop, slicing continues up to the last element
If we omit step, default step size is 1
In NumPy, it's possible to access the portion of an array using the slicing operator :. For
example,
import numpy as np
# create a 1D array
array1 = np.array([1, 3, 5, 7, 8, 9, 2, 4, 6])
In the above example, we have created the array named array1 with 9 elements.
Then, we used the slicing operator : to slice array elements.
array1[2:6] - slices array1 from index 2 to index 6, not including index 6
array1[0:8:2] - slices array1 from index 0 to index 8, not including index 8
array1[3:] - slices array1 from index 3 up to the last element
array1[:] - returns all items from beginning to end
start parameter
stop parameter
start and stop parameter
start, stop, and step parameter
1. Using start Parameter
import numpy as np
# Output: [ 2 4 6 20 20 20]
Run Code
Here, numbers[3:] = 20 replaces all the elements from index 3 onwards with new value 20.
2. Using stop Parameter
import numpy as np
# Output: [2 4 22 22 22 12]
Run Code
Here, numbers[2:5] = 22 selects elements from index 2 to index 4 and replaces them with new
value 22.
4. Using start, stop, and step parameter
import numpy as np
# Output: [ 2 16 6 16 10 12]
Run Code
numbers[1:5:2] = 16
modifies every second element from index 1 to index 5 with a new value 16.
import numpy as np
Output
Here,
In NumPy, we can also reverse array elements using the negative slicing. For example,
import numpy as np
# create a numpy array
numbers = np.array([2, 4, 6, 8, 10, 12])
# Output: [12 10 8 6 4 2]
Run Code
Here, the slice numbers[::-1] selects all the elements of the array with a step size of -1, which
reverses the order of the elements.
A 2D NumPy array can be thought of as a matrix, where each element has two indices, row
index and column index.
To slice a 2D NumPy array, we can use the same syntax as for slicing a 1D NumPy array. The
only difference is that we need to specify a slice for each dimension of the array.
array[row_start:row_stop:row_step, col_start:col_stop:col_step]
Here,row_start,row_stop,row_step - specifies starting index, stopping index, and step size for
the rows respectively
col_start,col_stop,col_step - specifies starting index, stopping index, and step size for
the columns respectively
Let's understand this with an example.
# create a 2D array
array1 = np.array([[1, 3, 5, 7],
[9, 11, 13, 15]])
print(array1[:2, :2])
# Output
[[ 1 3]
[ 9 11]]
[1 3]
The second :2 returns first 2 columns from the 2 rows. This results in
[9 11]
# create a 2D array
array1 = np.array([[1, 3, 5, 7],
[9, 11, 13, 15],
[2, 4, 6, 8]])
# slice the array to get the first two rows and columns
subarray1 = array1[:2, :2]
# slice the array to get the last two rows and columns
subarray2 = array1[1:3, 2:4]
Output
First Two Rows and Columns:
[[ 1 3]
[ 9 11]]
Last two Rows and Columns:
[[13 15]
[ 6 8]]
Here,
array1[:2, :2] - slices array1 that starts at the first row and first column (default values),
and ends at the second row and second column (exclusive)
array1[1:3, 2:4] - slices array1 that starts at the second row and third column
(index 1 and 2), and ends at the third row and fourth column (index 2 and 3)
NumPy array functions are the built-in functions provided by NumPy that allow us to create
and manipulate arrays, and perform different operations on them.
There are many NumPy array functions available but here are some of the most commonly
used ones.
Array creation functions allow us to create new NumPy arrays. For example,
import numpy as np
Output
np.array():
[1 3 5]
np.zeros():
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
np.ones():
[[1. 1. 1. 1.]
[1. 1. 1. 1.]]
Here,
Note: To learn more about NumPy Array Creation, please visit NumPy Array
Creation and NumPy N-d Array Creation.
NumPy Array Manipulation Functions
NumPy array manipulation functions allow us to modify or rearrange NumPy arrays. For
example,
import numpy as np
# create a 1D array
array1 = np.array([1, 3, 5, 7, 9, 11])
Output
Original array:
[ 1 3 5 7 9 11]
Reshaped array:
[[ 1 3 5]
[ 7 9 11]]
Transposed array:
[[ 1 7]
[ 3 9]
[ 5 11]]
In this example,
np.reshape(array1, (2, 3)) - reshapes array1 into 2D array with shape (2,3)
np.transpose(array2) - transposes 2D array array2
NumPy Array Mathematical Functions
In NumPy, there are tons of mathematical functions to perform on arrays. For example,
import numpy as np
Output
Sum of arrays:
[ 5 11 19 29 41]
Difference of arrays:
[ -3 -7 -13 -21 -31]
Matplotlib
Before start working with the Matplotlib or its plotting functions first, it needs to be installed.
The installation of matplotlib is dependent on the distribution that is installed on your computer.
These installation methods are following:
The easiest way to install Matplotlib is to download the Anaconda distribution of Python.
Matplotlib is pre-installed in the anaconda distribution No further installation steps are
necessary.
o Visit the official site of Anaconda and click on the Download Button
o Choose download according to your Python interpreter configuration.
Install Matplotlib using with Anaconda Prompt
Matplotlib can be installed using with the Anaconda Prompt by typing command. To install
matplotlib, open Anaconda Prompt and type the following command:
ADVERTISEMENT
The python package manager pip is also used to install matplotlib. Open the command prompt
window, and type the following command:
To verify that matplotlib is installed properly or not, type the following command includes
calling .__version __ in the terminal.
1. import matplotlib
2. matplotlib.__version__
3. '3.1.1'
Pyplot
Most of the Matplotlib utilities lies under the pyplot submodule, and are usually imported under
the plt alias:
plt.plot(xpoints, ypoints)
plt.show()
Result:
Plotting x and y points
If we need to plot a line from (1, 3) to (8, 10), we have to pass two arrays [1, 8] and [3, 10] to
the plot function.
To plot only the markers, you can use shortcut string notation parameter 'o', which means
'rings'.
Example
Draw two points in the diagram, one at position (1, 3) and one in position (8, 10):
Multiple Points
You can plot as many points as you like, just make sure you have the same number of points
in both axis.
Example
Draw a line in a diagram from position (1, 3) to (2, 8) then to (6, 1) and finally to position (8,
10):
Default X-Points
If we do not specify the points on the x-axis, they will get the default values 0, 1, 2, 3 etc.,
depending on the length of the y-points.
So, if we take the same example as above, and leave out the x-points, the diagram will look
like this:
Example
Markers
You can use the keyword argument marker to emphasize each point with a specified marker:
Result:
Shorter Syntax
Example
Shorter syntax:
plt.plot(ypoints, ls = ':')
Result:
Line Styles
Style Or
'dotted' ':'
'dashed' '--'
'dashdot' '-.'
Line Color
You can use the keyword argument color or the shorter c to set the color of the line:
Example
Example
...
plt.plot(ypoints, c = '#4CAF50')
...
Result:
Example
...
plt.plot(ypoints, c = 'hotpink')
...
Result:
Line Width
You can use the keyword argument linewidth or the shorter lw to change the width of the line.
Example
Example
Bar Width
The bar() takes the keyword argument width to set the width of the bars:
Example
Bar Height
The barh() takes the keyword argument height to set the height of the bars:
Example
Histogram
Example: Say you ask for the height of 250 people, you might end up with a histogram like
this:
You can read from the histogram that there are approximately:
Create Histogram
The hist() function will use an array of numbers to create a histogram, the array is sent into the
function as an argument.
.
Example
import numpy as np
print(x)
Result:
This will generate a random result, and could look like this:
The hist() function will read the array and produce a histogram:
Example
A simple histogram:
plt.hist(x)
plt.show()
Result:
Creating Pie Charts
With Pyplot, you can use the pie() function to draw pie charts:
plt.pie(y)
plt.show()
Result:
As you can see the pie chart draws one piece (called a wedge) for each value in the array (in
this case [35, 25, 25, 15]).
By default the plotting of the first wedge starts from the x-axis and moves counterclockwise:
Note: The size of each wedge is determined by comparing the value with all the other values,
by using this formula:
Labels
The labels parameter must be an array with one label for each wedge:
Example
Start Angle
As mentioned the default start angle is at the x-axis, but you can change the start angle by
specifying a startangle parameter.
Explode
Maybe you want one of the wedges to stand out? The explode parameter allows you to do that.
The explode parameter, if specified, and not None, must be an array with one value for each
wedge.
Each value represents how far from the center each wedge is displayed:
Example
Pull the "Apples" wedge 0.2 from the center of the pie:
Shadow
Add a shadow to the pie chart by setting the shadows parameter to True:
Example
Add a shadow:
Colors
You can set the color of each wedge with the colors parameter.
The colors parameter, if specified, must be an array with one value for each wedge:
Example
'r' - Red
'g' - Green
'b' - Blue
'c' - Cyan
'm' - Magenta
'y' - Yellow
'k' - Black
'w' - White
Legend
To add a list of explanation for each wedge, use the legend() function:
Example
Add a legend:
To add a header to the legend, add the title parameter to the legend function.
Example
Introduction to Pandas
Pandas is a powerful Python library for data manipulation and analysis. It provides flexible
data structures like Series and DataFrame, which are essential for data handling.
1. Series
A Series is a one-dimensional labeled array that can hold any data type.
Example:
python
Copy code
import pandas as pd
# Creating a Series
data = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd'])
print(data)
2. DataFrame
Example:
python
Copy code
# Creating a DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [24, 27, 22],
'City': ['New York', 'Los Angeles', 'Chicago']
}
df = pd.DataFrame(data)
print(df)
3. Panel
Example:
python
Copy code
# Creating a Panel (not recommended for new code)
import numpy as np
4. Basic Functions
Example:
python
Copy code
# Basic DataFrame functions
print(df.head()) # First 5 rows
print(df.describe()) # Descriptive statistics
print(df.info()) # Info about the DataFrame
Example:
python
Copy code
# Descriptive statistics
print(df['Age'].mean()) # Mean of Age
print(df['Age'].std()) # Standard deviation of Age
print(df.describe()) # Summary statistics for all columns
6. Iterating DataFrames
Example:
python
Copy code
# Iterating through rows
for index, row in df.iterrows():
print(f"Index: {index}, Name: {row['Name']}, Age: {row['Age']}")
7. Statistical Functions
Example:
python
Copy code
# Statistical functions
print(df['Age'].min()) # Minimum Age
print(df['Age'].max()) # Maximum Age
print(df['Age'].sum()) # Sum of Ages
8. Aggregations
Example:
python
Copy code
# Aggregation using groupby
grouped = df.groupby('City').agg({'Age': ['mean', 'max', 'min']})
print(grouped)
9. Visualization
Pandas integrates with libraries like Matplotlib and Plotly for data visualization.