0% found this document useful (0 votes)
147 views8 pages

WS 1.3 Python Data Science Toolbox

The document is a worksheet containing coding problems and outputs related to Python data science tools and analysis of various datasets. The problems cover topics like creating NumPy arrays from data, analyzing and visualizing datasets on CO2 emissions, human population growth, and more. The student is asked to write Python code to perform tasks like data wrangling, plotting, aggregation, and downsampling of time series data.

Uploaded by

h
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
0% found this document useful (0 votes)
147 views8 pages

WS 1.3 Python Data Science Toolbox

The document is a worksheet containing coding problems and outputs related to Python data science tools and analysis of various datasets. The problems cover topics like creating NumPy arrays from data, analyzing and visualizing datasets on CO2 emissions, human population growth, and more. The student is asked to write Python code to perform tasks like data wrangling, plotting, aggregation, and downsampling of time series data.

Uploaded by

h
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 8

Name: Rodenas, Hannah Mae P.

APPLIED DATA SCIENCE

WORKSHEET 1.3: PYTHON DATA SCIENCE TOOLBOX

Write codes in Jupyter as required by the problems. Copy the code and output and paste them here.

1 Date:
Refer to the following line of code: np.array([True,1,2]) + np.array([3,4,False])
Which code chunk builds the exact same Python object?
A. np.array([True,1,2,3,4,False])
B. np.array([4,3,0]) + np.array([0,2,2]) B
C. np.array([1,1,2]) + np.array([3,4,-1])
D. np.array([0,1,2,3,4,5])

2 Date:
Create a list of lists. The individual lists should contain, in the correct order, the height (in inches), the weight (in pounds) and the
age of the baseball players.

Heights: 74 74 72 72 73 69 69 71 76 71 73 73 74 74 69 70 73 75 78 79
Weights: 180 215 210 210 188 176 209 200 231 180 188 180 185 160 180 185 189 185 219 230
Ages: 23 35 31 36 36 30 31 36 31 28 24 27 24 27 28 35 28 23 23 26

Convert the list of lists into a NumPy array named np_baseball. Using NumPy functionality, convert the unit of height to m
and that of weight to kg. Print the resulting array.
Code

Output

Page 1 of 8
Name: Rodenas, Hannah Mae P. APPLIED DATA SCIENCE

WORKSHEET 1.3: PYTHON DATA SCIENCE TOOLBOX

3 Date:
Refer to the code in #2. Write a code that determines the age of the 8 th player. The output should be in the following form:
The 8th player is <age> years old.
Code

Output

4 Date:
Refer to the code in #2. Print out the ages of the young players (those who are 25 years old and below).
Code

Output

5 Date:
Refer to the code in #2. Print out the average weight, median height and median age of the players. The output should be in the
following form:
Average Weight: <average weight>
Median Height: <median height>
Median Age: <median age>

Code

Output

6 Date:

Page 2 of 8
Name: Rodenas, Hannah Mae P. APPLIED DATA SCIENCE

WORKSHEET 1.3: PYTHON DATA SCIENCE TOOLBOX

Create a line plot of CO2 emissions per person in the Philippines as a function of year. Make sure to add labels and a title to your
plot.
CO2 Emissions per country per year (tons per person)
country 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014
Brunei 13.9 13.7 13.1 22.5 24 20.5 21.1 24.6 24.2 19.2 22.1
Cambodia 0.187 0.209 0.223 0.253 0.281 0.33 0.35 0.358 0.369 0.373 0.438
Indonesia 1.51 1.51 1.5 1.61 1.76 1.87 1.77 2.46 2.56 1.95 1.82
Lao 0.246 0.244 0.265 0.153 0.156 0.204 0.262 0.256 0.265 0.243 0.297
Malaysia 6.51 6.8 6.41 6.94 7.53 7.2 7.77 7.7 7.5 7.96 8.03
Myanmar 0.259 0.239 0.263 0.262 0.198 0.205 0.25 0.283 0.217 0.25 0.417
Philippines 0.875 0.867 0.771 0.808 0.869 0.841 0.905 0.897 0.942 0.996 1.06
Singapore 6.52 6.76 6.68 4.21 7.45 11.3 11 8.74 6.9 10.4 10.3
Thailand 3.74 3.78 3.83 3.81 3.79 4 4.19 4.12 4.37 4.4 4.62
Vietnam 1.08 1.16 1.21 1.22 1.36 1.47 1.61 1.7 1.57 1.61 1.8

Code

Output

7 Date:
You're a professor in Data Analytics with Python, and you want to visually assess if longer answers on exam questions lead to
higher grades. Which plot should you use?
A. Box Plot
B. Histogram C
C. Scatter Plot
Page 3 of 8
Name: Rodenas, Hannah Mae P. APPLIED DATA SCIENCE

WORKSHEET 1.3: PYTHON DATA SCIENCE TOOLBOX

What if you want to visually assess if the grades on your exam follow a particular distribution? Which of the
B
choices above should you use?

8 Date:
Based on the plot, in approximately what year will there be more
than ten billion human beings on this planet?

2060

9 Date:

Which of the following conclusions can you derive from the plot? A

A. The countries in blue, corresponding to Africa, have


both low life expectancy and a low GDP per capita.

B. There is a negative correlation between GDP per


capita and life expectancy.

C. China has both a lower GDP per capita and lower life
expectancy compared to India.

Page 4 of 8
Name: Rodenas, Hannah Mae P. APPLIED DATA SCIENCE

WORKSHEET 1.3: PYTHON DATA SCIENCE TOOLBOX

10 Date:
Visualize Child Mortality as a function of GDP per Capita for some of South East Asia countries. Use population as additional
argument. Do not forget to label the axes and to add a title.
Fertility Life Expectancy Population Child Mortality GDP Per Capita
Philippines 3.151 68.207 93.2 31.9 5614
Thailand 1.443 73.852 69.1 14.5 12822
Singapore 1.261 81.788 50.9 2.8 72056
Vietnam 1.82 75.49 87.8 24.8 4486
Indonesia 2.434 70.185 239.9 33.1 8498
Malaysia 2.001 74.479 48.0 8.3 20398

Code

Output

Page 5 of 8
Name: Rodenas, Hannah Mae P. APPLIED DATA SCIENCE

WORKSHEET 1.3: PYTHON DATA SCIENCE TOOLBOX

11 Date:
Import cars.csv. Use the country abbreviations as index. Print the first three lines.
Code

Output

12 Date:
Refer to the cars dataset. Create a code that prints out the country name and per capita value of cars in Japan, India and Russia.
Code

Output

13 Date:
Refer to the cars dataset. Create a code that prints out the observations for the countries with few cars (cars per capita less than
500).
Code

Output

Page 6 of 8
Name: Rodenas, Hannah Mae P. APPLIED DATA SCIENCE

WORKSHEET 1.3: PYTHON DATA SCIENCE TOOLBOX

14 Date:
Import weather_data_austin_2010.csv. Make sure to use a DateTimeIndex. Extract the Temperature column and
save the result to temp0. Extract data from temp0 for a single hour – the hour from 9 pm to 10 pm on October 11, 2010. Assign
the data to temp1.
Code

Extract data from temp0 for a single day – August 27, 2010 and assign it to temp2.
Code

Page 7 of 8
Name: Rodenas, Hannah Mae P. APPLIED DATA SCIENCE

WORKSHEET 1.3: PYTHON DATA SCIENCE TOOLBOX

15 Date:
Resample temp0 from the previous number to every 6 hours frequency. Aggregate using mean.
Code

Downsample temp0 to daily data and count the number of data points.
Code

Page 8 of 8

You might also like