Quiz Python Effective Programming Week 2

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

1. What is a feature (in context of data)?

a. Feature is the style of the table


b. Feature is the columns of the data
c. Feature is the number of rows in the data
d. Feature is the synonym of “observation”

2. What is an observation (in context of data)?


a. Observation is the index of data
b. Observation is the evidence we see in data
c. Observation is the rows of the data
d. Observation is the number of columns in the data

3. Why do we need to do “pd.to_datetime”?


a. To convert Pandas datetime format to Excel datetime format
b. To modify the date so that day comes after month, month after year
c. To remove and renew the datetime
d. To convert Excel datetime format to Pandas datetime format

4. What is required to select a specific data of a specific well name in the dataframe?
a. Using a mask by inputting the well name, then apply to dataframe
b. Using a mask by deleting unwanted columns, then filter by well name
c. Deleting manually the unwanted rows, then apply to dataframe
d. It can only be done in Excel

5. What is Seaborn library used for?


a. To plot a time series data, for example, the production data
b. To plot a “pairplot” of features in data and identify correlation
c. To plot a “pairplot” of more than one datasets
d. To visualize how many rows have non-numeric values

6. How to control the width of bars in histogram?


a. Adjusting the “barwidth” parameter in the histogram function
b. Increasing bandwidth of data
c. Adjusting the “bins” parameter in the histogram function
d. Dividing data into several bins

7. What is Missingno (“msno”) library used for?


a. To visualize how many observations have non-numeric values
b. To visualize time series data, for example, production data
c. To calculate how many observations have numeric values
d. To separate non-numeric from numeric values in data

8. What is NOT a method of handling missing values?


a. Interpolation
b. Drop observations with NaN values
c. Change NaN values with a specific value
d. Curve fitting
9. In our example of well 15/9-F-14 production data, what feature contains the most missing
values?
a. Average annulus pressure (AVG_ANNULUS_PRESS)
b. Average downhole pressure (AVG_DOWNHOLE_PRESSURE)
c. Borehole gas volume (BORE_GAS_VOL)
d. Borehole water injected volume (BORE_WI_VOL)

10. If the date from Excel is written as “19-Mar-97”, what is the appropriate Pandas datetime
format?

a. “%m-%d-%y”
b. “%M-%d-%y”
c. “%d-%B-%Y”
d. “%d-%b-%y”

11. What information is NOT contained in summary statistics? (Answers more than 1)
a. Minimum and maximum value
b. Mean
c. Median
d. 25th, 50th, and 75th percentiles
e. 15th, 50th, and 95th percentiles
f. Variance
g. Standard deviation

12. These are part of activities done in data analysis, EXCEPT … (Answers more than 1)
a. Data pre-processing that includes handling missing values
b. Statistical analysis using summary statistics
c. Interpolation to see trends in our data
d. Loading data from source (website, database, etc)
e. Analyzing data distribution using histogram
f. Machine learning

You might also like