Exam C1000 - 059 IBM AI Enterprise Workflow V1 Data Scientist Specialist

Exam C1000 – 059 IBM AI Enterprise
Workflow V1 Data Scientist Specialist

(Sample Questions)
1. To reduce the overall time to complete a data ingestion job, what

two actions should be taken?
A. Assemble the data pipeline into a series of immutable

transformations, which can be combined after the processing.
B. Partition the data within each pipeline to take advantage of parallel
processing (multiple server cores, processors, etc.).
C. Look for outliers in the data, missing values, and skewness of the
data.
D. Build a dedicated pipeline for each dataset to ensure that all of them
can be processed independently and concurrently.
E. Apply a chi-squared statistical test to rank the impact of each feature
on the concept label and discard the less impactful features before
model training.
2. A design thinking project at a large corporation is in-progress

and most of the project activities involve conducting interviews
and the creation and review of photo journals. Which phase of
the design thinking process is currently being executed?
A. Empathize
B. Define
C. Ideate
D. Prototype
3. A client requests a general artificial intelligence (AI) tool that

they can plug into their data warehouse. What is the best
response to this request?
A. There is no general AI tool currently that works universally.

B. Apply neural networks to your data.
C. IBM Watson is the tool you are looking for.
D. AI can create value without any human-intervention.
4. What is a key advantage to a machine learning system versus a

rule-based system for making business decisions?
A. Machine learning systems can be implemented by business users.

B. Machine learning systems generalize better than a rule-based
system.
C. Machine learning systems are always more accurate than
rule-based systems.
D. Rule-based systems can only deal with nominal and ordinal
categorical data, whereas machine learning systems can deal with all
types of data.
5. What is a class of machine learning problems where the

algorithm builds a mathematical model from a small amount of
labeled data with a large amount of unlabeled data?
A. semi-supervised learning
B. partially labeled learning
C. nearest-neighbor clustering
D. imperfect knowledge clustering
6. What should be the first step to begin the task of collecting

initial data?
A. Copy data from several sources to a central repository to review the

data
B. Determine if a poll is required to collect data
C. Verify the technical skills that are required to collect data
D. Understand the business requirement to find out what would be the
relevant data needed
7. What are two common ways to handle missing values when

cleaning data?
A. delete records
B. replace with '1'
C. replace with mean
D. replace with '100'
E. replace with standard deviation
8. A client, a tomato grower, provides a dataset of measurements

of tomato plants and environmental data. A data scientist thinks
the features probably have a significant amount of redundancy.
The data scientist decides to apply dimensionality reduction to
the data features.
Which three techniques are examples of dimensionality

reduction?
A. k-means clustering
B. batch normalization
C. combinatorial optimization
D. autoencoder neural network
E. principal component analysis (PCA)
F. t-distributed stochastic neighbor embedding (t-SNE)
9. Which is an accurate statement regarding logistic regression?
A. Logistic regression is a non-linear classifier.

B. Logistic regression can be used for unsupervised learning.
C. Logistic regression can be used for binary classification.
D. The logistic function f(x) = 1/(1 + exp(-(wx + b))) can take values
between [0, inf].
10. What are three hyperparameters that are used when building a
simple decision tree model?
A. kernel
B. learning rate
C. maximum depth
D. split criterion
E. number of nearest neighbors
F. minimum number of samples in a leaf node
11. What is used to update coefficients in logistic regression?
A. number of features
B. kernel
C. slope
D. gradient descent
12. Which two statements are true in the context of evaluating

machine learning models?
A. Accuracy of 95% is always a good result.

B. Random guessing can be used as a baseline.
C. The F2-score puts equal weight on precision and recall.
D. F-score is the harmonic mean between precision and recall.
E. Evaluation metrics on training data are more important than on test
data.
13. What is the main benefit of adjusted R-squared compared to

R-squared?
A. all samples are considered in the formula

B. the number of features is considered in the formula
C. the average R-squared is calculated
D. train and test split is respected
14. Which model evaluation metric is best suited for imbalanced

data sets?
A. precision-recall curve
B. roc curve
C. misclassification curve
D. lift curve
15. Which IBM offering enables data scientists to deploy their

trained machine learning models to production in a scalable
environment?
A. Watson Machine Learning

B. Watson Studio
C. Watson Knowledge Catalog
D. Watson OpenScale
16. Which Python function would allow a data analyst to convert

strings of dates (such as "10 June 1964") into struct_time
objects to be used for further data cleansing?
A. import datetime.strptime()
B. import timobj.str2obj()
C. import calendar.object()
D. import time.toString()
17. The "aperture problem" in machine vision is best defined as?
A. Identifying a whole object or scene based on seeing only a small

part of that object or scene
B. generating "snakes" of active contours based on boundary curves
C. pattern matching based on an undertrained model
D. over-fitting a model based on close-up images
18. What is an example of a relation type that can be detected with

Watson Natural Language Understanding?
A. partOf
B. describedBy
C. assistant
D. during
Answer Key:
1. BD
2. A
3. A
4. B
5. A
6. D
7. AC
8. DEF
9. C
10. CDF
11. D
12. BD
13. B
14. A
15. A
16. A
17. A
18. A

Exam C1000 - 059 IBM AI Enterprise Workflow V1 Data Scientist Specialist

Uploaded by

Copyright:

Available Formats

Exam C1000 - 059 IBM AI Enterprise Workflow V1 Data Scientist Specialist

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Exam C1000 - 059 IBM AI Enterprise Workflow V1 Data Scientist Specialist

Uploaded by

Copyright:

Available Formats

Exam C1000 – 059 IBM AI Enterprise

Workflow V1 Data Scientist Specialist

1. To reduce the overall time to complete a data ingestion job, what

A. Assemble the data pipeline into a series of immutable

2. A design thinking project at a large corporation is in-progress

3. A client requests a general artificial intelligence (AI) tool that

A. There is no general AI tool currently that works universally.

4. What is a key advantage to a machine learning system versus a

A. Machine learning systems can be implemented by business users.

5. What is a class of machine learning problems where the

6. What should be the first step to begin the task of collecting

A. Copy data from several sources to a central repository to review the

7. What are two common ways to handle missing values when

8. A client, a tomato grower, provides a dataset of measurements

Which three techniques are examples of dimensionality

9. Which is an accurate statement regarding logistic regression?

A. Logistic regression is a non-linear classifier.

11. What is used to update coefficients in logistic regression?

12. Which two statements are true in the context of evaluating

A. Accuracy of 95% is always a good result.

13. What is the main benefit of adjusted R-squared compared to

A. all samples are considered in the formula

14. Which model evaluation metric is best suited for imbalanced

15. Which IBM offering enables data scientists to deploy their

A. Watson Machine Learning

16. Which Python function would allow a data analyst to convert

17. The "aperture problem" in machine vision is best defined as?

A. Identifying a whole object or scene based on seeing only a small

18. What is an example of a relation type that can be detected with

You might also like