Sample Questions For Beginners in Data Science
Sample Questions For Beginners in Data Science
Sample Questions For Beginners in Data Science
5𝑥 + 2𝑦 + 𝑧 = 5
2𝑥 − 4𝑦 − 3𝑧 = 6
𝑥 + 𝑦 − 2𝑧 = −5
2. Prove that the symmetric matrices have real eigenvalues
5. Assume 𝑅 2 to be a vector space that spans the xy plane. A line 𝑙 ∈ 𝑅 2 with slope 𝑚 and
y-intercept 𝑏 is defined as:
𝑙 = {(𝑥, 𝑦) ∈ 𝑅 2 | 𝑦 = 𝑚𝑥 + 𝑏}
Prove that 𝑙 is a subspace of 𝑅 2 if and only if 𝑏 = 0.
1 1
𝑣1 = [ ] And 𝑣2 = [ ]
1 −1
1. Given the variable 𝑦[𝑘] follows Poisson distribution. Estimate the mean of the variable
assumed as 𝑐, using Maximum Likelihood estimator.
𝑒 −𝜃 𝜃 𝑦[𝑘]
𝑝(𝑦[𝑘]|𝜃) =
𝑦[𝑘]!
2. Formulate model predictive controller with the following objective function. The process
model is also given below.
𝑃 𝑀−1
𝒪 = ∑(y[k + i] − y sp )2 + λ ∑ u2 [k + i]
𝑖=1 𝑖=0
y[k] = 0.5y[k − 1] + 3u[k − 1] − 2u[k − 2] ∀ 𝑘>0
3. Use the following tuning parameters and the previous measurements. Set point for the
system is 𝑦 𝑠𝑝 = 5
Parameter Value Measurements Value
𝑃 3 𝑦[𝑘 − 1] 2
𝑀 2 𝑢[𝑘 − 1] 2
𝜆 2 𝑢[𝑘 − 2] 1.8
4. Let’s assume IIT madras graduates 200 PG students every year. One of the TAs of this
course has conducted a survey regarding two events and the data is given below. First event
is whether a student enrolled for CH5019 course or not. Second event is whether a student
got placed in a data science based company or not.
Placement detail
Placed Not placed Total
Enrolled 100 30 130
CH5019
Not enrolled 20 50 70
Total 120 80 200
a) If a student has enrolled for CH5019 course, what is the probability for the student to
get placed in a data science based company?
b) If a student has placed in a data science based company, what is the probability that the
student is not registered for CH5019 course.
c) Comment on the independence of the two events.
5. Let X is a continuous random variable which denotes the scores of students attempting this
question paper. It can take values between 0 to 100 and the probability density function of
X is given by 0.000003x2. Determine mean, variance and standard deviation of X.
6. A box contains 2 different shaped toys i.e. spheres and cuboids of 3 different colours i.e.
red, white and black. There exists 3 red, 4 white and 5 black spheres and 8 red, 6 white, 3
black cuboids. Randomly a shape is chosen.
a) If the shape is found to be cuboid, what is the probability for it being black?
b) If the shape is found to be red in colour, what is the probability for it being a cuboid?
c) If the shape is found to be sphere, what is the probability for it not being red?
7. Let 𝑋1 , 𝑋2 , 𝑋3 , … . 𝑋𝑛 be a random sample from a random normal distribution with mean 𝜇
and variance 𝜎 2 . What is the maximum likely hood estimators of 𝜇 and 𝜎 2 .
1. The demand for coffee and tea in kilograms in each of the last four weeks is shown
below,
Week
1 2 3 4 5
Coffee 23 27 34 40 --
Tea 11 13 15 14 --
a) The coffee and tea are processed using machines A and B. One kilograms of coffee requires
a processing of 15 minutes through machine A and 25 minutes processing through machine
B. One kilograms of tea requires a processing of 7 minutes through machine A and 45
minutes processing through machine B. The available time for processing through machine
a forecasts to be 20 hours and through machines B forecasts to be 15 hours in week 5. The
coffee and tea contribute 10 dollars and 4 dollars in week 5. It may not be possible to
process enough to meet your forecast demand for coffee and tea in week 5 and per
kilograms unsatisfied demand for coffee costs 3 dollars and for tea costs 1 dollars.
b) Formulate the problem as linear program of deciding how much of coffee and tea need to
be processed in week 5
c) Solve graphically.
Subject to
𝑥1 + 2𝑥2 ≥ 3
2𝑥1 + 𝑥2 ≤ 5
5𝑥1 − 4𝑥2 ≥ −10
𝑥1 ≤ 3
𝑥2 ≥ 0
4. Obtain the stationary points for the following optimization problem and comment on the
functional behaviour at each stationary point.
5. A data scientist wants to fit a neural network with one hidden layer as shown below.
Formulate a least squares problem and obtain the weights and bias of the optimal network
for the data given below.
S. no 𝑥1 𝑥2 𝑦 S. no 𝑥1 𝑥2 𝑦
1 1 2 16 6 5 1 32
2 2 3 21 7 6 2 41
3 1 3 17 8 4 3 32
4 3 5 30 9 2 4 25
5 4 2 29 10 5 4 35