Dmbi
Dmbi
Dmbi
___________
MARKS
Q.1 (a) Explain KDD process using figure. 03
(b) Do feature wise comparison between BI and DW. 04
(c) Explain research issues in Data Mining. 07
Q.2 (a) Explain schemas: Stars, snowflakes and fact constellations using 03
figures.
(b) Do feature wise comparison between ROLAP and MOLAP. 04
(c) Enlist the preprocessing steps with example. Explain procedure 07
of any technique of preprocessing.
OR
(c) Explain what is concept description? Explain data 07
generalization, summarization-based characterization using
example.
Q.3 (a) Do feature wise comparison between classification and 03
prediction.
(b) Write a note on incremental Association Rule Mining. 04
(c) Generate frequent itemsets and generate association rules based 07
on it using apriori algorithm. Minimum support is 50% and
minimum confidence is 70%
TID Items
100 1, 3, 4
200 2, 3, 5
300 1, 2, 3, 5
400 2, 5
OR
Q.3 (a) Differentiate between Overfitting and Tree Pruning w.r.to 03
following parameters.
i). definition figure
ii). use in particular situation
iii). limitation
(b) Explain Mining Multiple-Level Association Rules using 04
example.
(c) Generate decision tree using CART algorithm for the following 07
dataset.
Sr. Temperat
no. Outlook ure Humidity Wind Play
1 Sunny hot high FALSE No
2 Sunny hot high TRUE No
3 Overcast hot high FALSE Yes
4 Rain mild high FALSE Yes
5 Rain cool normal FALSE Yes
6 Rain cool normal TRUE No
7 Overcast cool normal TRUE Yes
8 Sunny mild high FALSE No
9 Sunny cool normal FALSE Yes
10 Rain mild normal FALSE Yes
11 Sunny mild normal TRUE Yes
12 Overcast mild high TRUE Yes
13 Overcast hot normal FALSE Yes
14 Rain mild high TRUE No
OR
Q.4 (a) Explain Spatial mining using example. 03
(b) Calculate the weights using neural network single layer 04
perceptron model. Three inputs are x0, x1, x2, bias and weights
are as follows:
w1(0) = 30 , w2(0) = 300
b(0)= 50 , η=0.01, xo = +1
Activation function is :
sgn(x) = +1, if x>=0
sgn(x) = -1, if x<0
(a)Calculate x2 for x1=100 and & 200.
(b)For bias b(0)= -1230 recalculate the weights w1 and w2.
(c) How data Mining is useful for Business Intelligence applications 07
viz.Balanced Scorecard, Fraud Detection, Clickstream Mining,
Market Segmentation, retail industry, telecommunications
industry, banking & finance and CRM
Q.5 (a) Explain text mining using example. 03
(b) Explain big data and big data analytics. Explain key roles and 04
their responsibilities for successful analytic project.
(c) Calculate 2 clusters using k-means cluster algorithm. For finding 07
the distance use euclidian distance.
Subject A B
1 1.0 1.0
2 1.5 2.0
3 3.0 4.0
4 5.0 7.0
5 3.5 5.0
6 4.5 5.0
7 3.5 4.5
Assume mean1 as subject1 and mean2 as subject4
OR
Q.5 (a) Explain web mining using example. 03
(b) Explain Hadoop architecture using figure. 04
(c) Explain mapreduce. Explain any example using mapreduce. 07