Data Warehousing and Data Mining (CSC-451)
B.Sc. CSIT 8th Semester
Model Questions
Candidates are required to give their answers in their own words as
far as practicable. The figures in the margin indicate full marks.
Group A
Long Answer Questions (Attempt any TWO) [2×10=20]
1. Suppose
that a data warehouse for Big University consists of the following four
dimensions: student, course, semester and instructor and two measures count and
avg-grade. When at the lowest conceptual level (e.g. for a given student,
course, semester and instructor combination), the avg-grade measure stores the
actual course grade of the student. At higher conceptual levels avg-grade
stores the average grade for the given combination.
aa.) Draw
a snowflake schema diagram for the data warehouse.
bb.) Starting
with base cuboid [student, course, semester, instructor] what specific OLAP
operations (e.g. roll-up from semester to year) should one perform in order to
list the average grade of CS course for each Big University Student.
cc.) If
each dimension has five levels (including all) such as “student < major <
status < university < all”, how many cuboids will this cube contain
(including the base and apex cuboids)?
2. A=
{A1, A2, A3, A4, A5, A6}, assume σ=35%
use a priori algorithm to get the desired solution.
A1
|
A2
|
A3
|
A4
|
A5
|
A6
|
0
|
0
|
0
|
1
|
1
|
1
|
0
|
1
|
1
|
1
|
0
|
0
|
1
|
0
|
0
|
1
|
1
|
1
|
1
|
1
|
0
|
1
|
0
|
0
|
1
|
0
|
1
|
0
|
1
|
1
|
0
|
1
|
1
|
1
|
0
|
1
|
0
|
0
|
0
|
1
|
1
|
0
|
0
|
1
|
0
|
1
|
0
|
1
|
1
|
0
|
0
|
1
|
0
|
0
|
1
|
1
|
1
|
1
|
1
|
1
|
3. What
kind of data preprocessing do we need before applying data mining algorithm to
a data set. Explain data binning method to handle noisy data with example.
Group B
Short Answer Questions
(Attempt any Eight-questions) [8×5=40]
Question number 13 is compulsory.
4. Explain
the use of frequent item set generation process.
5. Differentiate
between data marts and data cubes.
6. Explain
OLAP operations with example.
7. List
the drawbacks of ID3 algorithm with over-fitting and its remedy techniques.
8. Write
the algorithms for K-means clustering. Compare it with k-nearest neighbor
algorithm.
9. What
is text mining? Explain the text indexing techniques.
10. Describe
genetic algorithm using as problem solving technique in data mining.
11. What
do you mean by WWW mining? Explain WWW mining techniques.
12. What
is DMQL? How do you define Star Schema using DMQL?
13. Write
shorts notes (Any Two)
aa.) Text
Database mining
bb.) Back
propagation algorithm
cc.) Regression
dd.) HOLAP
0 comments:
Post a Comment