Data mining Is a branch of extraction information from the large database.In GTU,asked different question asked in exam.That's reason some question are asked commonly in exam.The paper need to analysis and need to identify which is importance.The importance question are given below:
1.Differentiate between data, information & knowledge.
2. Briefly explain methods of concept hierarchy generation for categorical data.
3. Explain sampling method for data reduction
4. What do you mean by text mining? Explain various issues involved in it.
5. What is difference between OLTP and OLAP?
6. Explain various data cleaning methods in brief.
7. What do you mean by numerosity reduction? Explain various methods to achieve it.
8. Define following terms w.r.t. text mining: precision, recall, document ranking, stop list, term-frequency matrix, inverse document frequency and tokenization.
9. Explain 1) data mining architecture. 2) Data ware house architecture.
10. What are the major issues in data mining?
11. Short note on KDD process.
12. In real-world data, tuples with missing values for some attributes are a Common occurrence. Describe various methods for handling this problem.
13. Define Data Warehouse, Data Mart and Virtual Data Warehouse
14. Consider the data set shown in table.a. Estimate the conditional probability for P(A|+), P(B|+), P(C|+), P(A|-), P(B|-), and P(C|-).b. Predict the class label for a test sample (A=0,B=1,C=0) using naïve bayes approach.
15. Describe A Multilayer Feed-Forward Neural Network.
16. Describe the ID3 algorithm of the decision tree construction. Why it is unsuitable for Data mining applications?
17. What is associative classification? Why is associative classification able to achieve higher classification accuracy than a classical decision tree method? Explain how associative classification can be used for text document classification.
18. Explain Constraint based association Mining.
19. Compare various attribute selection measures for decision tree with suitable example.
20. What is prediction? Explain various regression methods for it.
21. Explain cross validation and bootstrap methods for evaluating accuracy of classifier/predictor also explain accuracy and error measures for classification and prediction respectively.
22. What is difference between association & correlation? For given contingency table find type of correlation between items using any three methods.
23. Explain with Example Apriori, FP-Grawth, a Eclat algorithm for frequent itemset mining.
24. Briefly outline how to compute the similarity/dissimilarity between object described by following types of variable 1) Interval-scaled variable. 2) Binary variable 3) Categorical variable 4) Ordinal variable 5) Ratio-scaled variable 6) Mixed Types 7) Vector Objects.
25. Explain K-means and K-medoids clustering techniques with example.