Showing posts with label DMDW. Show all posts
Showing posts with label DMDW. Show all posts

Data mining importance question



1.     What is Data Mining? Goal of Data Mining and application of Data Mining.

2.     What kinds of data on Data Mining?

3.     Classification of Data Mining.

4.     Give some alternative terms for Data Mining.

5.     Define each of the following Data Mining Functionalities: Characterization, discrimination, association and correlation analysis, classification, prediction, clustering and evolution  analysis.

6.     List major issues in Data Mining system.

7.     Describe the difference between the following approaches for integration of data mining system with a database or data warehouse system: No coupling, loose Coupling, Semitight coupling, Tight Coupling.

8.     What are the Steps involved in KDD Process?

9.      Describe Challenges to Data Mining regarding data Mining Methodology and User Interaction issues.

10.  Describe Challenges to data Mining Regarding Performance Issues.

11.  How Data warehouse different from database? How are they Similar?

12.  What is Query driven approaches and Update driven approaches which one is used by data warehouse?

13.  What makes a Pattern Interesting?

14.  Explain architecture of Data Mining.

15.  How is the Derived Model Presented?

16.  What is Cluster Analysis?

17.   What is Descriptive and Predictive Mining?

18.  Difference between OLAP and OLTP

19.  Evolution of Database system Technology.

20.  Discuss the role of data mining in data warehousing.

21.  What is Spatial, Sequence Mining?

22.  What is Text and Web Mining?

23.  What is the difference between discrimination and classification?

24.  What is the difference between classification and prediction?

25.  What is mean by Pattern?
26.   

Clustering in DBDW

introduced the concept of data mining and to the free and open source software Waikato Environment for Knowledge Analysis (WEKA), which allows you to mine your own data for trends and patterns. I also talked about the first method of data mining — regression — which allows you to predict a numerical value for a given set of input values. This method of analysis is the easiest to perform and the least powerfu


I introduced the concept of data mining and to the free and open source software Waikato Environment for Knowledge Analysis (WEKA), which allows you to mine your own data for trends and patterns. I also talked about the first method of data mining — regression — which allows you to predict a numerical value for a given set of input values. This method of analysis is the easiest to perform and the least powerful method of data mining, but it served a good purpose as an introduction to WEKA and provided a good example of how raw data can be transformed into meaningful information.
In this article, I will take you through two additional data mining methods that are slightly more complex than a regression model, but more powerful in their respective goals. Where a regression model could only give you a numerical output with specific inputs, these additional models allow you to interpret your data differently. As I said in Part 1, data mining is about applying the right model to your data. You could have the best data about your customers (whatever that even means), but if you don't apply the right models to it, it will just be garbage. Think of this another way: If you only used regression models, which produce a numerical output, how would Amazon be able to tell you "Other Customers Who Bought X Also Bought Y?" There's no numerical function that could give you this type of information. So let's delve into the two additional models you can use with your data.
In this article, I will also make repeated references to the data mining method called "nearest neighbour," though I won't actually delve into the details until. However, I included it in the comparisons and descriptions for this article to make the discussions complete.

Load the data file labor.arff into WEKA using the same steps we used to load data into the Pre-process tab. Take a few minutes to look around the data in this tab. Look at the columns, the attribute data, the distribution of the columns, etc. Your screen should look like Figure 5 after loading the data.  

With this data set, we are looking to create clusters, so instead of clicking on the Classify tab, click on the Cluster tab. Click Choose and select SimpleKMeans from the choices that appear (this will be our preferred method of clustering for this article). Your WEKA Explorer window should look like Figure 6 at this point.

Working of Data Mining with requirements

SOFTWARE REQUIRED:

SQL Server Setup requires Microsoft Windows Installer 3.1 or later and Microsoft Data Access Components (MDAC) 2.8 SP1 or later.

SQL Server Setup installs the following software components required by the product:
  • Microsoft .NET Framework 2.0
  • Microsoft SQL Server Native Client
  • Microsoft SQL Server Setup support files
            DATA MINING:

Data mining is a key member in the Business Intelligence (BI) product family in SQL Server 2005. Data mining is about analyzing data and finding hidden patterns using automatic or semiautomatic means, which can be explored for valuable information. It is about learning the characteristics of data set, which are not possible to discover by simple seeing.
There are several attempts to define the learning task applied to software systems such as:  "Learning is any process that enables a system to achieve a better performance when working on the same task" or "Learning consists of constructing or modifying representations of past experience".
Large volumes of data which comes from information systems have been accumulated and stored in databases. Organizations have become data-rich and knowledge-poor. The information found in the patterns can volumes of data which comes from information systems have been accumulated and stored in databases. Organizations have become data-rich and knowledge-poor. The information found in the patterns can be used for reporting, and, most importantly, for prediction.

WORKING WITH DATA MINING:

Data mining approach in Analysis Service is rather simple, all you need to do is to select the right data mining algorithm and specify the input columns and the predictable columns (which are the targets for the analysis).
Data mining can be used to solve a several problems such as:
  • Classification: Classification refers to assigning cases into categories based on a predictable attribute.
  • Clustering: It is used to identify natural groupings (self-similarity groups) of cases based on a set of attributes.
  • Large volumes of data which comes from information systems have been accumulated and stored in databases. Organizations have become data-rich and knowledge-poor. The information found in the patterns can be used for reporting, and, most importantly, for prediction.