Difference Between Data Mining and Query Tools
Difference Between Data Mining and Query Tools
Query Tools are tools that help analyze the data in a database. They provide query building, query
editing, searching, finding, reporting and summarizing functionalities. On the other hand, Data mining is a
field in computer science, which deals with the extraction of previously unknown and interesting
information from raw data. Data used as the input for the Data mining process usually is stored in
databases. Users who are inclined toward statistics use Data Mining. They utilize statistical models to
look for hidden patterns in data. Data miners are interested in finding useful relationships between
different data elements, which is ultimately profitable for businesses.
Data mining
Data mining is also known as Knowledge Discovery in Data (KDD). As mentioned above, it is a field of
computer science, which deals with the extraction of previously unknown and interesting information from
raw data. Due to the exponential growth of data, especially in areas such as business, data mining has
become very important tool to convert this large wealth of data in to business intelligence, as manual
extraction of patterns has become seemingly impossible in the past few decades. For example, it is
currently been used for various applications such as social network analysis, fraud detection and
marketing. Data mining usually deals with following four tasks: clustering, classification, regression, and
association. Clustering is identifying similar groups from unstructured data. Classification is learning rules
that can be applied to new data and will typically include following steps: preprocessing of data, designing
modeling, learning/feature selection and Evaluation/validation. Regression is finding functions with
minimal error to model data. And association is looking for relationships between variables. Data mining
is usually used to answer questions like what are the main products that might help to obtain high profit
next year in Wal-Mart?
Query Tools
Query Tools are tools that help to analyze the data in a database. Usually these query tools have a GUI
front end with convenient ways to input queries as a set of attributes. Once these inputs are provided the
tool generates actual queries made up of the underlying query language used by the database. SQL, T-
SQL and PL/SQL are examples of query languages used in many popular databases today. Then, these
generated queries are executed against the databases and the results of the queries are presented or
reported to the user in an organized and clear manner. Typically, the user does not need to know a
database-specific query language to use a Query tool. Key features of Query tools are integrated query
builder and editor, summery reports and figures, import and export features and advanced find/search
capabilities.
Query tools can be used to easily build and input queries to databases. Query tools make it very easy to
build queries without even having to learn a database-specific query language. On the other hand, Data
Mining is a technique or a concept in computer science, which deals with extracting useful and previously
unknown information from raw data. Most of the times, these raw data are stored in very large databases.
Therefore Data miners can use the existing functionalities of Query Tools to preprocess raw data before
the Data mining process. However, the main difference between Data mining techniques and using Query
tools is that, in order to use Query tools the users need to know exactly what they are looking for, while
data mining is used mostly when the user has a vague idea about what they are looking for.
Data Mining and Machine Learning
Data Mining: Process of discovering patterns in data
Machine Learning...
Analysis
Limited number of observation
Theory � All swans are white�
Reality: Infinite number of swans
Theory formation
Machine Learning...
Prediction
Single observation
Theory � All swans are white�
Theory falsification
Reality: Infinite number of swans
There are various steps that are involved in mining data as shown in the
picture.
1. Data Integration: First of all the data are collected and integrated
from all the different sources.
2. Data Selection: We may not all the data we have collected in the
first step. So in this step we select only those data which we think
useful for data mining.
3. Data Cleaning: The data we have collected are not clean and may
contain errors, missing values, noisy or inconsistent data. So we need
to apply different techniques to get rid of such anomalies.
4. Data Transformation: The data even after cleaning are not ready
for mining as we need to transform them into forms appropriate for
mining. The techniques used to accomplish this are smoothing,
aggregation, normalization etc.
5. Data Mining: Now we are ready to apply data mining techniques on
the data to discover the interesting patterns. Techniques like
clustering and association analysis are among the many different
techniques used for data mining.
6. Pattern Evaluation and Knowledge Presentation: This step
involves visualization, transformation, removing redundant patterns etc from
the patterns we generated.
7. Decisions / Use of Discovered Knowledge: This step helps user to make
use of the knowledge acquired to take better decisions.
Data Mining: Data cube computation and data generalization - Presentation Transcript