Data mining
Data Mining
Data Mining Tutorial covers basic and advanced topics, this is designed for beginner and experienced working professionals too. This Data Mining Tutorial help you to gain the fundamental of Data Mining for exploring a wide range of techniques.

Data Mining
What is Data Mining?
Data mining is the process of extracting knowledge or insights from large amounts of data using various statistical and computational techniques. The data can be structured, semi-structured or unstructured, and can be stored in various forms such as databases, data warehouses, and data lakes.
The primary goal of data mining is to discover hidden patterns and relationships in the data that can be used to make informed decisions or predictions. This involves exploring the data using various techniques such as clustering, classification, regression analysis, association rule mining, and anomaly detection.
Data mining has a wide range of applications across various industries, including marketing, finance, healthcare, and telecommunications. For example, in marketing, data mining can be used to identify customer segments and target marketing campaigns, while in healthcare, it can be used to identify risk factors for diseases and develop personalized treatment plans.
However, data mining also raises ethical and privacy concerns, particularly when it involves personal or sensitive data. It’s important to ensure that data mining is conducted ethically and with appropriate safeguards in place to protect the privacy of individuals and prevent misuse of their data.
Basic approaches for Data generalization (DWDM)
Data Generalization is the process of summarizing data by replacing relatively low level values with higher level concepts. It is a form of descriptive data mining.
There are two basic approaches of data generalization :
1. Data cube approach :
- It is also known as OLAP approach.
- It is an efficient approach as it is helpful to make the past selling graph.
- In this approach, computation and results are stored in the Data cube.
- It uses Roll-up and Drill-down operations on a data cube.
- These operations typically involve aggregate functions, such as count(), sum(), average(), and max().
- These materialized views can then be used for decision support, knowledge discovery, and many other applications.
2. Attribute oriented induction :
- It is an online data analysis, query oriented and generalization based approach.
- In this approach, we perform generalization on basis of different values of each attributes within the relevant data set. after that same tuple are merged and their respective counts are accumulated in order to perform aggregation.
- It performs off-line aggregation before an OLAP or data mining query is submitted for processing.
- On the other hand, the attribute oriented induction approach, at least in its initial proposal, a relational database query – oriented, generalized based (on-line data analysis technique).
- It is not limited to particular measures nor categorical data.
- Attribute oriented induction approach uses two method :
(i). Attribute removal.
(ii). Attribute generalization.
Comments
Post a Comment