Yogesh Waghode Data-Mining-ppt seminar report

yogeshvw56 71 views 22 slides Jun 22, 2024
Slide 1
Slide 1 of 22
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22

About This Presentation

btech 2nd year 3rd semester seminar report on topic of data mining technology in computer engineering


Slide Content

Seminar On DATA MINING Presenting YOGESH WAGHODE

J.T.MAHAJAN COLLEGE OF ENGINEERING, FAIZPUR (2022-2023) Guided by Prof. K. S. PATIL DEPARTMENT OF SECOND YEAR (COMPUTER) ENGINEERING

Content Data Mining Data Mining Definition Data Mining – Two Main Components Data Mining vs. Data Analysis What is (not) Data Mining? Related Fields Data Mining Process Major Data Mining Tasks Uses of Data Mining Sources of Data for Mining Challenges of Data Mining Advantages Conclusion Reference

Data Mining New buzzword, old idea. Inferring new information from already collected data. Traditionally job of Data Analysts Computers have changed this. Far more efficient to comb through data using a machine than eyeballing statistical data.

Data Mining Definition Data mining in Data is the non-trivial process of Identifying valid novel potentially useful and ultimately understandable patterns in data.

Data Mining vs. Data Analysis In terms of software and the marketing there of Data Mining = Data Analysis Data Mining implies software uses some intelligence over simple grouping and partitioning of data to infer new information. Data Analysis is more in line with standard statistical software ( ie : web stats). These usually present information about subsets and relations within the recorded data set (browser/search engine usage, average visit time, etc. )

What is (not) Data Mining? What is not Data Mining? Look up phone number in phone directory Query a Web search engine for information about “Amazon” What is Data Mining? Certain names are more prevalent in certain US locations (O’Brien, O’Rurke , O’Reilly… in Boston area) Group together similar documents returned by search engine according to their context (e.g. Amazon rainforest, Amazon.com,)

Data Mining Techniques Classification Clustering Regression Association Rules

Why Mine Data? Scientific Viewpoint Data collected and stored at enormous speeds (GB/hour) remote sensors on a satellite telescopes scanning the skies microarrays generating gene expression data scientific simulations generating terabytes of data Traditional techniques infeasible for raw data Data mining may help scientists in classifying and segmenting data in Hypothesis Formation

Data Mining Architecture

Related Fields Statistics Machine Learning Databases Visualization Data Mining and Knowledge Discovery

__ ____ __ ____ __ ____ Transformed Data Patterns and Rules Target Data RawData Knowledge Data Mining Transformation Interpretation & Evaluation Selection & Cleaning Integration Understanding Data Mining Process DATA Ware house Knowledge

Major Data Mining Tasks Classification: predicting an item class Associations: e.g. A & B & C occur frequently Visualization: to facilitate human discovery Estimation: predicting a continuous value Deviation Detection: finding changes Link Analysis: finding relationships...

Uses of Data Mining AI/Machine Learning Combinatorial/Game Data Mining Good for analyzing winning strategies to games, and thus developing intelligent AI opponents. ( ie : Chess) Business Strategies Market Basket Analysis Identify customer demographics, preferences, and purchasing patterns. Risk Analysis Product Defect Analysis Analyze product defect rates for given plants and predict possible complications (read: lawsuits) down the line.

Uses of Data Mining (Cont..) User Behavior Validation Fraud Detection In the realm of cell phones Comparing phone activity to calling records. Can help detect calls made on cloned phones. Similarly, with credit cards, comparing purchases with historical purchases. Can detect activity with stolen cards.

Uses of Data Mining (Cont..) Health and Science Protein Folding Predicting protein interactions and functionality within biological cells. Applications of this research include determining causes and possible cures for Alzheimers , Parkinson's, and some cancers (caused by protein " misfolds ") Extra-Terrestrial Intelligence Scanning Satellite receptions for possible transmissions from other planets. For more information see Stanford’s Folding@home and SETI@home projects. Both involve participation in a widely distributed computer application.

Sources of Data for Mining Databases (most obvious) Text Documents Computer Simulations Social Networks

Advantages of Data Mining Marketing / Retail Finance / Banking Manufacturing Governments

Challenges of Data Mining Scalability Dimensionality Complex and Heterogeneous Data Data Quality Data Ownership and Distribution Privacy Preservation Streaming Data

Conclusion Comprehensive data warehouses that integrate operational data with customer, supplier, and market information have resulted in an explosion of information. Competition requires timely and sophisticated analysis on an integrated view of the data. However, there is a growing gap between more powerful storage and retrieval systems and the users’ ability to effectively analyze and act on the information they contain.

Reference www.google.com www.wikipedia.com www.studymafia.org

Thank You For Your Attention