Data v/s Information Data - > always in raw form ; storage is in the form of 0’s and 1’s Information -> Processed form of data. Process A 66 66 99 B 55 88 98 C 66 87 89 Roll no mks1 mks2 Mks3 A 66 66 99 b 55 88 98
Data v/s Information Data Information Meaning Method of Collection Format of collection Consists of Can we take a decision? Dependency?? Based on Examples…
Data v/s Information Data Information Meaning Raw facts Processed fact Method of Collection Random collection Specific collection Format of collection Unorganized form of collection Systematic form of processed data Consists of Text and numbers Refined form of data Can we take a decision? Decision making process is difficult Easy to take decision Dependency?? Data is not depend on information Information is dependent on data Based on Records and observation Analysis Examples…
Data Shape -> how data is represented in business and storage form
Types of data
Further classification of data Demographic data (this customer is a woman, 35 years old, has two children, etc.). Transactional data (the products she buys each time, the time of purchases, etc.) Web behavior data (the products she puts into her basket when she shops online). Data from customer-created texts (comments about the retailer that this woman leaves on the internet).
DBMS… way of data extraction
Problems faced by current DBMS large quantities of data is generated /processed. data may get doubled in every say 3 months. Seeking knowledge from this massive data is most required. Fast developing in computer science and engineering techniques generates new demands. To fulfill those demands we require to analyze the data Data Rich , Information Poor.. Raw data by itself does not provide much information. In today's life we require only significant data from which we can judge the customer’s likings and strategies.
What is data mining?
Data Mining is…. Data mining is a powerful tool with great potential. Focus on the most important information in data Gives detail information about their potential customer and their behavior. Extraction of useful information. Finding useful valid and understandable data or patterns in a data. It is also defined as finding hidden information in a data base
Why Big Data 14 Old Model: Few companies are generating data, all others are consuming data New Model: all of us are generating data, and all of us are consuming
Why Big Data?? 15
Data Preprocessing in Data Mining
examples of big data and ML
Data sources..
What is Data Science? various tools, algorithms, and machine learning principles involves obtaining meaningful information Involves elements like mathematics, statistics, computer science How Data Science Works? Problem Statement Data Collection Optimization and Deployment: Data Analysis and Exploration Data Modelling Data Cleaning
The Data Science Lifecycle Capture : Data Acquisition, Data Entry, Signal Reception, Data Extraction. This stage involves gathering raw structured and unstructured data. Maintain : Data Warehousing, Data Cleansing, Data Staging, Data Processing, Data Architecture. This stage covers taking the raw data and putting it in a form that can be used. Process : Data Mining, Clustering/Classification, Data Modeling, Data Summarization. Data scientists take the prepared data and examine its patterns, ranges, and biases to determine how useful it will be in predictive analysis. Analyze : Exploratory/Confirmatory, Predictive Analysis, Regression, Text Mining, Qualitative Analysis. Here is the real meat of the lifecycle. This stage involves performing the various analyses on the data. Communicate : Data Reporting, Data Visualization, Business Intelligence, Decision Making. In this final step, analysts prepare the analyses in easily readable forms such as charts, graphs, and reports.
Class Activity 1 Justify the role of data scientist. What is the Prerequisites for Data Science How one can observe different types of data in “Identifying a particular type of disease’ What are the responsibilities of Data Scientist , Data Analyst , Data Engineers .