Introduction to Data Warehouse A Data Warehousing (DW) is process for collecting and managing data from varied sources to provide meaningful business insights. A Data warehouse is typically used to connect and analyze business data from heterogeneous sources. A data warehouse is a database, which is kept separate from the organization's operational database. There is no frequent updating done in a data warehouse. It possesses consolidated historical data, which helps the organization to analyze its business. A data warehouse helps executives to organize, understand, and use their data to take strategic decisions. Data warehouse systems help in the integration of diversity of application systems. A data warehouse system helps in consolidated historical data analysis.
Importance of Data Warehouse Subject Oriented Integrated Time-Variant Non-volatile
Characteristics/Advantages of Data Warehouse Clean data Indexes multiple types Secured data and its access Query processing with multiple options Enhanced business intelligence Increased system and query performance Business intelligence Timely access of data Enhanced data consistency and quality Return on investment is high Increased revenue
Functions of Data Warehouse Data consolidation Data cleaning Data integration
Data Warehouse Architecture Data Warehouses usually have a three-level (tier) architecture that includes: Bottom Tier (Data Warehouse Server ): A bottom-tier that consists of the Data Warehouse server, which is almost always an RDBMS. Middle Tier (OLAP Server ): A middle-tier which consists of an OLAP server for fast querying of the data warehouse . Top Tier (Front end Tools ): A top-tier consists of front-end client layer. This layer holds the query tools and reporting tools, analysis tools and data mining tools.
Data Warehouse Architecture
Introduction to Data Mining Data mining is one of the most useful techniques that help entrepreneurs, researchers, and individuals to extract valuable information from huge sets of data. Data mining is also called Knowledge Discovery in Database (KDD) . The knowledge discovery process includes: Data cleaning, Data integration, Data selection, Data transformation, Data mining, Pattern evaluation, and Knowledge presentation.
Knowledge Discovery in Database (KDD )
Steps of KDD Building up an understanding of the application domain Choosing and creating a data set on which discovery will be performed Preprocessing and cleansing Data Transformation Prediction and description Selecting the Data Mining algorithm Utilizing the Data Mining algorithm Evaluation Using the discovered knowledge
Data Mining S cope Automated prediction of trends and behaviors Automated discovery of previously unknown patterns Databases can be larger in both depth and breadth: More columns More rows
Data Mining Techniques Artificial neural networks Decision trees Genetic Algorithms Classification Clustering
Business applications of Data Mining and Data Warehouse Government Finance Direct Marketing Medicine Manufacturing Churn Analysis Market segmentation Trend Analysis Fraud Detection Web marketing