Data warehousing & Multidimensional model MSc Computer Science Dr. RIYA SIL DEPT OF COMPUTER SCIENCE, PG
Introduction to Data Warehouse Modelling The process of developing the schemas for the data warehouse's detailed and summarized information is known as data warehouse modeling [ a schema refers to the organization or structure of data within the data warehouse]. The purpose of data warehouse modeling is to create a schema that describes the reality, or at least a portion of the reality, that the data warehouse must support. Data warehouse modeling is essential in building a data warehouse for two essential reasons. The first reason is to visualize the relationships among the warehouse data, and the second reason is to optimize the schema to make a well-structured data warehouse.
Data Modeling The process of developing a visual representation of an entire information system or sections to express connections between data points and structures is known as data modeling. Data modeling ensures that all data objects required by the Database are correctly represented.
The life cycle of Data Modeling Gathering Business Requirements Conceptual Data Modeling Logical Data Modeling Physical Data Modeling Development of Schema / the database Maintenance of data Model time to time as per requirement
Types of the data model Data modeling facilitates the creation of a conceptual model and the establishment of relationships between items. Conceptual Data Model Logical Data Model Physical Data Model
Conceptual Data Model Conceptual models are usually built as part of gathering early project requirements. The conceptual model defines what the system contains. This data model focuses on finding the data used in a business instead of the processing flow. The main objective of this data model is to organize and establish business rules and concepts. It includes entity classes, properties and constraints, relationships, and the necessary security and data integrity requirements. Its primary purpose is to establish entities, attributes, and relationships between two entities. Business stakeholders or data architects create it Its purpose is to create various business rules.
Logical Data Model The logical Data Model map of rules and Data Structures includes the required data, such as tables, columns, etc. A logical data model consists of tables, documents, descriptions, etc. The document structures are defined in this model. This data model is always present in the root package object. This type of data model helps create the physical model base. There is no secondary or primary key defined in this model. It defines the structure of data elements and their relationships also. Business analysts and data architects create it.
Physical Data Model In a physical data model, we care about how the system can store the actual data. It manages the replication, shards, etc., physically. It defines the components and services which are required to build a database. It is created by using the database language and queries. The physical data model provides database column keys, constraints, and RDBMS features. We create various schemas, abstraction of schemas, and different mapping types in these data models. Database administrators and developers create it. It is the actual implementation of the Database.
Types of Data Warehouses Models M ainly there are three different types of data warehouse models Enterprise warehouse Data Mart Virtual warehouse
Enterprise Warehouse An Enterprise database brings together various functional areas of an organization and brings them together in a unified manner. An enterprise data warehouse structures and stores all company's business data for analytics querying and reporting. It collects all of the information about subjects spanning the entire organization. The goal of the Enterprise data warehouse is to provide a complete overview of any particular object in the data model. It mainly contains detailed summarized information and can range from a few gigabytes to hundreds of gigabytes, terabytes, or maybe beyond.
Enterprise Data Warehouse (EDW) An Enterprise Data Warehouse (EDW) is a centralized database that stores large volumes of structured data from various sources within an organization. It serves as a single, comprehensive repository for business-related information. The EDW is designed to support business intelligence and analytics by providing a unified and consistent view of the organization's data. This enables decision-makers to analyze trends, gain insights, and make informed decisions based on a holistic understanding of the company's information. An Enterprise Data Warehouse acts as a powerful tool for managing, organizing, and extracting valuable knowledge from diverse data sources across an entire enterprise.
Data Mart It is a data store designed for a particular department of an organization or company. Data Mart is a subset of the data warehouse usually oriented to a specific task. Data that we use for a particular department or purpose is called data mart.
Data Mart A Data Mart is like a smaller, focused version of an Enterprise Data Warehouse. It's a specialized database that stores and organizes specific types of data relevant to a particular department or business unit within an organization. Unlike a comprehensive Enterprise Data Warehouse, a Data Mart is designed for a specific purpose, making it easier for teams to access and analyze the data they need without dealing with the complexity of the entire organization's data. Think of it as a targeted and accessible data store tailored to meet the unique requirements of a specific team or project.
Reason for creating a data mart Easy access of frequently used data It improves end-user response time It can be easily creation of data mart Less cost in building a data mart.
Virtual warehouse A virtual data warehouse gives you a quick overview of your data. It has metadata in it. It connects to several data sources with the use of middleware. They are quick because they allow users to filter the most critical data from various older applications. A virtual warehouse is easy to set up, but it requires more database server capacity.
Virtual Warehouse A Virtual Warehouse is a cloud-based, on-demand computing environment that allows users to access and analyze data without owning or maintaining physical hardware. It's like renting computing power and storage in the cloud for temporary use, enabling organizations to scale up or down based on their specific needs. This flexibility makes it easier to perform data analytics and run applications without the hassle of managing physical servers, making computing resources more agile and cost-effective.
Dr. Riya Sil Assistant Professor DEPT OF COMPUTER SCIENCE, PG