cosmodb ppt personal.pptxgskjhkjsfgkhkjgskhk

BiharDarshan 15 views 29 slides Mar 08, 2025
Slide 1
Slide 1 of 29
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29

About This Presentation

hkjfahkjhkafdkjh


Slide Content

CENTRAL UNIVERSITY OF SOUTH BIHAR Microsoft u ses CosmosDB SUBMITTED BY: ABHINEET KUMAR (CUSB2302312001) MOHIT KUMAR (CUSB2302312010) SUBMITTED TO: DR. PRABHAT RANJAN ASSOCIATE PROFESSOR, CUSB

CONTENTS Introduction Microsoft Azure Cosmos DB Features of Azure Cosmos DB Architecture of Cosmo dB Storage mechanism Some Examples Comparing Cosmos DB and MongoDB Advantages & Disadvantages Conclusion

Microsoft Corporation Founded: April 4, 1975 Founders: Bill Gates and Paul Allen Headquarters: Redmond, Washington, USA CEO: Satya Nadella Industry: Technology Revenue: $211.9 billion (FY 2023) Employees: ~221,000 (2023) Microsoft is a global leader in technology, known for its software products like the Windows operating system and Microsoft Office suite. It also provides a range of cloud services through Azure, develops hardware like Surface devices and Xbox gaming consoles, and is a key player in AI development, with investments in OpenAI and generative AI products such as Copilot in Office applications. Microsoft’s cloud segment, Azure, is a major revenue driver, positioning the company as a leader in cloud computing.

AZURE COSMOS DB Azure Cosmos DB is a globally distributed, fully managed NoSQL database service by Microsoft, designed for mission-critical applications with high scalability and low-latency performance. Use Cases : Ideal for applications requiring real-time data access, such as IoT, gaming, retail, and social media platforms. Some Company use cosmosdb are Uber - Leverages Cosmos DB for global consistency and low-latency needs. Toyota - Uses Cosmos DB for connected car services and managing data in real time

Azure cosmos dB's history 2010 – Origin as "Project Florence" Developed internally to store large-scale unstructured data for Microsoft services. 2014 – Launch as DocumentDB Initially released as DocumentDB , focusing on document storage with limited functionality. 2017 – Rebranded and Public Release as Azure Cosmos DB Evolved into a globally distributed, multi-model database with horizontal partitioning for scalability. 2018 – Multi-Master Feature Announcement Introduced multi-master capabilities, allowing multiple write regions, boosting scalability and reliability.

Why we need ? Rapid Scaling & Low-Latency Azure Cosmos DB guarantees low latency at the 99th percentile , with read and write latencies typically within 10 milliseconds . Multi-Model Support Supports key-value, document, graph, and column models via APIs like MongoDB, Gremlin, NoSQL, and Table. High Performance & Availability SSD-backed storage, millisecond latencies, 99.99% availability, and multi-region replication with failover options. Flexible Consistency Models Offers 5 models (Strong, Bounded-Staleness, Session, Consistent Prefix, Eventual) for cost-performance balance. Elastic Scaling Independent scaling of throughput and storage; trillions of requests per day using Request Units (RUs). Automatic Partitioning Optimizes performance and scalability using partition keys for large-scale applications. Change Data Capture (CDC) Real-time data monitoring via Change Feed, supporting event-driven workflows. Cost-Effective for Serverless Apps Ideal for web, mobile, IoT, and gaming applications with high throughput and unlimited storage. SLAs for Mission-Critical Workloads Financially backed SLAs for throughput, latency, availability, and consistency

Key Features of Azure Cosmos DB Globally Distributed : One-click data replication across multiple Azure regions. Linearly Scalable : Horizontally scales to handle millions of transactions per second. Schema-Agnostic Indexing : Automatically indexes all data without schema or index management. Multi-Model Database : Supports key-value, document, graph, and column-family data models with consistent features. Multi-API & Multi-Language Support : Compatible with SQL, MongoDB, Cassandra, Gremlin, and supports SDKs for Java, .NET, Python, and more. Multi-Consistency Support : Offers 5 consistency levels: Eventual, Prefix, Session, Bounded-Staleness, and Strong. Automatic Indexing : Every property is automatically indexed, with custom indexing options available. High Availability : 99.999% for multi-region with multi-region writes. 99.99% for single-region accounts. Supports automatic failover. Guaranteed Low Latency : 10 ms read/write latency at the 99th percentile. Multi-Master Support : Multi-master writes across regions, enabling elastic scaling for reads and writes.

Architecture of Azure Cosmos DB

This diagram illustrates the architecture of an **e-commerce platform** utilizing **Azure Cosmos DB** for both recommendation services and order transaction management. ### Key Components: 1. **Shoppers**: The end-users who interact with the e-commerce platform. 2. **E-commerce Store**: The interface for customers to browse products and make purchases. ### **Online Recommendations Service**: - **Azure Container Service (Recommendations API)**: Provides product recommendations to shoppers based on their interactions and preferences. - **Azure Cosmos DB (Product + User Vectors)**: Stores product and user data vectors used for recommendations. - **Apache Spark on Azure Databricks**: Processes large datasets (likely related to product and user behavior) to generate recommendations. ### **Order Transactions**: - **Azure API Apps (Customer Order)**: Manages customer order transactions when users make purchases. - **Azure Cosmos DB (Customer Order)**: Stores customer order details. - **Change Feed**: Tracks changes to customer order data in Cosmos DB, likely for real-time updates or further data processing. - **Apache Spark on Azure Databricks**: Uses the change feed data to analyze orders and transactions in real-time or for batch processing. This architecture leverages Azure Cosmos DB for both real-time product recommendations and managing customer orders, with additional processing powered by Apache Spark on Azure Databricks for analytics.

DATA STORED

Retail Marketing

This diagram shows an Azure-based architecture for a web app: - **Browser**: User interface for interacting with the app. - **Azure App Services**: Manages web requests and connects the app to backend services. - **Azure Cosmos DB**: Stores structured data in containers. - **Azure Blob Storage**: Holds unstructured data like files. - **Azure Search**: Indexes Cosmos DB data to provide search functionality. Data flows from the browser through App Services, interacting with Cosmos DB and Blob Storage, with Azure Search enabling fast search capabilities.

STORAGE MECHANISM Partitioning & Data Distribution Logical vs. Physical Partitions : Data is split into logical partitions (based on partition keys), stored in physical partitions. Automatic Scaling : As data grows, Cosmos DB automatically splits physical partitions to handle more data without downtime. Global Distribution & Multi-Region Replication Replicates data across multiple Azure regions, ensuring low-latency access and high availability. Offers multiple consistency models (e.g., Strong, Eventual) to balance performance and consistency. Request Units (RUs) & Throughput Scaling Measures database operations (reads, writes, queries) with RUs for unified pricing. Evenly distributes throughput across partitions to avoid throttling and high latency. Encryption at Rest & in Transit Encryption at Rest : Uses AES-256 to encrypt data on SSDs. Encryption in Transit : Uses secure HTTPS connections for data transmission. Indexing & Storage Optimization Automatically indexes data with customizable policies for performance and cost efficiency. Supports Time-to-Live (TTL) policies to purge old data and optimize storage. Backup & Disaster Recovery Continuous backups stored in Azure Blob Storage for recovery in case of accidental deletion or corruption.

Storage Mechanism

from azure.cosmos import CosmosClient , PartitionKey url = 'https://cosmosrgeastus467823cc-9237-4d5c-b38edb.documents.azure.com:443/' key = '<nyQ525S6kZskc5gktn7eTxEtlf9YCnxOn1FQdFkk3iUoLXNhDbWVtlssnAunmjsKqCosp5Zov0QSACDbsxL3nA==>' client = CosmosClient ( url , credential=key) database_name = ‘bigdata24' container_name = ‘ bogdata ' database = client.get_database_client ( database_name ) container = database.get_container_client ( container_name ) items_to_insert = [ { 'id': '1', ' ItemID ': '101', 'name': ‘Mohit Kumar', 'description': 'Inserting first data item programmatically into Cosmos DB', ' createdDate ': '2024-10-24', 'status': 'in-progress' },

{ 'id': '2', ' ItemID ': '102', 'name': ‘Abhineet Kumar', 'description': 'Inserting second data item programmatically into Cosmos DB', ' createdDate ': '2024-10-23', 'status': 'completed' }, { 'id': '3', ' ItemID ': '103', 'name’: ‘Saurav Yadav', 'description': 'Inserting third data item programmatically into Cosmos DB', ' createdDate ': '2024-10-22', 'status': 'pending' },

{ 'id': '4', ' ItemID ': '104', 'name': ‘Sonu Kumar , 'description': 'Inserting fourth data item programmatically into Cosmos DB', ' createdDate ': '2024-10-21', 'status': 'in-progress' }, { 'id': '5', ' ItemID ': '105', 'name': ‘ Saquib Sabir', 'description': 'Inserting fifth data item programmatically into Cosmos DB', ' createdDate ': '2024-10-20', 'status': 'completed' },

{ 'id': '6', ' ItemID ': '106', 'name': ‘Pooja Kumari', 'description': 'Inserting sixth data item programmatically into Cosmos DB', ' createdDate ': '2024-10-19', 'status': 'pending' }, { 'id': '7', ' ItemID ': '107', 'name': ‘Pritty Kumari', 'description': 'Inserting seventh data item programmatically into Cosmos DB', ' createdDate ': '2024-10-18', 'status': 'in-progress' },

{ 'id': '8', ' ItemID ': '108', 'name': ‘Rishi Yadav', 'description': 'Inserting eighth data item programmatically into Cosmos DB', ' createdDate ': '2024-10-17', 'status': 'completed' }, { 'id': '9', ' ItemID ': '109', 'name': ‘Deepali', 'description': 'Inserting ninth data item programmatically into Cosmos DB', ' createdDate ': '2024-10-16', 'status': 'pending' },

{ 'id': '10', ' ItemID ': '110', 'name’: ‘Sunny Kumari', 'description': 'Inserting tenth data item programmatically into Cosmos DB', ' createdDate ': '2024-10-16', 'status': 'in-progress' } ] # Insert each document into the container for bigdata in items_to_insert : container.create_item (body=bigdata) print("10 unique items inserted successfully.")

query = "SELECT * FROM bigdata" items = list( container.query_items ( query=query, enable_cross_partition_query =True )) for item in items: print(item) Query 1. query = "SELECT * FROM bigdata WHERE bigdata.status = 'in-progress'" items = list( container.query_items ( query=query, enable_cross_partition_query =True )) for item in items: print(item) Query 2.

Query 3 query = "SELECT * FROM bigdata WHERE bigdata.createdDate > '2024-10-20'" items = list( container.query_items ( query=query, enable_cross_partition_query =True )) for item in items: print(item) query = "SELECT * FROM bigdata WHERE bigdata.name = 'Mohit Kumar'" items = list( container.query_items ( query=query, enable_cross_partition_query =True )) for item in items: print(item) Query 4

Query 5 query = "SELECT VALUE COUNT(1) FROM bigdata" count = list( container.query_items ( query=query, enable_cross_partition_query =True )) print( f"Total number of items: {count[0]}")

Category Azure Cosmos DB MongoDB Provider Managed service by Microsoft Azure. Open-source database developed by MongoDB Inc. Database Type Globally distributed, multi-model NoSQL database. Document-oriented NoSQL database. API Compatibility Supports multiple APIs: SQL, MongoDB, Cassandra, Gremlin (graph), Table APIs. Primarily supports its own MongoDB API (document model). Scalability Elastic scaling of throughput and storage globally, with auto-scaling options. Scalability achieved through sharding but requires manual setup and configuration. MongoDB Atlas offers auto-scaling. Performance Sub-10ms latency with automatic indexing of all data and low-latency reads/writes globally. Can achieve low latency with proper indexing and architecture but may require optimization for large datasets. Data Models Multi-model: Supports key-value, document, graph, and columnar data models. Document-based model using BSON (Binary JSON) format. Pricing Model Pay-as-you-go based on throughput (RU/s) and storage; offers per-request pricing. Pricing depends on the number of instances and resources in use. MongoDB Atlas offers pay-as-you-go pricing. Replication Automatic multi-master replication across regions, with failover support. Replica sets are available, but global replication requires additional setup. Security Offers encryption at rest, role-based access control, VNET, and private link support. Encryption, access control, and security features available, but configuration depends on the environment (e.g., Atlas vs. self-managed). Open-Source Proprietary, but APIs like MongoDB, Cassandra, and others are open-source compatible. Completely open-source (with MongoDB Inc.'s cloud offering, Atlas, available for managed services). Query Language Cosmos DB SQL-like query language (if using SQL API), supports MongoDB queries if using the Mongo API. MongoDB Query Language (MQL) for querying data.

Advantages of Cosmos DB: Disadvantages of Cosmos DB: Cost: Can become expensive at scale. Learning Curve: Complexity in consistency models and scaling. Complex Querying: Limited compared to traditional SQL databases. Vendor Lock-in: Difficult to migrate to other cloud providers. Limited Relational Support: Not ideal for complex relational data. Manual Indexing: May require custom indexing in some cases. Smaller Community: Less support compared to popular databases like MySQL. Flexible Consistency Models: Offers 5 consistency levels to balance performance and consistency. Global Distribution: Replicate data across regions for low-latency and high availability. Multi-Model Support: Supports document, key-value, graph, and column-family models. Automatic Scaling: Scales throughput automatically based on demand. Low Latency: Millisecond read and write latencies. Elastic Throughput: Scales resources efficiently. High Availability: 99.99% uptime SLA. Fully Managed: No need to handle infrastructure, scaling, or patches. Azure Integration: Works seamlessly with other Azure services.

Conclusion Azure Cosmos DB is a powerful, globally distributed, and fully managed NoSQL database service ideal for applications requiring low-latency access, automatic scaling, and multi-model flexibility. Its strengths lie in global distribution, high availability, and seamless integration with the Azure ecosystem. However, it comes with challenges like a steeper learning curve, higher costs at scale, and limited relational data handling. Cosmos DB is a great choice for large-scale, globally distributed applications that need high performance and flexibility but may not be the best fit for smaller projects or those deeply reliant on traditional relational data models.
Tags