CENTRAL UNIVERSITY OF SOUTH BIHAR Microsoft u ses CosmosDB SUBMITTED BY: ABHINEET KUMAR (CUSB2302312001) MOHIT KUMAR (CUSB2302312010) SUBMITTED TO: DR. PRABHAT RANJAN ASSOCIATE PROFESSOR, CUSB
Contents Introduction Microsoft Azure Cosmos DB Features of Azure Cosmos DB Architecture of Cosmo dB Storage mechanism Some Examples Comparing Cosmos DB and MongoDB Advantages & Disadvantages Conclusion
Microsoft Microsoft Corporation Founded: April 4, 1975 Founders: Bill Gates and Paul Allen Headquarters: Redmond, Washington, USA CEO: Satya Nadella Industry: Technology Revenue: $211.9 billion (FY 2023) Employees: ~221,000 (2023) Microsoft is a global leader in technology, known for its software products like the Windows operating system and Microsoft Office suite. It also provides a range of cloud services through Azure, develops hardware like Surface devices and Xbox gaming consoles, and is a key player in AI development, with investments in OpenAI and generative AI products such as Copilot in Office applications. Microsoft’s cloud segment, Azure, is a major revenue driver, positioning the company as a leader in cloud computing.
AZURE COSMOS DB Azure Cosmos DB is a globally distributed, fully managed NoSQL database service by Microsoft, designed for mission-critical applications with high scalability and low-latency performance. Use Cases : Ideal for applications requiring real-time data access, such as IoT, gaming, retail, and social media platforms.
Azure cosmosdb’s history 2010 – Origin as "Project Florence" Developed internally to store large-scale unstructured data for Microsoft services. 2014 – Launch as DocumentDB Initially released as DocumentDB , focusing on document storage with limited functionality. 2017 – Rebranded and Public Release as Azure Cosmos DB Evolved into a globally distributed, multi-model database with horizontal partitioning for scalability. 2018 – Multi-Master Feature Announcement Introduced multi-master capabilities, allowing multiple write regions, boosting scalability and reliability. Key Features & Innovations Elastic Scalability : Automatic scaling of throughput and storage. Global Distribution : Low-latency access across multiple Azure regions. Multiple Data Models : Supports document, key-value, graph, and more. 99.99% Availability : High availability with comprehensive SLAs.
Why we need Rapid Scaling & Low-Latency Ideal for apps needing fast scaling and low-latency responses. Multi-Model Support Supports key-value, document, graph, and columnar models via APIs like MongoDB, Gremlin, NoSQL, and Table. High Performance & Availability SSD-backed storage, millisecond latencies, 99.99% availability, and multi-region replication with failover options. Flexible Consistency Models Offers 5 models (Strong, Bounded-Staleness, Session, Consistent Prefix, Eventual) for cost-performance balance. Elastic Scaling Independent scaling of throughput and storage; trillions of requests per day using Request Units (RUs). Automatic Partitioning Optimizes performance and scalability using partition keys for large-scale applications. Change Data Capture (CDC) Real-time data monitoring via Change Feed, supporting event-driven workflows. Cost-Effective for Serverless Apps Ideal for web, mobile, IoT, and gaming applications with high throughput and unlimited storage. SLAs for Mission-Critical Workloads Financially backed SLAs for throughput, latency, availability, and consistency
Birth of Azure Cosmos DB Scalability : Seamlessly scales storage and throughput to meet growing demands. Low Latency : Optimized for fast read/write operations with low-latency responses. High Availability : Provides 99.99% availability with multi-region replication. Flexible Consistency Models : Offers 5 consistency models (Strong, Bounded-Staleness, Session, Consistent Prefix, Eventual) to balance performance and correctness. Multi-Model Support : Accommodates key-value, graph, document, and column-family data models. API Compatibility : Supports SQL and open-source APIs, enabling cloud-agnostic development. No Schema Management : Frees developers from managing schemas and indexes. Cost Efficiency : Designed for low-cost operation, ideal for high-volume applications. Use Cases : Perfect for handling unstructured data, such as user-generated content for social media.
Key Features of Azure Cosmos DB Globally Distributed : One-click data replication across multiple Azure regions. Linearly Scalable : Horizontally scales to handle millions of transactions per second. Schema-Agnostic Indexing : Automatically indexes all data without schema or index management. Multi-Model Database : Supports key-value, document, graph, and column-family data models with consistent features. Multi-API & Multi-Language Support : Compatible with SQL, MongoDB, Cassandra, Gremlin, and supports SDKs for Java, .NET, Python, and more. Multi-Consistency Support : Offers 5 consistency levels: Eventual, Prefix, Session, Bounded-Staleness, and Strong. Automatic Indexing : Every property is automatically indexed, with custom indexing options available. High Availability : 99.999% for multi-region with multi-region writes. 99.99% for single-region accounts. Supports automatic failover. Guaranteed Low Latency : 10 ms read/write latency at the 99th percentile. Multi-Master Support : Multi-master writes across regions, enabling elastic scaling for reads and writes.
Architecture of Azure Cosmos DB
Data Stored
Retail Marketing
Storage Mechanism
STORAGE MECHANISM Partitioning & Data Distribution Logical vs. Physical Partitions : Data is split into logical partitions (based on partition keys), stored in physical partitions. Automatic Scaling : As data grows, Cosmos DB automatically splits physical partitions to handle more data without downtime. Global Distribution & Multi-Region Replication Replicates data across multiple Azure regions, ensuring low-latency access and high availability. Offers multiple consistency models (e.g., Strong, Eventual) to balance performance and consistency. Request Units (RUs) & Throughput Scaling Measures database operations (reads, writes, queries) with RUs for unified pricing. Evenly distributes throughput across partitions to avoid throttling and high latency. Encryption at Rest & in Transit Encryption at Rest : Uses AES-256 to encrypt data on SSDs. Encryption in Transit : Uses secure HTTPS connections for data transmission. Indexing & Storage Optimization Automatically indexes data with customizable policies for performance and cost efficiency. Supports Time-to-Live (TTL) policies to purge old data and optimize storage. Backup & Disaster Recovery Continuous backups stored in Azure Blob Storage for recovery in case of accidental deletion or corruption.