DATA LAKE Cloud computing Cloud computing is the on-demand delivery of IT resources like computing power, storage, and software over the internet on a pay-as-you-go basis
victordujohn
2 views
21 slides
Oct 12, 2025
Slide 1 of 21
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
About This Presentation
Cloud computing
Size: 2.62 MB
Language: en
Added: Oct 12, 2025
Slides: 21 pages
Slide Content
Cloud computing
Definition Cloud computing refers to the use of hosted services, such as data storage, servers, databases, networking, and software over the internet. The data is stored on physical servers, which are maintained by a cloud service provider. Computer system resources, especially data storage and computing power, are available on-demand, without direct management by the user in cloud computing.
TYPES
CLOUD DEPLOYMENT MODELS The cloud deployment models summarised below are the following: Private Cloud: the cloud services used by a single organization, which are not exposed to the public. A private cloud resides inside the organization and must be behind a firewall, so only the organization has access to it and can manage it. Public Cloud: the cloud services are exposed to the public and can be used by anyone. Virtualization is typically used to build the cloud services that are offered to the public. An example of a public cloud is Amazon Web Services (AWS). Hybrid Cloud: the cloud services can be distributed among public and private clouds, where sensitive applications are kept inside the organization’s network (by using a private cloud), whereas other services can be hosted outside the organization’s network (by using a public cloud). Users can them interchangeably use private as well as public cloud services in every day operations.
CLOUD SERVICE MODELS IaaS: cloud-based services, pay-as-you-go for services such as storage, networking, and virtualization. Infrastructure as a service or IaaS is a type of cloud computing in which a service provider is responsible for providing servers, storage, and networking over a virtual interface. In this service, the user doesn’t need to manage the cloud infrastructure but has control over the storage, operating systems, and deployed applications. PaaS: hardware and software tools available over the internet. Platform as a service or PaaS is a type of cloud computing that provides a development and deployment environment in cloud that allows users to develop and run applications without the complexity of building or maintaining the infrastructure. SaaS: software that’s available via a third-party over the internet. S oftware as a service allows users to access a vendor’s software on cloud on a subscription basis. In this type of cloud computing, users don’t need to install or download applications on their local devices. Instead, the applications are located on a remote cloud network that can be directly accessed through the web or an API
case study cloud computing large scale applications - airbnb
AIRBNB Airbnb is a community marketplace that allows property owners and travelers to connect with each other for the purpose of renting unique vacation spaces around the world. The San Francisco-based Airbnb began operation in 2008 and currently has hundreds of employees across the globe supporting property rentals in nearly 25,000 cities in 192 countries. Launched in 2008, over 80 million guests have stayed on Airbnb in over 2 million homes in over 190 countries. They recently opened 4,000 homes in Cuba to travelers around the globe.
AWS & AIRBNB A year after Airbnb launched, the company decided to migrate nearly all of its cloud computing functions to Amazon Web Services (AWS) because of service administration challenges experienced with its original provider. “ Initially, the appeal of AWS was the ease of managing and customizing the stack. It was great to be able to ramp up more servers without having to contact anyone and without having minimum usage commitments. As our company continued to grow, so did our reliance on the AWS cloud and now, we’ve adopted almost all of the features AWS provides. AWS is the easy answer for any Internet business that wants to scale to the next level. ”
CHALLENGES
Cloud services To support demand, the company uses 200 Amazon Elastic Compute Cloud (Amazon EC2) instances for its application, Memcache , and search servers. Within Amazon EC2, Airbnb is using Elastic Load Balancing , which automatically distributes incoming traffic between multiple Amazon EC2 instances. To easily process and analyze 50 Gigabytes of data daily, Airbnb uses Amazon Elastic MapReduce (Amazon EMR) . Airbnb is also using Amazon Simple Storage Service (Amazon S3) to house backups and static files, including 10 terabytes of user pictures. To monitor all of its server resources, Airbnb uses Amazon CloudWatch , which allows the company to easily supervise all of its Amazon EC2 assets through the AWS Management Console, Command Line Tools, or a Web services API. In addition, Airbnb moved its main MySQL database to Amazon Relational Database Service (Amazon RDS) .
CLOUD INSTANCES – EC2 Elastic Compute Cloud – AWS Instances – Virtual Servers An instance in cloud computing is a server resource provided by third-party cloud services. While you can manage and maintain physical server resources on premises, it is costly and inefficient to do so. Cloud providers maintain hardware in their data centers and give you virtual access to compute resources in the form of an instance. You can use the cloud instance for running compute-intensive workloads like containers, databases, microservices, and virtual machines. Configuring Factors: CPU, Memory, Storage and Network Capacity Uses: Hosting Websites, Processing Large Amounts of Data, Running Compute – Intensive Workloads
CLOUD WATCH Amazon CloudWatch is a service that monitors applications, responds to performance changes, optimizes resource use, and provides insights into operational health. By collecting data across AWS resources, CloudWatch gives visibility into system-wide performance and allows users to set alarms, automatically react to changes, and gain a unified view of operational health. CloudWatch uses Metrics, Alarms, CloudWatch Logs, CloudWatch Events, and CloudWatch Dashboards to collect, access, correlate, and visualize data on a single platform from across your AWS resources and applications and services.
CLOUD TRAIL Everything we do in the AWS environment, such as creating or terminating EC2 instances, creating subnets, etc., is done through an API call. AWS CloudTrail is a service that records AWS API calls for your AWS environment in the form of logs and saves those logs to S3 buckets. You can use AWS CloudTrail to see the following: The identity of the user (who deleted the instance) The start time of the AWS API call (when the instance got deleted) The source IP address The request parameters (e.g., instance ID) The response parameters returned by the service
Bucket Buckets are the basic containers that hold your data . Everything that you store in Cloud Storage must be contained in a bucket. You can use buckets to organize your data and control access to your data, but unlike directories and folders, you cannot nest buckets.
S3 – Simple storage service Amazon Simple Storage Service (Amazon S3) is an object storage service offering industry-leading scalability, data availability, security, and performance. Millions of customers of all sizes and industries store, manage, analyze , and protect any amount of data for virtually any use case, such as data lakes, cloud-native applications, and mobile apps. With cost-effective storage classes and easy-to-use management features, you can optimize costs, organize and analyze data, and configure fine-tuned access controls to meet specific business and compliance requirements. Amazon S3 provides features for auditing and managing access to your buckets and objects. By default, S3 buckets and the objects in them are private. You have access only to the S3 resources that you create.
DATA LAKE A data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. You can run data analytics, artificial intelligence (AI), machine learning (ML), and high-performance computing (HPC) applications to unlock the value of your data. Raw data landing zone: Data sourced from multiple sources comes here. Data ingestion zone: Data stored in its original format. Staging and processing zone: Data transformed and enriched for use. Exploration zone: Data used by scientists for research. Data governance zone: Data quality and auditing, metadata management.
INTEGRATION
SNS Simple Notification Service SNS is a distributed publish/subscribe solution used for application-to-application (A2A) and application-to-person (A2P) communication. SNS topics are used to enable communication: producers publish messages to topics, and consumers subscribe to these topics to receive messages. You can deliver messages to various types of subscribers, such as AWS SQS queues, AWS Lambda functions, and HTTP endpoints. You can also use SNS to send SMS messages, email, and push notifications to end-user devices.
SNS MODEL
SQS Simple Queue Service AWS SQS is a distributed, managed queueing service used for communication between applications, microservices, and distributed systems. As with most messaging middleware, SQS consists of three major components: Producers (components that send messages to the queue). Queue (which stores messages). Consumers (other components that receive messages from the queue). There are two types of queues: Standard queues. They offer maximum throughput, best-effort ordering, and at-least-once delivery. FIFO queues . Designed to guarantee that messages are processed exactly once, in the same order that they are sent.