Unit-5_1 PPT on Distributed Web based System.pdf

rameshwarchintamani 0 views 22 slides Oct 08, 2025
Slide 1
Slide 1 of 22
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22

About This Presentation

Distributed Files: Introduction, File System Architecture, Sun
Network File System and HDFS.
Distributed Multimedia Systems: Characteristics of Multimedia Data,
Quality of Service Management, Resource Management.
Distributed Web Based Systems: Architecture of Traditional Web-
Based Systems, Apache W...


Slide Content

Distributed Computing Systems






PPT on Unit-5




Distributed Web-Based Systems

Distributed Web-Based Systems


Essence: The WWW is a huge client-server system
with millions of servers; each server hosting
thousands of hyperlinked documents:



Client machine Server machine


Browser Web server




OS
3. Response


1. Get document request (HTTP)


2. Server fetches
document
from local file




• Documents are generally represented in text
(plain text, HTML, XML)

• Alternative types: images, audio, video, but also
applications (PDF, PS)

• Documents may contain scripts that are
executed by the client-side software


12 – 1 Distributed Web-Based Systems/12.1 Architecture

Multi-tiered Architectures



Observation: Already very soon, Web sites were or-
ganized into three tiers:



3. Start process to fetch document



1. Get request
HTTP
request
6. Return result
handler





Web server




CGI
4. Database interaction

program


5. HTML document
created

CGI process Database server




























12 – 2 Distributed Web-Based Systems/12.1 Architecture

Web Services



Observation: At a certain point, people started rec-
ognizing that it is was more than just user ↔ site in-
teraction: sites could offer services to other sites ⇒
standardization is then badly needed.




Client machine
Look up
a service Client
application
Stub
Communication
subsystem
Generate stub
from WSDL
description




Server machine


Server Publish service
application
Stub
SOAP
Communication


subsystem
Generate stub
from WSDL
description

Service description (WSDL)
Service description (WSDL)
Service description (WSDL)

Directory service (UDDI)








12 – 3 Distributed Web-Based Systems/12.1 Architecture

Clients: Web browsers



Observation: browsers form the Web's most impor-
tant client-side sofware. They used to be simple, but
that is long ago.






User interface



Browser engine



Rendering engine




Client-side


Network

HTML/XML

script
comm. interpreter parser





Display back end




















12 – 4 Distributed Web-Based Systems/12.2 Processes

Apache Web Server



Observation: More than 70% of all Web sites are
based on Apache. The server is internally organized
more or less according to the steps needed to
process an HTTP request:



Module Module
Function
Module

... ... ...





Link between

function and hook

Hook Hook Hook Hook









Apache core

Functions called per hook


Request Response







12 – 5 Distributed Web-Based Systems/12.2 Processes

Server Clusters (1/2)


Essence: To improve performance and availability,

WWW servers are often clustered in a way that is
transparent to clients:


Web Web Web Web
server server server server



LAN

Front end handles
Front

all incoming requests
end and outgoing responses


Request

Response




Problem: The front end may easily get overloaded,
so that special measures need to be taken.

Transport-layer switching: Front end simply passes
the TCP request to one of the servers, taking
some performance metric into account.

Content-aware distribution: Front end reads the
con-tent of the HTTP request and then selects
the best server.


12 – 6 Distributed Web-Based Systems/12.2 Processes

Server Clusters (2/2)



Question: Why can content-aware distribution be so
much better?




6. Server responses


Web


5. Forward

server 3. Hand of
other f

messages

TCP connection Distributor



Other messages


Dis-

Client

Switch

4. Inform

patcher



Setup request switch

1. Pass setup request Distributor 2. Dispatcher selects

to a distributor


Web
server


server



























12 – 7 Distributed Web-Based Systems/12.2 Processes

Communication (1/2)



Essence: Communication in the Web is generally based
on HTTP; a relatively simple client-server transfer pro-
tocol having the following request messages:

Operation
Description

Head Request to return the header of a document

Get Request to return a document to the client

Put Request to store a document

Post Provide data that are to be added to a docu-
ment (collection)

Delete Request to delete a document


























12 – 8 Distributed Web-Based Systems/12.3 Communication

Communication (2/2)

Header
C/S
Contents

Accept C The type of documents the client can handle

Accept-Charset C The character sets are acceptable for the client

Accept- C The document encodings the client can handle
Encoding

Accept- C The natural language the client can handle
Language

Authorization C A list of the client's credentials

WWW- S Security challenge the client should respond to
Authenticate

Date C+S Date and time the message was sent

ETag S The tags associated with the returned document

Expires S The time for how long the response remains valid

From C The client's e-mail address

Host C The TCP address of the document's server

If-Match C The tags the document should have

If-None-Match C The tags the document should not have

If-Modified- C Tells the server to return a document only if it has
Since been modified since the specified time

If-Unmodified- C Tells the server to return a document only if it has
Since not been modified since the specified time

Last-Modified S The time the returned document was last modified

Location S A document reference to which the client should
redirect its request

Referer C Refers to client's most recently requested document

Upgrade C+S The application protocol sender wants to switch to

Warning C+S Information about status of the data in the message


12 – 9 Distributed Web-Based Systems/12.3 Communication

SOAP



Simple Object Access Protocol: Based on XML,
this is the standard protocol for communication be-
tween Web services.




• SOAP is bound to an underlying protocol (i.e., it
is not independent from its carrier)

• Conversational exchange style: Send a docu-
ment one way, get a filled-in response back.

• RPC-style exchange: Used to invoke a Web
ser-vice.




















12 – 10 Distributed Web-Based Systems/12.3 Communication

A Note on XML



Observation: XML has the advantage of allowing
self-describing documents. Full stop (i.e., it
introduces performance problems and is not meant
to be read by human beings)


env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope">
<env:Header>
<n:alertcontrol xmlns:n="http://example.org/alertcontrol">
<n:priority>1</n:priority>
<n:expires>2001-06-22T14:00:00-05:00</n:expires>
</n:alertcontrol>
</env:Header>
<env:Body>

<m:alert xmlns:m="http://example.org/alert">
<m:msg>Pick up Mary at school at 2pm</m:msg>
</m:alert>
</env:Body>
</env:Envelope>




















12 – 11 Distributed Web-Based Systems/12.3 Communication

Naming: URL


URL: Uniform Resource Locator tells how and
where to access a resource.


Scheme Host name Pathname


http :// www.cs.vu.nl /home/steen/mbox

(a)


Scheme Host name Port Pathname


http :// www.cs.vu.nl : 80 /home/steen/mbox

(b)


Scheme Host name Port Pathname


http :// 130.37.24.11 : 80 /home/steen/mbox

(c)



Examples:

http HTTP http://www.cs.vu.nl:80/globe

mailto Mail mailto:[email protected]

ftp FTP ftp://ftp.cs.vu.nl/pub/minix/README

file Local file file:/edu/book/work/chp/11/11

data Inline data data:text/plain;charset=iso-8859-7,
%e1%e2%e3

telnet Remote login telnet://flits.cs.vu.nl

tel Telephone tel:+31201234567

modem Modem modem:+31201234567;type=v32


12 – 12 Distributed Web-Based Systems/12.4

Synchronization: WebDAV



Problem: There is a growing need for collaborative
auditing of Web documents, but bare-bones HTTP
can't help here. Solution: Web Distributed Authoring
and Versioning.




• Supports exclusive and shared write locks,
which operate on entire documents

• A lock is passed by means of a lock token; the
server registers the client(s) holding the lock

• Clients modify the document locally and post it
back to the server along with the lock token




Note: There is no specific support for crashed
clients holding a lock.







12 – 13 Distributed Web-Based Systems/12.5 Synchronizatio n

Web Proxy Caching



Basic idea: Sites install a separate proxy server
that handles all outgoing requests. Proxies
subsequently cache incoming documents. Cache-
consistency pro-tocols:

• Always verify validity by contacting server

• Age-based consistency:

T
ex pire
=

α

·

(T
cached


T
l ast modi f ied
) +

T
cached

• Cooperative caching, by which you first check
your neighbors on a cache miss:


Web
server

3. Forward request
to Web server

1. Look in
local cache

Web 2. Ask neighboring proxy caches Web
Cache
proxy proxy
Cache

Client Client Client Client Client Client
Web
HTTP Get request
proxy
Cache



Client Client Client


12 – 14 Distributed Web-Based Systems/12.6 Consistency and Replication

Replication in Web Hosting
Systems



Observation: By-and-large, Web hosting systems
are adopting replication to increase performance.
Much research is done to improve their organization.
Fol-lows the lines of self-managing systems:



Uncontrollable parameters (disturbance / noise)




Initial configuration Corrections
Web hosting system
Observed output

+/-

+/-

+/-








Reference input



Replica

Consistency

Request Metric


placement enforcement routing estimation




Analysis

Measured output




Adjustment triggers















12 – 15 Distributed Web-Based Systems/12.6 Consistency and Replication

Handling Flash Crowds



Observation: We need dynamic adjustment to
bal-ance resource usage. Flash crowds introduce a
se-rious problem:















2 days 2 days
(a)
(b)














6 days 2.5 days

(c) (d)









12 – 16 Distributed Web-Based Systems/12.6 Consistency and Replication

Server Replication



Content Delivery Network: CDNs act as Web host-
ing services to replicate documents across the Inter-
net providing their customers guarantees on high
avail-ability and performance (example: Akamai).



6. Get embedded documents
Cache
CDN (if not already cached)

server






5. Get embedded
documents
Return IP address
7. Embedded documents
client-best server



1. Get base document


CDN DNS

Origin


4

Clien
t


server

server


2. Document with refs





DNS lookups

3

to embedded documents


Regular
DNS system






Question: How would consistency be maintained in
this system?


12 – 17 Distributed Web-Based Systems/12.6 Consistency and Replication

Replication of Web Apps. (1/3)



Observation: Replication becomes more difficult
when dealing with databses and such. No single best
solu-tion.





Edge-server side Origin-server side
Client
Server
query

Server


response
Content-blind Database
cache copy
full/partial data replication
Content-aware
full schema replication/
Authoritative
cache

database
query templates Schema Schema






Assumption: Updates are carried out at origin server,

and propagated to edge servers.






12 – 18 Distributed Web-Based Systems/12.6 Consistency and Replication

Replication of Web Apps. (2/3)





Edge-server side Origin-server side
Client
Server
query

Server


response
Content-blind Database
cache copy
full/partial data replication
Content-aware
full schema replication/
Authoritative
cache

database
query templates Schema Schema






• Full replication: high read/write ratio, often in

combination with complex queries. Note:
replica-tion may possibly speed -down
performance when R/W ratio goes down.

• Partial replication: high read/write ratio, but in
combination with simple queries




12 – 19 Distributed Web-Based Systems/12.6 Consistency and Replication

Replication of Web Apps. (3/3)





Edge-server side Origin-server side
Client
Server
query

Server


response
Content-blind Database
cache copy
full/partial data replication
Content-aware
full schema replication/
Authoritative
cache

database
query templates Schema Schema






• Content-aware caching: Check for queries at
lo-cal database, and subscribe for invalidations
at the server. Works good with range queries
and complex queries.

• Content-blind caching: Simply cache the result of
previous queries. Works great with simple queries
that address unique results (e.g., no range queries).


12 – 20 Distributed Web-Based Systems/12.6 Consistency and Replication

Security: TLS (SSL)



Transport Layer Security: Modern version of the
the Secure Socket Layer (SSL), which “sits” between
transport layer and application protocols. Relatively
simple protocol that can support mutual authentica-
tion using certificates:



1 Possibilities
Client

2 Choices
Server

3
[K
+
]



CA

S
4
[ K
+
C]CA


5
K
+
([ R ] )


S C

















12 – 21 Distributed Web-Based Systems/12.6 Consistency and Replication