Coda file system

29,350 views 23 slides Nov 01, 2013
Slide 1
Slide 1 of 23
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23

About This Presentation

Coda (Constant Data Avaialabilty) is a distributed file system developed at Carnegie Mellon University . This presentation explains how it works and different aspects of it.


Slide Content

•Introduction to Coda File System
•Naming and Location
•Architecture
•Caching and Replication
•Synchronization
•Communication
•Fault Tolerance
•Security
•Summary

•Coda (constant data availability) is a distributed file system that was
developed as a research project at Carnegie Mellon University in
1987 under the direction of Mahadev Satyanarayan.
•Coda’s design goals:
•Scalability
•Constant data availability
•Transparency
•Security
•Consistency

•The name space in Coda is hierarchically structured as in UNIX and is
partitioned into disjoint volumes.
•A volume consists of a set of files and directories located on one
server, and is the unit of replication in Coda.
•Each file and directory is identified by a 96-bit-long unique file
identifier (FID). Replicas of a file have the same FID.

•An FID has 2 components:
1.A 32-bit RVID (Replication Volume Identifier) of the logical volume
that the file is part of.
2.A 64-bit file handle, i.e. vnode, that uniquely identifies the file
within a volume.

Each file in Coda belongs to
exactly one volume
Volume may be replicated
across several servers
Multiple logical (replicated)
volumes map to the same
physical volume

It works by implementing the following functionalities :
1. Availability of files by replicating a file volume across many servers
2. Disconnected mode of operation by caching files at the client
machine

Coda File System is divided into two types of nodes:
1. Vice nodes: dedicated file servers
2. Virtue nodes: client machines

The internal organization of a Virtue workstation:
is designed to allow access to files even if server is
unavailable and
uses Virtual File System to intercept calls from client
application

Coda uses RPC2: a sophisticated reliable RPC
system
Start a new thread for each request, server
periodically informs client it is still working on
the request

•Coda servers allow clients to cache whole files
•Modifications by other clients are notified through invalidation
messages which require multicast RPC
a)Sending an invalidation message one at a time
b)Sending invalidation messages in parallel

Client
Server
Client
Time
Open
(RD)
Open
(WR)
Session A
Session B
File f
File f
Close
CloseInvalidate

Client A
Server
Client B
Open
(RD)
File f
Close
Open
(WR)
Close
Invalidate
(Callback
Break)
Open
(WR)
Open
(RD) File f
Close
Close
OK(no file transfer)
Time
Session A
Session B
Session C
Session D
File f
•Scalability
•Fault Tolerance

Data structures:
•VSG (Volume Storage Group):
•Set of servers storing replicas of a volume
•AVSG (Accessible Volume Storage Group):
•Set of servers accessible to a client for every volume the
client has cached

Versioning vector (Coda Version Vector) when partition happens: [1,1,1]
Client A updates file  versioning vector in its partition: [2,2,1]
Client B updates file  versioning vector in its partition: [1,1,2]
Partition repaired  compare versioning vectors: conflict!

HOARDING: File cache in advance with all files that will be accessed when
disconnected
EMULATION: when disconnected, behavior of server emulated at client
REINTEGRATION: transfer updates to server; resolves conflicts

•Hoard database
•Cache equilibrium:
•There is no uncached file with a higher priority than any cached
file.
•The cache is full, or no uncached file has nonzero priority.
• Each cached file is a copy of the one maintained in the client’s
AVSG.
•Hoard walk

Coda’s security architecture consists of two parts:
•The first part deals with setting up a secure channel between a client
and a server using secure RPC and system-level authentication.
•The second part deals with controlling access to files.

C
lie
n
t

(
V
e
n
u
s
)
V
ic
e

S
e
r
v
e
r

Operation Description
Read Read any file in the directory
Write Modify any file in the directory
Lookup Look up the status of any file
Insert Add a new file to the directory
Delete Delete an existing file
AdministerModify the ACL of the directory

•Peter J. Braam, The Coda File System, www.coda.cs.cmu.edu.