File organization

RituBhargava7 9,328 views 24 slides Sep 25, 2018
Slide 1
Slide 1 of 24
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24

About This Presentation

Topic includes definition of file organization,types of file organization ,their advantages and disadvantages


Slide Content

FILE – ORGANIZATION PRESENTED BY DR.RITU BHARGAVA SOPHIA GIRLS’COLLEGE AJMER(AUTONOMOUS)

Definition of File Organization File organization means the way data is stored so that it can be retrieved when needed. It includes the physical order and layout of records on storage devices The techniques used to find and retrieve stored records are called access methods . 2

GOALS OF FILE ORGANIZATION To give ease of creation and maintenance of database in terms of file organization. To create an efficient way of storing and retrieving information from file system . 3

OVERVIEW A logical file is a complete set of records for a specific purpose or designated to specific application . In case of file organization, database is stored in form of collection of files. Each file is organized logically as a sequence of multiple records. A record is sequence of fields in a relation. Records are mapped onto disk blocks for storage. Size of such records on file system may vary. 4

OVERVIEW One approach to mapping database to files is to store records of one length in a given file called as fixed length records. An alternative approach is variable length records 5

RECORDS IN FILES: FIXED LENGTH RECORD Let us consider following example Type student=record sname : char(20); sid : char(4); fees : real; end If each character occupies one byte, an integer occupies 4 bytes, real occupies 8 bytes then student record is 32 bytes long 6

Disadvantage It is difficult to delete a record from such fix structure. Block size should be multiple of 32 .It would then require two block accesses to read or write a record which is more than size 32. 7

VARIABLE LENGTH RECORDS Variable length records arise in database systems in several ways: Storage of multiple record types in a file. Record types that allow variable lengths for one or more fields. Record types that allow repeating fields 8

VARIABLE LENGTH RECORDS Type student=record class _name : char(20); student_info : array [1..∞ ] of record; sid : char(4); fees : real; end end We define student-info as ana array with an arbitrary number of elements ,so that there is no limit on how large a record can be. 9

TYPEES OF FILE ORGANIZATION Sequential file organization Indexed Sequential file organization Direct or Random file organization 10

SEQUENTIAL FILE ORGANIZATION In sequential file organization records are arranged in physical sequence by the value of some field called the sequence field. The field chosen is the key field, one unique values that are used to identify records. The records are laid out on the storage devices ,often magnetic tapes in increasing and decreasing order by the value of the sequence field.For ex: IBM’s SAM(sequential access method) 11

SEQUENTIAL FILE ORGANIZATION It is the oldest method of file organization This organization is simple Easy to understand and easy to manage. It is best suited for sequential access retrieving records one after the another in the same order in which they are stored. With this organization,insertion,updation and deletion are done by rewriting the entire file. Suitable for applications such as Payroll System. 12

ADVANTAGES & DISADVANTAGES Simplicity Less overheads Sequential file is best use if storage space. Difficulty in Searching Lack of support Problem with record deletion for queries . Sequential file is time consuming process. It has high data redundancy. 13

INDEXED SEQUENTIAL ACCESS METHOD The records in this type of file are organized in sequence and an index table is used to speed up Access to the records without requiring a search of the entire file. The records of the file can be stored in random sequence but the index table is in stored sequence on the key value. File can be both randomly as well as sequentially accessed. Records can be updated deleted and inserted in indexed file organization because we can limit the amount of reorganizing we ned to perform. This technique is referred as ISAM(indexed sequential access method. 14

ADVANTAGES In indexed sequential access file, sequential file and random file access is possible. It accesses the records very fast if the index table is properly organized. The records can be inserted in the middle of the file. It provides quick access for sequential and direct processing. It reduces the degree of the sequential search. 15

DISADVANTAGES Indexed sequential access file requires unique keys and periodic reorganization. Indexed sequential access file takes longer time to search the index for the data access or retrieval. It requires more storage space. It is expensive because it requires special software. It is less efficient in the use of storage space as compared to other file organizations. 16

DIRECT FILE ORGANIZATION Direct file organization is designed to provide random access ,rapid ,direct non sequential access to records . IBM’S BDAM(basic direct access mrthod )uses this technique. Using this organization, records are inserted in random order. Direct access organization provides random access to records and is most often used with databases. A hashing technique such as division/remainder or splitting/folding is used to convert the value of some field into a target address. 17

DIRECT FILE ORGANIZATION Collisions can be minimized by choosing a better hashing scheme ,increasing the bucket size so that each page holds more records or reducing packet density. Overflow is handled by searching forward a predetermined number of slots or using an overflow area. Synonym pointers connect overflow records. 18

TYPES OF HASHING SCHEME DIVISION METHOD In this method, we choose a number M such that M>N choose Prime number as M then Hash function is defined as H(K)= K mod N Where N =number of records K = set of keys Divide K by M and take the remainder of the division For example If K=9875 , N=58 , M=97 then H(K)=9875 mod 97 =78 19

TYPES OF HASHING SCHEME MID-SQUARE METHOD In this method, we take square of K ie K 2 we chop off digits from both the ends of K 2 Final value is called L. Hash function is defined as H(k)=L if K=9875,N=58, M=97 then we have K 2 = 975 15 625 H(K)=middle 2 digits of K 2 = 15. 20

TYPES OF HASHING SCHEME FOLDING METHOD Here K is partitioned into number of parts such as K1,K2,k3…Kn. The parts are then added together ignoring the final carry. Hash function is defined as H(K)=k1 + K2 + ……… Kn If K = 9875 ,N= 58, M=97 then H(K)= 98+75=173 ignoring the carry ,we have H(k)=73 21

ADVANTAGES Direct access file helps in online transaction processing system (OLTP) like online railway reservation system. In direct access file, sorting of the records are not required. It accesses the desired records immediately. It updates several files quickly. It has better control over record allocation. 22

DISADVANTAGES Direct access file does not provide back up facility. It is expensive. It has less storage space as compared to sequential file. 23

THANK YOU