File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36...
Transcript of File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36...
![Page 1: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/1.jpg)
File StructuresAn Introduction
สมชาย ประสิทธิ์จูตระกูล
![Page 2: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/2.jpg)
rasitjutrakul
Outline
! Introduction! Basic Concepts! Secondary Storage! Sequential Files! Direct Files! Indexed Files! Tree-Based Files! Multilist & Inverted Files
![Page 3: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/3.jpg)
rasitjutrakul
Managing Large Quantities of Data
"Accessed by multiple people and programs"Kept on external storage devices"Always reliably available for processing"Rapidly accessible when information is needed
![Page 4: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/4.jpg)
rasitjutrakul
Speed & Capacity
"Disks are slow.– RAM ≈ 100 ns– Disk ≈ 10 ms
"Disks provide enormous capacity.– RAM ≈ 10 MB (volatile)– Disk ≈ 1000 MB (nonvolatile)
![Page 5: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/5.jpg)
rasitjutrakul
Design Goal
Minimizing disk accesses for files that keep changing in content and size.Minimizing disk accesses for files that keep changing in content and size.
![Page 6: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/6.jpg)
rasitjutrakul
"1950s : Sequential access + indexes"1960s : Tree Structures"1970s : B-tree"1980s : Extendible Hashing
A Short History
![Page 7: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/7.jpg)
rasitjutrakul
Basic Concept : Outline
! Files! Records, Fields! Keys! Users! File Processing! File Design
![Page 8: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/8.jpg)
rasitjutrakul
Filing System
PersistencePersistence
SharabilitySharability
SizeSize
![Page 9: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/9.jpg)
rasitjutrakul
Files
Savings Account File
Checking Accounts File
Loan Applications File
Employee File
![Page 10: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/10.jpg)
rasitjutrakul
Records
Account Name Address Balance018-745-96 Thongdee 36 Sathon, 10600 25,250.93108-964-09 Dundee 488 Rama 4, 10330 2,252.00116-057-43 Yudee 56 Chareonkrung, 10210 99,768.25248-922-88 Wangdee 102 Bantadthong, 10330 125,899.29741-673-76 Dundee 77 Saphanluang, 10330 232.48
Checking Accounts FileChecking Accounts File
![Page 11: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/11.jpg)
rasitjutrakul
Fields
Account Name Address Balance018-745-96 Thongdee 36 Sathon, 10600 25,250.93108-964-09 Dundee 488 Rama 4, 10330 2,252.00116-057-43 Yudee 56 Chareonkrung, 10210 99,768.25248-922-88 Wangdee 102 Bantadthong, 10330 125,899.29741-673-76 Dundee 77 Saphanluang, 10330 232.48
Checking Accounts FileChecking Accounts File
![Page 12: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/12.jpg)
rasitjutrakul
Files & Records
! A file is a collection of records of the sametype.
! A record is a collection of related fields.
![Page 13: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/13.jpg)
rasitjutrakul
"Locate the Checking Account file."Access the record whose contents of the Account field = 116-057-43."Retrieve the record from the file."Examine the contents of the Balance field.
Keys
Find the Balance of [ Account = 116-057-43 ]Find the Balance of [ Account = 116-057-43 ]
![Page 14: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/14.jpg)
rasitjutrakul
Keys
Key is a field of a record whose contents identify the record.Key is a field of a record whose contents identify the record.
Find the Balance of [ Account = 116-057-43 ]Find the Balance of [ Account = 116-057-43 ]
![Page 15: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/15.jpg)
rasitjutrakul
Primary Keys
Account Name Address Balance018-745-96 Thongdee 36 Sathon, 10600 25,250.93108-964-09 Dundee 488 Rama 4, 10330 2,252.00116-057-43 Yudee 56 Chareonkrung, 10210 99,768.25248-922-88 Wangdee 102 Bantadthong, 10330 125,899.29741-673-76 Rakdee 77 Saphanluang, 10330 232.48
Primary keyPrimary key A primary key is a field that uniquely identify therecord.
![Page 16: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/16.jpg)
rasitjutrakul
Secondary Keys
Account Name Address Balance018-745-96 Thongdee 36 Sathon, 10600 25,250.93108-964-09 Dundee 488 Rama 4, 10330 2,252.00116-057-43 Yudee 56 Chareonkrung, 10210 99,768.25248-922-88 Wangdee 102 Bantadthong, 10330 125,899.29741-673-76 Rakdee 77 Saphanluang, 10330 232.48
Secondary keySecondary key A secondary key is a field that does identify the record,but this identification is not unique.
![Page 17: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/17.jpg)
rasitjutrakul
"End-users"Application programmers"System programmers
Users
![Page 18: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/18.jpg)
rasitjutrakul
File Processing Systems
Retrieve Balance of Account = 116-057-43Retrieve Balance of Account = 116-057-4399,768.2599,768.25
Checking Accounts File Processing SystemChecking Accounts File Processing System
Checking Accounts FileChecking Accounts File
File SystemFile System
end-users
response
applicationprogrammers
system programmers
![Page 19: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/19.jpg)
rasitjutrakul
"End-users– receive accurate information.
"Application programmers– aware of the file organization, record structure, and access mechanisms.
"System programmers– aware of the available tools and resources to enhance the file system efficiency.
Users' Concerns
![Page 20: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/20.jpg)
rasitjutrakul
Data Transfer
Logical RecordLogical Record Application programmers' view of the records
Physical BlockPhysical Block System programmers' view of the records
Application ProgramApplication Program
File SystemFile System
Logical READ
Physical READ
![Page 21: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/21.jpg)
rasitjutrakul
Logical Records
typedef struct customerTag { int iAccount; char szName[20]; char szAddress[50]; float fBalance; } recCustomer;
recCustomer CustomerRecord;
typedef struct customerTag { int iAccount; char szName[20]; char szAddress[50]; float fBalance; } recCustomer;
recCustomer CustomerRecord;
iAccount szName szAddress fBalance
![Page 22: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/22.jpg)
rasitjutrakul
Physical Blocks
Systemdata
Logical record#1
Logical record#2
Logical record#3
logical block
physical block
Blocking factorBlocking factor
![Page 23: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/23.jpg)
rasitjutrakul
Blocking & Deblocking
Deblocking
Input bufferInput buffer Output bufferOutput buffer
Logical recordLogical record Logical recordLogical record
Blocking
Physical BlockPhysical Block Physical BlockPhysical Block
physical read physical write
![Page 24: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/24.jpg)
rasitjutrakul
Disk Caching
recordrecord
recordrecord
recordrecord
recordrecord
recordrecord
recordrecord
blockblock
blockblock
blockblock
blockblock
blockblock
blockblock
blockblock
blockblock
blockblock
blockblock
user space
buffer
disk cache
disk
![Page 25: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/25.jpg)
rasitjutrakul
"Blocking factor vs # Block transfers"Blocking factor vs Buffer size"Optimal blocking factor
Blocking Factor
If the blocking factor were equal to the number of logical records then one could successfully argue thatonly one data transfer would be needed !!!
![Page 26: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/26.jpg)
rasitjutrakul
Logical & Physical File Structure
"Logical file structure– The organization of all logical records in the file.
"Physical file structure– The organization of all the physical blocks stored in secondary
storage.
![Page 27: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/27.jpg)
rasitjutrakul
Logical & Physical File Structure
record 1record 1
record 2record 2
record 3record 3
record 4record 4
. . .. . .
record 48record 48
record 49record 49
record 50record 50 record 1record 1 record 2record 2
record 3record 3 record 4record 4
record 49record 49 record 50record 50
record 47record 47 record 48record 48
key 1key 1
key 2key 2
key 3key 3
key 4key 4
. . .. . .
key 48key 48
key 49key 49
key 50key 50
sequential file physical linked sequential file
![Page 28: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/28.jpg)
rasitjutrakul
Access Path
317090130162200250
591519232731
39424953606570
2 Somchai P. ...3 Somboon T. ...5 Chukiat V. ...
7 Samruay W. ...8 Supat R. ...9 Chatchart S. ...
12 Kukiat R. ...14 Wiwat W. ...
15 Boonchai S. ...
34 Yingyong E. ...35 Rangsan S. ...39 Kriengkai F. ...
![Page 29: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/29.jpg)
rasitjutrakul
Access Path
317090130162200250
591519232731
39424953606570
2 Somchai P. ...3 Somboon T. ...5 Chukiat V. ...
7 Samruay W. ...8 Supat R. ...9 Chatchart S. ...
12 Kukiat R. ...14 Wiwat W. ...
15 Boonchai S. ...
34 Yingyong E. ...35 Rangsan S. ...39 Kriengkai F. ...
![Page 30: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/30.jpg)
rasitjutrakul
Access Methods
Physical File StructurePhysical File Structure
Access MethodAccess Method
Target RecordTarget Record
![Page 31: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/31.jpg)
rasitjutrakul
Classification of Access MethodsAccess methods
Primary access methods
Sequential access methods
Sequential
Random access methods
Direct
Hash
Indexed sequential
Binary search
AVL-tree
Paged tree
B-tree
B+-tree
Trie
Secondary access methods
Inverted file
Cellular inverted
Multilist
Cellular multilist
![Page 32: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/32.jpg)
rasitjutrakul
"Logical file design– select one of the available file organizations– design a new file organization
"Physical file design– design the physical file
File Design
![Page 33: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/33.jpg)
rasitjutrakul
"Selection of blocking factor"Allocation of the I/O buffers"Size of the physical file"Organization of the physical blocks"Design or selection of the access method"Selection of the primary key"File growth"Reorganization point
File Design
![Page 34: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/34.jpg)
rasitjutrakul
File Operations
"RetrieveAll"Batch"RetrieveOne"RetrieveNext"RetrievePrevious"InsertOne"DeleteOne"UpdateOne"RetrieveFew
![Page 35: File Structures · rasitjutrakul Secondary Keys Account Name Address Balance 018-745-96 Thongdee 36 Sathon, 10600 25,250.93 108-964-09 Dundee 488 Rama 4, 10330 2,252.00 116-057-43](https://reader035.fdocuments.net/reader035/viewer/2022081409/607dc35aa5d6514d3a5e1975/html5/thumbnails/35.jpg)
rasitjutrakul
Performance
"Response time– The type of allowable operations.– The frequency of each type of operation.
Ex. 95% Retrieve_One 5% BatchRandom or Sequential ?
Search lengthSearch length