OS - Chapter - 5 - File System

Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

Operating Systems and System Programming

Meti Dejene
[email protected]
Haramaya University
Chapter 5 – File System
Introduction
 When a process is running, it can store a limited amount of information within its own

address space.

 However, applications need large information to be retained persistently.

 The problem is when the process terminates, the information kept within a process’

address space is lost; which is unacceptable for many applications.

 The operating system solves this problem with an abstraction mechanism called File which

provides a way to store information on the disk and read it back later.
File
 A file is a named collection of related information that is recorded on secondary storage.

 It is a logical storage unit of information as a sequence of bits, bytes, lines, or records.

 The information in a file is defined by its creator.

 Many different types of information may be stored in a file - source or executable


programs, numeric or text data, photos, music, video, and so on.

 Files are managed by the operating system.

 The part of the operating system dealing with files is known as the file system.

 how they are structured, named, accessed, used, protected, implemented.


File Naming

 When a file is created, it is given file name which serves as an identifier for the
convenience of human users and so that it can be referred and accessed by other
processes using its name.

 The exact rules for file naming vary somewhat from system to system.

 Many file systems support names as long as 255 characters.

 Case sensitivity - Some file systems distinguish between upper and lowercase letters,
whereas others do not.

 Avoid special characters like * ? \ / : < > | "

 Avoid using reserved words that have special meanings in specific operating systems or
file systems
File Naming
 Many operating systems support two-part file names, with the two parts separated by a
period.

 The part before the period is the file name.

 The part following the period is called the file extension and

 The file extension usually indicates additional information about the file such as

 the type of the file and the type of operations that can be done on that file

 the internal structure of the file

 E.g. Chapter-1.pdf BubbleSort.cpp


Cont.
File Types
 The system uses the extension to indicate the type of the file and the type of operations
that can be done on that file.

 A file has a certain defined structure, which depends on its type.

 A text file is a sequence of characters organized into lines (and possibly pages).

 A source file is a sequence of functions, each of which is further organized as


declarations followed by executable statements.

 An executable file is binary file, contains machine-readable code generated after


compiling the source code.
Common File Types
File Access
 Files store information and the information in the file can be accessed in several ways.

1. Sequential-access

 A process could read all the bytes or records in a file in order one record after the
other, starting at the beginning, but could not skip around and read them out of order.

 Example, Magnetic tapes.

2. Random-access (Direct-access)

 Bytes or records of a file can be read in any order or allow random access to any file
block.

 There are no restrictions on the order of reading or writing for a direct-access file.

 We may read block 14, then read block 53, and then write block 7.
File Attributes

 Besides from file name, operating systems associate other additional information with
each file called file attributes or metadata.
File Operations

 Create  Seek

 Delete  Get attributes

 Open  Set attributes

 Close  Rename

 Read

 Write

 Append
Directories (aka Folders)
 A directory is a container used to organize and store files and other directories.

 Directories provide a way of grouping files together and organizing files in a container to
keep track of files and for convenience of use(search and locate files).

1. Single level directory system

 The simplest form of directory system in which one directory, the root directory, contain
all the files.

2. Hierarchical directory systems

 A tree of directories where there can be as many directories and sub-directories as


needed to group related files together.
Path Names
 This is a way for specifying file names from their location.

 Two different methods are commonly used.

1. An Absolute path name

 Giving the path from the root directory down to the specified file.

 Example, /usr/ast/mailbox C:users\Bucky\OS\Lecture\chapter1.pdf

 In UNIX the components of the path are separated by / and in Windows the
separator is \

2. Relative path name

 Instead of beginning at the root directory, in this case file path is specified relative to
the current working directory.
Directory Operations

 Create

 Delete

 Opendir

 Closedir

 Readdir

 Rename

 Link

 Unlink
Implementing File Storage

 The most important issue in implementing file storage is

 how to allocate space to these files so that storage space is utilized effectively and
files can be accessed quickly.

 keeping track of which disk blocks belong to which file.

 Various methods are used in different operating systems.


1. Contiguous Allocation

 Each file occupy a set of contiguous (physically adjacent) disk blocks in linear order.

 Contiguous allocation of a file is defined by the address of the first block and length (in
block units) of the file.

 If the file is n blocks long and starts at location b, then it occupies blocks b, b + 1, b + 2, ...,
b + n - 1.

 E.g. On a disk with 1-KB blocks, a 50-KB file would be allocated 50 consecutive blocks.
With 2-KB blocks, it would be allocated 25 consecutive blocks.

 Contiguous disk-space allocation has two significant advantages.


Cont.
A. It is simple to implement because keeping track of where a file’s blocks are is reduced to

remembering two things:

 the disk address of the first block and length (number of block units) of the file.

B. The read performance is excellent because only one seek is needed (to the first block);

after that, no more seeks or rotational delays are needed.

 Unfortunately, contiguous allocation also has a drawback:

 One difficulty is finding a contiguous disk space for a file. Why?

 Because over the course of time, the disk becomes fragmented. So unless the disk is

compacted frequently it is difficult to find a contiguous disk space


2. Linked-List Allocation
 Store each file as a linked list of disk blocks.

 The contents of a file can be scattered and stored anywhere on a disk.

 Directory stores the disk address of the first block or a pointer to first block of a file.

 Each block of data belonging to a given file contains a pointer, which points to the
next block of data belonging to that file.

 The last block contains a special End of File (EOF) indicator, so that the operating
system will know that it has retrieved all of the data for the file.

 To read a file, we simply read blocks by following the pointers from block to block.

 No space is lost to disk fragmentation since every disk block can be used in this method.

 The size of a file need not be declared when the file is created and A file can continue to
grow as long as free blocks are available.
Cont.
 However, Linked allocation does also have disadvantages,.

 The major problem is that due to the nature of linked lists it can be used effectively only
for sequential-access files. Random access is extremely slow.

 To find the ith block of a file, we must start at the beginning of that file and follow the

pointers one at a time until we get to the ith block.

 Another problem of linked allocation is reliability.

 Recall that the files are linked together by pointers scattered all over the device, and
consider what would happen if a pointer was lost or damaged due to a bug in the
operating-system software or a hardware failure.

 This error could in turn result in picking up the wrong pointer linking into the free-space
list or into another file.
Cont.
3. Allocation using a File Allocation Table (FAT) in Memory
 Improves linked list allocation by taking the pointer from each disk block and putting it in
a table called a File Allocation Table (FAT) so that the entire block is used for data.

 The table has one entry for each block and is indexed by block number.

 The directory entry contains the block number of the first block of the file.

 Then, the table entry indexed by that block number contains the block number of the
next block in the file.

 This chain continues until it reaches the last block, which has a special EOF value as the
table entry.

 Figure 4-12 shows what the table looks like for two files A and B.

 File A uses disk blocks 4, 7, 2, 10, and 12, in that order, and file B uses disk blocks 6, 3, 11,
and 14, in that order.
4. Indexed allocation
 Involves bringing all the pointers together into one block: the index block.

 Each file has its own index block, which stores addresses of disk blocks occupied by the
file.

 Given the index block, it is then possible to find all the blocks of the file.

 Directory contains the addresses of index blocks of files.

 The advantage of this scheme is that the i-node need to be in memory only when the
corresponding file is open.

 So, if each index block occupies n bytes and a maximum of k files may be open at once,
the total memory occupied by the array holding the index block for the open files is only
kn bytes.
Indexed allocation of disk space
How to Keep Track of Free Disk Blocks
1. The first approach is using a linked list of free disk blocks, where free disk blocks are linked

together i.e. a free block contains a pointer to the next free block.

2. Using grouping method, Store the addresses of n free blocks in the first free block.

3. The third technique is the bitmap.

 It is a series or collection of bits in which blocks are assigned a bit and each bit

represents the state of a disk block.

 Free blocks are represented by 1s in the map, allocated blocks by 0s (or vice versa).

 A disk with n blocks requires a bitmap with n bits.


File Protection

 When information is stored in a computer system, it must be kept safe from damage and
improper access (the issue of protection).

 Protection can be provided in many ways.

 One way is access control.

 to provide controlled access by limiting the types of file access that can be made.
Access Control
 Access is permitted or denied depending on several factors.

 The most common approach is to make access dependent on the identity of the user.

 Different users may need different types of access to a file or directory.

 Identity dependent access is to associate with each file and directory an access-
control list (ACL) specifying user names and the types of access allowed for each user.

 When a user requests access to a particular file,

 the operating system checks the access list associated with that file

 If that user is listed for the requested access, the access is allowed.

 Otherwise, a protection violation occurs, and access to the file is denied.

You might also like