File Systems
File Systems
Introduction
At the core of a computer, it's all 1s and 0s, but the organization of that
data is not quite as simple. A bit is a 1 or a 0, a byte is composed of 8
bits, a kilobyte is 1024 (i.e. 210) bytes, a megabyte is 1024 kilobytes and
so on and so forth. All these bits andbytes are permanently stored on a
Hard Drive. A hard drive stores all your data, any time you save a file,
you're writing thousands of 1s and 0s to a metallic disc, changing the
magnetic properties that can later be read as 1 or 0. There is so much
data on a hard drive that there has to be some way to organize it, like a
library of books and the old card drawers that indexed all of them,
without that index, we'd be lost. Libraries, for the most part, use the
Dewey Decimal System to organize their books, but there exist other
systems to do so, none of which have attained the same fame as Mr.
Dewey's invention.
File systems are the same way. The ones most users are aware of are
the ones Windows uses, the vFat or the NTFS systems, these are the
Windows default file systems.
There are several different attributes which are necessary in defining file
systems, these include their:
max file size,
max partition size,
whether they journal or not.
Journaling
Journaling has a dedicated area in the file system, where
all the changes are tracked. When the system crashes, the
possibility of file system corruption is less because of
journaling.
Fat16 2 GB 2 GB No Legacy
Fat32 4 GB 8 TB No Legacy
ext2 2 TB 32 TB No Legacy
ext2, ext3 and ext4 are all filesystems created for Linux.
This article explains the following:
High level difference between these filesystems.
How to create these filesystems.
How to convert from one filesystem type to another.
Ext2
Ext2 stands for second extended file system.
It was introduced in 1993. Developed by Rémy Card.
This was developed to overcome the limitation of the
original ext file system.
Ext2 does not have journaling feature.
On flash drives, usb drives, ext2 is recommended, as it
doesn’t need to do the over head of journaling.
Maximum individual file size can be from 16 GB to 2 TB
Overall ext2 file system size can be from 2 TB to 32 TB
Ext3
Ext3 stands for third extended file system.
It was introduced in 2001. Developed by Stephen
Tweedie.
Starting from Linux Kernel 2.4.15 ext3 was available.
The main benefit of ext3 is that it allows journaling.
Journaling has a dedicated area in the file system,
where all the changes are tracked. When the system
crashes, the possibility of file system corruption is less
because of journaling.
Maximum individual file size can be from 16 GB to 2 TB
Overall ext3 file system size can be from 2 TB to 32 TB
There are three types of journaling available in ext3 file
system.
Journal – Metadata and content are saved in the
journal.
Ordered – Only metadata is saved in the journal.
Metadata are journaled only after writing the content
to disk. This is the default.
Writeback – Only metadata is saved in the journal.
Metadata might be journaled either before or after
the content is written to the disk.
You can convert a ext2 file system to ext3 file system
directly (without backup/restore).
Ext4
Ext4 stands for fourth extended file system.
It was introduced in 2008.
Starting from Linux Kernel 2.6.19 ext4 was available.
Supports huge individual file size and overall file system
size.
Maximum individual file size can be from 16 GB to 16 TB
Overall maximum ext4 file system size is 1 EB
(exabyte). 1 EB = 1024 PB (petabyte). 1 PB = 1024 TB
(terabyte).
Directory can contain a maximum of 64,000
subdirectories (as opposed to 32,000 in ext3)
You can also mount an existing ext3 fs as ext4 fs
(without having to upgrade it).
Several other new features are introduced in ext4:
multiblock allocation, delayed allocation, journal
checksum. fast fsck, etc. All you need to know is that
these new features have improved the performance and
reliability of the filesystem when compared to ext3.
In ext4, you also have the option of turning the
journaling feature “off”.
Use the method we discussed earlier to identify whether
you have ext2 or ext3 or ext4 file system.
The JFS file system is a 64-bit file system created by IBM and ported to Linux in 1999. A
stable version was released in 2001. The first implementation was the Linux Kernel
2.4.18.
JFS was originally released in 1990 with AIX version 3.1. It is sometimes referred to as
JFS1
The file name size limit is 255 characters. To support large files and a larger partition
(more addressing values), the file system is 64-bit. The file and space limitations are as
follows:
File size: 4 PB
File system: 32 PB
To recover from an improper shut down, the file system uses journals to track metadata
for files and performs a recovery when a system is restarted from an improper shutdown.
Metadata can be restored so information is recovered instead of lost. Of course, this is
where the file system gets its name.
A B+ Tree is used to track directories/files and extent locations. The B+ Tree allows for
the searches to be performed much faster than most other stored data in files.
The JFS file system allows for the use of Dynamic Inode Allocation. The Inodes are 512
bytes each with 32 Inodes on a 16KB extent. Every file system has a limited number of
Inodes, but with Dynamic Inode Allocation, more can be created beyond the standard
limit. When Inodes are used up, files cannot be added until files have been deleted and
Inodes freed.
Extents are used to help prevent fragmentation. When Extents are used, a “reserved” free
space is kept after files. These contiguous free blocks are to allow the file to grow and not
cause fragmentation by placing parts of a file in non-contiguous spaces. When files are
spread out, or fragmented, system performance can be affected.
Allocating space on the file system is accomplished by using Extents. To manage the free
space on the file system, B+ Trees are used to track these spaces. Other file systems use a
bitmap to track free and used space. Two B+ Trees are used to track free space on a JFS
file system. One tree is used to store the starting block of the free extents, while the
second B+ Tree indexes the number of free extents for each starting block. To write a file,
the file system can check for a free space with enough contiguous extents, and then find
the starting block to begin writing.
NOTE: Bitmaps are used to track used and unused space. These bitmaps are not images,
but a file where each bit represents an addressable block. Each bit is either on (1) or off
(0) to represent if it is used or free.
To provide more storage, Compression can be used (on AIX only) to compress files so
more data can fit than without it being compressed.
JFS also allows for Concurrent I/O (CIO) for shared access of read and writes to a file.
Normally, when a file is read or written to, the file is in a "lock" mode to prevent other
processes from performing any I/O. With CIO, locks are a shared lock, which means that
other I/O can be performed. Read and writes are normally done in a serial fashion.
When requests are sent from applications to read or write, the requests are fulfilled as
they come - first come first serve, or first in, first out (FIFO). When read or writes are
performed, then to improve performance, Direct I/O is used.
JFS obtains faster throughput by using Allocation Groups. These are sections of a disk
volume where a read/write can occur simultaneously with other Allocation Groups in the
same disk volume. The process works better when the volume spans multiple disks.
Allocation Groups may store files within the group which are related. The relation may
be that they are from the directory and be from the same application. When a file is
opened, the Allocation Group as a whole is locked to prevent the files within the group
from being allocated elsewhere. Another allocation option is Sparse Files, where files are
spread out over the disk.