Os 1 PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

“OS ASSIGNMENT”

​NAME :- Lakhbir Singh


CLASS:- B.E C.Se (4​th​ Semester)
Roll No. :- SG-18329

UNIX​1.
1

Introduction
Design Principles
Structural, Files, Directory Hierarchy

Filesystem
Files, Directories, Links, On-Disk Structures
Mounting Filesystems, In-Memory Tables, Consistency

IO
Implementation, The Buffer Cache
Processes
Unix Process Dynamics, Start of Day, Scheduling and States
The Shell
Examples, Standard IO
Summary
INTRODUCTION
Introduction
Design Principles
Filesystem
IO
Processes
The Shell Summary

HISTORY (I)
First developed in 1969 at Bell Labs (Thompson & Ritchie) as reaction to bloated
Multics. Originally written in PDP-7 asm, but then (1973) rewritten in the "new"
high-level language C so it was easy to port, alter, read, etc. Unusual due to need
for performance
6th edition ("V6") was widely available (1976), including source meaning people
could write new tools and nice features of other OSes promptly rolled in
V6 was mainly used by universities who could afford a minicomputer, but not
necessarily all the software required. The first really portable OS as same source
could be built for three different machines (with minor asm changes)
Bell Labs continued with V8, V9 and V10 (1989), but never really widely available
because V7 pushed to Unix Support Group (USG) within AT&T
AT&T did System III first (1982), and in 1983 (after US government split Bells),
System V. There was no System IV.

HISTORY (II)
By 1978, V7 available (for both the 16-bit PDP-11 and the new 32-bit VAX-11).
Subsequently, two main families: AT&T "System V", currently SVR4, and Berkeley:
"BSD", currently 4.4BSD
Later standardisation efforts (e.g. POSIX, X/OPEN) to homogenise
USDL did SVR2 in 1984; SVR3 released in 1987; SVR4 in 1989 which supported the
POSIX.1 standard
In parallel with AT&T story, people at University of California at Berkeley (UCB)
added virtual memory support to "32V" [32-bit V7 for VAX] and created 3BSD.

HISTORY (III)
4BSD development supported by DARPA who wanted (among other things) OS
support for TCP/IP.
By 1983, 4.2BSD released at end of original DARPA project
1986 saw 4.3BSD released — very similar to 4.2BSD, but lots of minor tweaks. 1988
had 4.3BSD Tahoe (sometimes 4.3.1) which included improved TCP/IP congestion
control. 19xx saw 4.3BSD Reno (sometimes 4.3.2) with further improved
congestion control. Large rewrite gave 4.4BSD in 1993; very different structure,
includes LFS, Mach VM stuff, stackable FS, NFS, etc.
Best known Unix today is probably Linux, but also get FreeBSD, NetBSD, and
(commercially) Solaris, OSF/1, IRIX, and Tru64.
SIMPLIFIED UNIX FAMILY TREE
DESIGN PRINCIPLES
Introduction
Design Principles
Structural, Files, Directory Hierarchy
Filesystem
IO
Processes
The Shell
Summary
DESIGN FEATURES
Ritchie & Thompson (CACM, July 74), identified the (new) features of Unix:
A hierarchical file system incorporating demountable volumes
Compatible file, device and inter-process IO (naming schemes, access control)
Ability to initiate asynchronous processes (i.e., address-spaces = heavyweight)
System command language selectable on a per-user basis
Completely novel at the time: prior to this, everything was "inside" the OS. In Unix
separation between essential things (kernel) and everything else
Among other things: allows user wider choice without increasing size of core OS;
allows easy replacement of functionality — resulted in over 100 subsystems
including a dozen languages
Highly portable due to use of high-level language
Features which were not included: real time, multiprocessor support.
STRUCTURAL OVERVIEW
Clear separation between user and kernel portions was the big difference
between Unix and contemporary systems
— only the ​essential​ features ​inside​ OS,
not the editors, command interpreters,
compilers, etc.
Processes are unit of scheduling and
protection: the command interpreter
("shell") just a process
No concurrency within kernel
All IO looks like operations on files: in
Unix, everything is a file.
FILESYSTEM
Introduction
Design Principles
Filesystem
Files, Directories, Links, On-Disk Structures
Mounting Filesystems, In-Memory Tables, Consistency
IO
Processes
The Shell
Summary

FILE ABSTRACTION
File as an unstructured sequence of bytes which was relatively unusual at the time:
most systems lent towards files being composed of records
Cons: don't get nice type information; programmer must worry about format of
things inside file
Pros: less stuff to worry about in the kernel; and programmer has flexibility to
choose format within file!
Represented in user-space by a file descriptor (​fd​) this is just an opaque identifier
— a good technique for ensuring protection between user and kernel.

FILE OPERATIONS
Operations on files are:
fd = open (pathname, mode)
fd = create (pathname, mode)
bytes = read (fd, buffer, nbytes)
count = write (fd, buffer, nbytes)
reply = seek(fd, offset, whence)
reply = close(fd)
The kernel keeps track of the current position within the file Devices are
represented by special files:
Support above operations, although perhaps with bizarre semantics
Also have​ IOCtl​ for access to device-specific functionality

DIRECTORY HIERARCHY
Directories map names to files (and directories) starting from distinguished root
directory called ​/
Fully qualified pathnames mean performing
traversal from root
Every directory has entries: refer to self and
parent respectively. Also have shortcut of
current working directory​ (cwd) which
allows relative path names; and the
shell provides access to home directory as ~username (e.g. ​~mort​/). Note that
kernel knows about former but not latter
Structure is a tree in general though this is slightly relaxed.

ASIDE: PASSWORD FILE

/etc/passwd holds list of password entries of the form user-


name:encrypted-passwd:home-directory:shell
Also contains user-id, group-id (default), and friendly name.
Use one-way function to encrypt passwords i.e. a function which is easy to
compute in one direction, but has a hard to compute inverse. To login:
Get user name
Get password
Encrypt password
Check against version in /etc/password
Ifok, instantiate login shell
Otherwise delay and retry, with upper bound on retries

Publicly readable since lots of useful info there but permits offline attack
Solution: shadow passwords ​(/etc/shadow​)

FILE SYSTEM IMPLEMENTATION


Inside the kernel, a file is represented by a data structure called an ​index-node​ or
inode​ which hold file meta-data: owner, permissions, reference count, etc. and
location on disk of actual data (file contents).
I-NODES
Why don't we have all blocks in a simple table?
Why have first few in inode at all?
How many references to access blocks at different places in the file?
If block can hold 512 block-addresses (e.g. blocks are 4kB, block addresses are 8
bytes), what is max size of file (in blocks)?
Where is the filename kept?
DIRECTORIES AND LINKS
Directory is (just) a file which maps filenames to i-nodes — that is, it has its own
i-node pointing to its
contents
An instance of a file in a
directory is a (hard) link
hence the reference count in
the inode. Directories can
have at most 1 (real) link.
Why?
Also get soft- or symbolic
links: a 'normal' file which contains a filename.
ON-DISK STRUCTURES

A disk consists of a boot block followed by one or more partitions. Very old disks
would have just a single partition. Nowadays have a ​boot block​ containing a
partition table allowing OS to determine where the filesystems are
Figure shows two completely independent filesystems; this is not replication for
redundancy.
ON-DISK STRUCTURES
A partition is just a contiguous range of ​N​ fixed-size blocks of size ​k​ for some ​N​ and
k​, and a Unix filesystem resides within a partition
Common block sizes: 512B, 1kB, 2kB, 4kB, 8kB Superblock contains info such as:
Number of blocks and free blocks in filesystem
Start of the free-block and free-inode list
Various bookkeeping information
Free blocks and inodes intermingle with allocated ones
On-disk have a chain of tables (with head in superblock) for each of these.
Unfortunately, this leaves superblock and inode-table vulnerable to head crashes
so we must replicate in practice. In fact, now a wide range of Unix filesystems that
are completely different; e.g., log-structure.
MOUNTING FILESYSTEMS
Entire filesystems can be mounted on an existing directory in an already mounted
filesystem
At very start, only ​/​ exists so
must mount a root filesystem
Subsequently can mount other
filesystems, e.g.
mount ("/dev/hda2", "/home", options)
Provides a unified name-space:
e.g. access ​/home/mort/​ directly
(contrast with Windows9x or NT).
IN-MEMORY TABLES
Recall process sees files as file descriptors
In implementation these are just
indices into process-specific open file
table.

Entries point to system-wide open file


table. Why?

These in turn point to (in memory)


inode table.
ACCESS CONTROL

Access control information held in each inode


Three bits for each of owner, group and world: read, write and execute
What do these mean for directories? Read entry, write entry, traverse directory
In addition have set uid and set gid bits:
Normally processes inherit permissions of invoking user
Setuid/setgid allow user to "become" someone else when running a given
program
E.g. prof owns both executable test (​0711 ​and setuid), and score file (​0600​)
CONSISTENCY ISSUES
To delete a file, use the unlink system call — from the shell, this is rm
<filename>
Procedure is:
Check if user has sufficient permissions on the file (must have write access)
Check if user has sufficient permissions on the directory (must have write access)
If ok, remove entry from directory
Decrement reference count on inode
If now zero: free data blocks and free inode
If crash: must check entire filesystem for any block unreferenced and any block
double referenced
Crash detected as OS knows if crashed because root fs not unmounted cleanly.
UNIX FILESYSTEM: SUMMARY
Files are unstructured byte streams
Everything is a file: "normal" files, directories, symbolic links, special files
Hierarchy built from root (​/​)
Unified name-space (multiple filesystems may be mounted on any leaf directory)
Low-level implementation based around inodes
Disk contains list of inodes (along with, of course, actual data blocks)
Processes see file descriptors: small integers which map to system file table
Permissions for owner, group and everyone else
Setuid/setgid allow for more flexible control
Care needed to ensure consistency
IO
Introduction
Design Principles
Filesystem
IO
Implementation, The Buffer Cache
Processes
The Shell
Summary
IO IMPLEMENTATION
Everything accessed via the file system
Two broad categories: block and character; ignoring low-level gore:
Character IO low rate but complex — most functionality is in the "cooked"
interface
Block IO simpler but performance matters — emphasis on the buffer cache
THE BUFFER CACHE
Basic idea:​ keep copy of some parts of disk in memory for speed.
On read do:
Locate relevant blocks (from inode)
Check if in buffer cache
If not, read from disk into memory
Return data from buffer cache
On write do same first three, and then update version in cache, not on disk
"Typically" prevents 85% of implied disk transfers
But when does data actually hit disk?
Call ​sync​ every 30 seconds to flush dirty buffers to disk
Can cache metadata too — what problems can that cause?
PROCESSES
Introduction
Design Principles
Filesystem
IO
Processes
Unix Process Dynamics, Start of Day, Scheduling and States
● The Shell
● Summary
UNIX PROCESSES
Recall​: a process is a program in execution
Processes have three segments: text, data
and stack. Unix processes are heavyweight
Text​: holds the machine instructions for the
program
Data​: contains variables and their values
Stack​: used for activation records (i.e.
storing local variables, parameters, etc.)
UNIX PROCESS DYNAMICS
Process is represented by an opaque process id (​pid​), organised hierarchically
with parents creating children. Four basic operations:
pid = ​fork​ ()
reply = ​execve​(pathname, argv, envp) ​exit​(status) pid = ​wait​(status)
fork() nearly always
followed by exec()
leading to vfork() and/or
copy-on-write (COW).
Also makes a copy of
entire address space
which is not terribly
efficient.
UNIX PROCESS SCHEDULING (I)
Thus if e.g. load is 1 this means that roughly 90% of 1s CPU usage is "forgotten"
within 5s
Base priority divides processes into bands; CPU and nice components prevent
processes moving out of their bands. The bands are:
Swapper; Block IO device control; File manipulation; Character IO device control;
User processes
Within the user process band the execution history tends to penalize CPU bound
processes at the expense of IO bound processes.
UNIX PROCESS STATES

ru = running rk = running
(user- (kernelmode)
mode)
z = zombie p = preempted

sl = sleeping rb = runnable
c = created
THE SHELL
Introduction
Design Principles
Filesystem
IO
Processes
The Shell
Examples, Standard IO
Summary

THE SHELL
Shell just a process like everything else.
Needn't understand commands, just files
Uses path for convenience, to avoid needing
fully qualified pathnames
Conventionally ​&​ specifies background
Parsing stage (omitted) can do lots: wildcard
expansion ("globbing"), "tilde" processing
Prompt is ​$​. Use ​man​ to find out about
commands.

STANDARD IO
Every process has three fds on creation:
stdin: where to read input from
stdout: where to send output
stderr: where to send diagnostics
Normally inherited from parent, but shell allows redirection to/from a file, e.g.,
ls >listing.txt
ls >&listing.txt
sh <commands.sh
Consider: ls >temp.txt; wc <temp.txt >results
Pipeline is better (e.g. ls | wc >results)
Unix commands are often filters, used to build very complex command lines
Redirection can cause some buffering subtleties​.
SUMMARY
Introduction
Design Principles
Filesystem
IO
Processes
The Shell
Summary

8.1

MAIN UNIX FEATURES


File abstraction
file is an unstructured sequence of bytes
(Not really true for device and directory files)
Hierarchical namespace
Directed acyclic graph (if exclude soft links)
Thus, can recursively mount filesystems
Heavy-weight processes
IO: block and character
Dynamic priority scheduling
Base priority level for all processes
Priority is lowered if process gets to run
Over time, the past is forgotten
But V7 had inflexible IPC, inefficient memory management, and poor kernel
concurrency
Later versions address these issues.

8.2
SUMMARY
Introduction
Design Principles
Structural, Files, Directory Hierarchy
Filesystem
Files, Directories, Links, On-Disk Structures
Mounting Filesystems, In-Memory Tables, Consistency
IO
Implementation, The Buffer Cache
Processes
Unix Process Dynamics, Start of Day, Scheduling and States
The Shell
Examples, Standard IO
Summary
9
MS-DOS

Nowadays, it’s impossible to introduce desktop computers without seeing the name of Microsoft around sooner or
later. This company totally managed to take over this world. If they tell your computer to commit suicide, unless
you’re some kind of bearded geek not using their software, it will.

The story of Microsoft begins with the emergence of personal computers, and with two clever guys called Bill
Gates and Paul Allen who loved to mess around with computers. They were popular at the time in the computing
world, because they wrote useful software (things that made writing other software much easier, a spreadsheet, a
world processor…) on a low-cost computer called the Altair 8800 (that didn’t even include a keyboard and a
monitor).

When people at IBM – who ruled a big part of the computer world at the time – started having an interest in
micro-computers, building what was going to be a big hit called the IBM PC, they were attracted by this popularity
and licensed some of Gates’s software. And when they failed to get the major operating system of those days,
CP/M, they went back to Gates and asked him if he could provide them for an OS for it. Gates found one, bought
it, and hired the one who wrote it in order to modify it so that it fits IBM’s need. MS-DOS was born.

The first releases of MS-DOS essentially provided the file abstraction (i.e. the ability to consider that disk space is
divided into chunks of data called files, allowing one to ignore the actual structure of the floppy disk he works on),
some primitive tools to manipulate internal memory and text files (i.e. files containing text and numbers), and basic
configuration routines, everything being stolen from the dominant OS of those days, CP/M. One used it by typing
commands on a keyboard, pressing Return, and reading a resulting dump of text on the screen, an interface that’s
called command line. However, it introduced a new commercial idea that was here to last: the idea of selling an
operating system bundled with a computer, betting that users won’t bother using another one if there’s already
something working there.

The Command Line interface

Subsequent releases included support for higher-capacity data storage, directories (a kind of box which may
contain files and other directories, and hence a hierarchical way to organize files), then for more languages, then
for clocks (a chip whose goal is to measure time, useful not only for calendar/watches-like applications but also for
real-time applications) and even higher data storage, then for computer networks. After that, DOS 4, 10 times
heavier than the first release of DOS with 110 KB memory usage (which is 1/30 the size of a common MP3 file),
introduced “dosshell”, a new optional way to explore files that was a little less linear than the command line, direct
ancestor of window’s file explorer.

The Shell interface

Subsequent releases of MS-DOS, besides adding up support for a higher storage space (again…) through multiple
hard drives management and a new file management system, and managing more RAM, introduced three new
tools : one to compress files in order to economize disk space (at the cost of slower data access), one to check the
hard drive for errors (due to bad machine shutdown as an example) and try to fix them, and a simple antivirus,
MSAV.
As one may figure out, at this stage, DOS didn’t really evolved anymore. Those last things really were secondary,
questionable features, that were added just to sell new releases of the operating system and make money from it,
and maintenance improvements for keeping in touch with new hardware. MS-DOS had reached a mature stage of
evolution, and was a bit left behind while Microsoft were now working on their new product, Windows, initially
running on top of DOS, which we’re going to describe in a later article.

Now that we have described the features and evolution of MS-DOS, we can discuss them. A few points shine,
especially :

● Most of the development effort on the hardware abstraction side was spent on file and storage space
management : Some releases of DOS took months, years of development, just to add support for newer,
higher-capacity diskettes. Improvement on OS support of other kind of hardware was very limited. This may
be partly explained by the fact that, at the time, everyone was okay with letting programs deal with the bare
hardware. Hardware manufacturers made the thing a lot easier by letting everyone know how to interact with
their hardware, and by introducing hardware that was dead easy to manipulate for someone used to
assembly language. Standards were the rule rather than the exception, so if you could make your program
work on a computer, you were 95% sure that it would work on other hardware without modifying it in any way.
There wasn’t that much viral software at the time to harm the machine given direct access to it, especially due
to the fact that people didn’t downloaded and run random software from the internet, for the internet didn’t
exist for most people. On the other hand, performance was a critical issue, and there’s no program faster on a
specific hardware that one written specifically for it, using it directly.
Part of that wasn’t true for storage hardware. First, there wasn’t any kind of dominating storage medium, there
was a jungle of incompatible technologies (multiple kinds of tapes, various flavours of diskettes, the first hard
drive disks…), and they shared the common characteristic of being awful to manipulate. Then, performance
wasn’t such an issue for file storage: if you store something, it’s not for using it right away, and you don’t
spend your time reading and storing files on a diskette in your programs when you’ve got a main memory
that’s a lot faster. Last, file storage and manipulation was almost the sole thing that an average unskilled user,
trying to use software rather than write it, was forced to get into, in order to find and run the software he used,
or to copy the text files he wrote to disk, so the process had to be as simple as possible. The Shell, whose
only purpose was to better visualize the hierarchical structure of directories and find files quicker, perfectly
illustrates that.

● Little to no planning, features come as needed: Let’s see… As DOS aged up, 2 different file systems – ways
to manage the file abstraction – were used, one after another, called FAT12 and FAT16. FAT16 was
introduced to address the maximum disk capacity limit of FAT12. They are extremely close to each other
for older programs maximum compatibility reasons, so that differentiating FAT12 and FAT16 is extremely
difficult, but at the same time they are structurally incompatible. This is a typical example of a hack, a
modification in a program that a developer introduces when he understands that he messed up, but don’t
want to make a new design doc and other silly rigorous conception, just want to fix the sole thing that doesn’t
work. Perhaps the most common source of bugs in software is when people forget about the hack (or don’t
even know about it) and push it to its limits.
At the time of DOS, they couldn’t plan things, since they didn’t have a clue of how the computer business
would move on, so they introduced gradual changes, leading to hack accumulation, and hence bug
multiplications. Let’s anticipate the following articles by saying that this is one of the reasons that led to the
abandon of DOS later: managing, correcting, and more generally modifying it had become too complicated,
since no one could ever understand how it exactly worked. The operating system needed a complete rewrite.

From the history of DOS, we may extract the following keys of its success and subsequent fall:

● Always focus on what’s important first​: Well, reasons for that are pretty obvious.
● ​Don’t neglect the OS’s theoretical foundation and design:​ Or, sooner or later, hacks will occur.
● Make the doc simple, complete, and easily accessible:​ And you save people a lot of trouble, preventing effort
duplication on the way. Hardware manufacturers knew that, at the time.
● Don’t neglect the average user:​ After all, there’s a lot more users than there’s developers, so you know which
one makes lots of sales.
● Cleanness is the friend of reliability​: It always seems fine at the time, but it is the source of most of computer
lack of reliability nowadays.
● Compatibility may be the enemy of cleanness if the software is poorly designed to start with:​ This will be even
more obvious after we study the story of Windows.
● The search of maximal performance may be an enemy too:​ There are times when only hack will make a
program faster. Though if I were you, I’d better choose reliability and simplicity of design over performance, as
long as said performance is sufficient.

You might also like