Operating Systems Notes

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 41

Inter Process Communication (IPC)

- Processes within a system may be independent or cooperating.


- Reasons for cooperating processes
o Information sharing
o Computation speedup
o Modularity
- Cooperating processes need inter-process communication.
- 2 Models of IPC
o Shared Memory
o Message passing.
- Shared Memory
o An area of memory shared among the processes that wish to communicate.
o The communication is under the control of the users’ processes, not the operating system.
o Advantage: fast
o Major issue: error-prone, synchronization, less protection
o System calls:
 Int shmget(key, size, flags);
 Creates a shared memory segment.
 Returns the ID of the segment.
 key: access value associated with the segment.
 size: size of the memory segment
 flags: IPC_CREAT or IPC_EXCL and user permissions
 Int shmatt(shmid, addr, flags);
 attach the shared memory segment shmid to the address of the calling
process
 addr: pointer to the shared memory segment
 int shmdt(shmid);
 detach the shared memory
- Message Passing
o IPC facility provides two operations: send(message) and receive(message)
o the message size is either fixed or variable
o OS maintains a message queue
o Advantage: less error
o Major issue: OS gets involved so it’s slow
- Producer-consumer problem
o paradigm for cooperating processes – producer process produces information that is
consumed by a consumer process
 unbounded-buffer: places no practical limit on the size of the buffer
 bounded buffer: assumes that there is a fixed buffer size
o suppose that we wanted to provide a solution to the consumer-producer problem that fills
the buffer

~1~
 we can do so by having an integer counter that keeps track of the num of full
buffers
 initially counter is set to 0
 the int counter is incremented by the producer after it inserts a new item to the
buffer
 the int counter is decremented by the consumer once is consumes the item from
the buffer
pseudocode (producer):
while(true){
while(counter == BUFFER_SIZE) {
/* do nothing */
}

buffer[in] = next_produced;
in = (in +1)% BUFFER_SIZE;
counter++;
}

pseudocode (consumer)
while(true){
while(counter == 0) {
/* do nothing */
}

next_consumed = buffer[out];
out = (out + 1) % BUFFER_SIZE;

counter--;
/* consume the item in next consumed */
}
o Race condition is created
 counter++ could be implemented as

register1 = counter;
register1++;
counter = register1;

 counter-- could be implemented as


register2 = counter;
register2--;
counter = register2;

- Message Passing

~2~
o Mechanism for processes to communicate and to synchronize their actions
o Message system – processes communicate with each other without resorting to shared
variables
o IPC facility provides send and receive operations
o message size is either fixed or variable
- Logical and communication links
o for P and Q to communicate that need to establish a communication link between them
and exchange messages via send and receive functions
o logical implementation of links and send/receive
 direct or indirect
 synchronous or asynchronous
 automatic or explicit buffering
o Direct Communication
 processes must name each other explicitly
 send(P, message) – send a message to process P
 receive(Q, message) – receive a message from Q
 properties of communication link:
 links are established automatically
 a link is associated with exactly one pair of communicating processes
 between each pair there exists exactly one link
 the link may be unidirectional, but is usually bi-directional
o indirect communication
 messages are directed and received from mailboxes, or ports
 each port has its own id. processes can only communicate if they have a shared
port
 properties of communication link:
 link established only if processes share a common port
 a link may be associated with many processes
 each pair of processes may share several communication links
 link may be bi-directional or unidirectional
 operations when OS owns port
 create a new port, send and receive message through port, destroy a port
 primitives are defined as:
 send(A, message) - send a message to port A
 receive(A, message) – receive a message from mailbox A
 mailbox sharing may mean that processes who are not meant to receive a
message get it anyway. Solutions:
 Allow a link to only be associated with at most 2 processes
 Allow only one process at a time to execute a receive operation
 allow the system to select arbitrarily the receiver. Sender is notified who
the receiver was.

- Synchronization

~3~
o message passing may be either blocking or non-blocking
o blocking is considered synchronous
 blocking send – the sender is blocked until the message is received
 blocking receive – the receiver is blocked until a message is available
 producer-consumer becomes trivial
o non-blocking is considered asynchronous
 non-blocking send – the sender sends the message and continues
 non-blocking receive – the receiver receives either a valid message or a null
message.
- Buffering
o Queue of messages is attached to the link
o Implemented in one of three ways
1. Zero-capacity – no messages are queued on a link. Sender must wait for receiver
(rendezvous)
2. Bounded capacity – finite length of n messages. Sender must wait if link full
3. Unbounded capacity – infinite length, sender never waits
- Examples: Sockets
o Sockets are bidirectional.
o Unix sockets (IPC), network sockets (over network)
o 2 types:
 Datagram socket (UDP, connectionless)
 Stream socket (TCP, connection oriented)
- Examples: Signals
o Signals permit asynchronous one-way communication
 from process to process, or to a group of processes, via the kernel
 some signals have fixed meaning (e.g. to terminate a process)
 some signals can be user defines
o Signal handler: every process has a default code to execute for each signal. Some signal
handlers can be overridden to do other things.

~4~
Pipes
- File operations
o fd = open(“filename”);
 fd = file descriptor, a non-negative integer value
o read(fd, buf, size);
o write(fd, buf, size);
o close(fd);
- Each process has a file descriptor table (see image page)
- Pipe system call
o pipe(fd)
 OS creates a pipe and searches for unused file descriptor numbers to assign to
each end of the pipe.

o Call returns 2 file descriptors


 read handle and write handle
 a pipe is a half-duplex communication – meaning there is two-way
communication, but only from one side at a time. Switching is required.
 Data written in one file descriptor can be read through another.
o Both fds are in the same process, but this doesn’t mean they’re useless. Parent and child
processes share the fd after form. Parent can use one end and the child uses the other end.
o Named pipes: two endpoints of a pipe can be in different processes.
o Pipe data buffered in OS buffers between read and write.
- 2-way communication via pipes
o say there are two pipes opened, pipe1 and pipe2. It would be unnecessary for the parent
and child to both read and write to/from both pipes.
 instead, have one pipe for the child to write to and parent to read from, and one
for the parent to write to and child to read from. This eliminates unnecessary
pipes.

~5~
- Project 1 IPC - kth smallest finding
o Parent spawns 3 child processes and assigns IDs
o After reading inputs, each child sends ready
o Parent sends REQUEST to a random child.
o Child sends a random number from its input
o Parent sends PIVOT to all the child processes
o Parent broadcasts the PIVOT element (the number sent from the child in the previous
step)
o Children send out the number of elements that are smaller than the pivot.
o Parent sends LARGE to children; children then drop all elements larger than PIVOT (set
to -1)
 m = sum(# of larger elements), if k > m, k = k-m, if m == k-1, kth element found.
o Procedure repeats: parent sends request to a random child
o child sends a random number from its input.
o Parent sends PIVOT to all the child processes
o parent broadcasts the pivot element
o children send the no. of elements smaller than the pivot
o Parent sends SMALL to children
o Elements smaller than PIVOT are dropped (set to -1)
 m = sum(# of smaller elements), if k > m, k = k-m, if m == k-1, element found.
o Procedure repeats: parent sends REQUEST to a random child
o Child sends a random number from its input.
o Parent sends PIVOT to all the child processes
o Parent broadcasts the pivot element
o Children send number of elements smaller than pivot
 m = sum(# of smaller elements), if k > m, k = k-m, if m == k-1, element found.
- Parent process: N pairs of pipes, spawns N child processes. Assigns IDs to children. Sends
commands to child processes. Child process reads the commands. Child process sends their
responses via the second pipe. Parent process receives the values.
- 0 = read 1= write

~6~
Threads
- Threads are separate streams of execution within a single process. Threads in a process are not
isolated from each other. Each thread state (thread control block) contains registers (including EIP
and ESP) and stack.

- Benefits of threads:
o Responsiveness – interactive applications
o Resource sharing is easy – threads share memory and other resources of a process
o Economical – creation and context switching incurs lower cost.
o Scalability – better utilization of threads on multi-core systems.
- Processes versus threads:
o Process:
 A process has code, heap, stack, other segments
 Process has at least one thread
 Threads within a process share the same I/O, code, and files.
 If a process dies, all threads die.

o Thread:
 A thread has no data segment or heap
 A thread cannot live on its own. It needs to be attached to a process
 There can be more than one thread in a process. Each thread has its own stack
 If a thread dies, its stack is reclaimed.
- Multithreading models
o Two strategies for managing threads:
 User threads – thread management is done by user level thread library. Kernel
knows nothing about the threads.
 Kernel threads – Threads directly supported by the kernel. known as light weight
processes.
o User level threads:
 Advantages:
 Fast (really lightweight) – no system calls to manage threads. The thread
library does everything.
 Can be implemented on an OS that does not support threads

~7~
 Switching is fast. No switch from user to protected (kernel) mode.
 Disadvantages:
 Scheduling can be an issue. (Consider, one thread that is blocked on an
IO and another runnable)
 Lack of coordination between kernel and threads. (A process with 1000
threads competes for a time slice with a process having just one thread)
 Requires non-blocking system calls. (If one thread invokes a system call,
all threads need to wait).
o Kernel Level Threads
 Advantages:
 Scheduler can decide to give more time to a process having large number
of threads than process having small number of threads
 Kernel-level threads are good for applications that frequently block.
 Disadvantages:
 The kernel-level threads are slow (they involve kernel invocations)
 Overheads in the kernel (Since kernel must manage and schedule thread
as well as processes. It requires a full thread control block (TCB) for
each thread to maintain information about threads.)
o many-to-one model
 many user-level threads map to a single kernel thread
 Pros:
 Fast. No system calls to manage threads
 No mode change for switching threads
 Cons:
 No parallel execution of threads. All threads block when one has a
system call
 Not suited for multi-processor systems
o one-to-one model
 each user thread associated with one kernel thread
 pros:
 better suited for multiprocessor environments
 when one thread blocks, the other threads can continue to execute
 cons:
 Expensive. Kernel is involved.
o many-to-many model
 Many user threads mapped to many kernel threads
 Supported by some Unix and windows versions
 pros:
 flexible
 OS creates kernel threads as required
 process creates user threads as needed
 cons
 complex. double management

~8~
- pthread library
o Creating a thread: int pthread_create(pthread_t*thread, const pthread_attr_t*attr, void
*(*start_routine)(void*), void *arg);
 pthread)t*thread is a thread identifier much like pid
 *start_routine is a pointer to a function which starts execution in a different
thread
 void *arg is arguments to the function
o Destroying a thread: void pthread_exit(void *retval);
 void *retval is the exit value of the thread
o join (meaning, wait for a specific thread to complete): int pthread_join(pthread_t thread,
void **retval);
 pthread_t thread is the TID of the thread to wait for
 void **retval is the exit status of the thread
- Threading issues
o What happens when a thread invokes fork?
 Duplicating all threads is not easily done. Other threads may be running or
blocked I a system call.
 Duplicating only the caller thread is more feasible.
o Segmentation is a fault in a thread. Should only the thread terminate or the entire
process?
- Typical usage of threads is as follows:
o Event occurs?
 If no: Check again
 If yes: Create a thread, service the event, terminate the thread.
o Using thread pools, threads don’t have to be created every time an event occurs, they can
be waiting and assigned once they’re needed. Less delay, less overhead, but limited
number of threads
- Implicit threading: thread creating and management done by run time libraries and compilers

~9~
Priority Scheduling
- MLQs
o Processes assigned to a priority class. Each class has its own ready queue.
 The scheduler picks the highest priority class (queue) which has at least one
ready process.
 Selection of a process within the class may have its own policy. (Typically, RR
but not always.) High priority classes can implement FCFS in order to ensure
quick response time for critical tasks.
o Scheduler can adjust the time slice based on the queue class picked.
 IO bound processes can be assigned higher priority classes within longer time
slices.
 CPU bound processes can be assigned to lower priority classes with shorter time
slices.
o Disadvantage: class of a process must be assigned apriority - not all that efficient.
o Process dynamically moves between priority classes based on its CPU/IO activity.
o Basic observation:
 CPU bound processes are likely to complete the entire time slice
 IO bound processes may not need the entire time slice.
o All processes start in the highest priority class.
 If it finishes its time slice, it’s likely CPU bound: move to the next lower priority
class.
 If it does not finish its time slice, likely IO bound: stay in current priority class.
o As with any priority based scheduling scheme, starvation needs to be dealt with.
o A compute intensive process can trick the scheduler and remain in the high priority queue
class.
 By sleeping for the last small portion of the time slice, it tricks the scheduler into
thinking that its IO bound and will stay in the same priority queue instead of
moving on to the lower class like it should.
- Completely fair scheduling (CFS) – Linux
o With each runnable process is included a virtual runtime (vruntime)
 At every scheduling point, if a process has run for t ms, then vruntime += t.
 Vruntime for a process therefore monotonically increases.
o The idea is that when a timer interrupt occurs, the task with the lowest vruntime is
chosen, its dynamic time slice is computed, and the high-resolution timer is programmed
with its time slice.
o The process begins to execute in the CPU.
o When the interrupt occurs again, context switch happens only if there is another task with
a lower vruntime.
o Uses a red-black tree to pick the next process.
 Each node in the tree represents a runnable task.
 Nodes are ordered according to their vruntime.

~ 10 ~
 Nodes on the left have a lower vruntime compares to nodes on the right of the
tree. The leftmost node is the task with the least vruntime (this is cached as
min_vruntime)
 At a context switch, pick the leftmost node of the tree. It’s cached, so it’s
accessed in O(1)
 If the previous process is runnable, it is inserted into the tree depending on its
new vruntime. Done in O(log(n)).
 Tasks move from left to right of tree after its execution takes place.
Starvation is avoided.
 Red-black trees are self-balancing. No path in the tree will be twice as long as
any other path.
 All operations are O(log(n))
 See red-black tree example below:

~ 11 ~
o Priority, due to nice values (priority rating), used to weigh the vruntime.
 If a process has run for t ms, vruntime += t*(weight based on nice of process)
 A lower priority implies time moves at a faster rate compared to that of a high
priority task.
- Low priority = high nice value = high weight
- High priority = low nice value = low weight
- IO bound processes
o Should get higher priority and a longer time to execute compared to CPU bound
o CFS achieves this efficiently.
 IO bound processes have small CPU bursts therefore will have a low vruntime.
They would appear towards he left of the tree (so given high priority)
 IO bound will typically have larger time slices due to their smaller vruntime.
- New process
o Added to the RB tree
o Starts with an initial value of min_vruntime
 this ensures that it gets to execute quickly.

~ 12 ~
Process Synchronization
- Race conditions – situations where several processes access and manipulate the same data. The
outcome depends on the order in which the access takes place.
o Prevent race conditions by synchronization, meaning we ensure that only one process is
manipulating the critical section at a time
o Example:
o int counter = 5

o program 1: o program 2:
{ {
… …
counter ++; counter--;
… …
} }

o the shared variable is accessed by both. In a multicore system, these two programs could
be running simultaneously on different cores, meaning that the counter is not accurate.
- Critical Section: shared section between 2 or more processes
o Requirements:
 Mutual Exclusion: no more than one process in the critical section at a given
time
 Progress: when no process is in the critical section, any process that requests
access must be permitted without any delay.
 No starvation: There is an upper bound on the number of times a process enters
the critical section while another one is waiting.
- Locks and Unlocks
o lock(L) – acquire lock L exclusively.
 only the process with L can access the critical section
o unlock(L) – release exclusive access to lock L
 permitting other processes to access the critical section
- Single instructions by themselves are atomic
o multiple instructions need to be explicitly made atomic.

Lock implementation:

- Disabling interrupts:
o Simple: when interrupts are disabled, context switches happen
o Requires privileges: user processes generally cannot disable interrupts
o Not suited for multicore systems

~ 13 ~
Code implementation:

Process 1 Process 2

while(1) { while(1)
disable interrupts () disable interrupts ()
critical section critical section
enable interrupts () enable interrupts ()
… …
} }

- Turn based
o Achieves mutual exclusion
o Busy waiting – waste of power and time
o Needs to alternate execution in critical section
o Had a common turn flag that was modified by both processes
 this required processes to alternate
 soln: have 2 flags, 1 for each process

Code implementation:

shared: int turn = 1;

Process 1 Process 2

while(1) { while(1) {
while(turn == 2); // lock while(turn == 1); // lock
critical section critical section
turn = 2; //unlock turn = 1; //unlock
… …
} }

- Two flags
o Need note alternate execution in critical section.
o Does not guarantee Mutual exclusion

Code implementation:

Process 1 Process 2

while(1) { while(1) {
while(p2_inside == true); while(p1_inside == true);
p1_inside = true; p2_inside = true;
critical section critical section
p1_inside = false; p2_inside = false;
… …
} }

~ 14 ~
o Both p1 and p2 can enter the critical section at the same time:

o The problem is that the flag is set after we break from the while loop.

- Modified Two Flags:


o Achieves mutual exclusion
o Does not achieve progress – could deadlock
Code Implementation:

Shared: p1_wants_to_enter, p2_wants_to_enter

Program 1 Program 2

while(1) { while(1) {
p1_wants_to_enter = true; p2_wants_to_enter = true;
while(p2_wants_to_enter = true); while(p1_wants_to_enter = true);
critical section critical section
p1_wants_to_enter = false; p2_wants_to_enter = false;
… …
} }

- Peterson’s solution solves this deadlock


o Breaks the tie with a favoured process.
o Mutual exclusion when p1_wants_to_enter = true and p2_wants_to_enter = true.
Favoured will decide which one executes.
- Post = down, wait = up.

~ 15 ~
Code implementation:

Shared: p1_wants_to_enter, p2_wants_to_enter, favoured

Process 1 Process 2

while(1) { while(1) {
p1_wants_to_enter = true; p2_wants_to_enter = true;
favoured = 2; favoured = 1;

while(p2_wants_to_enter && while(p1_wants_to_enter &&


favoured == 2); favoured == 1);
critical section critical section
p1_wants_to_enter = false; p2_wants_to_enter = false;
… …
} }

~ 16 ~
Hardware Support for Synchronization
- Motivating example:

Shared: lock = 0;

Process 1 Process 2

while(1) { while(1) {
while(lock != 0); while(lock != 0);
lock = 1; //lock lock = 1; //lock
critical section critical section
lock = 0; //unlock lock = 0; //unlock
… …
} }

This does not achieve mutual exclusion:


lock = 0
P1: while (lock != 0)
CONTEXT SWITCH
P2: while (lock !=0)
P2: lock = 1;
CONTEXT SWITCH
P1: lock = 1;

…both processes in critical section.

- We need to make: while(lock!=0); lock = 1; atomic.

- Hardware support
o Test and set: write to a memory location and return its old value
 The entire function is executed atomically
 If two CPUs execute testAndSet at the same time, the hardware ensures
that only one of them does both its steps before the other starts.
 Atomizing the process:
while(1) {
while(testAndSet(&lock) == 1) ;
critical section
lock = 0; //unlock;

}

 The way it works is that the first invocation of testAndSet will read a 0 then set
the lock to 1 and return. The second testAndSet will see the lock as 1 and loop
continuously until lock becomes 0.

~ 17 ~
o xchg instruction (Intel x86)
 write to a memory location and return its previous value.
 example of implementation:

int xchg(addr, value){


%eax = value
xchg %eax, (addr)
//eax is returned
}

void acquire(int *locked){


while(1) {
if(xchg(locked,1) == 0)
break;
}
}

void release(int *locked){


locked = 0;
}
o Compare and Swap
 It compares the contents of a memory location with a given value and, only if
they are the same, modifies the contents of that memory location to a new given
value.
 Entirely atomic

int compareAndSwap(int *L, int exp, int v){


int prev = *L;
if(*L == exp)
*L = v;
return prev;
}
 Using this in a lock implementation:
while(true){
while(compareAndSwap(&lock, 0, 1) == 1){
/* critical section */
lock = 0;

~ 18 ~
- Spin Locks
o One process will acquire the lock, the other will wait in a loop, repeatedly checking if the
lock is available.
o The lock becomes available when the former process releases it.
o Issues:
 No compiler optimizations should be allowed.
 Should not reorder memory loads and stores
 No catching of X possible. All xchg operations are bus transactions.
 CPU asserts the LOCK, to inform that there is a locked memory access.
 Acquire function in spinlock invokes xchg in a loop… each operation is a bus
transaction which means huge performance hits.
o acquire code:
void acquire(int *locked){
while(1) {
if(xchg(locked,1) == 0)
break;
}
}
o Improved acquire code:
void acquire(int *locked){
reg = 1;
while(xchg(locked, reg) == 1)
while(*locked ==1);
}
o Original had a loop with xchg, bus transactions and huge overhead. The new way has an
inner loop that allows caching of locked. Access cache instead of memory and avoid the
bus transactions.
o Busy waiting - Spin locks are useful for short critical sections where much CPU tie is not
wasted waiting. They’re not useful when the period of wait is unpredictable or will take a
long time – use mutex locks in this case.
- Mutex Locks
o If the critical section is locked, then yield CPU, go to a sleep state. While unlocking,
wake up the sleeping process.

int xchg(addr, value){


%eax = value
xchg %eax, (addr)
//eax is returned
}

(cont. next page)

~ 19 ~
void lock(int *locked){
while(1) {
if(xchg(locked,1) == 0)
break;
else
sleep();
}
}

void release(int *locked){


locked = 0;
wakeup();
}
o Many processes wake up when the event occurs
 all waiting processes wake up, leading to several context switches. All processes
will go back to sleep except for one, which gets the critical section. Large
number of context switches could lead to starvation.
 SOLUTION: When entering the critical section, push into a queue before
blocking. When exiting the critical section, wake up only the first process in the
queue. Updated code:
void lock(int *locked){
while(1) {
if(xchg(locked,1) == 0)
break;
else
//add this process to Queue
sleep();
}
}

void release(int *locked){


locked = 0;
//remove process P from queue
wakeup(P);

~ 20 ~
Semaphores
- Semaphore’s purpose
o In the producer-consumer problem (aka bounded buffer problem – see page 1), producer
produced and stores in the buffer, consumer consumes from the buffer. There’s trouble
when the producer produces but the buffer is full OR when the consumer consumes but
the buffer is empty.

~ 21 ~
o Consider the following set of instructions:

o note that the wakeup is lost. Consumer will wait even though the buffer is not empty.
Eventually the producer and consumer will wait indefinitely.
o Semaphores were introduced by Dijkstra to fix this issue.

~ 22 ~
- Semaphore Functionality
o Functions up and down must be atomic.
o down, also called P
o up, also called V
o can have different variants such as blocking, non-blocking
o if S (the semaphore) is initially set to 1, Blocking semaphore would be similar to a
mutex, non-blocking semaphore would be similar to a spinlock.
void down(int *S) {
while(*S <= 0); //if empty, wait
*S--; //if not empty, decrement by 1
}

void up (int *S){


*S++; //value passed is incremented by 1
}
o example of the producer consumer with semaphores:

~ 23 ~
~ 24 ~
- Deadlock: A situation where programs continue to run indefinitely without making any progress.
Each program is waiting for an event that another process can cause.

- Dining Philosopher Problem (DPP)


o Philosophers can either think or eat.
o To eat, a philosopher needs to hold both forks (the one on his left and the one on his
right)
o If the philosopher is not eating, he is thinking
o We want to create an algorithm where no philosopher starves.
o Want to avoid Deadlocks
- Solution to DPP using mutex locks:
o protect critical sections with a mutex to prevent deadlock
o has performance issues – only one philosopher can eat at a time

void philosopher(int i) {
while(true) {
think(); //for some time
lock(mutex)
takeFork(i);
takeFork((i+1) %N);
eat();
putFork(i);
putFork((i+1)%N);
unlock(mutex)
}
}
- Best solution with semaphores
o Use N semaphores (s[0], s[1], s[2] … s[N]), all initialized to 0, and a mutex philosopher
has 3 states: HUNGRY, EATING, THINKING.
 A philosopher can only move to the eating state if neither of his neighbours is
eating

~ 25 ~
- DP Variant: consider the version of the DPP in which the forks are placed at the centre of the
table and any two of them can be used by a philosopher.
o Assume the requests for forks are made one at a time. The rule to prevent deadlocks in
this situation is:
 When a philosopher wants to eat, he checks if both required forks are available.
 If both forks are available, the philosopher picks them up and starts eating.
 If one or both forks are not available. the philosopher waits until they’re both
available

~ 26 ~
Deadlock Detection
- Conditions for a deadlock:
o Mutual exclusion
 Each resource is either available or currently assigned to exactly 1 process.
o Hold and wait
 A process holding a resource can request another resource
o No pre-emption
 resources previously granted cannot be forcibly taken away from a process
o Circular wait
 There must be a circular chain of two or more processes, each of which is waiting
for a resource held by the next member of the chain.
o All four of these must be met for a resource deadlock to occur.
o Having multiple resources can potentially reduce the chance of having a deadlock.

- Detection and Handling Strategies


o OS needs to keep track of :
 current resource allocation
 current requests
o Uses this information to detect deadlocks

~ 27 ~
- When there’s just one of each resource available, any cycles in the graph will be deadlock
situations
o If there is one possible order that the processes could execute in without encountering a
deadlock, the system is not deadlocked.
- When there are multiple instances of a resource:
o use a current allocation matrix (who currently has what) and a request matrix (who needs
what) to figure out if there are any possible execution orders that can be implemented.
Example:

- Compare the current available resources available vector to determine which request can be met.
Once the process that can be completed is satisfied, all of that process’ resources are free,
including both the current allocation and the request.
- When a deadlock is detected:
o Alarm is raised to alert users/administrator
o Pre-emption – take away a resource from a process temporarily (frequently impossible)
o Rollback – checkpoint states and then roll back.
o Kill low priority process – keep killing processes until the deadlock is broken (or reset
system).

Deadlock Avoidance

- State – state of the system reflects the current allocation of resources to processes
- Safe state – state in which there is at least one sequence of resource allocations to processes that
does not result in a deadlock. (No cycles in the resource allocation graph (RAG))
- Unsafe state – state in which no safe sequence of resource allocations exists (Cycles will occur in
the RAG)
- Deadlock state – state in which there is at least one sequence of resource allocations to processes
that results in a deadlock (Cycles do occur in the RAG)

~ 28 ~
Banker’s Algorithm
- Resource trajectories – system decides in advance If allocating a resource to a process will lead
to a deadlock.
- RAG:
o Process Vertices: represent processes currently running in the system
o Resource Vertices: represent resources available in the system
o Request Edges (dashed lines) - an arrow points from a process to a resource vertex if
the process is requesting that resource but hasn’t been granted access yet.
o Allocation Edges (solid lines): an arrow points from a process to a resource vertex if the
process has been allocated that resource and is currently using it
- In banker’s algorithm, processes declare the maximum number of instances it may need for a
specific resource type.
o System decides whether the allocation of resources will leave the system in a safe state. If
yes, the resource is allocated, otherwise the process must wait.
- Data structures:
o number of processes = n
o number of resource types = m
o Available = vector length m. Available[j] = k means that there are k instances of resource
type Rj available
o Max = n x m matrix. Maximum requests of each process. Max[i,j] = k means that process
Pi may request, at most, k instances of resource type Rj
o Allocation = n x m matrix representing the current resource allocation to each process
o Need = m x n matrix representing the remaining resources needed for the process.
 Need[i,j] = Max[i,j] – Allocation[i,j]
- Resource-request
1. if requesti < needi go to step 2, otherwise raise an error condition since the thread has
exceeded its maximum claim
2. if requesti < availablei go to step 3, otherwise this process must wait as the resources
required are not available
3. Have the system pretend to have allocated the requested resources to the process by
modifying the state: Available = Available – request, Allocation = Allocation –
Request, Need = Need – Request, for i.

- Deadlock avoidance
o Advantages:
 it is not necessary to pre-empt and rollback processes as in deadlock detection.
o Restrictions
 Maximum resource requirement for each process must be stated in advance
 Process under consideration must be independent and with no synchronization
requirements
 Fixed number of resources to allocate
 Process holding resources may not exit

~ 29 ~
Deadlock Prevention

- Idea: Prevent one of the four necessary conditions for a deadlock (Mutual exclusion, hold and
wait, no pre-emption, circular wait).
- Mutual Exclusion
o It’s practically impossible to provide a method to break the mutual exclusion condition
since most resources are intrinsically non-sharable
 e.g. two philosophers cannot share the same fork at the same time.
- Hold and wait
o a process acquires all the needed resources simultaneously before its execution, therefore
breaking the hold and wait condition
 e.g. in the DPP, each philosopher is required to pick up both forks at the same
time. If the philosopher fails, they have to release the fork(s) (if any) they have
acquired.
o Disadvantages:
 Starvation: a process may be held up for a long time waiting for all its resource
request to be filled
 Low resource utilization: resources allocated to a process may remain unused for
a considerable period.
 Practical problem: a process may not know in advance all its resource
requirements.
- No pre-emption
o Idea: allow resources to be pre-empted if they are being held while the process is waiting
 if a process holding some resources requests another resource that cannot be
immediately allocated to it, resources currently being held by that process are
released
o Drawback: some resources cannot/should not be pre-empted (e.g. mutex locks)
- Breaking circular wait
o Defining a linear ordering of resource types
o requiring that resources must be acquired in ascending order
- Summary
o ME – sharable resources
o Hold and wait – request all resources initially
o No pre-emption – take resources away
o Circular wait – order resources numerically

~ 30 ~
Memory Management
- Process in memory:
o A program in execution is present in the RAM and comprises of :
 Executable instructions
 Stack
 Heap
 State in the OS (in kernel)
o The state contains:
 Registers
 List of open files
 Processes
 Etc.
o Executable is store on hard disk, but process executes from RAM
- Memory manager: keeps track of which parts of memory are in use, allocate memory to
processes when they needs it, and deallocate it when they are done
- No two programs can coexist in the memory due to static relocation
o No abstraction

- Memory abstraction
o
o
o
o
o
o
o
o
o
o
o
Address space: set of addresses that a process can use to address memory
o Dynamic relocation: map each process’ address space onto a different part of physical
memory
- Tackling memory overloads
o Swapping – bringing in each process in its entirety, running it for a while, then putting it
back on the disk.
o Virtual Memory – allows programs to run even when they are only partially in main
memory
- Single Contiguous Model
o No sharing: one process occupies ram at a time. When one process completes another
process is allocated RAM. Process memory size is restricted by RAM size.

~ 31 ~
- Partition model: if sufficient contiguous space is available, new processes are allocated memory.
o Partition table example:

Memory Address Size Process Usage

0x0 120K 4 In Use


120k 60k 1 In Use
180k 20k 5 In Use
200k 10k - Free

o If there is sufficient space, a new process is allocated. Once a process is complete, its
space is deallocated from RAM (that amount of space becomes available).
 This can lead to fragmentation if there is enough space, but it isn’t contiguous
o Compaction: Shuffle memory contents to place all free memory together in one large
block
 change the base value for each of them as they are moved. Possible only if
relocation is dynamic, and done at execution time
 This is expensive (O(n)) in size of physical memory
o Dynamically Growing processes: may grow into adjacent holes or shuffle memory
location. Extra memory space is allocated for such processes
- Methods of choosing free memory:
o First Fit: Allocate the process to the first available free space that is big enough
 May make fragmentation worse
o Best Fit: Allocate the process to the smallest free space that is big enough
 May affect performance
o Worst Fit: Allocate the process to the largest free space
 May not be able to fit as many processes (large fragments)
- Deallocation of memory can have overheads of merging partitions
- Limitations:
o Entire process needs to be in RAM
o Allocation needed to be in contiguous memory
o These led to
 Fragmentation
 Limit the size of process by RAM size
 Performance degradation due to bookkeeping and managing partitions
- Modern day systems use virtual memory and segmentation techniques

~ 32 ~
Paging
- Divide physical memory into fixed-size blocks called frames (size is a power of 2, between 512
and 4KB)
- Divide logical/virtual memory into blocks of the same size called pages
- Keep track of all free frames
- To run a program of n pages, need to find n free frames and load program

- To keep track, set up a page table to translate logical to physical addresses


o Role of the memory management unit (MMU)
o There’s a page table for each process, managed and held by the OS.
o With each entry in the page table, associate a validation bit (if = 0, illegal page) and
read/write/execute privileges
o Keep track of how main memory is used (free/variable) using a frame table. Usually, one
frame table is managed by the OS

- Address translation Scheme: Address generated


o Page number (p) – used as an index into a page table which contains base address of each
page in physical memory
o Page offset (d) – combined with base address to define the physical memory address that
is sent to the memory unit.
o For a given logical address space 2m and page size 2n:
 Page number, p = m-n
 Page offset, d = n
o Small page size = large page table
o Large page size, if processes are small, leads to internal fragmentation
 External fragmentation is when there is wasted memory due to unused small
gaps

~ 33 ~
 Internal Fragmentation is when the last page allocated to a process is not
utilized (fully). This memory is allocated to the process but remains unused
because it’s not enough to hold another complete page of the process’ data.
 This leads to reduced memory efficiency and increasing paging
overhead.
o Typical page size is 4KB (getPageSize() on Linux)
o Page table is kept in main memory
o Page table base-register (PTBR) points to the page table and page table length register
(PTLR) indicates the size of the page table
o Every data/instruction access requires two memory accesses (one for the page table and
one for the instruction_
o Two memory access problem can be solved using a special fast-lookup hardware cache
called Translation look-aside buffers (TLBs).
o Some TLBs store address space identifiers (ASIDs) in each TLB entry which uniquely
identify each process and provide address-space protection for the process.
o On a TLB miss, the value is loaded into the TLB for faster access next time.

- Effective access time


o Associative lookup – extremely fast
o Hit Ratio = α
 Hit ratio is the percentage of times that a page number is found in the associate
memory. Consider α = 80%, 100ns for memory access.
 EAT = 0.80*100 + 0.20*200 = 120ns, assuming 200ns is the time for
non-associative lookup.

~ 34 ~
Demand Paging
- Page table structure
o Address Translation (32 bit)
 If each page frame is size 4KB, then 12 bits are required to address a page (4KB
= 22 * 210 = 212 = 12-bit address/offset)
 If there are 32 bits in the address, then 32 – 12 = 20 of them are for the table
index. This means that the number of entries in the page table is 220
o Hierarchical Page Tables
 Each level in the hierarchy contains entries that map virtual page numbers to
physical frame numbers.
 There are multiple levels: Instead of a single large table, there are multiple
smaller tables. They’re typically organized in a tree like structure with 2+ levels.
 Entries in a higher-level table act as pointers to lower-level tables. This allows for
efficient translation without

o Ha

~ 35 ~
- Inverted Page Tables:
o One entry for each frame of memory
o Entry: Virtual address of the page stored in that real memory location, with information
about the process that owns the page.
o Mapping is inverted from regular page tables.

Feature Page Table Inverted Page Table


Structure Per-process table mapping Single System-wise table
virtual to physical mapping physical to virtual
Mapping Virtual to physical Physical to virtual

Size Grows with process virtual Fixed size based on


address space physical memory
Memory Overhead Can be high for large Potentially lower for many
addresses spaces processes.

~ 36 ~
- Demand Paging
o Not all parts of the program are accesses simultaneously. Some code may not be executed
at all. Virtual memory takes advantage of this using demand paging
o Pages are loaded from disk to RAM only when needed. A ‘present bit’ in the page table
indicates if the block is in RAM or not.
 if(presentBit = 1) { block ion RAM }
else { block not in RAM }
o If a page is accesses that is not present in RAM, the processor issues a page fault
interrupt, triggering the OS to load the page into RAM and mark the present bit to 1.
o If there are no pages free for a new block to be loaded, the OS decides to remove another
block from RAM.
o This is based on a replacement policy implemented in the OS.
o Some replacement policies are:
 FIFO
 Least recently used
 Least frequency used
o The replaced block may need to be written back to the swap (swap out)
o the dirty bit, in the page table indicates if a page needs to be written back to disk.
o If the dirty bit is 1, indicates the page needs to be written back to disk
o Protection bits, in the page table, determine if the page is executable, read only, and
accessible by a user process.
- On a page fault:
o trap to the operating system
o call the page fault interrupt service
o read in the page (device IO)
o update the page table
o restart the process
- Effective access time = (1-p)*ma + p*page fault time
o p = probability of page fault. ma = memory access time.

Images:

xV6 system calls:

~ 37 ~
~ 38 ~
~ 39 ~
~ 40 ~
~ 41 ~

You might also like