OS-module 2 Notes

Operating Systems- 21CS44 - Module 2
Module 2: Multi-threaded Programming, Process Scheduling, Process

Synchronization
Syllabus: Multi-threaded Programming: Overview; Multithreading models; Thread Libraries;
Threading issues. Process Scheduling: Basic concepts; Scheduling Criteria; Scheduling Algorithms;
Multiple-processor scheduling; Thread scheduling. Process Synchronization: Synchronization: The
critical section problem; Peterson’s solution; Synchronization hardware; Semaphores; Classical
problems of synchronization; Monitors
Chapter 4: Multithreaded Programming

Motivation
Many software packages that run on modern desktop PCs are multithreaded. A web browser might
have one thread display images or text while another thread retrieves data from network, for example.
In certain situations, a single application may be required to perform several similar tasks. A busy web
server may have several clients concurrently accessing it. If the web server ran as a traditional single
threaded process, it would be able to service only one client at a time and a client might have to wait
very long for its request to be serviced. One solution is to have the server run as a single process that
accepts requests. When the server receives a request, it creates a separate process to service that
request. Process creation is time consuming and resource intensive. It is more efficient to use one
process that contains multiple threads. When a request is made, rather than creating another process,
the server will create a new thread to service the request and resume listening for additional requests.
Multithreaded Server Architecture: ->
Single and Multithreaded Processes:
VCET Puttur Page 1

Operating Systems- 21CS44 - Module 2
Benefits ofmultithreaded programming
1. Responsiveness – Multithreading an interactive application may allow a program to continue

running even if part of it is blocked. For example, multithreaded web browser could allow user
interaction in one thread while an image was being loaded in another thread.
2. Resource Sharing – Threads share the memory and the resources of the process to which they
belong. The benefit of sharing code and data: One application can have several different
threads of activity within the same address space.
3. Economy – Allocating memory and resources for process creation is costly. Since threads share
the resources of the processes to which they belong, it is more economical to create and
context-switch threads.
4. Scalability – Multithreading on a multi-CPU machine increases parallelism. In a
multiprocessor system threads may be running in parallel on different processors.
Multithreading Models
Support for threads may be provided either at the user level, for user threads, or by the kernel, for
kernel threads.
User Threads :
• Thread management done by user-level threads library
• Three primary thread libraries: POSIX Pthreads, Win32 threads and Java threads
Kernel Threads:
• Supported by the Kernel
• Examples: Windows XP/2000, Solaris, Linux, Tru64 UNIX, Mac OS X
Ultimately, a relationship must exist between user threads and kernel threads. Now we look at three
common ways of establishing such a relationship.
Different types of thread models: Many-to-One, One-to-One, Many-to-Many
1) Many-to-one model:
• Maps many user-level threads to one kernel thread.
• Because only one thread can access the kernel at a time, multiple threads are unable to run in
parallel on multiprocessors.
• Thread management is done by the thread library in user space, so it is efficient; but the entire
process will block if a thread makes a blocking system call
• The entire process will block if a thread makes a blocking system call.
• Examples: Solaris Green Threads, GNU Portable Threads
VCET Puttur Page 2

Operating Systems- 21CS44- Module 2
2) One-to-One
• Maps each user thread to a kernel thread.
• It provides more concurrency than the many-to-one model by allowing another thread to run
when a thread makes a blocking system call
• It also allows multiple threads to run in parallel on multiprocessors.
• Examples : Windows NT/XP/2000, Linux, Solaris 9 and later
• Drawback : creating a user thread requires creating the corresponding kernel thread which can
burden the performance of an application
3) Many-to-many model
• Allows many user level threads to be mapped to many kernel threads.
• Allows the operating system to create a sufficient number of kernel threads
• The number of kernel threads may be specific to either a particular application or a particular
machine. When a thread performs a blocking system call, the kernel can schedule another thread for
execution.
• Ex: Solaris prior to version 9, Windows NT/2000 with the ThreadFiber package
Comparison of 3 models:
• The Many-to-one model allows the developer to create as many user threads as he wishes, but
true concurrency is not gained because kernel can schedule only one thread at a time.
• The one-to-one model allows for greater concurrency, but the developer has to be careful not to
create too many threads within an application.
• The many-to-many model has neither of these shortcomings: Developers can create as many
user threads as necessary, and corresponding kernel threads can run in parallel on a
multiprocessor.
VCET Puttur Page 3

Two-level model
• Maps many user-level threads to a smaller or equal number of kernel threads but also allows a
user-level thread to be bound to a kernel thread.
• Similar to Many to Many, except that it allows a user thread to be bound to kernel thread
• Examples : IRIX, HP-UX, Tru64 UNIX, Solaris 8 and earlier
Thread Libraries
• Thread library provides programmer with API for creating and managing threads
• Two primary ways of implementing
• Library entirely in user space
• Kernel-level library supported by the OS
POSIX Pthreads
• May be provided either as user-level or kernel-level

• A POSIX standard (IEEE 1003.1c) API for thread creation and synchronization
• API specifies behavior of the thread library, implementation is up to development of the library
• Common in UNIX operating systems (Solaris, Linux, Mac OS X)
Basic thread creation is described using these three thread libraries.
As an illustrative example, design a multithreaded program that performs the summation of a non-
negative integer in a separate thread using the well-known summation function.
Pthreads Example
The statement pthread_t tid declares the identifier for the thread we will create. Each thread has a set
of attributes, including stack size and scheduling information. The pthread_attr_t attr declaration
represents the attributes for the thread. We set the attributes in the function call pthread_attr_init
(&attr).
A separate thread is created with the pthread_create() function call. In addition to passing the thread
identifier and the attributes for the thread, we also pass the name of the function where the new thread
will begin execution-in this case, the runner() function. Last, we pass the integer parameter that was
provided on the command line, argv[1].
At this point, the program has two threads: the initial (or parent) thread in main() and the summation
(or child) thread performing the summation operation in the runner() function. After creating the
VCET Puttur Page 4

summation thread, the parent thread will wait for it to complete by calling the pthread_join() function.
The summation thread will complete when it calls the function pthread_exit(). Once the summation
thread has returned, the parent thread will output the value of the shared data sum.
#include <pthread.h>
#include <stdio.h>
int sum; /* this data is shared by the thread(s) */

void *runner(void *param); /* the thread */
int main(int argc, char *argv[])

{
pthread_t tid; /* the thread identifier */
pthread_attr_t attr; /* set of thread attributes */
if (argc != 2)
{
fprintf(stderr,"usage: a.out <integer value>\n"); return -1;
}
if (atoi(argv[1]) < 0)
{
fprintf(stderr,"%d must be>= 0\n",atoi(argv[1])); return -1;
}
pthread_attr_init(&attr); /* get the default attributes */
pthread_create(&tid,&attr,runner,argv[1]); /* create the thread */
pthread_join(tid,NULL); /* wait for the thread to exit */
printf("sum = %d\n",sum);
}
/* The thread will begin control in this function */ void *runner(void *param)
{inti, upper= atoi(param);
sum = 0;
for (i = 1; i <= upper; i++)
sum += i;
pthread_exit(0);
}
Win32 Threads:
The technique for creating threads using the Win32 thread library is similar to the Pthreads technique
in several ways. We illustrate the Win32 thread API in the C program shown in Figure below. We must
include the windows.h header file when using the Win32 API.
Program:
VCET Puttur Page 5

Java Threads
• Java threads are managed by the JVM.
• Typically implemented
using the threads model provided
by underlying OS
• Java threads may be created
by:
o Extending Thread class
o Implementing the Runnable
interface Java Multithreaded
Program
VCET Puttur Page 6

Threading Issues
The fork( ) and exec( ) System Calls

• If one thread in a program calls fork( ), does the new process duplicate all threads, or is the new
process single threaded?
• UNIX systems have two versions of fork( ), one that duplicates all threads and another that
duplicates only the thread that invoked the fork( ) system call.
• If a thread invokes the exec( ) system call, the program specified in the parameter to exec( )
will replace the entire process including all threads.
Thread Cancellation
• It is the task of terminating a thread before it has completed.

• For example, if multiple threads are concurrently searching through a database and one thread
returns the result, the remaining threads might be canceled.
• A thread that is to be canceled is often referred to as the target thread
• Two general approaches:
o Asynchronous cancellation: One thread immediately terminates the target thread.
o Deferred cancellation: The target thread periodically checks whether it should terminate,
allowing it an opportunity to terminate itself in an orderly fashion.
• The difficulty with cancellation occurs in situations where resources have been allocated to a
canceled thread or where a thread is canceled while in the midst of updating data it is sharing with
VCET Puttur Page 7

other threads.
• Canceling a thread asynchronously may not free a necessary system-wide resource.
Signal Handling
• A signal is used in UNIX systems to notify a process that a particular event has occurred.
• A signal is generated by the occurrence of a particular event. A generated signal is delivered to
a process. Once delivered, signal must be handled.
• Synchronous signals include illegal memory access and division by 0.
• Synchronous signals are delivered to the same process that performed the operation that caused
the signal.
• Asynchronous signal is generated by an event external to a running process, for example
terminating a process with specific keystrokes (such as <control><C>) and having a timer expire.
• An asynchronous signal is sent to another process.
• Delivering signals is more complicated in multithreaded programs, where a process may have
multiple threads.
• Following operations exist;
◦ Deliver the signal to the thread to which the signal applies.
◦ Deliver the signal to every thread in the process.
◦ Deliver the signal to certain threads in the process.
◦ Assign a specific thread to receive all signals for the process.
• Synchronous signals need to be delivered to the thread causing the signal and not to other
threads in the process. Some asynchronous signals such as a signal that terminates a process
(<control><C>) should be sent to all threads.
Thread Pools
• Whenever web server receives a request, it creates a separate thread to service the request.
• The first issue concerns the amount of time required to create the thread prior to servicing the
request.
• If we allow concurrent requests to be serviced in a new thread, we need to create too many
threads. Unlimited threads could exhaust system resources, such as CPU time or memory.
• One solution is to use thread pool. Create a number of threads at process startup and place them
into a pool, where they sit and wait for work.
Benefits of thread pool:

◦ Servicing a request with existing thread is faster than waiting to create a new thread.
◦ Thread pool limits the number of threads that exist at any one point. This is important on
systems that cannot support a large number of concurrent threads.
• Threads belonging to a process share the data of the process.
• In some circumstances, each thread might need its own copy of certain data. We will call such
data thread-specific data.
VCET Puttur Page 8

Thread Specific Data

• Allows each thread to have its own copy of data
• Useful when you do not have control over the thread creation process (i.e., when using a thread
pool)
• For example, in a transaction-processing system, we might service each transaction in a
separate thread. Furthermore, each transaction might be assigned a unique identifier. To associate each
thread with its unique identifier, we could use thread-specific data.
Scheduler Activations
• Both M:M and Two-level models use an intermediate data structure between user and kernel
threads – lightweight process (LWP).
• Appears to be a virtual processor on which process can schedule user threads to run. Each LWP
is attached to a kernel thread.
• Each LWP is attached to a kernel thread, and it is kernel threads that the operating system
schedules to run on physical processors. If a kernel thread blocks (such as while waiting for an I/0
operation to complete), the LWP blocks as well. Up the chain, the user-level thread attached to the
LWP also blocks.
• Communication between user-thread library and kernel is known as Scheduler activation.
• The kernel provides an application with a set of virtual processors and application schedules
user threads on available virtual processors.
• Operating system schedules kernel threads on physical processors.
• Kernel must inform an application about certain events, This is known as an upcall. Upcalls are
handled by upcall handler in the thread library.
VCET Puttur Page 9

Chapter 5 : Process Scheduling

Basic Concepts
• Several processes are kept in memory at one time.
• When one process has to wait, the operating system takes the CPU away from that process and
gives the CPU to another process.
CPU-I/O Burst Cycle

• Process execution consists of a cycle of CPU execution and I/O wait.
• Process execution begins with a CPU burst, that is followed by an I/O burst, which is followed
by another CPU burst, then another I/O burst and so on.
• The final CPU burst ends with a system request to terminate execution.
CPU Scheduler
• Whenever CPU becomes idle, the operating system must select one of the processes in the
ready queue to be executed. The selection process is carried out by the Short-term scheduler.
• CPU scheduling decisions may take place when a process:
1. Switches from running state to waiting state.
2. Switches from running to ready state.
3. Switches from waiting to ready.
4. Terminates.
• Scheduling under circumstances 1 and 4 is nonpreemptive or cooperative; otherwise it is
preemptive.
• Under nonpreemptive scheduling, once the CPU has been allocated to a process, the process
keeps the CPU until it releases the CPU either by terminating or by switching to the waiting state.
• Consider the case of two processes that share data.While one is updating the data, it is
VCET Puttur Page 10

preempted. The second process then tries to read the data, which are in an inconsistent state.In such
case we need mechanisms to coordinate access to shared data.
Dispatcher
• Dispatcher module gives control of the CPU to the process selected by the short-term
scheduler; this involves:
• switching context
• switching to user mode
• jumping to the proper location in the user program to restart that program
• Dispatch latency – time it takes for the dispatcher to stop one process and start another
running.
Scheduling Criteria
• Different CPU-scheduling algorithms have different properties, and the choice of a particular
algorithm may favor one class of processes over another.
• Following criteria have been suggested for comparing CPU-scheduling algorithms:
◦ CPU utilization – keep the CPU as busy as possible.
◦ Throughput –number of processes that complete their execution per time unit.
◦ Turnaround time – Interval from the time of submission of a process to the time of
completion.
◦ Waiting time – The sum of periods spent waiting in the ready queue.
◦ Response time – amount of time it takes from the submission of a request until the first
response is produced, not output.
Scheduling Algorithm Optimization Criteria: It is desirable to maximize CPU utilization and

throughput and to minimize turnaround time, waiting time and response time.
Scheduling Algorithms
First- Come, First-Served (FCFS) Scheduling
• The process that requests the CPU first is allocated the CPU first.
• When a process enters the ready queue, its PCB is linked onto the tail of the queue. When the
CPU is free, it is allocated to the process at the head of the queue.
• Consider the following set of processes that arrive at time 0 in the order: P1, P2, P3. The length
of the CPU burst given in milliseconds.
VCET Puttur Page 11

• FCFS scheduling result is shown in the following Gantt Chart (bar chart that illustrates a
particular schedule including start and finish time of each participating process).
Average waiting and Turn Around Time:

Process id Burst time(in ms) Waiting Time(in ms) Turn Around time(ms)
P1 24 0 24
P2 3 24 27
P3 3 27 30
Average 51/3=17 81/3=27
• Suppose that the processes arrive in the order: P2 , P3 , P1

The Gantt chart for the schedule is:

P2 3 0 3
P3 3 3 6
P1 24 6 30
Average 9/3=3 39/3=13
• Thus average waiting time under FCFS policy is generally not optimal.
• FCFS scheduling algorithm is nonpreemptive. Once the CPU has been allocated to a process,
that process keeps the CPU until it releases the CPU, either by terminating or by requesting I/O.
◦ Thus FCFS algorithm is troublesome for time-sharing systems.
◦ Convoy effect – Many short processes wait for one big process to get off the CPU. Consider
one CPU-bound and many I/O-bound processes.
Shortest-Job-First (SJF) Scheduling

• This algorithm associates with each process the length of the process's next CPU burst.
• When the CPU is available, it is assigned to the process that has the smallest next CPU burst. If
the next CPU-bursts of two processes are the same, FCFS scheduling is used to break the tie.
• Consider the following set of processes, with the length of the CPU burst given in milliseconds.
VCET Puttur Page 12

SJF schedules processes according to following Gantt chart:

P1 6 3 9
P2 8 16 24
P3 7 9 16
P4 3 0 3
Average 28/4=7 52/4=13
• SJF is optimal – gives minimum average waiting time for a given set of processes.
• Although SJF is optimal, it cannot be implemented at the level of short-term scheduler. There is
no way for short-term scheduler to know the length of the next CPU burst.
Determining Length of Next CPU Burst
◦ One approach is to predict the length of the next CPU burst. Can be done by using the length of
previous CPU bursts, using exponential averaging.
t n=actual length of nth CPU burst
τn+1= predicted value for the next CPU burst
3 . α , 0≤α ≤1
4. Define: τn + 1 =αtn+(1−α ) τn .
• α controls the relative weight of the actual length and predicted length of nth CPU burst.
Commonly, α set to ½. Expanding the formula
VCET Puttur Page 13

Shortest-remaining-time-first (SRTF) or Preemptive SJF:
• SJF algorithm can be either preemptive or nonpreemptive.

• Preemptive SJF scheduling is sometimes called shortest-remaining-first scheduling.
• Consider the following four processes with the length of the CPU burst given in milliseconds.
SRTF Gantt Chart

Process Arrival Burst time(in ms) Waiting Time(in ms) Turn Around
id Time time(ms)
P1 0 8 10-1=9 17
P2 1 4 1-1=0 5-1=4
P3 2 9 17-2=15 26-2=24
P4 3 5 5-3=2 10-3=7
Average 26/4=6.5 52/4=13
• When next CPU burst of the newly arrived process is shorter than what is left of the currently
executing process,
◦ Preemptive SJF algorithm will preempt the currently executing process.
◦ Nonpreemptive SJF algorithm will allow the currently running process to finish its CPU burst.
• Average waiting time = [(10-1)+(1-1)+(17-2)+(5-3)]/4 = 26/4 = 6.5msec.
Priority Scheduling
• A priority is associated with each process and CPU is allocated to the process with highest
priority.
• SJF is priority scheduling where priority is the inverse of predicted next CPU burst time.
• We assume that low numbers represent high priority.
• Consider the following set of processes, assumed to have arrive at time 0 in the order P1, P2, . .
VCET Puttur Page 14

. , P5, with length of the CPU burst given in milliseconds.
Priority scheduling(non preemptive) Gantt Chart

Process Burst Time Priority Waiting Time(in ms) Turn Around time(ms)
id
P1 10 3 6 16
P2 1 1 0 1
P3 2 4 16 18
P4 1 5 18 19
P5 5 2 1 6
Average 41/5=8.2 60/5=12
A priority scheduling can be either preemptive or nonpreemptive.
• When priority of the newly arrived process is higher than the priority of the currently executing
process, Preemptive priority scheduling algorithm will preempt the currently executing process.
Nonpreemptive priority scheduling algorithm will put the new process at the head of the ready queue.
• A major problem with priority scheduling algorithm is indefinite blocking, or starvation.
• A priority scheduling algorithm can leave some low priority processes waiting indefinitely. A
steady stream of high priority processes can prevent a low-priority process from ever getting the CPU.
• A solution to the problem of indefinite blockage of low-priority processes is aging.
• Aging is a technique of gradually increasing the priority of processes that wait in the system for
long time.
VCET Puttur Page 15

Round Robin (RR) Scheduling
• The round-robin (RR) scheduling algorithm is designed especially for time-sharing systems.
• Each process gets a small unit of CPU time (1 time quantum q), usually 10-100 milliseconds.
After this time has elapsed, the process is preempted and added to the end of the ready queue.
• Timer interrupts every quantum to schedule next process.
• The process may have a CPU burst of less than 1 time quantum. In this case process itself will
release the CPU voluntarily.
• Consider the following set of processes that arrive at time 0, time quantum =4msec.
The Gantt chart is:

P1 P2 P3 P1 P1 P1 P1 P1
0 4 7 10 14 18 22 26 30

P1 24 10-4=6 30
P2 3 4 7
P3 3 7 10
Average 17/3=5.66 47/3=15.66
• P1 waits for 6 milliseconds (10 – 4), P2 waits for 4 milliseconds and P3 waits for 7
milliseconds. Thus average waiting time is 17/3 = 5.66 milliseconds.
• The performance of the RR algorithm depends heavily on the size of the time quantum q.
• We also need to consider the effect of context switching on the performance of RR scheduling.
Time Quantum and Context Switch Time
VCET Puttur Page 16

If the time quantum is 1 time unit and if we have only one process of 10 time units, then 9 context
switches will occur slowing the execution of the process.
• Time quantum should be large with respect to context switch time. If context switch time is 10
percent of the time quantum, then about 10 percent of the CPU time will be spent on context
switching.
Multilevel Queue Scheduling
• A multilevel queue scheduling algorithm partitions the ready queue into several separate
queues.
• The processes are permanently assigned to one queue, based on some property of the process,
such as process priority or process type.
• For example separate queues might be used for: foreground (interactive) processes and
background (batch) processes
• Each queue has its own scheduling algorithm: foreground queue might be scheduled by an RR.
background queue is scheduled by an FCFS.
VCET Puttur Page 17

• In addition, Scheduling must be done between the queues:

Fixed priority scheduling; (i.e., serve all from foreground then from
background).
Possibility of starvation.
Time slice – each queue gets a certain amount of CPU time which it can schedule amongst its
processes; i.e., 80% to foreground in RR and 20% to background in FCFS.
Multilevel Feedback Queue
• In multilevel queue scheduling processes are permanently assigned to a queue when they enter
the system.
• Multilevel feedback queue scheduling algorithm allows a process to move between queues.
• If a process uses too much CPU time, it will be moved to a lower priority queue.
• A process that waits too long in a lower priority queue may be moved to a higher priority
queue, this prevents starvation.
• Multilevel-feedback-queue scheduler is defined by the following parameters:
• The number of queues
• The scheduling algorithms for each queue
• The method used to determine when to upgrade a process The method used to determine when
VCET Puttur Page 18

to demote a process
• The method used to determine which queue a process will enter when that process needs
service
Example of Multilevel Feedback Queue
• Three queues:
◦ Q0 –RR, time quantum 8 milliseconds
◦ Q1 –RR, time quantum 16 milliseconds
◦ Q2 –FCFS
• Scheduling
◦ A process entering ready queue is put in queue Q0
◦ A process in Q0 is given a time quantum of 8 milliseconds.
◦ If it does not finish in 8 milliseconds, it is moved to the tail of queue Q1
◦ If queue Q0 is empty, process at the head of Q1 which is given a time quantum of 16 additional
milliseconds can be considered for execution.
◦ If it still does not complete, it is preempted and moved to queue Q2
◦ Q2 is serviced when Q0 and Q1 are empty.
Thread Scheduling
• On Operating systems that support threads, it is kernel level threads -not processes- that are
being scheduled by the operating system.
• To run on a CPU, user-level threads must be mapped to an associated kernel-level thread.
• This mapping may be indirect and may use a lightweight process (LWP).
• In Many-to-one and many-to-many models, thread library schedules user-level threads to run
on available LWP.
• This scheme Known as process-contention scope (PCS), since competition for the LWP takes
place among threads belonging to the same process.
• Next Kernel thread scheduled onto available physical CPU using system-contention scope
(SCS) – competition for the CPU among all kernel threads in system.
Multiple-Processor Scheduling
VCET Puttur Page 19

• CPU scheduling becomes more complex when multiple CPUs are available.
• Homogeneous processors – We can use any available processor to run any process in the
queue.
• Asymmetric multiprocessing:
◦ All scheduling decisions, I/O processing and other system activities handled by a single
processor.
◦ only one processor accesses the system data structures, reducing the need for data sharing.
• Symmetric multiprocessing (SMP):
◦ Each processor is self-scheduling, all processes may be in a common ready queue, or each
processor may have its own private queue of ready processes.
◦ We must ensure that two processors do not choose the same process.
Processor affinity
• The data most recently accessed by the process populate the cache for the processor.
• If the process migrates to another processor, the contents of cache must be invalidated for the
first processor and the cache for the second processor must be repopulated.
• Cost of invalidating and repopulating caches is high, most SMP systems try to avoid migration
of processes from one processor to another.
• Processor affinity – process has an affinity for the processor on which it is currently running.
• soft affinity- attempt is made to avoid the migration.
• hard affinity-Avoiding migration is Guaranteed
Load Balancing
• Load balancing attempts to keep the workload evenly distributed across all processors in an
SMP system.
• Load balancing is necessary only on systems where each processor has its own private queue.
• With common ready queue load balancing is unnecessary, because once a processor becomes
idle, it extracts a runnable process from the queue.
• Two approaches for load balancing:
Push migration – a specific task periodically checks the load on each processor, and if an imbalance
is found pushes task from overloaded CPU to other CPUs.
Pull migration – idle processor pulls a waiting task from busy processor.
• Pulling or pushing a process from one processor to another cannot take advantage of the data in
processor's cache memory.
Thread scheduling
On Operating systems that support threads, it is kernel level threads. being scheduled by the operating
system. To run on a CPU, user-level threads must be mapped to an associated kernel-level thread.
This mapping may be indirect and may use a lightweight process (LWP).
In Many-to-one and many-to-many models, thread library schedules user-level threads to run on
available LWP. This scheme Known as process-contention scope (PCS), since competition for the
VCET Puttur Page 20

LWP takes place among threads belonging to the same process.
Next Kernel thread scheduled onto available physical CPU using system-contention scope (SCS) –
competition for the CPU among all kernel threads in system.
Pthread API allows specifying either PCS or SCS during thread creation.
PTHREAD_SCOPE_PROCESS schedules threads using PCS scheduling.
PTHREAD_SCOPE_SYSTEM schedules threads using SCS scheduling.
Pthread IPC provides 2 functions for getting and setting contention scope policy:
pthread_attr_setscope(pthread_attr_t *attr, int scope)
pthread_attr_getscope(pthread_attr_t *attr, int *scope)
********************************************************************************
VCET Puttur Page 21

Module 2: Multi-threaded Programming, Process Scheduling, Process

Synchronization
Syllabus: Multi-threaded Programming: Overview; Multithreading models; Thread Libraries;
Threading issues. Process Scheduling: Basic concepts; Scheduling Criteria; Scheduling Algorithms;
Multiple-processor scheduling; Thread scheduling. Process Synchronization: Synchronization: The
critical section problem; Peterson’s solution; Synchronization hardware; Semaphores; Classical
problems of synchronization; Monitors
Chapter 6: Process Synchronization
Background:
Concurrent access to shared data may result in data inconsistency. Maintaining data consistency
requires mechanisms to ensure the orderly execution of cooperating processes. Suppose that we
wanted to provide a solution to the consumer-producer problem that fills all the buffers. We can do so
by having an integer count that keeps track of the number of full buffers. Initially, count is set to 0. It is
incremented by the producer after it produces a new buffer and is decremented by the consumer after it
consumes a buffer.
The code for the producer process can be modified as follows:
while (true)
{
/* produce an item and put in nextProduced */
while (counter == BUFFER_SIZE)
; // do nothing
buffer [in] = nextProduced;
in = (in + 1) % BUFFER_SIZE;
counter++;
}
The code for the consumer process can be modified as follows:

while (true)
{
while (counter == 0)
; // do nothing
nextConsumed = buffer[out];
out = (out + 1) % BUFFER_SIZE;
counter--;
/* consume the item in nextConsumed */ }
VCET Puttur Page 1

Race Condition
count++ could be implemented in machine language as

register1 = count
register1 = register1 + 1
count = register1
count-- could be implemented in machine language as
register2 = count
register2 = register2 - 1
count = register2
Consider this execution interleaving with “count = 5” initially:
S0: producer execute register1 = count {register1 = 5}
S1: producer execute register1 = register1 + 1 {register1 = 6}
S2: consumer execute register2 = count {register2 = 5}
S3: consumer execute register2 = register2 - 1 {register2 = 4}
S4: producer execute count = register1 {count = 6 }
S5: consumer execute count = register2 {count = 4}
Race Condition: Several processes access and manipulate the same data concurrently and the
outcome of the execution depends on the particular order in which the access takes place, is called a
race condition. So, Only one process at a time can be manipulating the variable counter to aviod race
condition.
The Critical Section Problem
• Consider system of n processes {p0, p1, … pn-1}

• Each process has critical section segment of code : Process may be changing common
variables, updating table, writing file, etc
▪ When one process in critical section, no other may be in its critical section.
• Critical section problem is to design protocol to solve this.
• Each process must ask permission to enter critical section in entry section, may follow critical
section with exit section, then remainder section
• Especially challenging with preemptive kernels
Critical Section
General structure of process pi is
VCET Puttur Page 2

Solution to Critical-Section Problem
1. Mutual Exclusion - If process Pi is executing in its critical section, then no other processes can be
executing in their critical sections
2. Progress - If no process is executing in its critical section and there exist some processes that wish
to enter their critical section, then the selection of the processes that will enter the critical section next
cannot be postponed indefinitely
3. Bounded Waiting - A bound must exist on the number of times that other processes are allowed to
enter their critical sections after a process has made a request to enter its critical section and before that
request is granted
▪ Assume that each process executes at a nonzero speed

▪ No assumption concerning relative speed of the n processes
Peterson’s Solution
• It is Two process solution

• The two processes share two variables:
int turn and boolean flag[2]
• The variable turn indicates whose turn it is to enter the critical section
• The flag array is used to indicate if a process is ready to enter the critical section. flag[i] = true
implies that process Pi is ready!
Algorithm for Process Pi
VCET Puttur Page 3

Figure 6.1 The structure of process A in Peterson's solution.

Using this we can prove that,
1. Mutual exclusion is preserved
2. Progress requirement is satisfied
3. Bounded-waiting requirement is met
Synchronization Hardware
• Many systems provide hardware support for critical section code

• Uniprocessors – could disable interrupts
▪ Currently running code would execute without preemption
▪ Generally too inefficient on multiprocessor systems
▪ Operating systems using this not broadly scalable
▪ Modern machines provide special atomic hardware instructions
▪ Atomic = non-interruptable
▪ Either test memory word and set value or swap contents of two memory words
Figure 6.2 Solution to the critical-section problem using locks.
VCET Puttur Page 4

TestAndSet Instruction
booleanTestAndSet (boolean *target)

{
booleanrv = *target;
*target = TRUE;
returnrv:
}
Solution using TestAndSet
• Shared boolean variable lock, initialized to FALSE

• Solution:
do {
while ( TestAndSet (&lock ))
; // do nothing
// critical section
lock = FALSE;
// remainder section
} while (TRUE);
Swap Instruction
void Swap (boolean *a, boolean *b)

{
boolean temp = *a;
*a = *b;
*b = temp:
}
Solution using Swap
• Shared Boolean variable lock initialized to FALSE; Each process has a local Boolean variable
key
• Solution:
do {
key = TRUE;
while ( key == TRUE)
Swap (&lock, &key );
// critical section lock = FALSE;
} while (TRUE);
VCET Puttur Page 5

Bounded-waiting Mutual Exclusion with TestandSet()

do {
waiting[i] = TRUE; key = TRUE;
while (waiting[i] && key)
key = TestAndSet(&lock);
waiting[i] = FALSE;
// critical section
j = (i + 1) % n;
while ((j != i) && !waiting[j])
j = (j + 1) % n; if (j == i)
lock = FALSE;
else
waiting[j] = FALSE;
} while (TRUE);
Semaphores
• Synchronization tool that does not require busy waiting
• Semaphore S – integer variable
• Two standard operations modify S: wait() and signal()
▪ Originally called P( ) andV( )
• Less complicated
• Can only be accessed via two indivisible (atomic) operations
Structure of wait (S):

wait(S)
{
while (S <= 0)
; // no-op
S--;
}
Structure of signal (S):
signal(S)
{ S+
+;
}
Semaphore as General Synchronization Tool
1) Counting semaphore – integer value can range over an unrestricted domain

2) Binary semaphore – integer value can range only between 0
and 1; can be simpler to implement
▪ Also known as mutex locks
VCET Puttur Page 6

• Can implement a counting semaphore S as a binary semaphore

• Provides mutual exclusion
Semaphore mutex; // initialized to 1
do {
wait (mutex);
// Critical Section signal (mutex);
} while (TRUE);
Semaphore Implementation
• Must guarantee that no two processes can execute wait () and signal () on the same semaphore
at the same time
• Thus, implementation becomes the critical section problem where the wait and signal code are
placed in the critical section
▪ Could now have busy waiting in critical section implementation
▪ But implementation code is short
▪ Little busy waiting if critical section rarely occupied
• Note that applications may spend lots of time in critical sections and therefore this is not a good
solution
Semaphore Implementation with no Busy waiting
• With each semaphore there is an associated waiting queue

• Each entry in a waiting queue has two data items:
▪ value (of type integer)
▪ pointer to next record in the list
• Two operations:
▪ block – place the process invoking the operation on the appropriate waiting queue
▪ wakeup – remove one of processes in the waiting queue and place it in the ready queue
• Implementation of wait:
wait(semaphore *S)
{S->value--;
if (S->value < 0) {
add this process to S->list;
block();
}
}
Implementation of signal:
signal(semaphore *S)
{S->value++;
if (S->value <= 0) {
remove a process P from S->list; wakeup(P);
VCET Puttur Page 7

}
}
Deadlock and Starvation
• Deadlock – two or more processes are waiting indefinitely for an event that can be caused by
only one of the waiting processes
• Let S and Q be two semaphores initialized to 1
P0 P1
wait (S); wait (Q);

wait (Q); wait (S);
. .
. .
signal (S); signal (Q);
signal (Q); signal (S);
• Starvation – indefinite blocking

▪ A process may never be removed from the semaphore queue in which it is suspended
• Priority Inversion – Scheduling problem when lower-priority process holds a lock needed by
higher-priority process
▪ Solved via priority-inheritance protocol
Classical Problems of Synchronization
• Classical problems used to test newly-proposed synchronization schemes

▪ Bounded-Buffer Problem
▪ Readers and Writers Problem
▪ Dining-Philosophers Problem
Bounded-Buffer Problem
• N buffers, each can hold one item

• Semaphore mutex initialized to the value 1
• Semaphore full initialized to the value 0
• Semaphore empty initialized to the value N
The structure of the producer process

do {
// produce an item in nextp wait (empty);
wait (mutex);
// add the item to the buffer signal (mutex);
signal (full);
} while (TRUE);
VCET Puttur Page 8

The structure of the consumer process

do {
wait (full); wait (mutex);
// remove an item from buffer to nextc signal (mutex);
signal (empty);
// consume the item in nextc
} while (TRUE);
Readers-Writers Problem
• A data set is shared among a number of concurrent processes

▪ Readers – only read the data set; they do not perform any updates
▪ Writers – can both read and write
• Problem – allow multiple readers to read at the same time
▪ Only one single writer can access the shared data at the same time
▪ Several variations of how readers and writers are treated – all involve priorities
• Shared Data and Data set
▪ Semaphore mutex initialized to 1
▪ Semaphore wrt initialized to 1
▪ Integer readcount initialized to 0
The structure of a writer process

do {
wait (wrt) ;
// writing is performed
signal (wrt) ;
} while (TRUE);
The structure of a reader process

do {
wait (mutex) ; readcount ++ ;
if (readcount == 1)
wait (wrt) ;
signal (mutex)
// reading is performed
wait (mutex) ;
readcount- - ;
if (readcount == 0)
signal (wrt) ;
signal (mutex) ;
} while (TRUE);
VCET Puttur Page 9

Readers-Writers Problem Variations
• First variation – no reader kept waiting unless writer has permission to use shared object
• Second variation – once writer is ready, it performs write asap
• Both may have starvation leading to even more variations
• Problem is solved on some systems by kernel providing reader-writer locks
Dining-Philosophers Problem
Figure 6.9 The situation of the dining philosophers.
• Philosophers spend their lives thinking and eating

• Don’t interact with their neighbors, occasionally try to pick up 2 chopsticks (one at a time) to
eat from bowl
▪ Need both to eat, then release both when done
• In the case of 5 philosophers
▪ Shared data are :
▪ Bowl of rice (data set)
▪ Semaphore chopstick [5] initialized to 1
Dining-Philosophers Problem Algorithm
The structure of Philosopher i:

do {
wait ( chopstick[i] );
wait ( chopStick[ (i + 1) % 5] );
// eat
signal ( chopstick[i] );
signal (chopstick[ (i + 1) % 5] );
// think
} while (TRUE);
• What is the problem with this algorithm?
VCET Puttur Page 10

Problems with Semaphores
• Incorrect use of semaphore operations:

▪ signal (mutex) …. wait (mutex)
▪ wait (mutex) … wait (mutex)
▪ Omitting of wait (mutex) or signal (mutex) (or both)
• Deadlock and starvation
Monitors
• A high-level abstraction that provides a convenient and effective mechanism for process
synchronization
• Abstract data type, internal variables only accessible by code within the procedure
• Only one process may be active within the monitor at a time
• But not powerful enough to model some synchronization schemes monitor monitor-name
{
// shared variable declarations
procedure P1 (…) { …. }
procedurePn (…) {……}
Initialization code (…) { … }
}
}
Figure 6.10 Schematic view of a monitor.
VCET Puttur Page 11

Condition Variables
• condition x, y;
• Two operations on a condition variable:
▪ x.wait () – a process that invokes the operation is suspended until x.signal ()
▪ x.signal () – resumes one of processes (if any) that invoked x.wait ()
▪ If no x.wait () on the variable, then it has no effect on the variable
Figure 6.11 Monitor with condition variables.
Condition Variables Choices
• If process P invokes x.signal (), with Q in x.wait () state, what should happen next?
▪ If Q is resumed, then P must wait
• Options include
▪ Signal and wait – P waits until Q leaves monitor or waits for another condition
▪ Signal and continue – Q waits until P leaves the monitor or waits for another condition
▪ Both have pros and cons – language implementer can decide
▪ Monitors implemented in Concurrent Pascal compromise
➢ P executing signal immediately leaves the monitor, Q is resumed
▪ Implemented in other languages including Mesa, C#, Java
Solution to Dining Philosophers :

monitor DiningPhilosophers
{
enum { THINKING; HUNGRY, EATING}, state [5] ;
condition self [5];
void pickup (int i)
{
state[i] = HUNGRY;
test(i);
if (state[i] != EATING) self [i].wait;
}
VCET Puttur Page 12

void putdown (int i)

{
state[i] = THINKING;
// test left and right neighbors
test((i + 4) % 5); test((i + 1) % 5);
}
void test (int i)

{
if ( (state[(i + 4) % 5] != EATING) &&
(state[i] == HUNGRY) && (state[(i + 1) % 5] != EATING) )
{
state[i] = EATING ;
self[i].signal () ;
}
}
initialization_code()
{
for (int i = 0; i < 5; i++)
state[i] = THINKING;
}
}
• Each philosopher i invokes theoperations pickup()and putdown() in the following sequence:

DiningPhilosophers.pickup (i);
EAT DiningPhilosophers.putdown (i);
• No deadlock, but starvation is possible
Monitor Implementation Using Semaphores
• Variables
semaphoremutex; // (initially = 1)
semaphore next; // (initially = 0)
intnext_count = 0;
• Each procedure F will be replaced by wait(mutex);

…
body of F;
…
if (next_count> 0) signal(next)
else
signal(mutex);
• Mutual exclusion within a monitor is ensured
VCET Puttur Page 13

Monitor Implementation – Condition Variables
• For each condition variable x, we have: semaphorex_sem; // (initially = 0) intx_count = 0;

• The operation x.waitcan be implemented as:
x-count++;
if (next_count> 0)
signal(next);
else
signal(mutex);
wait(x_sem);
x-count--;
• The operation x.signal can be implemented as:
if (x-count > 0) { next_count++; signal(x_sem); wait(next); next_count--;

}
Resuming Processes within a Monitor
• If several processes queued on condition x, and x.signal() executed, which should be resumed?
• FCFS frequently not adequate
• conditional-wait construct of the form x.wait(c)
▪ Where c is priority number
▪ Process with lowest number (highest priority) is scheduled next
monitor ResourceAllocator
{
boolean busy; condition x;
void acquire(int time)
{
if (busy)
x.wait(time);
busy = TRUE;
}
void release()
{
busy = FALSE;
x.signal();
}
initialization code()
{
busy = FALSE;
}
VCET Puttur Page 14

OS-module 2 Notes

Uploaded by

OS-module 2 Notes

Uploaded by

Operating Systems- 21CS44 - Module 2

Module 2: Multi-threaded Programming, Process Scheduling, Process

Chapter 4: Multithreaded Programming

Multithreaded Server Architecture: ->

Single and Multithreaded Processes:

VCET Puttur Page 1

Benefits ofmultithreaded programming

1. Responsiveness – Multithreading an interactive application may allow a program to continue

VCET Puttur Page 2

VCET Puttur Page 3

• May be provided either as user-level or kernel-level

VCET Puttur Page 4

int sum; /* this data is shared by the thread(s) */

int main(int argc, char *argv[])

VCET Puttur Page 5

VCET Puttur Page 6

The fork( ) and exec( ) System Calls

• It is the task of terminating a thread before it has completed.

VCET Puttur Page 7

Benefits of thread pool:

VCET Puttur Page 8

Thread Specific Data

VCET Puttur Page 9

Chapter 5 : Process Scheduling

CPU-I/O Burst Cycle

VCET Puttur Page 10

Scheduling Algorithm Optimization Criteria: It is desirable to maximize CPU utilization and

First- Come, First-Served (FCFS) Scheduling

VCET Puttur Page 11

Average waiting and Turn Around Time:

• Suppose that the processes arrive in the order: P2 , P3 , P1

Average waiting and Turn Around Time:

Shortest-Job-First (SJF) Scheduling

VCET Puttur Page 12

SJF schedules processes according to following Gantt chart:

Average waiting and Turn Around Time:

Determining Length of Next CPU Burst

VCET Puttur Page 13

Shortest-remaining-time-first (SRTF) or Preemptive SJF:

• SJF algorithm can be either preemptive or nonpreemptive.

SRTF Gantt Chart

Average waiting and Turn Around Time:

VCET Puttur Page 14

. , P5, with length of the CPU burst given in milliseconds.

Priority scheduling(non preemptive) Gantt Chart

Average waiting and Turn Around Time:

A priority scheduling can be either preemptive or nonpreemptive.

VCET Puttur Page 15

Round Robin (RR) Scheduling

The Gantt chart is:

Average waiting and Turn Around Time:

Time Quantum and Context Switch Time

VCET Puttur Page 16

Multilevel Queue Scheduling

VCET Puttur Page 17

• In addition, Scheduling must be done between the queues:

Multilevel Feedback Queue

VCET Puttur Page 18

Example of Multilevel Feedback Queue

VCET Puttur Page 19

VCET Puttur Page 20

LWP takes place among threads belonging to the same process.

VCET Puttur Page 21

Module 2: Multi-threaded Programming, Process Scheduling, Process

Chapter 6: Process Synchronization

The code for the producer process can be modified as follows:

The code for the consumer process can be modified as follows:

VCET Puttur Page 1

count++ could be implemented in machine language as

S0: producer execute register1 = count {register1 = 5}

S1: producer execute register1 = register1 + 1 {register1 = 6}

S2: consumer execute register2 = count {register2 = 5}

S3: consumer execute register2 = register2 - 1 {register2 = 4}

S4: producer execute count = register1 {count = 6 }

S5: consumer execute count = register2 {count = 4}

The Critical Section Problem

void Swap (boolean a, boolean b)