One Stop GATE Forum: Gate Study Material OS-2

Message

             Deadlock

 You may need to write code that acquires more than one lock.
			This opens up the possibility of deadlock. Consider the following
			piece of code:
						Lock *l1, *l2;
			void p() {
			  l1->Acquire();
			  l2->Acquire();
			  code that manipulates data that l1 and l2 protect
			  l2->Release();
			  l1->Release();
			}
			void q() {
			  l2->Acquire();
			  l1->Acquire();
			  code that manipulates data that l1 and l2 protect
			  l1->Release();
			  l2->Release();
			}

			If p and q execute concurrently, consider what may
			happen. First, p acquires l1 and q acquires
			l2. Then, p waits to acquire l2 and
			q waits to acquire l1. How long will they wait? Forever.

 This case is called deadlock. What are conditions for deadlock?
			 Mutual Exclusion: Only one thread can hold lock at a time.

 Hold and Wait: At least one thread holds a lock and is waiting
			for another process to release a lock.

 No preemption: Only the process holding the lock can release it.

 Circular Wait: There is a set t₁, ..., t_n such that
			t₁ is waiting for a lock held by t₂, ..., t_n is
			waiting for a lock held by t₁.

 How can p and q avoid deadlock? Order the locks,
			and always acquire the locks in that order. Eliminates the circular
			wait condition.

 Occasionally you may need to write code that needs to acquire
			locks in different orders. Here is what to do in this situation.
			 First, most locking abstractions offer an operation that tries to
			acquire the lock but returns if it cannot. We will call this
			operation TryAcquire.
			Use this operation to
			try to acquire the lock that you need to acquire out of order.

 If the operation succeeds, fine. Once you've got the lock,
			there is no problem.

 If the operation fails, your code will need to release all
			locks that come after the lock you are trying to acquire. Make
			sure the associated data structures are in a state where you can
			release the locks without crashing the system.

 Release all of the locks that come after the lock you are
			trying to acquire, then reacquire all of the locks in the right
			order. When the code resumes, bear in mind that the data structures
			might have changed between the time when you released and reacquired
			the lock.

 Here is an example.
						int d1, d2;
			// The standard acquisition order for these two locks
			// is l1, l2.
			Lock *l1, // protects d1
			     *l2; // protects d2
			// Decrements d2, and if the result is 0, increments d1
			void increment() {
			  l2->Acquire();
			  int t = d2;
			  t--;
			  if (t == 0) {
			    if (l1->TryAcquire()) {
			      d1++;
			    } else {
			      // Any modifications to d2 go here - in this case none
			      l2->Release();
			      l1->Acquire();
			      l2->Acquire();
			      t = d2;
			      t--;
			      // some other thread may have changed d2 - must recheck it
			      if (t == 0) {
			        d1++;
			      }
			    }
			    l1->Release();
			  }
			  d2 = t;
			  l2->Release();
			}

			This example is somewhat contrived, but you will recognize the
			situation when it occurs.

 There is a generalization of the deadlock problem to situations
			in which processes need multiple resources, and the hardware may
			have multiple kinds of each resource - two printers, etc.
			Seems kind of like a batch model - processes request
			resources, then system schedules process to run when resources
			are available.

 In this model, processes issue requests to OS for resources, and
			OS decides who gets which resource when. A lot of theory built up
			to handle this situation.

 Process first requests a resource, the OS issues it and the
			process uses the resource, then the process releases the resource
			back to the OS.

 Reason about resource allocation using resource allocation
			graph. Each resource is represented with a box, each process with
			a circle, and the individual instances of the resources with
			dots in the boxes. Arrows go from processes to resource boxes
			if the process is waiting for the resource. Arrows go from
			dots in resource box to processes if the process holds that instance
			of the resource. See Fig. 7.1.

 If graph contains no cycles, is no deadlock. If has a cycle,
			may or may not have deadlock. See Fig. 7.2, 7.3.

 System can either
			 Restrict the way in which processes will request resources
			to prevent deadlock.

 Require processes to give advance information about which
			resources they will require, then use algorithms that schedule
			the processes in a way that avoids deadlock.

 Detect and eliminate deadlock when it occurs.

 First consider prevention. Look at the deadlock conditions
			listed above.
			 Mutual Exclusion -
			To eliminate mutual exclusion, allow everybody to use the resource immediately
			if they want to. Unrealistic in general - do you want your
			printer output
			interleaved with someone elses?

 Hold and Wait. To prevent hold and wait, ensure that when a
			process requests resources, does not hold
			any other resources. Either asks
			for all resources before executes, or dynamically asks for resources
			in chunks as needed,
			then releases all resources before asking for more.
			Two problems - processes may hold but not use
			resources for a long time because they will eventually hold them.
			Also, may have starvation. If a process asks for lots of resources,
			may never run because other processes always hold some subset of
			the resources.

 Circular Wait. To prevent circular wait, order resources and
			require processes to request resources in that order.

 Deadlock avoidance. Simplest algorithm - each process
			tells max number of resources it will ever need. As process
			runs, it requests resources but never exceeds max number
			of resources. System schedules processes and allocates
			resoures in a way that ensures that no

			deadlock results.

 Example: system has 12 tape drives.
			System currently running
			P0 needs max 10 has 5, P1 needs max 4 has 2, P2 needs max 9 has
			2.

 Can system prevent deadlock even if all processes request
			the max? Well, right now system has 3 free tape drives. If
			P1 runs first and completes, it will have 5 free tape drives.
			P0 can run to completion with those 5 free tape drives even if it
			requests max. Then P2 can complete. So, this schedule will
			execute without deadlock.

 If P2 requests two more tape drives, can system give it the
			drives? No, because cannot be sure it can run all jobs to completion
			with only 1 free drive. So, system must not give P2 2 more tape
			drives until P1 finishes. If P2 asks for 2 tape drives, system
			suspends P2 until P1 finishes.

 Concept: Safe Sequence. Is an ordering of processes such that
			all processes can execute to completion in that order even if all
			request maximum resources. Concept: Safe State - a state in which there
			exists a safe sequence. Deadlock avoidance algorithms always ensure
			that system stays in a safe state.

 How can you figure out if a system is in a safe state?
			Given the current and maximum allocation, find a safe sequence.
			System must maintain some information about the resources and
			how they are used. See OSC 7.5.3.
						Avail[j] = number of resource j available
			Max[i,j] = max number of resource j that process i will use
			Alloc[i,j] = number of resource j that process i currently has
			Need[i,j] = Max[i,j] - Alloc[i,j]

 Notation: A<=B if for all processes i,
			A<=B.

 Safety Algorithm: will try to find a safe sequence.
			Simulate evolution of system over time under worst case
			assumptions of resource demands.
						1: Work = Avail;
			   Finish = False for all i;
			2: Find i such that Finish = False and Need <= Work
			   If no such i exists, goto 4
			3: Work = Work + Alloc; Finish = True; goto 2
			4: If Finish = True for all i, system is in a safe state

 Now, can use safety algorithm to determine if we can
			satisfy a given resource demand. When a process demands
			additional resources, see if can give them to process and
			remain in a safe state. If not, suspend process until system
			can allocate resources and remain in a safe state. Need
			an additional data structure:
						Request[i,j] = number of j resources that process i requests

 Here is algorithm. Assume process i has
			just requested additional resources.
						1: If Request <= Need goto 2. Otherwise, process has
			   violated its maximum resource claim.
			2: If Request <= Avail goto 3. Otherwise, i must wait
			   because resources are not available.
			3: Pretend to allocate resources as follows:
			   Avail = Avail - Request
			   Alloc = Alloc + Request
			   Need = Need - Request
			   If this is a safe state, give the process the resources. Otherwise,
			   suspend the process and restore the old state.

 When to check if a suspended process should be given the
			resources and resumed? Obvious choice - when some other process
			relinquishes its resources. Obvious problem - process starves
			because other processes with lower resource requirements are
			always taking freed resources.

 See Example in Section 7.5.3.3.

 Third alternative: deadlock detection and elimination. Just
			let deadlock happen. Detect when it does, and eliminate the deadlock
			by preempting resources.

 Here is deadlock detection algorithm. Is very similar to
			safe state detection algorithm.
						1: Work = Avail;
			   Finish = False for all i;
			2: Find i such that Finish = False and Request <= Work
			   If no such i exists, goto 4
			3: Work = Work + Alloc; Finish = True; goto 2
			4: If Finish = False for some i, system is deadlocked.
			   Moreover, Finish = False implies that process i is deadlocked.

 When to run deadlock detection algorithm? Obvious time: whenever
			a process requests more resources and suspends. If deadlock detection
			takes too much time, maybe run it less frequently.

 OK, now you've found a deadlock. What do you do? Must free
			up some resources so that some processes can run. So, preempt
			resources - take them away from processes. Several different
			preemption cases:
			 Can preempt some
			resources without killing job - for example, main memory. Can
			just swap out to disk and resume job later.

 If job provides rollback points, can roll job back to
			point before acquired resources. At a later time, restart
			job from rollback point. Default rollback point - start of job.

 For some resources must just kill job. All resources
			are then free. Can either kill  processes one by one until your
			system is no longer deadlocked. Or, just go ahead and kill all
			deadlocked processes.

 In a real system, typically use different deadlock strategies
			for different situations based on resource characteristics.

 This whole topic has a sort of 60's and 70's batch mainframe feel
			to it. How come these topics never seem to arise in modern Unix
			systems?

             Synchronization

 How do we implement synchronization operations like locks?
		   Can build synchronization operations out of atomic reads and writes.
		   There is a lot of literature on how to do this, one algorithm
		   is called the bakery algorithm. But, this
		   is slow and cumbersome to use. So, most machines have hardware
		   support for synchronization - they provide synchronization
		   instructions.

 On a uniprocessor, the only thing that will make
		   multiple instruction sequences not atomic is interrupts.
		   So, if want to do a critical section, turn off interrupts before the
		   critical section and turn on interrupts after the critical
		   section. Guaranteed atomicity. It is also fairly efficient.
		   Early versions of Unix did this.

 Why not just use turning off interrupts? Two main
		   disadvantages: can't use in a multiprocessor, and can't use
		   directly from user program for synchronization.

 Test-And-Set. The test and set instruction atomically
		   checks if a memory location is zero, and if so, sets the
		   memory location to 1. If the memory location is 1, it does
		   nothing.  It returns the old value of the memory location.
		   You can use test and set to implement
		   locks as follows:
		    The lock state is implemented by a memory location.
		   The location is 0 if the lock is unlocked and 1 if the
		   lock is locked.

 The lock operation is implemented as:
		   		   while (test-and-set(l) == 1);

 The unlock operation is implemented as:
		   *l = 0;

		   The problem with this implementation is busy-waiting. What if
		   one thread already has the lock, and another thread wants to
		   acquire the lock? The acquiring thread will spin until the
		   thread that already has the lock unlocks it.

 What if the threads are running on a uniprocessor?
		   How long will the acquiring thread spin? Until it expires
		   its quantum and thread that will unlock the lock runs.
		   So on a uniprocessor,
		   if can't get the thread the first time,
		   should just suspend. So, lock acquisition looks like this:
		   		   while (test-and-set(l) == 1) {
		     currentThread->Yield();
		   }

		   Can make it even better by having a queue lock that queues up the
		   waiting threads and gives the lock to the first thread in the queue.
		   So, threads never try to acquire lock more than once.

 On a multiprocessor, it is less clear. Process that
		   will unlock the lock may be running on another processor.
		   Maybe should spin just a little while, in hopes that other process
		   will release lock.  To evaluate spinning and suspending strategies, need to
		   come up with a cost for each suspension algorithm. The cost
		   is the amount of CPU time the algorithm uses to acquire a lock.

 There are three components of the cost: spinning,
		   suspending and resuming. What is the cost
		   of spinning? Waste the CPU for the spin time. What is cost
		   of suspending and resuming? Amount of CPU time it takes to suspend the
		   thread and restart it when the thread acquires the lock.

 Each lock acquisition algorithm spins for a while, then suspends
		   if it didn't get the lock. The optimal algorithm is as follows:
		    If the lock will be free in less than the suspend and resume time,
		   spin until acquire the lock.

 If the lock will be free in more than the suspend and resume time,
		   suspend immediately.

		   Obviously, cannot implement this algorithm - it requires knowledge of the
		   future, which we do not in general have.

 How do we evaluate practical algorithms - algorithms that spin for
		   a while, then suspend. Well, we compare them with the optimal algorithm
		   in the worst case for the practical algorithm. What is the worst case
		   for any practical algorithm relative to the optimal algorithm? When
		   the lock become free just after the practical algorithm stops spinning.

 What is worst-case cost of algorithm that spins for the
		   suspend and resume time, then suspends? (Will call this the SR algorithm).
		   Two times the suspend and resume time.
		   The worst case is when the lock is unlocked just after the thread starts
		   the suspend. The optimal algorithm just spins until the lock
		   is unlocked, taking the suspend and resume time to acquire the lock.
		   The SR algorithm costs twice the suspend and resume time -it first
		   spins for the suspend and resume time, then suspends, then gets the lock,
		   then resumes.

 What about other algorithms that
		   spin for a different fixed amount of time then block? Are all worse than
		   the SR algorithm.
		    If spin for less than suspend and resume time then suspend
		   (call this the LT-SR algorithm),
		   worst case is when lock becomes free just after start the suspend.
		   In this case the the algorithm will cost spinning time plus suspend
		   and resume time. The SR algorithm will just cost the spinning time.

 If spin for greater than suspend and resume time then suspend
		   (call this the GR-SR algorithm), worst case is again when lock
		   becomes free just after start the suspend. In this case the SR algorithm
		   will also suspend and resume, but it will spin for less time than the
		   GT-SR algorithm

		   Of course, in practice locks may not exhibit
		   worst case behavior, so best algorithm depends on locking and
		   unlocking patterns actually observed.

 Here is the SR algorithm.
		   Again, can be improved with use of queueing locks.
		   		   notDone = test-and-set(l);
		   if (!notDone) return;
		   start = readClock();
		   while (notDone) {
		     stop = readClock();
		     if (stop - start >= suspendAndResumeTime) {
		       currentThread->Yield();
		       start = readClock();
		     }
		     notDone = test-and-set(l);
		   }

 There is an orthogonal issue. test-and-set instruction
		   typically consumes
		   bus resources every time. But a load instruction caches
		   the data. Subsequent loads come out of cache and never hit the
		   bus. So, can do something like this for inital algorithm:
		   		   while (1) {
		     if !test-and-set(l) break;
		     while (*l == 1);
		   }

 Are other instructions that can be used to implement spin
		   locks - swap instruction, for example.

 On modern RISC machines, test-and-set and swap may cause
		   implementation headaches. Would rather do something that fits
		   into load/store nature of architecture. So, have a non-blocking
		   abstraction: Load Linked(LL)/Store Conditional(SC).

 Semantics of LL: Load memory location into register and mark
		   it as loaded by this processor. A memory location can be marked
		   as loaded by more than one processor.

 Semantics of SC: if the memory location is marked
		   as loaded by this processor, store
		   the new value and remove all marks from the memory location.
		   Otherwise, don't perform the store. Return whether
		   or not the store succeeded.

 Here is how to use LL/SC to implement the lock operation:
		   		     while (1) {
		       LL  r1, lock
		       if (r1 == 0) {
		         LI r2, 1
		         if (SC r2, lock) break;
		       }
		     }

		   Unlock operation is the same as before.

 Can also use LL/SC to implement some operations (like
		   increment) directly. People have built up a whole bunch of theory
		   dealing with the difference in power between stuff like LL/SC
		   and test-and-set.
		   		     while (1) {
		       LL   r1, lock
		       ADDI r1, 1, r1
		       if (SC r2, lock) break;
		     }
 Note that the increment operation is non-blocking. If two threads
		   start to perform the increment at the same time, neither will block -
		   both will complete the add and only one will successfully perform
		   the SC. The other will retry. So, it eliminates problems with locking
		   like: one thread acquires locks and dies, or one thread acquires locks
		   and is suspended for a long time, preventing other threads that need
		   to acquire the lock from proceeding.

             CPU Scheduling

 What is CPU scheduling? Determining which processes run when
		  there are multiple runnable processes. Why is it important?
		  Because it can can have a big effect on resource utilization and the overall
		  performance of the system.

 By the way, the world went through a long period (late 80's, early
		  90's) in which the most popular operating systems (DOS, Mac) had
		  NO sophisticated CPU scheduling algorithms.
		  They were single threaded and ran one process at a time until the
		  user directs them to run another process. Why was this true?
		  More recent systems (Windows NT) are back to having sophisticated
		  CPU scheduling algorithms. What drove the change, and what will
		  happen in the future?

 Basic assumptions behind most scheduling algorithms:
		   There is a pool of runnable processes contending for the CPU.

 The processes are independent and compete for resources.

 The job of the scheduler is to distribute the scarce resource of
		  the CPU to the different processes ``fairly'' (according to some
		  definition of fairness) and in a way that optimizes some performance
		  criteria.

		  In general, these assumptions are starting to break down. First of
		  all, CPUs are not really that scarce - almost everybody has
		  several, and pretty soon people will be able to afford lots. Second, many
		  applications are starting to be structured as multiple cooperating
		  processes. So, a view of the scheduler as mediating between competing
		  entities may be partially obsolete.

 How do processes behave? First, CPU/IO burst cycle.
		  A process will run for a while (the CPU burst),
		  perform some IO (the IO burst), then run for
		  a while more (the next CPU burst).
		  How long between IO operations? Depends on the process.
		   IO Bound processes: processes that perform lots of IO
		  operations. Each IO operation is followed by a short CPU burst to
		  process the IO, then more IO happens.

 CPU bound processes: processes that perform lots of computation
		  and do little IO. Tend to have a few long CPU bursts.

		  One of the things a scheduler will typically do is switch the CPU to
		  another process when one process does IO. Why? The IO will take a long
		  time, and don't want to leave the CPU idle while wait for the IO to finish.

 When look at CPU burst times across the whole system, have the
		  exponential or hyperexponential distribution in Fig. 5.2.

 What are possible process states?
		   Running - process is running on CPU.

 Ready - ready to run, but not actually running on the CPU.

 Waiting - waiting for some event like IO to happen.

 When do scheduling decisions take place? When does CPU choose
		  which process to run? Are a variety of possibilities:
		   When process switches from running to waiting. Could be because
		  of IO request, because wait for child to terminate, or wait for
		  synchronization operation (like lock acquisition) to complete.

 When process switches from running to ready - on completion of
		  interrupt handler, for example. Common example of interrupt handler -
		  timer interrupt in interactive systems. If scheduler switches
		  processes in this case, it has preempted the running process.
		  Another common case interrupt handler is the IO completion handler.

 When process switches from waiting to ready state (on completion
		  of IO or acquisition of a lock, for example).

 When a process terminates.

 How to evaluate scheduling algorithm? There are many possible
		  criteria:
		   CPU Utilization: Keep CPU utilization as high as possible.
		  (What is utilization, by the way?).

 Throughput: number of processes completed per unit time.

 Turnaround Time: mean time from submission to completion of
		  process.

 Waiting Time: Amount of time spent ready to run but not running.

 Response Time: Time between submission of requests and first response
		  to the request.

 Scheduler Efficiency: The scheduler doesn't perform any useful
		  work, so any time it takes is pure overhead. So, need to make the
		  scheduler very efficient.

 Big difference: Batch and Interactive systems. In batch systems,
		  typically want good throughput or turnaround time. In interactive
		  systems, both of these are still usually important (after all, want
		  some computation to happen), but response time is usually a primary
		  consideration. And, for some systems, throughput or turnaround time
		  is not really relevant - some processes conceptually run forever.

 Difference between long and short term scheduling. Long term
		  scheduler is given a set of processes and decides which ones should
		  start to run. Once they start running, they may suspend because of IO
		  or because of preemption. Short term scheduler decides which of the
		  available jobs that long term scheduler has decided are runnable
		  to actually run.

 Let's start looking at several vanilla scheduling algorithms.

 First-Come, First-Served. One ready queue, OS runs the process
		  at head of queue, new processes come in at the end of the queue.
		  A process does not give up CPU until it either terminates or performs IO.

 Consider performance of FCFS algorithm for three compute-bound processes.
		  What if have 4 processes P1 (takes 24 seconds), P2 (takes 3 seconds)
		  and P3 (takes 3 seconds). If arrive in order P1, P2, P3, what is
		   Waiting Time? (24 + 27) / 3 = 17

 Turnaround Time? (24 + 27 + 30) = 27.

 Throughput? 30 / 3 = 10.

		  What about if processes come in order P2, P3, P1? What is
		   Waiting Time? (3 + 3) / 2 = 6

 Turnaround Time? (3 + 6 + 30) = 13.

 Throughput? 30 / 3 = 10.

 Shortest-Job-First (SJF) can eliminate some of the variance
		  in Waiting and Turnaround time. In fact, it is optimal with
		  respect to average waiting time. Big problem: how
		  does scheduler figure out how long will it take
		  the process to run?

 For long term scheduler
		  running on a batch system, user will give an estimate. Usually
		  pretty good - if it is too short, system will cancel job before it
		  finishes. If too long, system will hold off on running the process.
		  So, users give pretty good estimates of overall running time.

 For short-term scheduler, must use the past to predict the
		  future. Standard way: use a time-decayed
		  exponentially weighted average of previous CPU bursts for each process.
		  Let T_n be the measured burst time of the nth burst, s_n
		  be the predicted size of next CPU burst. Then, choose a weighting
		  factor w, where 0 <= w <= 1 and compute
		  s_n+1 = w T_n + (1 - w)s_n.
		  s₀ is defined as some default constant or system average.

 w tells how to weight the past relative to future.
		  If choose w = .5, last observation has as much weight as
		  entire rest of the history. If choose w = 1, only last
		  observation has any weight.  Do a quick example.

 Preemptive vs. Non-preemptive SJF scheduler. Preemptive scheduler
		  reruns scheduling decision when process becomes ready.
		  If the new process has priority over running process, the CPU
		  preempts the running process and executes the new process.
		  Non-preemptive scheduler only does scheduling decision when
		  running process voluntarily gives up CPU. In effect, it allows every
		  running process to finish its CPU burst.

 Consider 4 processes P1 (burst time 8), P2 (burst time 4),
		  P3 (burst time 9) P4 (burst time 5) that arrive one time unit apart
		  in order P1, P2, P3, P4. Assume that after burst happens, process
		  is not reenabled for a long time (at least 100, for example).
		  What does a preemptive SJF scheduler do?
		  What about a non-preemptive scheduler?

 Priority Scheduling. Each process is given a priority, then
		  CPU executes process with highest priority. If multiple processes
		  with same priority are runnable, use some other criteria - typically
		  FCFS. SJF is an example of a priority-based scheduling algorithm.
		  With the exponential decay algorithm above, the priorities of a
		  given process change over time.

 Assume we have 5 processes P1 (burst time 10, priority 3),
		  P2 (burst time 1, priority 1),
		  P3 (burst time 2, priority 3),
		  P4 (burst time 1, priority 4),
		  P5 (burst time 5, priority 2). Lower numbers represent higher priorities.
		  What would a standard priority scheduler do?

 Big problem with priority scheduling algorithms: starvation or
		  blocking of low-priority processes. Can use aging to prevent this -
		  make the priority of a process go up the longer it stays runnable but
		  isn't run.

 What about interactive systems? Cannot just let any process run
		  on the CPU until it gives it up - must give response to users in a
		  reasonable time. So, use an algorithm called round-robin
		  scheduling. Similar to FCFS but with preemption.
		  Have a time quantum or time slice. Let the first process
		  in the queue run until
		  it expires its quantum (i.e. runs for as long as the time quantum),
		  then run the next process in the queue.

 Implementing round-robin requires timer interrupts. When
		  schedule a process, set the timer to go off after the time quantum
		  amount of time expires. If process does IO before timer goes off, no
		  problem - just run next process. But if process expires its quantum,
		  do a context switch. Save the state of the running process and
		  run the next process.

 How well does RR work? Well, it gives good response time,
		  but can give bad waiting time. Consider the waiting times under
		  round robin for 3 processes P1 (burst time 24), P2 (burst time 3),
		  and P3 (burst time 4) with time quantum 4. What happens, and what is
		  average waiting time? What gives best waiting time?

 What happens with really a really small quantum? It looks like
		  you've got a CPU that is 1/n as powerful as the real CPU, where
		  n is the number of processes. Problem with a small quantum -
		  context switch overhead.

 What about having a really small quantum supported in hardware?
		  Then, you have something called multithreading. Give the CPU a bunch
		  of registers and heavily pipeline the execution. Feed the processes
		  into the pipe one by one. Treat memory access like IO - suspend the
		  thread until the data comes back from the memory. In the meantime,
		  execute other threads. Use computation to hide the latency of
		  accessing memory.

 What about a really big quantum? It turns into FCFS. Rule of
		  thumb - want 80 percent of CPU bursts to be shorter than time quantum.

 Multilevel Queue Scheduling - like RR, except have multiple
		  queues. Typically, classify processes into separate categories and
		  give a queue to each category.  So, might have system, interactive and
		  batch processes, with the priorities in that order. Could also
		  allocate a percentage of the CPU to each queue.

 Multilevel Feedback Queue Scheduling - Like multilevel
		  scheduling, except processes can move between queues as their
		  priority changes. Can be used to give IO bound and interactive
		  processes CPU priority over CPU bound processes. Can also
		  prevent starvation by increasing the priority of processes that
		  have been idle for a long time.

 A simple example of a multilevel feedback queue scheduling
		  algorithm. Have 3 queues, numbered 0, 1, 2 with corresponding
		  priority. So, for example, execute a task in queue 2 only when
		  queues 0 and 1 are empty.

 A process goes into queue 0 when it becomes ready.
		  When run a process from queue 0, give it a quantum of 8 ms.
		  If it expires its quantum, move to queue 1. When execute
		  a process from queue 1, give it a quantum of 16. If it expires
		  its quantum, move to queue 2. In queue 2, run a RR scheduler with
		  a large quantum if in an interactive system or an FCFS scheduler
		  if in a batch system. Of course, preempt queue 2 processes when a
		  new process becomes ready.

 Another example of a multilevel feedback queue scheduling algorithm:
		  the Unix scheduler. We will go over a simplified version that does not
		  include kernel priorities. The point of the algorithm is to fairly
		  allocate the CPU between processes, with processes that have not
		  recently used a lot of CPU resources given priority over processes
		  that have.

 Processes are given a base priority of 60, with lower numbers
		  representing higher priorities. The system clock generates an
		  interrupt between 50 and 100 times a second, so we will assume a
		  value of 60 clock interrupts per second. The clock interrupt handler
		  increments a CPU usage field in the PCB of the interrupted
		  process every time it runs.

 The system always runs the highest priority process.
		  If there is a tie, it runs the process that has been ready longest.
		  Every second, it recalculates the priority and CPU usage field
		  for every process according
		  to the following formulas.
		   CPU usage field = CPU usage field / 2

 Priority = CPU usage field / 2 + base priority

 So, when a process does not use much CPU recently, its priority
		  rises. The priorities of IO bound processes and interactive processes
		  therefore tend to be high and the priorities of CPU bound processes
		  tend to be low (which is what you want).

 Unix also allows users to provide a ``nice'' value for each
		  process. Nice values modify the priority calculation as follows:
		   Priority = CPU usage field / 2 + base priority + nice value

		  So, you can reduce the priority of your process to be ``nice'' to
		  other processes (which may include your own).

 In general, multilevel feedback queue schedulers are complex
		  pieces of software that must be tuned to meet requirements.

 Anomalies and system effects associated with schedulers.

 Priority interacts with synchronization to create a really
		  nasty effect called priority inversion. A priority inversion happens
		  when a low-priority thread acquires a lock, then a high-priority
		  thread tries to acquire the lock and blocks. Any middle-priority
		  threads will prevent the low-priority thread from running and
		  unlocking the lock. In effect, the middle-priority threads block the
		  high-priority thread.

 How to prevent priority inversions? Use priority inheritance.
		  Any time a thread holds a lock that other threads are waiting on, give
		  the thread the priority of the highest-priority thread waiting to
		  get the lock. Problem is that priority inheritance makes the
		  scheduling algorithm less efficient and increases the overhead.

 Preemption can interact with synchronization
		  in a multiprocessor context to create another
		  nasty effect - the convoy effect. One thread acquires the lock, then
		  suspends. Other threads come along, and need to acquire the lock
		  to perform their operations. Everybody suspends until the lock that
		  has the thread wakes up. At this point the threads are synchronized,
		  and will convoy their way through the lock, serializing the
		  computation. So, drives down the processor utilization.

 If have non-blocking synchronization via operations like LL/SC,
		  don't get convoy effects caused by suspending a thread competing
		  for access to a resource. Why not? Because threads don't hold resources
		  and prevent other threads from accessing them.

 Similar effect when scheduling CPU and IO bound processes.
		  Consider a FCFS algorithm with several IO bound and one CPU bound
		  process. All of the IO bound processes execute their bursts
		  quickly and queue up for access to the IO device. The CPU bound
		  process then executes for a long time. During this time all of the IO
		  bound processes have their IO requests satisfied and move back into
		  the run queue. But they don't run - the CPU bound process is running
		  instead - so the IO device idles. Finally, the CPU bound process
		  gets off the CPU, and all of the IO bound processes run for a short
		  time then queue up again for the IO devices. Result is poor
		  utilization of IO device - it is busy for a time while it processes
		  the IO requests, then idle while the IO bound processes wait in the
		  run queues for their short CPU bursts. In this case an easy solution
		  is to give IO bound processes priority over CPU bound processes.

 In general, a convoy effect happens when a set of processes
		  need to use a resource for a short time, and one process holds the
		  resource for a long time, blocking all of the other processes.
		  Causes poor utilization of the other resources in the system.


One Stop GATE Forum : GATE AT A GLANCE : Other Discussions on GATE