One Stop GATE Forum: Gate Study Material OS-1

Message

             Overview of Operating System

 What is an operating system? Hard to define precisely, because
		  operating systems arose historically as people needed to solve
		  problems associated with using computers.

 Much of operating system history driven by relative cost factors
		  of hardware and people. Hardware started out fantastically expensive
		  relative to people and the relative cost has been decreasing ever
		  since.
		  Relative costs drive the goals of the operating system.
		   In the beginning: Expensive Hardware, Cheap People
		  Goal: maximize hardware utilization.

 Now: Cheap Hardware, Expensive People
		  Goal: make it easy for people to use computer.

 In the early days of computer use, computers were
		  huge machines that are expensive to buy, run and maintain. Computer
		  used in single user, interactive mode. Programmers interact with the
		  machine at a very low level - flick console switches, dump cards into
		  card reader, etc. The interface is basically the raw hardware.
		   Problem: Code to manipulate external I/O devices. Is very complex, and
		  is a major source of programming difficulty.

 Solution: Build a subroutine library (device
		  drivers) to manage the interaction with the I/O devices. The
		  library is loaded into the top of memory and stays there.
		  This is the first example of something that would grow into
		  an operating system.

 Because the machine is so expensive, it is important to
		  keep it busy.
		   Problem: computer idles while programmer sets things up.
		  Poor utilization of huge investment.

 Solution: Hire a specialized person to do setup. Faster
		  than programmer, but still a lot slower than the machine.

 Solution: Build a batch monitor. Store jobs on a disk (spooling), have
		  computer read them in one at a time and execute them.
		  Big change in computer usage: debugging
		  now done offline from print outs and memory dumps.
		  No more instant feedback.

 Problem: At any given time, job is actively using either
		  the CPU or an I/O device, and the rest of the machine is idle and
		  therefore unutilized.

 Solution: Allow the job to overlap computation and I/O. Buffering
		  and interrupt handling added to subroutine library.

 Problem: one job can't keep both CPU and I/O devices busy.
		  (Have compute-bound jobs that tend to use only the CPU and
		  I/O-bound jobs that tend to use only the I/O devices.)
		  Get poor utilization either of CPU or I/O devices.

 Solution: multiprogramming - several jobs share system.
		  Dynamically switch from one job to another when the running job
		  does I/O.
		  Big issue: protection. Don't want one job to affect the results of
		  another.  Memory protection and relocation added to hardware, OS must
		  manage new hardware functionality. OS starts to become a significant
		  software system. OS also starts to take up significant resources on
		  its own.

 Phase shift: Computers become much cheaper. People costs become
		  significant.
		   Issue:
		  It becomes important to make computers easier to use and
		  to improve the productivity of the people. One big productivity
		  sink: having to wait for batch output (but is this really true?).
		  So, it is important to run interactively. But computers are still
		  so expensive that you can't buy one for every person. Solution:
		  interactive timesharing.

 Problem: Old batch schedulers were designed to run a job
		  for as long as it was utilizing the CPU effectively (in practice,
		  until it tried to do some I/O). But now, people need reasonable
		  response time from the computer.

 Solution: Preemptive scheduling.

 Problem: People need to have their data and programs around while
		  they use the computer.

 Solution: Add file systems for quick access to data.
		  Computer becomes a repository for data, and people don't have to
		  use card decks or tapes to store their data.

 Problem: The boss logs in and gets terrible response time
		  because the machine is overloaded.

 Solution: Prioritized scheduling. The boss gets more of the
		  machine than the peons. But, CPU scheduling is just an example
		  of resource allocation problems. The timeshared machine was full
		  of limited resources (CPU time, disk space, physical memory space, etc.)
		  and it became the responsibility of the OS to mediate the allocation
		  of the resources. So, developed things like disk and physical memory
		  quotas, etc.

		  Overall, time sharing was a success. However, it was a limited success.
		  In practical terms, every timeshared computer became overloaded and
		  the response time dropped to annoying or unacceptable levels.
		  Hard-core hackers compensated by working at night, and we developed
		  a generation of pasty-looking, unhealthy insomniacs addicted to caffeine.

 Computers become even cheaper. It becomes
		  practical to give one computer to each user.
		  Initial cost is very important in market.
		  Minimal hardware (no networking or hard disk, very slow microprocessors
		  and almost no memory)
		  shipped with minimal OS (MS-DOS). Protection, security
		  less of an issue. OS resource consumption becomes a big issue (computer
		  only has 640K of memory).
		  OS back to a shared subroutine library.

 Hardware becomes cheaper and users more sophisticated.
		  People need to share data and information with other people.
		  Computers become more information transfer, manipulation and
		  storage devices rather than machines that perform arithmetic
		  operations. Networking becomes very important, and as sharing
		  becomes an important part of the experience so does
		  security. Operating systems become more sophisticated.
		  Start putting back features present in the old
		  time sharing systems (OS/2, Windows NT, even Unix).

 Rise of network. Internet is a huge popular phenomenon and drives
		  new ways of thinking about computing. Operating system is no
		  longer interface to the lower level machine - people structure
		  systems to contain layers of middleware. So, a Java API
		  or something similar may be
		  the primary thing people need, not a set of system calls.
		  In fact, what the operating system is may become irrelevant as long
		  as it supports the right set of middleware.

 Network computer. Concept of a box that gets all of its
		  resources over the network. No local file system, just network
		  interfaces to acquire all outside data. So have a slimmer version of
		  OS.

 In the future, computers will become physically small and portable.
		  Operating systems will have to deal with issues like disconnected
		  operation and mobility. People will also start using information
		  with a psuedo-real time component like voice and video.
		  Operating systems will have to adjust to deliver acceptable
		  performance for these new forms of data.

 What does a modern operating system do?
		   Provides Abstractions Hardware has low-level physical
		  resources with complicated, idiosyncratic interfaces. OS provides
		  abstractions that present clean interfaces.  Goal: make computer
		  easier to use.
		  Examples: Processes, Unbounded Memory, Files, Synchronization and Communication Mechanisms.

 Provides Standard Interface Goal: portability. Unix runs on
		  many very different computer systems. To a first approximation can
		  port programs across systems with little effort.

 Mediates Resource Usage Goal: allow multiple users to
		  share resources fairly, efficiently, safely and securely.  Examples:
		   Multiple processes share one processor. (preemptable resource)

 Multiple programs share one physical memory (preemptable resource).

 Multiple users and files share one disk. (non-preemptable resource)

 Multiple programs share a given amount of disk and
		  network bandwidth (preemptable resource).

 Consumes Resources
		  Solaris takes up about 8Mbytes physical memory (or about $400).

 Abstractions often work well - for example,
		  timesharing, virtual memory and hierarchical and networked file systems.
		  But, may break down if stressed. Timesharing gives poor performance
		  if too many users run compute-intensive jobs. Virtual memory breaks
		  down if working set is too large (thrashing), or if there are too many large
		  processes (machine runs out of swap space).
		  Abstractions often fail for
		  performance reasons.

 Abstractions also fail because they prevent programmer from
		  controlling machine at desired level. Example: database systems often
		  want to control movement of information between disk and physical
		  memory, and the paging system can get in the way. More recently,
		  existing OS schedulers fail to adequately support
		  multimedia and parallel processing needs, causing poor performance.

 Concurrency and asynchrony make operating systems
		  very complicated pieces of software. Operating systems are fundamentally
		  non-deterministic and event driven. Can be difficult to construct
		  (hundreds of person-years of effort) and impossible to completely
		  debug. Examples of concurrency and asynchrony:
		   I/O devices run concurrently with CPU, interrupting CPU when done.

 On a multiprocessor multiple user processes execute in parallel.

 Multiple workstations execute concurrently and communicate by
		  sending messages over a network. Protocol processing takes place
		  asynchronously.

		  Operating systems are so large no one person understands whole system.
		  Outlives any of its original builders.

 The major problem facing computer science today is how to build
		  large, reliable software systems. Operating systems are one of
		  very few examples of existing large software systems, and by
		  studying operating systems we may learn lessons applicable to the
		  construction of larger systems.

             Process

 A process is an execution stream in the context of a particular
		  process state.
		   An execution stream is a sequence of instructions.

 Process state determines the effect of the instructions. It
		  usually includes (but is not restricted to):
		   Registers

 Stack

 Memory (global variables and dynamically allocated memory)

 Open file tables

 Signal management information

		  Key concept: processes are separated: no process can directly affect the state
		  of another process.

 Process is a key OS abstraction that users see - the environment
		  you interact with when you use a computer is built up out of processes.
		   The shell you type stuff into is a process.

 When you execute a program you have just compiled, the OS
		  generates a process to run the program.

 Your WWW browser is a process.

 Organizing system activities around processes has proved to be
		  a useful way of separating out different activities into coherent
		  units.

 Two concepts: uniprogramming and multiprogramming.
		   Uniprogramming: only one process at a time. Typical example:
		  DOS. Problem: users often wish to perform more than one activity
		  at a time (load a remote file while editing a program, for example),
		  and uniprogramming does not allow this. So DOS and other uniprogrammed
		  systems put in things like memory-resident programs that invoked
		  asynchronously, but still have separation problems. One key problem with
		  DOS is that there is no memory protection - one program may write
		  the memory of another program, causing weird bugs.

 Multiprogramming: multiple processes at a time. Typical of Unix
		  plus all currently envisioned new operating systems. Allows system
		  to separate out activities cleanly.

 Multiprogramming introduces the
		  resource sharing problem - which processes get to use the physical
		  resources of the machine when? One crucial resource: CPU. Standard
		  solution is to use preemptive multitasking - OS runs one process
		  for a while, then takes the CPU away from that process and lets
		  another process run. Must save and restore process state. Key issue:
		  fairness. Must ensure that all processes get their fair share of the
		  CPU.

 How does the OS implement the process abstraction? Uses a
		  context switch to switch from running one process to running
		  another process.

 How does machine implement context switch? A processor has a
		  limited amount of physical resources. For example, it has only
		  one register set. But every process on the machine has its own
		  set of registers. Solution: save and restore hardware state on
		  a context switch. Save the state in Process Control Block (PCB).
		  What is in PCB? Depends on the hardware.
		   Registers - almost all machines save registers in PCB.

 Processor Status Word.

 What about memory? Most machines allow memory from multiple
		  processes to coexist in the physical memory of the machine. Some
		  may require Memory Management Unit (MMU) changes on a context switch.
		  But, some early personal computers switched all of
		  process's memory out to disk (!!!).

 Operating Systems are fundamentally event-driven systems - they
		  wait for an event to happen, respond appropriately to the event, then
		  wait for the next event. Examples:
		   User hits a key. The keystroke is echoed on the screen.

 A user program issues a system call to read a file. The operating system
		  figures out which disk blocks to bring in, and generates a request to the
		  disk controller to read the disk blocks into memory.

 The disk controller finishes reading in the disk block and
		  generates and interrupt. The OS moves the read data into the user program
		  and restarts the user program.

 A Mosaic or Netscape user asks for a URL to be retrieved.
		  This eventually generates requests to the OS to send request packets
		  out over the network to a remote WWW server. The OS sends the packets.

 The response packets come back from the WWW server, interrupting the
		  processor. The OS figures out which process should get the packets,
		  then routes the packets to that process.

 Time-slice timer goes off. The OS must save the state of the
		  current process, choose another process to run, the give the CPU to
		  that process.

 When build an event-driven system with several distinct
		  serial activities, threads are a key structuring mechanism of the
		  OS.

 A thread is again an execution stream in the context of a
		  thread state. Key difference between processes and threads is that
		  multiple threads share parts of their state. Typically, allow
		  multiple threads to read and write same memory. (Recall that
		  no processes could directly access memory of another process).
		  But, each thread still has its own registers. Also has its own stack,
		  but other threads can read and write the stack memory.

 What is in a thread control block? Typically just registers.
		  Don't need to do anything to the MMU when switch threads, because
		  all threads can access same memory.

 Typically, an OS will have a separate thread for each distinct
		  activity. In particular, the OS will have a separate thread for each process,
		  and that thread will perform OS activities on behalf of the process.
		  In this case we say that each user process is backed by a kernel
		  thread.
		   When process issues a system call to read a file, the process's
		  thread will take over, figure out which disk accesses to
		  generate, and issue the low level instructions required to start
		  the transfer. It then suspends until the disk finishes reading
		  in the data.

 When process starts up a remote TCP connection, its
		  thread handles the low-level details of sending out network packets.

 Having a separate thread for each activity allows the programmer
		  to program the actions associated with that activity as a single
		  serial stream of actions and events. Programmer does not have to
		  deal with the complexity of interleaving multiple activities on the
		  same thread.

 Why allow threads to access same memory? Because inside OS,
		  threads must coordinate their activities very closely.
		   If two
		  processes issue read file system calls at close to the same time,
		  must make sure that the OS serializes the disk requests
		  appropriately.

 When one process allocates memory, its thread must find some free
		  memory and give it to the process. Must ensure that multiple
		  threads allocate disjoint pieces of memory.

		  Having threads share the same address space makes it much easier to
		  coordinate activities - can build data structures that represent
		  system state and have threads read and write data structures to
		  figure out what to do when they need to process a request.

 One complication that threads must deal with: asynchrony.
		  Asynchronous events happen arbitrarily as the thread is executing,
		  and may interfere with the thread's activities unless the programmer
		  does something to limit the asynchrony. Examples:
		   An interrupt occurs, transferring control away from one
		  thread to an interrupt handler.

 A time-slice switch occurs, transferring control from
		  one thread to another.

 Two threads running on different processors read and write
		  the same memory.

 Asynchronous events, if not properly controlled, can lead to
		  incorrect behavior. Examples:
		   Two threads need to issue disk requests. First thread starts to
		  program disk controller (assume it is memory-mapped, and must issue
		  multiple writes to specify a disk operation). In the meantime, the
		  second thread runs on a different processor and also issues the
		  memory-mapped writes to program the disk controller. The disk
		  controller gets horribly confused and reads the wrong disk block.

 Two threads need to write to the display. The first thread
		  starts to build its request, but before it
		  finishes a time-slice switch occurs and the
		  second thread starts its request. The combination of the two
		  threads issues a forbidden request sequence, and smoke starts pouring
		  out of the display.

 For accounting reasons the operating system keeps track of how much
		  time is spent in each user program. It also keeps a running sum of
		  the total amount of time spent in all user programs. Two threads
		  increment their local counters for their processes, then
		  concurrently increment the global counter. Their increments interfere,
		  and the recorded total time spent in all user processes is less than
		  the sum of the local times.

 So, programmers need to coordinate the activities of the
		  multiple threads so that these bad things don't happen. Key mechanism:
		  synchronization operations. These operations allow threads to
		  control the timing of their events relative to events in other
		  threads. Appropriate use allows programmers to avoid problems like
		  the ones outlined above.

             Thread

 We first must postulate a thread creation and manipulation
		  interface. Will use the one in Nachos:
		  		  class Thread {
		    public:
		      Thread(char* debugName);
		      ~Thread();
		      void Fork(void (*func)(int), int arg);
		      void Yield();
		      void Finish();
		  }

 The Thread constructor creates a new thread.
		  It allocates a data structure with space for the TCB.

 To actually
		  start the thread running, must tell it what function to start
		  running when it runs. The Fork method gives it the function
		  and a parameter to the function.

 What does Fork do? It first allocates a stack for the
		  thread. It then sets up the TCB so that when the thread starts
		  running, it will invoke the function and pass it the correct
		  parameter. It then puts the thread on a run queue someplace.
		  Fork then returns, and the thread that called Fork
		  continues.

 How does OS set up TCB so that the thread starts running
		  at the function? First, it sets the stack pointer in the
		  TCB to the stack. Then, it sets the PC in the TCB to be the first
		  instruction in the function. Then, it sets the register in
		  the TCB holding the first parameter to the parameter. When the
		  thread system restores the state from the TCB, the function will magically
		  start to run.

 The system maintains a queue of runnable threads. Whenever
		  a processor becomes idle, the thread scheduler grabs a thread
		  off of the run queue and runs the thread.

 Conceptually, threads execute concurrently. This is the best
		  way to reason about the behavior of threads. But in practice, the
		  OS only has a finite number of processors, and it can't run
		  all of the runnable threads at once. So, must multiplex the
		  runnable threads on the finite number of processors.

 Let's do a few thread examples. First example: two threads
		  that increment a variable.
		  		  int a = 0;
		  void sum(int p) {
		    a++;
		    printf("%d : a = %d\n", p, a);
		  }
		  void main() {
		    Thread *t = new Thread("child");
		    t->Fork(sum, 1);
		    sum(0);
		  }

 The two calls to
		  sum run concurrently. What are the possible results
		  of the program? To understand this fully, we must break the
		  sum subroutine up into its primitive components.

 sum first reads the value of a into a register.
		  It then increments the register, then stores the contents
		  of the register back into a. It then reads the values of
		  of the control string,
		  p and a into the registers that it uses to
		  pass arguments to the printf routine. It then calls
		  printf, which prints out the data.

 The best way to understand the instruction sequence
		  is to look at the generated assembly language (cleaned up just a bit).
		  You can have the compiler generate assembly code instead of
		  object code by giving it the -S flag. It will put the generated
		  assembly in the same file name as the .c or .cc file, but with
		  a .s suffix.
		  		          la      a, %r0
		          ld      [%r0],%r1
		          add     %r1,1,%r1
		          st      %r1,[%r0]

		          ld      [%r0], %o3 ! parameters are passed starting with %o0
		          mov     %o0, %o1
		          la      .L17, %o0
		          call    printf

 So when execute concurrently, the result depends on how
		  the instructions interleave. What are possible results?
		  		  0 : 1                                      0 : 1
		  1 : 2                                      1 : 1

		  1 : 2                                      1 : 1
		  0 : 1                                      0 : 1

		  1 : 1                                      0 : 2
		  0 : 2                                      1 : 2

		  0 : 2                                      1 : 2
		  1 : 1                                      0 : 2

		  So the results are nondeterministic - you may get different
		  results when you run the program more than once. So, it
		  can be very difficult to reproduce bugs. Nondeterministic
		  execution is one of the things that makes writing parallel
		  programs much more difficult than writing serial programs.

 Chances are, the programmer is not happy with all
		  of the possible results listed above. Probably wanted the
		  value of a to be 2 after both threads finish.
		  To achieve this, must make the increment operation
		  atomic. That is, must prevent the interleaving of the
		  instructions in a way that would interfere with the
		  additions.

 Concept of atomic operation. An atomic operation is one
		  that executes without any interference from other operations -
		  in other words, it executes as one unit. Typically build
		  complex atomic operations up out of sequences of
		  primitive operations. In our case the primitive operations
		  are the individual machine instructions.

 More formally, if
		  several atomic operations execute, the final result is guaranteed to
		  be the same as if the operations executed in some serial order.

 In our case above, build an increment operation up out of
		  loads, stores and add machine instructions. Want the increment
		  operation to be atomic.

 Use synchronization operations to make code sequences atomic.
		  First synchronization abstraction: semaphores. A semaphore is,
		  conceptually, a counter that supports two atomic operations, P and
		  V. Here is the Semaphore interface from Nachos:
		  		  class Semaphore {
		    public:
		      Semaphore(char* debugName, int initialValue);
		      ~Semaphore();
		      void P();
		      void V();
		  }

 Here is what the operations do:
		   Semphore(name, count) : creates a semaphore and initializes
		  the counter to count.

 P() : Atomically waits until the counter is greater than 0, then
		  decrements the counter and returns.

 V() : Atomically increments the counter.

 Here is how we can use the semaphore to make the sum
		  example work:
		  		  int a = 0;
		  Semaphore *s;
		  void sum(int p) {
		    int t;
		    s->P();
		    a++;
		    t = a;
		    s->V();
		    printf("%d : a = %d\n", p, t);
		  }
		  void main() {
		    Thread *t = new Thread("child");
		    s = new Semaphore("s", 1);
		    t->Fork(sum, 1);
		    sum(0);
		  }

 We are using semaphores here to implement a mutual exclusion
		  mechanism. The idea behind mutual exclusion is that only one
		  thread at a time should be allowed to do something. In this
		  case, only one thread should access a. Use mutual
		  exclusion to make operations atomic.
		  The code that
		  performs the atomic operation is called a critical section.

 Semaphores do much more than mutual exclusion. They can also be
		  used to synchronize producer/consumer programs. The idea is that
		  the producer is generating data and the consumer is consuming data.
		  So a Unix pipe has a producer and a consumer. You can also think
		  of a person typing at a keyboard as a producer and the shell
		  program reading the characters as a consumer.

 Here is the synchronization problem: make sure that the
		  consumer does not get ahead of the producer. But, we would like
		  the producer to be able to produce without waiting for the
		  consumer to consume.  Can use semaphores
		  to do this. Here is how it works:
		  		  Semaphore *s;
		  void consumer(int dummy) {
		    while (1) {
		      s->P();
		      consume the next unit of data
		    }
		  }
		  void producer(int dummy) {
		    while (1) {
		      produce the next unit of data
		      s->V();
		    }
		  }
		  void main() {
		    s = new Semaphore("s", 0);
		    Thread *t = new Thread("consumer");
		    t->Fork(consumer, 1);
		    t = new Thread("producer");
		    t->Fork(producer, 1);
		  }

		  In some sense the semaphore is an abstraction of the collection of data.

 In the real world, pragmatics intrude. If we let the producer
		  run forever and never run the consumer, we have to store all of the
		  produced data somewhere. But no machine has an infinite amount of
		  storage. So, we want to let the producer to get ahead of the consumer
		  if it can, but only a given amount ahead. We need to implement
		  a bounded buffer which can hold only N items. If the bounded
		  buffer is full, the producer must wait before it can put any
		  more data in.
		  		  Semaphore *full;
		  Semaphore *empty;
		  void consumer(int dummy) {
		    while (1) {
		      full->P();
		      consume the next unit of data
		      empty->V();
		    }
		  }
		  void producer(int dummy) {
		    while (1) {
		      empty->P();
		      produce the next unit of data
		      full->V();
		    }
		  }
		  void main() {
		    empty = new Semaphore("empty", N);
		    full = new Semaphore("full", 0);
		    Thread *t = new Thread("consumer");
		    t->Fork(consumer, 1);
		    t = new Thread("producer");
		    t->Fork(producer, 1);
		  }

		  An example of where you might use a producer and consumer
		  in an operating system is the console (a device that reads and
		  writes characters from and to
		  the system console). You would probably use semaphores to
		  make sure you don't try to read a character before it is typed.

 Semaphores are one synchronization abstraction. There is
		  another called locks and condition variables.

 Locks are an abstraction specifically for mutual exclusion
		  only. Here is the Nachos
		  lock interface:
		  		  class Lock {
		    public:
		      Lock(char* debugName);              // initialize lock to be FREE
		      ~Lock();                            // deallocate lock
		      void Acquire(); // these are the only operations on a lock
		      void Release(); // they are both *atomic*
		  }

 A lock can be in one of two states: locked and unlocked.
		  Semantics of lock operations:
		   Lock(name) : creates a lock that starts out in the unlocked state.

 Acquire() : Atomically waits until the lock state is unlocked, then
		  sets the lock state to locked.

 Release() : Atomically changes the lock state to unlocked from
		  locked.

		  In assignment 1 you will implement locks in Nachos on top of semaphores.

 What are requirements for a locking implementation?
		   Only one thread can acquire lock at a time. (safety)

 If multiple threads try to acquire an unlocked lock, one of the
		  threads will get it. (liveness)

 All unlocks complete in finite time. (liveness)

 What are desirable properties for a locking implementation?
		   Efficiency: take up as little resources as possible.

 Fairness: threads acquire lock in the order they ask
		  for it. Are also weaker forms of fairness.

 Simple to use.

 When use locks, typically associate a lock with pieces of
		  data that multiple threads access. When one thread wants to
		  access a piece of data, it first acquires the lock. It then
		  performs the access, then unlocks the lock. So, the lock allows
		  threads to perform complicated atomic operations on each
		  piece of data.

 Can you implement unbounded buffer only using locks? There is
		  a problem - if the consumer wants to consume a piece of data
		  before the producer produces the data, it must wait. But locks
		  do not allow the consumer to wait until the producer produces
		  the data. So, consumer must loop until the data is ready. This
		  is bad because it wastes CPU resources.

 There is another synchronization abstraction called
		  condition variables just for this kind of situation. Here is
		  the Nachos interface:
		  		  class Condition {
		    public:
		      Condition(char* debugName);
		      ~Condition();
		      void Wait(Lock *conditionLock);
		      void Signal(Lock *conditionLock);
		      void Broadcast(Lock *conditionLock);
		  }

 Semantics of condition variable operations:
		   Condition(name) : creates a condition variable.

 Wait(Lock *l) : Atomically releases the lock and waits.
		  When Wait returns the lock will have been reacquired.

 Signal(Lock *l) : Enables one of the waiting threads to run.
		  When Signal returns the lock is still acquired.

 Broadcast(Lock *l) : Enables all of the waiting threads
		  to run. When Broadcast returns the lock is still acquired.

		  All locks must be the same.
		  In assignment 1 you will implement condition
		  variables in Nachos on top of semaphores.

 Typically, you associate a lock and a condition variable with
		  a data structure. Before the program performs an operation on the
		  data structure, it acquires the lock. If it has to wait before
		  it can perform the operation, it uses the condition variable
		  to wait for another operation to bring the data structure into
		  a state where it can perform the operation. In some cases
		  you need more than one condition variable.

 Let's say that we want to implement an unbounded buffer
		  using locks and condition variables. In this case we have
		  2 consumers.
		  		  Lock *l;
		  Condition *c;
		  int avail = 0;
		  void consumer(int dummy) {
		    while (1) {
		      l->Acquire();
		      if (avail == 0) {
		        c->Wait(l);
		      }
		      consume the next unit of data
		      avail--;
		      l->Release();
		    }
		  }
		  void producer(int dummy) {
		    while (1) {
		      l->Acquire();
		      produce the next unit of data
		      avail++;
		      c->Signal(l);
		      l->Release();
		    }
		  }
		  void main() {
		    l = new Lock("l");
		    c = new Condition("c");
		    Thread *t = new Thread("consumer");
		    t->Fork(consumer, 1);
		    Thread *t = new Thread("consumer");
		    t->Fork(consumer, 2);
		    t = new Thread("producer");
		    t->Fork(producer, 1);
		  }

 There are two variants of condition variables: Hoare
		  condition variables and Mesa condition variables. For Hoare
		  condition variables, when one thread performs a
		  Signal, the very next thread to run is the waiting
		  thread. For Mesa condition variables, there are no
		  guarantees when the signalled thread will run. Other
		  threads that acquire the lock can execute between
		  the signaller and the waiter. The
		  example above will work with Hoare condition variables but
		  not with Mesa condition variables.

 What is the problem with Mesa condition variables?
		  Consider the following scenario: Three threads,
		  thread 1 one producing data, threads 2 and 3 consuming data.
		   Thread 2 calls consumer, and suspends.

 Thread 1 calls producer, and signals thread 2.

 Instead of thread 2 running next, thread 3 runs next,
		  calls consumer, and consumes the element. (Note: with
		  Hoare monitors, thread 2 would always run next, so this
		  would not happen.)

 Thread 2 runs, and tries to consume an item that is not
		  there. Depending on the data structure used to store produced
		  items, may get some kind of illegal access error.

 How can we fix this problem?
		  Replace the if with a while.
		  		  void consumer(int dummy) {
		    while (1) {
		      l->Acquire();
		      while (avail == 0) {
		        c->Wait(l);
		      }
		      consume the next unit of data
		      avail--;
		      l->Release();
		    }
		  }

		  In general, this is a crucial point. Always put while's around your
		  condition variable code. If you don't, you can get really
		  obscure bugs that show up very infrequently.

 In this example, what is the data that the lock and
		  condition variable are associated with? The
		  avail variable.

 People have developed a programming abstraction that
		  automatically associates locks and condition variables with
		  data. This abstraction is called a monitor. A monitor is a
		  data structure plus a set of operations (sort of like an
		  abstract data type). The monitor also has a lock and, optionally,
		  one or more condition variables. See notes for Lecture 14.

 The compiler for the monitor language automatically inserts
		  a lock operation at the beginning of each routine and an unlock
		  operation at the end of the routine. So, programmer does not have
		  to put in the lock operations.

 Monitor languages were popular in the middle 80's - they are in some
		  sense safer because they eliminate one possible programming error.
		  But more recent languages have tended not to support monitors explicitly,
		  and expose the locking operations to the programmer. So the programmer
		  has to insert the lock and unlock operations by hand.
		  Java takes a middle ground - it supports monitors, but also allows
		  programmers to exert finer grain control over the locked sections
		  by supporting synchronized blocks within methods. But synchronized
		  blocks still present a structured model of synchronization, so it
		  is not possible to mismatch the lock acquire and release.

 Laundromat Example: A local laudromat has switched to a
		  computerized machine allocation scheme. There are N machines,
		  numbered 1 to N. By the front door there are P allocation stations. When
		  you want to wash your clothes, you go to an allocation station
		  and put in your coins. The allocation station gives you a number,
		  and you use that machine. There are also P deallocation stations.
		  When your clothes finish, you give the number back to one
		  of the deallocation stations, and someone else can use the machine.
		  Here is the alpha release of the machine allocation software:
		  		  allocate(int dummy) {
		    while (1) {
		      wait for coins from user
		      n = get();
		      give number n to user
		    }
		  }
		  deallocate(int dummy) {
		    while (1) {
		      wait for number n from user
		      put(i);
		    }
		  }
		  main() {
		    for (i = 0; i < P; i++) {
		      t = new Thread("allocate");
		      t->Fork(allocate, 0);
		      t = new Thread("deallocate");
		      t->Fork(deallocate, 0);
		    }
		  }

 The key parts of the scheduling are done in the two routines
		  get and put, which use an array data
		  structure a to keep track of which machines are
		  in use and which are free.
		  		  int a[N];
		  int get() {
		    for (i = 0; i < N; i++) {
		      if (a == 0) {
		        a = 1;
		        return(i+1);
		      }
		    }
		  }
		  void put(int i) {
		    a[i-1] = 0;
		  }

 It seems that the alpha software isn't doing all that well.
		  Just looking at the software, you can see that there are several
		  synchronization problems.

 The first problem is that sometimes two people are assigned
		  to the same machine. Why does this happen?
		  We can fix this with a lock:
		  		  int a[N];
		  Lock *l;
		  int get() {
		    l->Acquire();
		    for (i = 0; i < N; i++) {
		      if (a == 0) {
		        a = 1;
		        l->Release();
		        return(i+1);
		      }
		    }
		    l->Release();
		  }
		  void put(int i) {
		    l->Acquire();
		    a[i-1] = 0;
		    l->Release();
		  }

		  So now, have fixed the multiple assignment problem. But what
		  happens if someone comes in to the laundry when all of the
		  machines are already taken? What does the machine return?
		  Must fix it so that the system waits until there is a machine
		  free before it returns a number. The situation calls for
		  condition variables.
		  		  int a[N];
		  Lock *l;
		  Condition *c;
		  int get() {
		    l->Acquire();

		    while (1) {
		      for (i = 0; i < N; i++) {
		        if (a == 0) {
		          a = 1;
		          l->Release();
		          return(i+1);
		        }
		      }
		      c->Wait(l);
		    }
		  }
		  void put(int i) {
		    l->Acquire();
		    a[i-1] = 0;
		    c->Signal();
		    l->Release();
		  }

 What data is the lock protecting? The a array.

 When would you use a broadcast operation? Whenever want to
		  wake up all waiting threads, not just one. For an event
		  that happens only once. For example, a bunch of threads may
		  wait until a file is deleted.
		  The thread that actually deleted the file could use a broadcast to wake
		  up all of the threads.

 Also use a broadcast for allocation/deallocation of variable
		  sized units. Example: concurrent
		  malloc/free.
		  		  Lock *l;
		  Condition *c;
		  char *malloc(int s) {
		    l->Acquire();
		    while (cannot allocate a chunk of size s) {
		      c->Wait(l);
		    }
		    allocate chunk of size s;
		    l->Release();
		    return pointer to allocated chunk
		  }
		  void free(char *m) {
		    l->Acquire();
		    deallocate m.
		    c->Broadcast(l);
		    l->Release();
		  }

 Example with malloc/free. Initially start out with
		  10 bytes free.

		  Time
		  Process 1
		  Process 2
		  Process 3

		  malloc(10) - succeeds
		  malloc(5) - suspends lock
		  malloc(5) suspends lock

		  1

		  gets lock  - waits

		  2

		  gets lock - waits

		  3
		  free(10) - broadcast

		  4

		  resume malloc(5) - succeeds

		  5

		  resume malloc(5) - succeeds

		  6
		  malloc(7) - waits

		  7

		  malloc(3) - waits

		  8

		  free(5) - broadcast

		  9
		  resume malloc(7) - waits

		  10

		  resume malloc(3) - succeeds

		  What would happen if changed
		  c->Broadcast(l) to c->Signal(l)?
		  At step 10, process 3 would not wake up, and it would
		  not get the chance to allocate available memory.
		  What would happen if changed
		  while loop to an if?

 You will be asked to implement condition variables as part
		  of assignment 1. The following implementation is INCORRECT.
		  Please do not turn this implementation in.
		  		  class Condition {
		    private:
		      int waiting;
		      Semaphore *sema;
		  }
		  void Condition::Wait(Lock* l)
		  {
		    waiting++;
		    l->Release();
		    sema->P();
		    l->Acquire();
		  }
		  void Condition::Signal(Lock* l)
		  {
		    if (waiting > 0) {
		      sema->V();
		      waiting--;
		    }
		  }

		  Why is this solution incorrect? Because in some cases the signalling
		  thread may wake up a waiting thread that called Wait after the
		  signalling thread called Signal.


One Stop GATE Forum : GATE AT A GLANCE : Other Discussions on GATE