Before we even begin, let us be on the same page with some terms and conventions to be used in the rest of the explanation.
- A > B > C in terms of execution priority!
- Added means the process was added to the queue.
- Exit would mean the process terminated.
- A and C share a resource.
- In our case, C acquires the resource first and A waits for C to release it!
- A dark circle represents a shared resource being acquired.
- The Hollow circle represents wanting to acquire the resource.
- The scheduler is called every time a Process is added.
Look at what happens when A is added (@ t=7) and B is executing. In the case of a semaphore being used between A and C - there is priority inversion!
A is a higher priority than B, but B gets to execute because B is a higher priority than C, and note that C has the resource that A depends on! Hence, A cannot be run anyway!
Priority inversion is when a lower priority process is scheduled over a higher priority process.
Note how B gets to run while A is waiting! This is bad and the consequences can be really bad if we are talking about a real-time operating system.
How do we avoid a higher priority process not getting scheduled because the resource it needs is acquired and held by a much lower priority process?
One way is to increase the priority of the process that has the resource!
If a mutex is used. Note that the scheduler is aware of the dependency of A on C and C gets scheduled! Priority inversion is minimized.
A mutex will inherit the process priority and the scheduler can figure out if a higher priority process is blocked on a resource acquired and held by a lower priority process. Note in the figure above, as soon as A tries to acquire the resource held by C (@ t=9), the scheduler lets C run and then immediately schedules A after C has released the resource (@ t=13).
Notice the difference in the wait time compared to figure #1 where a semaphore is used.
mechanism: the key difference
There are many differences between semaphore and mutex, but the absolute major one is -
mutex has a notion of inheritance of the priority of the process which minimizes priority inversion while semaphore doesn't have that!
Be aware that the mutex being able to inherit the priority of a process is usually a configurable parameter. One needs to specifically program it to behave in ways I described.
On the usage
A mutex is usually considered for cases where the access to a critical section/memory is to be synchronized.
A and B both need to work on the same memory region. If A has the mutex on that shared memory, B waits. If B has the mutex A waits!
A semaphore is to be used for signalling purposes only.
Process A cannot proceed until B has completed an action. A waits for a signal from B. A and B use a semaphore. An embedded systems example for this can be - A wants to turn on an LED only when B signals that a button was pressed. A can then wait on a semaphore that B can give when B detects a button press.