Before we even begin, let us be on the same page with some terms and conventions to be used in the rest of the explanation.

  1. A > B > C in terms of execution priority!
  2. Added means the process was added to the queue.
  3. Exit would mean the process terminated.
  4. A and C share a resource.
  5. In our case, C acquires the resource first and A waits for C to release it!
  6. A dark circle represents a shared resource being acquired.
  7. The Hollow circle represents wanting to acquire the resource.
  8. The scheduler is called every time a Process is added.

binary semaphore

Look at what happens when A is added (@ t=7) and B is executing. In the case of a semaphore being used between A and C - there is priority inversion!

A is a higher priority than B, but B gets to execute because B is a higher priority than C, and note that C has the resource that A depends on! Hence, A cannot be run anyway!

figure #1: Semaphore being used between A and C to access a shared resource.
Priority inversion is when a lower priority process is scheduled over a higher priority process.

Note how B gets to run while A is waiting! This is bad and the consequences can be really bad if we are talking about a real-time operating system.

How do we avoid a higher priority process not getting scheduled because the resource it needs is acquired and held by a much lower priority process?

One way is to increase the priority of the process that has the resource!

mutex

If a mutex is used. Note that the scheduler is aware of the dependency of A on C and C gets scheduled! Priority inversion is minimized.

figure #2: Mutex is being used between A and C to access a shared resource.

A mutex will inherit the process priority and the scheduler can figure out if a higher priority process is blocked on a resource acquired and held by a lower priority process. Note in the figure above, as soon as A tries to acquire the resource held by C (@ t=9), the scheduler lets C run and then immediately schedules A after C has released the resource (@ t=13).

Notice the difference in the wait time compared to figure #1 where a semaphore is used.

mechanism: the key difference

There are many differences between semaphore and mutex, but the absolute major one is -

mutex has a notion of inheritance of the priority of the process which minimizes priority inversion while semaphore doesn't have that!

Note:

Be aware that the mutex being able to inherit the priority of a process is usually a configurable parameter. One needs to specifically program it to behave in ways I described.

What really happened on Mars? -- Authoritative Account
The mars rover would reset itself over and over. Turned out there was priority inversion resultant of the priority inherence parameter in the mutex left un-configured.

On the usage

A mutex is usually considered for cases where the access to a critical section/memory is to be synchronized.

A and B both need to work on the same memory region. If A has the mutex on that shared memory, B waits. If B has the mutex A waits!

A semaphore is to be used for signalling purposes only.

Process A cannot proceed until B has completed an action. A waits for a signal from B. A and B use a semaphore.

An embedded systems example for this can be - A wants to turn on an LED only when B signals that a button was pressed. A can then wait on a semaphore that B can give when B detects a button press.

References

Hands-On RTOS with Microcontrollers | Packt
Build a strong foundation in designing and implementing real-time systems with the help of practical examples
Refer to "4. Task Signaling and Communication Mechanisms"
Difference between binary semaphore and mutex
Is there any difference between a binary semaphore and mutex or are they essentially the same?