Synchronization

Synchronization

In most scenarios, accessing a resource that is currently being processed by another task can result in obtaining invalid or incomplete information. To prevent such race conditions, proper synchronization must be applied. Sirius RTOS provides a comprehensive suite of synchronization methods designed to simplify application development. These include critical section management with priority inversion avoidance, event signaling, counting semaphores, and system timers.

Waiting for object signalization

A system object exists in one of two states: signaled or non-signaled. A task may wait on one or more objects to monitor their state. When a task calls a system wait function on a non-signaled object, its execution is suspended until the object transitions to the signaled state. If a task waits for multiple objects, it will be resumed as soon as at least one of the objects becomes signaled. When multiple tasks wait for the same object, they are queued according to their priority. If a task performs a wait operation on an already signaled object, the function returns success immediately and the task continues execution.

A waiting task is released only when two conditions are met: the object is signaled and the task is eligible to run (i.e., no higher-priority tasks are currently preempting it). If a higher-priority task is running when an object becomes signaled, the waiting task will only resume once all higher-priority tasks become blocked (see scheduling). If the object returns to a non-signaled state before the task can be scheduled, the wait function will continue to block until both conditions are satisfied again.

To wait for a single object, the osWaitForObject function is used. To wait for multiple objects, use the osWaitForObjects function. The latter is available only if the maximum number of objects a task can wait upon is configured to be greater than one. Use the OS_MAX_WAIT_FOR_OBJECTS constant to define this limit.

Wait functions include a timeout parameter that specifies the maximum duration a task should block. If the timeout is set to 0 (or AR_TIME_IGNORE), the function polls the object state and returns immediately. If the object is signaled, it returns success; otherwise, it fails and sets the last error code to ERR_WAIT_TIMEOUT. Specifying AR_TIME_INFINITE as a timeout causes the task to block indefinitely until the object becomes signaled. All other values represent timeouts in system time units. Values other than 0, AR_TIME_IGNORE, and AR_TIME_INFINITE may only be used when OS_USE_WAITING_WITH_TIME_OUT is set to 1. In this case, the task waits until the object is signaled or the timeout elapses. For further details on time units, refer to the "Time and timeouts" section.

Sleeping

A task can suspend its execution for a specific duration by calling the osSleep function. When a task enters a sleep state, the scheduler is invoked to determine which task should run next. If the specified sleep time is 0, the task voluntarily yields its current time slice. Sleep durations other than 0 and AR_TIME_IGNORE are only supported when OS_USE_WAITING_WITH_TIME_OUT is enabled.

For more information regarding time management, please refer to the "Time and timeouts" section below.

Critical sections

Mutexes and semaphores are specialized objects used to manage access to critical sections. They provide priority inversion avoidance, deadlock detection, and abandoned critical section control. These semaphores are specifically designed for critical section management; for general synchronization, the higher-performance counting semaphores should be used. The primary difference is that a mutex is owned by a single task, whereas a semaphore can be acquired multiple times. Furthermore, if a task attempts to wait on a mutex it already owns, it will succeed without blocking. In contrast, a task can acquire a semaphore multiple times but will block if the semaphore becomes non-signaled, regardless of previous acquisitions by that same task.

A critical failure occurs if a task owning a critical section is terminated or if osCloseHandle is called on an active critical section. This implies the protected resource may be in an inconsistent state. In such cases, the kernel automatically releases the object as if osReleaseMutex or osReleaseSemaphore were called. If another task subsequently acquires an abandoned mutex, it will gain ownership, but the wait operation will return a failure with the error code ERR_WAIT_ABANDON. While this may be ignored if the state is known to be safe, it is treated as an error to allow for recovery procedures.

Priority inversion is another significant risk. If a low-priority task (L) owns a critical section and a medium-priority task (M) becomes ready, task L is preempted by task M. If a high-priority task (H) then becomes ready and attempts to acquire the same critical section, it is blocked by task L. Without intervention, task M would continue to run, indirectly blocking task H. When the kernel detects this, it employs a priority inheritance algorithm to temporarily raise the priority of task L to match the highest-priority task waiting for the resource. This allows task L to finish its operation and release the critical section quickly. Once released, task L's original priority is restored, and task H can proceed.

In poorly designed systems, priority inversion can occur across a chain of multiple tasks and critical sections. The kernel requires linear time to update priorities across such chains. This overhead is dangerous for real-time performance and should be avoided through proper system design.

Deadlocks can also occur during critical section acquisition in improperly architected systems. Sirius RTOS can detect deadlocks within regions controlled by mutexes and semaphores. When a deadlock is detected, the wait operation is terminated and the error code ERR_WAIT_DEADLOCK is set.

When multiple tasks wait for a mutex or semaphore, the highest-priority task is the first to acquire it upon release. If all waiting tasks have lower priority than the task that just released the critical section, they will be granted access only when they are ready to run. If a higher-priority task (including the one that just released the mutex) starts waiting for it again, it will re-acquire the mutex immediately due to its priority. Tasks with equal or lower priority are appended to the end of the pending queue.

Mutexes and semaphores should be utilized where priority inversion protection is required. However, well-designed applications should minimize the risk of priority inversion entirely. If priority inversion is not a concern, it is more efficient to use auto-reset events (which behave similarly to mutexes) or counting semaphores instead of standard semaphores.

Auto-reset feature

Certain objects, such as events or timers, feature an auto-reset mechanism. If multiple tasks are waiting for an auto-reset object and it transitions to the signaled state, only the first task is released. Immediately following this, the object returns to a non-signaled state. Any remaining tasks must wait for the object to be signaled again to be released one by one.

As with mutexes and semaphores, when multiple tasks wait for an object that becomes signaled, the task with the highest priority is released first. If a new task with a priority higher than all currently waiting tasks performs a wait function, it will not block and will immediately transition the object to non-signaled. If the new task has an equal or lower priority, it is added to the end of the pending queue, and the current head of the queue will be the next to run.

Timers

Timers are system objects that transition to the signaled state after a specified duration. When a timer is reset, it enters the non-signaled state and remains there until the period elapses. Timers can be configured to reset a specific number of times or to function as periodic timers that never expire. The auto-reset feature automatically resets the timer once a waiting task is released. For further details, please refer to the timers section and the "Time and timeouts" section below.

Time and timeouts

Time values passed to osWaitForObject, osWaitForObjects, and osSleep, or configured in timer objects, are specified in system ticks. The duration of a tick depends on the AR_TICKS_PER_SECOND constant. When using the port files provided by SpaceShadow, this is typically set to 1000, representing 1 millisecond intervals. A waiting task can only be released when the scheduler runs. For instance, if the scheduler executes every 10 milliseconds, calling osSleep(1) may result in the task being released after nearly 10 milliseconds. In high-performance real-time applications, the scheduler should generally be configured to run at 1 millisecond intervals or faster.

SpaceShadow documentation