Add necessary checks when sending data to the stream/message buffer in order to avoid a task deadlock when attempting to write a longer stream/message than the underlying buffer can write.
* Update port.c
I discovered a very snicky and tricky race condition scenario when integrating tracealyzer code into our project.
A little background on CortexR5 : When the IRQ line (comming from the interrupt controller, to which every peripheral IRQ lines connect) of the processor rises and the IRQ Enable bit in the status register of the CPU permits it, the CPU traps into interrupt mode. Several things happen. First, the CPU finishes the instruction it was performing. Second, it places the content of the CPSR register into the SPSR_irq register. And third, the mode of the CPU is changed to IRQ_Mode and /!\ THE IRQ ENABLE BIT IN CPSR_irq IS AUTOMATICALLY CLEARED /!\. The reason is to ensure that upon landing into IRQ code, we find ourselves automatically in a critical section because we cannot be interrupted again because the bit is cleared. The programmer can, if he wants, re-enable IRQs inside IRQ code itself to allow interrupt nesting. But it has to be wanted and meditated.
Now, inside portASM.S, at the end of 'FreeRTOS_IRQ_Handler' assembly function, a call to 'vTaskSwitchContextConst' is made if the variable 'ulPortYieldRequired' was set by someone while executing the interrupt. Before branching to that function, a 'CPSID i' instruction was placed to ensure that interrupts are disabled in case someone re-enabled it. Inside 'vTaskSwitchContext', there is the macro 'traceTASK_SWITCHED_OUT' that gets populated when tracing is enabled.
The bug is right there.. If the macro is populated and inside that macro there is a matching call to 'ulPortSetInterruptMask' and 'vPortClearInterruptMask', a race condition can occure is there is a OS Tick timer interrupt waiting at the interrupt controller's door. Upon calling 'vTaskSwitchContext', the interrupts are not masked in the interrupt controller, the only barrier against the CPU servicing that tick interrupt while already performing the function is that the IRQ Enable bit cleared. 'ulPortSetInterruptMask'
does what's its supposed to do, but doesn't take into account the IRQ Enable bit in CPSR. Wheter or not the bit was cleared, the function sets it at the end. When calling the matching 'vPortClearInterruptMask', the function clears the interrupt mask in the interrupt controller. Because the IRQ Enable bit (that was cleared) has been set no matter what in 'ulPortSetInterruptMask', the CPU services the OS Tick Interrupt right away.
The bug is there : instead of completing the 'vTaskSwitchContext' function, the CPU re-enters the switch context path right after 'traceTASK_SWITCHED_OUT' thus corrupting the CPU state and eventually triggering either an undefined instruction, data or instruction abort.
* Update port.c
Error on my part, this is the right inline asm code to retreive CPSR register
* Update port.c
Forgot an * while writing comment..
ulTaskGenericNotifyValueClear() returned the notification value of the
currently running task, not the target task. Now it returns the
notification value of the target task.
Some users expected xTaskCheckForTimeOut() to clear the 'last wake time'
value each time a timeout occurred, whereas it only did that in one path.
It now clears the last wake time in all paths that return that a timeout
occurred.
The TEX, Shareable (S), Cacheable (C) and Bufferable (B) bits define
the memory type, and where necessary the cacheable and shareable
properties of the memory region.
The default values for these bits, as configured in our MPU ports, are
sometimes not suitable for application. One such example is when the MCU
has a cache, the application writer may not want to mark the memory as
shareable to avoid disabling the cache. This change allows the
application writer to override default vales for TEX, S C and B bits for
Flash and RAM in their FreeRTOSConfig.h. The following two new
configurations are introduced:
- configTEX_S_C_B_FLASH
- configTEX_S_C_B_SRAM
If undefined, the default values for the above configurations are
TEX=000, S=1, C=1, B=1. This ensures backward compatibility.
Signed-off-by: Gaurav Aggarwal <aggarg@amazon.com>
* Removed TICK_stop() macro from portable/GCC/{AVR_AVRDx, AVR_Mega0}/porthardware.h because it is not used anywhere.
* Updated indentation in portable/GCC/{AVR_AVRDx, AVR_Mega0}/* files.
* Added portable/IAR/{AVR_AVRDx, AVR_Mega0 folders.
When I added portNVIC_ICSR_REG I didn't realize there was already a
portNVIC_INT_CTRL_REG, which identifies the same register. Not good
to have both. Note that portNVIC_INT_CTRL_REG is defined in portmacro.h
and is already used in this file (port.c).
The expected behaviour of portIS_PRIVILEGED is:
- return 0 if the processor is not running privileged.
- return 1 if the processor is running privileged.
Some TI ports do not return 1 when the processor is running privileged
causing the following check to fail: if( xRunningPrivileged != pdTRUE )
This commit change the check to: if( xRunningPrivileged == pdFALSE ). It
ensures that the check is successful even on the ports which return incorrect
value from portIS_PRIVILEGED when the processor is running privileged.
See https://forums.freertos.org/t/kernel-bug-nested-mpu-wrapper-calls-generate-an-exception/10391
Signed-off-by: Gaurav Aggarwal <aggarg@amazon.com>
configSYSTICK_CLOCK_HZ should be used to configure SysTick to support
the use case when the clock for SysTick timer is scaled from the main
CPU clock.
configSYSTICK_CLOCK_HZ is defined to configCPU_CLOCK_HZ when it is not
defined in FreeRTOSConfig.h.
Signed-off-by: Gaurav Aggarwal <aggarg@amazon.com>
ARMv7-M supports 8 or 16 MPU regions. FreeRTOS Cortex-M4 MPU ports so
far assumed 8 regions. This change adds support for 16 MPU regions. The
hardware with 16 MPU regions must define configTOTAL_MPU_REGIONS to 16
in their FreeRTOSConfig.h.
If left undefined, it defaults to 8 for backward compatibility.
Signed-off-by: Gaurav Aggarwal <aggarg@amazon.com>
Update BSP APIs to latest version
Remove unused macro which could have caused warnings
(Code Style) Manually align some macros
Signed-off-by: Yuguo Zou <yuguo.zou@synopsys.com>
These definitions were not useful because the corresponding mapping was
removed from mpu_wrappers.h earlier.
Signed-off-by: Gaurav Aggarwal <aggarg@amazon.com>
The reason for the change is that the register is called System Handler
Priority Register 3 (SHPR3).
Signed-off-by: Gaurav Aggarwal <aggarg@amazon.com>
Some of the privileged symbols were not being placed in their respective
sections. This commit addresses those and places them in
privileged_functions or privileged_data section.
Signed-off-by: Gaurav Aggarwal <aggarg@amazon.com>
If xTaskCreate API is used to create a task, the task's stack is
allocated on heap using pvPortMalloc. This places the task's stack
in the privileged data section, if the heap is placed in the
privileged data section.
We use a separate MPU region to grant a task access to its stack.
If the task's stack is in the privileged data section, this results in
overlapping MPU regions as privileged data section is already protected
using a separate MPU region. ARMv8-M does not allow overlapping MPU
regions and this results in a fault. This commit ensures to not use a
separate MPU region for the task's stack if it lies within the
privileged data section.
Note that if the heap memory is placed in the privileged data section,
the xTaskCreate API cannot be used to create an unprivileged task as
the task's stack will be in the privileged data section and the task
won't have access to it. xTaskCreateRestricted and
xTaskCreateRestrictedStatic API should be used to create unprivileged
tasks.
Signed-off-by: Gaurav Aggarwal <aggarg@amazon.com>
Add a note in the code comments that SysTick requests an interrupt when
decrementing from 1 to 0, so that's why stopping SysTick on zero is a
special case. Readers might unknowingly assume that SysTick requests
an interrupt when wrapping from 0 back to the load-register value.
Reconsider new "_SETTING" suffix since "_CONFIG" suffix seems more
descriptive. The code relies on *both* of these preprocessor symbols:
portNVIC_SYSTICK_CLK_BIT
portNVIC_SYSTICK_CLK_BIT_CONFIG **new**
A meaningful suffix is really helpful to distinguish the two symbols.
* Added protection for xQueueGenericCreate
* prevent eventual invalid state change from int8 overflow
* Append period at end of comment. To be consistent with file.
* check operand, not destination
* parantheses -- to not show assumptive precendence
* Per request, less dependence on stdint by defining and checking against queueINT8_MAX instead
When Link Time Optimization (LTO) is enabled, some of the LDR
instructions result in out of range access. The reason is that the
default generated literal pool is too far and not within the permissible
range of 4K.
This commit adds LTORG assembly instructions at required places to
ensure that access to literals remain in the permissible range of 4K.
Signed-off-by: Gaurav Aggarwal <aggarg@amazon.com>
* * Renamed trace point in prvNotifyQueueSetContainer() so it isn't a duplicate of the trace point in xQueueGenericSend()
* * Fixed backwards compatibility.
Co-authored-by: Erik Tamlin <erik.tamlin@percepio.com>
Refer to https://www.freertos.org/a00133.html.
The issue with the implementation is that, if only stop kernel tick the program will keep executing current task.
The desired behavior is to at least return/jump to the next instruction after vTaskStartScheduler().
Signed-off-by: Yuhui Zheng <10982575+yuhui-zheng@users.noreply.github.com>
Description
Before this change each task had a single direct to task notification value and state as described here: https://www.FreeRTOS.org/RTOS-task-notifications.html. After this change each task has an array of task notifications, so more than one task notification value and state per task. The new FreeRTOSConfig.h compile time constant configTASK_NOTIFICATION_ARRAY_ENTRIES sets the number of indexes in the array.
Each notification within the array operates independently - a task can only block on one notification within the array at a time and will not be unblocked by a notification sent to any other array index.
Task notifications were introduced as a light weight method for peripheral drivers to pass data and events to tasks without the need for an intermediary object such as a semaphore - for example, to unblock a task from an ISR when the operation of a peripheral completed. That use case only requires a single notification value. Their popularity and resultant expanded use cases have since made the single value a limitation - especially as FreeRTOS stream and message buffers themselves use the notification mechanism. This change resolves that limitation. Stream and message buffers still use the task notification at array index 0, but now application writers can avoid any conflict that might have with their own use of task notifications by using notifications at array indexes other than 0.
The pre-existing task notification API functions work in a backward compatible way by always using the task notification at array index 0. For each such function there is now an equivalent that is postfixed "Indexed" and takes an additional parameter to specify which index within the array it should operate upon. For example, xTaskNotify() is the original that only operates on array index 0. xTaskNotifyIndexed() is the new function that can operate on any array index.
Test Steps
The update is tested using the Win32 demo (PR to be created in the FreeRTOS/FreeRTOS github repo), which has been updated to build and run a new test file FreeRTOS/Demo/Common/Minimal/TaskNotifyArray.c. The tests in that file are in addition to, not a replacement for those in FreeRTOS/Demo/Common/Minimal/TaskNotify.c.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Prior to this commit, in configurations using the alternate SysTick
clocking, vPortSuppressTicksAndSleep() might cause xTickCount to jump
ahead as much as the entire expected idle time or fall behind as much
as one full tick compared to time as measured by the SysTick.
SysTick
-------
The SysTick is the hardware timer that provides the OS tick interrupt
in the official ports for Cortex M. SysTick starts counting down from
the value stored in its reload register. When SysTick reaches zero, it
requests an interrupt. On the next SysTick clock cycle, it loads the
counter again from the reload register. The SysTick has a configuration
option to be clocked by an alternate clock besides the core clock.
This alternate clock is MCU dependent.
Scenarios Fixed
---------------
The new code in this commit handles the following scenarios that were
not handled correctly prior to this commit.
1. Before the sleep, vPortSuppressTicksAndSleep() stops the SysTick on
zero, long after SysTick reached zero. Prior to this commit, this
scenario caused xTickCount to jump ahead one full tick for the same
reason documented here: https://github.com/FreeRTOS/FreeRTOS-Kernel/pull/59/commits/0c7b04bd3a745c52151abebc882eed3f811c4c81
2. After the sleep, vPortSuppressTicksAndSleep() stops the SysTick
before it loads the counter from the reload register. Prior to this
commit, this scenario caused xTickCount to jump ahead by the entire
expected idle time (xExpectedIdleTime) because the current-count
register is zero before it loads from the reload register.
3. Prior to return, vPortSuppressTicksAndSleep() attempts to start a
short SysTick period when the current SysTick clock cycle has a lot of
time remaining. Prior to this commit, this scenario could cause
xTickCount to fall behind by as much as nearly one full tick because the
short SysTick cycle never started.
Note that #3 is partially fixed by https://github.com/FreeRTOS/FreeRTOS-Kernel/pull/59/commits/967acc9b200d3d4beeb289d9da9e88798074b431
even though that commit addresses a different issue. So this commit
completes the partial fix.
Prior to this commit, if something other than systick wakes the CPU from
tickless idle, vPortSuppressTicksAndSleep() might cause xTickCount to
increment once too many times. See "bug 2" in this forum post:
https://forums.freertos.org/t/ultasknotifytake-timeout-accuracy/9629/40
SysTick
-------
The SysTick is the hardware timer that provides the OS tick interrupt
in the official ports for Cortex M. SysTick starts counting down from
the value stored in its reload register. When SysTick reaches zero, it
requests an interrupt. On the next SysTick clock cycle, it loads the
counter again from the reload register. To get periodic interrupts
every N SysTick clock cycles, the reload register must be N - 1.
Bug Example
-----------
- CPU is sleeping in vPortSuppressTicksAndSleep()
- Something other than the SysTick wakes the CPU.
- vPortSuppressTicksAndSleep() calculates the number of SysTick counts
until the next tick. The bug occurs only if this number is small.
- vPortSuppressTicksAndSleep() puts this small number into the SysTick
reload register, and starts SysTick.
- vPortSuppressTicksAndSleep() calls vTaskStepTick()
- While vTaskStepTick() executes, the SysTick expires. The ISR pends
because interrupts are masked, and SysTick starts a 2nd period still
based on the small number of counts in its reload register. This 2nd
period is undesirable and is likely to cause the error noted below.
- vPortSuppressTicksAndSleep() puts the normal tick duration into the
SysTick's reload register.
- vPortSuppressTicksAndSleep() unmasks interrupts before the SysTick
starts a new period based on the new value in the reload register.
[This is a race condition that can go either way, but for the bug
to occur, the race must play out this way.]
- The pending SysTick ISR executes and increments xPendedTicks.
- The SysTick expires again, finishing the second very small period, and
starts a new period this time based on the full tick duration.
- The SysTick ISR increments xPendedTicks (or xTickCount) even though
only a tiny fraction of a tick period has elapsed since the previous
tick.
The bug occurs when *two* consecutive small periods of the SysTick are
both counted as ticks. The root cause is a race caused by the small
SysTick period. If vPortSuppressTicksAndSleep() unmasks interrupts
*after* the small period expires but *before* the SysTick starts a
period based on the full tick period, then two small periods are
counted as ticks when only one should be counted.
The end result is xTickCount advancing nearly one full tick more than
time actually elapsed as measured by the SysTick. This is not the kind
of time slippage normally associated with tickless idle.
After this commit the code starts the SysTick and then immediately
modifies the reload register to ensure the very short cycle (if any) is
conducted only once. This strategy requires special consideration for
the build option that configures SysTick to use a divided clock. To
avoid waiting around for the SysTick to load value from the reload
register, the new code temporarily configures the SysTick to use the
undivided clock. The resulting timing error is typical for tickless
idle. The error (commonly known as drift or slippage in kernel time)
caused by this strategy is equivalent to one or two counts in
ulStoppedTimerCompensation.
This commit also updates comments and #define symbols related to the
SysTick clock option. The SysTick can optionally be clocked by a
divided version of the CPU clock (commonly divide-by-8). The new code
in this commit adjusts these comments and symbols to make them clearer
and more useful in configurations that use the divided clock. The fix
made in this commit requires the use of these symbols, as noted in the
code comments.
...and don't stop SysTick at all in the eAbortSleep case.
Prior to this commit, if vPortSuppressTicksAndSleep() happens to stop
the SysTick on zero, then after tickless idle ends, xTickCount advances
one full tick more than the time that actually elapsed as measured by
the SysTick. See "bug 1" in this forum post:
https://forums.freertos.org/t/ultasknotifytake-timeout-accuracy/9629/40
SysTick
-------
The SysTick is the hardware timer that provides the OS tick interrupt
in the official ports for Cortex M. SysTick starts counting down from
the value stored in its reload register. When SysTick reaches zero, it
requests an interrupt. On the next SysTick clock cycle, it loads the
counter again from the reload register. To get periodic interrupts
every N SysTick clock cycles, the reload register must be N - 1.
Bug Example
-----------
- Idle task calls vPortSuppressTicksAndSleep(xExpectedIdleTime = 2).
[Doesn't have to be "2" -- could be any number.]
- vPortSuppressTicksAndSleep() stops SysTick, and the current-count
register happens to stop on zero.
- SysTick ISR executes, setting xPendedTicks = 1
- vPortSuppressTicksAndSleep() masks interrupts and calls
eTaskConfirmSleepModeStatus() which confirms the sleep operation. ***
- vPortSuppressTicksAndSleep() configures SysTick for 1 full tick
(xExpectedIdleTime - 1) plus the current-count register (which is 0)
- One tick period elapses in sleep.
- SysTick wakes CPU, ISR executes and increments xPendedTicks to 2.
- vPortSuppressTicksAndSleep() calls vTaskStepTick(1), then returns.
- Idle task resumes scheduler, which increments xTickCount twice (for
xPendedTicks = 2)
In the end, two ticks elapsed as measured by SysTick, but the code
increments xTickCount three times. The root cause is that the code
assumes the SysTick current-count register always contains the number of
SysTick counts remaining in the current tick period. However, when the
current-count register is zero, there are ulTimerCountsForOneTick
counts remaining, not zero. This error is not the kind of time slippage
normally associated with tickless idle.
*** Note that a recent commit e1b98f0
results in eAbortSleep in this case, due to xPendedTicks != 0. That
commit does mostly resolve this bug without specifically mentioning
it, and without this commit. But that resolution allows the code in
port.c not to directly address the special case of stopping SysTick on
zero in any code or comments. That commit also generates additional
instances of eAbortSleep, and a second purpose of this commit is to
optimize how vPortSuppressTicksAndSleep() behaves for eAbortSleep, as
noted below.
This commit also includes an optimization to avoid stopping the SysTick
when eTaskConfirmSleepModeStatus() returns eAbortSleep. This
optimization belongs with this fix because the method of handling the
SysTick being stopped on zero changes with this optimization.
* fix: CLEAR MIE BIT IN INITIAL RISC-V MSTATUS VALUE
The MIE bit in the RISC-V MSTATUS register is used to globally enable
or disable interrupts. It is copied into the MPIE bit and cleared
on entry to an interrupt, and then copied back from the MPIE bit on
exit from an interrupt.
When a task is created it is given an initial MSTATUS value that is
derived from the current MSTATUS value with the MPIE bit force to 1,
but the MIE bit is not forced into any state. This change forces
the MIE bit to 0 (interrupts disabled).
Why:
If a task is created before the scheduler is started the MIE bit
will happen to be 0 (interrupts disabled), which is fine. If a
task is created after the scheduler has been started the MIE bit
is set (interrupts enabled), causing interrupts to unintentionally
become enabled inside the interrupt in which the task is first
moved to the running state - effectively breaking a critical
section which in turn could cause a crash if enabling interrupts
causes interrupts to nest. It is only an issue when starting a
newly created task that was created after the scheduler was started.
Related Issues:
https://forums.freertos.org/t/risc-v-port-pxportinitialisestack-issue-about-mstatus-value-onto-the-stack/9622
Co-authored-by: Cobus van Eeden <35851496+cobusve@users.noreply.github.com>
Enabling Link Time Optimization (LTO) causes some of the functions used
in assembly to be incorrectly stripped off, resulting in linker error.
To avoid this, these functions are marked with portDONT_DISCARD macro,
definition of which is port specific. This commit adds the definition
of portDONT_DISCARD for ARMv7-M ports.
Signed-off-by: Gaurav Aggarwal
Problem Description
-------------------
The current flash organization in ARMv7-M MPU ports looks as follows:
__FLASH_segment_start__ ------->+-----------+<----- __FLASH_segment_start__
| Vector |
| Table |
| + |
| Kernel |
| Code |
+-----------+<----- __privileged_functions_end__
| |
| |
| |
| Other |
| Code |
| |
| |
| |
__FLASH_segment_end__ ------>+-----------+
The FreeRTOS kernel sets up the following MPU regions:
* Unprivileged Code - __FLASH_segment_start__ to __FLASH_segment_end__.
* Privileged Code - __FLASH_segment_start__ to __privileged_functions_end__.
The above setup assumes that the FreeRTOS kernel code
(i.e. privileged_functions) is placed at the beginning of the flash and,
therefore, uses __FLASH_segment_start__ as the starting location of the
privileged code. This prevents a user from placing the FreeRTOS kernel
code outside of flash (say to an external RAM) and still have vector
table at the beginning of flash (which is many times a hardware
requirement).
Solution
--------
This commit addresses the above limitation by using a new variable
__privileged_functions_start__ as the starting location of the
privileged code. This enables users to place the FreeRTOS kernel code
wherever they choose.
The FreeRTOS kernel now sets up the following MPU regions:
* Unprivileged Code - __FLASH_segment_start__ to __FLASH_segment_end__.
* Privileged Code - __privileged_functions_start__ to __privileged_functions_end__.
As a result, a user can now place the kernel code to an external RAM. A
possible organization is:
Flash External RAM
+------------+ +-----------+<------ __privileged_functions_start__
| Vector | | |
| Table | | |
| | | |
__FLASH_segment_start__ ----->+------------+ | Kernel |
| | | Code |
| | | |
| | | |
| | | |
| Other | | |
| Code | +-----------+<------ __privileged_functions_end__
| |
| |
| |
__FLASH_segment_end__ ----->+------------+
Note that the above configuration places the vector table in an unmapped
region. This is okay because we enable the background region, and so the
vector table will still be accessible to the privileged code and not
accessible to the unprivileged code (vector table is only needed by the
privileged code).
Backward Compatibility
----------------------
The FreeRTOS kernel code now uses a new variable, namely
__privileged_functions_start__, which needs to be exported from linker
script to indicate the starting location of the privileged code. All of
our existing demos already export this variable and therefore, they will
continue to work.
If a user has created a project which does not export this variable,
they will get a linker error for unresolved symbol
__privileged_functions_start__. They need to export a variable
__privileged_functions_start__ with the value equal to
__FLASH_segment_start__.
Issue
-----
https://sourceforge.net/p/freertos/feature-requests/56/
Signed-off-by: Gaurav Aggarwal <aggarg@amazon.com>
This is similar to the Windows port, allowing FreeRTOS kernel
applications to run as regular applications on Posix (Linux) systems.
You can use this in a 32-bit or 64-bit application (although there are
dynamic memory allocation trace points that do not support 64-bit
addresses).
Many of the same caveats of running an RTOS on a non-real-time system
apply, but this is still very useful for easy debugging/testing
applications in a simulated environment. In particular, it allows easy
use of tools such as valgrind.
You can call standard library functions from tasks but care must be
taken with any that internally take mutexes or block. This includes
malloc()/free() and many stdio functions (e.g., printf()).
Replacement malloc(), free(), realloc(), and calloc() functions are
provided which are safe. printf() needs to be called with a FreeRTOS
mutex help (or called from only a single task).
Each task is run in its own pthread, which makes debugging with
standard tools (such as GDB) easier backtraces for individual tasks
are available. Threads for non-running tasks are blocked in sigwait().
The stack for each task (thread) is allocated when the thread is
created, and the stack provided during task creation is not used. This
is so the stack has guard pages, to help with detecting stack
overflows.
Task switch is done by resuming the thread for the next task by
sending it the resume signal (SIGUSR1) and then suspending the current
thread.
The timer interrupt uses SIGALRM and care is taken to ensure that the
signal handler runs only on the thread for the current task.
The additional data needed per-thread is stored at the top on the
task's stack.
When a running task is being deleted, its thread is marked it as dying
so when we switch away from it it exits instead of suspending. This
ensures that even if the idle task doesn't run, threads are deleted
which allows for more threads to be created (if many tasks are being
created and deleted in rapid succession).
To further aid debugging, SIGINT (^C) is not blocked inside critical
sections. This allows it to be used break into GDB while in a critical
section. This means that care must be taken with any custom SIGINT
handlers as these are like NMIs.
This is somewhat inspired by an existing port by William Davy
(https://www.freertos.org/FreeRTOS-simulator-for-Linux.html) but it
takes a number of different approaches to make it switch tasks
reliableand there's little similarly with the original implementation.
- Critical sections block scheduling/"interrupts" by blocking signals
using pthread_sigmask(). This is more expensive than attempting to
use flags but works reliably and is analogous to the interrupt
enable/disable on real hardware.
- Care is take to ensure that the SIGALRM handler (for the timer tick)
is runnable only on the pthread for the running task. This makes
tasks switches more straight-forward and reliable as we can suspend
the thread while in the signal handler.
- Task switches save/restore the critical nesting on the stack.
- Only uses a single (SIGUSR1) signal which is ignored and thus GDB's
default signal handling options won't trap/print on this signal.
- Extra per-thread data is stored on the task's stack, making it
accessible in O(1) instead of performing a O(n) lookup of the array.
- Uses the task create/delete hooks in a similar way to the Windows
port, rather than overloading trace points.