CrashCatcher
Catch Hard Faults on Cortex-M devices and save out a crash dump to be used by CrashDebug.
Install / Use
/learn @adamgreen/CrashCatcherREADME
==Table of Contents [[https://github.com/adamgreen/CrashCatcher#overview | Overview]] \ [[https://github.com/adamgreen/CrashCatcher#how-to-use-crash-dumps | How To Use Crash Dumps]] \ [[https://github.com/adamgreen/CrashCatcher#how-it-works | How it Works]] \ [[https://github.com/adamgreen/CrashCatcher#crashcatcher-libraries | CrashCatcher Libraries]] \ [[https://github.com/adamgreen/CrashCatcher#dump-format | Dump Format]] \ [[https://github.com/adamgreen/CrashCatcher#how-to-clone | How to Clone]] \ [[https://github.com/adamgreen/CrashCatcher#how-to-build | How to Build]] \
==Overview CrashCatcher is code which can be included in Cortex-M microcontroller firmware to catch Hard Faults and generate crash dumps to be used by [[https://github.com/adamgreen/CrashDebug | CrashDebug]] for post-mortem debugging. This can be useful:
- for diagnosing crashes when it isn't convenient to have a debugger attached.
- for capturing faults in the field to be sent back to developers for further investigation.
- for capturing faults during testing to be further investigated at a later point in time.
- for attaching to bug reports to provide more information about an issue being encountered.
The core code in CrashCatcher knows how to capture the register and RAM state at the time of a crash. It will then call developer provided functions to dump data in a convenient format. This could be dumping the data to a serial port or generating a file on a SD card or LocalFileSystem (for mbed devices which have such built-in storage.)
== How To Use Crash Dumps The crash dumps generated by CrashCatcher are to be loaded and used with the [[https://github.com/adamgreen/CrashDebug#readme | CrashDebug]] utility. More information on how CrashCatcher dumps are to be used with CrashDebug can be found [[https://github.com/adamgreen/CrashDebug#crashcatcher-hexdump | here]].
== How it Works In addition to the documentation below you can also check out [[http://www.cyrilfougeray.com/2020/07/27/firmware-logs-with-stack-trace.html | Cyril Fougeray's great blog post]] where he describes using the CrashCatcher library as a part of his logging solution.
=== Catching the Hard Fault A typical CMSIS based build system for Cortex-M parts will have a hard fault interrupt vector which calls a function with the following prototype: {{{ extern "C" void HardFault_Handler(void) }}} The default CMSIS HardFault_Handler() is a weak function that typically does nothing more than infinite loop. However the developer can link in a HardFault_Handler() function that does more. The CrashCatcher library provides two assembly language versions of this HardFault_Handler() function, one for ARMv6-M cores and another for ARMv7-M cores. | [[https://github.com/adamgreen/CrashCatcher/blob/master/Core/src/CrashCatcher_armv6m.S]] | | [[https://github.com/adamgreen/CrashCatcher/blob/master/Core/src/CrashCatcher_armv7m.S]] | These assembly language routines switch to using a dedicated g_crashCatcherStack stack (incase the crash was caused by stack corruption) and then it pushes registers to this stack before calling CrashCatcher_Entry(). Cortex-M processors will automatically push r0-r3, r12, lr, pc, and xpsr to the stack before running the HardFault_Handler() code. The assembly language routines therefore only need to push the rest of the registers.
===CrashCatcher_Entry() The CrashCatcher_Entry() routine is the core crash dumping routine and can be found [[https://github.com/adamgreen/CrashCatcher/blob/master/Core/src/CrashCatcher.c | here]]. It is written in C and knows how to dump the state of Cortex-M0/M3/M4 processors using the registers that were stacked before it was called. The HardFault_Handler() assembly language code passes in a pointer to where it stacked these registers of interest. How CrashCatcher_Entry() presents the crash dump will depend on a developer's particular hardware setup. For example, the developer may want it to be dumped to a serial interface or stored on some nonvolatile storage medium. As CrashCatcher gathers the crash state, it will call routines provided by the developer to actually dump this state.
===Developer Provided Routines This section describes the routines that the CrashCatcher core expects to be provided by the developer to facilitate crash dumping on their particular hardware setup. There is more information in the [[https://github.com/adamgreen/CrashCatcher/blob/master/include/CrashCatcher.h | main CrashCatcher header file]].
====Core Routines The following table lists the routines that are required by the Core CrashCatcher module to generate a crash dump. Implementing these functions provide the developer with the most flexibility for where the crash dump will be recorded.
| CrashCatcher_DumpStart() | Called at the beginning of a crash dump. The developer should provide an implementation which prepares for the dump by opening a dump file, prompting the user, or whatever makes sense for their scenario. A pointer to a CrashCatcherInfo structure is passed into this function. This lets the developer know things about the current fault such as whether it was caused by a hard coded breakpoint instruction and the current value of the Stack Pointer. | | CrashCatcher_GetMemoryRegions() | Called to obtain an array of regions in memory that should be dumped as part of the crash. This will typically be all RAM regions that contain volatile data. For some crash scenarios, a developer may decide to also add peripheral registers of interest (ie. dump some ethernet registers when encountering crashes in the network stack.) If NULL is returned from this function, the core will only dump the registers. A developer might want to take advantage of the Stack Pointer value passed into CrashCatcher_DumpStart() to only return a region for the currently used stack. This would result in a mini dump where only the call stack and local variables can be accessed by GDB (globals wouldn't be accessible.) | | CrashCatcher_DumpMemory() | Called to dump the next chunk of memory (this memory may contain the contents of registers which have already been copied to memory by CrashCatcher.) The element size will be 8-bits, 16-bits, or 32-bits. The implementation should use reads of the specified size since some memory locations may only support reads of the indicated sized. | | CrashCatcher_DumpEnd() | Called at the end of a crash dump. The developer should provide an implementation which cleans up at the end of dump. This could include closing a dump file, blinking LEDs, and/or infinite looping. It is typically only safe to return CRASH_CATCHER_EXIT if the call to CrashCatcher_DumpStart() indicates that the cause of the fault is a hard coded breakpoint. |
====HexDump Routines Often one of the first peripherals that developers get up and running on their hardware is the UART. The above list of functions may seem like a lot for a developer to implement when the only mechanism they have for communicating the crash information to the user is the UART. CrashCatcher provides a HexDump module which makes it easier for a developer to utilize their existing UART driver software for capturing the crash dump. The following table shows the simpler list of routines that have to be provided by a developer if they are using the HexDump module:
| CrashCatcher_GetMemoryRegions() | Called to obtain an array of regions in memory that should be dumped as part of the crash. This will typically be all RAM regions that contain volatile data. For some crash scenarios, a user may decide to also add peripheral registers of interest (ie. dump some ethernet registers when you are encountering crashes in the network stack.) If NULL is returned from this function, the core will only dump the registers. | | CrashCatcher_getc() | Called to receive a character of data from the user. Typically this is in response to a "Press any key" type of prompt to the user. This function should be blocking. | | CrashCatcher_putc() | Called to send a character of hex dump data to the user. |
The following is an excerpt of what the HexDump module would output when a crash is encountered. It first notifies the user that a crash has been encountered and then prompts them to press any key to start the dumping process. Once the user sends any keystroke to the device, the hexadecimal dump of text begins. At the end it loops and prompts the user again, incase they missed some data from the previous dump.
{{{
CRASH ENCOUNTERED Enable logging and then press any key to start dump.
63430300 00000000 00000000010000000200000003000000 04000000050000000600000007000000 08000000090000000A0000000B000000 0C000000 687F0010 FFFFFFFF0003000000000001 487F00107809001003000000 0000001000800010 00BE0AE00D782D0668400824400000D3 ... 28ED00E03CED00E0 00000100000000400800000034ED00E0 38ED00E0
End of dump
CRASH ENCOUNTERED Enable logging and then press any key to start dump. }}}
===CrashCatcher Stack When dumping the information about a crash, CrashCatcher sets the stack pointer to an area of memory reserved for this purpose. It uses its own stack as stack corruption may have been what lead to the crash in the first place. This reserved stack area is named {{{g_crashCatcherStack}}} and can be found in the Core/src/CrashCatcher.c source file. Its size is determined by the {{{CRASH_CATCHER_STACK_WORD_COUNT}}} macro in the Core/src/CrashCatcherPriv.h header.
The CrashCatcher stack must be large enough to run CrashCatcher's code, any routines you provide, and any code that your routines end up calling. The current size specified by {{{CRASH_CATCHER_STACK_WORD_COUNT}}} may not be enough depending on the nature of your routines. If you need a larger stack, you can modify the {{{CRASH_CATCHER_STACK_WORD_COUNT}}} macro in Core/src/CrashCatcherPriv.h or provide it on the command line of your C compiler and assembler (using the {{{-DCRASH_CATCHER_STACK_WORD_COUNT=###}}}
