I've implemented a bootloader for a Kinetis ARM Cortex-M4 microcontroller.
The main application (starting at 0x10000
) is re-programmed via the bootloader over a custom RS232 interface. I've implemented jumpToApplication
and jumpToBootloader
functions from the bootloader and application perspectives and all works fine so far.
One strategy I'm keen to understand is what to do upon the event of a corrupt main application?
The bootloader currently checks the stack-pointer and program-counter of the main application before deciding whether to jump. However, if the main application is corrupt then either two issues will occur:
- The main application will hang and make it difficult to re-program
- The microcontroller will reboot and will be stuck in a
bootloader
>application
>bootloader
(etc) loop
I have a SharedData
structure which allows me to share data (via a fixed RAM location) between both the bootloader and application. I have considered adding a rebootCounter
to this structure which would be incremented upon the HardFaultInterrupt
being triggered in the main application.
This value could be tested in the bootloader and, depending on the counter value, a decision could be made as to whether to stay in the bootloader or try to launch the application.
Are there more "industry standard" ways of dealing with this?
UPDATE
To clarify, the ultimate reason for asking this question is to cover the following scenario:
- Bootloader is programmed into the device during production phase via JTAG
- Main application (latest build) is loaded during testing phase
- During the testing phase, there is a power-cut or connection issue and the device is only partially programmed
- When power is applied again, the bootloader will "assume" that there is a valid program in the main part of flash and will "jump" to this application
- The microcontroller is now stuck in no mans land with no way of re-loading flash via the bootloader again without opening up the products enclosure and re-flashing the chip via JTAG - not something we can do when the product is in the field.
During the bootloader programming phase, the firmware is programmed and validated byte-by-byte to ensure that there is no corruption during the data transfer. If corruption occurs during this phase (bad packet due to USB hub issue, for example) then the bootloader will continue to accept re-programming commands.
UPDATE #2
The following post seems to be thinking along similar lines:
https://interrupt.memfault.com/blog/how-to-write-a-bootloader-from-scratch