[SOLVED] J-Link won't reprogram a page with all 1's on STM32L4

  • Hi,

    I have an issue with JLinkExe/J-Flash on an STM32L433CC, which I suspect is due to some JLink flash access optimization.
    TL;DR
    if you programatically and explicitly write a flash page (2KB) with all-1's (from within your firmware code), J-Link won't be able to reprogram it unless you erase the whole flash first!

    Long story (after several hours of debugging):

    I am implementing a simple bootloader to allow for on-the-field firmware upgrade. I have two separate binaries: a small bootloader firmware (flashed at the reset address location, 0x08000000) and the actual application firmware (stored at 0x08020000).
    I use J-Link/GDB under Eclipse with the appropriate settings so that I can quickly program and debug both the bootloader and the application with a single mouse
    click, which I find very convenient.

    For the on-the-field upgrade, it goes like this: you flash the new application code starting at 0x08004000 and set a flag in the flash and then reboot.
    The bootloader then checks for the flag and if it is set, it erases the flash area containing the current firmware and overwrites it with the new one before jumping to it. Pretty simple.
    To test the bootloader code, I tried downloading an empty firmware (all-0xFF's) in the special area and setting the flag. The bootloader then just copies it over in the application-dedicated area, 0x08020000, (remember, it's all FF's!).

    After that, I can't reprogram the flash pages starting at 0x08020000 using J-Link/GDB anymore. The error message (or lack thereof) varies depending on the tool/OS I use.

    What's worse, GDB will just behave as if everything had been OK!
    See the output here (GDB/Linux):


    However, when reading back the area at 0x08020000 (where the application should have been), I read all FF's.
    If I do the same operation on the CLI tool JLinkExe/Linux, I at least get some hint that something went wrong:

    Code
    Comparing flash   [100%] Done.
    Erasing flash 	[100%] Done.
    Programming flash [100%] Done.
    Verifying flash   [100%] Done.
    J-Link: Flash download: Restarting flash programming due to program error (possibly skipped erasure of half-way erased sector).
    J-Link: Flash download: Skip optimizations disabled for second try.
    Error while programming flash: Programming failed.


    Almost in despair, I also tried using J-Flash (under Windows):

    After performing a mass erase actually everything goes back to normal.
    Also, programmatically erasing the incriminated pages seems to also fix the situation.

    So while I can't be 100% sure it's J-Link's fault, it looks like its optimization logic gets somehow deceived by this use case (albeit a pretty unique one, I admit).
    The root cause is perhaps to be found within the STM32L4 internal flash implementation with its out-of-band ECC making an empty/erased flash page look indistinguishable from a page programmed with all 1's, even though the former can be reprogrammed at will, while the latter must be erased first.

    For now I worked around the issue by adding an explicit check in the flash programming routine which refrains from explicitly writing a 64-bit double word if it's all 1's.

    Nevertheless, it's quite unnerving how:
    1) GDB incorrectly reports that programming was successful even though that was absolutely not the case!
    2) JLinkExe tries to disable its internal optimization, without much success though
    3) J-Flash reports "Erase operation completed successfully" (while apparently nothing was erased)
    4) I found no way to explicitly and unconditionally erase a single page!

    Could anyone please confirm whether my suspicion makes any sense at all or maybe I'm doing something wrong somewhere else?
    Is there any better way to deal with this?
    Any chance this could be fixed so that real errors are reported?

    Thank you!

  • Hello,

    Thank you for your inquiry.

    We were able to reproduce the issue with a STM32L4 board and J-Flash.
    For this particular device family it seems that the flash can only be programmed once after erase.
    Now programming all 0xFF also counts as programming once for the STM32L4.
    If you try to program again on top of that J-Link will read all 0xFF and due to optimization think it was already erased and try to program the same flash cells a second time without erasing beforehand which results in a "confused" state of the target device. Thus no operations afterwards work properly until a software or power on reset.
    As it is not feasible for us to remove optimizations for such a special case we can only offer workarounds in that matter:

    - Try to use other testdata than 0xFF to avoid the erase optimization, for this you can use e.g. J-Flash and there under Target->Test->Generate Test data... and generate data which creates code for an application that is a simple branch loop onto itself which is not harmful to Cortex-M devices.
    - Skip automatic flash compare, then an erase of the affected area will always be executed and thus you make sure that particular sector gets erased each time. For this you can use exec command "SetCompareMode" which sets the compare mode for the current session. Set this to "Skip" and you should no longer have any of the described issues.
    For more information consult the J-Link User Manual.

    Quote

    I found no way to explicitly and unconditionally erase a single page!


    To do this specifically only J-Flash can be used as for all other cases the J-Link Software takes automatically care of only erasing affected sectors to increase the longevity of the users flash memory.

    Best regards,
    Nino

    Please read the forum rules before posting.

    Keep in mind, this is *not* a support forum.
    Our engineers will try to answer your questions between their projects if possible but this can be delayed by longer periods of time.
    Should you be entitled to support you can contact us via our support system: https://www.segger.com/ticket/

    Or you can contact us via e-mail.

  • Thank you for you reply, see my comments inline.

    Hello,

    Thank you for your inquiry.

    We were able to reproduce the issue with a STM32L4 board and J-Flash.
    For this particular device family it seems that the flash can only be programmed once after erase.
    Now programming all 0xFF also counts as programming once for the STM32L4.
    If you try to program again on top of that J-Link will read all 0xFF and due to optimization think it was already erased and try to program the same flash cells a second time without erasing beforehand which results in a "confused" state of the target device. Thus no operations afterwards work properly until a software or power on reset.


    Actually a power on reset or even a power cycle, does not seem to help either.

    Quote


    As it is not feasible for us to remove optimizations for such a special case we can only offer workarounds in that matter:

    - Try to use other testdata than 0xFF to avoid the erase optimization, for this you can use e.g. J-Flash and there under Target->Test->Generate Test data... and generate data which creates code for an application that is a simple branch loop onto itself which is not harmful to Cortex-M devices.
    - Skip automatic flash compare, then an erase of the affected area will always be executed and thus you make sure that particular sector gets erased each time. For this you can use exec command "SetCompareMode" which sets the compare mode for the current session. Set this to "Skip" and you should no longer have any of the described issues.
    For more information consult the J-Link User Manual.


    OK, I'll give it a try.

    Quote


    To do this specifically only J-Flash can be used as for all other cases the J-Link Software takes automatically care of only erasing affected sectors to increase the longevity of the users flash memory.


    Well, the above workaround should be enough.
    The last thing I don't understand though, is how J-Link GDB server does not report any error at all. I see a pretty reassuring line:

    Code
    Verifying flash   [....................] Done.

    which is somehow misleading. This would suggest that programming was succesful as a later comparison would provide a match. Yet this is certainly not the case.
    Could you please elaborate? What does line refer to?

    Thank you!

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!