SES 3.10e trouble identifying hard fault

This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

  • SES 3.10e trouble identifying hard fault

    So I am using SES v2.20 on Ubuntu 16.04 and running into some issues with a project for the STM32F429ZI Cortex-M4. I had a working set of files from a project that I copied into a new project so I could make some modifications. After completing the modifications, the project compiles but does not execute correctly. On SES v2.20 the debugger just halts with a message saying something about ending due to a vector catch but provides no specifics. So I tried to upgrade to SES 3.10. This time, instead of the debugger just halting completely, it goes to the hard fault handler.

    I've tried commenting out code to see what is causing the error, but this hasn't helped, it only changes where the error occurs. So if I comment out function A, the error happens in function B, and vice-versa. Also, the code executes correctly if I step through it in the disassembly window, but as soon as I stop going line by line and hit `run` I get an error. As an example, with all of my project code in-place, I will get an error on this line:

    Source Code

    1. newTaskID = 0x00; // global of type unsigned char

    If I put a break point there and try to continue execution with either `run`, `step over`, or `step into`, I get a hard fault, unless I switch to the disassembly window, in which case, it executes fine. If I comment out this line of code, then the error simply happens at the next statement. I am thinking there is something wrong with my project settings and not my code, but I have no idea how to investigate this. The only information I could find about people dealing with the vector catch is that their code was being placed in the wrong section of memory. If this is the case here, I'm not sure how to identify or fix the issue. I've attached source files if anyone cares to try and reproduce the issue. As stated, this code is for the STM32F429ZI Discovery Board with the default project settings.
    Files
    • Scheduling.7z

      (7.06 kB, downloaded 545 times, last: )

    The post was edited 1 time, last by Eqqman ().

  • Commenting and uncommenting code is likely to confuse the issue when you have something like a hard fault. I'm actually battling one myself right now on an STM32F415. The system control block has some registers that you can inspect that can be helpful. Check your stacks for overflow if you are using an RTOS (the plugins for EMBOS and FreeRTOS will display this information). You can find hard fault handler code out there that will pull useful information off the stack.

    The real tool to use for this is trace. Unfortunately the STM32F parts do not have an ETB (on board trace buffer that can be read by the debugger). Instead you need to run the trace lines to your debug connector and purchase something like a J-Trace. I am in the process of ordering one myself for an STM32F7 project.

    Good luck.
  • Hi,

    You might want to have a look at our Application Note for more analysis possibilities of Hard Faults: segger.com/downloads/appnotes
    There might be different reasons for hard faults, especially in "multi-tasking" applications. The hardfault handler might help you to find it.

    Best regards
    Johannes
    Please read the forum rules before posting.

    Keep in mind, this is *not* a support forum.
    Our engineers will try to answer your questions between their projects if possible but this can be delayed by longer periods of time.
    Should you be entitled to support you can contact us via our support system: segger.com/ticket/

    Or you can contact us via e-mail.
  • I might be closer to an answer, although the exact nature of the problem eludes me. My project has several versions of the code base. In versions n and n-1, I can compile and run the code just fine, and even pause and step into it. However, if I place a break point into the system, I get a hard fault the moment I move past the break point. After some trial and error, I isolated the issue to this line of code:

    Source Code

    1. NVIC_EnableIRQ(TIM6_IRQ); // TIM6_IRQ defined as 0x36

    This function call is created in one of the default files auto-generated by SES (core_cm4.h). If I comment out this line of code, everything works perfectly fine. If I allow the code to free-run, then everything works fine. However, placing a break point in the system gives a hard fault the second I attempt to advance through the code, provided that the NVIC_Enable function has been called (so I can step through code just fine until I reach this function). This is still a puzzlement to me since in version n-2 of my code base, I can step through that function call just fine in debug mode. And when I was developing version n-1 of my code, I don't recall having any issues with the debugger, so I don't understand what has happened.
  • Hello-

    I have downloaded the recommended document and followed its advice. When I do the simple hard fault handler:

    C Source Code

    1. static volatile unsigned int _Continue;
    2. void HardFault_Handler(void) {
    3. _Continue = 0u;
    4. //
    5. // When stuck here, change the variable value to != 0 in order to step out
    6. //
    7. while (_Continue == 0u);
    8. }

    the document says "If you step out of the Hard Fault handler, you will reach the first instruction after the instruction which caused the hard fault." However, this is not what happens to me. Instead, execution immediately goes back to the hard fault handler, even when I step through in disassembly mode.

    When I add in the more detailed code, these are the non-zero values in the HardFaultRegs register:

    Source Code

    1. bfar = 0xe000ed38 // Bus Fault Manage Address Register
    2. ufsr:INVPC = 0x01 // Attempts to do an exception with a bad value in the EXC_RETURN number
    3. hfsr:FORCED = 0x01 // Indicates hard fault is taken because of bus fault/memory management fault/usage fault

    Everything else is zero, even `SavedRegs`. I'm not sure how to proceed with this information. As a reminder, the code appears to execute correctly provided I never pause on a break point. This is the section giving the issue:

    C Source Code

    1. void
    2. OSp_InitTIM6 (void) {
    3. volatile unsigned int wait = MAX_WAIT;
    4. // Enable the system clock for the TIM6 peripheral
    5. // [1] p.183
    6. RCC->APB1ENR |= RCC_APB1ENR_TIM6EN;
    7. // Wait for the system clock to stabilize
    8. for (wait = 0x00; wait < MAX_WAIT; ) {
    9. ++wait;
    10. }
    11. // Enable auto-reload
    12. // [1] p.704
    13. TIM6->CR1 BON(BIT_07);
    14. // Ensure counter is free-running (does not stop counting)
    15. // [1] p.705
    16. TIM6->CR1 BOFF(BIT_03);
    17. // Only over/underflow causes interrupts
    18. // [1] p.705
    19. TIM6->CR1 BON(BIT_02);
    20. // Interrupt on over/underflow enabled
    21. // [1] p.705
    22. TIM6->CR1 BOFF(BIT_01);
    23. // [1] p.706
    24. TIM6->DIER BON(BIT_00);
    25. // No prescaler used
    26. // [1] pp.699, 708
    27. TIM6->PSC = 0x00;
    28. // Value for auto-reload register (sets the timer duration)
    29. // [1] pp.699, 701, 708
    30. TIM6->ARR = ONE_MS;
    31. // Set the timer to the lowest (best) possible priority
    32. // [4] pp.208, 214, core_cmd4.h in (Proj. Dir.)/CMSIS_4/CMSIS/Include
    33. NVIC_SetPriority(TIM6_IRQ, 0x00);
    34. // Timer is ON
    35. // [1] p.705
    36. TIM6->CR1 BON(BIT_00);
    37. // Enable IRQ for TIM6 in the NVIC
    38. // [4] p.208, 214, core_cmd4.h in (Proj. Dir.)/CMSIS_4/CMSIS/Include
    39. NVIC_EnableIRQ(TIM6_IRQ);
    40. } // end OSp_InitTIM6
    Display All

    If I comment out the line to enable the IRQ, everything functions normally with the debugger. I've also had no problems with this code as-is in other projects.
  • Hi,

    NVIC_EnableIRQ(TIM6_IRQ); enables the TIM6 interrupt, which seems to be your system/kernel timer.
    If you do not call this function your OS probably won't run with multiple tasks.

    When you step (on source or instruction level) interrupts are usually disabled.
    When you let your application run interrupts are enabled.

    So it might be possible that there is a problem in your TIM6 ISR.
    Set a breakpoint in the ISR and check if the hard fault happens there.

    Best regards
    Johannes
    Please read the forum rules before posting.

    Keep in mind, this is *not* a support forum.
    Our engineers will try to answer your questions between their projects if possible but this can be delayed by longer periods of time.
    Should you be entitled to support you can contact us via our support system: segger.com/ticket/

    Or you can contact us via e-mail.
  • Thanks to everyone that helped.

    As guessed by Johannes, the problem is Timer6. This particular timer on the STM32F429 board always runs once enabled, even when the processor is halted in the debugger. So, if a break point occurs too early in the code, the Timer6 ISR immediately launches afterwards before key system variables have been properly set up. When I used this same set of code on the TI Launchpad with the TM4C123G chip, the timer I was using would halt when the debugger halted, so these problems went undetected for 2 years.