[SOLVED] GDBServer: Call Stack reconstruction in fault condition

pruesch · Jun 7th 2016, 4:01pm

Hi,

might it be possible to integrate the ability to get a stack unwinding when a fault triggers?

someone has done it before, in a java based gdb extension:
codeconfidence.com/freertos-tools.shtml

SEGGER - Arne · Jun 15th 2016, 12:06pm

Hi,

in the link you provided is to be read "works in conjunction with a tiny hard fault exception handler". I'm sure most of the work for getting the call stack
built up correctly is done in this handler, as the call stack is completely generated by the GDB client using the GDB Server only for reading memory.

We already have Tools, which can build up the call stack even from within an exception handler.

Maybe you want to give Ozone - The J-Link Debugger a try?

Our IDE Embedded Studio is also able to do this and is available free of charge for evaluation and non-profit educational purposes.

Best regards,

Arne

pruesch · Jun 15th 2016, 5:14pm

Hi,

I wrote a short article about it and how far I got:
element14.com/community/thread…-debugging-of-hard-faults

I haven't tried your debugging solutions for a long time.
maybe I should try again.

SEGGER - Arne · Jun 15th 2016, 6:46pm

Hi,

this confirmed my suspicion, that the GDB client uses the wrong stack for its backtracing.
You could try to replace the LR with the SP before the exception (saved on the stack).

Best regards,
Arne

pruesch · Jun 16th 2016, 9:55am

I'm not 100% sure what you mean.

should write the stacked sp into lr ?
or write the current msp/psp into lr ?

pruesch · Jun 16th 2016, 10:24am

I tried to use the stacked lr

C Source Code

__asm volatile(" tst lr, #4 \n" );
__asm volatile(" ite eq \n" );
__asm volatile(" mrseq r0, msp \n" );
__asm volatile(" mrsne r0, psp \n" );
__asm volatile(" ldr lr, [r0,#20] \n" ); //use the stacked lr

but this did not produce good results.

for now, I'm getting the best results when I write the actual msp/psp to the sp.

what is your impression?

SEGGER - Arne · Jun 16th 2016, 10:42am

Hi,

take the stacked PC (offset 24).
This is the address that caused the exception.

Best regards,
Arne

pruesch · Jun 16th 2016, 10:49am

ooookayy...

it seems like I used the wrong offset to get the stacked lr from the stack frame.

I was confused because the this hardfault handler uses offset 5 as lr...

if I take the lr pc with an offset of 24 bytes, and write it to the lr, the call stack is even more beautiful once the breakpoint instruction is hit.
see the attached screenshot.

how can I make the debug view focused on that lr address? like manipulating the pc and trigger another breakpoint after one instruction, is that possible?

maybe setting up some breakpoint register and making a branch to lr?

edit:
I think we got a bit confused here about the lr offsets. the stacked lr is indeed at offset 20(!). but in the current scenario, we are more interested to restore the stacked pc!
and this is at offset 24.

SEGGER - Arne · Jun 16th 2016, 12:36pm

Hi,

this might not be possible, you cannot access the debug/breakpoint registers from within your application.

We maybe could implement in the GDB Server to do two single steps after a software BP. If your hardfault handler immediately does a branch to LR after the software BP, the PC should be exactly at the exception's cause. As a side effect you would no more need to do anything else, since the exception return is executed by the CPU and every register is restored to the state before the exception. Additionally we could pass the software BP's immediate value as signal number to the GDB client.

I will discuss that later internally.

Best regards,
Arne

SEGGER - Arne · Jun 16th 2016, 3:05pm

Hi,

I have created a beta version with this feature: download.segger.com/Arne/JLinkGDBServer
The GDB server must be started with the command line parameter "-excdbg".
The hardfault handler must have a "bx lr" directly behind the "bkpt" and it must be declared with the naked attribute.

The simplest implementation of a working hardfault handler:

Source Code

__attribute__((naked)) void HardFaultHandler(void)
{
__asm volatile (
" bkpt #10 \n"
" bx lr \n"
);
}

The immediate value of the bkpt instruction is directly passed as signal number to the GDB client.
The above sample (signal number 10 is "SIGBUS") should look like this when triggered:

Best regards and
Arne

pruesch · Jun 17th 2016, 8:35am

Hi Arne,

thank you very much for your effort! I'm very impressed how fast our conversation lead to this :)
and you even built a linux binary! great!

I will try it soon.

would it be possible to have the gdb server evaluate the core's fault register the determine which kind of fault occurred and set the immediate value dynamically to display it in the debug view?

best regards
Peter

unfortunately, I'm having some trouble getting the GDBServer running:

Difference-File

[developer@localhost JLink_Linux_V541g_x86_64]$ sudo strace ./JLinkGDBServer_beta
execve("./JLinkGDBServer_beta", ["./JLinkGDBServer_beta"], [/* 22 vars */]) = 0
brk(0) = 0x1007000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f70cbbe5000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f70cbbe4000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f70cbbe3000
arch_prctl(ARCH_SET_FS, 0x7f70cbbe4680) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x8} ---
+++ killed by SIGSEGV (core dumped) +++
[developer@localhost JLink_Linux_V541g_x86_64]$

Display All

I tried to LD_PRELOAD=libjlinkarm.so.5 but still the same.

maybe this causes my error:

Source Code

[developer@localhost JLink_Linux_V541g_x86_64]$ file JLinkGDBServer
JLinkGDBServer: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped
[developer@localhost JLink_Linux_V541g_x86_64]$ file JLinkGDBServer_beta
JLinkGDBServer_beta: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), stripped

I'm trying this on CentOS 6.7

SEGGER - Arne · Jun 17th 2016, 10:00am

Hi Peter,

you can do any evaluations by yourself in the hardfault handler, if you branch to different "bkpt n & bx lr" instruction blocks dependent on the fault register.
The only requirement for the fault handler is to have the "bx lr" directly after the "bkpt n", everything else is up to you.

Best regards,
Arne

SEGGER - Arne · Jun 17th 2016, 10:29am

Hi Peter,

please try this: download.segger.com/Arne/JLinkGDBServer.tar.gz

This archive includes the libjlinkarm.so from the same build, works for me on a Debian jessie live system (hopefully on CentOS too).

If it does not work for you, please let me know.

Best regards,
Arne

pruesch · Jun 17th 2016, 11:18am

Hi Arne,

I tested the binary you supplied. It's perfect!

For a test, I hardcoded the SIGFPE (#8 ) Signal which is shown in the debugger view in eclipse. perfect!
after the excpetion, the pc points at the exact instruction which caused it. perfect!

it is do be discussed what amount of exception identification can be done on the target / gdb server.
I find your suggestion with different "bkpt #X && bx lr" combinations great.

during the last weeks, you winded up the quality of embedded debugging by a huge leap, congratulations for that! I really like the results of it!
your interaction with this forum is remarkable and very welcome at our side :)

hope this Proof of Example can find its way into one of the next stable versions.

Best Regards
Peter

SEGGER - Arne · Jun 17th 2016, 12:04pm

Hi Peter,

thank you very much for your positive feedback.

Of course this feature will find its way into one of the the next stable versions, at least it will be in the official beta builds from now on.
Future builds will accept a parameter after -excdbg, specifying the number of steps to do after the breakpoint. Some IDEs with automatic code generation
might have their default handlers that only call custom code as a subroutine. This would require additional steps to get out of the exception handler.

Best regards,
Arne

pruesch · Jun 28th 2016, 3:12pm

would it make sense to use this kind of stack unwinding also for normal interrupts?

I would like to see in the unwound call stack where I will resume execution when I return from the ISR.

did you think about that before?

to clarify my thoughts, take a look at the attached picture. it shows the callstack before and after stepping out of the ISR.

best regards

SEGGER - Arne · Jun 29th 2016, 9:27am

Hi Peter,

as I said before, backtracing generally is the GDB client's job. It has to switch the stack on an exception return (to 0xfffffffd) if this is necessary.

I do not see any chance to work around this lack of functionality from GDB server side.

Best regards,
Arne

pruesch · Jun 29th 2016, 3:31pm

this is a bug in gdb client. it has been reported multiple times but did not find its way into it.

recent discussion:
bugs.launchpad.net/gcc-arm-embedded/+bug/1566054

C Source Code

Source Code

Difference-File

Source Code

Share