guild icon
Toit
#Figuring out where some toit code is locking up
Thread channel in help
addshore
addshore 05/14/2025 01:31 PM
Someone has written some code, and I suspect that it is locking up somewhere blocking on some task / channel etc, and then a watchdog is triggering.
I have a core dump, stack memory, and also the code that is being run.
Is there any way to go from this to try and figure out rougly where it might have been blocked?
bitphlipphar
bitphlipphar 05/14/2025 01:52 PM
Good question. I don't think we've done much post-mortem debugging of this sort up until now.
bitphlipphar
bitphlipphar 05/14/2025 01:53 PM
I assume it is likely hard to reproduce?
addshore
addshore 05/14/2025 01:53 PM
I believe it takes about an hour :😛:
addshore
addshore 05/14/2025 01:53 PM
and yes involves custom hardware, talking to custom services etc
bitphlipphar
bitphlipphar 05/14/2025 01:53 PM
That doesn't sound too bad.
bitphlipphar
bitphlipphar 05/14/2025 01:53 PM
Oh, wait. Now it sounds bad.
🤣1
bitphlipphar
bitphlipphar 05/14/2025 01:53 PM
:🙂:
addshore
addshore 05/14/2025 01:54 PM
I remember from some of my early "deadlocks" i seemed to always trace it back to some loop that would never yeield
I always managed to find them with more verbos logging within the loops and then figurign it out from that
addshore
addshore 05/14/2025 01:54 PM
so perhaps that is still the best approach for now
bitphlipphar
bitphlipphar 05/14/2025 01:55 PM
Probably. I wonder if we could make the missed watchdog start with a soft signal that dumps a stack trace for all tasks or some such.
👀1
bitphlipphar
bitphlipphar 05/14/2025 01:59 PM
You could possibly also run with automatic logging of all task block/resume operations.
bitphlipphar
bitphlipphar 05/14/2025 02:00 PM
That would not really help if it is just forgotten yields, but if you're blocked on something, you should be able to tell.
bitphlipphar
bitphlipphar 05/14/2025 02:02 PM
This is where a task is suspended: https://github.com/toitlang/toit/blob/eb2f42303fad36ccd54da56b83ad9e0b2f564322/lib/core/task.toit#L340

This is where it is resumed later:
https://github.com/toitlang/toit/blob/eb2f42303fad36ccd54da56b83ad9e0b2f564322/lib/core/task.toit#L351

Using something like print_ this around those lines might give you more insights (this is the task being suspended and resumed).
(edited)
bitphlipphar
bitphlipphar 05/14/2025 02:04 PM
You may also be able to find the problematic task by having more than one watchdog.
addshore
addshore 05/14/2025 02:04 PM
Thanks for all the pointers, I like the print_ one and might take a look at that if i go ahead and reproduce it tommorrow
bitphlipphar
bitphlipphar 05/14/2025 02:09 PM
You may only learn that a specific task is blocked through that. Next step might involve getting some kind of stack trace dumped - but it might be too verbose.
17 messages in total