guild icon
Toit
#Looping MALLOC_FAILED in jaguar
Thread channel in help
addshore
addshore 09/05/2025 04:58 PM
I got this looping MALLOC_FAILED in jaguar today, while not really running anything else on the device

[jaguar] WARN: running Jaguar failed due to 'MALLOC_FAILED' (2/3) [jaguar.http] INFO: running Jaguar device 'long-expert' (id: '736b8804-dcdf-4d96-890a-8785c1bfa31d') on 'http://192.168.68.50:9000' Heap report @ out of memory in primitive 3:4: ┌───────────┬──────────┬─────────────────────────────────────────────────────┐ │ Bytes │ Count │ Type │ ├───────────┼──────────┼─────────────────────────────────────────────────────┤ │ 7400 │ 728 │ heap overhead │ │ 258064 │ 691 │ untagged │ └───────────┴──────────┴─────────────────────────────────────────────────────┘ Total: 265464 bytes in 691 allocations (81%), largest free 44k, total free 62k [jaguar] WARN: running Jaguar failed due to 'MALLOC_FAILED' (3/3) [jaguar] INFO: backing off for 5s
addshore
addshore 09/05/2025 04:58 PM
ultimately lead to a device crash
addshore
addshore 09/05/2025 04:59 PM
I possibly could have been trying to do a wifi scan, however I imagine that would have imediatly failed and shouldnt be an issue
floitsch
floitsch 09/06/2025 11:40 AM
Primitive 3:4 is UDP-Send.
It's likely the Jaguar broadcast failed.
Given that you have sufficient memory, my guess is that something internal went bad and reported a memory-failure even though that's not exactly what happened.
For now there is little to go on. If we see this more frequently we need to investigate.
👍1
addshore
addshore 09/09/2025 09:30 AM
the udp broadcast is pretty frequent from memory?
Does it have limitations in terms of one at a time, and timeout for an attempt?
floitsch
floitsch 09/09/2025 09:31 AM
There is a loop that broadcasts every few ms. That should be sequential.
The underlying network should be able to handle multiple requests (potentially by blocking, but I don't know exactly).
addshore
addshore 09/11/2025 10:18 AM
Getting more things like this while running different code on my side

[jaguar] WARN: running Jaguar failed due to 'MALLOC_FAILED' (2/3) [jaguar.http] INFO: running Jaguar device 'national-diamond' (id: '94879073-44c2-49d1-b1d3-650a4d37c067') on 'http://192.168.1.107:9000' Heap report @ out of memory in primitive 3:4: ┌───────────┬──────────┬─────────────────────────────────────────────────────┐ │ Bytes │ Count │ Type │ ├───────────┼──────────┼─────────────────────────────────────────────────────┤ │ 7808 │ 778 │ heap overhead │ │ 251640 │ 737 │ untagged │ └───────────┴──────────┴─────────────────────────────────────────────────────┘ Total: 259448 bytes in 737 allocations (79%), largest free 28k, total free 68k [wifi] DEBUG: closing fatal: esp_event_handler_unregister(base, id, on_event)
addshore
addshore 09/11/2025 10:19 AM
Which ultimately lead to a crash
floitsch
floitsch 09/11/2025 10:22 AM
I will have a look at the code and see if there is something that sticks out.
addshore
addshore 09/11/2025 11:32 AM
i think it is likely while doing a ble scan and or a wifi scan at the same time as jaguar trying to do its thing
floitsch
floitsch 09/11/2025 11:32 AM
good to know.
floitsch
floitsch 09/11/2025 11:32 AM
if you have something that is (mostly) reproducible, that could help.
addshore
addshore 09/11/2025 11:37 AM
I wonder if https://github.com/lightbug-io/toit-lightbug/blob/94f6469f588ef1f7f14ae2c131ca457f1cda536b/examples/modules/ble/advanced.toit works out of the box without one of our devices
I think it should / might just work on any ESP
But we are specificaly on a C6
A Toit package for working with Lightbug devices in Toit - lightbug-io/toit-lightbug
addshore
addshore 09/11/2025 11:38 AM
i dont have "just" an esp32c6 here to try it on right now, but if it fails I can likely hack it to work
addshore
addshore 09/11/2025 11:39 AM
infact, if it fails, I think changing like 5 to be device := devices.Fake will mean it will work
addshore
addshore 09/11/2025 11:40 AM
Though that has gotten to scan #24 no issue
with the real device it died at #11
Which implies that it is the combination of this example code doing ble scans, but simultaniously in the background other code doing wifi and ble scans
addshore
addshore 09/11/2025 11:43 AM
Yes, so using Fake and also adding this before the loop in that code

task:: while true: print "=== Background scans ===" sleep --ms=15000 device.ble.scan --async --duration=3000 sleep --ms=500 device.wifi.scan --async --duration=3000

Makes it happen a bit
addshore
addshore 09/11/2025 11:44 AM
my main surprise is how it affects jaguar
However I'm also no acutely aware I should add some extra handling around my scans to make sure im not trying to do too much at once
floitsch
floitsch 09/11/2025 11:46 AM
You could also remove the broadcasts
floitsch
floitsch 09/11/2025 11:47 AM
You would then need to give the address to scan but that's not a big deal.
addshore
addshore 09/11/2025 11:48 AM
I was thinking that this is likely only an issue during development
And wondering if I could easily add a check. if jag, disable jag networking stuff for a second, do the scans, and turn it back on?
addshore
addshore 09/11/2025 11:48 AM
and tbh, i dont use the broadcasts from jag right now, always providing my specific --device
floitsch
floitsch 09/11/2025 11:49 AM
Might as well disable them entirely. That's a flag I would accept for Jaguar
👍1
floitsch
floitsch 09/11/2025 11:49 AM
--device uses scan in the background. I wonder if that still works nicely.
addshore
addshore 09/11/2025 11:49 AM
well, im normally passing an IP to --device anyway :🙂:
floitsch
floitsch 09/11/2025 11:50 AM
Then definitely :🙂:
addshore
addshore 09/11/2025 12:40 PM
i wonder if I should write these on github so i dont forget!
addshore
addshore 09/11/2025 12:46 PM
Per discussion in https://discord.com/channels/918498540232253480/1413568907985420339 doing wifi scans and or bluetooth scans and result in errors relating to jaguar UDP broadcasts. [jaguar] WARN: ...
28 messages in total