guild icon
Toit
#Jaguar wifi closes (ESP32S3)
Thread channel in help
MikkelD
MikkelD 03/10/2023 12:00 PM
I am having an issue where a device flashed with jaguar and just rebooted, closes the Wifi after about 40 seconds and does not reestablish the wifi connection.
A log from the console (with more wifi info than normal)
[wifi] DEBUG: connecting I (6206) wifi:mode : sta (f4:12:fa:d4:9a:20) I (6206) wifi:enable tsf I (7416) wifi:new:<11,2>, old:<1,0>, ap:<255,255>, sta:<11,2>, prof:1 I (8096) wifi:state: init -> auth (b0) I (8096) wifi:state: auth -> assoc (0) I (8156) wifi:state: assoc -> run (10) W (8166) wifi:<ba-add>idx:0 (ifx:0, 70:a7:41:ba:7e:46), tid:5, ssn:1, winSize:64 I (8166) wifi:connected with Spottune, aid = 1, channel 11, 40D, bssid = 70:a7:41:ba:7e:46 I (8166) wifi:security: WPA2-PSK, phy: bgn, rssi: -40 I (8166) wifi:pm start, type: 1 I (8166) wifi:set rx beacon pti, rx_bcn_pti: 0, bcn_timeout: 0, mt_pti: 25000, mt_time: 10000 [wifi] DEBUG: connected I (8176) wifi:BcnInt:102400, DTIM:1 W (8246) wifi:<ba-add>idx:1 (ifx:0, 70:a7:41:ba:7e:46), tid:6, ssn:1, winSize:64 I (9246) esp_netif_handlers: sta ip: 192.168.1.152, mask: 255.255.255.0, gw: 192.168.1.1 [wifi] INFO: network address dynamically assigned through dhcp {ip: 192.168.1.152} [wifi] INFO: dns server address dynamically assigned through dhcp {ip: [192.168.1.1]} [jaguar] INFO: running Jaguar device 'stex_jig' (id: 'dc6b35c1-f193-45bb-a7d7-7926b54d3fbc') on 'http://192.168.1.152:9000' I (38866) wifi:state: run -> init (6c0) I (38866) wifi:pm stop, total sleep time: 16065060 us / 30696067 us W (38866) wifi:<ba-del>idx W (38866) wifi:<ba-del>idx I (38866) wifi:new:<11,0>, old:<11,2>, ap:<255,255>, sta:<11,2>, prof:1 [wifi] DEBUG: closing I (38866) wifi:flush txq I (38866) wifi:stop sw txq I (38866) wifi:lmac stop hw txq I (38866) wifi:Deinit lldesc rx mblock:10
After this, the device needs to be rebooted before it starts jaguar again
Rikke
Rikke 03/10/2023 01:50 PM
Is this with the latest version of toit ? I did see my device with alpha 63 close its wifi, and never try to reconnect.
MikkelD
MikkelD 03/10/2023 02:19 PM
Yes, this is very current toit/jaguar.
bitphlipphar
bitphlipphar 03/11/2023 05:33 AM
I am looking into this. I hope to have an update (and a fix) early next week.
bitphlipphar
bitphlipphar 03/11/2023 05:56 AM
What is the latest version where you didn't experience this problem? I assume you've been running on 47 for a while, @Rikke.
bitphlipphar
bitphlipphar 03/11/2023 06:31 AM
I'm thinking that this might be a problem exposed by https://github.com/toitlang/toit/commit/214ab6d5dfd4ae13138dc9adb3cfd12f44c85a40.
bitphlipphar
bitphlipphar 03/11/2023 06:42 AM
The close dance is rather complicated, so my first step will be to make this more robust. It feels like we forget to notify Jaguar that the network is closed and I'm thinking that it might happen because we have concurrent modifications of some state in the wifi service. Are you running other code that use the network on top of Jaguar when you encounter this?
bitphlipphar
bitphlipphar 03/11/2023 06:54 AM
https://github.com/toitlang/toit/blob/master/system/extensions/esp32/wifi.toit#L161 <-- this is where we start notifying other processes that the network is down, but we wait for them to call us back (close), which is a little bit shaky. They do this asynchronously, so we may be manipulating the resources map while we're notifying, which is clearly bad. Also, I think I'll close the resource eagerly and then notify.
Program your microcontrollers in a fast and robust high-level language. - toit/wifi.toit at master · toitlang/toit
bitphlipphar
bitphlipphar 03/11/2023 06:54 AM
But first step is to try to find a good repro.
bitphlipphar
bitphlipphar 03/11/2023 08:10 AM
I think I have a repro.
bitphlippharbitphlipphar
What is the latest version where you didn't experience this problem? I assume you've been running on 47 for a while, @Rikke.
Rikke
Rikke 03/11/2023 08:42 AM
47 yes, and im not sure if it is the excatly same problem. I was unable to reproduce it on a 2nd device. But the first device ended up closing wifi, and never recovered like 5 times in a row.
bitphlipphar
bitphlipphar 03/11/2023 08:50 AM
I have a repro that shows that we're losing close notifications, so Jaguar is never informed.
bitphlipphar
bitphlipphar 03/11/2023 08:50 AM
Easy to fix. Should be ready for testing Monday morning.
bitphlipphar
bitphlipphar 03/11/2023 09:26 AM
Got a fix ready. Will test it a bit more.
bitphlipphar
bitphlipphar 03/11/2023 09:27 AM
If I am right, it's not a Jaguar issue, but a general resource notification problem.
bitphlippharbitphlipphar
The close dance is rather complicated, so my first step will be to make this more robust. It feels like we forget to notify Jaguar that the network is closed and I'm thinking that ...
MikkelD
MikkelD 03/11/2023 10:08 AM
no
MikkelDOPMikkelD
no
bitphlipphar
bitphlipphar 03/11/2023 03:00 PM
Okay. In that case, I think there are two problems. One simple one in Jaguar. When I started using Task.group I got it wrong.
bitphlipphar
bitphlipphar 03/11/2023 03:00 PM
(February 6)
bitphlipphar
bitphlipphar 03/11/2023 03:05 PM
Before my change, we would use the UDP broadcasting to discover that the network was closed and then take down the http server.
bitphlipphar
bitphlipphar 03/11/2023 03:06 PM
After my change, we only take it down and restart if the broadcasting throws.
bitphlipphar
bitphlipphar 03/11/2023 03:06 PM
Woops.
bitphlipphar
bitphlipphar 03/11/2023 03:07 PM
(we just need to use Task.group --required=1)
bitphlipphar
bitphlipphar 03/11/2023 03:17 PM
Without this change, the UDP broadcasting would terminate nicely when the network closed and we would keep the HTTP server in a listening state (not getting anything through the closed network).
bitphlipphar
bitphlipphar 03/11/2023 03:17 PM
The old code (before starting to use Task.group) was correct: https://github.com/toitlang/jaguar/commit/0170761a87c43ad025ddbec0e77b36ff66d9f31f.
bitphlipphar
bitphlipphar 03/11/2023 03:17 PM
There is still a bug in resource notifications, but it requires more than one net.open call.
bitphlipphar
bitphlipphar 03/13/2023 09:02 AM
Both fixes have landed in SDK v2.0.0-alpha.64 and Jaguar v1.9.10.
26 messages in total