guild icon
Toit
#Unable to OTA after upgrading to version 35
Thread channel in help
tplux
tplux 10/25/2022 06:00 AM
Hi
After upgrading to version 35, we are not able to do OTA. Our device retrieves the image via http, and the image is written in the same way as install_firmware in jaguar.toit. But after a few writes, it looks like the write method gets stuck, and the app halts without any indication. The only way to get it out of this is a hard reset. The device is already running with 35, so there shouldn't be an issue with the partitions.
tplux
tplux 10/25/2022 06:30 AM
I have added a 2 ms sleep after each write, and now the ota works sometimes. Sometimes after committing and resetting, I get a panic:
tplux
tplux 10/25/2022 06:32 AM
Firmware updated; doing chip reset
ets Jun 8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:184
load:0x40078000,len:12700
ho 0 tail 12 room 4
load:0x40080400,len:2916
entry 0x400805c4
E (633) psram: PSRAM ID read error: 0xffffffff
E (633) spiram: SPI RAM enabled but initialization failed. Bailing out.
[toit] INFO: starting <v2.0.0-alpha.35>
[wifi] DEBUG: connecting
Guru Meditation Error: Core 1 panic'ed (IllegalInstruction). Exception was unhandled.
Memory dump at 0x4018d45c: f03df03d 00000000 ffffbf00
Core 1 register dump:
PC : 0x4018d460 PS : 0x00060c30 A0 : 0x8018a7cd A1 : 0x3ffd8840
A2 : 0x00000001 A3 : 0x3ffc8d80 A4 : 0x00100000 A5 : 0x09ee0087
A6 : 0x008d2018 A7 : 0x00000003 A8 : 0x3ffc8df2 A9 : 0x00000006
A10 : 0x00000011 A11 : 0x0000003b A12 : 0x3ffc8d91 A13 : 0x3ffc07fc
A14 : 0x0000003c A15 : 0x00000000 SAR : 0x00000007 EXCCAUSE: 0x00000000
EXCVADDR: 0x00000000 LBEG : 0x4018d449 LEND : 0x4018d45c LCOUNT : 0x00000000
tplux
tplux 10/25/2022 06:33 AM
**
Decoding by jag, device has version <2.0.0-alpha.35>
**
WARNING: extracting a specific part is deprecated
Crash in native code:
Backtrace:0x4018d43d:0x3ffd88400x4018a7ca:0x3ffd8860 0x4018b23a:0x3ffd8880 0x40145659:0x3ffd88b0 0x401456b9:0x3ffd88d0 0x4016e60a:0x3ffd88f0 0x4016ed75:0x3ffd8910 0x4016d6b4:0x3ffd8930 0x401adb81:0x3ffd8950
0x4018d43d: tx_pwctrl_init + 0x31
0x4018b23a: register_chipv7_phy + 0x11e
0x40145659: esp_phy_load_cal_and_init + 0x3d
0x401456b9: esp_phy_enable + 0x51
0x4016e60a: wifi_hw_start + 0x76
0x4016ed75: wifi_start_process + 0x49
0x4016d6b4: ieee80211_ioctl_process + 0x5c
0x401adb81: ppTask + 0xb1
**





ELF file SHA256: 799463178f1588c9
bitphlipphar
bitphlipphar 10/25/2022 06:36 AM
So the writes you are referring to are FirmwareWriter.write, right?
bitphlipphar
bitphlipphar 10/25/2022 06:37 AM
How much other stuff is running while you do the OTA?
tplux
tplux 10/25/2022 06:41 AM
Nevermind :馃槄: . It was a power issue. It works just as before :馃憤:
tplux
tplux 10/25/2022 06:42 AM
I think it's safe to delete this topic.
tplux
tplux 10/26/2022 08:56 AM
So it seems it wasn't a power issue after all. When the mqtt traffic is low, it works fine, but with just a few messages each second, the writer stops everything after a while. Even if I spawn the firmware writing.
bitphlipphar
bitphlipphar 10/26/2022 09:06 AM
So lots of WiFi traffic kills the firmware updating?
tplux
tplux 10/26/2022 09:07 AM
It seems so.
tplux
tplux 10/26/2022 10:55 AM
Let me know, if I should try someting specific. I've tried different things but nothing seems to make it better. My biggest concern is that the app stops completely - I haven't found a way to avoid that.
bitphlipphar
bitphlipphar 10/26/2022 02:59 PM
Is it an option to stop the MQTT client when you're doing firmware updates or is that super unattractive?
tplux
tplux 10/27/2022 05:39 AM
It could be possible.
tplux
tplux 10/27/2022 10:00 AM
If I stop all other tasks, the ota succeeds most of the times. Sometimes it results in a panic, but other times it just stops executing, which is quite critical, since a hard reset is required. Any idea what happens when it stops completely?
bitphlipphar
bitphlipphar 10/27/2022 10:16 AM
I'm looking into it! There's an odd case in the firmware writer that we might need to talk about, but I doubt this fully explains what you're seeing.
tplux
tplux 10/27/2022 10:22 AM
:馃憤: :馃檪:
bitphlipphar
bitphlipphar 10/27/2022 10:57 AM
Found a pretty bad bug in our RPC system that we introduced lately (it happens when we're temporarily low on memory).
bitphlipphar
bitphlipphar 10/27/2022 10:59 AM
Introduced in alpha.34, I think.
tplux
tplux 10/27/2022 11:07 AM
Interesting. That could explain some of it...
tplux
tplux 10/27/2022 11:39 AM
I tried it out, but unfortunately I still have the same behaviour.
bitphlipphar
bitphlipphar 10/27/2022 05:22 PM
I'll keep looking. Did you just patch in the change locally and forced a rebuild? The PR just landed.
tplux
tplux 10/28/2022 05:33 AM
Yes, I just copied it in :馃檪:
tplux
tplux 10/28/2022 10:05 AM
I am trying to debug the problem a bit by placing ets_printf in different locations, and so far I have come to the conclusion that it gets stuck in PRIMITIVE(ota_write) in the call to esp_partition_erase_range - at least it doesn't get to the ets_printf I placed just after that.
bitphlipphar
bitphlipphar 10/28/2022 10:26 AM
That is super useful. Thanks :馃檹:
MikkelD
MikkelD 10/28/2022 01:56 PM
@tplux When you are doing erase_range, any code that is not in iram can not run. We modified the iram setup in 35. I wonder if that is the cause.
MikkelD
MikkelD 10/28/2022 01:58 PM
Although the ESP32 should disable any code that tries to run during a flash exclusive write, I have seen this fail.
MikkelD
MikkelD 10/28/2022 01:58 PM
(Mostly when trying to access memory in SPIRAM, this is not relevant here though)
tplux
tplux 10/29/2022 08:59 AM
Ok. That's worth looking into.
tplux
tplux 10/31/2022 06:07 AM
Which files should I look into to see what changed in the iram setup?
MikkelD
MikkelD 10/31/2022 08:37 AM
sdkconfig basicly. There are some IRAM config parameters.
馃憤1
tplux
tplux 10/31/2022 02:28 PM
I'm still strugling with this. Regarding the IRAM config, I'm wary of messing with it, since I guess there's a good reason for the current setup. I did try to set CONFIG_SPIRAM_CACHE_LIBNUMPARSER_IN_IRAM=y, but I was not able to link because there was not enough space in the iram section.
Instead I'm stopping all tasks not handling the ota to minimize any unwanted activity - this seemed to work for a while.
I tested by updating between 3 bin files:
test1.bin updates to test2.bin
test2.bin updates to test3.bin
test3.bin updates to test1.bin
and so on.
I don't know if it's a coincidence, but it started failing again when I accidentally messed up the order, so test3.bin updated to test2.bin, and then test2.bin was not able to update to test1.bin.
BTW, the "failing" is still everything stopping leaving only a hard reset to save the day - that's not viable.
bitphlipphar
bitphlipphar 10/31/2022 02:34 PM
We're here to help. Do you have time for a chat tomorrow?
tplux
tplux 10/31/2022 02:34 PM
Absolutely :馃檪:
bitphlipphar
bitphlipphar 11/01/2022 12:28 PM
We just had a brief discussions about this and @floitsch remembered that we ended up moving FreeRTOS out of IRAM as part of the upgrade to alpha.35. This way due to PSRAM support, so if you don't need that maybe we can find a way of going back to having FreeRTOS in IRAM by disabling the PSRAM compiler workarounds that make generated code (and thus IRAM requirements) grow.
floitsch
floitsch 11/01/2022 12:33 PM
Fwiw, the documentation states that it should be safe to not have them in iram:
Provided these functions are not (incorrectly) used from ISRs, this option is safe to enable in all configurations.

I will still try to get a version with the Freertos in IRAM to work again.
tplux
tplux 11/01/2022 12:50 PM
Thanks :馃檪:
MikkelD
MikkelD 11/01/2022 02:57 PM
@tplux Are you trying to access the sdcard while flashing? Assuming you use the sdcard code.
floitsch
floitsch 11/01/2022 04:00 PM
@tplux: I wasn't able to put freertos back into iram. Wasn't just the flag for the external RAM...
However. If you don't use Bluetooth, you could disable that and put freertos into iram instead.
diff --git a/toolchains/esp32/sdkconfig.defaults b/toolchains/esp32/sdkconfig.defaults index 3947a427..7f18132a 100644 --- a/toolchains/esp32/sdkconfig.defaults +++ b/toolchains/esp32/sdkconfig.defaults @@ -6,7 +6,7 @@ CONFIG_ESPTOOLPY_FLASHSIZE_4MB=y CONFIG_PARTITION_TABLE_CUSTOM=y CONFIG_COMPILER_OPTIMIZATION_SIZE=y CONFIG_COMPILER_OPTIMIZATION_ASSERTIONS_DISABLE=y -CONFIG_BT_ENABLED=y +#CONFIG_BT_ENABLED=y CONFIG_BTDM_CTRL_BLE_MAX_CONN=2 CONFIG_BTDM_CTRL_PINNED_TO_CORE_1=y # CONFIG_BTDM_CTRL_MODEM_SLEEP is not set @@ -57,7 +57,7 @@ CONFIG_FREERTOS_THREAD_LOCAL_STORAGE_POINTERS=3 CONFIG_FREERTOS_IDLE_TASK_STACKSIZE=1024 CONFIG_FREERTOS_ISR_STACKSIZE=2096 CONFIG_FREERTOS_USE_TRACE_FACILITY=y -CONFIG_FREERTOS_PLACE_FUNCTIONS_INTO_FLASH=y +#CONFIG_FREERTOS_PLACE_FUNCTIONS_INTO_FLASH=y CONFIG_LOG_DEFAULT_LEVEL_ERROR=y # CONFIG_LOG_COLORS is not set CONFIG_LWIP_LOCAL_HOSTNAME="toit"
Then delete the toolchains/esp32/sdkconfig file and build the image again.

You can also just run idf.py -C toolchains/esp32 -B build/esp32 size-components (assuming you have sourced the export.sh: source third_party/esp-idf/export.sh).
MikkelDMikkelD
@tplux Are you trying to access the sdcard while flashing? Assuming you use the sdcard code.
tplux
tplux 11/02/2022 07:05 AM
I am using the sdcard code, but to rule that out, I tried without it, and unfortunately that didn't help.
floitschfloitsch
@tplux: I wasn't able to put freertos back into iram. Wasn't just the flag for the external RAM... However. If you don't use Bluetooth, you could disable that and pu...
tplux
tplux 11/02/2022 07:06 AM
I will try this out. For now at least we don't need BT.
tplux
tplux 11/02/2022 07:34 AM
It seems disabling BT and putting freertos back into iram did the trick. I will keep trying to see if it fails again.
bitphlipphar
bitphlipphar 11/02/2022 07:42 AM
Very interesting. @dongheng might know why we need to keep FreeRTOS in IRAM.
MikkelD
MikkelD 11/02/2022 09:12 AM
Wow. A bit read eared here. I just checked our sdkconfig on the ESP32 with working flash, and we have indeed done now what is descibed here:
# CONFIG_BT_ENABLED is not set # CONFIG_FREERTOS_PLACE_FUNCTIONS_INTO_FLASH is not set # CONFIG_FREERTOS_PLACE_SNAPSHOT_FUNS_INTO_FLASH is not set
I wonder if there is a bigger problem with OTA in toit on ESP32 now....
bitphlippharbitphlipphar
Very interesting. @dongheng might know why we need to keep FreeRTOS in IRAM.
dongheng
dongheng 11/02/2022 10:00 AM
Hi You can see the information I shared in general channel: Flash writing/reading/erasing has much usage limitation, as I known mainly about:

1. Task stack mustn't be PSRAM, becasue the cache for PSRAM will be disable
2. Interrupt(flags=ESP_INTR_FLAG_IRAM) process functions(include calling function one by one layer) should be put into IRAM, because cache for flash will be disable
3. All CPUs will switch to special tasks(IPC), and other tasks can't run and it looks like being blocked
dongheng
dongheng 11/02/2022 10:01 AM
Some FreeRTOS functions are called in interrupt service, especially semaphore, queue and some task related.
donghengdongheng
Hi You can see the information I shared in general channel: Flash writing/reading/erasing has much usage limitation, as I known mainly about: 1. Task stack mustn't be PSRAM, bec...
floitsch
floitsch 11/02/2022 10:07 AM
@erikcorry_arbat and @bitphlipphar might correct me, but for Toit we should have:
1. the stack is not in PSRAM. We barely started using PSRAM, as we had difficulties with the ESP32 errata (and our compile flags weren't working around the issue).
2. I don't think we wrote our own interrupt handlers. They should all be from the esp-idf, and should therefore in in iram.
3. Not sure about that one. There shouldn't be any dependency on other tasks. Also wouldn't explain why things start to work when we put freertos back into iram.

@dongheng: do you know if there is an easy way to see where the issue comes from? Which function is trying to access something that it shouldn't do?
MikkelD
MikkelD 11/02/2022 10:12 AM
I think I placed panic handlers in IRAM at some point, that helped.
MikkelD
MikkelD 11/02/2022 10:12 AM
CONFIG_ESP_PANIC_HANDLER_IRAM+
floitsch
floitsch 11/02/2022 10:14 AM
@tplux if you have a few minutes at some point, it would be interesting to see what you get with the CONFIG_ESP_PANIC_HANDLER_IRAM config.
MikkelD
MikkelD 11/02/2022 10:14 AM
And ofc moving the FreeRtos back to flash and enable BT :馃檪:
MikkelD
MikkelD 11/02/2022 10:16 AM
Also, another hack I used when fiddling with this, is to just kick the toitvm out of IRAM to have plenty to work with. It is the interpreter.h line 27
tplux
tplux 11/02/2022 10:54 AM
Sure, I'll let you know how it goes.
tplux
tplux 11/02/2022 11:01 AM
With freertos back in flash, BT enabled and CONFIG_ESP_PANIC_HANDLER_IRAM=y the iram section overflows.
tpluxOPtplux
With freertos back in flash, BT enabled and CONFIG_ESP_PANIC_HANDLER_IRAM=y the iram section overflows.
floitsch
floitsch 11/02/2022 11:10 AM
Try without bluetooth.
floitsch
floitsch 11/02/2022 11:11 AM
The current hypothesis is that freertos needs to be in iram to make the flashing work.
According to the docs that shouldn't be the case.
If we put freertos back into flash, and have the panic handler enabled, we might get an error message telling us which function tries to access a freertos function.
tplux
tplux 11/02/2022 11:16 AM
Got it. On it.
floitschfloitsch
@erikcorry_arbat and @bitphlipphar might correct me, but for Toit we should have: 1. the stack is not in PSRAM. We barely started using PSRAM, as we had difficulties w...
dongheng
dongheng 11/02/2022 11:18 AM
Sometimes the panic can show the backtrace, so we can know where casue this issue, but to see more details we should de-assembly this ELF file and check the space which is the address value of backtrace.
tplux
tplux 11/02/2022 11:21 AM
**
Decoding by jag, device has version <2.0.0-alpha.35>
**
WARNING: extracting a specific part is deprecated
Crash in native code:
Backtrace:0x401910c1:0x3ffb12700x40086b03:0x3ffb1290 0x4008be95:0x3ffb1300 0x4008849a:0x3ffaf430 0x4008c3bf:0x3ffaf450
0x401910c1: prvAcquireItemNoSplit + 0x51
0x4008be95: _xt_lowint1 + 0x89
0x4008849a: spi_flash_op_block_func + 0x22
0x4008c3bf: ipc_task + 0x6f
**
dongheng
dongheng 11/02/2022 11:23 AM
Could you send the ELF and bin to me ?
donghengdongheng
Could you send the ELF and bin to me ?
tplux
tplux 11/02/2022 11:28 AM
The ELF file is too big for discord... :馃槵:
tplux
tplux 11/02/2022 11:29 AM
How do I check it?
dongheng
dongheng 11/02/2022 11:36 AM
I see the backtrace, but it looks like not all corrent. Maybe spi_flash_op_block_func triggers issue, but also looks like a interrupt triggers by _xt_lowint1, so I can't check directly.
dongheng
dongheng 11/02/2022 11:37 AM
Could you push your branch to github ? Then I can pull and run it
tplux
tplux 11/02/2022 11:49 AM
If you can settle for the elf file, I think my colleague has nitro, so she can send bigger files...
floitsch
floitsch 11/02/2022 11:52 AM
@tplux
it looks like Jaguar is decoding the native stack-trace the wrong way.
Try to set the JAG_TOIT_REPO_PATH environment variable to your Toit repository.
If you already did that, then use the esp-idf monitor instead.
Let me find the command line for the esp-idf monitor line.
floitsch
floitsch 11/02/2022 11:53 AM
Make sure that you have sourced third_party/esp-idf/export.sh.
Then run idf.py -C toolchains/esp32 -B build/esp32 monitor
floitsch
floitsch 11/02/2022 11:54 AM
That should, hopefully, print the correct trace.
floitsch
floitsch 11/02/2022 11:54 AM
The reason for the mix-up, is that Jaguar assumes that the firmware that is running, is the one that Jaguar knows about.
I ran into the same thing when I was looking at the ESP32-S2.
We just discussed this, and will try to make this more stable.
tplux
tplux 11/02/2022 12:13 PM
Guru Meditation Error: Core 0 panic'ed (Cache disabled but cached memory region accessed).

Core 0 register dump:
PC : 0x401910c4 PS : 0x00060034 A0 : 0x80086b06 A1 : 0x3ffb1270
0x401910c4: prvCheckItemFitsByteBuffer at /home/tommy/remoni_repos/toit/third_party/esp-idf/components/esp_ringbuf/ringbuf.c:325

A2 : 0x3ffe0ed0 A3 : 0x3ffe0b24 A4 : 0x0000001c A5 : 0x3ffb12b8
A6 : 0x401910c4 A7 : 0x3ffe0fc4 A8 : 0x8009261d A9 : 0x3ffb1230
0x401910c4: prvCheckItemFitsByteBuffer at /home/tommy/remoni_repos/toit/third_party/esp-idf/components/esp_ringbuf/ringbuf.c:325

A10 : 0x3ffe0ed0 A11 : 0x0000001c A12 : 0x3ffb5080 A13 : 0x0000cdcd
A14 : 0x00060023 A15 : 0x00060021 SAR : 0x00000000 EXCCAUSE: 0x00000007
EXCVADDR: 0x00000000 LBEG : 0x4008fc94 LEND : 0x4008fcb0 LCOUNT : 0xffffffff
0x4008fc94: memcpy at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/machine/xtensa/memcpy.S:175

0x4008fcb0: memcpy at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/machine/xtensa/memcpy.S:197



Backtrace:0x401910c1:0x3ffb12700x40086b03:0x3ffb1290 0x4008be95:0x3ffb1300 0x40008544:0x3ffc64a0 0x400882c1:0x3ffc64c0 0x4009c7bd:0x3ffc64e0 0x4009c5c1:0x3ffc6510 0x40087d87:0x3ffc6530 0x4010176b:0x3ffc6580 0x400ecfa3:0x3ffc65a0 0x40083c1e:0x3ffc6630 0x400f1f20:0x3ffc66c0 0x400f21c6:0x3ffc6710 0x400f21f5:0x3ffc6740 0x400e4e16:0x3ffc6760 0x400e4e3d:0x3ffc6780
tplux
tplux 11/02/2022 12:13 PM
0x401910c1: prvAcquireItemNoSplit at ringbuf.c:?

0x40086b03: uart_rx_intr_handler_default at /home/tommy/remoni_repos/toit/third_party/esp-idf/components/driver/uart.c:978

0x4008be95: _xt_lowint1 at /home/tommy/remoni_repos/toit/third_party/esp-idf/components/freertos/port/xtensa/xtensa_vectors.S:1111

0x400882c1: delay_us at /home/tommy/remoni_repos/toit/third_party/esp-idf/components/spi_flash/spi_flash_os_func_app.c:148

0x4009c7bd: spi_flash_chip_generic_wait_idle at /home/tommy/remoni_repos/toit/third_party/esp-idf/components/spi_flash/spi_flash_chip_generic.c:443

0x4009c5c1: spi_flash_chip_generic_erase_sector at /home/tommy/remoni_repos/toit/third_party/esp-idf/components/spi_flash/spi_flash_chip_generic.c:162

0x40087d87: esp_flash_erase_region at /home/tommy/remoni_repos/toit/third_party/esp-idf/components/spi_flash/esp_flash_api.c:616

0x4010176b: esp_partition_erase_range at /home/tommy/remoni_repos/toit/third_party/esp-idf/components/spi_flash/partition.c:545

0x400ecfa3: toit::primitive_ota_write(toit::Process, toit::Object**) at ./build/esp32/./src/primitive_esp32.cc:162
(inlined by) primitive_ota_write at ./build/esp32/./src/primitive_esp32.cc:116

0x40083c1e: toit::Interpreter::run() at ./build/esp32/./src/interpreter_run.cc:966

0x400f1f20: toit::Scheduler::run_process(toit::Locker&, toit::Process
, toit::SchedulerThread) at ./build/esp32/./src/scheduler.cc:627

0x400f21c6: toit::Scheduler::run(toit::SchedulerThread
) at ./build/esp32/./src/scheduler.cc:404

0x400f21f5: toit::SchedulerThread::entry() at ./build/esp32/./src/scheduler.cc:38

0x400e4e16: toit::Thread::_boot() at ./build/esp32/./src/os_esp32.cc:308

0x400e4e3d: toit::thread_start(void*) at ./build/esp32/./src/os_esp32.cc:295
(inlined by) esp_thread_start at ./build/esp32/./src/os_esp32.cc:300
tplux
tplux 11/02/2022 12:15 PM
dongheng
dongheng 11/02/2022 12:26 PM
Fisrtly we can see that: Cache disabled but cached memory region accessed, it means cache for external memory(flash or psram) is disable, but software try to access it,. Then 0x401910c4 show where it triggers issues, if there its ASM code, it will be better...
dongheng
dongheng 11/02/2022 12:27 PM
The process I check is:
1. erase flash -> close cache and wait for it is done
2. a UART IRQ triggers and send data to ring_buffer
3. crash in ring_buffer's code which access flash or psram
dongheng
dongheng 11/02/2022 12:33 PM
why the address is 0x401910c4, I think this out of range of IRAM...
MikkelD
MikkelD 11/02/2022 12:33 PM
In this case there is no PSRAM attached.
MikkelD
MikkelD 11/02/2022 12:33 PM
ringbuf.c is probably not in iram?
dongheng
dongheng 11/02/2022 12:34 PM
IRAM space is from 0x40080000 to 0x400a0000.
MikkelD
MikkelD 11/02/2022 12:35 PM
Ah !
MikkelD
MikkelD 11/02/2022 12:35 PM
The esp_ringbuf is placed in IRAM if the FREERTOS is in IRAM.
MikkelD
MikkelD 11/02/2022 12:36 PM
esp-idf/components/esp_ringbug/linker.lf line 5(edited)
dongheng
dongheng 11/02/2022 12:36 PM
Flash space is from 0x400d0020, so 0x401910c4 may be the flash space...
MikkelD
MikkelD 11/02/2022 12:36 PM
That is the reason.
floitsch
floitsch 11/02/2022 12:38 PM
I think Mikkel is right. (And Kasper came to the same conclusion):
While the chache is disabled, the UART rx interrupt handler is called which triggers a call to the esp_ringbuf.
Two options:
- disable uart interrupt
- move esp_ringbuf into iram
(edited)
tplux
tplux 11/02/2022 12:39 PM
So if I close down the uart before starting ota, the uart interrupt should be disabled?
floitsch
floitsch 11/02/2022 12:39 PM
@tplux do you have something connected to a serial port that is feeding data to the ESP32?
tplux
tplux 11/02/2022 12:39 PM
Yes
MikkelD
MikkelD 11/02/2022 12:39 PM
@tplux Worth a shot
tplux
tplux 11/02/2022 12:40 PM
I'll try
tpluxOPtplux
So if I close down the uart before starting ota, the uart interrupt should be disabled?
floitsch
floitsch 11/02/2022 12:40 PM
That would be an option, but fundamentally, this looks like a bug in the esp-idf.
floitsch
floitsch 11/02/2022 12:40 PM
And for now, you can just disable bluetooth and have freertos in iram.
floitsch
floitsch 11/02/2022 12:40 PM
While you don't use bluetooth, having freertos in iram is just better.
tpluxOPtplux
I'll try
floitsch
floitsch 11/02/2022 12:41 PM
It would still be interesting to see if that solves the issue. Confirming our hypothesis.
floitschfloitsch
It would still be interesting to see if that solves the issue. Confirming our hypothesis.
tplux
tplux 11/02/2022 12:48 PM
Confirmed :馃檪:
馃憤1
dongheng
dongheng 11/03/2022 01:59 AM
@floitsch @tplux Actually this maybe not a bug of the esp-idf, it is a usage limitation of esp-idf's UART driver, and you can disable option UART_ISR_IN_IRAM( https://github.com/toitware/esp-idf/blob/wip-allocate-toit-backwards/components/driver/Kconfig#L150) , then UART interrupt will not work when erase/read/write flash.
Espressif IoT Development Framework. Official development framework for ESP32. - esp-idf/Kconfig at wip-allocate-toit-backwards 路 toitware/esp-idf
bitphlipphar
bitphlipphar 11/03/2022 05:07 AM
@dongheng The "bug" in esp-idf is that it feels like the ringbuffer code should be forced into IRAM if UART_ISR_IN_IRAM is set -- even if all of FreeRTOS isn't put in IRAM. Alternatively, the UART interrupt should be disabled when reading/writing/erasing flash if the ringbuffer code isn't in IRAM.
bitphlippharbitphlipphar
@dongheng The "bug" in esp-idf is that it feels like the ringbuffer code should be forced into IRAM if UART_ISR_IN_IRAM is set -- even if all of FreeRTOS isn't put in ...
dongheng
dongheng 11/03/2022 06:14 AM
Yes, it is. The issue is about option dependence in Kconfig.
MikkelD
MikkelD 11/03/2022 07:32 AM
I believe it is unfortunate that the documentation for the CONFIG_FREERTOS_PLACE_FUNCTIONS_INTO_FLASH states
Provided these functions are not (incorrectly) used from ISRs, this option is safe to enable in all configurations.
and then the esp-idt UART interrupt driver is dependent on that flag..
MikkelD
MikkelD 11/03/2022 07:41 AM
For now, I believe that toit should probably kick the UART driver out of IRAM to avoid this particular nasty little side effect.
MikkelD
MikkelD 11/03/2022 07:43 AM
Side note: In our application, we get the OTA data on the UART, so for us there is no way around making space in IRAM for the UART and FreeRTOS, otherwise the UART will drop data packages and the OTA will fail.
tplux
tplux 11/03/2022 08:56 AM
BTW, while testing all this I sometimes experienced that after flashing a new image to the ESP, it started an old app. I assume this happens because of ota, where the bootloader at some point wants to start from the ota1 partition, but the flash tool only flashes the ota0 partition. Does that make sense?
MikkelD
MikkelD 11/03/2022 08:59 AM
How do you flash? esptool/jag?
MikkelD
MikkelD 11/03/2022 09:00 AM
There is an ota data partition, if that is not cleared, then the boot loader might indeed start the "wrong" app.
tplux
tplux 11/03/2022 09:07 AM
I use esptool.
bitphlipphar
bitphlipphar 11/03/2022 09:07 AM
Which version of esptool are you using, @tplux? I seem to remember that this has been fixed in newer versions.
tplux
tplux 11/03/2022 09:08 AM
v3.3.2-dev
bitphlipphar
bitphlipphar 11/03/2022 09:09 AM
Yeah, I think I got that wrong. What was fixed was the default arguments provided by the "build environment". You might be able to flash using tools/firmware instead (if you're not on Windows).
tplux
tplux 11/03/2022 09:12 AM
Interesting. I will look into that :馃憤:
bitphlipphar
bitphlipphar 11/03/2022 09:13 AM
(it invokes tools/esptool behind the scenes)
tplux
tplux 11/03/2022 09:14 AM
Is there a how-to somewhere? Might have missed it :馃檪:
bitphlipphar
bitphlipphar 11/03/2022 09:18 AM
I think we may not have written much about since it doesn't yet work on Windows.
tplux
tplux 11/03/2022 09:27 AM
No worries. I'll try to deduce from the code.
MikkelD
MikkelD 11/03/2022 09:29 AM
idf.py flash will add the ota data partition by itself, so might be worth considering. Atleast the output of idf.py build will include the ota data partition piece that is needed to make esptool work.
馃憤1
bitphlipphar
bitphlipphar 11/03/2022 10:19 AM
~/.cache/jaguar/sdk/tools/firmware -e ~/.cache/jaguar/assets/firmware-esp32.envelope flash -p /dev/tty.usbserial-11330
bitphlipphar
bitphlipphar 11/03/2022 10:19 AM
(this is just an example of using this from the stuff installed with Jaguar by default)
bitphlipphar
bitphlipphar 11/03/2022 10:20 AM
esptool.py v3.3.2-dev Serial port /dev/tty.usbserial-11330 Connecting.............. Chip is ESP32-D0WD (revision 1) Features: WiFi, BT, Dual Core, 240MHz, VRef calibration in efuse, Coding Scheme None Crystal is 40MHz MAC: 98馃捒ac:63:76:cc Uploading stub... Running stub... Stub running... Changing baud rate to 921600 Changed. Configuring flash size... Auto-detected Flash size: 4MB Flash will be erased from 0x00001000 to 0x00004fff... Flash will be erased from 0x00008000 to 0x00008fff... Flash will be erased from 0x0000d000 to 0x0000efff... Flash will be erased from 0x00010000 to 0x00144fff... Compressed 15888 bytes to 11151... Wrote 15888 bytes (11151 compressed) at 0x00001000 in 0.4 seconds (effective 319.6 kbit/s)... Hash of data verified. Compressed 3072 bytes to 165... Wrote 3072 bytes (165 compressed) at 0x00008000 in 0.1 seconds (effective 328.1 kbit/s)... Hash of data verified. Compressed 8192 bytes to 31... Wrote 8192 bytes (31 compressed) at 0x0000d000 in 0.1 seconds (effective 511.9 kbit/s)... Hash of data verified. Compressed 1263200 bytes to 812457... Wrote 1263200 bytes (812457 compressed) at 0x00010000 in 13.9 seconds (effective 725.9 kbit/s)... Hash of data verified.
馃捒2
bitphlipphar
bitphlipphar 11/03/2022 10:20 AM
The output shows that this is also writing the otadata partition.
MikkelD
MikkelD 11/03/2022 10:23 AM
Because, of course every one knows that the otadata partition starts at 0x00d000 :馃檪:
馃槃2
bitphlippharbitphlipphar
`esptool.py v3.3.2-dev Serial port /dev/tty.usbserial-11330 Connecting.............. Chip is ESP32-D0WD (revision 1) Features: WiFi, BT, Dual Core, 240MHz, VRef calibration in efus...
tplux
tplux 11/03/2022 10:27 AM
Seems straight forward.
tplux
tplux 11/03/2022 10:42 AM
I guess adding "0x00d000 toit/build/esp32/ota_data_initial.bin" to the call to esptool should also do the trick?
bitphlipphar
bitphlipphar 11/03/2022 10:42 AM
Yup. That's the theory :馃槂:
馃憤1
122 messages in total