Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(debug): Implement core dump handling #168

Merged
merged 4 commits into from
Oct 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 61 additions & 15 deletions code/README.md
Original file line number Diff line number Diff line change
@@ -1,52 +1,62 @@
# Build
## Build

## Preparations
### Checkout Github Repository
```
git clone https://github.com/Slider0007/AI-on-the-edge-device.git
cd AI-on-the-edge-device
git checkout develop
git submodule update --init
```

### Optional Step: Update Submodules
```
cd code/components/{submodule} (e.g. esp32-camera)
git checkout VERSION (e.g. HASH of latest build)
cd ../../ (go back to code level)
git submodule update --init
```

## Build and Flash within terminal
See further down to build it within an IDE.
---
### Build and Flash with console

### Compile (firmware only)
#### Compile (firmware only)
```
cd code
Github project root directory --> cd code
platformio run --environment esp32cam
```

### Compile (with HTML parameter tooltips, API docs and file hashes)
Check `platformio.ini` to find out which environments are available.

#### Compile (with HTML parameter tooltips, API docs and file hashes)
```
cd code
platformio run --environment esp32cam-localbuild
```
Check `platformio.ini` to find out which environments are available.

### Upload
#### Upload
```
pio run --target upload --upload-port /dev/ttyUSB0
```

Alternatively you also can set the UART device in `platformio.ini`, eg. `upload_port = /dev/ttyUSB0`
Alternatively, UART device can be defined in `platformio.ini`, eg. `upload_port = /dev/ttyUSB0`

### Monitor UART Log
#### Monitor UART Log
```
pio device monitor -p /dev/ttyUSB0 -b 115200
```

## Build and Flash with Visual Code IDE
---
### Build and Flash with Visual Code IDE

- Download and install VS Code
- https://code.visualstudio.com/Download
- Install the VS Code platform io plugin
- Install the VS Code platformIO IDE plugin
- <img src="https://github.com/raw/Slider0007/ai-on-the-edge-device/develop/images/platformio_plugin.jpg" width="200" align="middle">
- Check for error messages, maybe you need to manually add some python libraries
- Check for error messages, maybe you need add some python libraries or other dependencies manually
- e.g. in my Ubuntu a python3-env was missing: `sudo apt-get install python3-venv`
- git clone this project
- in Linux:

```
git clone https://github.com/Slider0007/AI-on-the-edge-device.git
cd AI-on-the-edge-device
Expand All @@ -65,7 +75,43 @@ pio device monitor -p /dev/ttyUSB0 -b 115200
- the build artifacts are stored in `code/.pio/build/`
- Connect the device and type `pio device monitor`. There you will see your device and can copy the name to the next instruction
- Add `upload_port = you_device_port` to the `platformio.ini` file
- make sure an sd card with the contents of the `sd_card` folder is inserted and you have changed the wifi details
- Make sure a SD card with the proper contents is inserted and you have adapted the WLAN configuration in `config.json`
- `pio run --target erase` to erase the flash
- `pio run --target upload` this will upload the `bootloader.bin, partitions.bin,firmware.bin` from the `code/.pio/build/esp32cam/` folder.
- `pio device monitor` to observe the logs via uart

---
## Debugging

### UART/Serial Log
```
pio device monitor -p /dev/ttyUSB0 -b 115200
```
### Application Log File
The device is logging lots of actions to SD card (`log/messages`). This log can be viewed using WebUI (`System > Log Viewer`) or directly by browsing the files on SD card. Verbosity is depended on log level which can be adapted in WebUI

### Dump File
After a software exception a dump log will be written to flash. Find further details to the core functionality [here](https://docs.espressif.com/projects/esp-idf/en/v5.3.1/esp32/api-guides/core_dump.html)

Configuration:
- Location: partition `coredump` (compare `partitions.csv`)
- Log Format: ELF
- Integrity Check: CRC32


You can view the dump log backtrace summary directly in the WebUI or you can download the complete dump file for further analysis. (`System > System Info > Section 'Build'`). The downloaded dump file name has to following syntax: `{firmware version}__{board_type}_coredump-elf.bin`

#### ESP-IDF provides a special tool to help to analyze the downloaded core dump file
- Install [esp-coredump](https://github.com/espressif/esp-coredump) --> e.g. Installation using VSCode Platformio console: `pip install esp-coredump`
- Download SOC specific [ROM ELF files](https://github.com/espressif/esp-rom-elfs) and extract the hardware specific ELF file for further usage
- Make sure to use the matching version of `tool-xtensa-esp-elf-gdb`. If you are using VSCode with Platformio IDE, this package is already installed
in `<path>/.platformio/packages`.
- Generic usage:
```
esp-coredump info_corefile --gdb <path_to_gdb_bin> --rom-elf <soc_specific_rom_elf_file> --core-format raw --core <downloaded coredump file> <elf file of actual firmware>
```
- Example:
```
esp-coredump info_corefile --gdb <path to tool-xtensa-esp-elf-gdb/bin/xtensa-esp32-elf-gdb.exe> --rom-elf esp32_rev0_rom.elf --core-format raw --core firmware_ESP32CAM_coredump-elf.bin firmware.elf
```

2 changes: 1 addition & 1 deletion code/components/fileserver_ota/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ FILE(GLOB_RECURSE app_sources ${CMAKE_CURRENT_SOURCE_DIR}/*.*)

idf_component_register(SRCS ${app_sources}
INCLUDE_DIRS "." "../../include" "miniz"
REQUIRES vfs spiffs esp_http_server webserver_softap app_update mainprocess_ctrl misc_helper)
REQUIRES vfs spiffs esp_http_server espcoredump webserver_softap app_update mainprocess_ctrl misc_helper)


158 changes: 151 additions & 7 deletions code/components/fileserver_ota/server_file.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,19 +18,22 @@ extern "C" {
}
#endif

#include "esp_err.h"
#include <esp_partition.h>
#include <esp_core_dump.h>
#include <esp_err.h>
#include <esp_log.h>

#include "esp_vfs.h"
#include <esp_vfs.h>
#include <esp_spiffs.h>
#include "esp_http_server.h"
#include <esp_http_server.h>
#include <cJSON.h>

#include "webserver.h"
#include "server_help.h"
#include "ClassLogFile.h"
#include "MainFlowControl.h"
#include "gpioControl.h"
#include "helper.h"
#include "system.h"
#include "psram.h"

#ifdef ENABLE_MQTT
Expand Down Expand Up @@ -778,7 +781,6 @@ static esp_err_t download_get_handler(httpd_req_t *req)
// Handler to upload a file to server (sd card)
static esp_err_t upload_post_handler(httpd_req_t *req)
{
//LogFile.writeToFile(ESP_LOG_DEBUG, TAG, "upload_post_handler");
char filepath[FILE_PATH_MAX];
FILE *fd = NULL;
struct stat file_stat;
Expand Down Expand Up @@ -930,8 +932,6 @@ static esp_err_t delete_post_handler(httpd_req_t *req)
httpd_resp_set_hdr(req, "Access-Control-Allow-Origin", "*");

if (httpd_req_get_url_query_str(req, query, sizeof(query)) == ESP_OK) {
ESP_LOGD(TAG, "Query: %s", query);

if (httpd_query_key_value(query, "task", valuechar, sizeof(valuechar)) == ESP_OK) {
LogFile.writeToFile(ESP_LOG_DEBUG, TAG, "delete_post_handler: Task: " + std::string(valuechar));
task = std::string(valuechar);
Expand Down Expand Up @@ -1010,6 +1010,141 @@ static esp_err_t delete_post_handler(httpd_req_t *req)
}


static std::string printCoreDumpBacktraceInfo(const esp_core_dump_summary_t *summary)
{
if (summary == NULL) {
return "No core dump available";
}

char results[256]; // Assuming a maximum of 256 characters for the backtrace string
int offset = 0;

for (int i = 0; i < summary->exc_bt_info.depth; i++) {
uintptr_t pc = summary->exc_bt_info.bt[i]; // Program Counter (PC)
int len = snprintf(results + offset, sizeof(results) - offset, " 0x%08X", pc);
if (len >= 0 && offset + len < sizeof(results)) {
offset += len;
}
else {
break; // Reached the limit of the results buffer
}
}

return std::string("Backtrace: " + std::string(results) +
"\nDepth: " + std::to_string((int)summary->exc_bt_info.depth) +
"\nCorrupted: " + std::to_string(summary->exc_bt_info.corrupted) +
"\nPC: " + std::to_string((int)summary->exc_pc) +
"\nFirmware version: " + getFwVersion());
}


static esp_err_t coredump_handler(httpd_req_t *req)
{
const char* APIName = "coredump:v1"; // API name and version
char query[200];
char valuechar[30];
std::string task;

httpd_resp_set_hdr(req, "Access-Control-Allow-Origin", "*");
httpd_resp_set_type(req, "text/plain");

if (httpd_req_get_url_query_str(req, query, sizeof(query)) == ESP_OK) {
if (httpd_query_key_value(query, "task", valuechar, sizeof(valuechar)) == ESP_OK) {
task = std::string(valuechar);
}
}

// Check if coredump partition is available
const esp_partition_t *partition = esp_partition_find_first(ESP_PARTITION_TYPE_DATA,
ESP_PARTITION_SUBTYPE_DATA_COREDUMP, "coredump");
if (partition == NULL) {
httpd_resp_send_err(req, HTTPD_500_INTERNAL_SERVER_ERROR, "Partition 'coredump' not found");
return ESP_FAIL;
}

// Get core dump summary to check if core dump is available
esp_core_dump_summary_t summary;
esp_err_t coreDumpGetSummaryRetVal = esp_core_dump_get_summary(&summary);

// Save core dump file
// Debug with esp-coredump (https://github.com/espressif/esp-coredump) --> e.g. install with "pip install esp-coredump"
// Generic: esp-coredump info_corefile --gdb <path_to_gdb_bin> --rom-elf <soc_specific_rom_elf_file> --core-format raw
// --core <downloaded coredump file (FIRMWARE_BOARDTYPE-coredump-elf.bin)> firmware.elf (firmware debug zip --> firmware.elf)
// Example: esp-coredump info_corefile --gdb <tool-xtensa-esp-elf-gdb/bin/xtensa-esp32-elf-gdb.exe>
// --rom-elf esp32_rev0_rom.elf --core-format raw --core firmware_ESP32CAM_coredump-elf.bin firmware.elf
if (task.compare("save") == 0) {
if (coreDumpGetSummaryRetVal != ESP_OK) { // Skip save request if no core dump is available (empty partition)
httpd_resp_sendstr(req, "Skip request, no core dump available");
return ESP_OK;
}

// Get firmware and cleanup name to have proper filename
std::string firmware = getFwVersion();
replaceAll(firmware, ":", "_");
replaceAll(firmware, " ", "_");
replaceAll(firmware, "(", "_");
replaceAll(firmware, ")", "_");

std::string attachmentFile = "attachment;filename=" + firmware + "_" + getBoardType() + "_coredump-elf.bin";
httpd_resp_set_type(req, "application/octet-stream");
httpd_resp_set_hdr(req, "Content-Disposition", attachmentFile.c_str());

/* Retrieve the pointer to scratch buffer for temporary storage */
char *buf = ((struct HttpServerData *)req->user_ctx)->scratch;
if (buf == NULL) {
httpd_resp_send_err(req, HTTPD_500_INTERNAL_SERVER_ERROR, "No scratch buffer available");
return ESP_FAIL;
}

int i = 0;
for (i = 0; i < (partition->size / WEBSERVER_SCRATCH_BUFSIZE); i++) {
esp_partition_read(partition, i * WEBSERVER_SCRATCH_BUFSIZE, buf, WEBSERVER_SCRATCH_BUFSIZE);
httpd_resp_send_chunk(req, buf, WEBSERVER_SCRATCH_BUFSIZE);
}

int pendingSize = partition->size - (i * WEBSERVER_SCRATCH_BUFSIZE);
if (pendingSize > 0) {
ESP_ERROR_CHECK(esp_partition_read(partition, i * WEBSERVER_SCRATCH_BUFSIZE, buf, pendingSize));
httpd_resp_send_chunk(req, buf, pendingSize);
}
httpd_resp_send_chunk(req, NULL, 0);
return ESP_OK;
}
else if (task.compare("clear") == 0) { // Format partition 'coredump'
esp_err_t err = esp_partition_erase_range(partition, 0, partition->size);
if (err == ESP_OK) {
httpd_resp_sendstr(req, "Partition 'coredump' cleared");
return ESP_OK;
}
else {
std::string errMsg = "Failed to format partition 'coredump'. Error: " + intToHexString(err);
httpd_resp_send_err(req, HTTPD_500_INTERNAL_SERVER_ERROR, errMsg.c_str());
return ESP_FAIL;
}
}
else if (task.compare("force_exception") == 0) { // Only for testing purpose, and ESP exception crash can be forced
LogFile.writeToFile(ESP_LOG_ERROR, TAG, "coredump_handler: Software exception triggered manually");
httpd_resp_send_chunk(req, NULL, 0);
assert(0);
return ESP_OK;
}
else if (task.compare("api_name") == 0) {
httpd_resp_sendstr(req, APIName);
return ESP_OK;
}

// Default action: Print backtrace summary
if (coreDumpGetSummaryRetVal == ESP_OK) {
httpd_resp_sendstr(req, printCoreDumpBacktraceInfo(&summary).c_str());
}
else {
httpd_resp_sendstr(req, "No core dump available");
}

return ESP_OK;
}


void registerFileserverUri(httpd_handle_t server, const char *basePath)
{
ESP_LOGI(TAG, "Registering URI handlers");
Expand Down Expand Up @@ -1049,6 +1184,15 @@ void registerFileserverUri(httpd_handle_t server, const char *basePath)
};
httpd_register_uri_handler(server, &file_delete);

/* URI handler for deleting files from server */
httpd_uri_t coredump = {
.uri = "/coredump",
.method = HTTP_GET,
.handler = coredump_handler,
.user_ctx = httpServerData // Pass server data as context
};
httpd_register_uri_handler(server, &coredump);

httpd_uri_t handler_logfile = {
.uri = "/log",
.method = HTTP_GET,
Expand Down
6 changes: 6 additions & 0 deletions code/components/fileserver_ota/server_ota.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,12 @@ static bool ota_update_firmware(std::string fn)
return false;
}

// Clear core dump partition content after successful firmware update (clean start)
const esp_partition_t *partition = esp_partition_find_first(ESP_PARTITION_TYPE_DATA, ESP_PARTITION_SUBTYPE_DATA_COREDUMP, "coredump");
if (partition != NULL) {
esp_partition_erase_range(partition, 0, partition->size);
}

return true;
}

Expand Down
2 changes: 1 addition & 1 deletion code/components/webserver_softap/webserver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -599,7 +599,7 @@ httpd_handle_t startWebserver(void)
config.stack_size = 10240;
config.core_id = 1;
config.max_open_sockets = 5; // With default value 7: Error "httpd_accept_conn: error in accept"
config.max_uri_handlers = 22; // Max number of URI handler
config.max_uri_handlers = 23; // Max number of URI handler
config.lru_purge_enable = true; // Cut old connections if new ones are needed
config.uri_match_fn = httpd_uri_match_wildcard;

Expand Down
2 changes: 2 additions & 0 deletions code/components/webserver_softap/webserver.h
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

#include <esp_http_server.h>
#include <esp_vfs.h>
#include <string>

#include "../../include/defines.h"

Expand All @@ -14,6 +15,7 @@ struct HttpServerData {
extern struct HttpServerData *httpServerData;

extern httpd_handle_t server;
extern std::string getFwVersion(void);

void allocateWebserverHelperMemory(void);
httpd_handle_t startWebserver(void);
Expand Down
7 changes: 4 additions & 3 deletions code/main/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -169,9 +169,10 @@ extern "C" void app_main(void)
checkIsPlannedReboot();
if (!getIsPlannedReboot() && (esp_reset_reason() == ESP_RST_PANIC)) { // If system reboot was not triggered by user and reboot was caused by execption
LogFile.writeToFile(ESP_LOG_WARN, TAG, "Reset reason: " + getResetReason());
LogFile.writeToFile(ESP_LOG_WARN, TAG, "Device was rebooted due to a software exception! Log level is set to DEBUG until the next reboot. "
"Flow init is delayed by 5 minutes to check the logs or do an OTA update");
LogFile.writeToFile(ESP_LOG_WARN, TAG, "Keep device running until crash occurs again and check logs after device is up again");
LogFile.writeToFile(ESP_LOG_WARN, TAG, "The device was restarted due to a software exception. The log level is set to DEBUG "
"until the next reboot. Process init is delayed by 5 minutes to allow checking logs, "
"downloading the dump file or performing an OTA update. Keep the device running until "
"another crash happens and review once the device is back online");
LogFile.setLogLevel(ESP_LOG_DEBUG);
setTaskAutoFlowState(FLOW_TASK_STATE_INIT_DELAYED);
}
Expand Down
Loading