Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debugging: can't get LMIC_PRINTF_TO working on SAMD21 #270

Open
JeroenvIS opened this issue Feb 28, 2020 · 12 comments
Open

Debugging: can't get LMIC_PRINTF_TO working on SAMD21 #270

JeroenvIS opened this issue Feb 28, 2020 · 12 comments

Comments

@JeroenvIS
Copy link

Base question/issue: I can't get LMIC_PRINTF_TO working on SAMD21. Tried with Arduino 1.8.6 and 1.8.12 on Windows, both with current LMIC. When I uncomment #define LMIC_PRINTF_TO Serial and set LMIC_DEBUG_LEVEL to 2, compilation of the sketch fails with the error error: 'cookie_io_functions_t' does not name a type

I'm lost how to fix this, it's way more low-level than I'm used to. Any pointers?

Background / reason for the debug: I'm running into an issue with the "Sensebox MCU", which is a SAMD21 based board. Node can join TTN successfully (OTAA) and usually transmits a couple of packets just fine (eg with a TX_INTERVAL of 60 seconds, rescheduling the os_job 60 secs after TX_COMPLETE as in common examples).

However, after a while, I don't see any data being received at TTN. The loop() is still running (observed by blinking LED) and with serial debug output for the sketch itself, I can see that the send_job has prepared the data just fine and emitted the "Packet queued" message. I never see a TX_COMPLETE event though. If I add some code to reschedule the os_send job if no TX_COMPLETE has been seen withing 15 secs after queueing the packet (only after join of course), it appears that LMIC.opmode & OP_TXRXPEND yields true.

I haven't had issues like these on AVR, but I see this pattern on 4 out of 5 Sensebox MCU platforms and I believe that a working LMIC_PRINTF_TO might help me dig deeper.

@matthijskooijman
Copy link
Owner

Hmm, I have no problems compiling the ttn-otaa example for the Arduino Zero with #define LMIC_PRINTF_TO Serial.

Using board 'arduino_zero_edbg' from platform in folder: /home/matthijs/.arduino15/packages/arduino/hardware/samd/1.6.17
Using core 'arduino' from platform in folder: /home/matthijs/.arduino15/packages/arduino/hardware/samd/1.6.17

Are you using the latest version (1.5.0+arduino3)? It was recently fixed to support non-AVR printf as well (though your error suggests you already have the fix, or maybe you're usign the old non-avr-printf (or something like that) branch? Note that the IDE library manager seems to mis-sort the versions and tends to install an older version.

What board and core are you using exactly? This is printed by the IDE on (verbose) compilation as I pasted above.

(as for the underlying issue - I have no suggestions from the top of my head, other than indeed enabling debug output :-p)

@JeroenvIS
Copy link
Author

The board that I'm using is the "SenseBox MCU" from SenseBox , with this board support package.

Yes, I'm using LMIC version 1.5.0+arduino3. There is an older LMIC library included with the SenseBox packages, so I emptied that directory and copied the files for 1.5.0+arduino3 to that location. Probably not the cleanest approach, but (also looking at the verbose output) I'm 100% sure that it picks up the latest LMIC code this way.

Could it be that the BSP needs some kind of patch/update to make this work?

(as for the underlying issue: I've never seen this before, neither with AVR nor with ESP32 platforms. However, a faculty member who is involved in this project e-mailed me that he also observed this issue on ESP32+RFM95 several times. Anecdotical, but I would like to investigate further).

@JeroenvIS
Copy link
Author

Using board 'sb' from platform in folder: C:\Users\ingenj\AppData\Local\Arduino15\packages\sensebox\hardware\samd\1.3.1

Using core 'arduino' from platform in folder: C:\Users\ingenj\AppData\Local\Arduino15\packages\arduino\hardware\samd\1.8.5

@matthijskooijman
Copy link
Owner

No time to reproduce right now, but looking at the install instructions for this board, it looks like it just uses the default Aduino SAMD core with some added boards in its boards.txt (confirmed by the lack of a cores directory in their repo). Also, just realized, confirmed by the output in your last comment which indeed says it uses the Arduino SAMD core.

I think these printf cookie functions are supplied by gcc/libc, so that should just work.

I do notice that I have an older Arduino SAMD version (1.6.17, whereas you're using 1.8.5). I'll try upgrading later, to see if that works (or you could try downgrading just the Arduino SAMD core, see if that fixes it)?

@JeroenvIS
Copy link
Author

I do notice that I have an older Arduino SAMD version (1.6.17, whereas you're using 1.8.5). I'll try upgrading later, to see if that works (or you could try downgrading just the Arduino SAMD core, see if that fixes it)?

That's a good pointer, thanks! Yes, it does compile correctly when I downgrade Arduino SAMD support to 1.6.17. Will try to bisect where the breakage is introduced.

In the mean time I compiled with the 1.6.17 so I can run a node with extensive debugging.

@matthijskooijman
Copy link
Owner

That's a good pointer, thanks! Yes, it does compile correctly when I downgrade Arduino SAMD support to 1.6.17. Will try to bisect where the breakage is introduced.

Nice, thanks for testing. I can imagine two underlying causes then:

  • The gcc / libc / toolchain was upgraded, which changed something to break this.
  • The Arduino core was changed to somewhere actually include stdio.h, causing the later include to not work.

I'll explain the latter a bit more. In the hal.cpp file:

#include <Arduino.h>
#include <SPI.h>
#include "../lmic.h"
#include "hal.h"
#define _GNU_SOURCE 1 // For fopencookie
#include <stdio.h>
#undef _GNU_SOURCE

We include stdio.h with _GNU_SOURCE defined to enable some GNU-specific extension. However, we include Arduino.h (and some other files) first, so that means if Arduino.h (or the other files) would (indirectly) include stdio.h, that file would be included without _GNU_SOURCE defined, so it would not expose the cookie functions we need (and the later include in hal.cpp would end up being a no-op because stdio.h was previously included).

To test this, you could try moving these three lines to above the Arduino.h include:

#define _GNU_SOURCE 1 // For fopencookie
#include <stdio.h>
#undef _GNU_SOURCE

That should ensure that stdio.h is always included properly, regardless of what other headers do. If this is indeed the problem, that should be fixed by making this change (and even if it does not, I think I will make this change since it should prevent this problem from being introduced later).

@JeroenvIS
Copy link
Author

Great! I'll try that in a bit. First update: without any other code changes, Arduino SAMD version 1.6.21 (last 1.6) works fine too; 1.8.3 and 1.8.5 don't. However, with your proposed fix, 1.8.5 works fine. So sounds like a good solution!

And from the debug output to narrow down the TX issues: everything is OK when "Packet queued" is followed by "irq: dio: 0x0 flags: 0x8", whereas the cases where "Packet queued" is never followed by a "EV_TXCOMPLETE" event (and instead my workaround code has to kick in) are missing that irq.

OK:

26378997: Updating info for TX at 26377875, airtime will be 3856. Setting available time for band 0 to 30233875
26379221: TXMODE, freq=867300000, len=23, SF=7, BW=125, CR=4/5, IH=0
Packet queued
26382967: irq: dio: 0x0 flags: 0x8
26383030: Scheduled job 0x20001470, cb 0x370d ASAP

Not OK:

25160109: Updating info for TX at 25158684, airtime will be 3856. Setting available time for band 1 to 25544284
25160349: TXMODE, freq=868300000, len=23, SF=7, BW=125, CR=4/5, IH=0
Packet queued
No TX_COMPLETE event received after scheduling transmission, rescheduling
26125011: engineUpdate, opmode=0x800

@matthijskooijman
Copy link
Owner

And from the debug output to narrow down the TX issues: everything is OK when "Packet queued" is followed by "irq: dio: 0x0 flags: 0x8", whereas the cases where "Packet queued" is never followed by a "EV_TXCOMPLETE" event

Sounds like somehow the IRQ is never fired (not that there is no actually interrupt line involved, the code just polls the DIO "interrupt" lines using digitalRead()). Could it be that you stop calling os_runloop_once() in the broken case? If not, maybe add some debug printing to the GPIO polling (or, if have it available, use a scope or logic analyzer to see if the DIO pin actually becomes active). From the debug output dio: 0x0, I suspect you should be looking at DIO0, though I'm not entirely sure.

@JeroenvIS
Copy link
Author

The code is still calling os_runloop_once() in the loop, that's not different from the non-broken case. There is no complex code in the loop (just blinking a LED twice a second, by driving an output based on int(millis() / 500) % 2. Just to make sure I'll remove that bit and see if I can still reproduce then.
I have a small logic analyzer, so if more debugging doesn't help, I'll consider monitoring DIO0 and possibly a few other signals to correlate.

JeroenvIS added a commit to JeroenvIS/arduino-lmic that referenced this issue Mar 3, 2020
With debugging enabled, LMIC wouldn't compile on SAMD targets when used with Arduino core greater than 1.6.21 (eg 1.8.x versions). Changing the order of includes fixes this. Suggested by matthijskooiman in upstream matthijskooijman#270.
@JeroenvIS
Copy link
Author

@matthijskooijman I committed your suggested change in my fork (JeroenvIS@42f4eb0); it compiles cleanly both with and without #define LMIC_PRINTF_TO Serial, even with Arduino core 1.8.5.

So I think this is a perfect fix and can be merged or otherwise adopted in your branch. When that's done, I'll urge Sensebox to update the version that they are shipping with their package.

@JeroenvIS
Copy link
Author

There is no complex code in the loop (just blinking a LED twice a second, by driving an output based on int(millis() / 500) % 2. Just to make sure I'll remove that bit and see if I can still reproduce then.

FWIW: this was indeed the issue, introducing a certain probability that EV_TXCOMPLETE didn't come after queueing a packet. If I remove this line from the loop, LMIC os_runloop_once gets the 'interrupt' every time.

Can't explain from a hardware and/or timing perspective yet, but I guess that's not too relevant for this PR anyway.

@Commifreak
Copy link

Commifreak commented Oct 23, 2020

@matthijskooijman I committed your suggested change in my fork (JeroenvIS@42f4eb0); it compiles cleanly both with and without #define LMIC_PRINTF_TO Serial, even with Arduino core 1.8.5.

So I think this is a perfect fix and can be merged or otherwise adopted in your branch. When that's done, I'll urge Sensebox to update the version that they are shipping with their package.

This fix worked! I had to add these #define _GNU_SOURCE 1 // For fopencookie since they're not there in my hal.cpp?

When will this fix being merged into https://github.com/mcci-catena/arduino-lmic ?

EDIT: Maybe never, since it is not a wanted fix: mcci-catena#549

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants