-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about CUSBEndpoint::SkipPID(unsigned int, bool) #519
Comments
Rene, I may have a possible work-around, but with the loss of some functionality. I removed the following code from my keyboard input loop and added it to execute once at the top of kernel.cpp:
I believe this may have solved my keyboard problem. I believe I got this code fragment from one of your examples, and it hasn't changed since the Summer. I don't believe the input loop can ever be executed simultaneously from multiple Tasks, but while check the code to be sure. Of course, doing this means that the keyboard has to be plugged in when my Circle-based OS boots, and it can't be removed and re-inserted. But that is not something that I need to do often. I tried this on a hunch, given that the crashes always seemed to be in the USB management functions (you may remember I had a similar earlier problem with RPi 4 + Keyboard). Regards, Dave. |
Dave, SkipPID() is a method down in the USB stack, which moves the protocol
identifier (PID), which is sent in USB data packets, to the next value. This is
necessary, because the PID normally changes between DATA0 and DATA1. The
receiver can check this way, if the expected PID has been received, and if a
data packet has been lost (e.g. because of a CRC error).
This method is called on every completion of a transfer from the interrupt
endpoint of a USB keyboard. This endpoint transfers the keyboard data, when a
key is pressed. Some keyboards also transfer data from time to time (e.g. each
500 ms), when there is no change in the pressed key status.
I'm trying to find a reason, why this happens. I guess it does not have
directly to do with SkipPID(), but the data, which is used inside this method
may be overwritten from somewhere. I could also imagine, that this is caused
by the same problem, that we had earlier in issue #490 for the RPi 4, which I
was not able to reproduce.
I could try to debug this here again with your kernel8.elf file. Perhaps I can
reproduce this problem on the RPi 3?
Rene
|
As per a follow-up, I was able to prevent the error by not checking for new USB devices in my input loop.
As an alternative approach, all of the keyboard activity is contained in a couple of functions in one file: although I know you are not generally involved in debugging other people’s code, with your knowledge of Circle, could I ask you to review my keyboard input functions and tell me if I am doing anything egregious?
I wrote this code back in June (?), although it was my first C code in three decades, and my first C++ code ever. But I have not changed it since, and it seems to be causing problems as I add more tasks (perhaps slowing things down?).
… On Dec 10, 2024, at 7:23 AM, Rene Stange ***@***.***> wrote:
Dave, SkipPID() is a method down in the USB stack, which moves the protocol
identifier (PID), which is sent in USB data packets, to the next value. This is
necessary, because the PID normally changes between DATA0 and DATA1. The
receiver can check this way, if the expected PID has been received, and if a
data packet has been lost (e.g. because of a CRC error).
This method is called on every completion of a transfer from the interrupt
endpoint of a USB keyboard. This endpoint transfers the keyboard data, when a
key is pressed. Some keyboards also transfer data from time to time (e.g. each
500 ms), when there is no change in the pressed key status.
I'm trying to find a reason, why this happens. I guess it does not have
directly to do with SkipPID(), but the data, which is used inside this method
may be overwritten from somewhere. I could also imagine, that this is caused
by the same problem, that we had earlier in issue #490 for the RPi 4, which I
was not able to reproduce.
I could try to debug this here again with your kernel8.elf file. Perhaps I can
reproduce this problem on the RPi 3?
Rene
—
Reply to this email directly, view it on GitHub <#519 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BA2V5KRLHM6OND3ZDXR5MIL2E3MMZAVCNFSM6AAAAABTJM34EKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMZRGQ4TEMRVGI>.
You are receiving this because you authored the thread.
|
Your keyboard handling code is equivalent to the one in sample/08-usbkeyboard
and should be all right. I have built a version for the RPi 3 from the source,
you have sent earlier to me. It's working well. I'm probably not using the
features, which are triggering the problem, but it cannot be the keyboard
handling alone, because I'm using the keyboard and also attached and removed
it several times.
|
Rene,
Yes, it seems to be triggered by using network sockets, but always seems to crash somewhere in the CUSBEndpoint class.
I don’t see the problem with the version that I had previously sent you.
So if I send you my latest kernel8.elf and instructions on how to reproduce the problem, would you have time to try to replicate on your end with debugging?
Cheers,
Dave.
… On Dec 10, 2024, at 11:55 AM, Rene Stange ***@***.***> wrote:
Your keyboard handling code is equivalent to the one in sample/08-usbkeyboard
and should be all right. I have built a version for the RPi 3 from the source,
you have sent earlier to me. It's working well. I'm probably not using the
features, which are triggering the problem, but it cannot be the keyboard
handling alone, because I'm using the keyboard and also attached and removed
it several times.
—
Reply to this email directly, view it on GitHub <#519 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BA2V5KWLLL6DAEFS3I2PB2D2E4MGRAVCNFSM6AAAAABTJM34EKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMZSGI3TSOJZGM>.
You are receiving this because you authored the thread.
|
Dave, yes, if you could send your kernel8.elf, which shows the problem, with
instructions, I would take the time for debugging.
Rene
|
Rene, Replying here, so that it is properly documented, and available to others who might run into a similar problem. Thank you so much for the detailed debugging, and a full explanation. This all makes sense now, and would also explain my problem on the RPi4 as well. I believe the difference between then and now is that I recently ran into a problem with the (previous) 1 MB buffer being too small, so I took the easy way out and just increased it to 2 MB. My thinking was that my operating system is using so little memory (< 100 MB with everything compiled in and running, compared to a full Linux system which minimally boots in at more than >300 MB, even before running any user programs), that allocating a 2 MB data buffer should not be a problem. My bad, of course, is that I was not thinking about the stack size. Just curious, though, should something else have detected exceeding the stack size? Without running through a debugger -- although that may always be the right answer -- is there any other way that this should have been caught? I will admit that I am still getting my brain to think properly about things like memory management, which is handled differently in C/C++ than Java, which I had been programming almost exclusively for the last 25+ years (as you may already know, Java has a garbage collector that recovers memory from any object that is no longer referenced). So with C/C++ I am always worried about allocating memory with 'new' or malloc(), and then having to remember to deallocate it later, otherwise run the risk of a memory leak (which is the biggest problem with most operating systems). Easy enough when everything works, but having to make sure that every possible error condition also frees any allocated buffers before returning the error to the user. Nonetheless, thanks for showing me the problem, and suggesting two possible ways to fix it. I will try to keep in mind the stack size for local variables going forward. Cheers, Dave. ============================================================ David, I was able to reproduce the problem and I think I have found the reason. My setup were a RPi 3B and a 3A+ connected to the WLAN and HDMI display/USB Then I started everything again with the JTAG debugger connected to the 3A+ It turned out, that the method CSocket:Receive() overwrites some parts of the How is this possible? The kernel stack by default is only 128KBytes in size. #define REMOTE_MAX_RESPONSE_SIZE 2097152 the Reponse buffer, which is allocated on the stack as a local variable, is How to solve this? You could increase the kernel stack size by defining DEFINE += -DKERNEL_STACK_SIZE=0x400000 in Config.mk (for a 4 MBytes stack for example, must be a multiple of Regards, Rene |
Rene, Sorry, for clarification, if I am gong to go the 'easy' route of increasing the stack side to 4 MB (yes, the 2 MB buffer in the REMOTE command is by far the biggest anywhere in the code I've written), and I am gong to add: DEFINE += -DKERNEL_STACK_SIZE=0x400000 to Config.mk, and I am building and linking against Circle + Stephan's stdlib/newlib, do I need to increase the stack size in both: ./circle-stdlib/libs/circle/Config.mk as well as in my own personal build: ./colorOS/Config.mk Or only my own personal build (compile and link using the standard Circle build scripts)? I'm thinking that it is only necessary in my own compile and link, but before I get myself into any more trouble, do I also need the increased stack size when building the Circle and stdlib/newlib libraries? Or is that not necessary? Cheers, Dave. |
Fixed by adding: DEFINE += -DKERNEL_STACK_SIZE=0x400000 to all instances of Config.mk and rebuilding. I will consider refactoring code to use new / malloc for large buffers at a later date (which I current do for GET URL (http/https) and GET FILE (tftp), but not REMOTE). |
Dave, you are welcome. I had a look, if some stack checking could be implemented, but this would require support from the GNU compiler. There is the option The normal way to apply the Rene |
Thanks Rene! I understand. I’ve closed this ticket, but was just wondering if there was a better way to detect this in the future.
I have made the change to all 3x Config.mk files, rebuilt everything, and it seems to have solved the problem.
Cheers,
Dave.
… On Dec 15, 2024, at 4:27 AM, Rene Stange ***@***.***> wrote:
Dave,
you are welcome. I had a look, if some stack checking could be implemented, but this would require support from the GNU compiler. There is the option -fstack-check, but it can only applied with a fixed address of the stack bottom, which is not sufficient, because Circle uses multiple stacks for the different cores and exception levels. So this cannot be easily implemented.
The normal way to apply the KERNEL_MAX_SIZE= option is by using the -o option, when calling configure in circle-stdlib. But if you want to add it manually, adding it to all three config files cannot be wrong. ;)
Rene
—
Reply to this email directly, view it on GitHub <#519 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BA2V5KUPQCQBRGJ3YR2WKQL2FVDPLAVCNFSM6AAAAABTJM34EKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNBTGYZTMOBQHA>.
You are receiving this because you modified the open/close state.
|
Thanks for info!
|
Rene,
More of a question than necessarily a problem with Circle.
I have spent a few days trying to track down a problem I'm having with a Synchronous exception. The PC seems to always be in this function: CUSBEndpoint::SkipPID(unsigned int, bool)
And only when using the keyboard on a RPi3. As I add and remove code, the PC value changes a bit, but it always seems to be in this function (usually first 1 - 5 instructions).
Can you please advise as to what that function does, which might (or might not) help me track down the problem?
It seems to be crashing when opening a socket while using the USB keyboard. If I perform the same functions from the network (telnet server), then I can't replicate the problem no matter how hard I try.
Thanks for any information you can provide that might possibly be helpful. I know this is a long shot, but I am running out of ideas.
Cheers,
Dave.
The text was updated successfully, but these errors were encountered: