intermittent hanging
Posted: Thu Oct 09, 2014 7:39 am
We're experiencing an intermittent problem where our board seems to hang after power-up. We haven't been able to get the problem to reproduce reliably, but we see it quite frequently when we power on 20 boards (about 1 out of every 4 power cycles will have a single non-responsive board out of the 20, seemingly randomly selected).
On power-up, our app should spawn our highest priority tcp thread which listens on a port, accepts a connection, and reads from the connection. It also spawns a lower priority thread that sends UDP heartbeats every second. While testing this problem, all we were doing was powering the boards up and checking in WireShark for all 20 UDP heartbeats, then powering them back down and repeating the process (no TCP connections were attempted on the PC side). We noticed several behaviors when we got one to fail:
We were not able to open a TCP connection from the PC side. On a separate incident, we saw one fail and we just looked at the Netburner's ethernet port and noticed there were no LEDs illuminated, so we just unplugged the ethernet cable and plugged it back in and then everything came back up and worked fine (LEDs lit/blinked, heartbeats sent, were able to establish a tcp connection).
So I'm just looking for any suggestions/ideas on what I can do or try. I can't say for sure where the code is hanging...and our board doesn't even have a serial port for spitting out debug info or LEDs for...blinking info (worst design decision ever). I have an eval kit that I can run the code on, but it seems nearly impossible to replicate this problem on a single board.
I compile on NNDK 2.6.3 and target the MOD5272.
On power-up, our app should spawn our highest priority tcp thread which listens on a port, accepts a connection, and reads from the connection. It also spawns a lower priority thread that sends UDP heartbeats every second. While testing this problem, all we were doing was powering the boards up and checking in WireShark for all 20 UDP heartbeats, then powering them back down and repeating the process (no TCP connections were attempted on the PC side). We noticed several behaviors when we got one to fail:
We were not able to open a TCP connection from the PC side. On a separate incident, we saw one fail and we just looked at the Netburner's ethernet port and noticed there were no LEDs illuminated, so we just unplugged the ethernet cable and plugged it back in and then everything came back up and worked fine (LEDs lit/blinked, heartbeats sent, were able to establish a tcp connection).
So I'm just looking for any suggestions/ideas on what I can do or try. I can't say for sure where the code is hanging...and our board doesn't even have a serial port for spitting out debug info or LEDs for...blinking info (worst design decision ever). I have an eval kit that I can run the code on, but it seems nearly impossible to replicate this problem on a single board.
I compile on NNDK 2.6.3 and target the MOD5272.