TCP Buffers issue on MOD 5272 with NNDK 2.8.1

Discussion to talk about software related topics only.
Post Reply
ephogy
Posts: 30
Joined: Fri Aug 29, 2008 12:53 pm

TCP Buffers issue on MOD 5272 with NNDK 2.8.1

Post by ephogy »

Hi All,

I've been struggling with some networking issues for a few days now; I've never seen this problem before, but I suspect that all of our devices suffer from it, but the problem only manifests when they are under heavy load.

To give some background; the devices produce relatively small data streams of <5kB every 250 ms.
Because of (very) historical reasons, the data that is produces is polled (and not pushed) via an HTTP connection.
Multiple computers will request the packets every 250ms, occasionally requesting several of these sub 5kB packets.
If the computers DO NOT receive a response within 250ms, they close the connection and perform a new request for whatever data is available, the missed packets goes to oblivion.

The issue appears in the tcp.cpp code (I'm not saying that it is definitely the culprit), under the accept6/accept4 code

Code: Select all

PSOCKET ps = (PSOCKET)0xDEADBEEF;

pL->RxBuffer.ReadData( (PBYTE)&ps, sizeof( ps ) );


ASSERT( ( ps >= sockets ) && ( ps < ( sockets + TCP_SOCKET_STRUCTS ) ) );
occasionally, ps comes back as a value, but an obviously invalid one, in fact, if the ASSERT statements are enabled, the system halts because the value of ps is garbage.

I've discovered an obtuse work around that *appears* to be working, but I won't know for sure for a very long time of running.
the pointer does actually seem to be available, it just doesn't appear to be the first value (or second one sometimes)

The work around is as follows:

Code: Select all

int count = 0;
PSOCKET ps;
while (true)
{
    ps = (PSOCKET)0xDEADBEEF;

    pL->RxBuffer.ReadData( (PBYTE)&ps, sizeof( ps ) );

    //ASSERT( ( ps >= sockets ) && ( ps < ( sockets + TCP_SOCKET_STRUCTS ) ) );

    if ( ( DWORD ) ps == 0xDEADBEEF )
    {
        return TCP_ERR_NOSUCH_SOCKET;
    }

    if ( ( ps >= sockets ) && ( ps < ( sockets + TCP_SOCKET_STRUCTS ) ) )
    {
        break;
    }

    iprintf ("%d: Invalid socket returned by buffer ReadData %p\r\n", ++count, ps);
}
At first I though there was a stack overflow somewhere, but calls to OSDumpTCBStacks() produced tasks with over 2kB of stack space still available; not believing this, I doubled the stack sizes, and the corresponding increase in stack space shows up.

Everything in the user code is statically allocated at compile time, so I don't think there should be any memory allocation issues, there could be access issues, but analyzing the code, I haven't been able to find any yet; I

Any insight as to what I should potentially look into would be very welcomed.

Regards,
Keith
User avatar
pbreed
Posts: 1080
Joined: Thu Apr 24, 2008 3:58 pm

Re: TCP Buffers issue on MOD 5272 with NNDK 2.8.1

Post by pbreed »

I believe there are changes to this code in the latest 2.8.4/ 2.8.5 betas to fix a race condition....
You should be using that...
ephogy
Posts: 30
Joined: Fri Aug 29, 2008 12:53 pm

Re: TCP Buffers issue on MOD 5272 with NNDK 2.8.1

Post by ephogy »

pbreed wrote:I believe there are changes to this code in the latest 2.8.4/ 2.8.5 betas to fix a race condition....
You should be using that...
Thanks, I spent the morning modifying the 2.8.5 kernel with the changes we've made to it, and it appears to be working without the strange buffer errors.

Cheers
Post Reply