Problem with network occasionally failing (MOD 5270)

Discussion to talk about software related topics only.
greengene
Posts: 164
Joined: Wed May 14, 2008 11:20 am
Location: Lakeside, CA

Re: Problem with network occasionally failing (MOD 5270)

Post by greengene »

kewl! it looks like you check for that status every time you process
ethernet packets and you didn't receive anything.
that should alleviate any of the checking i was going to add.
just need to get all of our releases using 2.4...
matthew.hutchins
Posts: 9
Joined: Mon Dec 15, 2008 11:05 pm

Re: Problem with network occasionally failing (MOD 5270)

Post by matthew.hutchins »

Posting an update to this problem.

We have updated to the latest NNDK (2.4) and the problem still exists.

The symptom is that the ethernet receive interrupt function stops being called (as indicated by the counter RxIsr not incrementing, also frames_rx not incrementing) so that no ethernet frames are received. Free buffers and other counters look OK. The problem is at the ethernet level and effects all IP and ARP packets (no ethernet frames are received at all).

Transmit seems to be OK, although when all the entries time out of the ARP table, the only transmissions are ARP requests (because the replies are never received no unicast IP packets get sent).

The testing I have done indicates that the ethernet phy is not in isolation mode when the problem occurs.

The problem is rare and intermittent, and I am yet to find a reliable trigger for it (so getting more information is hard).

One test I did seemed to indicate that executing a WarmEnetReset() did not fix the problem permanently, but it did seem to allow an ARP reply through so one debug message got out.


Matthew
matthew.hutchins
Posts: 9
Joined: Mon Dec 15, 2008 11:05 pm

Re: Problem with network occasionally failing (MOD 5270)

Post by matthew.hutchins »

On reflection, when the problem is occurring, the Netburner sends a LOT of ARP requests.

Would it be normal to send out a continual stream of ARP requests if no response had been received?

If not, is it possible that somehow an ARP request is getting stuck and being continually transmitted, and this prevents other frames being received?

Matthew
matthew.hutchins
Posts: 9
Joined: Mon Dec 15, 2008 11:05 pm

Re: Problem with network occasionally failing (MOD 5270)

Post by matthew.hutchins »

A final update on this problem.

It seems to have been caused by a bug in the arp code, which I am told will be fixed in the next release.

It will only bite you if your code occasionally initiates sending a stream of packets to a unicast address. In our case we were using SysLog to send log messages. If the address you are sending to has timed out of the arp cache, and you are very unlucky with the timing of arp replies, then there can be buffer problem.

Thanks,

Matthew
Post Reply