Handling improper Client Disconnects

Discussion to talk about software related topics only.
Post Reply
User avatar
tod
Posts: 587
Joined: Sat Apr 26, 2008 8:27 am
Location: Southern California
Contact:

Handling improper Client Disconnects

Post by tod »

I've been meaning to ask this question for a while and someone posted a question that is in some ways just the opposite of the problem I'm trying to deal with so...

The NB board acts as the server and only ever wants to allow one client to connect and control it at a time. The first problem I have is that you can't specify 0 to mean no one can queue on the listen port. The minimum value is 1 and that means one client can connect and one can queue. The second and more important problem is the strategy for dealing with clients that don't properly close their connection. That is, they just crash or something similar so that the NB Tcp server has no way of knowing they are gone. Here are two things I have tried.

1. Do an intermittent ping to make sure the client is still there. PROBLEM: Some clients may not respond to pings, it puts additional noise traffic on the network and worst of all, if the client is another embedded system that can crash/reboot within a couple of seconds then the ping interval has to be way too fast. No network admin wants a device on their network that sends out a ping every second.

2. Set up second listen port on a separate port. When any connection comes in on that port it tells the NB to force close the existing connection with the client. This seems to work OK but requires a little more work for the client than I would like and has the potential for abuse. In reality I haven't seen the abuse but most of the systems are on closed networks with well-behaved engineers. The client software just needs to know about the possibility of getting locked out and know to connect to the special port first, to clear the old connection and then connect to the normal port. Of course on the NB side it also requires an extra task to monitor the special port.

I noticed from answers to the other post that several folks seem to know a great deal more about this situation than I do and I'm hoping someone might have a better solution.

Tod
rnixon
Posts: 833
Joined: Thu Apr 24, 2008 3:59 pm

Re: Handling improper Client Disconnects

Post by rnixon »

Hi Tod,

I've got a few ideas, but I'm not sure if they are the best possible solution.

- For the listen issue, after you accept a connection close the listen port. That way no more incoming connections can be accepted. Once you decide you want to listen again, the call listen() again to open it up.

- If there is no communication flowing on the tcp connection you can't tell if its down. So you need to do something, such as the ping, some type of heartbeat that the client and server agree on, or use keep-alive. But no matter how you look at it you can't tell if the client is there unless you send something that requires an answer. When the server checks for a heartbeat, the tcp connection will timeout if nothing comes back and then you will know. Maybe you do this once a minute instead of once per second.

- Use an override. So you open a listen socket and accept a connection. After some amount of time you open the listen socket again, and if someone connects it bumps the old connection. That would solve the half open socket issue, but could also bump a good connection. But maybe that is still ok, because how do you handle the situation in which a computer is connected and everything is good, but that computer is in some other location and you want to take control?
User avatar
pbreed
Posts: 1081
Joined: Thu Apr 24, 2008 3:58 pm

Re: Handling improper Client Disconnects

Post by pbreed »

A few revs ago we added TCP keep alive functions.
Take a look at the example nburn\examples\tcp\tcp_simple_keepalive.

The knowledge that you seek is there... :-)

Paul
User avatar
tod
Posts: 587
Joined: Sat Apr 26, 2008 8:27 am
Location: Southern California
Contact:

Re: Handling improper Client Disconnects

Post by tod »

Hi rnixon (and Paul)

Thanks for the ideas. I never realized you could close the listen socket and leave the connection socket open! I just assumed the connection socket depended on the listen socket. That alone made it worth asking the question.

BTW, discovered a cool forum feature. I originally had written a paragraph about being told keep alive wasn't supported on the NB. When I went to post I got a message telling me there was a new reply - it was Paul informing me about the keepalive examples. (I must have been off on my C# binge when this happened). Time to go get smarter.

Tod
thomastaranowski
Posts: 82
Joined: Sun May 11, 2008 2:17 pm
Location: Los Angeles, CA
Contact:

Re: Handling improper Client Disconnects

Post by thomastaranowski »

The concept of the listen/accept pattern isn't really documented anywhere, and all the examples just kind of assume it's known, which makes me think people are just copy/pasting stuff around without knowing what's going on. It helps me to think in OOP terminology, where the accept() call is a socket factory. I never really grokked it until I wrote a socket layer for an embedded stack.

The typical listen/accept pattern is as follows:

* create listen socket
* bind listen socket to local port
* listen and wait for a return
* Main loop
* listen will tell us someone wants to conenct
* accept the new connection and get a shiny new socket descriptor (keep listening on the original listen socket)
* Use this new descriptor in all the client code
* Close the descriptor once done
* main loop ends, close the listen socket

The keep-alive is a good method for monitoring connection status, but you still have keep-alive message overhead. This isn't a big deal, unless your going over a slow modem link or something. An application level ping/pong type scheme could also be used, with once a minute pings/pongs to minimize overhead, however I think you will need to turn off SO_LINGER option on the socket, so that when closing the socket on ping/pong timeout you don't get stuck in one of the FIN wait states for some minutes waiting for the connections send buffer to empty.
Post Reply