With LTSP running on Ubuntu 14.04, some clients get the error “Error: Failed to connect to NBD server” followed by “Socket failed: Connection timed out” or a similar “Socket failed” message. This happens mostly when the client was shut down incorrectly, but it does occasionally seem to happen otherwise.

The root problem seems to be nbd-server does not release the established connection to the client when the client shuts down and then tries to use the same port on the second connection.

The behavior of not releasing the port seems to have always been there, but it was not a problem until a change in recent Linux kernel versions made the second connection use the same port. It sounds like there is a patch to fix this, but I don’t know how long it will be until this is available in the repositories.

If you don’t want to use NFS or be restarting your nbd-server all the time, this seems to fix the problem as a band aid (all commands run as root):

  1. Get IP address from user (shown during PXE boot)
  2. Have user shutdown terminal completely
  3. run netstat -pn | grep nbd | grep [IP]
  4. Make note of any PIDs of the processes serving the client (i.e. [PID]/nbd-server). If there are more than two or three you probably ran the command wrong. Do NOT get the PID from the line containing DGRAM if it shows up.
  5. Run “kill [PID]” for each of the PIDs


2 Comments

Don GurĂº · June 18, 2014 at 14:22

Hi…other permanent solution ??? without nfs??

    erunaheru · July 14, 2014 at 02:03

    I’ve thought of setting a timeout on nbd-server, but unfortunately I haven’t had time to see if this causes any problems. I’m especially concerned about clients that use local apps (fat clients). I’ll make a new post once I have a chance to test this.

Leave a Reply