Ambushing problems with TCP/IP sockets on Linux
Communication programming is a fertile field for ambushing bugs, because many things in communication are random and/or are not under programmer control:
  • Input data are transmitted by a remote side. Receiver has no control on their content and volumes, nor on when they are sent.
  • The delay between sending data and when they reach their destination, depends on network: its congestion, changes in topology, hardware changes, physical state of wires and connectors, and a myriad of other aspects.
  • Some of a socket‑related stuff is random by design.
Typically it makes the bugs practically irreproducible.
The following simple recipes may help avoid some of the bugs at the first place:
  1. select
    do not scale well and may crash programs. They should not be used in enterprise‑strength codes. Use instead
    , or
  2. Bind listener sockets to proper port numbers, that usually should be in a range 6100165535.
  3. Use
    and similar facilities only with non‑blocking sockets. Particularly, only with non‑blocking listener sockets.
  4. Set option
    for listener sockets.

The recipies usually work right out of the box, though sometimes you need to tinker with them.
Yuriy Koblents-Mishke
November 13, 2013
