NAME
tcp —
Internet Transmission Control
Protocol
SYNOPSIS
#include <sys/socket.h>
#include <netinet/in.h>
int
socket(
AF_INET,
SOCK_STREAM,
0);
int
socket(
AF_INET6,
SOCK_STREAM,
0);
DESCRIPTION
The TCP provides reliable, flow-controlled, two-way transmission of data. It is
a byte-stream protocol used to support the
SOCK_STREAM
abstraction. TCP uses the standard Internet address format and, in addition,
provides a per-host collection of “port addresses”. Thus, each
address is composed of an Internet address specifying the host and network,
with a specific TCP port on the host identifying the peer entity.
Sockets using TCP are either “active” or “passive”.
Active sockets initiate connections to passive sockets. By default TCP sockets
are created active; to create a passive socket the
listen(2) system call must be
used after binding the socket with the
bind(2) system call. Only passive
sockets may use the
accept(2)
call to accept incoming connections. Only active sockets may use the
connect(2) call to initiate
connections.
Passive sockets may “underspecify” their location to match incoming
connection requests from multiple networks. This technique, termed
“wildcard addressing”, allows a single server to provide service
to clients on multiple networks. To create a socket which listens on all
networks, the Internet address
INADDR_ANY
must be
bound. The TCP port may still be specified at this time; if the port is not
specified the system will assign one. Once a connection has been established
the socket's address is fixed by the peer entity's location. The address
assigned the socket is the address associated with the network interface
through which packets are being transmitted and received. Normally this
address corresponds to the peer entity's network.
TCP supports a number of socket options which can be set with
setsockopt(2) and tested
with
getsockopt(2):
-
-
TCP_NODELAY
- Under most circumstances, TCP sends data when it is
presented; when outstanding data has not yet been acknowledged, it gathers
small amounts of output to be sent in a single packet once an
acknowledgment is received. For a small number of clients, such as window
systems that send a stream of mouse events which receive no replies, this
packetization may cause significant delays. Therefore, TCP provides a
boolean option,
TCP_NODELAY
(from
<netinet/tcp.h>, to defeat this
algorithm.
-
-
TCP_MAXSEG
- By default, a sender- and receiver-TCP will negotiate among
themselves to determine the maximum segment size to be used for each
connection. The
TCP_MAXSEG
option allows the user
to determine the result of this negotiation, and to reduce it if
desired.
-
-
TCP_MD5SIG
- This option enables the use of MD5 digests (also known as
TCP-MD5) on writes to the specified socket. In the current release, only
outgoing traffic is digested; digests on incoming traffic are not
verified. The current default behavior for the system is to respond to a
system advertising this option with TCP-MD5; this may change.
One common use for this in a NetBSD router
deployment is to enable based routers to interwork with Cisco equipment at
peering points. Support for this feature conforms to RFC 2385. Only IPv4
(AF_INET) sessions are supported.
In order for this option to function correctly, it is necessary for the
administrator to add a tcp-md5 key entry to the system's security
associations database (SADB) using the
setkey(8) utility. This
entry must have an SPI of 0x1000 and can therefore only be specified on a
per-host basis at this time.
If an SADB entry cannot be found for the destination, the outgoing traffic
will have an invalid digest option prepended, and the following error
message will be visible on the system console:
tcp_signature_compute: SADB lookup failed for
%d.%d.%d.%d.
-
-
TCP_KEEPIDLE
- TCP probes a connection that has been idle for some amount
of time. The default value for this idle period is 4 hours. The
TCP_KEEPIDLE
option can be used to affect this
value for a given socket, and specifies the number of seconds of idle time
between keepalive probes. This option takes an unsigned
int value, with a value greater than 0.
-
-
TCP_KEEPINTVL
- When the
SO_KEEPALIVE
option is
enabled, TCP probes a connection that has been idle for some amount of
time. If the remote system does not respond to a keepalive probe, TCP
retransmits the probe after some amount of time. The default value for
this retransmit interval is 150 seconds. The
TCP_KEEPINTVL
option can be used to affect this
value for a given socket, and specifies the number of seconds to wait
before retransmitting a keepalive probe. This option takes an
unsigned int value, with a value greater than
0.
-
-
TCP_KEEPCNT
- When the
SO_KEEPALIVE
option is
enabled, TCP probes a connection that has been idle for some amount of
time. If the remote system does not respond to a keepalive probe, TCP
retransmits the probe a certain number of times before a connection is
considered to be broken. The default value for this keepalive probe
retransmit limit is 8. The TCP_KEEPCNT
option can
be used to affect this value for a given socket, and specifies the maximum
number of keepalive probes to be sent. This option takes an
unsigned int value, with a value greater than
0.
-
-
TCP_KEEPINIT
- If a TCP connection cannot be established within some
amount of time, TCP will time out the connect attempt. The default value
for this initial connection establishment timeout is 150 seconds. The
TCP_KEEPINIT
option can be used to affect this
initial timeout period for a given socket, and specifies the number of
seconds to wait before the connect attempt is timed out. For passive
connections, the TCP_KEEPINIT
option value is
inherited from the listening socket. This option takes an
unsigned int value, with a value greater than
0.
-
-
TCP_INFO
- Information about a socket's underlying TCP session may be
retrieved by passing the read-only option
TPC_INFO
to getsockopt(2). It
accepts a single argument: a pointer to an instance of
struct tcp_info.
This API is subject to change; consult the source to determine which fields
are currently filled out by this option. NetBSD
specific additions include send window size, receive window size, and
bandwidth-controlled window space.
The option level for the
setsockopt(2) call is the
protocol number for TCP, available from
getprotobyname(3).
In the historical
BSD TCP implementation, if the
TCP_NODELAY
option was set on a passive socket, the
sockets returned by
accept(2)
erroneously did not have the
TCP_NODELAY
option set;
the behavior was corrected to inherit
TCP_NODELAY
in
NetBSD 1.6.
Options at the IP network level may be used with TCP; see
ip(4) or
ip6(4). Incoming connection
requests that are source-routed are noted, and the reverse source route is
used in responding.
There are many adjustable parameters that control various aspects of the
NetBSD TCP behavior; these parameters are documented
in
sysctl(7), and they include:
- RFC 1323 extensions for high
performance
- Send/receive buffer
sizes
- Default maximum segment size
(MSS)
- SYN cache parameters
- Hughes/Touch/Heidemann
Congestion Window Monitoring algorithm
- Keepalive parameters
- newReno algorithm for
congestion control
- Logging of connection
refusals
- RST packet rate limits
- SACK (Selective
Acknowledgment)
- ECN (Explicit Congestion
Notification)
- Congestion window increase
methods; the traditional packet counting or RFC 3465 Appropriate Byte
Counting
- RFC 3390: Increased initial
window size
DIAGNOSTICS
A socket operation may fail with one of the following errors returned:
-
-
- [
EISCONN
]
- when trying to establish a connection on a socket which
already has one;
-
-
- [
ENOBUFS
]
- when the system runs out of memory for an internal data
structure;
-
-
- [
ETIMEDOUT
]
- when a connection was dropped due to excessive
retransmissions;
-
-
- [
ECONNRESET
]
- when the remote peer forces the connection to be
closed;
-
-
- [
ECONNREFUSED
]
- when the remote peer actively refuses connection
establishment (usually because no process is listening to the port);
-
-
- [
EADDRINUSE
]
- when an attempt is made to create a socket with a port
which has already been allocated;
-
-
- [
EADDRNOTAVAIL
]
- when an attempt is made to create a socket with a network
address for which no network interface exists.
SEE ALSO
getsockopt(2),
socket(2),
inet(4),
inet6(4),
intro(4),
ip(4),
ip6(4),
sysctl(7)
Transmission Control Protocol,
RFC, 793,
September 1981.
Requirements for Internet Hosts --
Communication Layers, RFC,
1122, October 1989.
HISTORY
The
tcp protocol stack appeared in
4.2BSD.