Network Working Group Internet Engineering Task Force Request for Comments: 1122 R. Braden, Editor October 1989
Requirements for Internet Hosts -- Communication Layers
Status of This Memo
This RFC is an official specification for the Internet community. It incorporates by reference, amends, corrects, and supplements the primary protocol standards documents relating to hosts. Distribution of this document is unlimited.
Summary
This is one RFC of a pair that defines and discusses the requirements for Internet host software. This RFC covers the communications protocol layers: link layer, IP layer, and transport layer; its companion RFC-1123 covers the application and support protocols.
Table of Contents
-
INTRODUCTION … 5 1.1 The Internet Architecture … 6 1.1.1 Internet Hosts … 6 1.1.2 Architectural Assumptions … 7 1.1.3 Internet Protocol Suite … 8 1.1.4 Embedded Gateway Code … 10 1.2 General Considerations … 12 1.2.1 Continuing Internet Evolution … 12 1.2.2 Robustness Principle … 12 1.2.3 Error Logging … 13 1.2.4 Configuration … 14 1.3 Reading this Document … 15 1.3.1 Organization … 15 1.3.2 Requirements … 16 1.3.3 Terminology … 17 1.4 Acknowledgments … 20
-
LINK LAYER … 21 2.1 INTRODUCTION … 21
Internet Engineering Task Force [Page 1]
RFC1122 INTRODUCTION October 1989
2.2 PROTOCOL WALK-THROUGH ... 21 2.3 SPECIFIC ISSUES ... 21 2.3.1 Trailer Protocol Negotiation ... 21 2.3.2 Address Resolution Protocol -- ARP ... 22 2.3.2.1 ARP Cache Validation ... 22 2.3.2.2 ARP Packet Queue ... 24 2.3.3 Ethernet and IEEE 802 Encapsulation ... 24 2.4 LINK/INTERNET LAYER INTERFACE ... 25 2.5 LINK LAYER REQUIREMENTS SUMMARY ... 26
- INTERNET LAYER PROTOCOLS … 27 3.1 INTRODUCTION … 27 3.2 PROTOCOL WALK-THROUGH … 29 3.2.1 Internet Protocol – IP … 29 3.2.1.1 Version Number … 29 3.2.1.2 Checksum … 29 3.2.1.3 Addressing … 29 3.2.1.4 Fragmentation and Reassembly … 32 3.2.1.5 Identification … 32 3.2.1.6 Type-of-Service … 33 3.2.1.7 Time-to-Live … 34 3.2.1.8 Options … 35 3.2.2 Internet Control Message Protocol – ICMP … 38 3.2.2.1 Destination Unreachable … 39 3.2.2.2 Redirect … 40 3.2.2.3 Source Quench … 41 3.2.2.4 Time Exceeded … 41 3.2.2.5 Parameter Problem … 42 3.2.2.6 Echo Request/Reply … 42 3.2.2.7 Information Request/Reply … 43 3.2.2.8 Timestamp and Timestamp Reply … 43 3.2.2.9 Address Mask Request/Reply … 45 3.2.3 Internet Group Management Protocol IGMP … 47 3.3 SPECIFIC ISSUES … 47 3.3.1 Routing Outbound Datagrams … 47 3.3.1.1 Local/Remote Decision … 47 3.3.1.2 Gateway Selection … 48 3.3.1.3 Route Cache … 49 3.3.1.4 Dead Gateway Detection … 51 3.3.1.5 New Gateway Selection … 55 3.3.1.6 Initialization … 56 3.3.2 Reassembly … 56 3.3.3 Fragmentation … 58 3.3.4 Local Multihoming … 60 3.3.4.1 Introduction … 60 3.3.4.2 Multihoming Requirements … 61 3.3.4.3 Choosing a Source Address … 64 3.3.5 Source Route Forwarding … 65
Internet Engineering Task Force [Page 2]
RFC1122 INTRODUCTION October 1989
3.3.6 Broadcasts ... 66 3.3.7 IP Multicasting ... 67 3.3.8 Error Reporting ... 69 3.4 INTERNET/TRANSPORT LAYER INTERFACE ... 69 3.5 INTERNET LAYER REQUIREMENTS SUMMARY ... 72
- TRANSPORT PROTOCOLS … 77 4.1 USER DATAGRAM PROTOCOL – UDP … 77 4.1.1 INTRODUCTION … 77 4.1.2 PROTOCOL WALK-THROUGH … 77 4.1.3 SPECIFIC ISSUES … 77 4.1.3.1 Ports … 77 4.1.3.2 IP Options … 77 4.1.3.3 ICMP Messages … 78 4.1.3.4 UDP Checksums … 78 4.1.3.5 UDP Multihoming … 79 4.1.3.6 Invalid Addresses … 79 4.1.4 UDP/APPLICATION LAYER INTERFACE … 79 4.1.5 UDP REQUIREMENTS SUMMARY … 80 4.2 TRANSMISSION CONTROL PROTOCOL – TCP … 82 4.2.1 INTRODUCTION … 82 4.2.2 PROTOCL WALK-THROUGH … 82 4.2.2.1 Well-Known Ports … 82 4.2.2.2 Use of Push … 82 4.2.2.3 Window Size … 83 4.2.2.4 Urgent Pointer … 84 4.2.2.5 TCP Options … 85 4.2.2.6 Maximum Segment Size Option … 85 4.2.2.7 TCP Checksum … 86 4.2.2.8 TCP Connection State Diagram … 86 4.2.2.9 Initial Sequence Number Selection … 87 4.2.2.10 Simultaneous Open Attempts … 87 4.2.2.11 Recovery from Old Duplicate SYN … 87 4.2.2.12 RST Segment … 87 4.2.2.13 Closing a Connection … 87 4.2.2.14 Data Communication … 89 4.2.2.15 Retransmission Timeout … 90 4.2.2.16 Managing the Window … 91 4.2.2.17 Probing Zero Windows … 92 4.2.2.18 Passive OPEN Calls … 92 4.2.2.19 Time to Live … 93 4.2.2.20 Event Processing … 93 4.2.2.21 Acknowledging Queued Segments … 94 4.2.3 SPECIFIC ISSUES … 95 4.2.3.1 Retransmission Timeout Calculation … 95 4.2.3.2 When to Send an ACK Segment … 96 4.2.3.3 When to Send a Window Update … 97 4.2.3.4 When to Send Data … 98
Internet Engineering Task Force [Page 3]
RFC1122 INTRODUCTION October 1989
4.2.3.5 TCP Connection Failures ...................... 100
4.2.3.6 TCP Keep-Alives .............................. 101
4.2.3.7 TCP Multihoming .............................. 103
4.2.3.8 IP Options ................................... 103
4.2.3.9 ICMP Messages ................................ 103
4.2.3.10 Remote Address Validation ................... 104
4.2.3.11 TCP Traffic Patterns ........................ 104
4.2.3.12 Efficiency .................................. 105
4.2.4 TCP/APPLICATION LAYER INTERFACE ................... 106
4.2.4.1 Asynchronous Reports ......................... 106
4.2.4.2 Type-of-Service .............................. 107
4.2.4.3 Flush Call ................................... 107
4.2.4.4 Multihoming .................................. 108
4.2.5 TCP REQUIREMENT SUMMARY ........................... 108
- REFERENCES … 112
Internet Engineering Task Force [Page 4]
RFC1122 INTRODUCTION October 1989
- INTRODUCTION
This document is one of a pair that defines and discusses the requirements for host system implementations of the Internet protocol suite. This RFC covers the communication protocol layers: link layer, IP layer, and transport layer. Its companion RFC, “Requirements for Internet Hosts – Application and Support” [INTRO:1], covers the application layer protocols. This document should also be read in conjunction with “Requirements for Internet Gateways” [INTRO:2].
These documents are intended to provide guidance for vendors, implementors, and users of Internet communication software. They represent the consensus of a large body of technical experience and wisdom, contributed by the members of the Internet research and vendor communities.
This RFC enumerates standard protocols that a host connected to the Internet must use, and it incorporates by reference the RFCs and other documents describing the current specifications for these protocols. It corrects errors in the referenced documents and adds additional discussion and guidance for an implementor.
For each protocol, this document also contains an explicit set of requirements, recommendations, and options. The reader must understand that the list of requirements in this document is incomplete by itself; the complete set of requirements for an Internet host is primarily defined in the standard protocol specification documents, with the corrections, amendments, and supplements contained in this RFC.
A good-faith implementation of the protocols that was produced after careful reading of the RFC’s and with some interaction with the Internet technical community, and that followed good communications software engineering practices, should differ from the requirements of this document in only minor ways. Thus, in many cases, the “requirements” in this RFC are already stated or implied in the standard protocol documents, so that their inclusion here is, in a sense, redundant. However, they were included because some past implementation has made the wrong choice, causing problems of interoperability, performance, and/or robustness.
This document includes discussion and explanation of many of the requirements and recommendations. A simple list of requirements would be dangerous, because:
o Some required features are more important than others, and some features are optional.
Internet Engineering Task Force [Page 5]
RFC1122 INTRODUCTION October 1989
o There may be valid reasons why particular vendor products that are designed for restricted contexts might choose to use different specifications.
However, the specifications of this document must be followed to meet the general goal of arbitrary host interoperation across the diversity and complexity of the Internet system. Although most current implementations fail to meet these requirements in various ways, some minor and some major, this specification is the ideal towards which we need to move.
These requirements are based on the current level of Internet architecture. This document will be updated as required to provide additional clarifications or to include additional information in those areas in which specifications are still evolving.
This introductory section begins with a brief overview of the Internet architecture as it relates to hosts, and then gives some general advice to host software vendors. Finally, there is some guidance on reading the rest of the document and some terminology.
1.1 The Internet Architecture
General background and discussion on the Internet architecture and
supporting protocol suite can be found in the DDN Protocol
Handbook [INTRO:3]; for background see for example [INTRO:9],
[INTRO:10], and [INTRO:11]. Reference [INTRO:5] describes the
procedure for obtaining Internet protocol documents, while
[INTRO:6] contains a list of the numbers assigned within Internet
protocols.
1.1.1 Internet Hosts
A host computer, or simply "host," is the ultimate consumer of
communication services. A host generally executes application
programs on behalf of user(s), employing network and/or
Internet communication services in support of this function.
An Internet host corresponds to the concept of an "End-System"
used in the OSI protocol suite [INTRO:13].
An Internet communication system consists of interconnected
packet networks supporting communication among host computers
using the Internet protocols. The networks are interconnected
using packet-switching computers called "gateways" or "IP
routers" by the Internet community, and "Intermediate Systems"
by the OSI world [INTRO:13]. The RFC "Requirements for
Internet Gateways" [INTRO:2] contains the official
specifications for Internet gateways. That RFC together with
Internet Engineering Task Force [Page 6]
RFC1122 INTRODUCTION October 1989
the present document and its companion [INTRO:1] define the
rules for the current realization of the Internet architecture.
Internet hosts span a wide range of size, speed, and function.
They range in size from small microprocessors through
workstations to mainframes and supercomputers. In function,
they range from single-purpose hosts (such as terminal servers)
to full-service hosts that support a variety of online network
services, typically including remote login, file transfer, and
electronic mail.
A host is generally said to be multihomed if it has more than
one interface to the same or to different networks. See
Section 1.1.3 on "Terminology".
1.1.2 Architectural Assumptions
The current Internet architecture is based on a set of
assumptions about the communication system. The assumptions
most relevant to hosts are as follows:
(a) The Internet is a network of networks.
Each host is directly connected to some particular
network(s); its connection to the Internet is only
conceptual. Two hosts on the same network communicate
with each other using the same set of protocols that they
would use to communicate with hosts on distant networks.
(b) Gateways don't keep connection state information.
To improve robustness of the communication system,
gateways are designed to be stateless, forwarding each IP
datagram independently of other datagrams. As a result,
redundant paths can be exploited to provide robust service
in spite of failures of intervening gateways and networks.
All state information required for end-to-end flow control
and reliability is implemented in the hosts, in the
transport layer or in application programs. All
connection control information is thus co-located with the
end points of the communication, so it will be lost only
if an end point fails.
(c) Routing complexity should be in the gateways.
Routing is a complex and difficult problem, and ought to
be performed by the gateways, not the hosts. An important
Internet Engineering Task Force [Page 7]
RFC1122 INTRODUCTION October 1989
objective is to insulate host software from changes caused
by the inevitable evolution of the Internet routing
architecture.
(d) The System must tolerate wide network variation.
A basic objective of the Internet design is to tolerate a
wide range of network characteristics -- e.g., bandwidth,
delay, packet loss, packet reordering, and maximum packet
size. Another objective is robustness against failure of
individual networks, gateways, and hosts, using whatever
bandwidth is still available. Finally, the goal is full
"open system interconnection": an Internet host must be
able to interoperate robustly and effectively with any
other Internet host, across diverse Internet paths.
Sometimes host implementors have designed for less
ambitious goals. For example, the LAN environment is
typically much more benign than the Internet as a whole;
LANs have low packet loss and delay and do not reorder
packets. Some vendors have fielded host implementations
that are adequate for a simple LAN environment, but work
badly for general interoperation. The vendor justifies
such a product as being economical within the restricted
LAN market. However, isolated LANs seldom stay isolated
for long; they are soon gatewayed to each other, to
organization-wide internets, and eventually to the global
Internet system. In the end, neither the customer nor the
vendor is served by incomplete or substandard Internet
host software.
The requirements spelled out in this document are designed
for a full-function Internet host, capable of full
interoperation over an arbitrary Internet path.
1.1.3 Internet Protocol Suite
To communicate using the Internet system, a host must implement
the layered set of protocols comprising the Internet protocol
suite. A host typically must implement at least one protocol
from each layer.
The protocol layers used in the Internet architecture are as
follows [INTRO:4]:
o Application Layer
Internet Engineering Task Force [Page 8]
RFC1122 INTRODUCTION October 1989
The application layer is the top layer of the Internet
protocol suite. The Internet suite does not further
subdivide the application layer, although some of the
Internet application layer protocols do contain some
internal sub-layering. The application layer of the
Internet suite essentially combines the functions of the
top two layers -- Presentation and Application -- of the
OSI reference model.
We distinguish two categories of application layer
protocols: user protocols that provide service directly
to users, and support protocols that provide common system
functions. Requirements for user and support protocols
will be found in the companion RFC [INTRO:1].
The most common Internet user protocols are:
o Telnet (remote login)
o FTP (file transfer)
o SMTP (electronic mail delivery)
There are a number of other standardized user protocols
[INTRO:4] and many private user protocols.
Support protocols, used for host name mapping, booting,
and management, include SNMP, BOOTP, RARP, and the Domain
Name System (DNS) protocols.
o Transport Layer
The transport layer provides end-to-end communication
services for applications. There are two primary
transport layer protocols at present:
o Transmission Control Protocol (TCP)
o User Datagram Protocol (UDP)
TCP is a reliable connection-oriented transport service
that provides end-to-end reliability, resequencing, and
flow control. UDP is a connectionless ("datagram")
transport service.
Other transport protocols have been developed by the
research community, and the set of official Internet
transport protocols may be expanded in the future.
Transport layer protocols are discussed in Chapter 4.
Internet Engineering Task Force [Page 9]
RFC1122 INTRODUCTION October 1989
o Internet Layer
All Internet transport protocols use the Internet Protocol
(IP) to carry data from source host to destination host.
IP is a connectionless or datagram internetwork service,
providing no end-to-end delivery guarantees. Thus, IP
datagrams may arrive at the destination host damaged,
duplicated, out of order, or not at all. The layers above
IP are responsible for reliable delivery service when it
is required. The IP protocol includes provision for
addressing, type-of-service specification, fragmentation
and reassembly, and security information.
The datagram or connectionless nature of the IP protocol
is a fundamental and characteristic feature of the
Internet architecture. Internet IP was the model for the
OSI Connectionless Network Protocol [INTRO:12].
ICMP is a control protocol that is considered to be an
integral part of IP, although it is architecturally
layered upon IP, i.e., it uses IP to carry its data end-
to-end just as a transport protocol like TCP or UDP does.
ICMP provides error reporting, congestion reporting, and
first-hop gateway redirection.
IGMP is an Internet layer protocol used for establishing
dynamic host groups for IP multicasting.
The Internet layer protocols IP, ICMP, and IGMP are
discussed in Chapter 3.
o Link Layer
To communicate on its directly-connected network, a host
must implement the communication protocol used to
interface to that network. We call this a link layer or
media-access layer protocol.
There is a wide variety of link layer protocols,
corresponding to the many different types of networks.
See Chapter 2.
1.1.4 Embedded Gateway Code
Some Internet host software includes embedded gateway
functionality, so that these hosts can forward packets as a
Internet Engineering Task Force [Page 10]
RFC1122 INTRODUCTION October 1989
gateway would, while still performing the application layer
functions of a host.
Such dual-purpose systems must follow the Gateway Requirements
RFC [INTRO:2] with respect to their gateway functions, and
must follow the present document with respect to their host
functions. In all overlapping cases, the two specifications
should be in agreement.
There are varying opinions in the Internet community about
embedded gateway functionality. The main arguments are as
follows:
o Pro: in a local network environment where networking is
informal, or in isolated internets, it may be convenient
and economical to use existing host systems as gateways.
There is also an architectural argument for embedded
gateway functionality: multihoming is much more common
than originally foreseen, and multihoming forces a host to
make routing decisions as if it were a gateway. If the
multihomed host contains an embedded gateway, it will
have full routing knowledge and as a result will be able
to make more optimal routing decisions.
o Con: Gateway algorithms and protocols are still changing,
and they will continue to change as the Internet system
grows larger. Attempting to include a general gateway
function within the host IP layer will force host system
maintainers to track these (more frequent) changes. Also,
a larger pool of gateway implementations will make
coordinating the changes more difficult. Finally, the
complexity of a gateway IP layer is somewhat greater than
that of a host, making the implementation and operation
tasks more complex.
In addition, the style of operation of some hosts is not
appropriate for providing stable and robust gateway
service.
There is considerable merit in both of these viewpoints. One
conclusion can be drawn: an host administrator must have
conscious control over whether or not a given host acts as a
gateway. See Section 3.1 for the detailed requirements.
Internet Engineering Task Force [Page 11]
RFC1122 INTRODUCTION October 1989
1.2 General Considerations
There are two important lessons that vendors of Internet host
software have learned and which a new vendor should consider
seriously.
1.2.1 Continuing Internet Evolution
The enormous growth of the Internet has revealed problems of
management and scaling in a large datagram-based packet
communication system. These problems are being addressed, and
as a result there will be continuing evolution of the
specifications described in this document. These changes will
be carefully planned and controlled, since there is extensive
participation in this planning by the vendors and by the
organizations responsible for operations of the networks.
Development, evolution, and revision are characteristic of
computer network protocols today, and this situation will
persist for some years. A vendor who develops computer
communication software for the Internet protocol suite (or any
other protocol suite!) and then fails to maintain and update
that software for changing specifications is going to leave a
trail of unhappy customers. The Internet is a large
communication network, and the users are in constant contact
through it. Experience has shown that knowledge of
deficiencies in vendor software propagates quickly through the
Internet technical community.
1.2.2 Robustness Principle
At every layer of the protocols, there is a general rule whose
application can lead to enormous benefits in robustness and
interoperability [IP:1]:
"Be liberal in what you accept, and
conservative in what you send"
Software should be written to deal with every conceivable
error, no matter how unlikely; sooner or later a packet will
come in with that particular combination of errors and
attributes, and unless the software is prepared, chaos can
ensue. In general, it is best to assume that the network is
filled with malevolent entities that will send in packets
designed to have the worst possible effect. This assumption
will lead to suitable protective design, although the most
serious problems in the Internet have been caused by
unenvisaged mechanisms triggered by low-probability events;
Internet Engineering Task Force [Page 12]
RFC1122 INTRODUCTION October 1989
mere human malice would never have taken so devious a course!
Adaptability to change must be designed into all levels of
Internet host software. As a simple example, consider a
protocol specification that contains an enumeration of values
for a particular header field -- e.g., a type field, a port
number, or an error code; this enumeration must be assumed to
be incomplete. Thus, if a protocol specification defines four
possible error codes, the software must not break when a fifth
code shows up. An undefined code might be logged (see below),
but it must not cause a failure.
The second part of the principle is almost as important:
software on other hosts may contain deficiencies that make it
unwise to exploit legal but obscure protocol features. It is
unwise to stray far from the obvious and simple, lest untoward
effects result elsewhere. A corollary of this is "watch out
for misbehaving hosts"; host software should be prepared, not
just to survive other misbehaving hosts, but also to cooperate
to limit the amount of disruption such hosts can cause to the
shared communication facility.
1.2.3 Error Logging
The Internet includes a great variety of host and gateway
systems, each implementing many protocols and protocol layers,
and some of these contain bugs and mis-features in their
Internet protocol software. As a result of complexity,
diversity, and distribution of function, the diagnosis of
Internet problems is often very difficult.
Problem diagnosis will be aided if host implementations include
a carefully designed facility for logging erroneous or
"strange" protocol events. It is important to include as much
diagnostic information as possible when an error is logged. In
particular, it is often useful to record the header(s) of a
packet that caused an error. However, care must be taken to
ensure that error logging does not consume prohibitive amounts
of resources or otherwise interfere with the operation of the
host.
There is a tendency for abnormal but harmless protocol events
to overflow error logging files; this can be avoided by using a
"circular" log, or by enabling logging only while diagnosing a
known failure. It may be useful to filter and count duplicate
successive messages. One strategy that seems to work well is:
(1) always count abnormalities and make such counts accessible
through the management protocol (see [INTRO:1]); and (2) allow
Internet Engineering Task Force [Page 13]
RFC1122 INTRODUCTION October 1989
the logging of a great variety of events to be selectively
enabled. For example, it might useful to be able to "log
everything" or to "log everything for host X".
Note that different managements may have differing policies
about the amount of error logging that they want normally
enabled in a host. Some will say, "if it doesn't hurt me, I
don't want to know about it", while others will want to take a
more watchful and aggressive attitude about detecting and
removing protocol abnormalities.
1.2.4 Configuration
It would be ideal if a host implementation of the Internet
protocol suite could be entirely self-configuring. This would
allow the whole suite to be implemented in ROM or cast into
silicon, it would simplify diskless workstations, and it would
be an immense boon to harried LAN administrators as well as
system vendors. We have not reached this ideal; in fact, we
are not even close.
At many points in this document, you will find a requirement
that a parameter be a configurable option. There are several
different reasons behind such requirements. In a few cases,
there is current uncertainty or disagreement about the best
value, and it may be necessary to update the recommended value
in the future. In other cases, the value really depends on
external factors -- e.g., the size of the host and the
distribution of its communication load, or the speeds and
topology of nearby networks -- and self-tuning algorithms are
unavailable and may be insufficient. In some cases,
configurability is needed because of administrative
requirements.
Finally, some configuration options are required to communicate
with obsolete or incorrect implementations of the protocols,
distributed without sources, that unfortunately persist in many
parts of the Internet. To make correct systems coexist with
these faulty systems, administrators often have to "mis-
configure" the correct systems. This problem will correct
itself gradually as the faulty systems are retired, but it
cannot be ignored by vendors.
When we say that a parameter must be configurable, we do not
intend to require that its value be explicitly read from a
configuration file at every boot time. We recommend that
implementors set up a default for each parameter, so a
configuration file is only necessary to override those defaults
Internet Engineering Task Force [Page 14]
RFC1122 INTRODUCTION October 1989
that are inappropriate in a particular installation. Thus, the
configurability requirement is an assurance that it will be
POSSIBLE to override the default when necessary, even in a
binary-only or ROM-based product.
This document requires a particular value for such defaults in
some cases. The choice of default is a sensitive issue when
the configuration item controls the accommodation to existing
faulty systems. If the Internet is to converge successfully to
complete interoperability, the default values built into
implementations must implement the official protocol, not
"mis-configurations" to accommodate faulty implementations.
Although marketing considerations have led some vendors to
choose mis-configuration defaults, we urge vendors to choose
defaults that will conform to the standard.
Finally, we note that a vendor needs to provide adequate
documentation on all configuration parameters, their limits and
effects.
1.3 Reading this Document
1.3.1 Organization
Protocol layering, which is generally used as an organizing
principle in implementing network software, has also been used
to organize this document. In describing the rules, we assume
that an implementation does strictly mirror the layering of the
protocols. Thus, the following three major sections specify
the requirements for the link layer, the internet layer, and
the transport layer, respectively. A companion RFC [INTRO:1]
covers application level software. This layerist organization
was chosen for simplicity and clarity.
However, strict layering is an imperfect model, both for the
protocol suite and for recommended implementation approaches.
Protocols in different layers interact in complex and sometimes
subtle ways, and particular functions often involve multiple
layers. There are many design choices in an implementation,
many of which involve creative "breaking" of strict layering.
Every implementor is urged to read references [INTRO:7] and
[INTRO:8].
This document describes the conceptual service interface
between layers using a functional ("procedure call") notation,
like that used in the TCP specification [TCP:1]. A host
implementation must support the logical information flow
Internet Engineering Task Force [Page 15]
RFC1122 INTRODUCTION October 1989
implied by these calls, but need not literally implement the
calls themselves. For example, many implementations reflect
the coupling between the transport layer and the IP layer by
giving them shared access to common data structures. These
data structures, rather than explicit procedure calls, are then
the agency for passing much of the information that is
required.
In general, each major section of this document is organized
into the following subsections:
(1) Introduction
(2) Protocol Walk-Through -- considers the protocol
specification documents section-by-section, correcting
errors, stating requirements that may be ambiguous or
ill-defined, and providing further clarification or
explanation.
(3) Specific Issues -- discusses protocol design and
implementation issues that were not included in the walk-
through.
(4) Interfaces -- discusses the service interface to the next
higher layer.
(5) Summary -- contains a summary of the requirements of the
section.
Under many of the individual topics in this document, there is
parenthetical material labeled "DISCUSSION" or
"IMPLEMENTATION". This material is intended to give
clarification and explanation of the preceding requirements
text. It also includes some suggestions on possible future
directions or developments. The implementation material
contains suggested approaches that an implementor may want to
consider.
The summary sections are intended to be guides and indexes to
the text, but are necessarily cryptic and incomplete. The
summaries should never be used or referenced separately from
the complete RFC.
1.3.2 Requirements
In this document, the words that are used to define the
significance of each particular requirement are capitalized.
Internet Engineering Task Force [Page 16]
RFC1122 INTRODUCTION October 1989
These words are:
* "MUST"
This word or the adjective "REQUIRED" means that the item
is an absolute requirement of the specification.
* "SHOULD"
This word or the adjective "RECOMMENDED" means that there
may exist valid reasons in particular circumstances to
ignore this item, but the full implications should be
understood and the case carefully weighed before choosing
a different course.
* "MAY"
This word or the adjective "OPTIONAL" means that this item
is truly optional. One vendor may choose to include the
item because a particular marketplace requires it or
because it enhances the product, for example; another
vendor may omit the same item.
An implementation is not compliant if it fails to satisfy one
or more of the MUST requirements for the protocols it
implements. An implementation that satisfies all the MUST and
all the SHOULD requirements for its protocols is said to be
"unconditionally compliant"; one that satisfies all the MUST
requirements but not all the SHOULD requirements for its
protocols is said to be "conditionally compliant".
1.3.3 Terminology
This document uses the following technical terms:
Segment
A segment is the unit of end-to-end transmission in the
TCP protocol. A segment consists of a TCP header followed
by application data. A segment is transmitted by
encapsulation inside an IP datagram.
Message
In this description of the lower-layer protocols, a
message is the unit of transmission in a transport layer
protocol. In particular, a TCP segment is a message. A
message consists of a transport protocol header followed
by application protocol data. To be transmitted end-to-
Internet Engineering Task Force [Page 17]
RFC1122 INTRODUCTION October 1989
end through the Internet, a message must be encapsulated
inside a datagram.
IP Datagram
An IP datagram is the unit of end-to-end transmission in
the IP protocol. An IP datagram consists of an IP header
followed by transport layer data, i.e., of an IP header
followed by a message.
In the description of the internet layer (Section 3), the
unqualified term "datagram" should be understood to refer
to an IP datagram.
Packet
A packet is the unit of data passed across the interface
between the internet layer and the link layer. It
includes an IP header and data. A packet may be a
complete IP datagram or a fragment of an IP datagram.
Frame
A frame is the unit of transmission in a link layer
protocol, and consists of a link-layer header followed by
a packet.
Connected Network
A network to which a host is interfaced is often known as
the "local network" or the "subnetwork" relative to that
host. However, these terms can cause confusion, and
therefore we use the term "connected network" in this
document.
Multihomed
A host is said to be multihomed if it has multiple IP
addresses. For a discussion of multihoming, see Section
3.3.4 below.
Physical network interface
This is a physical interface to a connected network and
has a (possibly unique) link-layer address. Multiple
physical network interfaces on a single host may share the
same link-layer address, but the address must be unique
for different hosts on the same physical network.
Logical [network] interface
We define a logical [network] interface to be a logical
path, distinguished by a unique IP address, to a connected
network. See Section 3.3.4.
Internet Engineering Task Force [Page 18]
RFC1122 INTRODUCTION October 1989
Specific-destination address
This is the effective destination address of a datagram,
even if it is broadcast or multicast; see Section 3.2.1.3.
Path
At a given moment, all the IP datagrams from a particular
source host to a particular destination host will
typically traverse the same sequence of gateways. We use
the term "path" for this sequence. Note that a path is
uni-directional; it is not unusual to have different paths
in the two directions between a given host pair.
MTU
The maximum transmission unit, i.e., the size of the
largest packet that can be transmitted.
The terms frame, packet, datagram, message, and segment are
illustrated by the following schematic diagrams:
A. Transmission on connected network:
_______________________________________________
| LL hdr | IP hdr | (data) |
|________|________|_____________________________|
<---------- Frame ----------------------------->
<----------Packet -------------------->
B. Before IP fragmentation or after IP reassembly:
______________________________________
| IP hdr | transport| Application Data |
|________|____hdr___|__________________|
<-------- Datagram ------------------>
<-------- Message ----------->
or, for TCP:
______________________________________
| IP hdr | TCP hdr | Application Data |
|________|__________|__________________|
<-------- Datagram ------------------>
<-------- Segment ----------->
Internet Engineering Task Force [Page 19]
RFC1122 INTRODUCTION October 1989
1.4 Acknowledgments
This document incorporates contributions and comments from a large
group of Internet protocol experts, including representatives of
university and research labs, vendors, and government agencies.
It was assembled primarily by the Host Requirements Working Group
of the Internet Engineering Task Force (IETF).
The Editor would especially like to acknowledge the tireless
dedication of the following people, who attended many long
meetings and generated 3 million bytes of electronic mail over the
past 18 months in pursuit of this document: Philip Almquist, Dave
Borman (Cray Research), Noel Chiappa, Dave Crocker (DEC), Steve
Deering (Stanford), Mike Karels (Berkeley), Phil Karn (Bellcore),
John Lekashman (NASA), Charles Lynn (BBN), Keith McCloghrie (TWG),
Paul Mockapetris (ISI), Thomas Narten (Purdue), Craig Partridge
(BBN), Drew Perkins (CMU), and James Van Bokkelen (FTP Software).
In addition, the following people made major contributions to the
effort: Bill Barns (Mitre), Steve Bellovin (AT&T), Mike Brescia
(BBN), Ed Cain (DCA), Annette DeSchon (ISI), Martin Gross (DCA),
Phill Gross (NRI), Charles Hedrick (Rutgers), Van Jacobson (LBL),
John Klensin (MIT), Mark Lottor (SRI), Milo Medin (NASA), Bill
Melohn (Sun Microsystems), Greg Minshall (Kinetics), Jeff Mogul
(DEC), John Mullen (CMC), Jon Postel (ISI), John Romkey (Epilogue
Technology), and Mike StJohns (DCA). The following also made
significant contributions to particular areas: Eric Allman
(Berkeley), Rob Austein (MIT), Art Berggreen (ACC), Keith Bostic
(Berkeley), Vint Cerf (NRI), Wayne Hathaway (NASA), Matt Korn
(IBM), Erik Naggum (Naggum Software, Norway), Robert Ullmann
(Prime Computer), David Waitzman (BBN), Frank Wancho (USA), Arun
Welch (Ohio State), Bill Westfield (Cisco), and Rayan Zachariassen
(Toronto).
We are grateful to all, including any contributors who may have
been inadvertently omitted from this list.
Internet Engineering Task Force [Page 20]
RFC1122 LINK LAYER October 1989
-
LINK LAYER
2.1 INTRODUCTION
All Internet systems, both hosts and gateways, have the same requirements for link layer protocols. These requirements are given in Chapter 3 of “Requirements for Internet Gateways” [INTRO:2], augmented with the material in this section.
2.2 PROTOCOL WALK-THROUGH
None.
2.3 SPECIFIC ISSUES
2.3.1 Trailer Protocol Negotiation
The trailer protocol [LINK:1] for link-layer encapsulation MAY be used, but only when it has been verified that both systems (host or gateway) involved in the link-layer communication implement trailers. If the system does not dynamically negotiate use of the trailer protocol on a per-destination basis, the default configuration MUST disable the protocol. DISCUSSION: The trailer protocol is a link-layer encapsulation technique that rearranges the data contents of packets sent on the physical network. In some cases, trailers improve the throughput of higher layer protocols by reducing the amount of data copying within the operating system. Higher layer protocols are unaware of trailer use, but both the sending and receiving host MUST understand the protocol if it is used. Improper use of trailers can result in very confusing symptoms. Only packets with specific size attributes are encapsulated using trailers, and typically only a small fraction of the packets being exchanged have these attributes. Thus, if a system using trailers exchanges packets with a system that does not, some packets disappear into a black hole while others are delivered successfully. IMPLEMENTATION: On an Ethernet, packets encapsulated with trailers use a distinct Ethernet type [LINK:1], and trailer negotiation is performed at the time that ARP is used to discover the link-layer address of a destination system.
Internet Engineering Task Force [Page 21]
RFC1122 LINK LAYER October 1989
Specifically, the ARP exchange is completed in the usual
manner using the normal IP protocol type, but a host that
wants to speak trailers will send an additional "trailer
ARP reply" packet, i.e., an ARP reply that specifies the
trailer encapsulation protocol type but otherwise has the
format of a normal ARP reply. If a host configured to use
trailers receives a trailer ARP reply message from a
remote machine, it can add that machine to the list of
machines that understand trailers, e.g., by marking the
corresponding entry in the ARP cache.
Hosts wishing to receive trailer encapsulations send
trailer ARP replies whenever they complete exchanges of
normal ARP messages for IP. Thus, a host that received an
ARP request for its IP protocol address would send a
trailer ARP reply in addition to the normal IP ARP reply;
a host that sent the IP ARP request would send a trailer
ARP reply when it received the corresponding IP ARP reply.
In this way, either the requesting or responding host in
an IP ARP exchange may request that it receive trailer
encapsulations.
This scheme, using extra trailer ARP reply packets rather
than sending an ARP request for the trailer protocol type,
was designed to avoid a continuous exchange of ARP packets
with a misbehaving host that, contrary to any
specification or common sense, responded to an ARP reply
for trailers with another ARP reply for IP. This problem
is avoided by sending a trailer ARP reply in response to
an IP ARP reply only when the IP ARP reply answers an
outstanding request; this is true when the hardware
address for the host is still unknown when the IP ARP
reply is received. A trailer ARP reply may always be sent
along with an IP ARP reply responding to an IP ARP
request.
2.3.2 Address Resolution Protocol -- ARP
2.3.2.1 ARP Cache Validation
An implementation of the Address Resolution Protocol (ARP)
[LINK:2] MUST provide a mechanism to flush out-of-date cache
entries. If this mechanism involves a timeout, it SHOULD be
possible to configure the timeout value.
A mechanism to prevent ARP flooding (repeatedly sending an
ARP Request for the same IP address, at a high rate) MUST be
included. The recommended maximum rate is 1 per second per
Internet Engineering Task Force [Page 22]
RFC1122 LINK LAYER October 1989
destination.
DISCUSSION:
The ARP specification [LINK:2] suggests but does not
require a timeout mechanism to invalidate cache entries
when hosts change their Ethernet addresses. The
prevalence of proxy ARP (see Section 2.4 of [INTRO:2])
has significantly increased the likelihood that cache
entries in hosts will become invalid, and therefore
some ARP-cache invalidation mechanism is now required
for hosts. Even in the absence of proxy ARP, a long-
period cache timeout is useful in order to
automatically correct any bad ARP data that might have
been cached.
IMPLEMENTATION:
Four mechanisms have been used, sometimes in
combination, to flush out-of-date cache entries.
(1) Timeout -- Periodically time out cache entries,
even if they are in use. Note that this timeout
should be restarted when the cache entry is
"refreshed" (by observing the source fields,
regardless of target address, of an ARP broadcast
from the system in question). For proxy ARP
situations, the timeout needs to be on the order
of a minute.
(2) Unicast Poll -- Actively poll the remote host by
periodically sending a point-to-point ARP Request
to it, and delete the entry if no ARP Reply is
received from N successive polls. Again, the
timeout should be on the order of a minute, and
typically N is 2.
(3) Link-Layer Advice -- If the link-layer driver
detects a delivery problem, flush the
corresponding ARP cache entry.
(4) Higher-layer Advice -- Provide a call from the
Internet layer to the link layer to indicate a
delivery problem. The effect of this call would
be to invalidate the corresponding cache entry.
This call would be analogous to the
"ADVISE_DELIVPROB()" call from the transport layer
to the Internet layer (see Section 3.4), and in
fact the ADVISE_DELIVPROB routine might in turn
call the link-layer advice routine to invalidate
Internet Engineering Task Force [Page 23]
RFC1122 LINK LAYER October 1989
the ARP cache entry.
Approaches (1) and (2) involve ARP cache timeouts on
the order of a minute or less. In the absence of proxy
ARP, a timeout this short could create noticeable
overhead traffic on a very large Ethernet. Therefore,
it may be necessary to configure a host to lengthen the
ARP cache timeout.
2.3.2.2 ARP Packet Queue
The link layer SHOULD save (rather than discard) at least
one (the latest) packet of each set of packets destined to
the same unresolved IP address, and transmit the saved
packet when the address has been resolved.
DISCUSSION:
Failure to follow this recommendation causes the first
packet of every exchange to be lost. Although higher-
layer protocols can generally cope with packet loss by
retransmission, packet loss does impact performance.
For example, loss of a TCP open request causes the
initial round-trip time estimate to be inflated. UDP-
based applications such as the Domain Name System are
more seriously affected.
2.3.3 Ethernet and IEEE 802 Encapsulation
The IP encapsulation for Ethernets is described in RFC-894
[LINK:3], while RFC-1042 [LINK:4] describes the IP
encapsulation for IEEE 802 networks. RFC-1042 elaborates and
replaces the discussion in Section 3.4 of [INTRO:2].
Every Internet host connected to a 10Mbps Ethernet cable:
o MUST be able to send and receive packets using RFC-894
encapsulation;
o SHOULD be able to receive RFC-1042 packets, intermixed
with RFC-894 packets; and
o MAY be able to send packets using RFC-1042 encapsulation.
An Internet host that implements sending both the RFC-894 and
the RFC-1042 encapsulations MUST provide a configuration switch
to select which is sent, and this switch MUST default to RFC-
894.
Internet Engineering Task Force [Page 24]
RFC1122 LINK LAYER October 1989
Note that the standard IP encapsulation in RFC-1042 does not
use the protocol id value (K1=6) that IEEE reserved for IP;
instead, it uses a value (K1=170) that implies an extension
(the "SNAP") which can be used to hold the Ether-Type field.
An Internet system MUST NOT send 802 packets using K1=6.
Address translation from Internet addresses to link-layer
addresses on Ethernet and IEEE 802 networks MUST be managed by
the Address Resolution Protocol (ARP).
The MTU for an Ethernet is 1500 and for 802.3 is 1492.
DISCUSSION:
The IEEE 802.3 specification provides for operation over a
10Mbps Ethernet cable, in which case Ethernet and IEEE
802.3 frames can be physically intermixed. A receiver can
distinguish Ethernet and 802.3 frames by the value of the
802.3 Length field; this two-octet field coincides in the
header with the Ether-Type field of an Ethernet frame. In
particular, the 802.3 Length field must be less than or
equal to 1500, while all valid Ether-Type values are
greater than 1500.
Another compatibility problem arises with link-layer
broadcasts. A broadcast sent with one framing will not be
seen by hosts that can receive only the other framing.
The provisions of this section were designed to provide
direct interoperation between 894-capable and 1042-capable
systems on the same cable, to the maximum extent possible.
It is intended to support the present situation where
894-only systems predominate, while providing an easy
transition to a possible future in which 1042-capable
systems become common.
Note that 894-only systems cannot interoperate directly
with 1042-only systems. If the two system types are set
up as two different logical networks on the same cable,
they can communicate only through an IP gateway.
Furthermore, it is not useful or even possible for a
dual-format host to discover automatically which format to
send, because of the problem of link-layer broadcasts.
2.4 LINK/INTERNET LAYER INTERFACE
The packet receive interface between the IP layer and the link
layer MUST include a flag to indicate whether the incoming packet
was addressed to a link-layer broadcast address.
Internet Engineering Task Force [Page 25]
RFC1122 LINK LAYER October 1989
DISCUSSION
Although the IP layer does not generally know link layer
addresses (since every different network medium typically has
a different address format), the broadcast address on a
broadcast-capable medium is an important special case. See
Section 3.2.2, especially the DISCUSSION concerning broadcast
storms.
The packet send interface between the IP and link layers MUST
include the 5-bit TOS field (see Section 3.2.1.6).
The link layer MUST NOT report a Destination Unreachable error to
IP solely because there is no ARP cache entry for a destination.
2.5 LINK LAYER REQUIREMENTS SUMMARY
| | | | |S| |
| | | | |H| |F
| | | | |O|M|o
| | |S| |U|U|o
| | |H| |L|S|t
| |M|O| |D|T|n
| |U|U|M| | |o
| |S|L|A|N|N|t
| |T|D|Y|O|O|t
FEATURE | SECTION | T | T | e |
---|
| | | | | | |
Trailer encapsulation |2.3.1 | | |x| | | Send Trailers by default without negotiation |2.3.1 | | | | |x| ARP |2.3.2 | | | | | | Flush out-of-date ARP cache entries |2.3.2.1|x| | | | | Prevent ARP floods |2.3.2.1|x| | | | | Cache timeout configurable |2.3.2.1| |x| | | | Save at least one (latest) unresolved pkt |2.3.2.2| |x| | | | Ethernet and IEEE 802 Encapsulation |2.3.3 | | | | | | Host able to: |2.3.3 | | | | | | Send & receive RFC-894 encapsulation |2.3.3 |x| | | | | Receive RFC-1042 encapsulation |2.3.3 | |x| | | | Send RFC-1042 encapsulation |2.3.3 | | |x| | | Then config. sw. to select, RFC-894 dflt |2.3.3 |x| | | | | Send K1=6 encapsulation |2.3.3 | | | | |x| Use ARP on Ethernet and IEEE 802 nets |2.3.3 |x| | | | | Link layer report b’casts to IP layer |2.4 |x| | | | | IP layer pass TOS to link layer |2.4 |x| | | | | No ARP cache entry treated as Dest. Unreach. |2.4 | | | | |x|
Internet Engineering Task Force [Page 26]
RFC1122 INTERNET LAYER October 1989
-
INTERNET LAYER PROTOCOLS
3.1 INTRODUCTION
The Robustness Principle: “Be liberal in what you accept, and conservative in what you send” is particularly important in the Internet layer, where one misbehaving host can deny Internet service to many other hosts.
The protocol standards used in the Internet layer are:
o RFC-791 [IP:1] defines the IP protocol and gives an introduction to the architecture of the Internet.
o RFC-792 [IP:2] defines ICMP, which provides routing, diagnostic and error functionality for IP. Although ICMP messages are encapsulated within IP datagrams, ICMP processing is considered to be (and is typically implemented as) part of the IP layer. See Section 3.2.2.
o RFC-950 [IP:3] defines the mandatory subnet extension to the addressing architecture.
o RFC-1112 [IP:4] defines the Internet Group Management Protocol IGMP, as part of a recommended extension to hosts and to the host-gateway interface to support Internet-wide multicasting at the IP level. See Section 3.2.3.
The target of an IP multicast may be an arbitrary group of Internet hosts. IP multicasting is designed as a natural extension of the link-layer multicasting facilities of some networks, and it provides a standard means for local access to such link-layer multicasting facilities.
Other important references are listed in Section 5 of this document.
The Internet layer of host software MUST implement both IP and ICMP. See Section 3.3.7 for the requirements on support of IGMP.
The host IP layer has two basic functions: (1) choose the “next hop” gateway or host for outgoing IP datagrams and (2) reassemble incoming IP datagrams. The IP layer may also (3) implement intentional fragmentation of outgoing datagrams. Finally, the IP layer must (4) provide diagnostic and error functionality. We expect that IP layer functions may increase somewhat in the future, as further Internet control and management facilities are developed.
Internet Engineering Task Force [Page 27]
RFC1122 INTERNET LAYER October 1989
For normal datagrams, the processing is straightforward. For
incoming datagrams, the IP layer:
(1) verifies that the datagram is correctly formatted;
(2) verifies that it is destined to the local host;
(3) processes options;
(4) reassembles the datagram if necessary; and
(5) passes the encapsulated message to the appropriate
transport-layer protocol module.
For outgoing datagrams, the IP layer:
(1) sets any fields not set by the transport layer;
(2) selects the correct first hop on the connected network (a
process called "routing");
(3) fragments the datagram if necessary and if intentional
fragmentation is implemented (see Section 3.3.3); and
(4) passes the packet(s) to the appropriate link-layer driver.
A host is said to be multihomed if it has multiple IP addresses.
Multihoming introduces considerable confusion and complexity into
the protocol suite, and it is an area in which the Internet
architecture falls seriously short of solving all problems. There
are two distinct problem areas in multihoming:
(1) Local multihoming -- the host itself is multihomed; or
(2) Remote multihoming -- the local host needs to communicate
with a remote multihomed host.
At present, remote multihoming MUST be handled at the application
layer, as discussed in the companion RFC [INTRO:1]. A host MAY
support local multihoming, which is discussed in this document,
and in particular in Section 3.3.4.
Any host that forwards datagrams generated by another host is
acting as a gateway and MUST also meet the specifications laid out
in the gateway requirements RFC [INTRO:2]. An Internet host that
includes embedded gateway code MUST have a configuration switch to
disable the gateway function, and this switch MUST default to the
Internet Engineering Task Force [Page 28]
RFC1122 INTERNET LAYER October 1989
non-gateway mode. In this mode, a datagram arriving through one
interface will not be forwarded to another host or gateway (unless
it is source-routed), regardless of whether the host is single-
homed or multihomed. The host software MUST NOT automatically
move into gateway mode if the host has more than one interface, as
the operator of the machine may neither want to provide that
service nor be competent to do so.
In the following, the action specified in certain cases is to
"silently discard" a received datagram. This means that the
datagram will be discarded without further processing and that the
host will not send any ICMP error message (see Section 3.2.2) as a
result. However, for diagnosis of problems a host SHOULD provide
the capability of logging the error (see Section 1.2.3), including
the contents of the silently-discarded datagram, and SHOULD record
the event in a statistics counter.
DISCUSSION:
Silent discard of erroneous datagrams is generally intended
to prevent "broadcast storms".
3.2 PROTOCOL WALK-THROUGH
3.2.1 Internet Protocol -- IP
3.2.1.1 Version Number: RFC-791 Section 3.1
A datagram whose version number is not 4 MUST be silently
discarded.
3.2.1.2