IEN 188
                     ISSUES IN INTERNETTING
                       PART 3:  ADDRESSING
                          Eric C. Rosen
                  Bolt Beranek and Newman Inc.
                            June 1981
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
                     ISSUES IN INTERNETTING
                       PART 3:  ADDRESSING
3.  Addressing
     This is the third in a series of  papers  that  discuss  the
issues  involved in the design of an internet.  The initial paper
was IEN 184, familiarity with which is presupposed.
     In this paper, we will deal  with  two  basic  issues.   The
first  has  to  do  with  the  Network  Access  Protocol.   It is
concerned with the sort of addressing information which a  source
Host  has  to  supply,  along  with  its data, to a source Switch
(gateway, in the Catenet context), in order to enable the  Switch
to  get  the  data delivered to the proper destination Host.  The
second issue has to do with the  question  of  how  the  Switches
(both  source  Switch  and  the  intermediate  Switches)  are  to
interpret and act upon the addressing information supplied by the
source  Host.   We  begin  by  stating  generally  the  sort   of
addressing  scheme  we  envision (which is by no means original),
and by comparing it to the  very  different  sort  of  addressing
currently  in  use  in the Catenet.  Next we will discuss some of
the issues and details that arise in considering how to make such
a scheme work reliably.  We will then show how this scheme  lends
itself  quite naturally to the solution of certain problems which
are very difficult to handle in the current Catenet architecture.
Although addressing and routing are rather intimately  bound  up,
                              - 1 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
we  will  avoid  routing  considerations  here whenever possible.
Routing in the internet will be the topic of a longer paper which
will be the next to appear in this series.
3.1  Logical Addressing / Flat Addressing
     For maximum  flexibility  and  robustness  of  operation,  a
source  Host should be able to simply "name" the destination Host
it wants to reach, where a "name" is just an arbitrary identifier
for a Host.  That is, the source Host should  not  need  to  know
anything about the physical location of the destination Host, NOT
EVEN  WHAT NETWORK IT IS ON.  In other words, the internet should
have logical addressing.  The advantages  of  logical  addressing
are  thoroughly  discussed  in IEN 183, and that discussion shall
not be repeated here.  IEN  183  presents  a  logical  addressing
scheme  which  was  designed  with the ARPANET in mind.  However,
since we  regard  the  internet  as  a  Network  Structure  whose
Switches  are  gateways and whose Hosts are generally multi-homed
to the gateways, most of the ideas presented in IEN  183  can  be
carried  over  directly to the internet environment.  The present
IEN will emphasize those aspects of the logical addressing scheme
which are specific to the internet environment, but the  proposed
scheme  is  basically  the  same as the one discussed in IEN 183.
Anyone with a real interest in these issues will want  to  become
familiar with that document.
                              - 2 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
     The  basic  idea of logical addressing is that a source Host
should name the destination Host, and  the  Switches  should  map
that  name  into a physical address that is meaningful within the
Network Structure of the Switches.  The mapping between names and
(physical) addresses will, in general, be  many-many.   That  is,
one  name  may refer indeterminately to several distinct physical
addresses,  either  because  some   one   physical   machine   is
multi-homed,  or  because the user does not care which of several
physical machines he reaches.  Similarly,  one  physical  machine
may  have  several names, which may either be synonyms, or may be
used for further multiplexing within the destination Host.  (This
may be particularly important when  a  Host  within  one  Network
Structure  is  really  a  Switch,  e.g., a port expander or local
network, within another.)
     Logical addressing tends to  result  in  a  flat  addressing
space,  rather than a hierarchical one.  This may seem surprising
in  the  context  of  the  internet,  since  an  internet  is   a
hierarchical  structure, and internet routing is almost certainly
going to be some  form  of  hierarchical  routing.   However,  it
simply  does  not  follow  that  the addressing space used in the
internet  Network  Access  Protocol  must   be   a   hierarchical
addressing  space.   In  fact,  since  the form of the addressing
space has an effect on the Network Access Protocol, and hence  on
Host-level  software,  whereas  the routing algorithm is a purely
internal  matter  to  the  Network  Structure,  proper   protocol
                              - 3 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
layering  would  seem  to require that the form of the addressing
and the form of the routing be independent.  We would like to  be
able  to  change  the  internal  routing algorithm of the Network
Structure  without  requiring  corresponding  changes   in   Host
software, i.e., without changing the form of the addressing.
     What  we  are  proposing  is  quite  different  from the way
addressing  is  done  in  the  current  Catenet  Network   Access
Protocol,  IP.  IP uses both physical addressing and hierarchical
addressing.  (Note that physical addressing within a hierarchical
Network  Structure  will   almost   certainly   be   hierarchical
addressing,   whereas  logical  addressing  allows  the  internal
structure of the Network Structure to be better hidden  from  the
users.  This is one of its main advantages.)  The first component
of the address is a network number, and the second component is a
physical address which is meaningful within that network.  In IEN
183,  we  discuss  a  number  of  reasons  for the superiority of
logical  over  physical  addressing.   Other  criticisms  of  the
Catenet's  current  addressing  scheme  have been voiced by other
authors.  For example, the way in which  hierarchical  addressing
is  incorporated  into Catenet addressing mechanisms has recently
come under criticism in IEN 177 by Danny Cohen, who  focuses  his
criticism  on  the  particular  case  of  the  ARPANET.  His main
criticism is that it does not allow enough  hierarchical  levels.
That  is, with the presence of local nets or port expanders which
appear to the ARPANET as Hosts, there is really another level  of
                              - 4 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
hierarchy  after  the  ARPANET.   He  suggests,  therefore,  that
ARPANET  addressing  (1822-level)  be  changed  to  provide  this
additional  hierarchical  level,  and that end-users (or at least
Host software modules) fill in this additional level.
     It is not obvious, though, that a single additional level of
addressing will do for all applications.  If we are sending  data
not  just to a local net, but to an internet of local nets, maybe
several additional levels of hierarchy are needed.  We  may  also
need  more  hierarchy  on  the  "front  end"  of  the address.  A
protocol which begins the internet address with a field which  is
supposed  to  identify the destination network (e.g., IP) assumes
that there is no need to establish a hierarchy among the networks
themselves.  (This is equivalent to assuming  that  all  Switches
can  "know about" all networks.)  As long as we have only a small
number of networks, it may be reasonable enough  to  assume  that
destination    network   addresses   need   not   themselves   be
hierarchical.  However, it is not difficult  to  imagine  a  very
large  internet  composed  of thousands of networks, where before
specifying a network, we must first specify,  say,  a  continent.
So  maybe  our  protocol  for  hierarchical  addressing  needs  a
"continent address" field before the network address  field.   It
begins  to  look  as  if  the  addressing  structure  needs to be
INFINITELY EXTENSIBLE in both directions.  In fact,  in  IEN  179
Cohen proposes a scheme which seems intended to provide this sort
of   infinite  extensibility.   That  seems  both  an  inevitable
                              - 5 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
consequence  of  hierarchical  addressing,  and  a  reductio   ad
absurdum of it.
     It  is  also  worth  noting that a given number of Hosts can
generally be addressed with  fewer  bits  in  a  flat  addressing
scheme  than in a hierarchical addressing scheme.  Given, say, 32
bits of addressing, flat addressing can  represent  2**32  Hosts.
However,  if  these  32  bits  are broken into four 8-bit fields,
hierarchically, fewer Hosts can be represented, since in general,
not every one of the four fields will actually take on  the  full
256  values.   Inevitably, one finds that at least one field must
take on 257 values, while at least one other turns out to have  a
smaller  number  of  values than expected.  This tends to lead to
the feeling that the address field needs "just one more level" of
hierarchy.  It also tends to lead to  the  use  of  funny  escape
values and multiplexing protocols so that different fields can be
divided up in different ways by different applications.  The same
problems  usually  reappear, however, in a few years, as the need
for "just one more level"  is  proclaimed  yet  again.   Yet  the
alternative  of making the address fields arbitrarily long, hence
infinitely  extensible,  is  rather  infeasible,   if   bandwidth
considerations are taken into account.
     The  need  for  infinite extensibility at the Host interface
can be avoided by using logical addressing (although this is only
one of its many advantages).  We can then identify a single  Host
                              - 6 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
by   using   a  single,  structure-less,  unique  name  which  is
meaningful at each level of internet  hierarchy.   That  is,  the
Switches  at  each  level  of  the  hierarchy  would  be  able to
recognize the name, and to map it into a physical address that is
meaningful at that level of hierarchy.  Neither the end-user  nor
the source Host would be responsible for determining the physical
addresses  at each level of a never-ending hierarchy.  Of course,
neither these arguments, nor those of IEN 183, can be regarded as
finally  settling  the  "flat  vs.   hierarchical"   issue.    In
networking,  no  one  issue can ever be settled in isolation, and
attempts to  do  so  result  only  in  endless  and  unproductive
arguments.   A network (or internet) is a whole whose performance
and functionality result from the combination of  its  protocols,
addressing  schemes,  routing  algorithm,  hardware  and software
architecture, etc.  Particular addressing  schemes  can  only  be
judged  when  it  is  seen  how they actually fit into particular
designs.  The  only  real  argument  in  favor  of  a  particular
addressing  scheme  is  that  it  fits  naturally  into a network
architecture  which  provides  the   needed   functionality   and
performance.   It  is hoped that the addressing scheme we propose
will be judged as part of the architecture we are  developing  in
this series of papers, rather than in isolation.
                              - 7 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
3.2  Model of Operation:  An Overview
     The  model  of  operation we are proposing is as follows.  A
source Host submits a packet to  a  source  Switch,  naming  (not
addressing)   the  destination  Host.   THE  SOURCE  SWITCH  THEN
TRANSLATES (OR MAPS) THAT NAME INTO  A  PHYSICAL  SWITCH  ADDRESS
WHICH  IS MEANINGFUL WITHIN ITS OWN NETWORK STRUCTURE;  THAT WILL
BE THE ADDRESS OF THE  DESTINATION  SWITCH  WITHIN  THAT  NETWORK
STRUCTURE.  The data is then routed through the Network Structure
to  the  destination  Switch  so  addressed.   The  name (logical
address) of the destination Host  is  also  carried  through  the
Network Structure along with the data and the physical address of
the destination Switch.  When the destination Switch receives the
data,  it  forwards  it to the destination Host over (one of) its
Pathway(s) to that Host.  If the Pathway is itself a  network  or
internet  configuration  with logical addressing, the name of the
destination Host is passed on via the  Pathway  Access  Protocol.
If logical addresses or names are not unique across all component
networks  of  an  internet, translation from the internet logical
address to the Pathway logical address would have to be  done  at
this  point.   If  the network or internet underlying the Pathway
does not even have logical addressing, the Host name will have to
be translated into a Pathway physical address by the  destination
Switch.
                              - 8 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
     Note  that,  at  any  particular  hierarchical  level (i.e.,
within  any  particular  Network  Structure),   the   ADDRESSABLE
ENTITIES  are  the  Switches  at that level (which are physically
addressed), and all the Hosts (which are logically addressed,  or
named).   Component  networks  of  the  internet  are  treated as
structure-less  Pathways,  AND  NEITHER  THE  COMPONENT  NETWORKS
THEMSELVES  NOR  THE  SWITCHES  OF  THE  COMPONENT  NETWORKS  ARE
INDEPENDENTLY ADDRESSABLE.  Furthermore, a name (logical address)
which adequately identifies the destination Host  is  present  at
each  level  of the hierarchy.  Of course, a particular name only
needs to be unique at a single level of the  internet  hierarchy,
within  a  particular Network Structure.  The names can change as
we travel up and down the hierarchy of  Network  Structures  that
make up the internet.
3.3  Some Issues in Address Translation
     In  order  to  do  the  sort  of translation from logical to
physical address that we have been discussing above, the Switches
must have translation tables.  Many of the issues involved in the
design of a robust translation table mechanism are  discussed  in
IEN  183,  and  much of that discussion applies without change to
the internet.  We will confine our discussion here, therefore, to
issues which are not considered in that note, or which  are  more
specific to the internet environment.
                              - 9 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
     The  main  problem  with  the  model  of  operation  we have
proposed  is  a  very  mundane  one,  but  unfortunately  a  very
important  one.   If  there  may  be  thousands  of  Hosts  on an
internet, each one with an unlimited number of  different  names,
and  if  a  source  Switch  must  be  able to map any name to the
address of a destination Switch, then each Switch  will  have  to
have  a  very  large  table  of  names  to drive this translation
function.  By itself, this is not much of a problem.  To be sure,
in the past,  it  has  been  considered  important  to  keep  the
gateways as small as possible.  It now seems to be more generally
accepted  that  the  current  Catenet gateways provide inadequate
performance, and that  building  a  robust  operational  internet
system  requires  us  to  build Switches that are large enough to
handle the required functionality at a reasonably high  level  of
performance.   We would expect Switches built in the future to be
much larger than the current gateways are.  However,  it  is  one
thing to require large tables, and quite another thing to require
tables  which  may grow without bound.  Since the number of Hosts
on the internet may grow without bound, it does not seem feasible
to require the Switches to have tables with one or  more  entries
for each and every Host in the internet.
     If we cannot fit the complete set of translation tables into
each  Switch,  a natural alternative is to turn the tables into a
DISTRIBUTED DATA BASE, with each Switch having only a  subset  of
the  complete  set  of tables.  For each Switch, there would be a
                             - 10 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
subset of logical addresses  for  which  the  Switch  would  have
complete   physical   addressing   information.    These  logical
addresses would fall into one of two classes:
     1) Those logical addresses which refer to  Hosts  which  are
        homed  (in  some  Network  Structure)  directly  to  that
        Switch.
     2) Those logical addresses  which  refer  to  distant  Hosts
        which  are in FREQUENT communication with the Hosts which
        are directly homed to that Switch.
The logical addresses in these two classes are the ones for which
the   Switch   will   be   most   often   called   upon   to   do
logical-to-physical address translation, and for best efficiency,
the  information needed to do the translation ought to be present
in the Switches.  For other logical  addresses,  which  are  less
often  seen,  all  that is needed is for the Switch to know where
the address translation information can  be  found.   Then  if  a
packet  with an infrequently-seen logical address is encountered,
it can be forwarded to a place where the  proper  information  is
known  to  reside,  or  else  the  packet  can  be held while the
information is obtained.  (We may want to have a scheme which  is
a  hybrid  of  these two alternatives.  For example, packets with
logical addresses that are not contained in the  resident  tables
can be forwarded to a place with more addressing information, and
this  can  in  turn cause the needed addressing information to be
                             - 11 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
sent back to the source Switch, so that additional  packets  with
the  same  address  can be handled directly by the source Switch.
That is, the source Switch might maintain,  in  addition  to  its
permanently  resident tables, a cache of the most recently needed
addressing information.)
     It is important to note that the two classes  defined  above
may  vary  dynamically,  and we may want a procedure for altering
the members of those classes in some  specific  Switch  depending
upon the traffic that the Switch is actually seeing in real time.
     Unfortunately,  any  such  scheme  would seem to require the
inclusion of at least one additional level of  hierarchy  in  the
addressing  structure, since when a Switch sees a logical address
for which it does not have complete information, it must be  able
to  determine  how  to get that complete information.  The scheme
would be self-defeating if it meant that we had to have  a  table
of  all the logical addresses, with an indication for each one of
which other Switch has the complete information.  Rather, we need
to be able to group the logical addresses into "areas", of  which
there will be a bounded number.  Then each Switch will be able to
keep a table indicating which other Switches contain the complete
translation information for each area.  This table of areas would
then  be  the only part of the complete set of translation tables
that had to be resident at ALL Switches.  While this is much more
feasible than requiring each Switch to keep  a  table  containing
                             - 12 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
all  the  logical  addresses,  it does means that the destination
address provided by the source  Host  must  include  not  only  a
destination  Host  identifier,  but  also an "area code" for that
logical address.
     If we are going to organize the  logical  addresses  of  all
internet  Hosts  into a relatively small set of "areas", we would
like to find some means of organization which is fairly  optimal.
Unfortunately, there are a number of fairly subtle considerations
which   make  this  quite  tricky  to  do.   Certain  intuitively
attractive ways of organizing the internet into these areas  will
result  in  various  sorts  of  significant  and  quite  annoying
sub-optimalities.  Suppose, for example,  we  treated  "area"  as
meaning  "home network", much as in the present Catenet IP (where
network number is  part  of  the  address  that  the  Hosts  must
specify.) Then we would require all and only the ARPANET gateways
to contain the logical-to-physical addressing information for the
ARPANET  Hosts,  all  and only the SATNET gateways to contain the
tables for the logical addresses of the SATNET Hosts,  etc.   The
user,  in  addressing  a particular Host, would not only name it,
but also name its "home network", and  the  source  Switch  would
choose  some Switch which interfaces directly to the home network
of the destination Host from  which  to  obtain  the  translation
information.   This  method of organization, however, has several
unsatisfactory consequences.  One problem is that if any Host  is
on  two  "home networks", we want the Switches, not the Hosts, to
                             - 13 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
choose which "destination network" to use.  This is necessary  if
we  want  the  routing  algorithm to be able to choose the "best"
path to some destination Host, and is  really  the  only  way  of
ensuring  that packets can be delivered to a Host over some path,
if one of the Host's home networks is down but the other  is  up.
(This  is  jumping  ahead  a  bit, since a full discussion of the
"partitioned net" problem will not appear until section 3.4.  The
point, though, is that the choice of "home network" to  use  when
sending  traffic  to  a  particular destination Host is a ROUTING
PROBLEM, NOT AN ADDRESSING PROBLEM.  Therefore  it  ought  to  be
totally  in  the  province of the Switches, which are responsible
for routing, and not at all in the province of the  Hosts,  which
must participate in the addressing, but not the routing.)
     Another  problem arises as follows.  Suppose we have adopted
the scheme of sending packets for a certain area to a  Switch  in
that   area,   depending   on  that  Switch  to  do  the  further
logical-to-physical translation.  It is possible that  when  this
further  translation  is  done, we will find that the route which
the packet travels from that Switch takes  it  back  through  the
source   Switch.    This   could   mean   a   very   lengthy  and
delay-producing "detour" for  the  packet.   It  might  at  first
appear  that  this  is  not very likely.  If a packet is going to
some ARPANET Host, and  we  send  it  to  some  Switch  which  is
directly  connected to the ARPANET, surely we have sent it closer
to its final destination, not further away.  Unfortunately,  that
                             - 14 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
just  is  not  necessarily true.  Network partition or congestion
may force a packet for an ARPANET Host to travel from an  ARPANET
gateway to a gateway (or series of gateways) outside the ARPANET,
back around (through a potentially long route) to another ARPANET
gateway.   (Consider  the  partitioned  net  and  the  expressway
problems.)  In such cases, the Network Structure may  already  be
in  a  condition of stress which is likely to result in below par
performance.  We do not want to make things even worse by  adding
any  further  unnecessary  but  lengthy  detours  just because we
cannot keep all the addressing information at the source  Switch.
     One  way  of  helping to avoid these sorts of problems is to
separate the notion of "area" from  any  physical  meaning.   The
purpose  of  adding  the notion of area to the logical addressing
scheme is just to enable us to distribute the data base needed to
do logical-to-physical address translation.  There is  no  reason
to  suppose  that  the  addressing  information  needed  for some
particular Host ought to be contained only in Switches  that  are
"near"  that  Host.   That  would  be  a  mistake.   Rather,  the
addressing information ought to be somewhere which is "near"  the
SOURCE  Host,  not  somewhere which is near the destination Host.
This maximizes the chances that the necessary address translation
will be done as soon as possible  after  the  packet  enters  the
Network Structure.  The sooner we do the address translation, the
more  information we have which we can make use of to improve the
routing of the  packet,  and  the  less  likely  any  unnecessary
detours will be.
                             - 15 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
     One  might  think  that at least Hosts which are on the same
home network should be grouped into the  same  area.   This  will
work  until  the  first  time a Host is moved from one network to
another.  Since the area codes are given by the  individual  Host
or  user as part of the address in the Network Access Protocol of
the internet, changing a Host's area code would involve  changing
Host-level   software   or  tables,  which  has  to  be  avoided.
(Avoiding  the  need  to  make  such  changes  when  Hosts   move
physically   is  one  of  the  main  reasons  for  using  logical
addressing.)  So we really have to think  of  "areas"  as  random
collections of Hosts.
     What we are proposing is a truly distributed logical address
translation  table,  rather  than  a  scheme  where  each  Switch
maintains only local information.  To make  this  more  concrete,
consider  how  this  might  be  done  in  the  Catenet.   All the
information about logical addresses which refer to Hosts  on  the
ARPANET would be contained not only in all the gateways which are
directly  connected  to  the  ARPANET,  but  also  in  a  set  of
additional gateways which  are  uniformly  scattered  around  the
internet.  Then, although the addressing information would not be
in  every potential source Switch, it would be somewhere close to
every potential source Switch, and  packets  would  not  have  to
travel  a  long  distance only to find out that they are going in
the wrong direction.
                             - 16 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
3.4  Model of Operation:  More Detail
     Let's assume that a source Host has given  a  message  to  a
source  Switch,  with  a  logical  address  and  an  "area  code"
indicating the destination Host.  If the source Switch  does  not
have  the complete address translation information in its tables,
it will look in its table of area codes.   The  given  area  code
will  be associated in the latter table with some set of Switches
(within the same Network Structure).  The sequence of  operations
that we envisage is the following:
     1) The source Switch picks one of these Switches, and  sends
        the message to it.  There must be enough protocol between
        these  two  Switches so that the chosen Switch knows that
        it is not the  final  destination  Switch,  but  only  an
        intermediate  Switch, and that it is expected to complete
        the address translation and then to forward  the  message
        further.
     2) The chosen Switch must be able to recognize  the  logical
        address  of  the  destination Host, and associate it with
        one or more possible destination Switches.   The  message
        will be forwarded to one of these Switches.  Furthermore,
        the addressing information can be sent back to the source
        Switch  where  it  can  be  held  in  a cache in case the
        message is followed by a flood of additional messages for
        the same logical address.
                             - 17 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
     In the case where the source Switch  does  contain  complete
address  translation  information  for  the  destination  logical
address, that logical address will be associated with some set of
potential destination Switches.  The source  Switch  will  choose
one, and send the message directly to it.
     Logical-to-physical  address  translation  should be done by
only one Switch; either the source Switch or the Switch chosen by
the source Switch on the basis of the area  code.   There  is  no
need to allow intermediate Switches to do any logical-to-physical
address  translation.   (There  is  only  one  exception to this,
namely the case where a message arrives at an intermediate Switch
only to discover that the destination Switch chosen by the source
Switch is no longer accessible.  In this case, re-translation  is
the alternative to dropping the message entirely.)  Remember that
many  Hosts will be multi-homed (in the internet, virtually every
Host is multi-homed, since most networks will have at  least  two
internet  gateways  connected  to  them),  so  that there will in
general  be  more  than  one  possible  destination  Switch.   By
prohibiting re-translation at intermediate Switches, we avoid the
problems  of  looping  that might arise if different intermediate
Switches make different choices of  destination  Switch.   As  we
shall  see,  this also simplifies our approach to the partitioned
net problem, and at any rate, there  is  no  great  advantage  to
allowing intermediate Switch translation (cf. IEN 183).
                             - 18 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
     We  suggested  above  that  if  a  source  Switch  does  not
recognize a particular logical address, and  hence  must  send  a
message  to  another Switch (as determined by the area code), the
latter Switch should send the addressing information back to  the
source  Switch,  to  be  kept temporarily in a cache.  We have to
emphasize "temporarily." The source Switch should  time  out  the
addressing  information  which  it  keeps  in the cache, and then
discard it.  If it later receives from any of  its  source  Hosts
any subsequent messages for the same destination logical address,
it will have to reobtain the information.  The reason for this is
that  it  will  be  necessary,  from  time to time, to change the
translation tables.  It is not that hard to develop  an  updating
procedure which ensures consistent updating of all Switches where
the information about a logical address normally resides.  But it
might  be  more  difficult  to  develop a procedure which ensures
consistent updating of all the temporary (cached) copies of  that
information.   Timing  out the temporary copies of the addressing
information  will  prevent  out-of-date  information  from  being
preserved  in  inappropriate  places.   (Though  the  use  of  an
out-of-date translation is not so terrible, since it would elicit
a DNA message, rather than causing mis-delivery of data.  See IEN
183 for details.   In  this  sense,  out-of-date  information  is
self-correcting.)
     When  either a destination Host name (logical address) or an
area code maps into several  Switches,  the  source  Switch  must
                             - 19 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
apply  some  criterion  to  choose  one from among them, since in
general we will want to send only one copy of the message to  its
destination.   (Though there may indeed be cases in which we want
to send a copy  of  the  message  to  each  possible  destination
Switch, in order to increase the reliability of the system, or to
be  sure  that we get the message to its destination Host as fast
as possible.)  There are several possible criteria that we  might
consider using:
     a) We might always choose the "closest" Switch, according to
        some particular distance metric (which might or might not
        be   the   same  distance  metric  used  by  the  routing
        algorithm).
     b) The list of potential destination Switches might  have  a
        "built-in" ordering, so that the first one is always used
        unless it is down, in which case the second one is always
        used,  unless  it is down, in which case the third one is
        used, etc.
     c) If the set of  potential  destination  Switches  has  the
        right  sort  of topological distribution, we might try to
        round-robin  them  in  order  to  achieve  some  sort  of
        load-splitting.
     d) If we can obtain  some  information  about  the  relative
        loadings  of  the  various Switches, we can try to choose
                             - 20 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
        the one with the smallest load (to try to  avoid  causing
        congestion  within the destination Switches), or we might
        try to trade off the increase in load that we will  cause
        at  the  destination  Switch with the distance we have to
        travel to get there.
     e) Certain possible destination Switches  might  be  favored
        for  certain  classes  of  traffic  (as determined by the
        "type  of  service"   field,   or   by   access   control
        considerations).   That  is, certain destination Switches
        might be favored for  interactive  traffic,  and  certain
        others  (with more capacity?) for bulk traffic.  Or there
        might be administrative access control restrictions which
        prohibit certain classes of traffic from  being  sent  to
        certain  Switches.   (This may be particularly applicable
        in an internet context where different Switches are under
        the  control  of  different   administrations.    It   is
        possible, though, to imagine applications of this sort of
        access  control  even  in a single-administration Network
        Structure.   For  example,  we  might  want  to  prohibit
        military traffic from entering certain Switches, in order
        to  preserve  capacity for important university traffic.)
     f) It is possible to combine some  of  the  above  criteria,
        e.g.,  choose  the  closest (i.e., shortest delay) Switch
        for interactive traffic and the most lightly  loaded  one
        for bulk traffic.
                             - 21 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
Remember that in the internet case, all the Hosts on some network
are  considered  to be homed to all the gateways on that network,
so that in general most Hosts will be multi-homed, and the way we
select the destination Switch could have a significant effect  on
internet performance.
     Of  course,  a  destination  Switch might itself have two or
more Pathways to a  particular  destination  Host.   Perhaps  the
Switch  is  a  gateway  on  two networks, and the Host is also on
those two networks.  Or perhaps the Switch  is  multi-homed  onto
the  network  of  the  Host.   In  such  cases,  a further choice
remains -- the destination Switch must choose  which  of  several
possible  Pathways  to  the  destination  Host  it should use for
sending  some  particular  packet.   Each  (destination)  Switch,
therefore, will have to have a second logical-to-physical address
translation  table,  which  it  accesses  in  order to choose the
proper Pathway to a destination Host.   This  second  translation
table,   however,  contains  information  which  is  only  useful
locally.  In addition to containing information needed to map the
logical address onto one of the Switch's access  lines,  it  must
also  contain  any  information  needed  in  order to specify the
address of the destination Host in the Pathway  Access  Protocol.
In  some  cases,  the  logical  address  of the Host in its "home
network" may be the same as its logical address in the  internet,
in  which  case  no additional information is needed.  If this is
not the case, or if the "home  network"  does  not  have  logical
                             - 22 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
addressing, the local translation tables must contain information
for  mapping  the internet logical address to an address (logical
or physical) which is meaningful  in  the  "home  network."   The
issues  of  choosing  one  from  among a set of possible Pathways
according to some criteria are basically the  same  as  those  we
have  been  discussing from the perspective of the source Switch,
however.
     An interesting little issue: suppose that traffic for Host H
can be sent to either Switch A or B, but that the route to Switch
B contains Switch A as an intermediate Switch.   Does  this  mean
that  the traffic should always be sent to A, rather than B?  Not
necessarily.  Perhaps A has plenty  of  bandwidth  available  for
forwarding traffic to other Switches, but only a little available
for  sending  traffic  directly  to  a Host.  Or the Pathway from
Switch A to Host H may itself have such a long delay that  it  is
quicker  to  send  the  traffic  through  A  to B and then on B's
Pathway to H.  While  it may turn out to  be  very  difficult  to
take  account of such factors, we ought not to rule them out by a
priori considerations, and we ought not to  design  a  system  in
which such factors cannot be considered.
     A  variant on this issue can arise as follows.  Suppose Host
H1 wants to send some data to Host H2, and H1 puts this data into
the internet by submitting it to source Switch  S.   Now  S  will
look  in  its  address  translation  table  to  find the possible
                             - 23 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
destination Switches for H2.  Let's suppose that  there  are  two
such  possible  destination  Switches, one of which is D, and the
other of which is S itself.  That is, S has a choice  of  sending
the  data  directly  to  H2  (over a Pathway with no intermediate
Switches), or of sending it to D, so D can transmit  it  directly
to  H2.   Nothing  in  the proposed scheme constrains S to choose
itself as the destination Switch.  If we want, we can have S make
the choice of  destination  Switch  without  taking  any  special
cognizance  of  the fact that it itself is a possible destination
Switch.  Or we might even require that S not choose itself as the
destination Switch.  That is, when a gateway on the ARPANET,  for
example,  gets  some  data from an ARPANET Host which is destined
for another ARPANET Host, maybe we  want  the  data  to  be  sent
through  another  gateway, rather than just sending it right back
into the ARPANET.  This possibility might be crucial  to  solving
the "expressway" problem.  While we are not at present making any
proposals for allowing the internet to be used as an "expressway"
between  two  Hosts  on  a common, but very slow, network, we are
trying to ensure that nothing in our proposed  addressing  scheme
will  make  this impossible.  This is a very important difference
between our proposed scheme and the scheme presently  implemented
in  the  Catenet, where a source Switch which is also a potential
destination Switch is highly constrained to pick  itself  as  the
actual  destination  Switch.   Of course, for this to work, there
must be enough protocol so that a Switch which receives some data
                             - 24 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
can know whether it is getting it directly from a source Host, or
whether it is getting it from another Switch.
     When we say that a particular Host name maps onto a  set  of
possible  Switches, what we are really saying is that each member
of that set of Switches has a Pathway to the Host.  Remember  the
definition  of  "Pathway"  --  a  Pathway  in Network Structure N
between two Switches of Network Structure N or between  a  Switch
and  a  Host  of  Network  Structure  N  is a communications path
between the two entities which does not contain any  Switches  of
Network Structure N.  The logical-to-physical address translation
tables  will  not  map  a  Host  name  to  a  particular  set  of
destination Switches unless each of those Switches has a  Pathway
to  that Host.  But we must remember that at any particular time,
one or more of these Pathways may be down.  Before we  apply  the
above  criteria  (or  others)  to the set of possible destination
Switches in order to choose  a  particular  one,  we  must  first
eliminate  from  the  set  any  Switches  whose  Pathway  to  the
destination Host is down.   This  is  a  non-trivial  task  which
breaks down naturally into two sub-tasks.  First, the destination
Switch  must  be  able  to  determine which of the Hosts that are
normally homed to  it  is  reachable  at  some  particular  time.
Second,  this  information must be fed back to the source Switch.
Each of these sub-tasks raises a number of interesting issues.
                             - 25 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
     In IEN 187, we discussed the importance of having a  Pathway
up/down  protocol  run between each Host and each Switch to which
it is homed, so that a source Host can know which source Switches
it has a currently operational Pathway to.  Now we see the  other
side  of  the  coin  --  each  destination Switch must be able to
determine which Hosts it currently has an operational Pathway to.
Many of the considerations discussed in IEN 187 apply  here  too,
and need not be mentioned again.  Basically, the Switch will have
to  run  a low-level up/down protocol which relies on the network
which underlies the Pathway to tell it whether a particular  Host
is reachable (e.g., the ARPANET returns an 1822 DEAD Reply to any
ARPANET source Host which attempts to send a non-datagram message
to  an  unreachable  destination  Host), and the Switch will also
have to run a higher-level up/down protocol  whereby  it  queries
the Host and infers that the Host is unreachable if no replies to
the queries are received.  Of course, if some Pathway consists of
a  simple  datagram-oriented network that provides no feedback to
the source, then a higher-level protocol will  have  to  be  used
alone.
     Assuming  that  the  Switches  have  some way of determining
whether their Pathways to particular Hosts  are  operational,  we
have   the   following   subsidiary   issue   --   should   these
determinations be made on a regular basis,  for  all  Hosts  that
might be reachable, or should they be made on an exception basis,
with the information obtained only as needed?  Let's consider the
                             - 26 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
analogous  operation in the ARPANET.  In the ARPANET, the up/down
status of each Host is maintained continuously, as  a  matter  of
course,   by   the  IMP  to  which  that  Host  is  homed.   This
information, however, is not generally maintained at other  IMPs.
If  a packet for a dead Host (on a live IMP) is submitted to some
source IMP, the packet will always be  sent  to  the  destination
IMP,  which will (unless the packet is a datagram) return an 1822
DEAD reply.  The source IMP receives the DEAD reply,  signals  it
to  the  source Host, and then discards the information.  IMPs do
not maintain status  information  about  remote  Hosts,  but  the
information  is  available  to  them as they need it (i.e., on an
exception basis).  On the other hand, each IMP  always  maintains
complete,   accurate,   and   up-to-date  information  about  the
reachability of each other IMP.  Whenever any IMP  goes  down  or
comes  up,  this information is broadcast to all other IMPs in an
extremely quick and reliable manner.  If a source  Host  attempts
to send a packet to a Host on an unreachable IMP, no data is sent
across  the network at all; the source IMP already knows that the
destination IMP cannot be reached,  and  tells  the  source  Host
immediately.
     Why don't IMPs maintain regular status information about all
ARPANET Hosts?  It's not as if this is against the law, and under
certain  conditions, it might be advantageous to do so.  However,
the more entities  about  which  regular  status  information  is
maintained, the more bandwidth (trunk and CPU) and memory must be
                             - 27 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
devoted   to   handling  the  information.   With  a  potentially
unbounded number of Hosts being able to connect to  the  ARPANET,
it  does  not  seem feasible for all IMPs to maintain this status
information for every Host.   Fortunately,  it  just  is  not  as
important  to  maintain status information for Hosts as it is for
IMPs.  Status information about the IMPs is necessary in order to
do routing, so failure to  maintain  this  information  regularly
would  degrade  the  routing capability, with a consequent global
degradation in network service.  Since Hosts, on the other  hand,
are not used for storing-and-forwarding packets, routing does not
have  to  be so aware of Host status, and global degradations due
to incorrect assumptions about Host status are less likely.
     If we can't expect ARPANET IMPs to maintain  regular  status
information  for  each  Host,  we certainly can't expect internet
gateways to maintain regular  status  information  for  each  and
every  Host  in  the  internet.   In  fact,  in the internet, the
situation is even worse.  In  the  ARPANET,  each  IMP  at  least
maintains regular status information about the few Hosts to which
it is directly connected.  This is simple enough to do, since the
number of Hosts on an IMP is bounded (barring the introduction of
local  nets or port expanders) and there are machine instructions
to detect the state of the Ready Line.  However,  we  can  hardly
expect a gateway to maintain regular status information about all
the  Hosts  on  all the networks to which the gateway is directly
connected.   So  we  will  suppose  that   in   general,   status
                             - 28 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
information  about  the  Hosts  which  are  homed to a particular
Switch will be obtained by that Switch on an exception basis,  as
needed.  Of course, saying that this will be true in general does
not  mean  that  it must be universally true.  If there are a few
Hosts somewhere that are major servers with many  many  important
users  scattered  around the internet, there is no reason why the
Switches to which those servers are homed cannot maintain regular
status information about those few Hosts.  If the number of  such
special  Hosts  is  kept  small,  this would not be prohibitively
expensive, and if these Hosts really do handle a large portion of
the internet traffic,  this  might  be  an  important  efficiency
savings.
     If  a source Switch knows that a particular destination Host
logical address can be mapped to any of a number  of  destination
Switches,  then,  as we have pointed out, it must be able to tell
when, due to some sort  of  failure  or  network  partition,  the
destination Host is (temporarily) unreachable via some particular
Switch.   It  must  have  that information in order to be able to
avoid choosing a destination Switch whose Pathway to the Host  is
non-operational.   If  we  agree  that the Pathway up/down status
between  a  particular  destination  Switch  and   a   particular
destination  Host  which  is  ordinarily  homed to it can only be
obtained, on an  exception  basis,  by  that  destination  Switch
itself,  it  follows  that  this  information  can  also  only be
obtained by the source Switch on an exception  basis.   That  is,
                             - 29 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
the  only  way  for a source Switch to find out that a particular
Host  can  temporarily  not  be  reached  through  a   particular
destination  Switch  is  to  send a message for that Host to that
Switch.  The destination Switch must then determine that  it  has
no  operational  Pathway  to  that  Host, and it must send back a
control message to the source Switch informing it of  this  fact.
(In  IEN  183,  we  christened these messages "DNA messages", for
"Destination Not Accessible.) The source Switch will  store  this
information  in its address translation tables, so that from then
on it does not choose a destination Switch whose Pathway  to  the
Host  is  down.   (Of course, in addition to sending this control
information back to the source Switch, the  putative  destination
Switch  should also try to forward the message it received to one
of the other Switches to which the destination Host is homed.)
     This should  work  well,  unless  the  Pathway  between  the
original  destination  Switch and the destination Host comes back
up.  We must develop some way of informing the source Switch that
that destination Switch is now once again usable as a destination
Switch for that Host.  A simple and robust way to handle this  is
as  follows.   When a source Switch is informed, according to the
mechanism  of  the  previous   paragraph,   that   a   particular
destination  Switch  cannot  reach  a particular destination Host
(without  forwarding  traffic  through  additional   intermediate
Switches),  it  marks  (in  its  address translation tables) that
Switch as UNUSABLE as a destination for that Host.  However, this
                             - 30 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
information is reset periodically, say, every  few  minutes.   In
effect,  this  approach  would  cause  a  source  Switch which is
handling  traffic  for  that  destination  Host  to   query   the
destination  Switch  periodically  to see if it has become usable
again.  Note that no special control message is  needed  for  the
querying.   The querying is done simply by sending data addressed
to the destination  Host  to  the  destination  Switch.   If  the
destination  Switch is still unusable, no data is lost, since the
data can be readdressed by the destination  Switch  and  sent  to
some  other  destination  Switch  which  does have an operational
Pathway to that destination  Host.   Note  also  that  with  this
scheme,  not all source Switches will be in agreement as to which
destination Switches can be used to reach which destination Hosts
at some particular time.  But this is not much of a  problem,  as
long as address translation is done only once, and not re-done at
each intermediate Switch.  Further, any source Switch which tries
to  use  the  wrong  destination  Switch  will be told, via a DNA
message, to use another one.
     Lest there be any misunderstanding, we should emphasize that
we are not proposing this as a general mechanism for  determining
which Hosts are homed to which Switches.  That information is not
to  be obtained dynamically at all, but rather is to be installed
in the translation tables at each Switch by the  Network  Control
Center  (or  whatever equivalent of the Network Control Center we
devise for  the  internet.)   This  mechanism  is  only  used  to
                             - 31 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
determine  that  a  Pathway  which ORDINARILY exists between some
Switch and some Host is TEMPORARILY out of operation.
     If a destination Host happens to be  unreachable  from  EACH
potential  destination  Switch  (which will happen if the Host is
down), this procedure will eventually result in the source Switch
marking all potential destination Switches unusable.   Once  this
happens,  the  source  Switch should discard any data it receives
which is destined for that destination Host,  and  should  return
some  sort  of  negative  acknowledgment to the source Host.  The
source Host can then try again, every few minutes, to  send  more
data  to  the  destination Host.  Since the information marking a
destination Switch as  unusable  (for  a  particular  destination
Host) is reset every few minutes, the source Host will be able to
establish  communication  with the destination Host soon after it
becomes  reachable  again.    Strictly   speaking,   a   negative
acknowledgment  from  the  source Switch is not required, and the
current IP  makes  no  provision  for  such  a  thing.   Yet  the
information  contained  in the negative acknowledgment might well
help  the  source  Host  to  choose  a  suitable   retransmission
interval.   If  a destination Host is unreachable, it makes sense
for a TCP to retransmit more infrequently than if the TCP has  no
information   at   all   about   why   it   is  not  getting  any
acknowledgments  from   the   destination   Host.    Also,   this
information  would  be  useful  to  the  end-user (if the various
protocol layers in his Host succeed in passing it back  to  him.)
                             - 32 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
A  user  who is not getting any response from the system may want
to take a different action if he knows his destination cannot  be
reached  than if he thinks that the network (or internet) is just
slow.
     This procedure, which is basically the same as  the  one  we
recommended  (in  IEN  183)  for  use  with  logically  addressed
multi-homed Hosts on the ARPANET, should resolve the  partitioned
net  problem.   Our approach is not dissimilar to one proposed by
Sunshine and Postel in IEN 135.  To quote them:
     A simpler solution to the partitioning problem  follows  the
     spirit of querying a database when things go wrong.  Suppose
     there  were  another  database  listing networks and all the
     gateways attached to each net (whether up  or  down).   This
     database would change slowly only as new equipment was added
     to  the  internet system.  Further suppose that the gateways
     and  internet  routing  are  totally  unaware   of   network
     partitions,  except  that  gateways to partitioned nets find
     out when they cannot reach some Host on their own  net.   In
     this  case,  the  gateway  would  return  a Host Unreachable
     (through me) advisory message to  the  source.   The  source
     could  then  query  the global database to get a list of all
     gateways to the  destination  net,  and  construct  explicit
     source routes to the destination going through each of these
     gateways, trying each one in turn until it succeeded.
     Note, however, that our proposal does not require any source
routing, because it is Switches (i.e., gateways) themselves which
are  the addressable entities in our scheme, rather than networks
(though the authors quoted above were considering how  to  handle
the  problem  in the current Catenet environment, rather than how
to design a new environment).  The database they propose  can  be
identified  with the translation tables we have spoken of.  Also,
                             - 33 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
our proposal handles the situation where a Pathway that was  down
becomes usable again, a case they don't seem to mention.
     It   is   sometimes  claimed  that  hierarchical  addressing
requires less table space than flat addressing, since there is no
need to have an entry in a translation table  for  each  address.
We  can  see now that this is not true.  If we wish to be able to
handle multi-homing, and in particular to handle the "partitioned
net" problem, we need to maintain table space for the Hosts  with
which  we are in communication.  This is true no matter what kind
of addressing scheme we adopt.
     Let's look now at how our scheme would handle the problem of
mobile Hosts, i.e., Hosts which move from one network to another.
We distinguish the case of "rapidly mobile" Hosts from  the  case
of  "slowly  mobile"  Hosts.  A Host is slowly mobile if its move
from one net to another can be made  with  enough  lead  time  to
allow  manual  intervention  to  update  the  logical-to-physical
address translation tables.  This case is handled simply  by  the
presence  of  the  logical  addressing.   When  the Host moves to
another network, it can still be addressed by the same name,  but
the translation tables are changed so that the logical address is
now  mapped  to  a  different set of Switches.  This creates some
work for the internet administration and control center,  but  is
completely  transparent  to  higher  level  protocols,  since the
logical address does not change.  On the other hand, we  consider
                             - 34 -
IEN 188                              Bolt Beranek and Newman Inc.
                                                    Eric C. Rosen
a  Host  to be rapidly mobile if it moves from one net to another
too quickly or too frequently to allow the procedure of modifying
the address translation tables to be feasible.  If we can know in
advance that there is some limited set of networks to which  that
Host  might  connect, we can map the logical address of that Host
onto the set of all  gateways  which  connect  to  any  of  those
networks.   Our  procedure for choosing one gateway to use as the
destination gateway might be as follows.  Try the  first  gateway
on the list.  If a DNA message is received, try the second, etc.,
etc.   Once  a source gateway begins sending traffic for a mobile
Host to  a  particular  destination  gateway,  it  should  always
continue to use that gateway, until it receives a DNA message, in
which  case  it should try the next one.  You will note that this
procedure is very similar to that used for non-mobile Hosts.   In
fact,   it  might  be  entirely  identical.   The  only  possible
difference is that we might want to be  much  more  reluctant  to
switch  from  one  destination  gateway to another in the case of
mobile Hosts than in the  case  of  non-mobile  Hosts,  since  we
expect that a mobile Host will not generally be reachable through
all of the potential destination gateways at every time.
                             - 35 -
-------