InterNetNews... Salz
wait until the original name shows up again. The
INN approach is more efficient and conceptually
cleaner.

Innd Structure

When innd starts up it reads the active file into
memory. An array of NEWSGROUP structures is
created, one for each newsgroup, that contains the
following elements:
 char *Name; /* "comp.sources.unix */ 
 char *Dir; /* "comp/sources/u nix/" */ 
 long Last; /* 0211 */ 
 int LastWidth; /* 5 */ 
 char *LastString; /* "00211..." */ 
 char *Rest; /* "m\n..." */ 
 int SiteCount; /* 1 */ 
 SITE **Sites; /* defined below */ 
 
The C comments above show the data that would
generated for the following line in the active file:
comp.sources.unix 00211 00202 m
The Last field specifies the name to be given to the
next article in the group. The LastString element
points into the in -memory copy of the file. This
number is carefully formatted so that the file can be
memory -mapped, or updated with a single write.

A hash table into the structure array is built,
using a function provided by Chris Torek [Torek91].
The hash calculation is very simple, yet empirically
it gives near -uniform distribution. The secondary
key is the highest article number, so groups with the
most traffic tend to be at the top of the bucket.

The INN equivalent of the sys file read next.
An array of SITE structures is created, one for each
site, that contains the following elements:

 BOOL Sendit; 
 char FileFlags[10]; 
 
The FileFlags array specifies what information
should be written to the site's data stream when it
receives an article. The subscription list for the site
is then parsed, and for all the newsgroup that it
recives, the matching NEWSGROUP structure will
contain a pointer to the SITE structure.

Using these two structures it is easy to step
through how an article is propagated:

 extern ARTDATA *art; 
 extern SITE *Sites, *LastSite; 
 extern int nSites; 
 char **pp; 
 SITE *sp; 
 NEWSGROUP *ng; 
 int i; 
 while (*pp) { 
   ng = HashNewsgroup(* pp++); 
   if (ng == NULL) 
     continue; 
   AssignName(ng) ; 
   for (i = 0; i < ng ->nSites; i++) { 
     if (MeetsSiteCrite ra(ng ->Sites[i], art)) 
 
       ng ->Sites[i] ->Sendit = TRUE; 
   } 
 } 
 for (sp = Sites; sp < LastSite; sp++) { 
   if (!sp ->Sendit) 
     continue; 
   for (p = sp ->FileFlags; *p; p++) 
     switch (*p) { 
     case 'm': 
       /* Write Message -ID */ 
     case 'n': 
       /* Write filename */
     ... 
     } 
 } 
 
The ARTDATA structure contains information about
the current article such as its size, the host that sent
it, and so on. The MeetsSiteCriter ia function is an
abstraction for the in -line tests that are done to see if
an article really should be propagated to a site (e.g.,
checking the Path header as described above).
AssignName is described below.

At its core, innd is an I/O scheduler that makes
callbacks when select (2) has determined that there is
activity on a descriptor. This is encapsulated in the
CHANNEL structure, which has the following ele-
ments:

 enum TYPE Type; 
 enum STATE State; 
 int fd; 
 FUNCPTR Reader; 
 FUNCPTR WriteDone; 
 BUFFER In; 
 BUFFER Out; 
 
The Type field is used for internal consistency
checks. There four different types of channels
local -accept, remote -accept, local -control (used by
ctlinnd ) and NNTP connection. Each type is imple-
mented in anywhere from 100 to 1200 lines of code.
The Reader and WriteDone function pointers, and
the State enumeration are used for protocol -specific
data. For example, State field is used by the NNTP
channel code to determine whether the site is send-
ing an NNTP command or an article. The BUFFER
datatype contains sized reusable I/O buffers that
grow as needed.

At start time innd calls getdtablesize (2) to
create an array of channels that can be directly
indexed by descriptor.

The code to listen on the NNTP port is show in
Figure 3. When a host connects to the NNTP port,
select (2) will report activity on the descriptor and
call RemoteReader which will accept the connection
and possibly create fill in a new CHANNEL out of
the resultant descriptor.

It took a bit of effort to write the callback loop
so that it was fair i.e., so that the lowest descrip-
tors did not get priority treatment. The problem was

Summer '92 USENIX June 8 -June 12, 1992 San Antonio, TX