rkt logo

INN FAQ Part 2

NOTE: The maintainers of the INN FAQ stopped publishing in December 1997.
An important update to this topic is provided by Mib Software in the Usenet RKT for Subscribers.

Subject: (2.14) Solaris 2.x special needs


Solaris 2.5:
Sun assures that Solaris 2.5 does no longer have the socket bug
(see fix #7 below) and Dave Zavatson <dhzavatson@ucdavis.edu> writes
that the bug still exists ... So if you see "'resource temp
unavailable' errors, you have to apply it.

Joe St Sauver <JOE@OREGON.UOREGON.EDU> submitted the following:
| Symptom: One of the topologically distant sites notices far lower than normal 
| article throughput. Further investigation by the remote site (using netstat) 
| identifies a large number of "completely duplicated packets" originating 
| with the Solaris feed host. 
 
| Resolution: The local Solaris 2.5 host had not applied Sun patches 103169-05 
| ("ip driver and ifconfig fixes") and 103447-03  ("tcp patch") as can be 
| obtained from ftp://sunsolve1.Sun.COM/pub/patches/patches.html 
| (Solaris 2.5.1 users, see 103582-01 and 103630-01).
 
| Without these patches, when working with hosts that are topologically 
| remote, TCP/IP throughput reportedly can drop to as little as 5% of 
| what it should be. 
| For further information, see: <199607140422.VAA04495@yorick.cygnus.com>
| quoting a 7 June 1996 article posted to comp.unix.solaris by Cathe A. Ray 
| (Manager of Internet Engineering for Sun).
 
| Thanks to Howard Goldstein <hgoldste@bbs.mpcs.com> for the detective work in
| isolating and resolving this problem!

SOLARIS 2.4: Install the Recommended cluster patch from Sun.
The Recommended cluster patch is:
ftp://sunsolve1.sun.com/pub/patches/2.4_Recommended.tar.Z
The README is:
ftp://sunsolve1.sun.com/pub/patches/2.4_Recommended.README
Then follow the directions in
ftp://ftp.isc.org/isc/inn/unoff-patches/OLD/solaris-2.4.patch.
The patch needs to be applied BY HAND, it is not in the correct format
to work with Larry Wall's patch program. Also, do *not* link with the
/usr/ucblib stuff, and HAVE_WAITPID should be set to "DO".
On 3/25/95 Sun introduced patch 101945-23 which fixes bug #1178506 titled
"INN wounded after upgrade to SunOS 5.4". This fixes the
"cant read Resource temporarily unavailable" bug that some
have reported.
But Even if the Sun Patch mentions
"1186224 socket select hangs in NON-BLOCKED mode", this seems not to
be totally fixed. Ian Dickinson <idickins@fore.com>
doesn't notice it on his lightly loaded
server. But on heavily loaded machines, it occurs occasionally
(<5 times a day). See below for a patch (Solaris Fix #7 )
It seems that the last version of the kernel patch for Sparc is 19945-36;
191945-29 is known to work. For x86 the latest version is 101946-29,
which has problems with Unix domain sockets, so 101946-12 seems to be the
last usable one here ...

Include /opt/SUNWspro/bin and /usr/bin in your path before /usr/ucb as
/usr/ucb/sed does not work well.

SOLARIS 2.3: If you install the "Recommended cluster patch" I *think*
you will only need to pay attention to Fix #5 listed below. It would
be helpful if people sent an update about this.
The Recommended cluster patch is:
ftp://sunsolve1.sun.com/pub/patches/2.3_Recommended.tar.Z
The README is:
ftp://sunsolve1.sun.com/pub/patches/2.3_Recommended.README

(note: If you trust other people to compile programs for you
[especially ones that run as root] you can get inn1.4sec pre-compiled
w/gcc at ccnews.ke.sanet.sk:/pub/solaris/inn1.4sec-src+bin.tar.gz)

INN works with Solaris 2.[0123]. It's not easy, but it will work.
The problem is that depending on which Solaris patches you have
installed, you have to install various INN patches. There are too
many combinations of Sun patches and INN patches to be able to say
what is required and what isn't. (See the "SOLARIS 2.3" tip above
for one tried and tested configuration).

Here is the general guide:

Step 1: Use the info for config.data for Solaris 2.x that is included
Install.ms.
Step 2: As you go, if you get any of the problems listed below, try
the fix listed.

Eventually you will be up and running with only the fixes you need. If
you try to install ALL the fixes at once, things will definitely not
work.

COMPILER TIPS: Use gcc or /opt/SUNWspro/bin/cc. Do *not* use
/usr/ucb/cc. In fact, remove /usr/ucb from your path when you compile.

For directory structure - be careful about /var/news, as the news(1)
tool also writes in this area an might damage your files. (Need more
input on this).

The patch program supplied with Solaris 2.5 appears to not understand
the "new-style" context diffs which virtually everyone uses these
days so you have to fetch the gnu-patch as described in part8 of this
FAQ. Also it doesn't know -p0 option ; it wants -p 0 and the file to
patch has to be writable.

---------- Solaris Fix #1

Under Solaris 2.[012] (SunOS 5.0, 5.1, 5.2) you must add the following
at the beginning of each file using gethostbyname():

#define gethostbyname __switch_gethostbyname

Under Solaris 2.3 gethostbyname() might work without changes depending
on your configuration. We haven't figured out when they work and when
they don't. If you run into problems, try to change "gethostbyname()"
to "solaris_gethostbyname()" and then use the gethostbyname() listed in
the Solaris Porting FAQ. This isn't a perfect solution, because you
now need a different binary for Solaris 2.[012] systems.

It also seems to be a good idea to put dns in front of nis in
/etc/nsswitch.conf

hosts: dns nis files

It would be great if someone were to submit a solaris_gethostbyname()
function who's binary works under all Solaris revs and gives all the
semantics of BSD gethostbyname(). In particular, one that doesn't have
the problems discussed in sun bugid #1126573 or #1135988. It would be
amazing if this was submitted by one of the many Sun employees that
flame the INN FAQ maintainer in comp.sys.sun.admin every time he bitches
about how much he hates Solaris 2.x. :-)

---------- Solaris Fix #2

Under all Solaris 2.* versions there is a problem with innwatch.ctl.
It expects to use "df -i" to find out how many inodes are free on your
disk. /usr/{sbin,5bin,bin}/df doesn't support the "-i" option, it has
a "-e" option that outputs the info you want, but in a different
format. You should use "/usr/ucb/df -i" instead, since this version of
df includes the "-i" option.

If you have too much space left on your disks (;-)) you will see the
following:

Filesystem             iused   ifree  %iused  Mounted on
/dev/md/dsk/d10      103495213433720     7%   /var/spool/news

So awk will print 7% as number of free inodes ...

Ian Dickinson <idickins@fore.com> wrote a inndf which can be found at
the usual place. This inndf compiled with gcc and -DHAVE_STATVFS
seems to work though (after Nash E. Foster <nef10958@usln1b.glaxo.com> ).
A new version of this is available which works with large filesystems
is available from ftp://ftp.csv.warwick.ac.uk/pub/usenet/inn/inndf.tar.gz

If you have your news spool NFS mounted from another box, which is
absolutely not recommended (see INN FAQ #5.15 , ME cant nonblock), then the
following might help: rsh other_box /usr/ucb/df -u /var/spool/news

/usr/ucb/df is part of the BSD Compatibility stuff. If you loaded
Solaris 2.x without that, you can replace innwatch.ctl's disk checks
with these lines:

##  If load is OK, check space (and inodes) on various filesystems
##  =()<!!! /usr/bin/df -k . | awk 'NR == 2 { print $4 }' ! lt ! @<INNWATCH_SPOOLSPACE>@ ! throttle ! No space (spool)>()=
!!! /usr/bin/df -k . | awk 'NR == 2 { print $4 }' ! lt ! 8000 ! throttle ! No space (spool)
##  =()<!!! /usr/bin/df -k @<_PATH_BATCHDIR>@ | awk 'NR == 2 { print $4 }' ! lt ! @<INNWATCH_BATCHSPACE>@ ! throttle ! No space (newsq)>()=
!!! /usr/bin/df -k /news2/spool/out.going | awk 'NR == 2 { print $4 }' ! lt ! 800 ! throttle ! No space (newsq)
##  =()<!!! /usr/bin/df -k @<_PATH_NEWSLIB>@ | awk 'NR == 2 { print $4 }' ! lt ! @<INNWATCH_LIBSPACE>@ ! throttle ! No space (newslib)>()=
!!! /usr/bin/df -k /news2/privcontrol | awk 'NR == 2 { print $4 }' ! lt ! 40000 ! throttle ! No space (newslib)
##  =()<!!! /usr/bin/df -k @<_PATH_OVERVIEWDIR>@ | awk 'NR == 2 { print $4 }' ! lt ! @<INNWATCH_OVERVIEWSPACE>@ ! throttle ! No space (overview)>()=
!!! /usr/bin/df -k /news3/overview | awk 'NR == 2 { print $4 }' ! lt ! 6000 ! throttle ! No space (overview)
##  =()<!!! /usr/bin/df -e . | awk 'NR == 2 { print $2 }' ! lt ! @<INNWATCH_SPOOLNODES>@ ! throttle ! No space (spool inodes)>()=
!!! /usr/bin/df -e . | awk 'NR == 2 { print $2 }' ! lt ! 200 ! throttle ! No space (spool inodes)

---------- Solaris fix #3

Don't run the "lint" step if you use Solaris. In fact, nobody needs to
execute this step except Rich, when he's writing new code. If you have
a Solaris machine without "lint", just make "lint" a symlink to
"/bin/echo".

---------- Solaris fix #4

People running Solaris 2.3 have built INN with HAVE_UNIX_DOMAIN set to
TRUE and everything seems to be ok. I guess Sun has fixed enough
bugs in 2.3 to make it usable. I recommend the latest "recommended
patches" if you run any version of Solaris 2.x. To install all of
the "Recommended Patches" in one command, refer to:
ftp://sunsolve1.sun.com/pub/patches/patches.html

---------- Solaris fix #5

If "inews" outputs "Bad Message-ID" when posting Under Solaris 2.x
(where x = 0, 1, 2 or 3) you need to change the file "getfqdn.c". Find
the lines that read:

        if (strchr(hp->h_name, '.') == NULL) {
                /* Try to force DNS lookup if NIS/whatever gets in the way. */
                (void)strncpy(temp, buff, sizeof buff);
                (void)strcat(temp, ".");
                hp = gethostbyname(temp);
        }

and delete them.

---------- Solaris fix #6

If posting gets you "441 Can't generate Message-ID, Error 0" and you
are running with DNS, then the problem is with Solaris 2.3's
gethostbyname. dns. If you ask for a host with "hostname." it returns
"hostname." instead "hostname.yourdomain.com" as expected by nn. The
workaround is to define "domain" in your inn.conf and apply the
following patch to getfqdn.c:

*** getfqdn.c.~1~       Sun Sep  4 09:02:37 1994
--- getfqdn.c   Sun Sep  4 09:53:11 1994
***************
*** 35,45 ****
      if ((hp = gethostbyname(buff)) == NULL)
        return NULL;
!     if (strchr(hp->h_name, '.') == NULL) {
!       /* Try to force DNS lookup if NIS/whatever gets in the way. */
!       (void)strncpy(temp, buff, sizeof buff);
!       (void)strcat(temp, ".");
!       hp = gethostbyname(temp);
!     }
!     if (hp != NULL && strchr(hp->h_name, '.') != NULL) {
        if (strlen(hp->h_name) < sizeof buff - 1)
            return strcpy(buff, hp->h_name);
--- 35,39 ----
      if ((hp = gethostbyname(buff)) == NULL)
        return NULL;
!     if (strchr(hp->h_name, '.') != NULL) {
        if (strlen(hp->h_name) < sizeof buff - 1)
            return strcpy(buff, hp->h_name);

---------- Solaris fix #7

From Ian Dickinson <ian@fore.com>:
Sun appear to reduced the frequency of the problem, but not fixed the bug
itself. I still need this under SunOS5.4 101945-29. You should already
have -DSUNOS5 in your DEFS setting in config.data anyway.
(Note that in 1.5.x this workaround is already in the source. You can
enable with with specifying -DPOLL_BUG in the DEFS settings in
config.data. Thanks to rhaskins@shiva.com who pointed that out).

This should apply - maybe with a bit of fuzz:

*** innd/chan.c.ORIG    Wed Dec 14 11:03:16 1994
--- innd/chan.c Thu Dec 15 17:00:54 1994
***************
*** 497,502 ****
--- 497,508 ----
      bp->Left = bp->Size - bp->Used;
      i = read(cp->fd, &bp->Data[bp->Used], bp->Left - 1);
      if (i < 0) {
+ #ifdef SUNOS5
+     /* return of -2 indicates EAGAIN, for SUNOS5.4 poll() bug workaround */
+       if (errno == EAGAIN) {
+           return -2;
+       }
+ #endif
        syslog(L_ERROR, "%s cant read %m", p);
        return -1;
      }
*** innd/nc.c.ORIG      Thu Mar 18 21:04:28 1993
--- innd/nc.c   Thu Dec 15 17:00:41 1994
***************
*** 783,788 ****
--- 783,794 ----
      /* Read any data that's there; ignore errors (retry next time it's our
       * turn) and if we got nothing, then it's EOF so mark it closed. */
      if ((i = CHANreadtext(cp)) < 0) {
+ #ifdef SUNOS5
+     /* return of -2 indicates EAGAIN, for SUNOS5.4 poll() bug workaround */
+       if (i == -2) {
+           return;
+       }
+ #endif
        if (cp->BadReads++ >= BAD_IO_COUNT) {
            if (NCcount > 0)
                NCcount--;

---------- Solaris fix #8

From: Joe St Sauver <joe@decoy.uoregon.edu>

We recently upgraded some machines in our news farm to fast ethernet, and
after doing so we noticed poor performance (ping times of 30msec between
two machines each connected to dedicated switch ports on the same switch...).

Poking around a little, we noticed that under Solaris 2.5, tcp_conn_req_max
is set to 32 by default, which is a little low if you are working with a fair
number of peers or have a lot of readers. We bumped that value to 1000 or
so (1024 max under Solaris 2.5), using:

# ndd -set /dev/tcp tcp_conn_req_max 1000

and now ping times are back into the 0 or 1 msec reported range you'd hope to
see from that sort of topology. :-)

------------------------------

[Source: INN FAQ Part 2 Archive-name: usenet/software/inn-faq/part2]
[Last Changed: $Date: 1997/09/23 01:25:52 $ $Revision: 2.34 $]
[Copyright: 1997 Heiko Rupp, portions by Tom Limoncelli, Rich Salz, et al.]