usefor-article-03 February 2000

[< Prev] [TOC] [ Next >]
5.5.  Newsgroups

   The Newsgroups header's content specifies which newsgroup(s) the
   article is posted to. It is an inheritable header (4.2.2.2) which
   SHOULD then become the default Newsgroups header of any followup,
   unless a Followup-To header is present to prescribe otherwise.

      Newsgroups-content  = newsgroup-name
                     *( *FWS ng-delim *FWS newsgroup-name )
                     *FWS
      newsgroup-name      = component *( "." component )
      component           = component-start
                     *( component-start / component-other )
      component-start     = Un-lowercase / Un-digit
      Un-lowercase        = <Unicode Letter, Lowercase> /
                  <Unicode Letter, Other>
      Un-digit            = <Unicode Number, Decimal Digit> /
                  <Unicode Number, Other>
      component-other     = "+" / "-" / "_"
      ng-delim            = ","
   where the <Unicode ...> items are as described in [UNICODE].

   The inclusion of folding white space within a Newsgroups-content is a
   newly introduced feature in this standard. It MUST be accepted by all
   conforming implementations (relaying agents, serving agents and
   reading agents).  Posting agents should be aware that such postings
   may be rejected by overly-critical old-style relaying agents. When a
   sufficient number of relaying agents are in conformance, posting
   agents SHOULD generate such whitespace in the form of <CRLF WS> so as
   to keep the length of lines in the relevant headers (notably
   Newsgroups and Followup-To) to no more than than 79 characters (or
   other agreed policy limit - see 4.5).  Before such critical mass
   occurs, injecting agents MAY reformat such headers by removing
   whitespace inserted by the posting agent, but relaying agents MUST
   NOT do so.

   A newsgroup-name consists of one or more components. Components MAY
   contain non-ASCII letters, but these MUST be encoded in UTF-8 and not
   according to [RFC 2047].  A component MUST contain at least one
   letter (and MUST, according to the syntax, begin with a letter or
   digit). Components SHOULD begin with a letter.  Composite characters
   (made by overlaying one character with another) and format
   characters, as allowed in certain parts of Unicode and needed by
   certain languages, must use whatever canonical conventions apply to
   those parts of Unicode (such conventions are not defined in this
   Standard). The use of "_" in a component is deprecated. Serving
   agents MAY refuse to accept newsgroups using such a component.

        NOTE: Components composed entirely of digits would cause
        problems for the commonly used implementation technique of using
        the component as the name of a directory, whilst also using
        sequential numbers to distinguish the articles within a group.
        Components containing other non-permitted characters could cause
        problems when newsgroup-names appear in URLs [RFC 1738] (for
        example an '@' character would prevent distinguishing between
        newsgroup-names and message identifiers).

        NOTE: According to the syntax, uppercase letters cannot occur in
        newsgroup-names, but this standard imposes no requirement on
        software to check this condition, since it would be unreasonable
        to expect it to do so in parts of Unicode for which it was not
        configured (in general, a table lookup is required). Rather, it
        is the responsibility of those creating new newsgroups (7.1) not
        to violate it. It is, moreover, to be expected that a newsgroup
        created in violation of this condition will not be propagated
        particularly well.

   Whilst there is no longer any technical reason to limit the length of
   a component (formerly, it was limited to 14 characters) nor to limit
   the total length of a newsgroup-name, it should be noted that these
   names are also used in the newsgroups line (7.1.2) where an overall
   policy limit applies, and moreover excessively long names can be
   exceedingly inconvenient in practical use.  Agencies responsible for
   individual hierarchies SHOULD therefore, as a matter of policy, set
   reasonable limits for the length of a component and of a newsgroup-
   name. In the absence of such explicit policies, the default figures
   are 30 characters and 71 characters respectively.
[If the checkpolicies proposal is included in the Standard, there should
be a reference to it here.]
        NOTE: The newsgroup-name as encoded in UTF-8 should be regarded
        as the canonical form. Reading agents may convert it to whatever
        character set they are able to display (see 4.4.1) and serving
        agents may possibly need to convert it to some form more
        suitable as a filename. Simple algorithms for both kinds of
        conversion are readily available.  Observe that the syntax does
        not allow comments within the Newsgroups header; this is to
        simplify processing by relaying and serving agents which have a
        requirement to process this header extremely rapidly.

   Posters SHOULD use only the names of existing newsgroups in the
   Newsgroups header. However, it is legitimate to cross-post to
   newsgroup(s) which do not exist on the posting agent's host, provided
   that at least one of the newsgroups DOES exist there, and followup
   agents SHOULD accept this (posting agents MAY accept it, but SHOULD
   at least alert the poster to the situation and request confirmation).
   Relaying agents MUST NOT rewrite Newsgroups headers in any way, even
   if some or all of the newsgroups do not exist on the relaying agent's
   host. Serving agents MUST NOT create new newsgroups simply because an
   unrecognised newsgroup-name occurs in a Newsgroups header (see 7.1
   for the correct method of newsgroup creation).

   The Newsgroups header is intended for use in Netnews articles rather
   than in mail messages. It MAY be used in a mail message to indicate
   that it is a copy also posted to the listed newsgroups, but it SHOULD
   NOT be used in a mail-only reply to a Netnews article (thus the
   "inheritable" property of this header applies only to followups to a
   newsgroup, and not to followups to the poster). Moreover, if a
   newsgroup-name contains any non-ASCII character, it MAY be encoded
   using the mechanism defined in [RFC 2047] when sent by mail but, if
   it is subsequently returned to the Netnews environment, it MUST then
   be re-encoded into UTF-8.
[< Prev] [TOC] [ Next >]
#Diff to first older
NewerOlder
usefor-usefor May 2005
usefor-usefor April 2005
usefor-usefor November 2004
usefor-usefor September 2004
News Article Format and Transmission May 2004
News Article Format and Transmission November 2003
News Article Format June 2003
News Article Format April 2003
News Article Format February 2003
News Article Format August 2002
News Article Format May 2002
News Article Format November 2001
News Article Format July 2001
News Article Format April 2001
Son of 1036 June 1994
RFC 1036 December 1987

--- ../s-o-1036/Newsgroups.out          June 1994
+++ ../usefor-article-03/Newsgroups.out          February 2000
@@ -1,272 +1,112 @@
 5.5. Newsgroups
 
-The Newsgroups header's content specifies which newsgroup(s)
-the article is posted to:
-
-     Newsgroups-content  = newsgroup-name *( ng-delim newsgroup-name )
-     newsgroup-name      = plain-component *( "." component )
-     component           = plain-component / encoded-word
-     plain-component     = component-start *13component-rest
-     component-start     = lowercase / digit
-     lowercase           = <letter a-z>
-     component-rest      = component-start / "+" / "-" / "_"
+   The Newsgroups header's content specifies which newsgroup(s) the
+   article is posted to. It is an inheritable header (4.2.2.2) which
+   SHOULD then become the default Newsgroups header of any followup,
+   unless a Followup-To header is present to prescribe otherwise.
+
+      Newsgroups-content  = newsgroup-name
+                     *( *FWS ng-delim *FWS newsgroup-name )
+                     *FWS
+      newsgroup-name      = component *( "." component )
+      component           = component-start
+                     *( component-start / component-other )
+      component-start     = Un-lowercase / Un-digit
+      Un-lowercase        = <Unicode Letter, Lowercase> /
+                  <Unicode Letter, Other>
+      Un-digit            = <Unicode Number, Decimal Digit> /
+                  <Unicode Number, Other>
+      component-other     = "+" / "-" / "_"
      ng-delim            = ","
+   where the <Unicode ...> items are as described in [UNICODE].
 
-Encoded words used in newsgroup names MUST not contain char-
-acters other than letters, digits, "+", "-", "/", "_",  "=",
-and "?"  (although they may encode them).
-
-A  newsgroup  name consists of one or more components, which
-may be plain components or (except for  the  first)  encoded
-words.   A plain component MUST contain at least one letter,
-MUST begin with a letter or digit, and MUST  not  be  longer
-than  14  characters.  The first component MUST begin with a
-letter; subsequent components SHOULD begin  with  a  letter.
-Newsgroup  names  MUST not contain uppercase letters, except
-where required by encodings in encoded words.  The sequences
-"all" and "ctl" MUST not be used as components.
-
-     NOTE:  The  alphabet  and  syntax specified encom-
-     passes all  existing  names  of  widespread  news-
-     groups,  while  avoiding  various  forms  that are
-     known to cause problems.  Important existing soft-
-     ware  uses  various non-alphanumeric characters as
-     punctuation  adjacent  to  newsgroup  names.   (It
-     would,  in  fact,  be  preferable  to ban "+" from
-     newsgroup  names,  were  it   not   that   several
-     widespread  newsgroups related to the C++ program-
-     ming language already use it.)
-
-     NOTE: Much existing software  converts  the  news-
-     group  name  into  a directory path and stores the
-     articles themselves using  numeric  filenames,  so
-
-INTERNET DRAFT to be        NEWS                    sec. 5.5
-
-
-     all-digit  name components can be troublesome; the
-     "Great Renaming" early in the  history  of  Usenet
-     included  revisions  of several newsgroup names to
-     eliminate such components.
-
-     NOTE: The same storage technique is the reason for
-     the  14-character limit.  The limit is now largely
-     historical, since most modern  systems  have  much
-     larger limits on the length of a directory entry's
-     name, but many old systems are still in use.  Sys-
-     tems  with  shorter  limits  also  exist, but news
-     software on such systems has had to deal with  the
-     problem   already,   since   there   are   several
-     widespread newsgroups with 14-character components
-     in  their  names.  Implementors are warned that it
-     is intended that the successor to this Draft  will
-     increase  the 14-character limit, and are urged to
-     fix their software to handle longer  names  grace-
-     fully  (if  such  fixes  are  necessary, given the
-     intended domain of application of  the  particular
-     software).
-
-     NOTE:  The requirement that the first character of
-     a name be a letter accommodates existing  software
-     which assumes it can tell the difference between a
-     newsgroup name and other possible syntactic  enti-
-     ties  by  inspecting the first character.  Similar
-     considerations motivate excluding  "+",  "-",  and
-     "_"  from  coming  first  in  a component, and the
-     preference for components that do not  begin  with
-     digits.   The "all" sequence is used as a wildcard
-     symbol in much existing software,  and  the  "ctl"
-     sequence  was  involved  in an obsolete historical
-     mechanism for marking control  messages,  so  they
-     are best avoided.
-
-     NOTE:  Possibly  newsgroup  names should have been
-     case-insensitive, but all existing software treats
-     them  as  case-sensitive.   (RFC  977 [rrr] claims
-     that they are case-insensitive in NNTP, but exist-
-     ing  implementations are believed to ignore this.)
-     The simplest solution is just to ban use of upper-
-     case  letters,  since no widespread newsgroup name
-     uses them anyway; this avoids any  possibility  of
-     confusion.
-
-     NOTE:  The syntax has the disadvantage of contain-
-     ing no white space, making it impossible  to  con-
-     tinue  a  Newsgroups  header across several lines.
-     Implementors of relayers and  reading  agents  are
-     warned  that  it is intended that the successor to
-     this Draft will change the definition of  ng-delim
-     to:
-
-INTERNET DRAFT to be        NEWS                    sec. 5.5
-
-
-          ng-delim = "," [ space ]
-
-     and  are  urged  to  fix  their software to handle
-     (i.e., ignore) white space following  the  commas.
-     Meanwhile, posters must avoid inserting such space
-     (despite  the  natural-language  convention  which
-     permits  it)  and  posting  agents should strip it
-     out.
-
-     NOTE: Encoded words  as  components  are  somewhat
-     problematic,  but are clearly desirable for use in
-     non-English-speaking nations.  They are  not  sub-
-     ject to the 14-character limit, and this (plus the
-     possibility of "/" within them) may  require  spe-
-     cial handling in news software.
-
-Encoded words are allowed in newsgroup names ONLY where non-
-ASCII characters are necessary to the name, and must use the
-"b"  encoding  [rrr] and the first suitable character set in
-the MIME order of preferred character sets [rrr].
-
-     NOTE: Since the  newsgroup  name  is  the  encoded
-     form,  NOT the underlying non-ASCII form, there is
-     room for terrible confusion here if the choice  of
-     encoding  for a particular name is not fully stan-
-     dardized.
-
-Posters SHOULD use only the names of existing newsgroups  in
-the  Newsgroups  header,  because newsgroups are NOT created
-simply by being posted to.  However,  it  is  legitimate  to
-cross-post to newsgroup(s) which do not exist on the posting
-agent's host, provided that at least one of  the  newsgroups
-DOES  exist  there,  and  followup  agents  MUST accept this
-(posting agents MAY accept it, but SHOULD at least alert the
-poster to the situation and request confirmation).  Relayers
-MUST not rewrite Newsgroups headers in any way, even if some
-or all of the newsgroups do not exist on the relayer's host.
-
-     NOTE: Early experience  with  news  software  that
-     created  newsgroups  when they were mentioned in a
-     Newsgroups header was thoroughly negative: posters
-     frequently mistype newsgroup names.
-
-     NOTE:  While it is legitimate for some of an arti-
-     cle's newsgroups not to exist on the host where it
-     is  posted,  this  IS  a  rather unusual situation
-     except in followups (which should go to all  news-
-     groups  the  precursor  was posted to, even if not
-     all of them reach the site where the  followup  is
-     being posted).
-
-     NOTE:   Rewriting   Newsgroups  headers  to  strip
-     locally-unknown   newsgroups   is    superficially
-     attractive.    However,   early   experience  with
-
-INTERNET DRAFT to be        NEWS                    sec. 5.5
-
-
-     exactly that policy was thoroughly negative:  news
-     propagation   is  more  redundant  and  much  less
-     orderly than many people imagine, and in  particu-
-     lar  it  is  not  unheard-of  for  the (sometimes)
-     fastest path between two (say) U of Toronto  sites
-     to  pass  outside  U  of  Toronto... in which case
-     newsgroup stripping can cause incomplete  propaga-
-     tion.   Having  an  article's  set  of  newsgroups
-     change as it propagates can also  result  in  fol-
-     lowups  not  achieving the same propagation as the
-     original.  It's been tried; it's more trouble than
-     it's worth; don't do it.
-
-     NOTE:  In particular, newsgroup stripping superfi-
-     cially looks like a solution  to  the  problem  of
-     duplicate  regional newsgroup names.  For example,
-     both University of Toronto and University of Texas
-     have  "ut.general" newsgroups, and material cross-
-     posted to that name and a global newsgroup appears
-     in  both universities' local newsgroups.  However,
-     the side effects  of  stripping  are  sufficiently
-     unacceptable  to  disqualify  it for this purpose.
-     Don't do it.
-
-Cross-posting an article to several relevant  newsgroups  is
-far  superior  to  posting separate articles with duplicated
-content to each newsgroup, because reading agents can detect
-the  situation  and  show the article to a reader only once.
-Posters SHOULD cross-post rather than duplicate-post.
-
-     NOTE: On the other hand, cross-posting to a  large
-     number  of  newsgroups  usually indicates that the
-     poster has not thought about his  audience;  arti-
-     cles  are rarely pertinent to more than (say) half
-     a dozen newsgroups.  Posting agents might wish  to
-     request confirmation when the number of newsgroups
-     exceeds (say) five in the presence of a  Followup-
-     To  header,  or (say) two in the absence of such a
-     header.
-
-     NOTE: One problem with cross-postings is  what  to
-     do  with an article cross-posted to a set of news-
-     groups including both  moderated  and  unmoderated
-     ones.   Posters  tend to expect such an article to
-     show up immediately in the unmoderated newsgroups,
-     especially if they do not realize that one or more
-     of the newsgroups is moderated.  However, since it
-     is  not  possible for a moderator to retroactively
-     add an already-posted article to a moderated news-
-     group,  the only correct action is to mail such an
-     article to one (and only one)  of  the  moderators
-     for  action.   It is probably best for the posting
-     agent to detect this situation and ask the  poster
-     what  action is preferred.  The acceptable choices
-
-INTERNET DRAFT to be        NEWS                    sec. 5.5
-
-
-     are to alter the newsgroup list or to  mail  to  a
-     moderator  of  the  poster's  choice;  the posting
-     agent should NOT  offer  duplicate-posting  as  an
-     easy-to-request  option (if only because many mod-
-     erators will reject a submission that has  already
-     been posted to unmoderated newsgroups).
-
-     NOTE:  An  article cross-posted to multiple moder-
-     ated newsgroups really should have  approval  from
-     all  the  moderators  involved.   In practice, the
-     only straightforward way to do this is to send the
-     article  to  one  of them and have him consult the
-     others.
-
-A newsgroup SHOULD not appear more than once  in  the  News-
-groups header.
-
-Newsgroup  names  having only one component are reserved for
-newsgroups whose propagation is restricted to a single  host
-(or  the  administrative  equivalent).  It is inadvisable to
-name a newsgroup "poster"  because  that  word  has  special
-meaning  in  the  Followup-To header (see section 6.1).  The
-names "control" and "junk" are frequently used  for  pseudo-
-newsgroups  internal  to  relayer implementations, and hence
-are also best avoided.
-
-     NOTE: Beware of the  duplicate-regional-newsgroup-
-     names  problem  mentioned  above.   In particular,
-     there are many, many hosts with a newsgroup  named
-     "general",  and  some surprising things show up in
-     such newsgroups when  people  cross-post.   It  is
-     probably  better  to  use  multi-component  names,
-     which are less likely to  be  duplicated.   Fred's
-     Widget  House should use "fwh.general" rather than
-     just  "general"  as  its  in-house  general-topics
-     newsgroup.
-
-It is conventional to reserve newsgroup names beginning with
-"to." for test messages sent  on  an  essentially  point-to-
-point basis (see also the ihave/sendme protocol described in
-section 7.2); newsgroup names beginning  with  "to."  SHOULD
-not be used for any other purpose.  The second (and possibly
-later) components of such a name should, together,  comprise
-the  relayer name (see section 5.6) of a relayer.  The news-
-group exists only at the named relayer  and  its  neighbors.
-The  neighbors all pass that newsgroup to the named relayer,
-while the named relayer does not pass it to anyone.
-
-The order of newsgroup names in the Newsgroups header is not
-significant.
-
-INTERNET DRAFT to be        NEWS                    sec. 5.6
+   The inclusion of folding white space within a Newsgroups-content is a
+   newly introduced feature in this standard. It MUST be accepted by all
+   conforming implementations (relaying agents, serving agents and
+   reading agents).  Posting agents should be aware that such postings
+   may be rejected by overly-critical old-style relaying agents. When a
+   sufficient number of relaying agents are in conformance, posting
+   agents SHOULD generate such whitespace in the form of <CRLF WS> so as
+   to keep the length of lines in the relevant headers (notably
+   Newsgroups and Followup-To) to no more than than 79 characters (or
+   other agreed policy limit - see 4.5).  Before such critical mass
+   occurs, injecting agents MAY reformat such headers by removing
+   whitespace inserted by the posting agent, but relaying agents MUST
+   NOT do so.
+
+   A newsgroup-name consists of one or more components. Components MAY
+   contain non-ASCII letters, but these MUST be encoded in UTF-8 and not
+   according to [RFC 2047].  A component MUST contain at least one
+   letter (and MUST, according to the syntax, begin with a letter or
+   digit). Components SHOULD begin with a letter.  Composite characters
+   (made by overlaying one character with another) and format
+   characters, as allowed in certain parts of Unicode and needed by
+   certain languages, must use whatever canonical conventions apply to
+   those parts of Unicode (such conventions are not defined in this
+   Standard). The use of "_" in a component is deprecated. Serving
+   agents MAY refuse to accept newsgroups using such a component.
+
+        NOTE: Components composed entirely of digits would cause
+        problems for the commonly used implementation technique of using
+        the component as the name of a directory, whilst also using
+        sequential numbers to distinguish the articles within a group.
+        Components containing other non-permitted characters could cause
+        problems when newsgroup-names appear in URLs [RFC 1738] (for
+        example an '@' character would prevent distinguishing between
+        newsgroup-names and message identifiers).
+
+        NOTE: According to the syntax, uppercase letters cannot occur in
+        newsgroup-names, but this standard imposes no requirement on
+        software to check this condition, since it would be unreasonable
+        to expect it to do so in parts of Unicode for which it was not
+        configured (in general, a table lookup is required). Rather, it
+        is the responsibility of those creating new newsgroups (7.1) not
+        to violate it. It is, moreover, to be expected that a newsgroup
+        created in violation of this condition will not be propagated
+        particularly well.
+
+   Whilst there is no longer any technical reason to limit the length of
+   a component (formerly, it was limited to 14 characters) nor to limit
+   the total length of a newsgroup-name, it should be noted that these
+   names are also used in the newsgroups line (7.1.2) where an overall
+   policy limit applies, and moreover excessively long names can be
+   exceedingly inconvenient in practical use.  Agencies responsible for
+   individual hierarchies SHOULD therefore, as a matter of policy, set
+   reasonable limits for the length of a component and of a newsgroup-
+   name. In the absence of such explicit policies, the default figures
+   are 30 characters and 71 characters respectively.
+[If the checkpolicies proposal is included in the Standard, there should
+be a reference to it here.]
+        NOTE: The newsgroup-name as encoded in UTF-8 should be regarded
+        as the canonical form. Reading agents may convert it to whatever
+        character set they are able to display (see 4.4.1) and serving
+        agents may possibly need to convert it to some form more
+        suitable as a filename. Simple algorithms for both kinds of
+        conversion are readily available.  Observe that the syntax does
+        not allow comments within the Newsgroups header; this is to
+        simplify processing by relaying and serving agents which have a
+        requirement to process this header extremely rapidly.
+
+   Posters SHOULD use only the names of existing newsgroups in the
+   Newsgroups header. However, it is legitimate to cross-post to
+   newsgroup(s) which do not exist on the posting agent's host, provided
+   that at least one of the newsgroups DOES exist there, and followup
+   agents SHOULD accept this (posting agents MAY accept it, but SHOULD
+   at least alert the poster to the situation and request confirmation).
+   Relaying agents MUST NOT rewrite Newsgroups headers in any way, even
+   if some or all of the newsgroups do not exist on the relaying agent's
+   host. Serving agents MUST NOT create new newsgroups simply because an
+   unrecognised newsgroup-name occurs in a Newsgroups header (see 7.1
+   for the correct method of newsgroup creation).
+
+   The Newsgroups header is intended for use in Netnews articles rather
+   than in mail messages. It MAY be used in a mail message to indicate
+   that it is a copy also posted to the listed newsgroups, but it SHOULD
+   NOT be used in a mail-only reply to a Netnews article (thus the
+   "inheritable" property of this header applies only to followups to a
+   newsgroup, and not to followups to the poster). Moreover, if a
+   newsgroup-name contains any non-ASCII character, it MAY be encoded
+   using the mechanism defined in [RFC 2047] when sent by mail but, if
+   it is subsequently returned to the Netnews environment, it MUST then
+   be re-encoded into UTF-8.
 

Documents were processed to this format by Forrest J. Cavalier III