usefor-article-04 April 2001

[< Prev] [TOC] [ Next >]
2.4.  Syntax Notation

   This standard uses the Augmented Backus Naur Form described in [RFC
   2234].  A discussion of this is outside the bounds of this standard,
   but it is expected that implementors will be able quickly to
   understand it with reference to that defining document.

   Much of the syntax of News Articles is based on the corresponding
   syntax defined in [MESSFOR] or in the Mime specifications [RFC 2045]
   et seq, which is deemed to have been incorporated into this standard
   as required. However, there are some important differences arising
   from the fact that [MESSFOR] does not recognise anything other than
   US-ASCII characters, that it does not recognise the MIME headers [RFC
   2045], and that it includes much syntax described as "obsolete".

        NOTE: News parsers historically have been much less permissive
        than Mail parsers, and this is reflected in the modifications
        referred to, and in some further specific rules.

   The following syntactic forms therefore supersede the corresponding
   rules given in [MESSFOR] and [RFC 2045], thus allowing UTF-8
   characters [RFC 2044] to appear in certain contexts (the four rules
   begining with "strict-" reflect the corresponding original rules from
   [MESSFOR]).

      UTF8-xtra-head  = %d192-253
      UTF8-xtra-tail  = %d128-191
      UTF8-xtra-char  = UTF8-xtra-head 1*UTF8-xtra-tail
      text            = %d1-9 /            ; all UTF-8 characters except
              %d11-12 /          ; US-ASCII NUL, CR and LF
              %d14-127 /
              UTF8-xtra-char
      ctext           = NO-WS-CTL /        ; all of <text> except
              %d33-39 /          ; SP, HTAB, "(", ")"
              %d42-91 /          ; and "\"
              %d93-126 /
              UTF8-xtra-char
      qtext           = NO-WS-CTL /        ; all of <text> except
              %d33 /             ; SP, HTAB, "\" and DQUOTE
              %d35-91 /
              %d93-126 /
              UTF8-xtra-char
      utext           = NO-WS-CTL /        ; Non white space controls
              %d33-126 /         ; The rest of US-ASCII
              UTF8-xtra-char
      strict-text     = %d1-9 /            ; text restricted to
              %d11-12 /          ; US-ASCII
              %d14-127
      strict-qtext    = NO-WS-CTL /        ; qtext restricted to
              %d33 /             ; US-ASCII
              %d35-91 /
              %d93-127
      strict-quoted-pair
            = "\" strict-text
      strict-quoted-string
            = [CFWS] DQUOTE
                 *([FWS] (strict-qtext / strict-quoted-pair))
                 [FWS] DQUOTE [CFWS]

        NOTE: There are sequences of octets which cannot legitimately
        occur in UTF-8, even a few permitted by the above syntax. These
        SHOULD NOT be generated by posting agents but, where they occur
        inadavertently, they SHOULD be passed on untouched by other
        agents.

   Wherever in this standard the syntax is stated to be taken from
   [MESSFOR], it is to be understood as the syntax defined by [MESSFOR]
   after making the above changes, but NOT including any syntax defined
   in section 4 ("Obsolete syntax") of [MESSFOR].  Software compliant
   with this standard MUST NOT generate any of the syntactic forms
   defined in that Obsolete Syntax, although it MAY accept such
   syntactic forms. Certain syntax from the MIME specifications [RFC
   2045] et seq is also considered a part of this standard (see 6.21).

   The following syntactic forms, taken from [RFC 2234] or from
   [MESSFOR], are repeated here for convenience only:

      ALPHA           = %x41-5A /          ; A-Z
              %x61-7A            ; a-z
      CR              = %x0D               ; carriage return
      CRLF            = CR LF
      DIGIT           = %x30-39            ; 0-9
      HTAB            = %x09               ; horizontal tab
      LF              = %x0A               ; line feed
      SP              = %x20               ; space
      NO-WS-CTL       = %d1-8 /            ; US-ASCII control characters
              %d11 /             ; which do not include the
              %d12 /             ; carriage return, line feed,
              %d14-31 /          ; and whitespace characters
              %d127
      WSP             = SP / HTAB          ; Whitespace characters
      FWS             = ([*WSP CRLF] 1*WSP); Folding whitespace
      atext           = ALPHA / DIGIT /
              "!" / "#" /        ; Any character except
              "$" / "%" /        ; controls SP, and specials.
              "&" / "'" /        ; Used for atoms
              "*" / "+" /
              "-" / "/" /
              "=" / "?" /
              "^" / "_" /
              "`" / "}" /
              "|" / "}" /
              "~"
      atom            = [CFWS] 1*atext [CFWS]
      dot-atom        = [CFWS] dot-atom-text [CFWS]
      dot-atom-text   = 1*atext *( "." 1*atext )
      comment         = "(" *([FWS]
                 (ctext / quoted-pair / comment)) [FWS] ")"
      CFWS            = *([FWS] comment) (([FWS] comment) / FWS )
      DQUOTE          = %d34              ; quote mark
      quoted-pair     = "\" text
      quoted-string   = [CFWS] DQUOTE
                 *([FWS] (qtext / quoted-pair))
                 [FWS] DQUOTE [CFWS]
      unstructured    = *( [FWS] utext ) [FWS]

        NOTE: CFWS occurs at many places in the syntax in order to allow
        comments and extra whitespace to be inserted almost anywhere.
        The syntax is in fact ambiguous insofar as it may be impossible
        to tell in which of several possible ways a given comment or WS
        was produced. However, this does not lead to semantic ambiguity
        because, unless specifically stated otherwise, the presence of
        absence of a comment or additional WS has no semantic meaning
        and, in particular, it is a matter of indifference whether it
        forms a part of the syntactic construct preceding it or the one
        following it.

        NOTE: Following [RFC 2234], literal text included in the syntax
        is to be regarded as case-insensitive.  However, in
        contradistinction to [MESSFOR], the Netnews protocols are
        sensitive to case in some instances (as in newsgroup names, some
        header parameters, etc.). Care has been taken to indicate this
        explicitly where required.

   The complete syntax defined in this standard is repeated, for
   convenience, in Appendix B.
[< Prev] [TOC] [ Next >]
#Diff to first older
NewerOlder
usefor-usefor May 2005
usefor-usefor April 2005
usefor-usefor November 2004
usefor-usefor September 2004
News Article Format and Transmission May 2004
News Article Format and Transmission November 2003
News Article Format June 2003
News Article Format April 2003
News Article Format February 2003
News Article Format August 2002
News Article Format May 2002
News Article Format November 2001
News Article Format July 2001
News Article Format February 2000
Son of 1036 June 1994

--- ../usefor-article-03/Syntax_Notation.out          February 2000
+++ ../usefor-article-04/Syntax_Notation.out          April 2001
@@ -2,8 +2,8 @@
 
    This standard uses the Augmented Backus Naur Form described in [RFC
    2234].  A discussion of this is outside the bounds of this standard,
-   but it is expected that implementors will be able to quickly
-   understand it with reference to the defining document.
+   but it is expected that implementors will be able quickly to
+   understand it with reference to that defining document.
 
    Much of the syntax of News Articles is based on the corresponding
    syntax defined in [MESSFOR] or in the Mime specifications [RFC 2045]
@@ -70,7 +70,7 @@
    with this standard MUST NOT generate any of the syntactic forms
    defined in that Obsolete Syntax, although it MAY accept such
    syntactic forms. Certain syntax from the MIME specifications [RFC
-   2045] et seq is also considered a part of this standard (see 6.17).
+   2045] et seq is also considered a part of this standard (see 6.21).
 
    The following syntactic forms, taken from [RFC 2234] or from
    [MESSFOR], are repeated here for convenience only:
@@ -131,4 +131,7 @@
         sensitive to case in some instances (as in newsgroup names, some
         header parameters, etc.). Care has been taken to indicate this
         explicitly where required.
+
+   The complete syntax defined in this standard is repeated, for
+   convenience, in Appendix B.
 

Documents were processed to this format by Forrest J. Cavalier III