Son-of-RFC1036:[Previous][Up to Table of Contents] [Next]

          All  octets found in headers MUST be ASCII characters.  How-
          ever, it is desirable to have a way  of  encoding  non-ASCII
          characters,  especially  in "human-readable" headers such as
          Subject.  MIME [rrr]  provides  a  way  to  do  this.   Full
          details  may be found in the MIME specifications; herewith a
          quick summary to alert software authors to the issues...

               encoded-word  = "=?" charset "?" encoding "?" codes "?="
               charset       = 1*tag-char
               encoding      = 1*tag-char
               tag-char      = <ASCII printable character except !()<>@,;:\"[]/?=>
               codes         = 1*code-char
               code-char     = <ASCII printable character except ?>

          An encoded word is a sequence of ASCII printable  characters
          that  specifies the character set, encoding method, and bits
          of (potentially) non-ASCII characters.   Encoded  words  are
          allowed  only in certain positions in certain headers.  Spe-
          cific headers impose restrictions on the content of  encoded
          words beyond that specified in this section.  Posting agents
          MUST ensure that any material  resembling  an  encoded  word
          (complete  with  all delimiters), in a context where encoded
          words may appear, really is an encoded word.

               NOTE: The  syntax  is  a  bit  ugly,  but  it  was
               designed  to  minimize  chances  of confusion with
               legitimate header contents, and to satisfy  diffi-
               cult constraints on use within existing headers.

          An  encoded word MUST not be more than 75 octets long.  Each
          line of a header containing encoded word(s) MUST be at  most
          76 octets long, not counting the EOL.

               NOTE:  These  limits are meant to bound the looka-
               head needed to determine whether text that  begins
               "=?" is really an encoded word.

          The  details  of  charsets and encodings are defined by MIME
          [rrr]; the sequence of preferred character sets is the  same
          as  MIME's.   Encoded  words  SHOULD not be used for content
          expressible in ASCII.

          When an encoded word is used, other than in a newsgroup name
          (see  section  5.5),  it MUST be separated from any adjacent
          non-space characters  (including  other  encoded  words)  by
          white  space.   Reading  agents  displaying  the contents of
          encoded words (as opposed  to  their  encoded  form)  should
          ignore white space adjacent to encoded words.

               UNRESOLVED  ISSUE:  Should this section be deleted
               entirely, or made much more terse?   The  material
               is relevant, but too complex to discuss fully.
               NOTE: The deletion of intervening white space per-
               mits using multiple encoded words, implicitly con-
               catenated  by  the  deletion,  to encode text that
               will not fit within a single 75-character  encoded
               word.

          Reading-agent  implementors  are  warned  that although this
          Draft completely specifies where encoded words may appear in
          the  headers  it  defines, there are other headers (e.g. the
          MIME Content-Description header) that MAY contain them.