* Latin 1 in non-MIME news postings? @ 1998-09-02 7:33 Kai Grossjohann 1998-09-02 9:49 ` Jost Krieger ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Kai Grossjohann @ 1998-09-02 7:33 UTC (permalink / raw) I just read a news article with Latin 1 characters in it with pGnus 0.13 (Emacs 20.3). Latin 1 characters were displayed as \888 octal escapes. The news article didn't have any MIME headers. Is this the correct behavior? I have (set-language-environment "Latin-1") in my init files. kai -- OOP: object oriented programming; OOPS: object oriented mistakes ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Latin 1 in non-MIME news postings? 1998-09-02 7:33 Latin 1 in non-MIME news postings? Kai Grossjohann @ 1998-09-02 9:49 ` Jost Krieger 1998-09-02 11:24 ` Kai Grossjohann 1998-09-03 10:27 ` Hrvoje Niksic 1998-09-02 10:42 ` jean-luc cassel 1998-09-02 12:25 ` Lars Magne Ingebrigtsen 2 siblings, 2 replies; 12+ messages in thread From: Jost Krieger @ 1998-09-02 9:49 UTC (permalink / raw) >>>>> "Kai" == Kai Grossjohann <grossjohann@amaunet.cs.uni-dortmund.de> writes: > I just read a news article with Latin 1 characters in it with pGnus > 0.13 (Emacs 20.3). Latin 1 characters were displayed as \888 octal > escapes. The news article didn't have any MIME headers. So how should gnus know they are Latin1 characters ? > Is this the correct behavior? I have (set-language-environment > "Latin-1") in my init files. That might be a hint. On the other hand, those just-send-8-bit people should stand out like a sore thumb. Jost -- | Jost.Krieger@ruhr-uni-bochum.de Please help stamp out spam! | | Postmaster, JAPH, resident answer machine am RZ der RUB | ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Latin 1 in non-MIME news postings? 1998-09-02 9:49 ` Jost Krieger @ 1998-09-02 11:24 ` Kai Grossjohann 1998-09-03 10:15 ` Russ Allbery 1998-09-03 10:27 ` Hrvoje Niksic 1 sibling, 1 reply; 12+ messages in thread From: Kai Grossjohann @ 1998-09-02 11:24 UTC (permalink / raw) Cc: ding >>>>> On 02 Sep 1998, Jost Krieger said: Jost> So how should gnus know they are Latin1 characters ? Hm. Maybe I confused HTML with news, here. In the HTML standard, it says the document is Latin 1. I thought the news RFC also specified Latin 1 as the default charset? Who knows more? kai -- OOP: object oriented programming; OOPS: object oriented mistakes ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Latin 1 in non-MIME news postings? 1998-09-02 11:24 ` Kai Grossjohann @ 1998-09-03 10:15 ` Russ Allbery [not found] ` <x7soi9yx8s.fsf@peorth.gweep.net> 0 siblings, 1 reply; 12+ messages in thread From: Russ Allbery @ 1998-09-03 10:15 UTC (permalink / raw) Kai Grossjohann <grossjohann@amaunet.cs.uni-dortmund.de> writes: >>>>>> On 02 Sep 1998, Jost Krieger said: > Jost> So how should gnus know they are Latin1 characters ? > Hm. Maybe I confused HTML with news, here. In the HTML standard, it > says the document is Latin 1. I thought the news RFC also specified > Latin 1 as the default charset? Who knows more? News specifies article body format follows RFC 822, which specifies 7bit ASCII. Technically, MIME isn't even legal in news. In practice, most people use MIME or just send 8bit. It looks likely that the new news RFC will specify UTF-7 as a default but strongly encourage use of MIME charset tagging. -- Russ Allbery (rra@stanford.edu) <URL:http://www.eyrie.org/~eagle/> ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <x7soi9yx8s.fsf@peorth.gweep.net>]
* Re: Latin 1 in non-MIME news postings? [not found] ` <x7soi9yx8s.fsf@peorth.gweep.net> @ 1998-09-03 16:47 ` Russ Allbery [not found] ` <x7k93lw2lm.fsf@peorth.gweep.net> 1998-09-03 17:35 ` Karl Kleinpaste 1 sibling, 1 reply; 12+ messages in thread From: Russ Allbery @ 1998-09-03 16:47 UTC (permalink / raw) Stainless Steel Rat <ratinox@peorth.gweep.net> writes: > "RA" == Russ Allbery <rra@stanford.edu> writes: > RA> News specifies article body format follows RFC 822, which specifies > RA> 7bit ASCII. Technically, MIME isn't even legal in news. > Wait. > MIME in and of itself sits on top of RFC 822. MIME specifies that 8-bit > data be encoded into a 7-bit format, usually base64. > Recent incarnations of SMTP allow for 8-bit data over 8-bit clean > networks between 8-bit clean MTAs, (ab)using aspects of MIME to > accomplish this. > Please do not confuse the two. How am I confusing the two? News specifies, by proxy, 7bit ASCII in article bodies. MIME, whether 7bit or 8bit or what have you, technically does not apply to news and means nothing in news, since the news standards (although somewhat obscure on this point) seem to indicate that they do not adopt 822 extensions. Therefore there is technically no standards-compliant way to send 8bit data of any sort, even ISO 8859-1 characters, across Usenet, even in an encoded form, since news says that base64 is just a stream of characters like any other 7bit ASCII body and that you cannot reliably apply any particular interpretation to the headers that claim otherwise. Obviously this is widely ignored in practice. -- Russ Allbery (rra@stanford.edu) <URL:http://www.eyrie.org/~eagle/> ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <x7k93lw2lm.fsf@peorth.gweep.net>]
* Re: Latin 1 in non-MIME news postings? [not found] ` <x7k93lw2lm.fsf@peorth.gweep.net> @ 1998-09-03 17:22 ` Russ Allbery 1998-09-03 20:48 ` Richard Coleman 1 sibling, 0 replies; 12+ messages in thread From: Russ Allbery @ 1998-09-03 17:22 UTC (permalink / raw) Stainless Steel Rat <ratinox@peorth.gweep.net> writes: > "RA" == Russ Allbery <rra@stanford.edu> writes: > RA> How am I confusing the two? > Vanilla MIME is 100% compliant with RFC 822. 8-bit data is *NOT* > allowed in a MIME message; it must be encoded into a 7-bit format. MIME > is completely legal in news. Yes, but legal is not what I'm talking about. It's completely legal to send MIME-encoded data in news. However, the headers that tell you that it's MIME-encoded data don't mean anything in news, and therefore it's technically not legal to make assumptions based on their content. Like, say, decoding articles. It's a minor pedantic point, yes, but it's one of the things that annoys me about RFC 1036. Probably one of the more minor ones. -- Russ Allbery (rra@stanford.edu) <URL:http://www.eyrie.org/~eagle/> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Latin 1 in non-MIME news postings? [not found] ` <x7k93lw2lm.fsf@peorth.gweep.net> 1998-09-03 17:22 ` Russ Allbery @ 1998-09-03 20:48 ` Richard Coleman 1 sibling, 0 replies; 12+ messages in thread From: Richard Coleman @ 1998-09-03 20:48 UTC (permalink / raw) > Vanilla MIME is 100% compliant with RFC 822. 8-bit data is *NOT* allowed > in a MIME message; it must be encoded into a 7-bit format. MIME is > completely legal in news. > > SMTP added the '8bit' transfer type. This is not a MIME standard type... > that is, it is a standard MIME type for SMTP, not for MIME. That being the > case, 8bit is valid *ONLY* for SMTP traffic. This is not correct. RFC2045 clearly defines the type "8bit" as a valid Content-Transfer-Encoding. Here is one of the relevant paragraphs from RFC2045: The Content-Transfer-Encoding values "7bit", "8bit", and "binary" all mean that the identity (i.e. NO) encoding transformation has been performed. As such, they serve simply as indicators of the domain of the body data, and provide useful information about the sort of encoding that might be needed for transmission in a given transport system. The terms "7bit data", "8bit data", and "binary data" are all defined in Section 2. The MIME standard does not force 7bit transport. -- Richard Coleman coleman@math.gatech.edu ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Latin 1 in non-MIME news postings? [not found] ` <x7soi9yx8s.fsf@peorth.gweep.net> 1998-09-03 16:47 ` Russ Allbery @ 1998-09-03 17:35 ` Karl Kleinpaste 1 sibling, 0 replies; 12+ messages in thread From: Karl Kleinpaste @ 1998-09-03 17:35 UTC (permalink / raw) Stainless Steel Rat <ratinox@peorth.gweep.net> writes: > MIME in and of itself sits on top of RFC 822. MIME specifies that 8-bit > data be encoded into a 7-bit format, usually base64. > Recent incarnations of SMTP allow for 8-bit data over 8-bit clean networks > between 8-bit clean MTAs, (ab)using aspects of MIME to accomplish this. And later writes: > Vanilla MIME is 100% compliant with RFC 822. 8-bit data is *NOT* allowed > in a MIME message; it must be encoded into a 7-bit format. Nonsense, as even a minimal review of the RFCs shows. MIME specifies no such thing as a 7bit encoding requirement. "Vanilla MIME" is explicitly an extension of RFC822 beyond its original intended domain. MIME specifies that 8bit data has the identity transformation when "Content-Transfer-Encoding: 8bit" is present. MIME sits comfortably atop both RFC822 format and RFC821 transport, as RFC-modified for 8bit data passage (e.g., RFC1652, RFC2045) -- and the RFCs specifically state that such formats and transports have been redefined and extended -- so that it is by no means "(ab)using aspects of MIME" to do so. For the pedantic, relevant RFC citations follow. --karl RFC 2045, _MIME Part 1_, Format of Internet Message Bodies, page 1, Abstract: ...This set of documents, collectively called the Multipurpose Internet Mail Extensions, or MIME, redefines the format of messages to allow for ^^^^^^^^^ (1) textual message bodies in character sets other than US-ASCII... ...Because RFC 822 said so little about message bodies, these documents are largely orthogonal to (rather than a revision of) RFC 822. Page 3, Introduction: One of the notable limitations of RFC 821/822 based mail systems is the fact that they limit the contents of electronic mail messages to relatively short lines (e.g. 1000 characters or less [RFC-821]) of 7bit US-ASCII. This forces users to convert any non-textual data that they may wish to send into seven-bit bytes representable as printable US-ASCII characters... Page 4: This document describes several mechanisms that combine to solve most of these problems... (3) A Content-Transfer-Encoding header field, which can be used to specify both the encoding transformation that was applied to the body and the domain of the result. Encoding transformations other than the identity transformation are usually applied to data in order to allow it to pass through mail transport mechanisms which may have data or character set limitations. Page 14, Content-Transfer-Encoding Header Field: Many media types which could be usefully transported via email are represented, in their "natural" format, as 8bit character or binary data. Such data cannot be transmitted over some transfer protocols... ...Proper labelling of unencoded material in less restrictive formats for direct use over less restrictive transports is also desireable. This document specifies that such encodings will be indicated by a new "Content- Transfer-Encoding" header field. This field has not been defined by any previous standard. Pages 15-16, Content-Transfer-Encoding Semantics: Three transformations are currently defined: identity, the "quoted- printable" encoding, and the "base64" encoding. The domains are "binary", "8bit" and "7bit". The Content-Transfer-Encoding values "7bit", "8bit", and "binary" all mean that the identity (i.e. NO) encoding transformation has been performed. As such, they serve simply as indicators of the domain of the body data... ...[E]stablishing only a single transformation into the "7bit" domain does not seem possible. 8bit data is perfectly legal, as-is, in a MIME context (i.e. identified as such), when riding through an RFC1652 8bit transport. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Latin 1 in non-MIME news postings? 1998-09-02 9:49 ` Jost Krieger 1998-09-02 11:24 ` Kai Grossjohann @ 1998-09-03 10:27 ` Hrvoje Niksic 1 sibling, 0 replies; 12+ messages in thread From: Hrvoje Niksic @ 1998-09-03 10:27 UTC (permalink / raw) Jost Krieger <Jost.Krieger@ruhr-uni-bochum.de> writes: > >>>>> "Kai" == Kai Grossjohann <grossjohann@amaunet.cs.uni-dortmund.de> writes: > > > I just read a news article with Latin 1 characters in it with pGnus > > 0.13 (Emacs 20.3). Latin 1 characters were displayed as \888 octal > > escapes. The news article didn't have any MIME headers. > > So how should gnus know they are Latin1 characters ? It's a good idea to assume Latin1 when nothing is specified. The assumption does no harm, and gets things right in most of the cases. -- Hrvoje Niksic <hniksic@srce.hr> | Student at FER Zagreb, Croatia --------------------------------+-------------------------------- Those who like sausages, laws, and standards are well advised not to learn how they are made. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Latin 1 in non-MIME news postings? 1998-09-02 7:33 Latin 1 in non-MIME news postings? Kai Grossjohann 1998-09-02 9:49 ` Jost Krieger @ 1998-09-02 10:42 ` jean-luc cassel 1998-09-02 12:25 ` Lars Magne Ingebrigtsen 2 siblings, 0 replies; 12+ messages in thread From: jean-luc cassel @ 1998-09-02 10:42 UTC (permalink / raw) / Kai Grossjohann <grossjohann@amaunet.cs.uni-dortmund.de> : > I just read a news article with Latin 1 characters in it with pGnus > 0.13 (Emacs 20.3). Latin 1 characters were displayed as \888 octal > escapes. The news article didn't have any MIME headers. > > Is this the correct behavior? I have (set-language-environment > "Latin-1") in my init files. [I'm french] with pgnus 0.7-emacs 20.2, no problem even if no MIME headers, with only in .emacs : (standard-display-european 1) [and (gnus-strict-mime t)] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Latin 1 in non-MIME news postings? 1998-09-02 7:33 Latin 1 in non-MIME news postings? Kai Grossjohann 1998-09-02 9:49 ` Jost Krieger 1998-09-02 10:42 ` jean-luc cassel @ 1998-09-02 12:25 ` Lars Magne Ingebrigtsen 1998-09-07 20:15 ` Kai Grossjohann 2 siblings, 1 reply; 12+ messages in thread From: Lars Magne Ingebrigtsen @ 1998-09-02 12:25 UTC (permalink / raw) Kai Grossjohann <grossjohann@amaunet.cs.uni-dortmund.de> writes: > I just read a news article with Latin 1 characters in it with pGnus > 0.13 (Emacs 20.3). Latin 1 characters were displayed as \888 octal > escapes. The news article didn't have any MIME headers. Do you get that even if you `C-u g' the article to avoid any decoding on Gnus' part? -- (domestic pets only, the antidote for overdose, milk.) larsi@gnus.org * Lars Magne Ingebrigtsen ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Latin 1 in non-MIME news postings? 1998-09-02 12:25 ` Lars Magne Ingebrigtsen @ 1998-09-07 20:15 ` Kai Grossjohann 0 siblings, 0 replies; 12+ messages in thread From: Kai Grossjohann @ 1998-09-07 20:15 UTC (permalink / raw) >>>>> Kai Grossjohann <grossjohann@amaunet.cs.uni-dortmund.de> writes: Kai> I just read a news article with Latin 1 characters in it with Kai> pGnus 0.13 (Emacs 20.3). Latin 1 characters were displayed as Kai> \888 octal escapes. The news article didn't have any MIME Kai> headers. >>>>> On 02 Sep 1998, Lars Magne Ingebrigtsen said: Lars> Do you get that even if you `C-u g' the article to avoid any Lars> decoding on Gnus' part? Dunno. Seems to have disappeared between 0.13 and 0.17. Was traveling the past few days, so the point is moot now, I guess. kai -- OOP: object oriented programming; OOPS: object oriented mistakes ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~1998-09-07 20:15 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 1998-09-02 7:33 Latin 1 in non-MIME news postings? Kai Grossjohann 1998-09-02 9:49 ` Jost Krieger 1998-09-02 11:24 ` Kai Grossjohann 1998-09-03 10:15 ` Russ Allbery [not found] ` <x7soi9yx8s.fsf@peorth.gweep.net> 1998-09-03 16:47 ` Russ Allbery [not found] ` <x7k93lw2lm.fsf@peorth.gweep.net> 1998-09-03 17:22 ` Russ Allbery 1998-09-03 20:48 ` Richard Coleman 1998-09-03 17:35 ` Karl Kleinpaste 1998-09-03 10:27 ` Hrvoje Niksic 1998-09-02 10:42 ` jean-luc cassel 1998-09-02 12:25 ` Lars Magne Ingebrigtsen 1998-09-07 20:15 ` Kai Grossjohann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).