9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] char encoding problem
@ 2008-07-11 15:55 Mathieu Lonjaret
  2008-07-11 15:56 ` andrey mirtchovski
  2008-07-14 16:31 ` Russ Cox
  0 siblings, 2 replies; 7+ messages in thread
From: Mathieu Lonjaret @ 2008-07-11 15:55 UTC (permalink / raw)
  To: 9fans

Hello 9fans,

I have a problem with one specific e-mail in p9p acme Mail: only some
part of it is displayed (~1/2) when I open it (right click) in acme
Mail. According to mutt, the encoding is iso-8859-15 and I can read it
fine there.
There does not seem to be anything special in the first line which is
"not there" in acme Mail; some accented characters but not different
from other ones which appear sooner in the mail.

As turjo advised me on #plan9, I tried
'9p read mail/mbox/n/raw'
in an acme win. That outputed the whole message, however the accented
chars were replaced by runes like that one: �
I also fetched the latest p9p (20080711); same behaviour.

Any idea about what's going on please?

Cheers,
Mathieu.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] char encoding problem
  2008-07-11 15:55 [9fans] char encoding problem Mathieu Lonjaret
@ 2008-07-11 15:56 ` andrey mirtchovski
  2008-07-11 19:35   ` Mathieu Lonjaret
  2008-07-14 16:31 ` Russ Cox
  1 sibling, 1 reply; 7+ messages in thread
From: andrey mirtchovski @ 2008-07-11 15:56 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

you can try using tcs to convert the email to utf, but the man page
doesn't say whether tcs speaks 8859-15, but the p9p program says it
does (tcs -lv):

http://swtch.com/plan9port/man/man1/tcs.html

On Fri, Jul 11, 2008 at 9:55 AM, Mathieu Lonjaret <lejatorn@gmail.com> wrote:
> Hello 9fans,
>
> I have a problem with one specific e-mail in p9p acme Mail: only some
> part of it is displayed (~1/2) when I open it (right click) in acme
> Mail. According to mutt, the encoding is iso-8859-15 and I can read it
> fine there.
> There does not seem to be anything special in the first line which is
> "not there" in acme Mail; some accented characters but not different
> from other ones which appear sooner in the mail.
>
> As turjo advised me on #plan9, I tried
> '9p read mail/mbox/n/raw'
> in an acme win. That outputed the whole message, however the accented
> chars were replaced by runes like that one: �
> I also fetched the latest p9p (20080711); same behaviour.
>
> Any idea about what's going on please?
>
> Cheers,
> Mathieu.
>
>
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] char encoding problem
  2008-07-11 15:56 ` andrey mirtchovski
@ 2008-07-11 19:35   ` Mathieu Lonjaret
  2008-07-11 19:54     ` erik quanstrom
  0 siblings, 1 reply; 7+ messages in thread
From: Mathieu Lonjaret @ 2008-07-11 19:35 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 169 bytes --]

Yes,
9p read mail/mbox/40/raw | tcs -f 8859-15
seems to be working fine, thanks.

Then any idea why acme Mail has a problem with this message?

Cheers,
Mathieu.

[-- Attachment #2: Type: message/rfc822, Size: 4723 bytes --]

From: "andrey mirtchovski" <mirtchovski@gmail.com>
To: "Fans of the OS Plan 9 from Bell Labs" <9fans@9fans.net>
Subject: Re: [9fans] char encoding problem
Date: Fri, 11 Jul 2008 09:56:40 -0600
Message-ID: <14ec7b180807110856r30c90fa5h18ddc2fb9dacf2b1@mail.gmail.com>

you can try using tcs to convert the email to utf, but the man page
doesn't say whether tcs speaks 8859-15, but the p9p program says it
does (tcs -lv):

http://swtch.com/plan9port/man/man1/tcs.html

On Fri, Jul 11, 2008 at 9:55 AM, Mathieu Lonjaret <lejatorn@gmail.com> wrote:
> Hello 9fans,
>
> I have a problem with one specific e-mail in p9p acme Mail: only some
> part of it is displayed (~1/2) when I open it (right click) in acme
> Mail. According to mutt, the encoding is iso-8859-15 and I can read it
> fine there.
> There does not seem to be anything special in the first line which is
> "not there" in acme Mail; some accented characters but not different
> from other ones which appear sooner in the mail.
>
> As turjo advised me on #plan9, I tried
> '9p read mail/mbox/n/raw'
> in an acme win. That outputed the whole message, however the accented
> chars were replaced by runes like that one: �
> I also fetched the latest p9p (20080711); same behaviour.
>
> Any idea about what's going on please?
>
> Cheers,
> Mathieu.
>
>
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] char encoding problem
  2008-07-11 19:35   ` Mathieu Lonjaret
@ 2008-07-11 19:54     ` erik quanstrom
  0 siblings, 0 replies; 7+ messages in thread
From: erik quanstrom @ 2008-07-11 19:54 UTC (permalink / raw)
  To: 9fans

> Yes,
> 9p read mail/mbox/40/raw | tcs -f 8859-15
> seems to be working fine, thanks.
>
> Then any idea why acme Mail has a problem with this message?

likely the mime character set string is different
from what tcs expects.

- erik




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] char encoding problem
  2008-07-11 15:55 [9fans] char encoding problem Mathieu Lonjaret
  2008-07-11 15:56 ` andrey mirtchovski
@ 2008-07-14 16:31 ` Russ Cox
  2008-07-14 17:17   ` Mathieu Lonjaret
  1 sibling, 1 reply; 7+ messages in thread
From: Russ Cox @ 2008-07-14 16:31 UTC (permalink / raw)
  To: 9fans

> I have a problem with one specific e-mail in p9p acme Mail: only some
> part of it is displayed (~1/2) when I open it (right click) in acme
> Mail. According to mutt, the encoding is iso-8859-15 and I can read it
> fine there.
> There does not seem to be anything special in the first line which is
> "not there" in acme Mail; some accented characters but not different
> from other ones which appear sooner in the mail.

acme couldn't care less.
mailfs is converting the message to utf-8
(or not) according to the content headers.

9p read mail/mbox/40/raw |grep -i '^content'
might tell something useful,
like what character set the message
is claiming to be in.

russ



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] char encoding problem
  2008-07-14 16:31 ` Russ Cox
@ 2008-07-14 17:17   ` Mathieu Lonjaret
  2008-07-14 18:38     ` erik quanstrom
  0 siblings, 1 reply; 7+ messages in thread
From: Mathieu Lonjaret @ 2008-07-14 17:17 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 406 bytes --]

% 9p read mail/mbox/40/raw |grep -i '^content'
Content-Type: text/plain; charset=
Content-Transfer-Encoding: 8bit

Does it mean that the character set is in iso-8859-15 but the sender
of the message somehow set it wrongly in the headers to be iso-8859-1?
Hence mailfs is failing at trying to convert it from iso-8859-1 to utf-8
because it trusts what's written in the headers?

Thanks,
Mathieu.

[-- Attachment #2: Type: message/rfc822, Size: 3617 bytes --]

From: "Russ Cox" <rsc@swtch.com>
To: 9fans@9fans.net
Subject: Re: [9fans] char encoding problem
Date: Mon, 14 Jul 2008 12:31:56 -0400
Message-ID: <20080714163031.DDC181E8C45@holo.morphisms.net>

> I have a problem with one specific e-mail in p9p acme Mail: only some
> part of it is displayed (~1/2) when I open it (right click) in acme
> Mail. According to mutt, the encoding is iso-8859-15 and I can read it
> fine there.
> There does not seem to be anything special in the first line which is
> "not there" in acme Mail; some accented characters but not different
> from other ones which appear sooner in the mail.

acme couldn't care less.
mailfs is converting the message to utf-8
(or not) according to the content headers.

9p read mail/mbox/40/raw |grep -i '^content'
might tell something useful,
like what character set the message
is claiming to be in.

russ


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [9fans] char encoding problem
  2008-07-14 17:17   ` Mathieu Lonjaret
@ 2008-07-14 18:38     ` erik quanstrom
  0 siblings, 0 replies; 7+ messages in thread
From: erik quanstrom @ 2008-07-14 18:38 UTC (permalink / raw)
  To: 9fans

> % 9p read mail/mbox/40/raw |grep -i '^content'
> Content-Type: text/plain; charset=
> Content-Transfer-Encoding: 8bit
>
> Does it mean that the character set is in iso-8859-15 but the sender
> of the message somehow set it wrongly in the headers to be iso-8859-1?
> Hence mailfs is failing at trying to convert it from iso-8859-1 to utf-8
> because it trusts what's written in the headers?

it's hard to tell from this.  the charset appears missing
entirely.  it may be that upas has misparsed the header.

you may need to read all of mail/mbox/40/rawheader
to find if the charset was really set to something.

if it is misparsed, i'd like to fix the problem.  you could
send me the raw header offline.

- erik




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-07-14 18:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-11 15:55 [9fans] char encoding problem Mathieu Lonjaret
2008-07-11 15:56 ` andrey mirtchovski
2008-07-11 19:35   ` Mathieu Lonjaret
2008-07-11 19:54     ` erik quanstrom
2008-07-14 16:31 ` Russ Cox
2008-07-14 17:17   ` Mathieu Lonjaret
2008-07-14 18:38     ` erik quanstrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).