* [9fans] pipeto.lib spool and encoding @ 2007-10-16 16:01 Matthias Teege 2007-10-18 20:10 ` Russ Cox 0 siblings, 1 reply; 7+ messages in thread From: Matthias Teege @ 2007-10-16 16:01 UTC (permalink / raw) To: 9fans Moin, I try to sort incoming email with pipeto. My pipto looks like this: #!/bin/rc USER=m . /mail/lib/pipeto.lib $* if(~ `{cat $D/from | tr A-Z a-z} admin@example.com) { spool admin exit 0 } spool exit 0 It works but the encoding is broken after delivery. The "german umlauts" are working if I remove pipto. What maybe wrong? Matthias ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [9fans] pipeto.lib spool and encoding 2007-10-16 16:01 [9fans] pipeto.lib spool and encoding Matthias Teege @ 2007-10-18 20:10 ` Russ Cox 2007-10-18 21:07 ` erik quanstrom 2007-10-18 22:22 ` Steve Simon 0 siblings, 2 replies; 7+ messages in thread From: Russ Cox @ 2007-10-18 20:10 UTC (permalink / raw) To: 9fans > It works but the encoding is broken after delivery. The "german > umlauts" are working if I remove pipto. in /mail/lib/pipeto.lib, the line sed '/^$/,$ s/^From / From /' >$TMP.msg needs to be replaced with a c program that does this conversion without coercing its input text into utf-8. russ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [9fans] pipeto.lib spool and encoding 2007-10-18 20:10 ` Russ Cox @ 2007-10-18 21:07 ` erik quanstrom 2007-10-19 18:07 ` Russ Cox 2007-10-18 22:22 ` Steve Simon 1 sibling, 1 reply; 7+ messages in thread From: erik quanstrom @ 2007-10-18 21:07 UTC (permalink / raw) To: 9fans >> It works but the encoding is broken after delivery. The "german >> umlauts" are working if I remove pipto. > > in /mail/lib/pipeto.lib, the line > > sed '/^$/,$ s/^From / From /' >$TMP.msg > > needs to be replaced with a c program that does this > conversion without coercing its input text into utf-8. > > russ i submitted a patch. unfortunately, i think the patch is on the wrong track. sed isn't coercing it's input to utf-8. there's no active conversion going on. plan 9 programs assume utf-8 input, since plan 9 uses utf-8. the alternative is to start using multiple character sets. i think a better solution to this is to convert the incoming message to utf-8 first. there are likely more problems similar to this one as plan 9 tools make valid assumptions that upas doesn't honour. - erik ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [9fans] pipeto.lib spool and encoding 2007-10-18 21:07 ` erik quanstrom @ 2007-10-19 18:07 ` Russ Cox 2007-10-19 18:22 ` erik quanstrom 0 siblings, 1 reply; 7+ messages in thread From: Russ Cox @ 2007-10-19 18:07 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs > > in /mail/lib/pipeto.lib, the line > > > > sed '/^$/,$ s/^From / From /' >$TMP.msg > > > > needs to be replaced with a c program that does this > > conversion without coercing its input text into utf-8. > > > > russ > > unfortunately, i think the patch is on the wrong track. > sed isn't coercing it's input to utf-8. there's no active > conversion going on. plan 9 programs assume utf-8 input, > since plan 9 uses utf-8. i said coerce, not convert. sed is treating its input as utf-8, like most plan 9 programs, but raw mail messages might be some other 8-bit ascii-compatible encoding. so the bytes that are not valid utf-8 sequences are getting mangled by the coercion into a Rune buffer. > i think a better solution to this is to convert the incoming > message to utf-8 first. there are likely more problems similar > to this one as plan 9 tools make valid assumptions that upas doesn't > honour. most plan 9 tools are used on the upas presentation of a mailbox, which *is* in utf-8. very few tools operate directly on the 8-bit mail message. pipeto.lib is one of the few, and even there it just works to get its input into an mbox and then invokes upas/fs. attempting to perform any conversion of the raw message is a mistake. you're almost guaranteed to lose some information, and with little to no benefit (thanks to everything using upas/fs to access mail). russ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [9fans] pipeto.lib spool and encoding 2007-10-19 18:07 ` Russ Cox @ 2007-10-19 18:22 ` erik quanstrom 0 siblings, 0 replies; 7+ messages in thread From: erik quanstrom @ 2007-10-19 18:22 UTC (permalink / raw) To: 9fans > most plan 9 tools are used on the upas presentation of a mailbox, > which *is* in utf-8. very few tools operate directly on the 8-bit > mail message. pipeto.lib is one of the few, and even there it > just works to get its input into an mbox and then invokes upas/fs. > > attempting to perform any conversion of the raw message is a mistake. > you're almost guaranteed to lose some information, and with little > to no benefit (thanks to everything using upas/fs to access mail). is there anything in the incoming charset that is useful? what distinguishes it from noise? - erik ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [9fans] pipeto.lib spool and encoding 2007-10-18 20:10 ` Russ Cox 2007-10-18 21:07 ` erik quanstrom @ 2007-10-18 22:22 ` Steve Simon 2007-10-18 22:34 ` geoff 1 sibling, 1 reply; 7+ messages in thread From: Steve Simon @ 2007-10-18 22:22 UTC (permalink / raw) To: 9fans > sed '/^$/,$ s/^From / From /' >$TMP.msg > > needs to be replaced with a c program that does this > conversion without coercing its input text into utf-8. I had one to hand, called /sys/src/cmd/upas/padfrom.c I think I wrote it, though it may have come from somone else on the list, if so I appologise. #include <u.h> #include <libc.h> #include <bio.h> main() { char *p; Biobuf bi, bo; Binit(&bi, 0, OREAD); Binit(&bo, 1, OWRITE); while ((p = Brdline(&bi, '\n')) != nil && Blinelen(&bi) > 1) Bwrite(&bo, p, Blinelen(&bi)); Bputc(&bo, '\n'); while ((p = Brdline(&bi, '\n')) != nil){ if (strncmp(p, "From ", 5) == 0) Bputc(&bo, ' '); Bwrite(&bo, p, Blinelen(&bi)); } } ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [9fans] pipeto.lib spool and encoding 2007-10-18 22:22 ` Steve Simon @ 2007-10-18 22:34 ` geoff 0 siblings, 0 replies; 7+ messages in thread From: geoff @ 2007-10-18 22:34 UTC (permalink / raw) To: 9fans This logic is already in too many places in upas, but upas/deliver, minus the logging, should do the job. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-10-19 18:22 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-10-16 16:01 [9fans] pipeto.lib spool and encoding Matthias Teege 2007-10-18 20:10 ` Russ Cox 2007-10-18 21:07 ` erik quanstrom 2007-10-19 18:07 ` Russ Cox 2007-10-19 18:22 ` erik quanstrom 2007-10-18 22:22 ` Steve Simon 2007-10-18 22:34 ` geoff
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).