Gnus development mailing list
 help / color / mirror / Atom feed
From: Mark Thomas <swoon@bellatlantic.net>
Subject: Re: nnml splitting on encoded headers
Date: Tue, 28 May 2002 18:17:54 -0400	[thread overview]
Message-ID: <wf18z64ar08.fsf@svelte.home> (raw)
In-Reply-To: <87off09f2s.fsf@nwalsh.com>


On Tue, 28 May 2002, ndw@nwalsh.com wrote:

> Looking at some representative asian spam on my machine, C-u g
> doesn't display the encoding in the subject, instead I see things
> like this:
> 
>   Message-Id: <200205281859.OAA05629@nexus.berkshire.net>
>   Reply-To: no@kojein.com
>   From: ¿¹½º¸Ç<yes@kojein.com>
>   To: ndw@nwalsh.com
>   Subject: (±¸ÀÎ,±¤°í)ÀçÅþ˹٠ÇϽǺР
>   Mime-Version: 1.0
>   Content-Type: text/html; charset="ks_c_5601-1987"

For this particular message, I would use rules:
    ("mail.spam.asian"     "^content-type:.*\\beuc-kr\\b")
    ("mail.spam.asian"     "^content-type:.*\\bks_c_5601-1987\\b")
Edit as necessary to make those fancy-rules.  Luckily there was a
charset in the headers for those split rules to match.

Sometimes I get spam where the Content-Type is multipart/alternative
and there is no charset listed in the headers.  For these, I use the
following rule to catch un-encoded spam:
    ("mail.spam.asian"     "^subject:.*[¡-ÿ]\\{4,\\}")
I figure any mail with more than four high-bit characters in a row in
the subject is probably not one I'm going to be able to read.

I've tried to use the rules
    ("mail.spam.asian"     "^subject:.*=\\?euc-kr\\?")
    ("mail.spam.asian"     "^subject:.*=\\?ks_c_5601-1987\\?")
to catch properly encoded headers, but Gnus decodes the message's
headers before it looks at the split rules (at least for back ends that
use nnmail-article-group) so these rules will never match.

Re-encoding the headers with
   (add-hook 'nnmail-split-hook 'rfc2047-encode-message-header)
lets those rules work.  However, this hook also encodes the previously
unencoded headers, so my match on high-bit-characters no longer works.
Sigh.

The number of unencoded Subject headers I receive far outnumber the
encoded ones, so I removed the function from the nnmail-split-hook.
This will work until I get too many properly encoded spams, in which
case I'll just yank the decoding call out of nnmail-article-group.

-Mark



  reply	other threads:[~2002-05-28 22:17 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-05-24 20:10 Mark Thomas
2002-05-25 12:35 ` Mark Thomas
2002-05-25 17:25 ` Kai Großjohann
2002-05-26  0:00   ` Russ Allbery
2002-05-26 12:32     ` Mark Thomas
2002-05-30 22:21       ` Russ Allbery
2002-06-03  3:34         ` Jesper Harder
2002-06-03 17:52         ` Simon Josefsson
2002-06-03 19:41           ` Kai Großjohann
2002-06-03 19:48             ` Simon Josefsson
2002-06-03 20:04               ` Russ Allbery
2002-05-28 20:45 ` Norman Walsh
2002-05-28 22:17   ` Mark Thomas [this message]
2002-05-29  0:31     ` Russ Allbery
2002-05-29  7:39   ` Kai Großjohann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=wf18z64ar08.fsf@svelte.home \
    --to=swoon@bellatlantic.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).