Gnus development mailing list
 help / color / mirror / Atom feed
From: Eric Abrahamsen <eric@ericabrahamsen.net>
To: ding@gnus.org
Subject: Re: sometime splits
Date: Fri, 02 Nov 2012 15:50:24 +0800	[thread overview]
Message-ID: <87wqy4qw8v.fsf@ericabrahamsen.net> (raw)
In-Reply-To: <877gy5n356.fsf@ericabrahamsen.net>

So, months after first having this problem, I think I've finally figured
out what's going on. To recap, I have this split in
`nnmail-split-fancy':

("from" "info@paper-republic.org"
        (| ("subject" "New Comment"
           (|
           ("subject" ,(rx "MARKED SPAM" eol) "mail.PRSpam")
           "mail.PRham"))

When messages come in with "MARKED SPAM" at the end of the subject
header, this _sometimes_ matches, and sometimes doesn't.

These messages are sent via a Django website, through Google Apps email
service.

I figured out that if there are non-ASCII characters in the subject
header, something (probably Google's mail service) messes with the
header. Using "C-u g" in the summary buffer shows that a pure-ASCII
subject header looks just like you'd expect it to, while a header
containing non-ASCII characters ends up actually looking like this:

--8<---------------cut here---------------start------------->8---
Subject: =?utf-8?q?=5BPaper_Republic=5D_New_Comment_on_French_Rendition_of_Fan_Wen?=
	=?utf-8?b?4oCZcyDigJxIYXJtb25pb3VzIExhbmTigJ0gdG8gTGF1bmNoIGJ5IGVhcmx5?=
	=?utf-8?q?_2013_MARKED_SPAM?=
--8<---------------cut here---------------end--------------->8---

Not surprisingly, the call to (rx "MARKED SPAM" eol) fails on this,
because of the extra "?=" at the end of the header, and the underscore
between MARKED and SPAM.

That underscore means I would need two different rules for the
differently-encoded headers. Is there anything built into Gnus that
might allow me to somehow translate this header into a "real" UTF-8
string, instead of what Google gives me? Or have the split performed on
the decoded string, rather than the literal string?

At any rate, I'm pleased to know that I'm not actually crazy.

E



Eric Abrahamsen <eric@ericabrahamsen.net> writes:

> On Tue, Mar 27 2012, Russ Allbery wrote:
>
>> Eric Abrahamsen <eric@ericabrahamsen.net> writes:
>>
>>> I'm having an irritating issue where one type of common email message
>>> gets split incorrectly. I run a website that emails me automatically
>>> with spam notifications, so I can catch false positives before they're
>>> automatically deleted. The top of my `nnmail-split-fancy' looks like
>>> this:
>>
>>> '(|
>>>   ("From" "info@paper-republic.org"
>>>     (| ("Subject" "\\[Paper Republic\\]"
>>
>> This kept catching me too.  You have to be careful about regexes; Gnus
>> adds an implicit word boundary on either end of the regex, but Emacs
>> doesn't consider the transition from a non-alphanumeric to another
>> non-alphanumeric to be a word boundary.  So if your regex begins or ends
>> with some non-alphanumeric characters, the regex won't match the way you
>> expect.
>>
>> Short version: change that to ".*\\[Paper Republic\\].*" and I bet it will
>> start working.
>
> Ooh, I'll give that a shot, thank you!




  reply	other threads:[~2012-11-02  7:50 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-27 18:41 Eric Abrahamsen
2012-03-27 20:16 ` Russ Allbery
2012-03-27 21:29   ` Eric Abrahamsen
2012-11-02  7:50     ` Eric Abrahamsen [this message]
2012-11-02  9:00       ` Katsumi Yamaoka
2012-11-02  9:39         ` Eric Abrahamsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wqy4qw8v.fsf@ericabrahamsen.net \
    --to=eric@ericabrahamsen.net \
    --cc=ding@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).