From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/77410 Path: news.gmane.org!not-for-mail From: Hobbit Newsgroups: gmane.emacs.gnus.general Subject: Re: Attach file improvement Date: Mon, 28 Feb 2011 16:59:05 +0200 Message-ID: <87oc5wb3ye.fsf@myhost.localdomain> References: <87d3mkb7bu.fsf@myhost.localdomain> <8762scb5ea.fsf@myhost.localdomain> <87aahnaijh.fsf@myhost.localdomain> <871v2z2hyj.fsf@gnus.org> <87ei6y4fn7.fsf@myhost.localdomain> <87k4goeqw3.fsf@gnus.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1298904944 18939 80.91.229.12 (28 Feb 2011 14:55:44 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 28 Feb 2011 14:55:44 +0000 (UTC) To: ding@gnus.org Original-X-From: ding-owner+M25734@lists.math.uh.edu Mon Feb 28 15:55:41 2011 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Pu4Vf-0004DY-TU for ding-account@gmane.org; Mon, 28 Feb 2011 15:55:40 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1Pu4VR-0002Qj-B3; Mon, 28 Feb 2011 08:55:25 -0600 Original-Received: from mx1.math.uh.edu ([129.7.128.32]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1Pu4VO-0002QN-S9 for ding@lists.math.uh.edu; Mon, 28 Feb 2011 08:55:22 -0600 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx1.math.uh.edu with esmtp (Exim 4.72) (envelope-from ) id 1Pu4VK-0000ok-Ki for ding@lists.math.uh.edu; Mon, 28 Feb 2011 08:55:22 -0600 Original-Received: from forward12.mail.yandex.net ([95.108.130.94]) by quimby.gnus.org with esmtp (Exim 4.72) (envelope-from ) id 1Pu4VJ-0007kE-Ad for ding@gnus.org; Mon, 28 Feb 2011 15:55:17 +0100 Original-Received: from smtp11.mail.yandex.net (smtp11.mail.yandex.net [95.108.130.67]) by forward12.mail.yandex.net (Yandex) with ESMTP id BA7A3C20E39 for ; Mon, 28 Feb 2011 17:55:11 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1298904911; bh=uOMDfjWz0mSmdS5dWvIz5xlzRcpjsp4GxuCuAktOLqc=; h=From:To:Subject:References:Date:In-Reply-To:Message-ID: MIME-Version:Content-Type; b=Id549cKMcYCA44Bz9FFPouI7p0oyaT0ZacBZa8aJG+h0GRgzyGX39OcXypH1ohaxn 4V9gmcQFVsg9HPG7KW1IYM6ii/aEzeP+sTAEbuw1mAGC4yLQHgzGXPkmlIeQShOSgg Hemfa+DiMqTJdCU3+D+mENmM8HB64JBSOqUYZI2o= Original-Received: from myhost.localdomain (246-153-133-95.pool.ukrtel.net [95.133.153.246]) by smtp11.mail.yandex.net (Yandex) with ESMTPSA id 66EA04CC007C for ; Mon, 28 Feb 2011 17:55:11 +0300 (MSK) In-Reply-To: <87k4goeqw3.fsf@gnus.org> (Lars Ingebrigtsen's message of "Thu, 24 Feb 2011 19:26:20 -0800") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) X-Spam-Score: 0.3 (/) X-Spam-Report: SpamAssassin (3.3.1 2010-03-16) analysis follows Bayesian score: 0.0000 Ham tokens: 0.000-1739--4381h-0s--0d--H*u:Emacs, 0.000-1669--4205h-0s--0d--H*u:Gnus, 0.000-1609--4054h-0s--0d--H*u:linux, 0.000-1609--4054h-0s--0d--H*UA:linux, 0.000-1551--3908h-0s--0d--H*u:gnu Spam tokens: 0.998-22740--202h-274675s--0d--UD:ru, 0.993-1--0h-2s--0d--UD:yandex.ru, 0.987-1--0h-1s--1d--RING, 0.957-646--214h-10291s--0d--H*F:D*ru, 0.925-600--476h-12880s--1d--latin Autolearn status: no 0.0 FREEMAIL_FROM Sender email is freemail (werehobbit[at]yandex.ru) 2.3 FSL_RU_URL URI: FSL_RU_URL -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.0 T_TO_NO_BRKTS_FREEMAIL T_TO_NO_BRKTS_FREEMAIL List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:77410 Archived-At: Lars Ingebrigtsen writes: > The FSF needs copyright assignment papers in this case. Would you be > willing to sign such papers? Yes, but not earlier then December 2011. Sorry. :( Would you write solution for the described problem if it'll be quick enough to implement (that's not #1 problem it TODO list, unfortunately)? >> So how could we know what charset to use without asking a user? > Right. But if you just load that file into an Emacs buffer, does it >interpret that file correctly according to your settings, or do you >have to tell Emacs manually what encoding it is? I guess the latter, >if you normally handle files of both encodings. Yes, I use the latter approach (telling manually). > And in that case, perhaps the MIME code should only ask when it's not > obvious by looking at the file what encoding it is? Users are usually used to type things like this | C-c C-a | Attach file: ~/file.txt | Content type (default text/plain): | Charset (default nil): cp855 | One line description: descr | Disposition (default inline): attachment automatically, and if Gnus would not ask for charset each time it could be uncomfortable (because it's brokes reflex). So maybe best way it's just add another customize variable gnus-ask-for-file-charset and set it to nil for you and to t for people that need Cyrillic alphabet. I'll extend aforementioned example. A file with contents (in hex codes) 0xCE, 0xC5, 0xD4 not only could be read using the koi8-r charset as CYRILLIC SMALL LETTER EN CYRILLIC SMALL LETTER IE CYRILLIC SMALL LETTER TE or using the windows-1251 charset as CYRILLIC CAPITAL LETTER O CYRILLIC CAPITAL LETTER IE CYRILLIC CAPITAL LETTER EF but also using iso-8859-1 it's LATIN CAPITAL LETTER I WITH CIRCUMFLEX LATIN CAPITAL LETTER A WITH RING ABOVE LATIN CAPITAL LETTER O WITH CIRCUMFLEX We can't understand it's real contents without some sophisticated heuristics. At least for 8-bit encodings. Only programs such as http://freshmeat.net/projects/enca/ can give general solution to this problem (excerpt from http://linux.die.net/man/1/enca): enca ... uses knowledge about their language (must be supported by you) and a mixture of parsing, statistical analysis, guessing and black magic to determine their encodings, which it then prints to standard output (or it confesses it doesn't have any idea what the encoding could be). How could a file encoding be obvious by mere looking at the file (without some clue from user, at least by customize variable)?