Gnus development mailing list
 help / color / mirror / Atom feed
* Attach file improvement
@ 2011-02-22 12:08 Hobbit
  2011-02-22 12:49 ` Hobbit
  0 siblings, 1 reply; 10+ messages in thread
From: Hobbit @ 2011-02-22 12:08 UTC (permalink / raw)
  To: ding

[-- Attachment #1: Type: text/plain, Size: 647 bytes --]

Gnus is a wonderful program. However, when you need to attach to your
mail some files in a many different charsets it's quickly becomes tiring
to type charset=<charset name> into

| type="text/plain" filename="~/file.txt" disposition=attachment description=description

by hand. So I wrote a patch for Gnus (lisp/mml.el), with which
attachment process looks like:

| C-c C-a
| Attach file: ~/file.txt
| Content type (default text/plain): <RET>
| Charset (default nil): cp855
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| One line description: descr
| Disposition (default inline): attachment

Could you merge it into a Gnus repository?

Patch for lisp/mml.el:

[-- Attachment #2: patch contents --]
[-- Type: application/octet-stream, Size: 1041 bytes --]

1207a1208,1216
> (defun mml-minibuffer-read-charset (&optional default)
>   (let ((charset (completing-read
>                   (format "Charset (default %s): " default)
>                   (mapcar 'symbol-name charset-list)
>                   nil t nil nil default)))
>     (if (not (equal charset ""))
>         charset
>       default)))
> 
1280c1289,1290
< (defun mml-attach-file (file &optional type description disposition)
---
> (defun mml-attach-file (file &optional type description
>                              disposition charset)
1290c1300,1301
< body) or \"attachment\" (separate from the body)."
---
> body) or \"attachment\" (separate from the body). CHARSET is file
> charset."
1293a1305,1307
>       (charset (when (member (car (split-string type "/"))
>                              '("text" "message"))
>                  (mml-minibuffer-read-charset)))
1296c1310
<      (list file type description disposition)))
---
>      (list file type description disposition charset)))
1303a1318
>               'charset charset

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Attach file improvement
  2011-02-22 12:08 Attach file improvement Hobbit
@ 2011-02-22 12:49 ` Hobbit
  2011-02-22 21:03   ` Hobbit
  0 siblings, 1 reply; 10+ messages in thread
From: Hobbit @ 2011-02-22 12:49 UTC (permalink / raw)
  To: ding

Hobbit <werehobbit@yandex.ru> writes:

> Could you merge it into a Gnus repository?
>
> Patch for lisp/mml.el:

Sorry for inconvenience. Patch in recommended style:

diff --git a/lisp/mml.el b/lisp/mml.el
index 8b196fa..5403aab 100644
--- a/lisp/mml.el
+++ b/lisp/mml.el
@@ -1205,6 +1205,15 @@ If not set, `default-directory' will be used."
 	string
       default)))
 
+(defun mml-minibuffer-read-charset (&optional default)
+  (let ((charset (completing-read
+                  (format "Charset (default %s): " default)
+                  (mapcar 'symbol-name charset-list)
+                  nil t nil nil default)))
+    (if (not (equal charset ""))
+        charset
+      default)))
+
 (defun mml-minibuffer-read-description ()
   (let ((description (read-string "One line description: ")))
     (when (string-match "\\`[ \t]*\\'" description)
@@ -1294,7 +1303,8 @@ to specify options."
   :version "22.1" ;; Gnus 5.10.9
   :group 'message)
 
-(defun mml-attach-file (file &optional type description disposition)
+(defun mml-attach-file (file &optional type description
+                             disposition charset)
   "Attach a file to the outgoing MIME message.
 The file is not inserted or encoded until you send the message with
 `\\[message-send-and-exit]' or `\\[message-send]'.
@@ -1304,13 +1314,17 @@ content-type, a string of the form \"type/subtype\".  DESCRIPTION
 is a one-line description of the attachment.  The DISPOSITION
 specifies how the attachment is intended to be displayed.  It can
 be either \"inline\" (displayed automatically within the message
-body) or \"attachment\" (separate from the body)."
+body) or \"attachment\" (separate from the body). CHARSET is file
+charset."
   (interactive
    (let* ((file (mml-minibuffer-read-file "Attach file: "))
 	  (type (mml-minibuffer-read-type file))
+      (charset (when (member (car (split-string type "/"))
+                             '("text" "message"))
+                 (mml-minibuffer-read-charset)))
 	  (description (mml-minibuffer-read-description))
 	  (disposition (mml-minibuffer-read-disposition type nil file)))
-     (list file type description disposition)))
+     (list file type description disposition charset)))
   ;; Don't move point if this command is invoked inside the message header.
   (let ((head (unless (message-in-body-p)
 		(prog1
@@ -1318,6 +1332,7 @@ body) or \"attachment\" (separate from the body)."
 		  (goto-char (point-max))))))
     (mml-insert-empty-tag 'part
 			  'type type
+              'charset charset
 			  ;; icicles redefines read-file-name and returns a
 			  ;; string w/ text properties :-/
 			  'filename (mm-substring-no-properties file)



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Attach file improvement
  2011-02-22 12:49 ` Hobbit
@ 2011-02-22 21:03   ` Hobbit
  2011-02-23  9:55     ` Lars Ingebrigtsen
  0 siblings, 1 reply; 10+ messages in thread
From: Hobbit @ 2011-02-22 21:03 UTC (permalink / raw)
  To: ding

Hobbit <werehobbit@yandex.ru> writes:

> Hobbit <werehobbit@yandex.ru> writes:
>
>> Could you merge it into a Gnus repository?
>>
>> Patch for lisp/mml.el:
>
> Sorry for inconvenience. Patch in recommended style:
>

I've got a good advices. So I applied them and updated my patch.

By the way, it's work now looks like:

| C-c C-a
| Attach file: ~/file.txt
| Content type (default text/plain): <RET>
| Charset: cp855
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| One line description: descr
| Disposition (default inline): attachment

If a charset prompt was left empty

| C-c C-a
| Attach file: ~/file.txt
| Content type (default text/plain): <RET>
| Charset:
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| One line description: descr
| Disposition (default inline): attachment

it just wouldn't insert a charset tag (and Gnus would work as without
this patch):

part type="text/plain" filename="~/file.txt" 
                          disposition=attachment description=descr

I. e. this patch affects only user interface.

Patch:

diff --git a/lisp/mml.el b/lisp/mml.el
index 8b196fa..81b8256 100644
--- a/lisp/mml.el
+++ b/lisp/mml.el
@@ -1205,6 +1205,15 @@ If not set, `default-directory' will be used."
 	string
       default)))
 
+(defun mml-minibuffer-read-charset (&optional default)
+  (let ((charset (completing-read
+                  (format "Charset: " default)
+                  mm-mime-mule-charset-alist
+                  nil t nil nil default)))
+    (if (not (equal charset ""))
+        charset
+      default)))
+
 (defun mml-minibuffer-read-description ()
   (let ((description (read-string "One line description: ")))
     (when (string-match "\\`[ \t]*\\'" description)
@@ -1294,7 +1303,8 @@ to specify options."
   :version "22.1" ;; Gnus 5.10.9
   :group 'message)
 
-(defun mml-attach-file (file &optional type description disposition)
+(defun mml-attach-file (file &optional type description
+                             disposition charset)
   "Attach a file to the outgoing MIME message.
 The file is not inserted or encoded until you send the message with
 `\\[message-send-and-exit]' or `\\[message-send]'.
@@ -1304,13 +1314,17 @@ content-type, a string of the form \"type/subtype\".  DESCRIPTION
 is a one-line description of the attachment.  The DISPOSITION
 specifies how the attachment is intended to be displayed.  It can
 be either \"inline\" (displayed automatically within the message
-body) or \"attachment\" (separate from the body)."
+body) or \"attachment\" (separate from the body). CHARSET is file
+charset."
   (interactive
    (let* ((file (mml-minibuffer-read-file "Attach file: "))
 	  (type (mml-minibuffer-read-type file))
+      (charset (when (member (car (split-string type "/"))
+                             '("text" "message"))
+                 (mml-minibuffer-read-charset)))
 	  (description (mml-minibuffer-read-description))
 	  (disposition (mml-minibuffer-read-disposition type nil file)))
-     (list file type description disposition)))
+     (list file type description disposition charset)))
   ;; Don't move point if this command is invoked inside the message header.
   (let ((head (unless (message-in-body-p)
 		(prog1
@@ -1318,6 +1332,7 @@ body) or \"attachment\" (separate from the body)."
 		  (goto-char (point-max))))))
     (mml-insert-empty-tag 'part
 			  'type type
+              'charset charset
 			  ;; icicles redefines read-file-name and returns a
 			  ;; string w/ text properties :-/
 			  'filename (mm-substring-no-properties file)



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Attach file improvement
  2011-02-22 21:03   ` Hobbit
@ 2011-02-23  9:55     ` Lars Ingebrigtsen
  2011-02-23 21:14       ` Hobbit
  0 siblings, 1 reply; 10+ messages in thread
From: Lars Ingebrigtsen @ 2011-02-23  9:55 UTC (permalink / raw)
  To: ding

Hobbit <werehobbit@yandex.ru> writes:

> I've got a good advices. So I applied them and updated my patch.

This patch is more than 10 lines long, so we'll need FSF copyright
assignment papers.  Do you have assignment papers on file with the FSF?

> | C-c C-a
> | Attach file: ~/file.txt
> | Content type (default text/plain): <RET>
> | Charset: cp855

The other thing I don't quite understand is that Emacs is usually able
to tell what the charset is automatically, so why does the user need to
be prompted here?  

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Attach file improvement
  2011-02-23  9:55     ` Lars Ingebrigtsen
@ 2011-02-23 21:14       ` Hobbit
  2011-02-25  3:26         ` Lars Ingebrigtsen
  0 siblings, 1 reply; 10+ messages in thread
From: Hobbit @ 2011-02-23 21:14 UTC (permalink / raw)
  To: ding

Lars Ingebrigtsen <larsi@gnus.org> writes:

> This patch is more than 10 lines long, so we'll need FSF copyright
> assignment papers.  Do you have assignment papers on file with the
> FSF?

Well, if you compare my patch with the original file mml.el you'd see
that it mostly consists of mere copy of old Gnus code:

For example, fragments from mml.el:

;; mml.el line #1197
(defun mml-minibuffer-read-disposition (type &optional default filename)
  (unless default
    (setq default (mml-content-disposition type filename)))
  (let ((disposition (completing-read                      ;; <-- look
		      (format "Disposition (default %s): " default)
		      '(("attachment") ("inline") (""))
		      nil t nil nil default)))
    (if (not (equal disposition ""))
	disposition
      default)))

;;mml.el line #1280
(defun mml-attach-file (file &optional type description disposition)

;;mml.el line #1296
     (list file type description disposition)))

;; mml.el line #491
(if (and (not raw)
      (member (car (split-string type "/")) '("text" "message")))

And this is the patch:

+(defun mml-minibuffer-read-charset (&optional default);; copy&rename
+  (let ((charset (completing-read     ;;<----- variable renamed
+      (format "Charset: " default)    ;; <-- prompt changed
+                  mm-mime-mule-charset-alist
+                  nil t nil nil default))) ;; just copied
+    (if (not (equal charset ""))           ;; just copied
+        charset                            ;; just copied
+      default)))                           ;; just copied


+(defun mml-attach-file (file &optional type description ;; copied
+                             disposition charset);; <- param added

+body) or \"attachment\" (separate from the body). CHARSET is file
+charset."                          

+      (charset (when (member (car (split-string type "/"))
+                             '("text" "message")) ;; partial copy
+                 (mml-minibuffer-read-charset)));; from line #491


+  (list file type description disposition charset)));; var added
+            'charset charset                        ;; var added

So my autorship is much smaller than it could seem. "Real" weight of
the patch is near 10 lines (if not less).

>> | C-c C-a
>> | Attach file: ~/file.txt
>> | Content type (default text/plain): <RET>
>> | Charset: cp855
>

> The other thing I don't quite understand is that Emacs is usually
> able to tell what the charset is automatically, so why does the user
> need to be prompted here?

Unfortunately, in the cyrillic world things are much more difficult,
because there are several popular charsets which use same codes for
different cyrillic letters.

For example, a file with contents (in hex codes)

0xCE, 0xC5, 0xD4

could be read using charset koi8-r as

CYRILLIC SMALL LETTER EN
CYRILLIC SMALL LETTER IE
CYRILLIC SMALL LETTER TE

whereas in charset windows-1251 it's

CYRILLIC CAPITAL LETTER O
CYRILLIC CAPITAL LETTER IE
CYRILLIC CAPITAL LETTER EF

So how could we know what charset to use without asking a user?



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Attach file improvement
  2011-02-23 21:14       ` Hobbit
@ 2011-02-25  3:26         ` Lars Ingebrigtsen
  2011-02-28 14:59           ` Hobbit
  2011-02-28 18:59           ` Hobbit
  0 siblings, 2 replies; 10+ messages in thread
From: Lars Ingebrigtsen @ 2011-02-25  3:26 UTC (permalink / raw)
  To: ding

Hobbit <werehobbit@yandex.ru> writes:

> So my autorship is much smaller than it could seem. "Real" weight of
> the patch is near 10 lines (if not less).

The FSF needs copyright assignment papers in this case.  Would you be
willing to sign such papers?

> Unfortunately, in the cyrillic world things are much more difficult,
> because there are several popular charsets which use same codes for
> different cyrillic letters.

[...]

> So how could we know what charset to use without asking a user?

Right.  But if you just load that file into an Emacs buffer, does it
interpret that file correctly according to your settings, or do you have
to tell Emacs manually what encoding it is?  I guess the latter, if you
normally handle files of both encodings.  And in that case, perhaps the
MIME code should only ask when it's not obvious by looking at the file
what encoding it is?

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Attach file improvement
  2011-02-25  3:26         ` Lars Ingebrigtsen
@ 2011-02-28 14:59           ` Hobbit
  2011-03-05 12:27             ` Lars Magne Ingebrigtsen
  2011-02-28 18:59           ` Hobbit
  1 sibling, 1 reply; 10+ messages in thread
From: Hobbit @ 2011-02-28 14:59 UTC (permalink / raw)
  To: ding

Lars Ingebrigtsen <larsi@gnus.org> writes:

> The FSF needs copyright assignment papers in this case.  Would you be
> willing to sign such papers?

Yes, but not earlier then December 2011. Sorry. :( Would you write
solution for the described problem if it'll be quick enough to implement
(that's not #1 problem it TODO list, unfortunately)?

>> So how could we know what charset to use without asking a user?
> Right.  But if you just load that file into an Emacs buffer, does it
>interpret that file correctly according to your settings, or do you
>have to tell Emacs manually what encoding it is?  I guess the latter,
>if you normally handle files of both encodings.

Yes, I use the latter approach (telling manually).

> And in that case, perhaps the MIME code should only ask when it's not
> obvious by looking at the file what encoding it is?

Users are usually used to type things like this

| C-c C-a
| Attach file: ~/file.txt
| Content type (default text/plain): <RET>
| Charset (default nil): cp855
| One line description: descr
| Disposition (default inline): attachment

automatically, and if Gnus would not ask for charset each time it could
be uncomfortable (because it's brokes reflex).

So maybe best way it's just add another customize variable 

gnus-ask-for-file-charset 

and set it to nil for you and to t for people that need Cyrillic
alphabet.

I'll extend aforementioned example. A file with contents (in hex codes)

0xCE, 0xC5, 0xD4

not only could be read using the koi8-r charset as

CYRILLIC SMALL LETTER EN
CYRILLIC SMALL LETTER IE
CYRILLIC SMALL LETTER TE

or using the windows-1251 charset as

CYRILLIC CAPITAL LETTER O
CYRILLIC CAPITAL LETTER IE
CYRILLIC CAPITAL LETTER EF

but also using iso-8859-1 it's

LATIN CAPITAL LETTER I WITH CIRCUMFLEX
LATIN CAPITAL LETTER A WITH RING ABOVE
LATIN CAPITAL LETTER O WITH CIRCUMFLEX

We can't understand it's real contents without some sophisticated
heuristics. At least for 8-bit encodings.

Only programs such as http://freshmeat.net/projects/enca/ can give
general solution to this problem (excerpt from
http://linux.die.net/man/1/enca):

enca ... uses knowledge about their language (must be supported by you)
and a mixture of parsing, statistical analysis, guessing and black magic
to determine their encodings, which it then prints to standard output
(or it confesses it doesn't have any idea what the encoding could be).

How could a file encoding be obvious by mere looking at the file
(without some clue from user, at least by customize variable)?





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Attach file improvement
  2011-02-25  3:26         ` Lars Ingebrigtsen
  2011-02-28 14:59           ` Hobbit
@ 2011-02-28 18:59           ` Hobbit
  2011-03-05 12:25             ` Lars Magne Ingebrigtsen
  1 sibling, 1 reply; 10+ messages in thread
From: Hobbit @ 2011-02-28 18:59 UTC (permalink / raw)
  To: ding

Lars Ingebrigtsen <larsi@gnus.org> writes:

> The FSF needs copyright assignment papers in this case.  Would you be
> willing to sign such papers?

Where could I get them and what exactly I need to write? Anyway, I need
to finish my current project first.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Attach file improvement
  2011-02-28 18:59           ` Hobbit
@ 2011-03-05 12:25             ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 10+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-03-05 12:25 UTC (permalink / raw)
  To: ding

Hobbit <werehobbit@yandex.ru> writes:

> Where could I get them and what exactly I need to write? 

Write to copyright-clerk@fsf.org and say that you want to get the
paperwork for copyright assignments for Emacs.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Attach file improvement
  2011-02-28 14:59           ` Hobbit
@ 2011-03-05 12:27             ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 10+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-03-05 12:27 UTC (permalink / raw)
  To: ding

Hobbit <werehobbit@yandex.ru> writes:

> | C-c C-a
> | Attach file: ~/file.txt
> | Content type (default text/plain): <RET>
> | Charset (default nil): cp855
> | One line description: descr
> | Disposition (default inline): attachment
>
> automatically, and if Gnus would not ask for charset each time it could
> be uncomfortable (because it's brokes reflex).

True, so adding the prompt unconditionally sounds like a good idea.

> We can't understand it's real contents without some sophisticated
> heuristics. At least for 8-bit encodings.

Right.  I didn't know the situation in Russia was that complex.  :-)

> How could a file encoding be obvious by mere looking at the file
> (without some clue from user, at least by customize variable)?

Some charsets are very easy to guess, like utf-8.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-03-05 12:27 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-22 12:08 Attach file improvement Hobbit
2011-02-22 12:49 ` Hobbit
2011-02-22 21:03   ` Hobbit
2011-02-23  9:55     ` Lars Ingebrigtsen
2011-02-23 21:14       ` Hobbit
2011-02-25  3:26         ` Lars Ingebrigtsen
2011-02-28 14:59           ` Hobbit
2011-03-05 12:27             ` Lars Magne Ingebrigtsen
2011-02-28 18:59           ` Hobbit
2011-03-05 12:25             ` Lars Magne Ingebrigtsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).