* [PATCH] cvs 2005-09-11: html2text.el::html2text-replace-list updates
@ 2005-09-11 8:19 Jari Aalto
2005-09-11 18:41 ` Romain Francoise
0 siblings, 1 reply; 4+ messages in thread
From: Jari Aalto @ 2005-09-11 8:19 UTC (permalink / raw)
Here is small patch to update the list. I left out the "real" Euro and
Cent signs so that the text would be is as close pure ascii as
possible. Patch is against CVS.
2005-09-11 Jari Aalto <jari dot aalto A T cante dot net>
* html2text.el: (html2text-replace-list): Added new entities.
Index: lisp/html2text.el
===================================================================
RCS file: /usr/local/cvsroot/gnus/lisp/html2text.el,v
retrieving revision 7.13
diff -u -IId: -b -w -u -r7.13 html2text.el
--- lisp/html2text.el 26 Aug 2005 00:05:02 -0000 7.13
+++ lisp/html2text.el 11 Sep 2005 07:36:20 -0000
@@ -43,8 +43,42 @@
(defvar html2text-format-single-element-list '(("hr" . html2text-clean-hr)))
(defvar html2text-replace-list
- '((" " . " ") (">" . ">") ("<" . "<") (""" . "\"")
- ("&" . "&") ("'" . "'"))
+ '(("´" . "`")
+ ("&" . "&")
+ ("'" . "'")
+ ("¦" . "|")
+ ("¢" . "c")
+ ("ˆ" . "^")
+ ("©" . "(C)")
+ ("¤" . "(#)")
+ ("°" . "degree")
+ ("÷" . "/")
+ ("€" . "e")
+ ("½" . "1/2")
+ (">" . ">")
+ ("¿" . "?")
+ ("«" . "<<")
+ ("&ldquo" . "\"")
+ ("‹" . "(")
+ ("‘" . "`")
+ ("<" . "<")
+ ("—" . "--")
+ (" " . " ")
+ ("–" . "-")
+ ("‰" . "%%")
+ ("±" . "+-")
+ ("£" . "£")
+ (""" . "\"")
+ ("»" . ">>")
+ ("&rdquo" . "\"")
+ ("®" . "(R)")
+ ("›" . ")")
+ ("’" . "'")
+ ("§" . "§")
+ ("¹" . "^1")
+ ("²" . "^2")
+ ("³" . "^3")
+ ("˜" . "~"))
"The map of entity to text.
This is an alist were each element is a dotted pair consisting of an
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] cvs 2005-09-11: html2text.el::html2text-replace-list updates
2005-09-11 8:19 [PATCH] cvs 2005-09-11: html2text.el::html2text-replace-list updates Jari Aalto
@ 2005-09-11 18:41 ` Romain Francoise
2005-09-19 16:21 ` Reiner Steib
0 siblings, 1 reply; 4+ messages in thread
From: Romain Francoise @ 2005-09-11 18:41 UTC (permalink / raw)
Cc: ding
Jari Aalto <jari.aalto@cante.net> writes:
> 2005-09-11 Jari Aalto <jari dot aalto A T cante dot net>
>
> * html2text.el: (html2text-replace-list): Added new entities.
Applied; thanks!
--
Romain Francoise <romain@orebokech.com> | The world is a fine place,
it's a miracle -- http://orebokech.com/ | and worth fighting for.
| --Ernest Hemingway
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] cvs 2005-09-11: html2text.el::html2text-replace-list updates
2005-09-11 18:41 ` Romain Francoise
@ 2005-09-19 16:21 ` Reiner Steib
2005-09-19 16:44 ` Romain Francoise
0 siblings, 1 reply; 4+ messages in thread
From: Reiner Steib @ 2005-09-19 16:21 UTC (permalink / raw)
On Sun, Sep 11 2005, Romain Francoise wrote:
> Jari Aalto <jari.aalto@cante.net> writes:
>
>> 2005-09-11 Jari Aalto <jari dot aalto A T cante dot net>
>>
>> * html2text.el: (html2text-replace-list): Added new entities.
>
> Applied; thanks!
I still doubt that maintaining yet another list of entities is a good
idea...
,----[ <news:v9oei6d4mv.fsf@marauder.physik.uni-ulm.de> ]
| From: Reiner Steib <reinersteib+gmane@imap.cc>
| Subject: Re: html2text
| To: <emacs-devel@gnu.org>
| Cc: [...] (Jari Aalto+mail.emacs)
| Date: Tue, 09 Nov 2004 23:44:24 +0100
| Message-ID: <v9oei6d4mv.fsf@marauder.physik.uni-ulm.de>
|
| [...]
| > (defvar html2text-replace-list
| > - '((" " . " ") (">" . ">") ("<" . "<") (""" . "\"")
| > - ("&" . "&") ("'" . "'"))
| > + '(("´" . "`")
|
| This should be "´".
|
| > + ("&" . "&")
| > + ("'" . "'")
| > + ("¦" . "|")
| > + ("¢" . "c")
| > + ("ˆ" . "^")
| > + ("©" . "(C)")
| > + ("¤" . "¤")
| > + ("°" . "degree")
| > + ("÷" . "/")
| > + ("€" . "e")
| > + ("½" . "½")
| [...]
|
| It seems strange to use Latin-1 characters for some entities, but not
| for all encodable by Latin-1.
|
| On a second thought, it looks like there are already more or less
| complete lists[1] e.g. in `mm-url-html-entities' (from Gnus),
| `sgml-char-names', `sgml-char-names-table', `iso-iso2sgml-trans-tab'
| (Emacs) or `w3m-entity-alist' (emacs-w3m).
|
| Probably one of these could be used. Hm, maybe the function
| `iso-sgml2iso' could be used in `html2text.el'?
|
| Bye, Reiner.
|
| [1] Might be checked with
| http://www.w3.org/TR/REC-html40/sgml/entities.html or other
| tables.
`----
Bye, Reiner.
--
,,,
(o o)
---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] cvs 2005-09-11: html2text.el::html2text-replace-list updates
2005-09-19 16:21 ` Reiner Steib
@ 2005-09-19 16:44 ` Romain Francoise
0 siblings, 0 replies; 4+ messages in thread
From: Romain Francoise @ 2005-09-19 16:44 UTC (permalink / raw)
Reiner Steib <reinersteib+gmane@imap.cc> writes:
> I still doubt that maintaining yet another list of entities is a good
> idea...
Sure, but in the meantime updating the list doesn't cost anything.
--
Romain Francoise <romain@orebokech.com> | I like the streets when
it's a miracle -- http://orebokech.com/ | they're empty, I can make the
| rest up.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-09-19 16:44 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-09-11 8:19 [PATCH] cvs 2005-09-11: html2text.el::html2text-replace-list updates Jari Aalto
2005-09-11 18:41 ` Romain Francoise
2005-09-19 16:21 ` Reiner Steib
2005-09-19 16:44 ` Romain Francoise
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).