Gnus development mailing list
 help / color / mirror / Atom feed
* A washing function to turn Unicode punctuation into ASCII
@ 2010-11-08 23:27 Lars Magne Ingebrigtsen
  2010-11-09  8:55 ` Steinar Bang
  2010-11-12 23:24 ` Kevin Ryde
  0 siblings, 2 replies; 12+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-11-08 23:27 UTC (permalink / raw)
  To: ding

I was reading Hacker News on my E72 yesterday, and many of the articles
were kinda, er, not very readable, because characters like “ were
rendered as 0x+201c in the article buffer, or something as awful.
Probably because the font wasn't available?  Or something?

So my question is: Does Emacs have a function to, er, translate these
characters into their nearest ASCII equivalents?  Or should I just write
a Gnus washing function for that?  I'm guessing that there aren't more
than 20 commonly used characters that it would make sense to wash this
way... 

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: A washing function to turn Unicode punctuation into ASCII
  2010-11-08 23:27 A washing function to turn Unicode punctuation into ASCII Lars Magne Ingebrigtsen
@ 2010-11-09  8:55 ` Steinar Bang
  2010-11-09 17:52   ` Lars Magne Ingebrigtsen
  2010-11-12 23:24 ` Kevin Ryde
  1 sibling, 1 reply; 12+ messages in thread
From: Steinar Bang @ 2010-11-09  8:55 UTC (permalink / raw)
  To: ding

>>>>> Lars Magne Ingebrigtsen <larsi@gnus.org>:

> So my question is: Does Emacs have a function to, er, translate these
> characters into their nearest ASCII equivalents?  Or should I just
> write a Gnus washing function for that?  I'm guessing that there
> aren't more than 20 commonly used characters that it would make sense
> to wash this way...

Doesn't that dumbquotes thing already do something like this?

Ie. deuglify.el...?






^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: A washing function to turn Unicode punctuation into ASCII
  2010-11-09  8:55 ` Steinar Bang
@ 2010-11-09 17:52   ` Lars Magne Ingebrigtsen
  2010-11-09 18:01     ` Lawrence Mitchell
  0 siblings, 1 reply; 12+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-11-09 17:52 UTC (permalink / raw)
  To: ding

Steinar Bang <sb@dod.no> writes:

> Doesn't that dumbquotes thing already do something like this?

Yes, but only for dumbquotes.  :-)  The unicode->ascii equivalence thing
doesn't seem to exist in Gnus.  Or Emacs.  At least, I can't find
anything... 

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: A washing function to turn Unicode punctuation into ASCII
  2010-11-09 17:52   ` Lars Magne Ingebrigtsen
@ 2010-11-09 18:01     ` Lawrence Mitchell
  2010-11-09 18:16       ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 12+ messages in thread
From: Lawrence Mitchell @ 2010-11-09 18:01 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen wrote:
> Steinar Bang <sb@dod.no> writes:

>> Doesn't that dumbquotes thing already do something like this?

> Yes, but only for dumbquotes.  :-)  The unicode->ascii equivalence thing
> doesn't seem to exist in Gnus.  Or Emacs.  At least, I can't find
> anything...


org-entities.el maybe?

Lawrence
-- 
Lawrence Mitchell <wence@gmx.li>




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: A washing function to turn Unicode punctuation into ASCII
  2010-11-09 18:01     ` Lawrence Mitchell
@ 2010-11-09 18:16       ` Lars Magne Ingebrigtsen
  2010-11-09 18:48         ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 12+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-11-09 18:16 UTC (permalink / raw)
  To: ding

Lawrence Mitchell <wence@gmx.li> writes:

> org-entities.el maybe?

Yes, that looks exactly like what I need; thanks.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: A washing function to turn Unicode punctuation into ASCII
  2010-11-09 18:16       ` Lars Magne Ingebrigtsen
@ 2010-11-09 18:48         ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 12+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-11-09 18:48 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Yes, that looks exactly like what I need; thanks.

This is now implemented.  New command `W U'.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: A washing function to turn Unicode punctuation into ASCII
  2010-11-08 23:27 A washing function to turn Unicode punctuation into ASCII Lars Magne Ingebrigtsen
  2010-11-09  8:55 ` Steinar Bang
@ 2010-11-12 23:24 ` Kevin Ryde
  2010-11-14 16:01   ` Lars Magne Ingebrigtsen
  1 sibling, 1 reply; 12+ messages in thread
From: Kevin Ryde @ 2010-11-12 23:24 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>
> wash this way...

A bit of display table can do a similar thing, and helps all modes.
I made my own few lines for hypens and quotes which seem to be the most
common nonsense.  Is there more or better code doing the same?

http://user42.tuxfamily.org/unicode-disp/index.html



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: A washing function to turn Unicode punctuation into ASCII
  2010-11-12 23:24 ` Kevin Ryde
@ 2010-11-14 16:01   ` Lars Magne Ingebrigtsen
  2010-11-14 17:56     ` Andreas Schwab
  0 siblings, 1 reply; 12+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-11-14 16:01 UTC (permalink / raw)
  To: ding

Kevin Ryde <user42@zip.com.au> writes:

> A bit of display table can do a similar thing, and helps all modes.
> I made my own few lines for hypens and quotes which seem to be the most
> common nonsense.  Is there more or better code doing the same?
>
> http://user42.tuxfamily.org/unicode-disp/index.html

This makes a lot of sense.  But I'm beginning to wonder whether Emacs
has any built-in functionality for doing this stuff?  It would seem like
a very natural thing to do -- declare that your Emacs is running in an
ASCII environment, and then Emacs sets up display tables to
automatically do this transformation...

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: A washing function to turn Unicode punctuation into ASCII
  2010-11-14 16:01   ` Lars Magne Ingebrigtsen
@ 2010-11-14 17:56     ` Andreas Schwab
  2010-11-14 19:16       ` Lars Magne Ingebrigtsen
  2010-11-15 23:33       ` Kevin Ryde
  0 siblings, 2 replies; 12+ messages in thread
From: Andreas Schwab @ 2010-11-14 17:56 UTC (permalink / raw)
  To: ding

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> This makes a lot of sense.  But I'm beginning to wonder whether Emacs
> has any built-in functionality for doing this stuff?  It would seem like
> a very natural thing to do -- declare that your Emacs is running in an
> ASCII environment, and then Emacs sets up display tables to
> automatically do this transformation...

See international/iso-ascii.el (only for latin1) or
international/latin1-disp.el.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: A washing function to turn Unicode punctuation into ASCII
  2010-11-14 17:56     ` Andreas Schwab
@ 2010-11-14 19:16       ` Lars Magne Ingebrigtsen
  2010-11-15 23:33       ` Kevin Ryde
  1 sibling, 0 replies; 12+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-11-14 19:16 UTC (permalink / raw)
  To: ding; +Cc: emacs-devel

(We're discussing how to read utf-8 messages on a terminal that doesn't
support utf-8, e.g. Nokia E72.)

Andreas Schwab <schwab@linux-m68k.org> writes:

> See international/iso-ascii.el (only for latin1) or
> international/latin1-disp.el.

Hm...  so the function to call would be `latin1-display-ucs-per-lynx',
perhaps?  That's a huge mapping in there.  I'll give it a try.

Yes, that did the trick.  Perhaps that function should be
;;;###autoloaded and given a more understandable name?  It seems really
useful. 

Or are there other methods to do the same thing?

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: A washing function to turn Unicode punctuation into ASCII
  2010-11-14 17:56     ` Andreas Schwab
  2010-11-14 19:16       ` Lars Magne Ingebrigtsen
@ 2010-11-15 23:33       ` Kevin Ryde
  2010-11-16 18:18         ` Lars Magne Ingebrigtsen
  1 sibling, 1 reply; 12+ messages in thread
From: Kevin Ryde @ 2010-11-15 23:33 UTC (permalink / raw)
  To: ding

Andreas Schwab <schwab@linux-m68k.org> writes:
>
> international/latin1-disp.el.

Oh, yes, of course that was the inspiration for the name unicode-disp.el
:-)  I ended up mangling any window-display-table too, but much less
chars.

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>
> latin1-display-ucs-per-lynx

I had a lot of doubts about multi-char replacements, as they can make a
mess of lined-up text tables, or make lines very long.  As long as it
can be turned on and off I suppose you can look at it both ways when
needed.  (If it's coming out of html then of course asking the renderer
to do the replacements can flow etc at that level.)

I toyed with face colours as a visual clue, but didn't much like it,
even something modest like `escape-face'.  With many replacements the
screen breaks out in freckles when in all honesty you don't care if some
drongo thought variant hyphens or quotes were smart.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: A washing function to turn Unicode punctuation into ASCII
  2010-11-15 23:33       ` Kevin Ryde
@ 2010-11-16 18:18         ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 12+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-11-16 18:18 UTC (permalink / raw)
  To: ding

Kevin Ryde <user42@zip.com.au> writes:

>> latin1-display-ucs-per-lynx
>
> I had a lot of doubts about multi-char replacements, as they can make a
> mess of lined-up text tables, or make lines very long.  As long as it
> can be turned on and off I suppose you can look at it both ways when
> needed.

It's quite useful when reading RSS feeds on my E72...

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-11-16 18:18 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-08 23:27 A washing function to turn Unicode punctuation into ASCII Lars Magne Ingebrigtsen
2010-11-09  8:55 ` Steinar Bang
2010-11-09 17:52   ` Lars Magne Ingebrigtsen
2010-11-09 18:01     ` Lawrence Mitchell
2010-11-09 18:16       ` Lars Magne Ingebrigtsen
2010-11-09 18:48         ` Lars Magne Ingebrigtsen
2010-11-12 23:24 ` Kevin Ryde
2010-11-14 16:01   ` Lars Magne Ingebrigtsen
2010-11-14 17:56     ` Andreas Schwab
2010-11-14 19:16       ` Lars Magne Ingebrigtsen
2010-11-15 23:33       ` Kevin Ryde
2010-11-16 18:18         ` Lars Magne Ingebrigtsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).