edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed
* [Edbrowse-dev] html unicode translations in edbrowse
@ 2013-12-18 15:59 Karl Dahlke
  2013-12-18 17:06 ` Adam Thompson
  0 siblings, 1 reply; 4+ messages in thread
From: Karl Dahlke @ 2013-12-18 15:59 UTC (permalink / raw)
  To: Edbrowse-dev, acsint

This is a heads up of where we are headed, quite soon I hope.

My jupiter adapter will pronounce unicodes in utf8 in the tty buffer
according to pronunciations that you can set in the config file.
Here is an example, the start of Greek.

u945	alpha
u946	beta
u947	gamma

So when this code appears as 2 bytes in utf8 it is read alpha,
no matter how it got there.

How did I use to do it?
The html browser would turn the html code
α into the word alpha when rendering html.
See format.c line 1330
That works fine as long as I am browsing files from the web,
or html files that I wrote myself,
but if alpha beta gamma are in a document or from pdf or some other
source well I am just out of luck.
You can see at a glance that such things are better handled in the adapter.
It's a more general and flexible approach.

Once the latest version of Jupiter is pushed,
I may request of Chris that most or all
of those hard-coded translations in format.c go away,
and instead you just crank out the unicode that is implied by the html tag.
It's up to the adapter then to read it properly.
It's mostly deleting code that I'm happy to get rid of,
so should be no trouble.
The real test will be reading my math pages,
which are full of greek letters etc.

Thanks.

Karl Dahlke

^ permalink raw reply	[flat|nested] 4+ messages in thread
* [Edbrowse-dev]  html unicode translations in edbrowse
@ 2013-12-18 18:45 Karl Dahlke
  2013-12-19 12:20 ` Adam Thompson
  0 siblings, 1 reply; 4+ messages in thread
From: Karl Dahlke @ 2013-12-18 18:45 UTC (permalink / raw)
  To: Edbrowse-dev

> I use speakup with espeak which seems to handle most things,

As I understand it it works well with 8859-1,
which covers many western languages,
but that would not include the high unicodes,
so yes that would leave you out in the cold regarding
alpha beta gamma and my other math symbols.
And I do appreciate this feedback; that's why I posted.

On the other side, edbrowse renders these according to my taste,
and in english, hard coded,
so some of my French edbrowse users may not be thrilled with the word alpha.
Who knows how that sounds on a french synthesizer.
So there's no clear right answere here;
maybe we'll just leave edbrowse be for a while until we have
a clear plan, or maybe a switch to turn these on or off.

Karl Dahlke

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-12-19 12:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-18 15:59 [Edbrowse-dev] html unicode translations in edbrowse Karl Dahlke
2013-12-18 17:06 ` Adam Thompson
2013-12-18 18:45 Karl Dahlke
2013-12-19 12:20 ` Adam Thompson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).