From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-we0-x231.google.com (mail-we0-x231.google.com [IPv6:2a00:1450:400c:c03::231]) by hurricane.the-brannons.com (Postfix) with ESMTPS id A611977AC0; Wed, 18 Dec 2013 09:06:19 -0800 (PST) Received: by mail-we0-f177.google.com with SMTP id u56so7894748wes.36 for ; Wed, 18 Dec 2013 09:06:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=LnQTvaDsyOJ2Dkk6c0rHvolqbTigIZFvoxCbhaDmBFQ=; b=tKVolJmEgB8VdvuRkGLLquVMmxNVHDdMrVhT0rSBsjhZd6Z3Sr5UFHZkwZtHjXpF0l vEVVG6DYYnx1+fW4tnS3PPTqc/p3GbAYSdzaYBGo1n4EJhJSJQe4ax7DrTUD2EiQEA3K 4zf12NQcuM6C6X79Ez8/OqV8EZ0WODgciZVz/dtI+swc417Wsm6WzYx7ke1oxcRKdMMe jPaY+IAoUc3qq+KRzj7fq7qDMRwgpI7nGdgbjzsXzSR/usBuA9Tsr1n42AskmnR2Ko/f QcQFhQ9BHgdYkGoyb5JFN7L6S9O+SS1ax/qMyBGlKQwgJdnCHsytvafqVZP1EYTY50VX SZSA== X-Received: by 10.180.90.37 with SMTP id bt5mr9093839wib.43.1387386369938; Wed, 18 Dec 2013 09:06:09 -0800 (PST) Received: from toaster.adamthompson.me.uk (toaster.adamthompson.me.uk. [2001:8b0:1142:9042::2]) by mx.google.com with ESMTPSA id xn17sm2508090wib.1.2013.12.18.09.06.08 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Wed, 18 Dec 2013 09:06:09 -0800 (PST) Date: Wed, 18 Dec 2013 17:06:05 +0000 From: Adam Thompson To: Karl Dahlke Message-ID: <20131218170605.GD5812@toaster.adamthompson.me.uk> References: <20131118105931.eklhad@comcast.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20131118105931.eklhad@comcast.net> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: acsint@lists.the-brannons.com, Edbrowse-dev@lists.the-brannons.com Subject: Re: [Edbrowse-dev] html unicode translations in edbrowse X-BeenThere: edbrowse-dev@lists.the-brannons.com X-Mailman-Version: 2.1.17 Precedence: list List-Id: Edbrowse Development List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Dec 2013 17:06:20 -0000 On Wed, Dec 18, 2013 at 10:59:31AM -0500, Karl Dahlke wrote: > My jupiter adapter will pronounce unicodes in utf8 in the tty buffer > according to pronunciations that you can set in the config file. > Here is an example, the start of Greek. > > u945 alpha > u946 beta > u947 gamma > > So when this code appears as 2 bytes in utf8 it is read alpha, > no matter how it got there. That sounds like a good idea. > How did I use to do it? > The html browser would turn the html code > α into the word alpha when rendering html. > See format.c line 1330 > That works fine as long as I am browsing files from the web, > or html files that I wrote myself, > but if alpha beta gamma are in a document or from pdf or some other > source well I am just out of luck. > You can see at a glance that such things are better handled in the adapter. > It's a more general and flexible approach. Again agreed. > > Once the latest version of Jupiter is pushed, > I may request of Chris that most or all > of those hard-coded translations in format.c go away, > and instead you just crank out the unicode that is implied by the html tag. > It's up to the adapter then to read it properly. This makes sense as long as the user's adapter does handle utf8. I use speakup with espeak which seems to handle most things, but probably not everything, and I've got no idea what those characters would do to my braille display. I'm not against the idea, but it may be worth remembering that edbrowse has a wider user community than those using jupiter, particularly as there's a debian package for edbrowse and not for jupiter (at least not in the main repos). Also, are you planning to ship an example list of these characters or do users have to go through the utf8 charset to work out what's what? Cheers, Adam.