From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from qmta09.westchester.pa.mail.comcast.net (qmta09.westchester.pa.mail.comcast.net [IPv6:2001:558:fe14:43:76:96:62:96]) by hurricane.the-brannons.com (Postfix) with ESMTP id 0F5E478622 for ; Wed, 26 Feb 2014 05:07:28 -0800 (PST) Received: from omta18.westchester.pa.mail.comcast.net ([76.96.62.90]) by qmta09.westchester.pa.mail.comcast.net with comcast id X0N71n0021wpRvQ5916Twk; Wed, 26 Feb 2014 13:06:27 +0000 Received: from eklhad ([107.5.36.150]) by omta18.westchester.pa.mail.comcast.net with comcast id X16T1n00b3EMmQj3e16Ttx; Wed, 26 Feb 2014 13:06:27 +0000 To: Edbrowse-dev@lists.the-brannons.com From: Karl Dahlke User-Agent: edbrowse/3.5.1 Date: Wed, 26 Feb 2014 08:06:27 -0500 Message-ID: <20140126080627.eklhad@comcast.net> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1393419987; bh=oQ8dDsj9uEkYW/4Qfpob000YbO7i1EFaHXUk/tY1bFQ=; h=Received:Received:To:From:Reply-to:Subject:Date:Message-ID: Mime-Version:Content-Type; b=UqQXuZZVe+gQ5oABRYxaFL/oGrMBp1uF4M+TJekLbmz8tpUoi41G5zT4h3e9bdlHc QtUJRZTJ9PDfxN5N21L1hLIOHu/rIIRWWQ8paDZUWnfy+/6NYSIpcr34xPWSMGCnkZ WhXNTXuTyk480LsVGHWjLRmXgSXZMh/TMqlMGpZ0k0I8oFHP1mx6K8zHLtbDSd+eGz gmDyVyWk1ZFQb2t2cWWGEIpRMoKZF/g+AcsjKIfHkzZtExNVVCbs242cdoFz8ltnAe o89AC22/1Vp4YEAlTN3gbF+wPSHwEEdmnzncJI8X4bM0ZSYrrxjysOAhIcxK8VPAZc bTSXnYLoTOqLg== Subject: [Edbrowse-dev] andTranslate X-BeenThere: edbrowse-dev@lists.the-brannons.com X-Mailman-Version: 2.1.17 Precedence: list Reply-To: Karl Dahlke List-Id: Edbrowse Development List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Feb 2014 13:07:29 -0000 There is a function in format.c called andTranslate(). It takes meta-characters like &whatever; in html and turns it into the symbol whatever. A common example is < for the less than sign, because a bare less than sign is the beginning of an html tag. Every literal less than sign has to be encoded in this way. Thus < becomes < I turn it into the character <, not the words less than or some such thing, because every screen reader and every adapter will read the less than sign, as you want it read, in your language. I don't want to mess with that. But the hiher unicodes I sometimes turn into words, English words, unfortunately hard coded in format.c, because screen readers may not know what to do with those unicodes. On the other hand, more and more readers are configurable, to render these high unicodes as you wish, and I take that power away from the user by translating them into my own words in format.c. I propose that andTranslate turn every &whatever; symbol into its utf8 equivalent, and that's all. Beyond this however, you could have in your .ebrc config file lines like γ gamma This would override the simple utf8 translation. It would let you put in your own words if your screen reader or system simply doesn't handle those unicodes well. Or if you are dumping formatted html to text and would rather have it in words. What do you think? Of course this qualifies as a new feature, and I need not jump into it now. We should probably continue with bug fixes and the debian confusion, which I am very disappointed that they aren't helping us out here. We're doing 95% of the work, and they can't come forward with some information on how they build their libraries etc?? Well that's another story I guess. Karl Dahlke