From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.17.22]) by hurricane.the-brannons.com (Postfix) with ESMTPS id C757F77C8D for ; Fri, 22 Jan 2016 09:11:52 -0800 (PST) Received: from gmx.com ([87.157.57.218]) by mail.gmx.com (mrgmx103) with ESMTPSA (Nemesis) id 0LikyX-1ZrBwF3LLx-00cwXG for ; Fri, 22 Jan 2016 18:13:07 +0100 Received: by gmx.com (sSMTP sendmail emulation); Fri, 22 Jan 2016 18:13:05 +0100 Date: Fri, 22 Jan 2016 18:13:05 +0100 From: Sebastian Humenda To: edbrowse-dev Message-ID: <20160122171305.GE2555@Kraftkrust> References: <20160122130158.GA2555@Kraftkrust> <20160022093057.eklhad@comcast.net> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="Lb0e7rgc7IsuDeGj" Content-Disposition: inline In-Reply-To: <20160022093057.eklhad@comcast.net> X-Operating-System: Debian GNU/Linux 6.0 Squeeze X-Editor: Vim 7.2 User-Agent: Mutt/1.5.24 (2015-08-30) X-Provags-ID: V03:K0:vja54ITqGOdlXKcuP7/gk8wWTqR74U27Pw0sY/m59ZEdwOpxK+U zUNFjF/BlzhqXFrT/Ee22qUWPvW/TXv6klddPeiEqxnnCqgYJLgT06cEC5N+owy9AnpQebE drxJiYhDDxPsR30zBYvo+vDC+vZ4QbAt3y/AZ4KRWfVpnK5zQneHOwfXeRveam0vfivrYFj jf5UhdyebbHvjd7/Y6W2Q== X-UI-Out-Filterresults: notjunk:1;V01:K0:qI7Fvw4HmWc=:PzE8IM68YZ0Bn0lwTPk8v/ kqart3axw3onKTezB/FGqX1QxPdsU6xXlJg807qkhhxfC1MzlMOvQLujavP5wB8O+Venp58zJ t/t/YlcMghE1XAHyJyWLV/8QqR3+ztnNF+uoALS84rmEaZ8mhH/yJhq/Kelw+dlQVribdxmHI pqvWJ7Ndh4mpQjFy+sSkucOIfHI2Z3jCl/+qCkg3kW3L1+x2Vy6waSuQyecwArO+lPmRcw4E+ Oa5a7O1EPxbo4oJRg2RcRGB770M6h0REaDNYXcdq4qpwfsl1lKoxdituQwqQXpMRQQElQp5ih 8vu7SIyDFfH69XMxJlWxDr/pCMDYbnkAfb8sk+lMjJrHvBjHE/aTHbzR5ro3IiR4kYaR4iV7z 8m50omF0YxIwpN6bKWMS9MTlXRV8pSJL+/d67jimeCoTHiERfXEznn+YANZMcp3eNDSh2ls++ GALMpA/8ZMWmw116hfF1EFlwNM5puM2CBeZ4YisUnrkdI1EKja+NU9aB54lnOuSm3Fh+ni6cQ sZ0wrybtk1kyL+zhnEe2WR2P3KRm/DGiuuf2htqtzkKow0k0A2iDF0aYZnos9dFb2HQqk73P/ HnsBbmwYbKhLUPJUJvMyBwjBreD0USRCx4i+hUo9KpEiVoc+5WRSjxIfhYWPYUfUJ4GdsXUzX xCIqQ+XrY07L0Z9FPXm9V0Kbds8//0PoDiIU+9Yd7gLf8VrNnQxuRyn/OIrvLEWDXkRbP3shR SWykxhnwXzdj+GQ8btPlRY52LCqdXx0Z08KYDwUlxbk8Yf6d3f/FVPtQjlo= Subject: Re: [Edbrowse-dev] Edbrowse recognizing site as binary data X-BeenThere: edbrowse-dev@lists.the-brannons.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Edbrowse Development List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jan 2016 17:11:53 -0000 --Lb0e7rgc7IsuDeGj Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi, Karl Dahlke schrieb am 22.01.2016, 9:30 -0500: >> edbrowse https://portal.slm.tu-dresden.de >> cannot browse a binary file >> I presume that's a bug? Is there anything I can do to help resolving thi= s issue? > >Not a bug technically, it's a feature not yet implemented. >Save the data to a file and run /bin/file on it and get this > >HTML document, Little-endian UTF-16 Unicode text Ah right. >The problem is utf-16, which edbrowse does not recognize. >This has been discussed on the group. >It is very rare on the internet and becoming rarer still, This is partially true. Especially for Asian languages, UTF-16 is a better choice than UTF8. Not true for the above site though. Anyway, there's http://llvm.org/svn/llvm-project/llvm/trunk/include/llvm/Support/ConvertUTF= =2Eh. That should provide a conversion function for both UTF-32->UTF-8 and UTF-16->UTF-8 (together with the appropriate C file, of course). Wouldn't i= t be easy to just detect UTF-16 and convert it to UTF-8 before doing anything el= se? Cheers Sebastian --=20 Web: http://www.crustulus.de (English|Deutsch) | Blog: http://www.crustulu= s.de/blog FreeDict: Free multilingual dictionaries - http://www.freedict.org Freies Latein-Deutsch-W=F6rterbuch: http://www.crustulus.de/freedict.de.html --Lb0e7rgc7IsuDeGj Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBCAAGBQJWomMhAAoJEGiaBAINfqmhOfQQAJGC3+aZGuejwZvW7oa0bGYl Hrm8+yipwTmEX3IAsO8iyXZRNkSoLwU9RyfZooP8YSvXTbaWrpuNbAH3Nt9pAFWx DGQFP3ti6EFYOSPsI4wE7MlBH8yQkRQpwr1H6+tmHKDGiv4pdLtyezRRimaV7xiY lCJvYC1jhKjV4EXUKsQEu5uUt6wIiakHqv3+Zr4tdsRvjCjjK222q8kOwzL3a5p7 KYVeHDK6uKVlafaWX7CILn3goFuVA/q9DnXx8thGMbnms5pob+Z1IuC4VcZT34OB f9erpLRROIBAFlBtchFay1ueugrZwhSHsuaF5rUZG6QmWhEbJ9NhF8EZAyiM0pKX aPk1f3S0El4rIGA6DXz3hKS1c1IRQhtDXVhp4P7apW3lX1KlE69jpJBGo4AuEnwL 3yRFTAfOwVvwELWBjHU4r03JGkgeaNfs9GU87ClC9z29xi+HP4pfLsHrIFA0flxv 9QY+5M2/q9ClrP0d+NKrHnbzaRHSNGl/r3ewXGYnuevz36+lvH9uIq4OcBlH6q3m j5LKwzcJWUyAZxeSyq9remS2OwGhqa+VarvxpsrouvzNSSLmpvLP3IGGF61HZQC8 mYh9ORLKVWZEFWZV+KiSmBxBGvy9ma4KJnLPsOTd77PQrkGdHygUCQnMtBFVzOKu sLxzXIMXK751TquS+DHN =lQf3 -----END PGP SIGNATURE----- --Lb0e7rgc7IsuDeGj--