From mboxrd@z Thu Jan 1 00:00:00 1970 From: Latchesar Ionkov To: 9fans@cse.psu.edu Subject: Re: [9fans] web-based bulgarian-english and english-bulgarian dictionary Message-ID: <20031106235712.GA2271@ionkov.net> References: <6e08851011f045080e1945623fb62232@plan9.ucalgary.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <6e08851011f045080e1945623fb62232@plan9.ucalgary.ca> User-Agent: Mutt/1.4.1i Date: Thu, 6 Nov 2003 18:57:12 -0500 Content-Transfer-Encoding: quoted-printable Topicbox-Message-UUID: 80bea73a-eacc-11e9-9e20-41e7f4b1d025 You can check the dictionaries (and the library) at http://sourceforge.net/projects/bedic.=20 The library is written in C++, but is pretty simple and can be ported to = C easily. Writing an command line (or CGI) app is also pretty easy. The dictionaries are in UTF-8 format :) Lucho PS. The library contains couple of functions from 9libs :) On Thu, Nov 06, 2003 at 04:29:21PM -0700, mirtchov@cpsc.ucalgary.ca said: > Inspired by the little thesaurus script today's "pull" brought home, > this little script queries a relatively complete en/bg and bg/en > dictionary on the http://sa.dir.bg web site. >=20 > I've sent them an email to tell them about the wonders of utf-8, but I > doubt they'll switch from the venerable Windows-1251 cyrillic > encoding, so there's a key in the script for transliterating all > cyrillic queries, or if you wish you can type in cyrillic and it'll do > the transliteration for you. >=20 > also available at: >=20 > http://pages.cpsc.ucalgary.ca/~mirtchov/p9/sadict/ >=20 > I don't think this should make it in the default Plan 9 distribution, > unless Plan 9 becomes the OS of choice in Bulgaria in the next 30 > minutes. >=20 > andrey >=20 > PS: I used it just now to check the meaning of 'venerable', it works :) > #!/bin/rc >=20 > # english-bulgarian translation: > # sadict test >=20 > # bulgarian-english translation: > # sadict TEST >=20 > # the web site switches between bulgarian and english translations > # based on whether it receives uppercase or lowercase letters > # to translate from bulgarian into english use the following transliter= ation > # key: >=20 > # key for translating from bulgarian to english (all capital letters > # are in english), as they would be written if you were typing on=20 > # a phonetic english-american keyboard. > #=20 > # ?? =3D A > # ?? =3D B > # ?? =3D W > # ?? =3D G > # ?? =3D D > # ?? =3D E > # ?? =3D V > # ?? =3D Z > # ?? =3D I > # ?? =3D J > # ?? =3D K > # ?? =3D L > # ?? =3D M > # ?? =3D N > # ?? =3D O > # ?? =3D P > # ?? =3D R > # ?? =3D S > # ?? =3D T > # ?? =3D U > # ?? =3D F > # ?? =3D H > # ?? =3D C > # ?? =3D ` (or %60) > # ?? =3D [ > # ?? =3D ] > # ?? =3D Y > # ?? =3D X > # ?? =3D | > # ?? =3D Q >=20 > # sa.dir.bg outputs everything in window-1251 encoding, so we=20 > # use tr -s to substitute for proper unicode cyrillic >=20 > # sa.dir.bg gives 20 (sometimes less) similar words in a separate > # frame. I've left it there because I often find it useful and it just > # scroll off the top of the screen when it isn't. >=20 > # don't try capital letters, won't work! >=20 > query=3D`{echo $1 |=20 > tr -s '??????????????????????????????????????????????????????????' 'AB= WGDEVZIJKLMNOPRSTUFHC\[\]YX\|Q' | > sed 's/??/%60/g' > } >=20 >=20 >=20 > hget 'http://sa.dir.bg/cgi-bin/sabig.cgi?word=3D'^$query | > htmlfmt -l 1000 | > tr -s '=E0-=FF=C0-=DF' '??-????-??' >=20