From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <6e08851011f045080e1945623fb62232@plan9.ucalgary.ca> To: 9fans@cse.psu.edu From: mirtchov@cpsc.ucalgary.ca MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="upas-tuzeuckkkjtmiipkbjcrqmmxxi" Subject: [9fans] web-based bulgarian-english and english-bulgarian dictionary Date: Thu, 6 Nov 2003 16:29:21 -0700 Topicbox-Message-UUID: 80a9e2d2-eacc-11e9-9e20-41e7f4b1d025 This is a multi-part message in MIME format. --upas-tuzeuckkkjtmiipkbjcrqmmxxi Content-Disposition: inline Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Inspired by the little thesaurus script today's "pull" brought home, this little script queries a relatively complete en/bg and bg/en dictionary on the http://sa.dir.bg web site. I've sent them an email to tell them about the wonders of utf-8, but I doubt they'll switch from the venerable Windows-1251 cyrillic encoding, so there's a key in the script for transliterating all cyrillic queries, or if you wish you can type in cyrillic and it'll do the transliteration for you. also available at: http://pages.cpsc.ucalgary.ca/~mirtchov/p9/sadict/ I don't think this should make it in the default Plan 9 distribution, unless Plan 9 becomes the OS of choice in Bulgaria in the next 30 minutes. andrey PS: I used it just now to check the meaning of 'venerable', it works :) --upas-tuzeuckkkjtmiipkbjcrqmmxxi Content-Disposition: attachment; filename=sadict Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable #!/bin/rc # english-bulgarian translation: # sadict test # bulgarian-english translation: # sadict TEST # the web site switches between bulgarian and english translations # based on whether it receives uppercase or lowercase letters # to translate from bulgarian into english use the following transliterat= ion # key: # key for translating from bulgarian to english (all capital letters # are in english), as they would be written if you were typing on=20 # a phonetic english-american keyboard. #=20 # =D0=B0 =3D A # =D0=B1 =3D B # =D0=B2 =3D W # =D0=B3 =3D G # =D0=B4 =3D D # =D0=B5 =3D E # =D0=B6 =3D V # =D0=B7 =3D Z # =D0=B8 =3D I # =D0=B9 =3D J # =D0=BA =3D K # =D0=BB =3D L # =D0=BC =3D M # =D0=BD =3D N # =D0=BE =3D O # =D0=BF =3D P # =D1=80 =3D R # =D1=81 =3D S # =D1=82 =3D T # =D1=83 =3D U # =D1=84 =3D F # =D1=85 =3D H # =D1=86 =3D C # =D1=87 =3D ` (or %60) # =D1=88 =3D [ # =D1=89 =3D ] # =D1=8A =3D Y # =D1=8C =3D X # =D1=8E =3D | # =D1=8F =3D Q # sa.dir.bg outputs everything in window-1251 encoding, so we=20 # use tr -s to substitute for proper unicode cyrillic # sa.dir.bg gives 20 (sometimes less) similar words in a separate # frame. I've left it there because I often find it useful and it just # scroll off the top of the screen when it isn't. # don't try capital letters, won't work! query=3D`{echo $1 |=20 tr -s '=D0=B0=D0=B1=D0=B2=D0=B3=D0=B4=D0=B5=D0=B6=D0=B7=D0=B8=D0=B9=D0=BA= =D0=BB=D0=BC=D0=BD=D0=BE=D0=BF=D1=80=D1=81=D1=82=D1=83=D1=84=D1=85=D1=86=D1= =88=D1=89=D1=8A=D1=8C=D1=8E=D1=8F' 'ABWGDEVZIJKLMNOPRSTUFHC\[\]YX\|Q' | sed 's/=D1=87/%60/g' } hget 'http://sa.dir.bg/cgi-bin/sabig.cgi?word=3D'^$query | htmlfmt -l 1000 | tr -s '=C3=A0-=C3=BF=C3=80-=C3=9F' '=D0=B0-=D1=8F=D0=90-=D0=AF' --upas-tuzeuckkkjtmiipkbjcrqmmxxi--