9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] web-based bulgarian-english and english-bulgarian dictionary
@ 2003-11-06 23:29 mirtchov
  2003-11-06 23:57 ` Latchesar Ionkov
  0 siblings, 1 reply; 4+ messages in thread
From: mirtchov @ 2003-11-06 23:29 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 799 bytes --]

Inspired by the little thesaurus script today's "pull" brought home,
this little script queries a relatively complete en/bg and bg/en
dictionary on the http://sa.dir.bg web site.

I've sent them an email to tell them about the wonders of utf-8, but I
doubt they'll switch from the venerable Windows-1251 cyrillic
encoding, so there's a key in the script for transliterating all
cyrillic queries, or if you wish you can type in cyrillic and it'll do
the transliteration for you.

also available at:

	http://pages.cpsc.ucalgary.ca/~mirtchov/p9/sadict/

I don't think this should make it in the default Plan 9 distribution,
unless Plan 9 becomes the OS of choice in Bulgaria in the next 30
minutes.

andrey

PS: I used it just now to check the meaning of 'venerable', it works :)

[-- Attachment #2: sadict --]
[-- Type: text/plain, Size: 1453 bytes --]

#!/bin/rc

# english-bulgarian translation:
#	sadict test

# bulgarian-english translation:
#	sadict TEST

# the web site switches between bulgarian and english translations
# based on whether it receives uppercase or lowercase letters
# to translate from bulgarian into english use the following transliteration
# key:

# key for translating from bulgarian to english (all capital letters
# are in english), as they would be written if you were typing on 
# a phonetic english-american keyboard.
# 
#	а	=	A
#	б	=	B
#	в	=	W
#	г	=	G
#	д	=	D
#	е	=	E
#	ж	=	V
#	з	=	Z
#	и	=	I
#	й	=	J
#	к	=	K
#	л	=	L
#	м	=	M
#	н	=	N
#	о	=	O
#	п	=	P
#	р	=	R
#	с	=	S
#	т	=	T
#	у	=	U
#	ф	=	F
#	х	=	H
#	ц	=	C
#	ч	=	` (or %60)
#	ш	=	[
#	щ	=	]
#	ъ	=	Y
#	ь	=	X
#	ю	=	|
#	я	=	Q

# sa.dir.bg outputs everything in window-1251 encoding, so we 
# use tr -s to substitute for proper unicode cyrillic

# sa.dir.bg gives 20 (sometimes less) similar words in a separate
# frame. I've left it there because I often find it useful and it just
# scroll off the top of the screen when it isn't.

# don't try capital letters, won't work!

query=`{echo $1 | 
	tr -s 'абвгдежзийклмнопрстуфхцшщъьюя' 'ABWGDEVZIJKLMNOPRSTUFHC\[\]YX\|Q' |
	sed 's/ч/%60/g'
}



hget 'http://sa.dir.bg/cgi-bin/sabig.cgi?word='^$query |
	htmlfmt -l 1000 |
	tr -s 'à-ÿÀ-ß' 'а-яА-Я'


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] web-based bulgarian-english and english-bulgarian dictionary
  2003-11-06 23:29 [9fans] web-based bulgarian-english and english-bulgarian dictionary mirtchov
@ 2003-11-06 23:57 ` Latchesar Ionkov
  2003-11-07  0:08   ` mirtchov
  0 siblings, 1 reply; 4+ messages in thread
From: Latchesar Ionkov @ 2003-11-06 23:57 UTC (permalink / raw)
  To: 9fans

You can check the dictionaries (and the library) at
http://sourceforge.net/projects/bedic. 

The library is written in C++, but is pretty simple and can be ported to C
easily. Writing an command line (or CGI) app is also pretty easy.

The dictionaries are in UTF-8 format :)

	Lucho

PS. The library contains couple of functions from 9libs :)

On Thu, Nov 06, 2003 at 04:29:21PM -0700, mirtchov@cpsc.ucalgary.ca said:
> Inspired by the little thesaurus script today's "pull" brought home,
> this little script queries a relatively complete en/bg and bg/en
> dictionary on the http://sa.dir.bg web site.
> 
> I've sent them an email to tell them about the wonders of utf-8, but I
> doubt they'll switch from the venerable Windows-1251 cyrillic
> encoding, so there's a key in the script for transliterating all
> cyrillic queries, or if you wish you can type in cyrillic and it'll do
> the transliteration for you.
> 
> also available at:
> 
> 	http://pages.cpsc.ucalgary.ca/~mirtchov/p9/sadict/
> 
> I don't think this should make it in the default Plan 9 distribution,
> unless Plan 9 becomes the OS of choice in Bulgaria in the next 30
> minutes.
> 
> andrey
> 
> PS: I used it just now to check the meaning of 'venerable', it works :)

> #!/bin/rc
> 
> # english-bulgarian translation:
> #	sadict test
> 
> # bulgarian-english translation:
> #	sadict TEST
> 
> # the web site switches between bulgarian and english translations
> # based on whether it receives uppercase or lowercase letters
> # to translate from bulgarian into english use the following transliteration
> # key:
> 
> # key for translating from bulgarian to english (all capital letters
> # are in english), as they would be written if you were typing on 
> # a phonetic english-american keyboard.
> # 
> #	??	=	A
> #	??	=	B
> #	??	=	W
> #	??	=	G
> #	??	=	D
> #	??	=	E
> #	??	=	V
> #	??	=	Z
> #	??	=	I
> #	??	=	J
> #	??	=	K
> #	??	=	L
> #	??	=	M
> #	??	=	N
> #	??	=	O
> #	??	=	P
> #	??	=	R
> #	??	=	S
> #	??	=	T
> #	??	=	U
> #	??	=	F
> #	??	=	H
> #	??	=	C
> #	??	=	` (or %60)
> #	??	=	[
> #	??	=	]
> #	??	=	Y
> #	??	=	X
> #	??	=	|
> #	??	=	Q
> 
> # sa.dir.bg outputs everything in window-1251 encoding, so we 
> # use tr -s to substitute for proper unicode cyrillic
> 
> # sa.dir.bg gives 20 (sometimes less) similar words in a separate
> # frame. I've left it there because I often find it useful and it just
> # scroll off the top of the screen when it isn't.
> 
> # don't try capital letters, won't work!
> 
> query=`{echo $1 | 
> 	tr -s '??????????????????????????????????????????????????????????' 'ABWGDEVZIJKLMNOPRSTUFHC\[\]YX\|Q' |
> 	sed 's/??/%60/g'
> }
> 
> 
> 
> hget 'http://sa.dir.bg/cgi-bin/sabig.cgi?word='^$query |
> 	htmlfmt -l 1000 |
> 	tr -s 'à-ÿÀ-ß' '??-????-??'
> 



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] web-based bulgarian-english and english-bulgarian dictionary
  2003-11-06 23:57 ` Latchesar Ionkov
@ 2003-11-07  0:08   ` mirtchov
  2003-11-07  2:03     ` Dan Cross
  0 siblings, 1 reply; 4+ messages in thread
From: mirtchov @ 2003-11-07  0:08 UTC (permalink / raw)
  To: 9fans

> You can check the dictionaries (and the library) at
> http://sourceforge.net/projects/bedic.
>

yes, but mine is simpler :P (including the modified version i just put
on the web, which fixes a bug or two and allows the script to do
searches on queries typed in cyrillic, so you can copy/paste any of
the suggestions a query gives you back into sadict).

if you have reasonably complete (~500 000 words) dictionaries you can
throw them my way -- there's a standard dict(7) already in Plan 9,
modifying it to use different dictionary type files is a breeze:

	http://pages.cpsc.ucalgary.ca/~mirtchov/p9/dict/

andrey




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] web-based bulgarian-english and english-bulgarian dictionary
  2003-11-07  0:08   ` mirtchov
@ 2003-11-07  2:03     ` Dan Cross
  0 siblings, 0 replies; 4+ messages in thread
From: Dan Cross @ 2003-11-07  2:03 UTC (permalink / raw)
  To: 9fans

mirtchov@cpsc.ucalgary.ca writes:
> if you have reasonably complete (~500 000 words) dictionaries you can
> throw them my way -- there's a standard dict(7) already in Plan 9,
> modifying it to use different dictionary type files is a breeze:

How about adding support for the Internet dictionary protocol?  I
think that'd be genuinely useful.

	- Dan C.



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2003-11-07  2:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-11-06 23:29 [9fans] web-based bulgarian-english and english-bulgarian dictionary mirtchov
2003-11-06 23:57 ` Latchesar Ionkov
2003-11-07  0:08   ` mirtchov
2003-11-07  2:03     ` Dan Cross

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).