9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] International Ispell in Plan9
@ 2013-04-10 22:01 trebol
  2013-04-11  6:57 ` Mark van Atten
  2013-04-11 17:57 ` Nemo
  0 siblings, 2 replies; 10+ messages in thread
From: trebol @ 2013-04-10 22:01 UTC (permalink / raw)
  To: 9fans

Hello everyone,

First of all, I'm just starting to learn programming, and I'm a complete
newbie in Plan9, so please, be patient...  I was sad with the English
only spell checker, so I compiled international ispell with ape:

-Installed pdcurses.
-Patched term.c for termios.h, I used a linux patch (I don't have idea)
-Modified correct.c:827 (I had type conflicts).
		(void) fputs ((const char*)tok, logfile); 
-Edited local.h.generic for local.h:

	#define MINIMENU	/* Display a mini-menu at the bottom of the screen */
	#define USG		/* Define on System V or if term.c won't compile */
	#undef NO_FCNTL_H	/* Define if you get compile errors on fcntl.h */
	#define NO_MKSTEMP	/* Define if you get compile or link errors */
	#define CFLAGS	"-O -D_POSIX_SOURCE -D_BSD_EXTENSION"
	#define TERMLIB	"-lcurses"
	#define REGLIB	""
	#undef NO8BIT
	#define WORDS	"/usr/trebol/local/share/dict/words"
	
	#define LANGUAGES "{american,MASTERDICTS=american.med,HASHFILES=americanmed.hash,EXTRADICT=} {español}"
	/*
	 * Important directory paths.  If you change MAN45DIR from man5 to
	 * something else, you probably also want to set MAN45SECT and
	 * MAN45EXT (but not if you keep the man pages in section 5 and just
	 * store them in a different place).
	 */
	#define BINDIR	"/usr/trebol/local/bin"
	#define LIBDIR	"/usr/trebol/local/lib"
	#define MAN1DIR	"/usr/trebol/local/man/man1"
	#define MAN45DIR "/usr/trebol/local/man/man5"

I used http://www.datsi.fi.upm.es/~coes/espa~nol-1.7.tar.gz for Spanish
spell checking, untar it in ispell-3.3.02/languages/español and made
some changes:

-Changed all files and directories names to utf8 (acme don't work fine with the ~)
-Added a utf8 formatter to the aff file:

	altstringtype "utf8" "tex" ".txt"
	
	altstringchar   á    \'a
	altstringchar   Á    \'A
	altstringchar   é    \'e
	altstringchar   É    \'E
	altstringchar   í    \'i
	altstringchar   Í    \'I
	altstringchar   ñ    \'n
	altstringchar   Ñ    \'N
	altstringchar   ó    \'o
	altstringchar   Ó    \'O
	altstringchar   ú    \'u
	altstringchar   Ú    \'U
	altstringchar   ü    \"u
	altstringchar   Ü    \"U

-Edited Makefile, changed LANGUAGE to español and corrected paths:
	
		...
	PATHADDER	=	../../
	BUILDHASH	=	../../buildhash
	UNSQ		=	../../unsq
	FIX8BIT		=	../../fix8bit
		...
	LANGUAGE	=	español
		...
	eñe:
		sh eñes
		...
	../../munchlist -v -l ...

'make' works fine, but the deformatters must be compiled, 'make install'
expects executables but 'make all' creates .o files.  After 'make install'
the Spanish .aff and .hash files must be moved to the correct directory.

Well, ispell's normal mode works, but the suggestions and the line of the
misspelled word aren't showed (curses's problems?). Interactive and -a,
-l modes works fine.

The next problem was acme.  I had to change spout.c:63 for no ASCII characters:
	if(isalpharune(c))

Then aspell.  I made a version for ispell...  well this was for me a
nightmare, but I've learned a lot about rc.

#!/bin/rc

args=()
spellflags=()
for(x){
	switch($x){
	case -d*
		spellflags=($spellflags $x)
	case -p*
		spellflags=($spellflags $x)
	case -T*
		spellflags=($spellflags $x)
	case *
		args = ($args $x)
	}
}

dir = /mnt/wsys
if(! test -f $dir/cons)
	dir = /mnt/term/$dir
id=`{cat $dir/new/ctl}
id=$id(1)

if(~ $#args 1 && ~ $args /*){
	adir = `{basename -d $args}
	args = `{basename $args}
	echo 'name '^$adir^/-spell > $dir/$id/ctl
	cd $adir
}
if not {
	echo 'name '^`{pwd}^/-spell > $dir/$id/ctl
}

{
	echo noscroll
	if(~ $#args 0)
		for(j in `{$home/local/bin/acme/spout | sort -t: -u +2 | sort -t: +1.1n}){if(test `{echo -n $j | $home/local/bin/ispell -l $spellflags}){echo -n $j; echo -n $j | $home/local/bin/ispell -a $spellflags | awk -F: '/^&/{ORS=""; print $2}'; echo}} > $dir/$id/body
	if not for(i in $args)
		cat $i | for(j in `{$home/local/bin/acme/spout | sort -t: -u +2 | sort -t: +1.1n}){if(test `{echo -n $j | $home/local/bin/ispell -l $spellflags}){echo -n $i; echo -n $j; echo -n $j | $home/local/bin/ispell -a $spellflags | awk -F: '/^&/{ORS=""; print $2}'; echo}} > $dir/$id/body
	echo clean
}> $dir/$id/ctl


This works, and the output is like this:

	test:#0,#7:centeya centena
	test:#8,#14:camiño camilo, camino, cariño
	test:#15,#21:camion camino, camión

So you can click to visit the file in the word misspelled, and also see the suggestions.

With -dlanguage, -ppersonaldictionary and -Tformatter you can use all of
the international ispell's dictionaries. I put functions in my profile
for a easy use:

fn aispelles {$home/local/bin/acme/aispell -p$home/lib/pdict -Tutf8 -despañol $*}
fn aispellen {$home/local/bin/acme/aispell -p$home/lib/pdict_en -damerican $*}

And you can have '>> /personal/dictionary/path' to acme's tag (or commands
file's window) for adding words to your personal dictionary quickly.

The problem is the 'for' statement.  When the file grows a little
the script works VERY slow. And some times I have 'test: unexpected
operator/operand:' with this test expression.

I'm sure there is a faster (and more proper) way to make this right,
so I'll appreciate any help.

Regards,
trebol.



^ permalink raw reply	[flat|nested] 10+ messages in thread
* [9fans] International Ispell in Plan9
@ 2013-04-11 12:34 trebol
  0 siblings, 0 replies; 10+ messages in thread
From: trebol @ 2013-04-11 12:34 UTC (permalink / raw)
  To: 9fans

> And you can have '>> /personal/dictionary/path' [...]

Sorry, this must be 'echo >> /personal/dictionary/path' and make a 2-1 mouse
chord.

Regards,
trebol.



^ permalink raw reply	[flat|nested] 10+ messages in thread
* [9fans] International Ispell in Plan9
@ 2013-04-13  6:48 trebol
  0 siblings, 0 replies; 10+ messages in thread
From: trebol @ 2013-04-13  6:48 UTC (permalink / raw)
  To: 9fans

The script don't work, and has serious mistakes in its approach.  I will
fix it soon.

Regards,
trebol.



^ permalink raw reply	[flat|nested] 10+ messages in thread
* [9fans] International Ispell in Plan9
@ 2013-04-14  2:30 trebol
  0 siblings, 0 replies; 10+ messages in thread
From: trebol @ 2013-04-14  2:30 UTC (permalink / raw)
  To: 9fans

Hello everyone.

This script works very fine.  I will take a look at sources, and learn
the proper way to share this.

There is a lot of dictionaries you can use with ispell:

http://fmg-www.cs.ucla.edu/geoff/ispell-dictionaries.html


I will make a package with ispell and the American and British
dictionaries, and a separate package with the Spanish dictionary.
I'm starting to learn Portuguese, so I will try to make a package for
this language too, but I think it would be great if a native speaker
of each language makes and tests the dictionaries.  The work is easy,
just make a utf8 formatter like that I showed you in the first mail I
send with this subject.

With the dot's address I can fix the output to make it work with
arbitrarily selections.  I'm still looking for some way to get the
dot's address of a window within a rc script, but I think that the
proper way it's going to be keep learning C (just 3th chapter of K&R
for now...) and make a simple C program to make the work and put it in
/acme/bin for future use in scripts, and maybe substitute the script
with a faster program.

Regards,
trebol.

//////////////////////////
//////////////////////////

#!/bin/rc

rm -f /tmp/$pid^'.'aispell

args=()
spellflags=()
for(x){
	switch($x){
	case -d*
		spellflags=($spellflags $x)
	case -p*
		spellflags=($spellflags $x)
	case -T*
		spellflags=($spellflags $x)
	case *
		args = ($args $x)
	}
}

dir = /mnt/wsys
if(! test -f $dir/cons)
	dir = /mnt/term/$dir
id=`{cat $dir/new/ctl}
id=$id(1)

#if(~ $#args 1 && ~ $args /*){
#	adir = `{basename -d $args}
#	args = `{basename $args}
#	echo 'name '^$adir^/-spell > $dir/$id/ctl
#	cd $adir
#}
#if not {
	echo 'name '^`{pwd}^/-spell > $dir/$id/ctl
#}

{
	echo noscroll
	if(~ $#args 0){
		cat > /tmp/$pid^'.'aispell
		args = /tmp/$pid^'.'aispell
		pipe = 1
	}
	for(i in $args){
			name = $i
			if(~ $pipe 1){
				name = `{sed 's/ .*//g' < /mnt/acme/$winid/tag}
				if(~ name '') name = nonamedwindow
			}
		for(j in `{{cat $i; echo} | $home/local/bin/acme/spout | sort -t: -u +2 | sed 's/$/\!/g' | $home/local/bin/ispell -a $spellflags | grep '^[&#]' | sed 's/ /_/g'}){
		# {cat $i; echo} is for spout, needs \n. I want make a list of lines, so j can't have spaces
			miss = `{ echo $j | awk -F_ '{print $2}'}
			sugg = `{ echo $j | sed 's/^.*://g'} # I can't put grep -v '^#' here...
			{cat $i; echo} |
			$home/local/bin/acme/spout |
			grep '.*:'$miss'$' |
			sed 's/$/ '$sugg'/g' | # If I put grep -v '^#' above, this sed cuts output, I don't know why ...
			sed 's/#_.*$//g' |
			sed 's/_/ /g' |  # If I put sed 's/_/ /g' above, variables don't work in sed.  Again, I don't know why...
			sed s',^,'$name',g' > $dir/$id/body
		}

	}
	rm -f /tmp/$pid^'.'aispell
	echo clean
}> $dir/$id/ctl



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-04-14  2:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-10 22:01 [9fans] International Ispell in Plan9 trebol
2013-04-11  6:57 ` Mark van Atten
2013-04-11 17:57 ` Nemo
2013-04-12 12:21   ` trebol
2013-04-12 12:39     ` Francisco J Ballesteros
2013-04-12 12:56     ` erik quanstrom
2013-04-13  3:36       ` trebol
2013-04-11 12:34 trebol
2013-04-13  6:48 trebol
2013-04-14  2:30 trebol

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).