9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: trebol <trebol55555@aol.com>
To: 9fans@9fans.net
Subject: [9fans] Spell checking with acme in p9p
Date: Sun, 15 Dec 2013 11:55:54 +0000	[thread overview]
Message-ID: <52ad98ca.fWxfWqxL4mycUn65%trebol55555@aol.com> (raw)

When recently I discovered Plan9, the first things I missed were a
non-only-English spell checker, support for other languages in troff
(mostly hyphenation), and other dictionaries for dict.  I've ported
"international ispell" to ape and write aispell, a modified version of
aspell script that work with ispell, I've formatted GCIDE, "Chambers's
Twentieth Century Dictionary", "Diccionario de la Real Academia de la
Lengua Española", "Moby Thesaursus" and "OpenThesaurus-es", and I was
hopping working in troff when I'll learn programming.  The lack of a
web browser capable of deal with today's madness and the portability
limitation of ape (at least for a ignorant like me) forcesme to deal
with other OS I have to install and maintaining, so the simplicity and
cleanness I like so much of plan9 become useless.  Thanks to Russ Cox for P9P!  

I've write a script for p9p too, you only need to install gnu
aspell (or other spell checker with "ispell -a" support) and compile a
slightly modified spout.c with rune support (I've called it uspout), so you can
use the script like native acme's aspell script.  This is from my README.PLAN9 file:

[...]
In the acme directory are aispell, an equivalent script of aspell and
uspout.c, slighted modified spout.c for UTF-8 runes, needed by aispell
to support non English languages.  You can pass ispell options as
arguments to aispell, for example for use the Spanish dictionary you can put
in the tag:
>aispell -despañol -Tutf8 

Or put 'aispell -despañol -Tutf8 $*' in a script and call
it aispelles, for example.  Defining a function in lib/profile didn't
work for me...  You can use it in any text selected, but for now, if
it doesn't start at the beginning of the buffer, the output's
addresses will be wrong.  This package install American and British
dictionaries.  If you are interested, look at the Spanish_ispell package I've
ported from http://www.datsi.fi.upm.es/~coes.
[...]

In P9P you don't need "-Tutf8".  I was going to ask for a directory in
sources, but I haven't see any interest in those things in the list.  I hope 
that this will help you.
The script and spout's source are small, so I'm going to paste both here.
trebol.

#!/usr/local/plan9/bin/rc
# Don't forget to check the path!
# aispell_p9p

rm -f /tmp/$pid^'.'aispell

spellpgr=aspell
args=()
spellflags=()
for(x){
	switch($x){
	case -d*
		spellflags=($spellflags $x)
	case -p*
		spellflags=($spellflags $x)
	case -T*
		spellflags=($spellflags $x)
	case *
		args=($args $x)
	}
}

id=`{9p read acme/new/ctl}
id=$id(1)

echo 'name '^`{pwd}^/-spell | 9p write acme/$id/ctl

{
	if(~ $#args 0){
		cat > /tmp/$pid^'.'aispell
		args=/tmp/$pid^'.'aispell
		pipe=1
	}
	for(i in $args){
			name=$i
			if(~ $pipe 1){
				name=`{9p read acme/$winid/tag | 9 sed 's/ .*//g'}
				if(~ name '') name=nonamedwindow
			}
		for(j in `{{cat $i; echo} | uspout | 9 sort -t: -u +2 | 9 sed 's/$/\!/g' | $spellpgr -a $spellflags | 9 grep '^[&#]' | 9 sed 's/ /_/g'}){
		# {cat $i; echo} is for uspout, needs \n. Also I want to make a list of lines, so j can't have spaces 

			miss=`{ echo $j | awk -F_ '{print $2}'}
			sug=`{ echo $j | 9 sed 's/^.*://g'} # Can't put 9 grep -v '^#' here...
			{cat $i; echo} |
			uspout |
			9 grep '.*:'$miss'$' |
			9 sed 's/$/ '$sug'/g' | # If I put 9 grep -v '^#' above, this 9 sed cuts output, I don't know why ...
			9 sed 's/#_.*$//g' |
			9 sed 's/_/ /g' |  # If I put 9 sed 's/_/ /g' above, variables don't work in 9 sed.  Again, I don't know why...
			9 sed s',^,'$name',g' | 9p write acme/$id/body
		}

	}
	rm -f /tmp/$pid^'.'aispell
	echo clean | 9p write acme/$id/ctl
}


------------------------
------------------------
uspout.c's source:
#include <u.h>
#include <libc.h>
#include <ctype.h>
#include <bio.h>

void	spout(int, char*);

Biobuf bout;

void
main(int argc, char *argv[])
{
	int i, fd;

	Binit(&bout, 1, OWRITE);
	if(argc == 1)
		spout(0, "");
	else
		for(i=1; i<argc; i++){
			fd = open(argv[i], OREAD);
			if(fd < 0){
				fprint(2, "spell: can't open %s: %r\n", argv[i]);
				continue;
			}
			spout(fd, argv[i]);
			close(fd);
		}
	exits(nil);
}

Biobuf b;

void
spout(int fd, char *name)
{
	char *s, *t, *w;
	Rune r;
	int inword, wordchar;
	int n, wn, wid, c, m;
	char buf[1024];

	Binit(&b, fd, OREAD);
	n = 0;
	wn = 0;
	while((s = Brdline(&b, '\n')) != nil){
		if(s[0] == '.')
			for(c=0; c<3 && *s>' '; c++){
				n++;
				s++;
			}
		inword = 0;
		w = s;
		t = s;
		do{
			c = *(uchar*)t;
			if(c < Runeself)
				wid = 1;
			else{
				wid = chartorune(&r, t);
				c = r;
			}
			wordchar = 0;
			if(isalpharune(c))
				wordchar = 1;
			if(inword && !wordchar){
				if(c=='\'' && isalpha(t[1]))
					goto Continue;
				m = t-w;
				if(m > 1){
					memmove(buf, w, m);
					buf[m] = 0;
					Bprint(&bout, "%s:#%d,#%d:%s\n", name, wn, n, buf);
				}
				inword = 0;
			}else if(!inword && wordchar){
				wn = n;
				w = t;
				inword = 1;
			}
			if(c=='\\' && (isalpha(t[1]) || t[1]=='(')){
				switch(t[1]){
				case '(':
					m = 4;
					break;
				case 'f':
					if(t[2] == '(')
						m = 5;
					else
						m = 3;
					break;
				case 's':
					if(t[2] == '+' || t[2]=='-'){
						if(t[3] == '(')
							m = 6;
						else
							m = 4;
					}else{
						if(t[2] == '(')
							m = 5;
						else if(t[2]=='1' || t[2]=='2' || t[2]=='3')
							m = 4;
						else
							m = 3;
					}
					break;
				default:
					m = 2;
				}
				while(m-- > 0){
					if(*t == '\n')
						break;
					n++;
					t++;
				}
				continue;
			}
	Continue:
			n++;
			t += wid;
		}while(c != '\n');
	}
	Bterm(&b);
}



             reply	other threads:[~2013-12-15 11:55 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-15 11:55 trebol [this message]
  -- strict thread matches above, loose matches on Subject: below --
2013-12-15  2:46 Blake McBride
2013-12-15  3:45 ` Bakul Shah
2013-12-15 12:01 ` Rubén Berenguel
2013-12-15 15:08   ` Blake McBride
2013-12-15 16:03     ` erik quanstrom
2013-12-15 16:27     ` Mark van Atten

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52ad98ca.fWxfWqxL4mycUn65%trebol55555@aol.com \
    --to=trebol55555@aol.com \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).