From mboxrd@z Thu Jan  1 00:00:00 1970
MIME-Version: 1.0
In-Reply-To: <f631016df731e553421e6079dd1da0d4@quintile.net>
References: <f631016df731e553421e6079dd1da0d4@quintile.net>
Date: Fri, 30 Oct 2009 21:53:36 +0100
Message-ID: <56a297000910301353p29baf584g3305d9548215e1f7@mail.gmail.com>
From: Noah Evans <noah.evans@gmail.com>
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Subject: Re: [9fans] sed question (OT)
Topicbox-Message-UUID: 94da013e-ead5-11e9-9d60-3106f5b1d025

This kind of problem is character processing, which I would argue is
C's domain. You can massage awk and sed to do the job for you, but at
least for me it's conceptually simpler to just bang out the following
C program:

#include <u.h>
#include <libc.h>
#include <bio.h>

#define	isupper(r)	(L'A' <=3D (r) && (r) <=3D L'Z')
#define	islower(r)	(L'a' <=3D (r) && (r) <=3D L'z')
#define	isalpha(r)	(isupper(r) || islower(r))
#define	isspace(r)	((r) =3D=3D L' ' || (r) =3D=3D L'\t' \
			|| (0x0A <=3D (r) && (r) <=3D 0x0D))
#define	toupper(r)	((r)-'a'+'A')

void
usage(char *me)
{
	fprint(2, "%s: usage\n", me);
}

void
main(int argc, char **argv)
{
	Biobuf in, out;
	int c, waswhite, nwords;

	ARGBEGIN{
	default:
		usage(argv[0]);
	}ARGEND;
	Binit(&in, 0, OREAD);
	Binit(&out, 1, OWRITE);
=09
	waswhite =3D 0;
	nwords =3D 0;
	while((c =3D Bgetc(&in)) !=3D Beof){
		if(isalpha(c))
		if(waswhite)
		if(nwords < 2){
			if(islower(c))
				c =3D toupper(c);
			nwords++;
		}
		if(isspace(c))
			waswhite =3D 1;
		else
			waswhite =3D 0;
		if(c =3D=3D '\n')
			nwords =3D 0;
		Bputc(&out, c);
	}
	exits(0);
}

Noah


On Thu, Oct 29, 2009 at 4:41 PM, Steve Simon <steve@quintile.net> wrote:
> Sorry, not really the place for such questions but...
>
> I always struggle with sed, awk is easy but sed makes my head hurt.
>
> I am trying to capitalise the first tow words on each line (I could use a=
wk
> as well but I have to use sed so it seems churlish to start another proce=
ss).
>
> capitalising the first word on the line is easy enough:
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0h
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0s/^(.).*/\1/
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0y/abcdefghijklmnopqrstuvwx=
yz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0x
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0s/^.(.*)/\1/
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0x
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0G
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0s/\n//
>
> Though there maye be a much easier/more elegant way to do this,
> but for the 2nd word it gets much harder.
>
> What I really want is sam's ability to select a letter and operate on it
> rather than everything being line based as sed seems to be.
>
> any neat solutions? (extra points awarded for use of the branch operator =
:-)
>
> -Steve
>
>