From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: From: "Steve Simon" Date: Thu, 29 Oct 2009 15:41:48 +0000 To: 9fans@9fans.net MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: [9fans] sed question (OT) Topicbox-Message-UUID: 9353ea50-ead5-11e9-9d60-3106f5b1d025 Sorry, not really the place for such questions but... I always struggle with sed, awk is easy but sed makes my head hurt. I am trying to capitalise the first tow words on each line (I could use awk as well but I have to use sed so it seems churlish to start another process). capitalising the first word on the line is easy enough: h s/^(.).*/\1/ y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ x s/^.(.*)/\1/ x G s/\n// Though there maye be a much easier/more elegant way to do this, but for the 2nd word it gets much harder. What I really want is sam's ability to select a letter and operate on it rather than everything being line based as sed seems to be. any neat solutions? (extra points awarded for use of the branch operator :-) -Steve From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: References: Date: Thu, 29 Oct 2009 16:06:40 +0000 Message-ID: <80c99e790910290906t36766978kcd38c9583392e038@mail.gmail.com> From: Lorenzo Bolla To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: multipart/alternative; boundary=000e0cd242b0c9f7f204771518cc Subject: Re: [9fans] sed question (OT) Topicbox-Message-UUID: 93602108-ead5-11e9-9d60-3106f5b1d025 --000e0cd242b0c9f7f204771518cc Content-Type: text/plain; charset=ISO-8859-1 To capitalize the first letter of each line wouldn't this be enough? s/^./\u&/ L. On Thu, Oct 29, 2009 at 3:41 PM, Steve Simon wrote: > Sorry, not really the place for such questions but... > > I always struggle with sed, awk is easy but sed makes my head hurt. > > I am trying to capitalise the first tow words on each line (I could use awk > as well but I have to use sed so it seems churlish to start another > process). > > capitalising the first word on the line is easy enough: > > h > s/^(.).*/\1/ > > y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ > x > s/^.(.*)/\1/ > x > G > s/\n// > > Though there maye be a much easier/more elegant way to do this, > but for the 2nd word it gets much harder. > > What I really want is sam's ability to select a letter and operate on it > rather than everything being line based as sed seems to be. > > any neat solutions? (extra points awarded for use of the branch operator > :-) > > -Steve > > --000e0cd242b0c9f7f204771518cc Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable To capitalize the first letter of each line wouldn't this be enough?
s/^./\u&/

L.


=
On Thu, Oct 29, 2009 at 3:41 PM, Steve Simon <steve@quintile.net= > wrote:
Sorry, not really the place for such questi= ons but...

I always struggle with sed, awk is easy but sed makes my head hurt.

I am trying to capitalise the first tow words on each line (I could use awk=
as well but I have to use sed so it seems churlish to start another process= ).

capitalising the first word on the line is easy enough:

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0h
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0s/^(.).*/\1/
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0y/abcdefghijklmnopqrstuvwxy= z/ABCDEFGHIJKLMNOPQRSTUVWXYZ/
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0x
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0s/^.(.*)/\1/
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0x
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0G
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0s/\n//

Though there maye be a much easier/more elegant way to do this,
but for the 2nd word it gets much harder.

What I really want is sam's ability to select a letter and operate on i= t
rather than everything being line based as sed seems to be.

any neat solutions? (extra points awarded for use of the branch operator :-= )

-Steve


--000e0cd242b0c9f7f204771518cc-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Thu, 29 Oct 2009 12:08:18 -0400 To: 9fans@9fans.net Message-ID: <5f7c2191ead171fd9096d059fdba6879@brasstown.quanstro.net> In-Reply-To: <<80c99e790910290906t36766978kcd38c9583392e038@mail.gmail.com>> References: <<80c99e790910290906t36766978kcd38c9583392e038@mail.gmail.com>> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: Re: [9fans] sed question (OT) Topicbox-Message-UUID: 93643e96-ead5-11e9-9d60-3106f5b1d025 > To capitalize the first letter of each line wouldn't this be enough? > > s/^./\u&/ ; echo abc def | sed 's/^.\u&/' sed: s command garbled: s/^.\u&/ - erik From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4AE9BE25.1090001@conducive.org> Date: Fri, 30 Oct 2009 00:09:09 +0800 From: W B Hacker User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.1.23) Gecko/20090823 SeaMonkey/1.1.18 MIME-Version: 1.0 To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [9fans] sed question (OT) Topicbox-Message-UUID: 935921aa-ead5-11e9-9d60-3106f5b1d025 Steve Simon wrote: > Sorry, not really the place for such questions but... > > I always struggle with sed, awk is easy but sed makes my head hurt. > > I am trying to capitalise the first tow words on each line (I could use awk > as well but I have to use sed so it seems churlish to start another process). > > capitalising the first word on the line is easy enough: > > h > s/^(.).*/\1/ > y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ > x > s/^.(.*)/\1/ > x > G > s/\n// > > Though there maye be a much easier/more elegant way to do this, > but for the 2nd word it gets much harder. > > What I really want is sam's ability to select a letter and operate on it > rather than everything being line based as sed seems to be. > > any neat solutions? (extra points awarded for use of the branch operator :-) > > -Steve > > I'd be sore tempted to move the needful files into an environment where I could use multiple passes of 'rpl' (or 'back in the day' BRIEF). BFBI .. far less capable tools, perhaps - BUT by the time you've figured out how to even *tell* awk or sed what to do, I'm working on some other task... 'If at first you don't succeed - cheat' YMMV, Bill From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: <5f7c2191ead171fd9096d059fdba6879@brasstown.quanstro.net> References: <5f7c2191ead171fd9096d059fdba6879@brasstown.quanstro.net> Date: Thu, 29 Oct 2009 14:29:41 -0200 Message-ID: From: Iruata Souza To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: text/plain; charset=UTF-8 Subject: Re: [9fans] sed question (OT) Topicbox-Message-UUID: 936845a4-ead5-11e9-9d60-3106f5b1d025 On Thu, Oct 29, 2009 at 2:08 PM, erik quanstrom wrote: >> To capitalize the first letter of each line wouldn't this be enough? >> >> s/^./\u&/ > > ; echo abc def | sed 's/^.\u&/' > sed: s command garbled: s/^.\u&/ > i guess you missed the second slash From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Thu, 29 Oct 2009 12:31:41 -0400 To: 9fans@9fans.net Message-ID: <98f84e4cfbb56eb6e93bdd18a3360e9a@brasstown.quanstro.net> In-Reply-To: <> References: <> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: Re: [9fans] sed question (OT) Topicbox-Message-UUID: 936fdb84-ead5-11e9-9d60-3106f5b1d025 On Thu Oct 29 12:31:23 EDT 2009, iru.muzgo@gmail.com wrote: > On Thu, Oct 29, 2009 at 2:08 PM, erik quanstrom wrote: > >> To capitalize the first letter of each line wouldn't this be enough? > >> > >> s/^./\u&/ > > > > ; echo abc def | sed 's/^.\u&/' > > sed: s command garbled: s/^.\u&/ > > > > i guess you missed the second slash > now it is less helpful: ; echo abc def | sed 's/^./\u&/' uabc def - erik From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: <80c99e790910290906t36766978kcd38c9583392e038@mail.gmail.com> References: <80c99e790910290906t36766978kcd38c9583392e038@mail.gmail.com> Date: Thu, 29 Oct 2009 14:33:29 -0200 Message-ID: From: Iruata Souza To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: text/plain; charset=UTF-8 Subject: Re: [9fans] sed question (OT) Topicbox-Message-UUID: 936c09f0-ead5-11e9-9d60-3106f5b1d025 On Thu, Oct 29, 2009 at 2:06 PM, Lorenzo Bolla wrote: > To capitalize the first letter of each line wouldn't this be enough? > s/^./\u&/ > > L. % echo rwrong | sed 's/^./\u&/' urwrong From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: References: <80c99e790910290906t36766978kcd38c9583392e038@mail.gmail.com> Date: Thu, 29 Oct 2009 16:42:54 +0000 Message-ID: <80c99e790910290942n5e0dd61aq6ae1b78395946110@mail.gmail.com> From: Lorenzo Bolla To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: multipart/alternative; boundary=000e0cd2965e59de570477159a2f Subject: Re: [9fans] sed question (OT) Topicbox-Message-UUID: 9373e986-ead5-11e9-9d60-3106f5b1d025 --000e0cd2965e59de570477159a2f Content-Type: text/plain; charset=ISO-8859-1 I forgot the "9". This works for GNU sed version 4.2.1 L. On Thu, Oct 29, 2009 at 4:33 PM, Iruata Souza wrote: > On Thu, Oct 29, 2009 at 2:06 PM, Lorenzo Bolla wrote: > > To capitalize the first letter of each line wouldn't this be enough? > > s/^./\u&/ > > > > L. > > % echo rwrong | sed 's/^./\u&/' > urwrong > > --000e0cd2965e59de570477159a2f Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I forgot the "9".
This works for GNU sed version 4.2.1
<= div>L.

On Thu, Oct 29, 2009 at 4:3= 3 PM, Iruata Souza <iru.muzgo@gmail.com> wrote:
On Thu, Oct 29, 2009 at 2= :06 PM, Lorenzo Bolla <lbolla@gmail.= com> wrote:
> To capitalize the first letter of each line wouldn't this be enoug= h?
> s/^./\u&/
>
> L.

% echo rwrong | sed 's/^./\u&/'
urwrong


--000e0cd2965e59de570477159a2f-- From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: References: Date: Thu, 29 Oct 2009 13:52:55 -0500 Message-ID: From: Jason Catena To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: text/plain; charset=ISO-8859-1 Subject: Re: [9fans] sed question (OT) Topicbox-Message-UUID: 9387f34a-ead5-11e9-9d60-3106f5b1d025 > Sorry, not really the place for such questions but... Try stackoverflow.com. They delight in problems such as these. > I am trying to capitalise the first tow words on each line I store the original line with h, and then pull it back out repeatedly with G to mangle it. I got far enough to translate "first second ..." to "First s" with this: h s/^(.).*/\1/ y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ G s/^.([^ ]+ ).*/\1/ s/^.([^ ]+)$/\1/ G s/^.[^ ]+ (.).*/\1/ #y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ #3y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ s/\n//g There's a couple problems. (1) It doesn't handle the case with only one word on a line, because it's hard to tell, later on, that I pulled out the single word once already. (2) I'd like to put in one of the commented-out y commands, but (2a) the first uppercases the entire pattern space, and (2b) the second refers to line 3 of the entire file, not line 3 of the pattern space. > -Steve Jason Catena From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Fri, 30 Oct 2009 13:35:32 +0000 From: Eris Discordia To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Message-ID: In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Subject: Re: [9fans] sed question (OT) Topicbox-Message-UUID: 94429e66-ead5-11e9-9d60-3106f5b1d025 Listing of file 'sedscr:' > s/^/ /; > s/$/aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ/; > s/ \([a-z]\)\(.*\1\)\(.\)/ \3\2\3/; > s/ \([a-z]\)\(.*\1\)\(.\)/ \3\2\3/; > s/.\{52\}$//; > s/ //; $ echo This is a test | sed -f sedscr This Is a test $ echo someone forgot to capitalize | sed -f sedscr Someone Forgot to capitalize This works with '/usr/bin/sed' from a FreeBSD 6.2-RELEASE installation. Above sed script stolen from: With a minor change: first three words to first two words. --On Thursday, October 29, 2009 15:41 +0000 Steve Simon wrote: > Sorry, not really the place for such questions but... > > I always struggle with sed, awk is easy but sed makes my head hurt. > > I am trying to capitalise the first tow words on each line (I could use > awk as well but I have to use sed so it seems churlish to start another > process). > > capitalising the first word on the line is easy enough: > > h > s/^(.).*/\1/ > y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ > x > s/^.(.*)/\1/ > x > G > s/\n// > > Though there maye be a much easier/more elegant way to do this, > but for the 2nd word it gets much harder. > > What I really want is sam's ability to select a letter and operate on it > rather than everything being line based as sed seems to be. > > any neat solutions? (extra points awarded for use of the branch operator > :-) > > -Steve > From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Fri, 30 Oct 2009 13:39:27 +0000 From: Eris Discordia To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Message-ID: <2AC4018ADBC89F5A056B73E2@[192.168.1.2]> In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Subject: Re: [9fans] sed question (OT) Topicbox-Message-UUID: 944ade64-ead5-11e9-9d60-3106f5b1d025 The script has a small "bug" one might say: it capitalizes the first two words on a line that are _not_ already capitalized. If one of the first two words is capitalized then the third will get capitalized. --On Thursday, October 29, 2009 15:41 +0000 Steve Simon wrote: > Sorry, not really the place for such questions but... > > I always struggle with sed, awk is easy but sed makes my head hurt. > > I am trying to capitalise the first tow words on each line (I could use > awk as well but I have to use sed so it seems churlish to start another > process). > > capitalising the first word on the line is easy enough: > > h > s/^(.).*/\1/ > y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ > x > s/^.(.*)/\1/ > x > G > s/\n// > > Though there maye be a much easier/more elegant way to do this, > but for the 2nd word it gets much harder. > > What I really want is sam's ability to select a letter and operate on it > rather than everything being line based as sed seems to be. > > any neat solutions? (extra points awarded for use of the branch operator > :-) > > -Steve > From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-version: 1.0 Content-transfer-encoding: 7BIT Content-type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Message-id: From: dave.l@mac.com To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> In-reply-to: Date: Fri, 30 Oct 2009 15:29:29 +0000 References: Subject: Re: [9fans] sed question (OT) Topicbox-Message-UUID: 947073d6-ead5-11e9-9d60-3106f5b1d025 You can do it, definitely. Caveat: I'm in bed with a virus and the brain's on impulse power so these are untested and may be highly suboptimal. Is the input guaranteed to have 2 words on each line? What are your definitions of words and blanks? I know from your snippet that there's no leading blanks and no empty lines. Assuming there are 2 words on every line, something like: h s/[A-Za-z0-9_-]+(.).*/\1/ y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ G s/(.)\n([A-Za-z0-9_-]+).(.*)/\2\1\3/ ought to roughly work after your fragment. If >= 2 words per line isn't assumed: h t urnofflag : urnofflag s/[A-Za-z0-9_-]+[^ A-Za-z0-9_-]*(.).*/\1/ t for2 b cosnot2wds : for2 y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ G s/(.)\n([A-Za-z0-9_-]+[^ A-Za-z0-9_-]*).(.*)/\2\1\3/ b : cosnot2wds g Bizarrely, within it's limitations (\n, \0, size limits), sed is, in some sense, complete, since you can store any number of things in the spaces (using /(.* \n)/ etc.) and branch conditionally. Another insane possibility, since there are only 26 variations, is to do: s/^a/A/ s/^([A-Z][A-Za-z0-9]+[^ A-Za-z0-9_-]*)a/\1A/ s/^b/B/ s/^([A-Z][A-Za-z0-9]+[^ A-Za-z0-9_-]*)b/\1B/ You can of course, use sed to create the above script like so: echo abcdefghijklmnopqrstuvwxyz | sed ... Filling in the ellipses is left as an exercise for the already addled reader. BTW: if you're shovelling a lot of this kind of muck, it may, paradoxically, be easier to do it on the command line and use your shell's variables for the repeated bits of regexps, commands etc. The only caveats are that this technique will curdle your brain even more than sed already does and it may, oddly, be the exception to the rule that rc is more elegant than sh, due to caret vs. double-quotes. Apologies for grandstanding, but I used to do this sort of stuff for a living. I wrote a piece of training courseware for sed once which had far worse excesses than the above as examples. RFC-822 header-reassembly anyone? I also used to get my intellectual rocks off on stuff like this until I finally grew up (in my late 40s). Dave. SEE ALSO teco, assembler, qed. On 29 Oct 2009, at 15:41, Steve Simon wrote: > Sorry, not really the place for such questions but... > > I always struggle with sed, awk is easy but sed makes my head hurt. > > I am trying to capitalise the first tow words on each line (I could > use awk > as well but I have to use sed so it seems churlish to start another > process). > > capitalising the first word on the line is easy enough: > > h > s/^(.).*/\1/ > y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ > x > s/^.(.*)/\1/ > x > G > s/\n// > > Though there maye be a much easier/more elegant way to do this, > but for the 2nd word it gets much harder. > > What I really want is sam's ability to select a letter and operate > on it > rather than everything being line based as sed seems to be. > > any neat solutions? (extra points awarded for use of the branch > operator :-) > > -Steve > From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Fri, 30 Oct 2009 12:16:40 -0400 To: 9fans@9fans.net Message-ID: <22825c9783dff37055ec2a815d0b79e3@brasstown.quanstro.net> In-Reply-To: <> References: <> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: Re: [9fans] sed question (OT) Topicbox-Message-UUID: 94998dc0-ead5-11e9-9d60-3106f5b1d025 On Fri Oct 30 11:31:24 EDT 2009, dave.l@mac.com wrote: > You can do it, definitely. > well played! - erik From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4AEB22A7.7010109@conducive.org> Date: Sat, 31 Oct 2009 01:30:15 +0800 From: W B Hacker User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.1.23) Gecko/20090823 SeaMonkey/1.1.18 MIME-Version: 1.0 To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> References: <2AC4018ADBC89F5A056B73E2@[192.168.1.2]> In-Reply-To: <2AC4018ADBC89F5A056B73E2@[192.168.1.2]> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [9fans] sed question (OT) Topicbox-Message-UUID: 94a99b8e-ead5-11e9-9d60-3106f5b1d025 Eris Discordia wrote: > The script has a small "bug" one might say: it capitalizes the first two > words on a line that are _not_ already capitalized. If one of the first > two words is capitalized then the third will get capitalized. Call me a Dinosaur, but - so long as it is ASCII or EBCDIC it is relatively trivial to implement that in hardware AND NOT have the issue of altering any but the first two words AND NOT have issues where there is only one word or a numeral or punctuation or hidden/control character rather than alpha. Hint: Among other simple stuff, needs XOR capability. 'Dinosaur' 'coz the last time I did one of the key portions of it was converting a Data Printer CT-1064 chaintrain from HP-3000 MKIII use to work with an S-100 Z-80. That capitalized *every* alpha character, but took just two 74-series IC's to replace a pair of lookup-table PROMS. One would need to add logic to detect space or newline, set/unset a few latches - not a lot more. Could have built it in less time than this thread has been running... ;-) Bill > > --On Thursday, October 29, 2009 15:41 +0000 Steve Simon > wrote: > >> Sorry, not really the place for such questions but... >> >> I always struggle with sed, awk is easy but sed makes my head hurt. >> >> I am trying to capitalise the first tow words on each line (I could use >> awk as well but I have to use sed so it seems churlish to start another >> process). >> >> capitalising the first word on the line is easy enough: >> >> h >> s/^(.).*/\1/ >> y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ >> x >> s/^.(.*)/\1/ >> x >> G >> s/\n// >> >> Though there maye be a much easier/more elegant way to do this, >> but for the 2nd word it gets much harder. >> >> What I really want is sam's ability to select a letter and operate on it >> rather than everything being line based as sed seems to be. >> >> any neat solutions? (extra points awarded for use of the branch operator >> :-) >> >> -Steve >> > > > > > > From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: References: Date: Fri, 30 Oct 2009 21:53:36 +0100 Message-ID: <56a297000910301353p29baf584g3305d9548215e1f7@mail.gmail.com> From: Noah Evans To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [9fans] sed question (OT) Topicbox-Message-UUID: 94da013e-ead5-11e9-9d60-3106f5b1d025 This kind of problem is character processing, which I would argue is C's domain. You can massage awk and sed to do the job for you, but at least for me it's conceptually simpler to just bang out the following C program: #include #include #include #define isupper(r) (L'A' <=3D (r) && (r) <=3D L'Z') #define islower(r) (L'a' <=3D (r) && (r) <=3D L'z') #define isalpha(r) (isupper(r) || islower(r)) #define isspace(r) ((r) =3D=3D L' ' || (r) =3D=3D L'\t' \ || (0x0A <=3D (r) && (r) <=3D 0x0D)) #define toupper(r) ((r)-'a'+'A') void usage(char *me) { fprint(2, "%s: usage\n", me); } void main(int argc, char **argv) { Biobuf in, out; int c, waswhite, nwords; ARGBEGIN{ default: usage(argv[0]); }ARGEND; Binit(&in, 0, OREAD); Binit(&out, 1, OWRITE); =09 waswhite =3D 0; nwords =3D 0; while((c =3D Bgetc(&in)) !=3D Beof){ if(isalpha(c)) if(waswhite) if(nwords < 2){ if(islower(c)) c =3D toupper(c); nwords++; } if(isspace(c)) waswhite =3D 1; else waswhite =3D 0; if(c =3D=3D '\n') nwords =3D 0; Bputc(&out, c); } exits(0); } Noah On Thu, Oct 29, 2009 at 4:41 PM, Steve Simon wrote: > Sorry, not really the place for such questions but... > > I always struggle with sed, awk is easy but sed makes my head hurt. > > I am trying to capitalise the first tow words on each line (I could use a= wk > as well but I have to use sed so it seems churlish to start another proce= ss). > > capitalising the first word on the line is easy enough: > > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0h > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0s/^(.).*/\1/ > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0y/abcdefghijklmnopqrstuvwx= yz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0x > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0s/^.(.*)/\1/ > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0x > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0G > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0s/\n// > > Though there maye be a much easier/more elegant way to do this, > but for the 2nd word it gets much harder. > > What I really want is sam's ability to select a letter and operate on it > rather than everything being line based as sed seems to be. > > any neat solutions? (extra points awarded for use of the branch operator = :-) > > -Steve > > From mboxrd@z Thu Jan 1 00:00:00 1970 To: 9fans@9fans.net Date: Wed, 11 Nov 2009 12:32:41 +0000 From: frankg Message-ID: <33e7e0a4-5181-4f3f-a830-65a9d553f92e@g22g2000prf.googlegroups.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable References: Subject: Re: [9fans] sed question (OT) Topicbox-Message-UUID: 9967b9d0-ead5-11e9-9d60-3106f5b1d025 On Oct 30, 12:58 pm, noah.ev...@gmail.com (Noah Evans) wrote: > This kind of problem is character processing, which I would argue is > C's domain. You can massage awk and sed to do the job for you, but at > least for me it's conceptually simpler to just bang out the following > C program: > > #include > #include > #include > > #define isupper(r)      (L'A' <= (r) && (r) <= L'Z') > #define islower(r)      (L'a' <= (r) && (r) <= L'z') > #define isalpha(r)      (isupper(r) || islower(r)) > #define isspace(r)      ((r) == L' ' || (r) == L'\t' \ >                         || (0x0A <= (r) && (r) <= 0x0D)) > #define toupper(r)      ((r)-'a'+'A') > > void > usage(char *me) > { >         fprint(2, "%s: usage\n", me); > > } > > void > main(int argc, char **argv) > { >         Biobuf in, out; >         int c, waswhite, nwords; > >         ARGBEGIN{ >         default: >                 usage(argv[0]); >         }ARGEND; >         Binit(&in, 0, OREAD); >         Binit(&out, 1, OWRITE); > >         waswhite = 0; >         nwords = 0; >         while((c = Bgetc(&in)) != Beof){ >                 if(isalpha(c)) >                 if(waswhite) >                 if(nwords < 2){ >                         if(islower(c)) >                                 c = toupper(c); >                         nwords++; >                 } >                 if(isspace(c)) >                         waswhite = 1; >                 else >                         waswhite = 0; >                 if(c == '\n') >                         nwords = 0; >                 Bputc(&out, c); >         } >         exits(0); > > } > > Noah > Simple, and wrong. You need to initialize waswhite to 1, not 0.