9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* Re: [9fans] lan9 rc-shell level regex vs utf values questions
       [not found] <CAEDadrw7iswfd2JQrxSypSBp0H+3fAP8k7AtiJn150a71=kPKA@mail.gmail.c>
@ 2012-06-07 19:05 ` erik quanstrom
  2012-06-08  9:03   ` Ethan Grammatikidis
  0 siblings, 1 reply; 5+ messages in thread
From: erik quanstrom @ 2012-06-07 19:05 UTC (permalink / raw)
  To: rhoyerboat, 9fans

On Thu Jun  7 14:50:38 EDT 2012, rhoyerboat@gmail.com wrote:

> Access utf values and echo from indexes on rc?
> 
> Before I write any code over the issue, does anyone have a grep, sed, or
> split application which matches by utf character values like \U2424 instead
> of whatever built-in token like \n ?

do you mean that instead of matching lines you want to match records delimited
by \u2424?  if so, you can use sam or acme structured regular expressions, sres, or
you can just tr ␤ '\n'.  if you want to save the original newlines, you can first
change those to an unused character in your text.

as cinap mentions, there's no need to escape codepoints >= 0x80.  rc treats
'em all as the same class of symbol as a-z.

- erik



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9fans] lan9 rc-shell level regex vs utf values questions
  2012-06-07 19:05 ` [9fans] lan9 rc-shell level regex vs utf values questions erik quanstrom
@ 2012-06-08  9:03   ` Ethan Grammatikidis
  2012-06-08 11:12     ` Ethan Grammatikidis
  0 siblings, 1 reply; 5+ messages in thread
From: Ethan Grammatikidis @ 2012-06-08  9:03 UTC (permalink / raw)
  To: 9fans

On Thu, 7 Jun 2012 15:05:13 -0400
erik quanstrom <quanstro@quanstro.net> wrote:

> do you mean that instead of matching lines you want to match records delimited
> by \u2424?


Also awk 'RS=␤', or awk -F ␤ where -F is field separator.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9fans] lan9 rc-shell level regex vs utf values questions
  2012-06-08  9:03   ` Ethan Grammatikidis
@ 2012-06-08 11:12     ` Ethan Grammatikidis
  0 siblings, 0 replies; 5+ messages in thread
From: Ethan Grammatikidis @ 2012-06-08 11:12 UTC (permalink / raw)
  To: 9fans

On Fri, 8 Jun 2012 10:03:09 +0100
Ethan Grammatikidis <eekee57@fastmail.fm> wrote:

> On Thu, 7 Jun 2012 15:05:13 -0400
> erik quanstrom <quanstro@quanstro.net> wrote:
> 
> > do you mean that instead of matching lines you want to match records delimited
> > by \u2424?
> 
> 
> Also awk 'RS=␤', or awk -F ␤ where -F is field separator.
> 


my bad, awk -v 'RS=␤' or put RS=␤ in the BEGIN block. You can also set FS instead of using -F if you like.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [9fans] lan9 rc-shell level regex vs utf values questions
  2012-06-07 18:49 andrew zerger
@ 2012-06-07 18:59 ` cinap_lenrek
  0 siblings, 0 replies; 5+ messages in thread
From: cinap_lenrek @ 2012-06-07 18:59 UTC (permalink / raw)
  To: rhoyerboat, 9fans

in plan9, you hit [Alt] then type X2424

echo '␤'

alternatively, you can run:

unicode 2424

also, no \n needed:

echo '
This
Is
a

Test'

theres no need to escape anything other than the quotes.

--
cinap



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [9fans] lan9 rc-shell level regex vs utf values questions
@ 2012-06-07 18:49 andrew zerger
  2012-06-07 18:59 ` cinap_lenrek
  0 siblings, 1 reply; 5+ messages in thread
From: andrew zerger @ 2012-06-07 18:49 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 492 bytes --]

Access utf values and echo from indexes on rc?

Before I write any code over the issue, does anyone have a grep, sed, or
split application which matches by utf character values like \U2424 instead
of whatever built-in token like \n ?

Or another basic question I cannot get the manuals to answer yet, how to
'echo \U2424' value?

Just seems like this would have been done already, so asking,
tyvm




-- 
⎼⎺⎺├@┼␊├├≤-␍⎼␊▒␍:/␤⎺└␊/⎼␤⎺#

[-- Attachment #2: Type: text/html, Size: 551 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-06-08 11:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAEDadrw7iswfd2JQrxSypSBp0H+3fAP8k7AtiJn150a71=kPKA@mail.gmail.c>
2012-06-07 19:05 ` [9fans] lan9 rc-shell level regex vs utf values questions erik quanstrom
2012-06-08  9:03   ` Ethan Grammatikidis
2012-06-08 11:12     ` Ethan Grammatikidis
2012-06-07 18:49 andrew zerger
2012-06-07 18:59 ` cinap_lenrek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).