9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* Re: [9fans] plan 9 regexp
@ 2009-06-03 23:41 Francisco J Ballesteros
  2009-06-03 23:56 ` erik quanstrom
  0 siblings, 1 reply; 30+ messages in thread
From: Francisco J Ballesteros @ 2009-06-03 23:41 UTC (permalink / raw)
  To: 9fans

I have a ssam script that does the work. But  it's not really streaming.

El 04/06/2009, a las 1:36, jrm8005@gmail.com escribió:

> Speaking of regexes in Plan 9, did the "structural awk" or "stream
> sam" Rob dreamed of in the SE paper ever get realized?
>
> [/mail/box/nemo/msgs/200906/41493]



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 23:41 [9fans] plan 9 regexp Francisco J Ballesteros
@ 2009-06-03 23:56 ` erik quanstrom
  2009-06-04  1:22   ` J.R. Mauro
  2009-06-04 21:35   ` Dan Cross
  0 siblings, 2 replies; 30+ messages in thread
From: erik quanstrom @ 2009-06-03 23:56 UTC (permalink / raw)
  To: 9fans

On Wed Jun  3 19:41:39 EDT 2009, nemo@lsub.org wrote:
> I have a ssam script that does the work. But  it's not really streaming.
>
> El 04/06/2009, a las 1:36, jrm8005@gmail.com escribió:
>
> > Speaking of regexes in Plan 9, did the "structural awk" or "stream
> > sam" Rob dreamed of in the SE paper ever get realized?
> >
> > [/mail/box/nemo/msgs/200906/41493]
>

here's a pointer to the previous discussion of ssam, which
does exist for unix:

http://9fans.net/archive/2003/10/309

"structural awk" is still a tempting idea.  but why not
just go wild and implement a shell with sres?

- erik



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 23:56 ` erik quanstrom
@ 2009-06-04  1:22   ` J.R. Mauro
  2009-06-04 11:43     ` Martin Neubauer
  2009-06-04 21:35   ` Dan Cross
  1 sibling, 1 reply; 30+ messages in thread
From: J.R. Mauro @ 2009-06-04  1:22 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, Jun 3, 2009 at 7:56 PM, erik quanstrom <quanstro@quanstro.net> wrote:
> On Wed Jun  3 19:41:39 EDT 2009, nemo@lsub.org wrote:
>> I have a ssam script that does the work. But  it's not really streaming.
>>
>> El 04/06/2009, a las 1:36, jrm8005@gmail.com escribió:
>>
>> > Speaking of regexes in Plan 9, did the "structural awk" or "stream
>> > sam" Rob dreamed of in the SE paper ever get realized?
>> >
>> > [/mail/box/nemo/msgs/200906/41493]
>>
>
> here's a pointer to the previous discussion of ssam, which
> does exist for unix:
>
> http://9fans.net/archive/2003/10/309

Unfortunately, I can't get it to build. It looks long un(der)maintained.

>
> "structural awk" is still a tempting idea.  but why not
> just go wild and implement a shell with sres?

One step at a time. Although if I woke up tomorrow and everything from
sed to lex and yacc and rc magically had sres, I would be very /very/
happy.

You can kind of get something not entirely unlike sres in sed, but it
is quite hard and unintuitive.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-04  1:22   ` J.R. Mauro
@ 2009-06-04 11:43     ` Martin Neubauer
  2009-06-04 19:23       ` J. R. Mauro
  0 siblings, 1 reply; 30+ messages in thread
From: Martin Neubauer @ 2009-06-04 11:43 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

* J.R. Mauro (jrm8005@gmail.com) wrote:
> > here's a pointer to the previous discussion of ssam, which
> > does exist for unix:
> >
> > http://9fans.net/archive/2003/10/309
>
> Unfortunately, I can't get it to build. It looks long un(der)maintained.
I haven't dug too deep, but it might require the libutf from the same website.

Good luck,
	Martin



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-04 11:43     ` Martin Neubauer
@ 2009-06-04 19:23       ` J. R. Mauro
  2009-06-04 19:27         ` erik quanstrom
  0 siblings, 1 reply; 30+ messages in thread
From: J. R. Mauro @ 2009-06-04 19:23 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs; +Cc: Fans of the OS Plan 9 from Bell Labs





On Jun 4, 2009, at 7:43, Martin Neubauer <m.ne@gmx.net> wrote:

> * J.R. Mauro (jrm8005@gmail.com) wrote:
>>> here's a pointer to the previous discussion of ssam, which
>>> does exist for unix:
>>>
>>> http://9fans.net/archive/2003/10/309
>>
>> Unfortunately, I can't get it to build. It looks long
>> un(der)maintained.
> I haven't dug too deep, but it might require the libutf from the
> same website.
>
> Good luck,
>    Martin
>

It does. That doesn't build either :(



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-04 19:23       ` J. R. Mauro
@ 2009-06-04 19:27         ` erik quanstrom
  2009-06-04 21:54           ` J.R. Mauro
  2009-06-13 18:29           ` J.R. Mauro
  0 siblings, 2 replies; 30+ messages in thread
From: erik quanstrom @ 2009-06-04 19:27 UTC (permalink / raw)
  To: 9fans

>
> It does. That doesn't build either :(
>

there is very little source code there.  why not dump the configure
goo and use p9p instead?

- erik



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 23:56 ` erik quanstrom
  2009-06-04  1:22   ` J.R. Mauro
@ 2009-06-04 21:35   ` Dan Cross
  2009-06-04 22:36     ` Charles Forsyth
  2009-06-05  0:07     ` Dan Cross
  1 sibling, 2 replies; 30+ messages in thread
From: Dan Cross @ 2009-06-04 21:35 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, Jun 3, 2009 at 7:56 PM, erik quanstrom<quanstro@quanstro.net> wrote:
> "structural awk" is still a tempting idea.  but why not
> just go wild and implement a shell with sres?

For my final project in the compilers course I took, I had this idea
to take structural regular expressions and use them to match patterns
against the parse trees generated by a compiler frontend.  The intent
was to be able to detect "unsafe" code at compile time via pattern
matching; I then implemented a small awk-like language that contained
syntax for doing the matching and a few primitives for doing things
like printing compiler errors or warnings (hooking into the facilities
provided by the compiler).  We linked it into lcc; it was kind of
neat.  You could define a 'program' that the compiler interpreted as
it compiled a source file that could contain a library of additional
errors and warnings that were specific to your problem domain (for
instance, I wrote code to detect calls to 'gets' and error out).

Of course, it broke down somewhat because the language of strings
necessarily all that well suited to describing trees whose elements
come from a completely different domain, but I still think the idea
has some merit.  This was in 2003; I gather things like that are now
beginning to become somewhat common elsewhere.

        - Dan C.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-04 19:27         ` erik quanstrom
@ 2009-06-04 21:54           ` J.R. Mauro
  2009-06-05 18:12             ` erik quanstrom
  2009-06-13 18:29           ` J.R. Mauro
  1 sibling, 1 reply; 30+ messages in thread
From: J.R. Mauro @ 2009-06-04 21:54 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Thu, Jun 4, 2009 at 3:27 PM, erik quanstrom <quanstro@quanstro.net> wrote:
>>
>> It does. That doesn't build either :(
>>
>
> there is very little source code there.  why not dump the configure
> goo and use p9p instead?
>
> - erik
>
>

I want to, but as usual, time is a problem.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-04 21:35   ` Dan Cross
@ 2009-06-04 22:36     ` Charles Forsyth
  2009-06-05  0:05       ` Dan Cross
  2009-06-05  0:07     ` Dan Cross
  1 sibling, 1 reply; 30+ messages in thread
From: Charles Forsyth @ 2009-06-04 22:36 UTC (permalink / raw)
  To: 9fans

>Of course, it broke down somewhat because the language of strings
>necessarily all that well suited to describing trees whose elements
>come from a completely different domain, but I still think the idea
>has some merit.  This was in 2003; I gather things like that are now
>beginning to become somewhat common elsewhere.

there are several varieties of tree automata that are better
suited to working with trees.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-04 22:36     ` Charles Forsyth
@ 2009-06-05  0:05       ` Dan Cross
  2009-06-06 17:55         ` Ori Bernstein
  2009-06-07 21:31         ` Russ Cox
  0 siblings, 2 replies; 30+ messages in thread
From: Dan Cross @ 2009-06-05  0:05 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Thu, Jun 4, 2009 at 6:36 PM, Charles Forsyth<forsyth@terzarima.net> wrote:
>>Of course, it broke down somewhat because the language of strings
>>necessarily all that well suited to describing trees whose elements
>>come from a completely different domain, but I still think the idea
>>has some merit.  This was in 2003; I gather things like that are now
>>beginning to become somewhat common elsewhere.
>
> there are several varieties of tree automata that are better
> suited to working with trees.

I'm sure.  This is something that I would be interested in revisiting;
do you have any pointers to particularly relevant information?  I
wonder how nicely these tree automata could be packaged into an
awk-like form.

        - Dan C.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-04 21:35   ` Dan Cross
  2009-06-04 22:36     ` Charles Forsyth
@ 2009-06-05  0:07     ` Dan Cross
  1 sibling, 0 replies; 30+ messages in thread
From: Dan Cross @ 2009-06-05  0:07 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Thu, Jun 4, 2009 at 5:35 PM, Dan Cross<crossd@gmail.com> wrote:
> For my final project in the compilers course I took, I had this idea
> to take structural regular expressions and use them to match patterns
> against the parse trees generated by a compiler frontend.  The intent
> was to be able to detect "unsafe" code at compile time via pattern
> matching; I then implemented a small awk-like language that contained
> syntax for doing the matching and a few primitives for doing things
> like printing compiler errors or warnings (hooking into the facilities
> provided by the compiler).  We linked it into lcc; it was kind of
> neat.  You could define a 'program' that the compiler interpreted as
> it compiled a source file that could contain a library of additional
> errors and warnings that were specific to your problem domain (for
> instance, I wrote code to detect calls to 'gets' and error out).

Oh, in the interest of being honest, not that it's relevant to 9fans
but just to show that I do have a conscience, I should mention that I
did this project with two other students, though I feel it fair to say
that I did the bulk of the technical work.

        - Dan C.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-04 21:54           ` J.R. Mauro
@ 2009-06-05 18:12             ` erik quanstrom
  0 siblings, 0 replies; 30+ messages in thread
From: erik quanstrom @ 2009-06-05 18:12 UTC (permalink / raw)
  To: 9fans

On Thu Jun  4 17:58:15 EDT 2009, jrm8005@gmail.com wrote:
> On Thu, Jun 4, 2009 at 3:27 PM, erik quanstrom <quanstro@quanstro.net> wrote:
> >>
> >> It does. That doesn't build either :(
> >>
> >
> > there is very little source code there.  why not dump the configure
> > goo and use p9p instead?
> >
> > - erik
>
> I want to, but as usual, time is a problem.

it turns out that ssam reads the entire input before
doing any processing.  sam doesn't have this limitation
and can handle files bigger than available memory.

so i'm not sure that ssam is worth investing any time
into.

- erik



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-05  0:05       ` Dan Cross
@ 2009-06-06 17:55         ` Ori Bernstein
  2009-06-07 17:11           ` erik quanstrom
  2009-06-07 21:31         ` Russ Cox
  1 sibling, 1 reply; 30+ messages in thread
From: Ori Bernstein @ 2009-06-06 17:55 UTC (permalink / raw)
  To: 9fans

On Thu, 4 Jun 2009 20:05:01 -0400
Dan Cross <crossd@gmail.com> wrote:


> I'm sure.  This is something that I would be interested in revisiting;
> do you have any pointers to particularly relevant information?  I
> wonder how nicely these tree automata could be packaged into an
> awk-like form.
>
>         - Dan C.

Look up iburg and twig. They're tree pattern matchers that could be
adapted to do what you want, although they're more like yacc than awk.

--
    Ori Bernstein



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-06 17:55         ` Ori Bernstein
@ 2009-06-07 17:11           ` erik quanstrom
  0 siblings, 0 replies; 30+ messages in thread
From: erik quanstrom @ 2009-06-07 17:11 UTC (permalink / raw)
  To: 9fans

On Sun Jun  7 11:02:51 EDT 2009, ori@eigenstate.org wrote:
> On Thu, 4 Jun 2009 20:05:01 -0400
> Dan Cross <crossd@gmail.com> wrote:
>
>
> > I'm sure.  This is something that I would be interested in revisiting;
> > do you have any pointers to particularly relevant information?  I
> > wonder how nicely these tree automata could be packaged into an
> > awk-like form.
> >
> >         - Dan C.
>
> Look up iburg and twig. They're tree pattern matchers that could be
> adapted to do what you want, although they're more like yacc than awk.

i'm surprised that no one has mentiond http://pdos.csail.mit.edu/xoc/
by russ et al.

- erik



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-05  0:05       ` Dan Cross
  2009-06-06 17:55         ` Ori Bernstein
@ 2009-06-07 21:31         ` Russ Cox
  1 sibling, 0 replies; 30+ messages in thread
From: Russ Cox @ 2009-06-07 21:31 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> I'm sure.  This is something that I would be interested in revisiting;
> do you have any pointers to particularly relevant information?  I
> wonder how nicely these tree automata could be packaged into an
> awk-like form.

In addition to what others have suggested, look up
[tree regular expressions] and [regular tree expressions].

Regarding ssam, I've always found it equally workable,
if slightly clunky, to cat >/tmp/a; echo script | sam -d /tmp/a; cat /tmp/a.
In fact, most of the time I want to be able to iterate over the
whole input multiple times, so it's hard to imagine doing better.

Russ


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-04 19:27         ` erik quanstrom
  2009-06-04 21:54           ` J.R. Mauro
@ 2009-06-13 18:29           ` J.R. Mauro
  1 sibling, 0 replies; 30+ messages in thread
From: J.R. Mauro @ 2009-06-13 18:29 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

I got it to build for linux with some modifications, if you or anyone
is interested. Now I just need a sawk and syacc.

On Thu, Jun 4, 2009 at 3:27 PM, erik quanstrom<quanstro@quanstro.net> wrote:
>>
>> It does. That doesn't build either :(
>>
>
> there is very little source code there.  why not dump the configure
> goo and use p9p instead?
>
> - erik
>
>



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 14:56 hugo rivera
                   ` (4 preceding siblings ...)
  2009-06-03 20:01 ` Wu JIANG
@ 2009-06-03 23:32 ` J.R. Mauro
  5 siblings, 0 replies; 30+ messages in thread
From: J.R. Mauro @ 2009-06-03 23:32 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Speaking of regexes in Plan 9, did the "structural awk" or "stream
sam" Rob dreamed of in the SE paper ever get realized?



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 20:44     ` Wu JIANG
@ 2009-06-03 20:49       ` hugo rivera
  0 siblings, 0 replies; 30+ messages in thread
From: hugo rivera @ 2009-06-03 20:49 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

great, thanks for the answer ;-)

2009/6/3 Wu JIANG <albert.w.jiang@gmail.com>:
> Sorry, I misunderstood your question in the first place. I think one example
> can be good to show how ``?'' is useful somehow in grep.
>
> Suppose I have a file, I want to find out a keyword ``produce'', but I know
> that the word ``produced'' might also be the word that I am interested (stem
> process in information retrieval or nlp). So I use the pattern "produced?"
> to find all the words useful to me.
>
> I hope this can be helpful at least a little bit. :-)
>
> On Wed, Jun 3, 2009 at 4:11 PM, hugo rivera <uair00@gmail.com> wrote:
>>
>> you are right, but the original post read
>>
>> > grep 'a+bb?'
>>
>> so you get at least one 'a' and one or two 'b'.
>>
>> 2009/6/3 Wu JIANG <albert.w.jiang@gmail.com>:
>> > actually, a+ means at least one 'a', b? means zero or one 'b'.
>> >
>> > On Wed, Jun 3, 2009 at 10:56 AM, hugo rivera <uair00@gmail.com> wrote:
>> >>
>> >> Hello,
>> >> I am experimenting with some regexp implementations (namely the one
>> >> from "the practice of programming") and I am a little disoriented by
>> >> the use of the '?' operator in plan 9's grep:
>> >> say I have the following input
>> >>
>> >> aaaabbb
>> >> ab
>> >> aaaab
>> >> bb
>> >> b
>> >> aaabb
>> >> aaaa
>> >>
>> >> which I feed into grep with
>> >>
>> >> grep 'a+bb?'
>> >>
>> >> which should match at least one 'a' followed by one or two 'b'. So,
>> >> grep's output is
>> >>
>> >> aaaabbb
>> >> ab
>> >> aaaab
>> >> aaabb
>> >>
>> >> which really surprised me at first, since I wasn't expecting the first
>> >> line. After some thought, I realized that the 'aaaab' and the 'aaaabb'
>> >> patterns, contained in the first line of input, match the regexp, so
>> >> grep prints the line.
>> >> But then, how exactly the '?' operator is useful for grep? I was
>> >> thinking that it was good to filter lines that contain more characters
>> >> that desired, but it is not.
>> >> Saludos
>> >> --
>> >> Hugo
>> >>
>> >
>> >
>>
>>
>>
>> --
>> Hugo
>>
>
>



--
Hugo



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 20:11   ` hugo rivera
@ 2009-06-03 20:44     ` Wu JIANG
  2009-06-03 20:49       ` hugo rivera
  0 siblings, 1 reply; 30+ messages in thread
From: Wu JIANG @ 2009-06-03 20:44 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 1943 bytes --]

Sorry, I misunderstood your question in the first place. I think one example
can be good to show how ``?'' is useful somehow in grep.

Suppose I have a file, I want to find out a keyword ``produce'', but I know
that the word ``produced'' might also be the word that I am interested (stem
process in information retrieval or nlp). So I use the pattern "produced?"
to find all the words useful to me.

I hope this can be helpful at least a little bit. :-)

On Wed, Jun 3, 2009 at 4:11 PM, hugo rivera <uair00@gmail.com> wrote:

> you are right, but the original post read
>
> > grep 'a+bb?'
>
> so you get at least one 'a' and one or two 'b'.
>
> 2009/6/3 Wu JIANG <albert.w.jiang@gmail.com>:
> > actually, a+ means at least one 'a', b? means zero or one 'b'.
> >
> > On Wed, Jun 3, 2009 at 10:56 AM, hugo rivera <uair00@gmail.com> wrote:
> >>
> >> Hello,
> >> I am experimenting with some regexp implementations (namely the one
> >> from "the practice of programming") and I am a little disoriented by
> >> the use of the '?' operator in plan 9's grep:
> >> say I have the following input
> >>
> >> aaaabbb
> >> ab
> >> aaaab
> >> bb
> >> b
> >> aaabb
> >> aaaa
> >>
> >> which I feed into grep with
> >>
> >> grep 'a+bb?'
> >>
> >> which should match at least one 'a' followed by one or two 'b'. So,
> >> grep's output is
> >>
> >> aaaabbb
> >> ab
> >> aaaab
> >> aaabb
> >>
> >> which really surprised me at first, since I wasn't expecting the first
> >> line. After some thought, I realized that the 'aaaab' and the 'aaaabb'
> >> patterns, contained in the first line of input, match the regexp, so
> >> grep prints the line.
> >> But then, how exactly the '?' operator is useful for grep? I was
> >> thinking that it was good to filter lines that contain more characters
> >> that desired, but it is not.
> >> Saludos
> >> --
> >> Hugo
> >>
> >
> >
>
>
>
> --
> Hugo
>
>

[-- Attachment #2: Type: text/html, Size: 2884 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 20:01 ` Wu JIANG
  2009-06-03 20:05   ` Wu JIANG
@ 2009-06-03 20:11   ` hugo rivera
  2009-06-03 20:44     ` Wu JIANG
  1 sibling, 1 reply; 30+ messages in thread
From: hugo rivera @ 2009-06-03 20:11 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

you are right, but the original post read

> grep 'a+bb?'

so you get at least one 'a' and one or two 'b'.

2009/6/3 Wu JIANG <albert.w.jiang@gmail.com>:
> actually, a+ means at least one 'a', b? means zero or one 'b'.
>
> On Wed, Jun 3, 2009 at 10:56 AM, hugo rivera <uair00@gmail.com> wrote:
>>
>> Hello,
>> I am experimenting with some regexp implementations (namely the one
>> from "the practice of programming") and I am a little disoriented by
>> the use of the '?' operator in plan 9's grep:
>> say I have the following input
>>
>> aaaabbb
>> ab
>> aaaab
>> bb
>> b
>> aaabb
>> aaaa
>>
>> which I feed into grep with
>>
>> grep 'a+bb?'
>>
>> which should match at least one 'a' followed by one or two 'b'. So,
>> grep's output is
>>
>> aaaabbb
>> ab
>> aaaab
>> aaabb
>>
>> which really surprised me at first, since I wasn't expecting the first
>> line. After some thought, I realized that the 'aaaab' and the 'aaaabb'
>> patterns, contained in the first line of input, match the regexp, so
>> grep prints the line.
>> But then, how exactly the '?' operator is useful for grep? I was
>> thinking that it was good to filter lines that contain more characters
>> that desired, but it is not.
>> Saludos
>> --
>> Hugo
>>
>
>



--
Hugo



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 20:01 ` Wu JIANG
@ 2009-06-03 20:05   ` Wu JIANG
  2009-06-03 20:11   ` hugo rivera
  1 sibling, 0 replies; 30+ messages in thread
From: Wu JIANG @ 2009-06-03 20:05 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 1408 bytes --]

btw, there is one site to play with RegExp
http://erik.eae.net/playground/regexp/regexp.html. Have fun.

a brief introduction can be found from here
http://en.wikipedia.org/wiki/Regular_expression.



On Wed, Jun 3, 2009 at 4:01 PM, Wu JIANG <albert.w.jiang@gmail.com> wrote:

> actually, a+ means at least one 'a', b? means zero or one 'b'.
>
>
> On Wed, Jun 3, 2009 at 10:56 AM, hugo rivera <uair00@gmail.com> wrote:
>
>> Hello,
>> I am experimenting with some regexp implementations (namely the one
>> from "the practice of programming") and I am a little disoriented by
>> the use of the '?' operator in plan 9's grep:
>> say I have the following input
>>
>> aaaabbb
>> ab
>> aaaab
>> bb
>> b
>> aaabb
>> aaaa
>>
>> which I feed into grep with
>>
>> grep 'a+bb?'
>>
>> which should match at least one 'a' followed by one or two 'b'. So,
>> grep's output is
>>
>> aaaabbb
>> ab
>> aaaab
>> aaabb
>>
>> which really surprised me at first, since I wasn't expecting the first
>> line. After some thought, I realized that the 'aaaab' and the 'aaaabb'
>> patterns, contained in the first line of input, match the regexp, so
>> grep prints the line.
>> But then, how exactly the '?' operator is useful for grep? I was
>> thinking that it was good to filter lines that contain more characters
>> that desired, but it is not.
>> Saludos
>> --
>> Hugo
>>
>>
>

[-- Attachment #2: Type: text/html, Size: 2267 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 14:56 hugo rivera
                   ` (3 preceding siblings ...)
  2009-06-03 17:19 ` Enrique Soriano
@ 2009-06-03 20:01 ` Wu JIANG
  2009-06-03 20:05   ` Wu JIANG
  2009-06-03 20:11   ` hugo rivera
  2009-06-03 23:32 ` J.R. Mauro
  5 siblings, 2 replies; 30+ messages in thread
From: Wu JIANG @ 2009-06-03 20:01 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

[-- Attachment #1: Type: text/plain, Size: 1071 bytes --]

actually, a+ means at least one 'a', b? means zero or one 'b'.

On Wed, Jun 3, 2009 at 10:56 AM, hugo rivera <uair00@gmail.com> wrote:

> Hello,
> I am experimenting with some regexp implementations (namely the one
> from "the practice of programming") and I am a little disoriented by
> the use of the '?' operator in plan 9's grep:
> say I have the following input
>
> aaaabbb
> ab
> aaaab
> bb
> b
> aaabb
> aaaa
>
> which I feed into grep with
>
> grep 'a+bb?'
>
> which should match at least one 'a' followed by one or two 'b'. So,
> grep's output is
>
> aaaabbb
> ab
> aaaab
> aaabb
>
> which really surprised me at first, since I wasn't expecting the first
> line. After some thought, I realized that the 'aaaab' and the 'aaaabb'
> patterns, contained in the first line of input, match the regexp, so
> grep prints the line.
> But then, how exactly the '?' operator is useful for grep? I was
> thinking that it was good to filter lines that contain more characters
> that desired, but it is not.
> Saludos
> --
> Hugo
>
>

[-- Attachment #2: Type: text/html, Size: 1524 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 17:46     ` Rob Pike
@ 2009-06-03 18:05       ` hugo rivera
  0 siblings, 0 replies; 30+ messages in thread
From: hugo rivera @ 2009-06-03 18:05 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

OK, thanks for the answers. This shows my lack of imagination.
Saludos

--
Hugo



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 17:34   ` erik quanstrom
@ 2009-06-03 17:46     ` Rob Pike
  2009-06-03 18:05       ` hugo rivera
  0 siblings, 1 reply; 30+ messages in thread
From: Rob Pike @ 2009-06-03 17:46 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Wed, Jun 3, 2009 at 10:34 AM, erik quanstrom <quanstro@quanstro.net> wrote:
>> For example, for lines with 8 characters and the ? operator:
>>
>> ^????????$
>
> just to clarify (sorry if i'm being pedantic):
> that will mach lines with *up to* 8 characters. (including
> blank lines) "^........$" will match lines with exactly 8
> characters.

No.  You mean ^.?.?.?.?.?.?.?.?$

WIthout the dots, it's just asking 8 times whether we're at the
beginning of the line or not, and ignoring the answer.

-rob



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 16:21 ` Rodolfo (kix)
@ 2009-06-03 17:34   ` erik quanstrom
  2009-06-03 17:46     ` Rob Pike
  0 siblings, 1 reply; 30+ messages in thread
From: erik quanstrom @ 2009-06-03 17:34 UTC (permalink / raw)
  To: 9fans

> For example, for lines with 8 characters and the ? operator:
>
> ^????????$

just to clarify (sorry if i'm being pedantic):
that will mach lines with *up to* 8 characters. (including
blank lines) "^........$" will match lines with exactly 8
characters.

- erik



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 14:56 hugo rivera
                   ` (2 preceding siblings ...)
  2009-06-03 16:51 ` Russ Cox
@ 2009-06-03 17:19 ` Enrique Soriano
  2009-06-03 20:01 ` Wu JIANG
  2009-06-03 23:32 ` J.R. Mauro
  5 siblings, 0 replies; 30+ messages in thread
From: Enrique Soriano @ 2009-06-03 17:19 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> But then, how exactly the '?' operator is useful for grep? I was

Suppose we want to filter file names ending
in '.jpg' and '.jpeg'. We would use this regexp:

; ls | grep '\.jpe?g$'

The 'e' is optional (zero or one 'e').

Q








^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 14:56 hugo rivera
  2009-06-03 16:21 ` Rodolfo (kix)
  2009-06-03 16:50 ` yy
@ 2009-06-03 16:51 ` Russ Cox
  2009-06-03 17:19 ` Enrique Soriano
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 30+ messages in thread
From: Russ Cox @ 2009-06-03 16:51 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

? is useful when it's not at the end of the pattern.
grep 'utf-?8' is shorter than grep 'utf8|utf-8'.

russ


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 14:56 hugo rivera
  2009-06-03 16:21 ` Rodolfo (kix)
@ 2009-06-03 16:50 ` yy
  2009-06-03 16:51 ` Russ Cox
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 30+ messages in thread
From: yy @ 2009-06-03 16:50 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

2009/6/3 hugo rivera <uair00@gmail.com>:
> But then, how exactly the '?' operator is useful for grep? I was
> thinking that it was good to filter lines that contain more characters
> that desired, but it is not.
> Saludos

Some common use cases are https? and plurals?


--
- yiyus || JGL .



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [9fans] plan 9 regexp
  2009-06-03 14:56 hugo rivera
@ 2009-06-03 16:21 ` Rodolfo (kix)
  2009-06-03 17:34   ` erik quanstrom
  2009-06-03 16:50 ` yy
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 30+ messages in thread
From: Rodolfo (kix) @ 2009-06-03 16:21 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

I Hugo,

I do not use regex in plan9, but about the question:

> But then, how exactly the '?' operator is useful for grep? I was
> thinking that it was good to filter lines that contain more characters
> that desired, but it is not.

In PERL, for example, you can do something like:

^a$

where ^ is start of line
and $ end of line.

For example, for lines with 8 characters and the ? operator:

^????????$

Saludos.

On Wed, Jun 3, 2009 at 4:56 PM, hugo rivera <uair00@gmail.com> wrote:
> Hello,
> I am experimenting with some regexp implementations (namely the one
> from "the practice of programming") and I am a little disoriented by
> the use of the '?' operator in plan 9's grep:
> say I have the following input
>
> aaaabbb
> ab
> aaaab
> bb
> b
> aaabb
> aaaa
>
> which I feed into grep with
>
> grep 'a+bb?'
>
> which should match at least one 'a' followed by one or two 'b'. So,
> grep's output is
>
> aaaabbb
> ab
> aaaab
> aaabb
>
> which really surprised me at first, since I wasn't expecting the first
> line. After some thought, I realized that the 'aaaab' and the 'aaaabb'
> patterns, contained in the first line of input, match the regexp, so
> grep prints the line.
> But then, how exactly the '?' operator is useful for grep? I was
> thinking that it was good to filter lines that contain more characters
> that desired, but it is not.
> Saludos
> --
> Hugo
>
>



-- 
Rodolfo García "kix"
EA4ERH - IN80ER



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [9fans] plan 9 regexp
@ 2009-06-03 14:56 hugo rivera
  2009-06-03 16:21 ` Rodolfo (kix)
                   ` (5 more replies)
  0 siblings, 6 replies; 30+ messages in thread
From: hugo rivera @ 2009-06-03 14:56 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Hello,
I am experimenting with some regexp implementations (namely the one
from "the practice of programming") and I am a little disoriented by
the use of the '?' operator in plan 9's grep:
say I have the following input

aaaabbb
ab
aaaab
bb
b
aaabb
aaaa

which I feed into grep with

grep 'a+bb?'

which should match at least one 'a' followed by one or two 'b'. So,
grep's output is

aaaabbb
ab
aaaab
aaabb

which really surprised me at first, since I wasn't expecting the first
line. After some thought, I realized that the 'aaaab' and the 'aaaabb'
patterns, contained in the first line of input, match the regexp, so
grep prints the line.
But then, how exactly the '?' operator is useful for grep? I was
thinking that it was good to filter lines that contain more characters
that desired, but it is not.
Saludos
--
Hugo



^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2009-06-13 18:29 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-03 23:41 [9fans] plan 9 regexp Francisco J Ballesteros
2009-06-03 23:56 ` erik quanstrom
2009-06-04  1:22   ` J.R. Mauro
2009-06-04 11:43     ` Martin Neubauer
2009-06-04 19:23       ` J. R. Mauro
2009-06-04 19:27         ` erik quanstrom
2009-06-04 21:54           ` J.R. Mauro
2009-06-05 18:12             ` erik quanstrom
2009-06-13 18:29           ` J.R. Mauro
2009-06-04 21:35   ` Dan Cross
2009-06-04 22:36     ` Charles Forsyth
2009-06-05  0:05       ` Dan Cross
2009-06-06 17:55         ` Ori Bernstein
2009-06-07 17:11           ` erik quanstrom
2009-06-07 21:31         ` Russ Cox
2009-06-05  0:07     ` Dan Cross
  -- strict thread matches above, loose matches on Subject: below --
2009-06-03 14:56 hugo rivera
2009-06-03 16:21 ` Rodolfo (kix)
2009-06-03 17:34   ` erik quanstrom
2009-06-03 17:46     ` Rob Pike
2009-06-03 18:05       ` hugo rivera
2009-06-03 16:50 ` yy
2009-06-03 16:51 ` Russ Cox
2009-06-03 17:19 ` Enrique Soriano
2009-06-03 20:01 ` Wu JIANG
2009-06-03 20:05   ` Wu JIANG
2009-06-03 20:11   ` hugo rivera
2009-06-03 20:44     ` Wu JIANG
2009-06-03 20:49       ` hugo rivera
2009-06-03 23:32 ` J.R. Mauro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).