Steffen Nurpmeso writes:
> Tony Finch wrote in
> <alpine.DEB.2.20.2005142316170.3374@grey.csi.cam.ac.uk>:
> |Larry McVoy <lm@mcvoy.com> wrote:
> |>
> |> It's got some perl goodness, regexps are part of the syntax, ....
> |
> |I got into Unix after perl and I've used it a lot. Back in the 1990s I saw
> |Henry Spencer's joke that perl was the Swiss Army Chainsaw of Unix, as a
> |riff on lex being its Swiss Army Knife. I came to appreciate lex
> |regrettably late: lex makes it remarkably easy to chew through a huge pile
> |of text and feed the pieces to some library code written in C. I've been
> |using re2c recently (http://re2c.org/), which is differently weird than
> |lex, though it still uses YY in all its variable names. It's remarkable
> |how much newer lexer/parser generators can't escape from the user
> |interface of lex/yacc. Another YY example: http://www.hwaci.com/sw/lemon/
>
> P.S.: i really hate automated lexers. I never ever got used to
> use them. For learning i once tried to use flex/bison, but
> i failed really hard. I like that blood, sweat and tears thing,
> and using a lexer seems so shattered, all the pieces. And i find
> them really hard to read.
>
> If you can deal with them they are surely a relief, especially in
> rapidly moving syntax situations. But if i look at settled source
> code which uses it, for example usr.sbin/ospfd/parse.y, or
> usr.sbin/smtpd/parse.y, both of OpenBSD, then i feel lost and am
> happy that i do not need to maintain that code.
>
> --steffen
Wow, I've had the opposite experience. I find lex/yacc/flex/bison really
easy to use. The issue, which I believe was covered in the early docs,
is that some languages are not designed with regularity in mind which makes
for ugly code. But to be fair, that code is at least as ugly with hand-crafted
code.
I believe that the original wisecrack was directed towards FORTRAN. My ancient
experience was that it was using lex/yacc for HSPICE was not going to work so I
had to hand-craft code for that.
Jon
[-- Attachment #1: Type: text/plain, Size: 2297 bytes --] “The asteroid to kill this dinosaur is still in orbit.“ —- Plan 9 lex man page I always hand craft my lexers and use yacc to parse. Most code on plan 9 does that as well. Brantley On May 16, 2020, at 8:00 PM, Jon Steinhart <jon@fourwinds.com> wrote: Steffen Nurpmeso writes: Tony Finch wrote in <alpine.DEB.2.20.2005142316170.3374@grey.csi.cam.ac.uk>: |Larry McVoy <lm@mcvoy.com> wrote: |> |> It's got some perl goodness, regexps are part of the syntax, .... | |I got into Unix after perl and I've used it a lot. Back in the 1990s I saw |Henry Spencer's joke that perl was the Swiss Army Chainsaw of Unix, as a |riff on lex being its Swiss Army Knife. I came to appreciate lex |regrettably late: lex makes it remarkably easy to chew through a huge pile |of text and feed the pieces to some library code written in C. I've been |using re2c recently (http://re2c.org/), which is differently weird than |lex, though it still uses YY in all its variable names. It's remarkable |how much newer lexer/parser generators can't escape from the user |interface of lex/yacc. Another YY example: http://www.hwaci.com/sw/lemon/ P.S.: i really hate automated lexers. I never ever got used to use them. For learning i once tried to use flex/bison, but i failed really hard. I like that blood, sweat and tears thing, and using a lexer seems so shattered, all the pieces. And i find them really hard to read. If you can deal with them they are surely a relief, especially in rapidly moving syntax situations. But if i look at settled source code which uses it, for example usr.sbin/ospfd/parse.y, or usr.sbin/smtpd/parse.y, both of OpenBSD, then i feel lost and am happy that i do not need to maintain that code. --steffen Wow, I've had the opposite experience. I find lex/yacc/flex/bison really easy to use. The issue, which I believe was covered in the early docs, is that some languages are not designed with regularity in mind which makes for ugly code. But to be fair, that code is at least as ugly with hand-crafted code. I believe that the original wisecrack was directed towards FORTRAN. My ancient experience was that it was using lex/yacc for HSPICE was not going to work so I had to hand-craft code for that. Jon [-- Attachment #2: Type: text/html, Size: 4835 bytes --]
[-- Attachment #1: Type: text/plain, Size: 2646 bytes --] On Sat, May 16, 2020, 6:05 PM Brantley Coile <brantley@coraid.com> wrote: > “The asteroid to kill this dinosaur is still in orbit.“ > > —- Plan 9 lex man page > > > I always hand craft my lexers and use yacc to parse. Most code on plan 9 > does that as well. > Wow! That is the most awesome thing I've seen in a while.... Warner Brantley > > > On May 16, 2020, at 8:00 PM, Jon Steinhart <jon@fourwinds.com> wrote: > > Steffen Nurpmeso writes: > > Tony Finch wrote in > > <alpine.DEB.2.20.2005142316170.3374@grey.csi.cam.ac.uk>: > > |Larry McVoy <lm@mcvoy.com> wrote: > > |> > > |> It's got some perl goodness, regexps are part of the syntax, .... > > | > > |I got into Unix after perl and I've used it a lot. Back in the 1990s I saw > > |Henry Spencer's joke that perl was the Swiss Army Chainsaw of Unix, as a > > |riff on lex being its Swiss Army Knife. I came to appreciate lex > > |regrettably late: lex makes it remarkably easy to chew through a huge pile > > |of text and feed the pieces to some library code written in C. I've been > > |using re2c recently (http://re2c.org/), which is differently weird than > > |lex, though it still uses YY in all its variable names. It's remarkable > > |how much newer lexer/parser generators can't escape from the user > > |interface of lex/yacc. Another YY example: http://www.hwaci.com/sw/lemon/ > > > P.S.: i really hate automated lexers. I never ever got used to > > use them. For learning i once tried to use flex/bison, but > > i failed really hard. I like that blood, sweat and tears thing, > > and using a lexer seems so shattered, all the pieces. And i find > > them really hard to read. > > > If you can deal with them they are surely a relief, especially in > > rapidly moving syntax situations. But if i look at settled source > > code which uses it, for example usr.sbin/ospfd/parse.y, or > > usr.sbin/smtpd/parse.y, both of OpenBSD, then i feel lost and am > > happy that i do not need to maintain that code. > > > --steffen > > > Wow, I've had the opposite experience. I find lex/yacc/flex/bison really > easy to use. The issue, which I believe was covered in the early docs, > is that some languages are not designed with regularity in mind which makes > for ugly code. But to be fair, that code is at least as ugly with > hand-crafted > code. > > I believe that the original wisecrack was directed towards FORTRAN. My > ancient > experience was that it was using lex/yacc for HSPICE was not going to work > so I > had to hand-craft code for that. > > Jon > > [-- Attachment #2: Type: text/html, Size: 5765 bytes --]
It looks like only grap and pic have mkfiles that invoke lex.
> On May 16, 2020, at 9:23 PM, Warner Losh <imp@bsdimp.com> wrote:
>
>
>
> On Sat, May 16, 2020, 6:05 PM Brantley Coile <brantley@coraid.com> wrote:
> “The asteroid to kill this dinosaur is still in orbit.“
> —- Plan 9 lex man page
>
> I always hand craft my lexers and use yacc to parse. Most code on plan 9 does that as well.
>
> Wow! That is the most awesome thing I've seen in a while....
>
> Warner
>
>
> Brantley
>
>
>> On May 16, 2020, at 8:00 PM, Jon Steinhart <jon@fourwinds.com> wrote:
>>
>> Steffen Nurpmeso writes:
>>> Tony Finch wrote in
>>> <alpine.DEB.2.20.2005142316170.3374@grey.csi.cam.ac.uk>:
>>> |Larry McVoy <lm@mcvoy.com> wrote:
>>> |>
>>> |> It's got some perl goodness, regexps are part of the syntax, ....
>>> |
>>> |I got into Unix after perl and I've used it a lot. Back in the 1990s I saw
>>> |Henry Spencer's joke that perl was the Swiss Army Chainsaw of Unix, as a
>>> |riff on lex being its Swiss Army Knife. I came to appreciate lex
>>> |regrettably late: lex makes it remarkably easy to chew through a huge pile
>>> |of text and feed the pieces to some library code written in C. I've been
>>> |using re2c recently (http://re2c.org/), which is differently weird than
>>> |lex, though it still uses YY in all its variable names. It's remarkable
>>> |how much newer lexer/parser generators can't escape from the user
>>> |interface of lex/yacc. Another YY example: http://www.hwaci.com/sw/lemon/
>>>
>>> P.S.: i really hate automated lexers. I never ever got used to
>>> use them. For learning i once tried to use flex/bison, but
>>> i failed really hard. I like that blood, sweat and tears thing,
>>> and using a lexer seems so shattered, all the pieces. And i find
>>> them really hard to read.
>>>
>>> If you can deal with them they are surely a relief, especially in
>>> rapidly moving syntax situations. But if i look at settled source
>>> code which uses it, for example usr.sbin/ospfd/parse.y, or
>>> usr.sbin/smtpd/parse.y, both of OpenBSD, then i feel lost and am
>>> happy that i do not need to maintain that code.
>>>
>>> --steffen
>>
>> Wow, I've had the opposite experience. I find lex/yacc/flex/bison really
>> easy to use. The issue, which I believe was covered in the early docs,
>> is that some languages are not designed with regularity in mind which makes
>> for ugly code. But to be fair, that code is at least as ugly with hand-crafted
>> code.
>>
>> I believe that the original wisecrack was directed towards FORTRAN. My ancient
>> experience was that it was using lex/yacc for HSPICE was not going to work so I
>> had to hand-craft code for that.
>>
>> Jon
Brantley Coile <brantley@coraid.com> wrote on Sun, 17 May 2020 01:36:16 +0000: >> It looks like only grap and pic have mkfiles that invoke lex. Both of those are Brian Kernighan's work, and from the FIXES file in his nawk, I can offer this quote: >> ... >> Aug 9, 1997: >> somewhat regretfully, replaced the ancient lex-based lexical >> analyzer with one written in C. it's longer, generates less code, >> and more portable; the old one depended too much on mysterious >> properties of lex that were not preserved in other environments. >> in theory these recognize the same language. >> ... ------------------------------------------------------------------------------- - Nelson H. F. Beebe Tel: +1 801 581 5254 - - University of Utah FAX: +1 801 581 4148 - - Department of Mathematics, 110 LCB Internet e-mail: beebe@math.utah.edu - - 155 S 1400 E RM 233 beebe@acm.org beebe@computer.org - - Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ - -------------------------------------------------------------------------------
Regarding lex/yacc/flex/bison, I remember (ca. 1980) when DEC's compiler group first got their hands on lex and yacc. For yucks they put the BLISS grammar through yacc. It came back with an error message that the grammar was ambiguous. And it turned out that, yes, Wulf's grammar for BLISS had an obscure corner case that *was* ambiguous. That caused quite a stir. -Paul W.
[-- Attachment #1: Type: text/plain, Size: 3284 bytes --] I also gave up on lex for parsing fairly early. The problem was reserved words. These looked like identifiers, but the state machine to pick out a couple of dozen reserved words out of all identifiers was too big for the PDP-11. When I wrote spell, I ran into the same problem. I had some rules that wanted to convert plurals to singular forms that would be found in the dictionary. Writing a rule to recognize .*ies and convert the "ies" to "y" blew out the memory after only a handful of patterns. My solution was to pick up words and reverse them before passing them through lex, so I looked for the pattern "sei.*", converted it to "y" and then reversed the word again. As it turned out, I only owned spell for a few weeks because Doug and others grabbed it and ran with it... Steve --- On 2020-05-16 18:23, Warner Losh wrote: > On Sat, May 16, 2020, 6:05 PM Brantley Coile <brantley@coraid.com> wrote: > >> "The asteroid to kill this dinosaur is still in orbit." >> >> --- Plan 9 lex man page >> >> I always hand craft my lexers and use yacc to parse. Most code on plan 9 does that as well. > > Wow! That is the most awesome thing I've seen in a while.... > > Warner > > Brantley > > On May 16, 2020, at 8:00 PM, Jon Steinhart <jon@fourwinds.com> wrote: > > Steffen Nurpmeso writes: > Tony Finch wrote in <alpine.DEB.2.20.2005142316170.3374@grey.csi.cam.ac.uk>: |Larry McVoy <lm@mcvoy.com> wrote: |> |> It's got some perl goodness, regexps are part of the syntax, .... | |I got into Unix after perl and I've used it a lot. Back in the 1990s I saw |Henry Spencer's joke that perl was the Swiss Army Chainsaw of Unix, as a |riff on lex being its Swiss Army Knife. I came to appreciate lex |regrettably late: lex makes it remarkably easy to chew through a huge pile |of text and feed the pieces to some library code written in C. I've been |using re2c recently (http://re2c.org/), which is differently weird than |lex, though it still uses YY in all its variable names. It's remarkable |how much newer lexer/parser generators can't escape from the user |interface of lex/yacc. Another YY example: http://www.hwaci.com/sw/lemon/ P.S.: i really hate automated lexers. I never ever got used to use them. For learning i once tried to use flex/bison, but i failed really hard. I like that blood, sweat and tears thing, and using a lexer seems so shattered, all the pieces. And i find them really hard to read. If you can deal with them they are surely a relief, especially in rapidly moving syntax situations. But if i look at settled source code which uses it, for example usr.sbin/ospfd/parse.y, or usr.sbin/smtpd/parse.y, both of OpenBSD, then i feel lost and am happy that i do not need to maintain that code. --steffen > Wow, I've had the opposite experience. I find lex/yacc/flex/bison really > easy to use. The issue, which I believe was covered in the early docs, > is that some languages are not designed with regularity in mind which makes > for ugly code. But to be fair, that code is at least as ugly with hand-crafted > code. > > I believe that the original wisecrack was directed towards FORTRAN. My ancient > experience was that it was using lex/yacc for HSPICE was not going to work so I > had to hand-craft code for that. > > Jon [-- Attachment #2: Type: text/html, Size: 8802 bytes --]
scj@yaccman.com wrote:
> As it turned out, I only owned spell for a few weeks because Doug and
> others grabbed it and ran with it...
>
> Steve
Who else besides Doug worked on spell?
Do I understand correctly that you invented the original pipeline based
spell:
tr ... | # split words to one line
tr ... | # lower case
sort -u | # sort and uniq
comm -12 - dict # find words not in dictionary
I didn't know that you'd worked on a rewrite in C.
Thanks,
Arnold
[-- Attachment #1: Type: text/plain, Size: 330 bytes --] Does anyone have the Usenix paper (80's timeframe I think) about making lex go fast? It was by Vern Paxson or Van Jacobson IIRC. Maybe only an abstract was published. The talk ended with a chart showing some CPU times, and the modified lex was only slightly slower than cat. Maybe Vern since he contributed to what became flex. [-- Attachment #2: Type: text/html, Size: 362 bytes --]
Interesting. My "speak" program had a trivial lexer that recognized literal tokens, many of which were prefixes of others, by maximum-munch binary search in a list of 1600 entries. Entries gave token+translation+rewrite. The whole thing fit in 15K. Many years later I wrote a regex recognizer that special-cased alternations of lots of literals. I believe Gnu's regex.c does that, too. (My regex also supported conjunction and negation-- legitimate regular-language operations--implemented by continuation-passing to avoid huge finite-state machines.) We have here a case of imperfect communication in 1127. Had I been conscious of the lex-explosion problem, I might have thought of speak and put support for speak-like tables into lex. As it happened, I only used yacc/lex once, quite successfully, for a small domain-specific language. Doug Steve Johnson wrote: I also gave up on lex for parsing fairly early. The problem was reserved words. These looked like identifiers, but the state machine to pick out a couple of dozen reserved words out of all identifiers was too big for the PDP-11. When I wrote spell, I ran into the same problem. I had some rules that wanted to convert plurals to singular forms that would be found in the dictionary. Writing a rule to recognize .*ies and convert the "ies" to "y" blew out the memory after only a handful of patterns. My solution was to pick up words and reverse them before passing them through lex, so I looked for the pattern "sei.*", converted it to "y" and then reversed the word again. As it turned out, I only owned spell for a few weeks because Doug and others grabbed it and ran with it.
Hi Rich,
> Does anyone have the Usenix paper (80's timeframe I think) about making lex
> go fast? It was by Vern Paxson or Van Jacobson IIRC. Maybe only an
> abstract was published. The talk ended with a chart showing some CPU
> times, and the modified lex was only slightly slower than cat. Maybe Vern
> since he contributed to what became flex.
Google Scholar's ‘Vancouver’ style citation:
Jacobson V. Tuning UNIX Lex or it's NOT true what they say
about Lex. InUSENIX Conference Proceedings (Washington,
DC, Winter 1987) 1987 (pp. 163-164).
Lack of space after ‘In’ is Google, not me.
I didn't find the paper itself, just citations of it.
--
Cheers, Ralph.
Ralph Corderoy <ralph@inputplus.co.uk> wrote:
> Hi Rich,
>
> > Does anyone have the Usenix paper (80's timeframe I think) about making lex
> > go fast? It was by Vern Paxson or Van Jacobson IIRC. Maybe only an
> > abstract was published. The talk ended with a chart showing some CPU
> > times, and the modified lex was only slightly slower than cat. Maybe Vern
> > since he contributed to what became flex.
>
> Google Scholar's ‘Vancouver’ style citation:
>
> Jacobson V. Tuning UNIX Lex or it's NOT true what they say
> about Lex. InUSENIX Conference Proceedings (Washington,
> DC, Winter 1987) 1987 (pp. 163-164).
>
> Lack of space after ‘In’ is Google, not me.
>
> I didn't find the paper itself, just citations of it.
I'm pretty sure I have those proceedings. Given that it's 2 pages,
it's probably just an abstract. If there's interest, I can try to scan
the pages.
Arnold
Hi Arnold, > I'm pretty sure I have those proceedings. Given that it's 2 pages, > it's probably just an abstract. Yes, it's just the abstract, says https://compilers.iecc.com/comparch/article/87-05-012 -- Cheers, Ralph.
On Sun, Jun 14, 2020 at 03:48:19PM +0100, Ralph Corderoy wrote: > Hi Arnold, > > > I'm pretty sure I have those proceedings. Given that it's 2 pages, > > it's probably just an abstract. > > Yes, it's just the abstract, says > https://compilers.iecc.com/comparch/article/87-05-012 Clem was kind enough to send the scan to me, and it's now here: https://www.tuhs.org/Archive/Documentation/Papers/TuningUnixLex_Jacobson_USENIX_Winter_1987_pp163_164.pdf Cheers, Warren
On Mon, Jun 15, 2020 at 11:12:30AM +1000, Warren Toomey wrote:
> Clem was kind enough to send the scan to me, and it's now here:
> https://www.tuhs.org/Archive/Documentation/Papers/TuningUnixLex_Jacobson_USENIX_Winter_1987_pp163_164.pdf
John R. Levine might be a person to ask to see if he has a copy of
the paper.
Warren
Warren Toomey <wkt@tuhs.org> wrote:
> On Mon, Jun 15, 2020 at 11:12:30AM +1000, Warren Toomey wrote:
> > Clem was kind enough to send the scan to me, and it's now here:
> > https://www.tuhs.org/Archive/Documentation/Papers/TuningUnixLex_Jacobson_USENIX_Winter_1987_pp163_164.pdf
>
> John R. Levine might be a person to ask to see if he has a copy of
> the paper.
> Warren
I've just done so, by posting to comp.compilers.
Arnold