9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: tlaronde@polynum.com
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Subject: Re: [9fans] Octets regexp
Date: Thu,  2 May 2013 16:43:21 +0200	[thread overview]
Message-ID: <20130502144321.GA438@polynum.com> (raw)
In-Reply-To: <37e6edf11c49568bd52ff0f3a9bdbb71@brasstown.quanstro.net>

On Thu, May 02, 2013 at 09:44:38AM -0400, erik quanstrom wrote:
> > This is a reflexion made to me by a developer who can use, when
> > needed, regexp (ed(1) or sed(1)) on an Unix where they still deal
> > with "char" (bytes) to search for a string of bytes in a binary.
>
> i have never needed to do this.  could you provide some motiviation
> for grepping for a wierd byte in an executable?  surely the debugger
> is better suited for this.
>

Because everything is not a program? But maybe data? For example, the
TeX (or METAFONT etc.) predigested dumps are binary, but not program.

> > And after some thought, I don't see an obvious reason why the regexp
> > could not be used with bytes strings (so UTF-8 is OK) without trying to
> > match runes (since not every bytes string is a correct UTF-8 sequence).
>
> because it makes things more complicated and probablly worse for the
> common case, while not providing an new functionality already in
> other tools.
>

Ah? I thought the purpose was to have not duplicated tools... And I'm
not quite sure it would be more complicated for common cases since already
defined functions could be wrappers calling more low level functions,
with the definition of the size of the "entity"---byte, wyde, tetra,
octa (when I'm at it: endianess too) or UTF-8.

>
> i think you've missed the point of making utf-8 *the* character set.
> it's not sometimes the character set.  or only on tuesday.  it's always
> the character set.
>
No: I have understood this. What I'm not totally sure about, is that the
system deals with octet strings (as it have), and this UTF-8 i.e.
Unicode is on the user interface, but is there a mean to not have the
interface interpret the strings as UTF-8? Because everything is not
text.

--
        Thierry Laronde <tlaronde +AT+ polynum +dot+ com>
                      http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C



  reply	other threads:[~2013-05-02 14:43 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-02 12:38 tlaronde
2013-05-02 12:48 ` erik quanstrom
2013-05-02 13:25   ` tlaronde
2013-05-02 13:43     ` Tristan
2013-05-02 14:19       ` Tristan
2013-05-02 14:51       ` tlaronde
2013-05-02 15:02         ` Bence Fábián
2013-05-02 15:20           ` tlaronde
2013-05-02 15:27             ` erik quanstrom
2013-05-02 15:10         ` Kurt H Maier
2013-05-02 15:21           ` tlaronde
2013-05-02 13:44     ` erik quanstrom
2013-05-02 14:43       ` tlaronde [this message]
2013-05-02 14:58     ` a
2013-05-02 15:08       ` tlaronde
2013-05-02 15:19         ` erik quanstrom
2013-05-02 15:31           ` tlaronde
2013-05-02 16:53             ` erik quanstrom
2013-05-02 18:59               ` tlaronde
2013-05-02 18:45           ` dexen deVries
2013-05-02 19:04             ` tlaronde
2013-05-02 19:22               ` erik quanstrom
2013-05-02 19:39                 ` tlaronde
2013-05-02 20:13                   ` erik quanstrom
2013-05-02 20:17                   ` 9p-st
2013-05-03 11:16                     ` tlaronde
2013-05-03 13:15                       ` Tristan
2013-05-03 16:33                         ` tlaronde
2013-05-02 16:16 ` tlaronde

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130502144321.GA438@polynum.com \
    --to=tlaronde@polynum.com \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).