List for cgit developers and users
 help / color / mirror / Atom feed
From: john at keeping.me.uk (John Keeping)
Subject: RFE: .so filters
Date: Thu, 9 Jan 2014 22:58:02 +0000	[thread overview]
Message-ID: <20140109225802.GM7608@serenity.lan> (raw)
In-Reply-To: <CAHmME9p6nE36-4PWxZTvSdZ0=mUu0CGBh4aRkr3KBaeiNH8reg@mail.gmail.com>

On Thu, Jan 09, 2014 at 10:34:26PM +0100, Jason A. Donenfeld wrote:
> I'm thinking about this filtering situation w.r.t. gravatar and
> potentially running multiple filters on one page. Something I've been
> considering is implementing a simple dlopen() mechanism for filters,
> if the filter filename starts with "soname:" or "lib:" or similar, so
> as to avoid the fork()ing and exec()ing we currently have, for high
> frequency filters. The idea is that first use of the filter would be
> dlopen()'d, but wouldn't be dlclose()'d until the end of the
> processing. This way the same function could be used over and over
> again without significant penalty.
> 
> In my first thinking of this, the method of action would be the same
> as the current system -- "int filter_run(int argc, char *argv[])" is
> dlopen()'d, executed, and it reads and writes to the dup2()'d file
> descriptor. Unfortunately, the piping in this introduces a cost that
> I'd rather avoid. In the case of gravatar (or more generally, email
> author filters), we'd be better off with a "char *filter_run(int argc,
> char *argv[])", that can just return the string that the html
> functions will then print. This, however, breaks the current filtering
> paradigm, and might not be ideal for filters that enjoy a stream of
> data (such as source code filters). This distinction more or less
> points toward coming up with a library API of sorts, but I really
> really really don't want to add a full fledged plugin system. So this
> has me leaning toward the simpler first idea.
> 
> But I'm undecided at the moment. Comments and suggestions are most
> welcome.

That interface doesn't really match the way the current filters work.
Currently when we open a filter we replace cgit's stdout with a pipe
into the filter process, so none of the existing CGit code will work
with this interface.  We could swap out write with a function pointer
into the filter, but I don't think we guarantee that all of the data is
written in one go which makes life harder for filter writers (although
for simple cases like author info we probably could guarantee to write
it all at once).

If we allow filters to act incrementally, then we can just leave the
filter running and swap it in or out when required.  That would require
a single dup2 to make it work the same way that the filters currently
work.  Interestingly, there is an "htmlfd" variable in html.c but it is
never changed from STDOUT_FILENO; I wonder if that can be used or are
there other places (possibly in libgit.a code) that just use stdout, in
which case we should remove that variable.  But there is the problem of
terminating the response; Lukas' suggestion of using NUL for that may be
the best, it's not that hard to printf '\0' in shell.

OTOH, the particular case of author details the input is more clearly
defined than items for which we currently provide filters, so maybe it
could use a different interface.

One final point (although I don't think you're suggesting this) is that
we shouldn't require shared objects; I think scripts using stdin+stdout
are a much simpler interface and provides a much lower barrier to entry,
not least because the range of languages that can be used to implement
the filters is so much greater.


  parent reply	other threads:[~2014-01-09 22:58 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-09 21:34 Jason
2014-01-09 22:29 ` mailings
2014-01-09 22:58 ` john [this message]
2014-01-10  1:41   ` Jason
2014-01-10  2:11     ` Jason
2014-01-10  4:26       ` Jason
2014-01-10  9:06       ` john
2014-01-10 15:57         ` Jason
2014-01-10 17:12           ` bluewind
2014-01-10 17:20             ` john
2014-01-10 17:43               ` mricon
2014-01-10 18:00                 ` Jason
2014-01-10 18:00               ` Jason
2014-01-10 17:57             ` Jason
2014-01-10 20:03               ` bluewind
2014-01-10 20:11                 ` john
2014-01-10 20:25                   ` bluewind
2014-01-10 20:36                     ` john
2014-01-10 20:56                       ` bluewind
2014-01-11  2:37                         ` Jason
2014-01-11  2:34                 ` Jason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140109225802.GM7608@serenity.lan \
    --to=cgit@lists.zx2c4.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).