Gnus development mailing list
 help / color / mirror / Atom feed
From: Ted Zlatanov <tzz@lifelogs.com>
To: ding@gnus.org
Subject: Re: Built-in HTML parsing and rendering library
Date: Mon, 06 Sep 2010 08:09:35 -0500	[thread overview]
Message-ID: <878w3fj9ww.fsf@lifelogs.com> (raw)
In-Reply-To: <m3r5h7ys2t.fsf@quimbies.gnus.org>

On Mon, 06 Sep 2010 14:28:10 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote: 

LMI> Ted Zlatanov <tzz@lifelogs.com> writes:
>> It's what Gnome uses so it's pretty good.  Because of the Gnome link it
>> would probably be the easiest one to bring into the Emacs core.

LMI> Yes.  I had a quick peek at the interface, and it seemed to return a
LMI> nice DOM that could probably be exported to Emacs pretty easily as an
LMI> elisp list tree like (:html (:head ...) (:body ...)) etc.

HTML and XML are SGML which is a crappy Lisp, so yeah :)  Parsing them
with libxml2 would improve many corners of Emacs.

LMI> And since libxml2 is already installed on 99% of Linux machines, linking
LMI> Emacs to it should be no big deal.

Yes.  The patch would be small.  I don't know if the Emacs maintainers
will have objections but it's kind of weird no one has proposed it yet.

LMI> So the question is: If we have the parse tree in Emacs Lisp, would we be
LMI> able to render it quickly enough for it to make sense to use?  I haven't
LMI> really thought about it much, but it strikes me that rendering heavily
LMI> nested tables and the like might be a time-consuming task in a language
LMI> that's as slow as Emacs Lisp.  But it might be fine; I'm not sure at all.

LMI> Is there a component of libxml2 (or some other handy library) that does
LMI> HTML rendering, too?  :-)

These days Mozilla's Gecko is getting less popular.  http://webkit.org/
is really popular and it's LGPL.  I know it's been proposed for Emacs
inclusion before and I think it's just been general laziness not to
include it.

IMO this is a really deep hole than is measured in man-years of work.
HTML parsing is easy; rendering it is a nightmare compounded by years of
legacy crap.  So I am pessimistic this is a good use of your time.  If
the Emacs project took interest in this, there would be many more
hackers and users available and it could happen.  Or it could all
devolve into endless arguments about keyboard stickers and DVCS supremacy.

Ted




  parent reply	other threads:[~2010-09-06 13:09 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-05 22:58 Lars Magne Ingebrigtsen
2010-09-06  4:27 ` Daniel Pittman
2010-09-06  7:53   ` Steinar Bang
2010-09-06 11:33   ` Lars Magne Ingebrigtsen
2010-09-06 12:20     ` Ted Zlatanov
2010-09-06 12:28       ` Lars Magne Ingebrigtsen
2010-09-06 12:40         ` Julien Danjou
2010-09-06 13:09         ` Ted Zlatanov [this message]
2010-09-06 18:26         ` Sivaram Neelakantan
2010-09-06 19:58           ` Lars Magne Ingebrigtsen
2010-09-06  7:32 ` Steinar Bang
2010-09-06  8:29   ` David Engster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878w3fj9ww.fsf@lifelogs.com \
    --to=tzz@lifelogs.com \
    --cc=ding@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).