Gnus development mailing list
 help / color / mirror / Atom feed
From: Steinar Bang <sb@dod.no>
To: ding@gnus.org
Subject: Re: Built-in HTML parsing and rendering library
Date: Mon, 06 Sep 2010 09:53:01 +0200	[thread overview]
Message-ID: <yb0eid78g0y.fsf@dod.no> (raw)
In-Reply-To: <87mxrvo5sk.fsf@rimspace.net>

>>>>> Daniel Pittman <daniel@rimspace.net>:

> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>> For parsing xml, there's libxml2, but are there any C/C++ libraries for
>> parsing HTML in use out there that could be compiled into Emacs, by any
>> chance?  Then Gnus wouldn't have to rely on the external w3m library...
>> 
>> I mean, something that parses real-world HTML as well as w3m does, and
>> generates a parse tree based on that.  I guess if it returned a
>> convenient parse tree back to elisp, the HTML could be rendered fast
>> enough from elisp.

> http://www.xmlsoft.org/html/libxml-HTMLparser.html

Another one is "tidylib"
	http://tidy.sourceforge.net/
(the parsing of HTML Tidy repackaged as a library).

There is also the HTML parser of the w3c libwww, but that is awfully
outdated.
	http://www.w3.org/Library/src/HTML.html
(said to support HTML 4 in what's checked into the CVS, though)






  reply	other threads:[~2010-09-06  7:53 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-05 22:58 Lars Magne Ingebrigtsen
2010-09-06  4:27 ` Daniel Pittman
2010-09-06  7:53   ` Steinar Bang [this message]
2010-09-06 11:33   ` Lars Magne Ingebrigtsen
2010-09-06 12:20     ` Ted Zlatanov
2010-09-06 12:28       ` Lars Magne Ingebrigtsen
2010-09-06 12:40         ` Julien Danjou
2010-09-06 13:09         ` Ted Zlatanov
2010-09-06 18:26         ` Sivaram Neelakantan
2010-09-06 19:58           ` Lars Magne Ingebrigtsen
2010-09-06  7:32 ` Steinar Bang
2010-09-06  8:29   ` David Engster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yb0eid78g0y.fsf@dod.no \
    --to=sb@dod.no \
    --cc=ding@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).