From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from resqmta-po-11v.sys.comcast.net (resqmta-po-11v.sys.comcast.net [IPv6:2001:558:fe16:19:96:114:154:170]) by hurricane.the-brannons.com (Postfix) with ESMTPS id 7F89A78F84 for ; Wed, 28 Jan 2015 12:58:04 -0800 (PST) Received: from resomta-po-16v.sys.comcast.net ([96.114.154.240]) by resqmta-po-11v.sys.comcast.net with comcast id lYuX1p0095BUCh401Yv6CP; Wed, 28 Jan 2015 20:55:06 +0000 Received: from eklhad ([IPv6:2601:4:5380:4ee:219:21ff:feb9:ba8d]) by resomta-po-16v.sys.comcast.net with comcast id lYv51p00J08MP5701Yv5Bx; Wed, 28 Jan 2015 20:55:06 +0000 To: Edbrowse-dev@lists.the-brannons.com From: Karl Dahlke Reply-to: Karl Dahlke User-Agent: edbrowse/3.5.2 Date: Wed, 28 Jan 2015 15:55:05 -0500 Message-ID: <20150028155505.eklhad@comcast.net> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20140121; t=1422478506; bh=CbcsHmKseEP2Iiqd7TxfKiSqSut1y6Lev1bOt+/ziLs=; h=Received:Received:To:From:Reply-to:Subject:Date:Message-ID: Mime-Version:Content-Type; b=hkXUgFcd4NNeGW5r0xTW6I30QU+bCe5a9Z0oq+uxNqbKtgze5QbX55U7/9WEH2gqj KGMi3NO7NyPQ925E/VwretNT3SuPrC50QGND1dwJMGXKrAssYBqajobJMTh0TV+G/p Fsr1A1B33629K9FCWRcoQZzULWCtjchVKmSNnE3cdT6yzRuVstv1ECJz4TTaYpGnRu +8S+5Jp23KStFx3WrhHADmdDdHvcFvY+Xy/jKJMhokmJEBZ/J9cIJ/6OdV0mfuO+Iv Qfxh3Mgkjjnbw6lBIkqbyQ5EQ2y78XKO7NRqGt2RlfFWHDAyPrGi74FfIR2y8OXGdq uiNdAomsvr4TQ== Subject: [Edbrowse-dev] html parser and whitespace in tag names X-BeenThere: edbrowse-dev@lists.the-brannons.com X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Edbrowse Development List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Jan 2015 20:58:04 -0000 > >

> link text goes here. >

>
Yes this did not work because I was trying to be clever. Some tags in the text I thought should close the open anchor, if there was an open anchor. I was really thinking about this.

link text

Should the

close the anchor? Does it really matter? There's so much improper html out there it makes my head spin. I tried to anticipate some of this and I think I did more harm than good. The latest push just comments out some code, in html.c from 1810 to 1828. #if 0 #endif Code no longer being used. I didn't delete the code cause I don't know maybe it still might be used in some fashion. If I think it's worthless in a couple months I'll delete it and some other code that supports it. All this makes me wonder again if I should be parsing html at all, or if there isn't some code out there that would do it for me, and turn it into a tree of nodes, and I could just work with that. Let somebody else worry about all this "is it nested properly" html crap. Trying to leverage more open source libraries. I was going to play with xidel but haven't got round to it. Karl Dahlke