From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from resqmta-ch2-06v.sys.comcast.net (resqmta-ch2-06v.sys.comcast.net [IPv6:2001:558:fe21:29:69:252:207:38]) by hurricane.the-brannons.com (Postfix) with ESMTPS id AEA5178C30 for ; Thu, 13 Aug 2015 20:42:03 -0700 (PDT) Received: from resomta-ch2-15v.sys.comcast.net ([69.252.207.111]) by resqmta-ch2-06v.sys.comcast.net with comcast id 4Tle1r0032Qkjl901TleUv; Fri, 14 Aug 2015 03:45:38 +0000 Received: from eklhad ([IPv6:2601:405:4002:b0a:21e:4fff:fec2:a0f1]) by resomta-ch2-15v.sys.comcast.net with comcast id 4Tld1r00F0GArqr01TldU6; Fri, 14 Aug 2015 03:45:38 +0000 To: Edbrowse-dev@lists.the-brannons.com From: Karl Dahlke Reply-to: Karl Dahlke References: User-Agent: edbrowse/3.5.4.1+ Date: Thu, 13 Aug 2015 23:45:37 -0400 Message-ID: <20150713234537.eklhad@comcast.net> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20140121; t=1439523938; bh=ZkC5EjHrk/Kvf94+fRE0ZN8z0J+YSytXJIcD8ERReqI=; h=Received:Received:To:From:Reply-to:Subject:Date:Message-ID: Mime-Version:Content-Type; b=kiyASqOtUKUG8D8pOGNjtjwuAG7l551+6u6CBZbKthjt0empArLE40LntHzH+TUjl 9SA6gUrFV34i965230LA2dDWeqCmhoD+oy4pUxbl0bwRZ7kq0a19pGs9FzVtJ6avxX QJ8qfY3o65syPAdx16kctMynXTJ73ttDY7SYeXRUhv0nOnFk0OYFvtr470epnMjwVq poQKXgWpBZeXAooe0fl1m4924YOn2iOW4HnATvrACln1sZJjnKnk/n9XQugPiZayRw fAENdFY9QY1ekRBvva7kRRHvyPqy7BURfLPSi+14Q2kocasVxsgOt2Sm5PJ5zkGXgC WGK5vfiHNyoBQ== Subject: [Edbrowse-dev] tidy5 X-BeenThere: edbrowse-dev@lists.the-brannons.com X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Edbrowse Development List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Aug 2015 03:42:03 -0000 > Am I on the right track in thinking, well tidy has a central "switch-case" > section over various tag types, and we have a central "switch-case" in Forgive me I haven't looked at the code at all, but I would guess there's a tidy5 encodeTags() that takes the html text and makes the tree. We would just call that instead of our encodeTags(), thus slicing out all that home grown html parsing code that I wrote, I don't want to be in that business any more. We would then follow up with software to traverse their node tree and build our node tree. The new tree will have more nodes than ours does today, a node for every tag, not just some tags, a note for each block of text, a node for each html comment. So a lot more nodes, but perhaps somewhat backward compatible with what we have today, at least for the first pass, at least to get us going. Then we improve and improve and improve. Karl Dahlke