From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from resqmta-ch2-09v.sys.comcast.net (resqmta-ch2-09v.sys.comcast.net [IPv6:2001:558:fe21:29:69:252:207:41]) by hurricane.the-brannons.com (Postfix) with ESMTPS id 79D2477D0D for ; Sun, 30 Aug 2015 04:28:54 -0700 (PDT) Received: from resomta-ch2-10v.sys.comcast.net ([69.252.207.106]) by resqmta-ch2-09v.sys.comcast.net with comcast id AzX71r0022JGN3p01zX7wK; Sun, 30 Aug 2015 11:31:07 +0000 Received: from eklhad ([IPv6:2601:405:4002:b0a:21e:4fff:fec2:a0f1]) by resomta-ch2-10v.sys.comcast.net with comcast id AzX71r0060GArqr01zX7pn; Sun, 30 Aug 2015 11:31:07 +0000 To: Edbrowse-dev@lists.the-brannons.com From: Karl Dahlke Reply-to: Karl Dahlke User-Agent: edbrowse/3.5.4.2+ Date: Sun, 30 Aug 2015 07:31:07 -0400 Message-ID: <20150730073107.eklhad@comcast.net> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20140121; t=1440934267; bh=pxVGwMi36tcCJUNMwp5O9vD9ZFywY1lc6GccmWrkiKY=; h=Received:Received:To:From:Reply-to:Subject:Date:Message-ID: Mime-Version:Content-Type; b=vGz0o/jBDj9BcqBvQaHR5oaiY4Ijq1mCoaBSM0wUilqAmWxoDgmZpqyjpE1i8/4Yk r5+HAWNlD3kBMhF3OVsPmUN5g8PfBvJpTL54ZSbMSkR1zoeGP8LPKMPZ02lDCf+ing aIE4/mS6691hVftRB1vJX0VVxtpPXi9vxLy+o9ZQJd2ej+8f5zzdscUcIJUt+cZVSI iHPF97LosueqVaoyk+XLPVNBCgWqOpxZijTG/WbH16honyXE+qrJhgkzhVZbN4Qmi0 1kAZq0yBR0DBt2QnOY5BYmhDJqEnnMeqYKFHcHEBXi/jqLvPDfm9Mj4vTYJhjxPVFG UKc0Ibl6InBNg== Subject: [Edbrowse-dev] parser separation X-BeenThere: edbrowse-dev@lists.the-brannons.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Edbrowse Development List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Aug 2015 11:28:54 -0000 I was thinking about the history of our js interactions. We felt it was wise to separate and encapsulate, writing jseng-moz.cpp as a separate source file running a separate process. Among other benefits, this makes it easier to switch engines, or conform to Mozilla upgrades and changing APIs. We've talked about v8 and duktape for instance. Those are still possibilities. So ... if there is any uncertainty at all about the html parser, or if we just want to keep the door open, should we do some encapsulation, and should we start now? The connection is far simpler than js. Pass html text, get back a tree of nodes. It's conceptually a function call, and doesn't need to be a separate process or thread. Nothing asynchronous etc, and no ongoing dialog with states etc. So it's very simple, but still might be worth putting in another sourcefile. html-tidy.c - use tidy to parse html html-hub.c - use hubbub to parse html ... Let the makefile link in whichever one we want, just as the makefile determines mozilla or v8 or duktape etc. Then htmltidy.c is the only file that needs tidy.h and its structures and API, and it returns to us a tree of our nodes, as converted from the tidy nodes, which we then use to build our DOM. It's a small bit of administrative overhead that might pay dividends. What do you think? Karl Dahlke