From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/36578 Path: news.gmane.org!not-for-mail From: "Mojca Miklavec" Newsgroups: gmane.comp.tex.context Subject: Re: How to process simple HTML files with LuaTeX Date: Fri, 14 Sep 2007 15:46:09 +0200 Message-ID: <6faad9f00709140646w29f06cd1m5859305eb5635e45@mail.gmail.com> References: <6faad9f00709130604j6d28699didefe1c18a3a90ec@mail.gmail.com> <46E9B816.3070102@wxs.nl> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_2444_2035107.1189777569648" X-Trace: sea.gmane.org 1189799430 1371 80.91.229.12 (14 Sep 2007 19:50:30 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 14 Sep 2007 19:50:30 +0000 (UTC) To: "mailing list for ConTeXt users" Original-X-From: ntg-context-bounces@ntg.nl Fri Sep 14 21:50:22 2007 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from ronja.vet.uu.nl ([131.211.172.88] helo=ronja.ntg.nl) by lo.gmane.org with esmtp (Exim 4.50) id 1IWHAr-0007U0-SW for gctc-ntg-context-518@m.gmane.org; Fri, 14 Sep 2007 21:49:57 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 75EFB201FC; Fri, 14 Sep 2007 21:49:57 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 15254-03; Fri, 14 Sep 2007 21:49:57 +0200 (CEST) Original-Received: from ronja.vet.uu.nl (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id D18721FE93; Fri, 14 Sep 2007 16:14:42 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 7B2251FE89 for ; Fri, 14 Sep 2007 16:14:38 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 08180-04-6 for ; Fri, 14 Sep 2007 16:14:33 +0200 (CEST) Original-Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.177]) by ronja.ntg.nl (Postfix) with ESMTP id 9F3CA20168 for ; Fri, 14 Sep 2007 15:46:14 +0200 (CEST) Original-Received: by wa-out-1112.google.com with SMTP id m16so1106886waf for ; Fri, 14 Sep 2007 06:46:11 -0700 (PDT) Original-Received: by 10.114.177.1 with SMTP id z1mr701344wae.1189777569685; Fri, 14 Sep 2007 06:46:09 -0700 (PDT) Original-Received: by 10.115.15.15 with HTTP; Fri, 14 Sep 2007 06:46:09 -0700 (PDT) In-Reply-To: <46E9B816.3070102@wxs.nl> X-Virus-Scanned: amavisd-new at ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.9 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl X-Virus-Scanned: amavisd-new at ntg.nl Xref: news.gmane.org gmane.comp.tex.context:36578 Archived-At: ------=_Part_2444_2035107.1189777569648 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline On 9/14/07, Hans Hagen wrote: > Mojca Miklavec wrote: > > Hello, > > > > I was trying to figure out how to process simple HTML files with the > > new code, but I fail to understand the details. Here's a simple file I > > would like to process: > > > keep in mind that this is still somewhat experimental Sure :) That's why I'm sending files for testing :) :) :) > % best define mappings before loading the file > > \startxmlsetups all:html > \xmlsetsetup{main}{head|h1|h2}{*} > \stopxmlsetups > > \xmlregistersetup{all:html} > > % register this so that it's done for each load > > \startxmlsetups h1 > \subject{\xmlflush{#1}} > \stopxmlsetups > > \startxmlsetups h2 > \subsubject{\xmlflush{#1}} > \stopxmlsetups > > \startxmlsetups head > \startstandardmakeup > THIS IS ABOUT: \xmlfilter{main}{/head/title/text()} > \stopstandardmakeup > \stopxmlsetups > > % that's it > > > \setupcolors[state=start] > \setuphead[subject][style=\bfd,color=blue] > \setuphead[subsubject][style=\bfc,color=blue] > > \starttext > > \xmlprocess{main}{test.html}{} > > \stoptext Great! This works perfect and seems much easier to write than the old code, though I still have no idea how to implement some parts of it: - where to plug in the entities such as  , ≤, ... - how to catch classes: how to differentiate between

title

and

title

- and some more - there are some simple examples in the attachment (too long to copy-paste) Thanks again, Mojca ------=_Part_2444_2035107.1189777569648 Content-Type: application/x-tex; name="frogs.tex" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="frogs.tex" X-Attachment-Id: f_f6kq8qd0 JSBlbmdpbmU9bHVhdGV4Cgpcc3RhcnRsdWFjb2RlCgl4bWwuZW50aXRpZXMgPSB7IFsnICddID0g JyZuYnNwOycgfQoJCglmdW5jdGlvbiBseG1sLmZsdXNoKGlkKQoJCXhtbC5zcHJpbnQoeG1sLm15 c3R1cGlkZnVuY3Rpb24obHhtbC5pZChpZCkpKQoJZW5kCgkKCWZ1bmN0aW9uIHhtbC5teXN0dXBp ZGZ1bmN0aW9uKHJvb3QpCgkJbG9jYWwgZCA9IHJvb3QuZHQKCQkJZm9yIGs9MSwjZCBkbwoJCQkJ bG9jYWwgZGsgPSBkW2tdCgkJCQlpZiB0eXBlKGRrKSA9PSAic3RyaW5nIiB0aGVuCgkJCQkJZFtr XSA9IGRrOmdzdWIoIiZuYnNwOyIsJyAnKQoJCQkJCWRrID0gZFtrXQoJCQkJCWRba10gPSBkazpn c3ViKCImbGU7IiwgJ1xcbWF0aGVtYXRpY3N7XFxsZX0nKQoJCQkJZW5kCgkJCWVuZAoJCXJldHVy biBkCgllbmQKXHN0b3BsdWFjb2RlCgpcc3RhcnR4bWxzZXR1cHMgYWxsOmh0bWwKICAgIFx4bWxz ZXRzZXR1cHttYWlufXtoZWFkfGgxfGgyfHB8cHJlfHNwYW58c3VifHRhYmxlfGJ9eyp9ClxzdG9w eG1sc2V0dXBzCgolIHJlZ2lzdGVyIHRoaXMgc28gdGhhdCBpdCdzIGRvbmUgZm9yIGVhY2ggbG9h ZApceG1scmVnaXN0ZXJzZXR1cHthbGw6aHRtbH0KCiUgdGl0bGUgLSBhbG1vc3QgT0ssIHNob3Vs ZCBiZSBwcmludGVkIG9uIGV2ZXJ5IHBhZ2UsIEkgd2lsbCBmaWd1cmUgb3V0IChJIGFscmVhZHkg YXNrZWQgdGhlIHNhbWUgcXVlc3Rpb24gb25jZSBhbHJlYWR5KQpcc3RhcnR4bWxzZXR1cHMgaGVh ZAoJXHNldHVwaGVhZGVydGV4dHNbXHhtbGZpbHRlcnttYWlufXsvaGVhZC90aXRsZS90ZXh0KCl9 XVtwYWdlbnVtYmVyXQpcc3RvcHhtbHNldHVwcwoKJSBzdWJqZWN0IC0gT0sKXHN0YXJ0eG1sc2V0 dXBzIGgxCiAgICBcc3ViamVjdHtceG1sZmx1c2h7IzF9fQpcc3RvcHhtbHNldHVwcwoKJSBzdWJz dWJqZWN0IC0gT0ssIGJ1dCBzaG91bGQgYmUgdHJlYXRlZCBkaWZmZXJlbnQgZGVwZW5kaW5nIG9u IGNsYXNzClxzdGFydHhtbHNldHVwcyBoMgoJJSB0aGlzIGRvZXNuJ3Qgd29yawoJJSBcZG9pZmVs c2V7XHhtbGF0dHsjMX17Y2xhc3N9fXtmaWxlbmFtZX17Li4ufXsuLi59Cglcc3Vic3ViamVjdHtc eG1sZmx1c2h7IzF9fQpcc3RvcHhtbHNldHVwcwoKJSBwYXJhZ3JhcGhzIC0gT0sKXHN0YXJ0eG1s c2V0dXBzIHAKCVx4bWxmbHVzaHsjMX1ccGFyClxzdG9weG1sc2V0dXBzCgolIGNvZGUgc2FtcGxl cyBzaG91bGQgYmUgdHlwZXNldCBhcyB0aGV5IGFyZSAoYWxzbzogbm8gZW50aXRpZXMgc3Vic3Rp dHV0aW9uKQolIGN1cnJlbnRseSBubyBibGFuayBsaW5lcyBhcmUgcHJpbnRlZCwgYnV0IEkgd2ls bCB0cnkgdG8gZmlndXJlIG91dCBob3cgdG8gc29sdmUgdGhhdApcc3RhcnR4bWxzZXR1cHMgcHJl CglcYmdyb3VwXG9iZXlsaW5lc1x0dAoJXHhtbGZsdXNoeyMxfQoJXGVncm91cApcc3RvcHhtbHNl dHVwcwoKJSB0YWJsZXMgLSByZWFsbHkgdWdseSBpbXBsZW1lbnRhdGlvbiwgb25seSBjb2xzcGFu IG5vdCB3b3JraW5nLCBidXQgSSBkb24ndCBuZWVkIGl0IHJpZ2h0IG5vdwpcc3RhcnR4bWxzZXR1 cHMgdGFibGUKCVx4bWxzZXRzZXR1cHttYWlufXt0cn17Kn0KCVxiVEFCTEUgXHhtbGZsdXNoeyMx fSBcZVRBQkxFClxzdG9weG1sc2V0dXBzCgpcc3RhcnR4bWxzZXR1cHMgdHIKCVx4bWxzZXRzZXR1 cHttYWlufXt0ZH17Kn0KCVxiVFIgXHhtbGZsdXNoeyMxfSBcZVRSClxzdG9weG1sc2V0dXBzCgpc c3RhcnR4bWxzZXR1cHMgdGQKCVxiVEQgXHhtbGZsdXNoeyMxfSBcZVREClxzdG9weG1sc2V0dXBz CgolIHNob3VsZCBiZSBvbmx5IGRlZmluZWQgZm9yIGNsYXNzPXNpbXBsZW1hdGgKXHN0YXJ0eG1s c2V0dXBzIHNwYW4KCVxtYXRoZW1hdGljc3tceG1sZmx1c2h7IzF9fQpcc3RvcHhtbHNldHVwcwoK JSBPSyBpbiBteSBjYXNlClxzdGFydHhtbHNldHVwcyBzdWIKCVxsb3d7XHhtbGZsdXNoeyMxfX0K XHN0b3B4bWxzZXR1cHMKCiUgYm9sZCAtIE9LLCB1bmxlc3Mgd2hlbiBuZXN0ZWQgd2l0aCA8aT4K XHN0YXJ0eG1sc2V0dXBzIGIKCXtcYmZceG1sZmx1c2h7IzF9fQpcc3RvcHhtbHNldHVwcwoKXHNl dHVwY29sb3JzCglbc3RhdGU9c3RhcnRdClxzZXR1cGhlYWQKCVtzdWJqZWN0XQoJW3N0eWxlPVxi ZmMsCgkgY29sb3I9Ymx1ZSwKCSBwYWdlPXllcywKCSBhZnRlcj1dClxzZXR1cGhlYWQKCVtzdWJz dWJqZWN0XQoJW3N0eWxlPVxiZmEsCgkgY29sb3I9Ymx1ZV0KXHNldHVwYmxhbmsKCVtiaWddClxz ZXR1cHBhZ2VudW1iZXJpbmcKCVtsb2NhdGlvbj1dCgpcc3RhcnR0ZXh0CgpceG1scHJvY2Vzc3tt YWlufXtmcm9ncy5odG1sfXt9Cgpcc3RvcHRleHQK ------=_Part_2444_2035107.1189777569648 Content-Type: text/html; name="frogs.html" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="frogs.html" X-Attachment-Id: f_f6kq90gj PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPCFET0NUWVBFIGh0bWwgUFVC TElDICItLy9XM0MvL0RURCBYSFRNTCAxLjAgU3RyaWN0Ly9FTiIgImh0dHA6Ly93d3cudzMub3Jn L1RSL3hodG1sMS9EVEQveGh0bWwxLXN0cmljdC5kdGQiPgo8aHRtbCB4bWxucz0iaHR0cDovL3d3 dy53My5vcmcvMTk5OS94aHRtbCI+Cgo8aGVhZD4KCTxtZXRhIGNvbnRlbnQ9InRleHQvaHRtbDsg Y2hhcnNldD1VVEYtOCIgaHR0cC1lcXVpdj0iY29udGVudC10eXBlIiAvPgoJPHRpdGxlPkZyb2dz PC90aXRsZT4KCTxsaW5rIGhyZWY9ImRlZmF1bHQuY3NzIiB0eXBlPSJ0ZXh0L2NzcyIgcmVsPSJT dHlsZVNoZWV0IiAvPgoJPGxpbmsgcmVsPSJzdHlsZXNoZWV0IiB0eXBlPSJ0ZXh0L2NzcyIgaHJl Zj0icHJpbnQuY3NzIiBtZWRpYT0icHJpbnQiIC8+Cgk8c3R5bGUgdHlwZT0idGV4dC9jc3MiPgoJ aDEsIGgyIHsKCQljb2xvcjogYmx1ZSA7Cgl9CgloMi5maWxlbmFtZSB7CgkJZm9udC1mYW1pbHk6 IG1vbm9zcGFjZSA7Cgl9Cgkuc2ltcGxlbWF0aCB7CgkJZm9udC1zdHlsZTogaXRhbGljIDsKCX0K CTwvc3R5bGU+CjwvaGVhZD4KCjxib2R5PgoKPGgxPkZyb2dzPC9oMT4KPGgyIGNsYXNzPSJmaWxl bmFtZSI+ZnJvZy5jLCBmcm9nLmNwcCwgZnJvZy5wYXMsIGZyb2cuamF2YTwvaDI+Cgo8cD5JZiBj bGFzcyBlcXVhbHMgZmlsZW5hbWUsIEgyIHNob3VsZCBiZSB0cmVhdGVkIHNsaWdodGx5IGRpZmZl cmVudC48L3A+Cgo8cD5Tb21lIGVudGl0aWVzIGNhdXNlIHByb2JsZW1zOiAmbmJzcDssICZsZTss ICZnZTssIC4uLjwvcD4KCjxkaXYgYWxpZ249ImNlbnRlciI+Cgk8dGFibGUgYm9yZGVyPSIxIj4K CQk8dHI+PHRkIGNvbHNwYW49IjIiIGFsaWduPSJjZW50ZXIiPjxiPlNpbXBsZSB0YWJsZTwvYj48 L3RkPjwvdHI+CgkJPHRyPjx0ZD4xPC90ZD48dGQ+MjwvdGQ+PC90cj4KCQk8dHI+PHRkPnRocmVl PC90ZD48dGQ+Zm91cjwvdGQ+PC90cj4KCTwvdGFibGU+CjwvZGl2PgoKCjxoMj5Db25zdHJhaW50 czwvaDI+Cgo8cD5XZSBoYXZlIDxzcGFuIGNsYXNzPSJzaW1wbGVtYXRoIj5YPC9zcGFuPiBhbmQg PHNwYW4gY2xhc3M9InNpbXBsZW1hdGgiPlkgKDEgJmxlOyBYLCBZICZsZTsgMTAwMCk8L3NwYW4+ IC4uLjwvcD4KCjxwPiBGb3IKPHNwYW4gY2xhc3M9InNpbXBsZW1hdGgiPng8c3ViPmZyb2c8L3N1 Yj48L3NwYW4+CjxzcGFuIGNsYXNzPSJzaW1wbGVtYXRoIj55PHN1Yj5mcm9nPC9zdWI+PC9zcGFu Pgo8c3BhbiBjbGFzcz0ic2ltcGxlbWF0aCI+eDxzdWI+ZnJvZ2dpZTwvc3ViPjwvc3Bhbj4gYW5k CjxzcGFuIGNsYXNzPSJzaW1wbGVtYXRoIj55PHN1Yj5mcm9nZ2llPC9zdWI+PC9zcGFuPgp0aGVy ZSBpcyBhIHdlaXJkIGxpbmUgYnJlYWtpbmcgLi4uCjwvcD4KCjxoMj5TYW1wbGUgaW5wdXQ8L2gy PgoKPHByZT40IDQKMSAxIDQgMgoKMgoyIDEgMyAzCjQgMyA0IDQKNCA0CjEgMSA0IDIKCjEKMiAx IDMgNAo3IDYKNCAyIDcgNgoKNQo0IDEgNyAxCjUgMSA1IDUKMiA0IDMgNAo3IDUgNyA1CjYgNiA2 IDYKMCAwCjwvcHJlPgoKPGgyPlNhbXBsZSBvdXRwdXQ8L2gyPgoKPHByZT4xNApubyBwYXRoIGZv dW5kCjEyCjwvcHJlPgoKPC9ib2R5Pgo8L2h0bWw+Cgo= ------=_Part_2444_2035107.1189777569648 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________ ------=_Part_2444_2035107.1189777569648--