From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/45711 Path: news.gmane.org!not-for-mail From: Lars Huttar Newsgroups: gmane.comp.tex.context Subject: Re: modifying URL wrapping rules Date: Wed, 19 Nov 2008 14:35:16 -0600 Message-ID: <49247884.9050400@sil.org> References: <49232FCD.1080803@sil.org> <20081118214400.GM18156@phare.normalesup.org> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1227127040 2929 80.91.229.12 (19 Nov 2008 20:37:20 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 19 Nov 2008 20:37:20 +0000 (UTC) To: Mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Wed Nov 19 21:38:23 2008 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from ronja.vet.uu.nl ([131.211.172.88] helo=ronja.ntg.nl) by lo.gmane.org with esmtp (Exim 4.50) id 1L2tod-0005Rq-2f for gctc-ntg-context-518@m.gmane.org; Wed, 19 Nov 2008 21:38:23 +0100 Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 82A571FBD4; Wed, 19 Nov 2008 21:37:12 +0100 (CET) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 15742-01-2; Wed, 19 Nov 2008 21:35:57 +0100 (CET) Original-Received: from ronja.vet.uu.nl (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 276B91FB3C; Wed, 19 Nov 2008 21:35:57 +0100 (CET) Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id D0A521FB38 for ; Wed, 19 Nov 2008 21:35:55 +0100 (CET) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 15742-01 for ; Wed, 19 Nov 2008 21:35:19 +0100 (CET) Original-Received: from smtp1.wsfo.org (smtp1.wsfo.org [208.145.81.51]) by ronja.ntg.nl (Postfix) with ESMTP id 078D11FB3C for ; Wed, 19 Nov 2008 21:35:18 +0100 (CET) Original-Received: from mail.link77.net (mail.link77.net [172.22.0.125]) by smtp1.wsfo.org (8.13.1/8.13.1) with ESMTP id mAJKZHut025148 (version=TLSv1/SSLv3 cipher=DES-CBC3-SHA bits=168 verify=NO) for ; Wed, 19 Nov 2008 15:35:18 -0500 X-CGP-ClamAV-Result: CLEAN X-VirusScanner: Niversoft's CGPClamav Helper v1.8.2 (ClamAV engine v0.94.1) Original-Received: from [172.20.6.55] (account lars_huttar@sil.org [172.20.6.55] verified) by mail.link77.net (CommuniGate Pro SMTP 5.2.10) with ESMTPSA id 200536702 for ntg-context@ntg.nl; Wed, 19 Nov 2008 15:35:17 -0500 User-Agent: Thunderbird 2.0.0.17 (Windows/20080914) In-Reply-To: <20081118214400.GM18156@phare.normalesup.org> X-Enigmail-Version: 0.95.7 X-Scanned-By: MIMEDefang 2.62 on 172.22.0.51 X-Virus-Scanned: amavisd-new at ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.9 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl X-Virus-Scanned: amavisd-new at ntg.nl Xref: news.gmane.org gmane.comp.tex.context:45711 Archived-At: On 11/18/2008 3:44 PM, Arthur Reutenauer wrote: >> >From what I can tell, the .tex file loads one of the other three: >> \loadmarkfile{lang-url} > > \loadmarkfile loads either lang-url.mkii or lang-url.mkiv, depending > on the ConTeXt version you're running (MkII / MkIV). In Mark IV, the > Lua code is then put in lang-url.lua, which is input by lang-url.mkiv > (you can see "\registerctxluafile{lang-url}{1.001}" near the beginning > of the latter). This architecture enables you to reuse the Lua code in > completely different environments (for example, in a pure Lua script). > >> Our project has a requirement of using Xetex, so I have to stick with >> that. Does that mean lang-url doesn't work at all? > > ConTeXt on XeTeX is considered Mark II as far as the mark business > goes (it doesn't know about Lua), so you have access to the exact same > code as with pdfTeX; in this case, lang-url.mkii will be loaded. OK, I've taken a stab at it. Here is the main code now in the modified lang-url.mkii. For brevity in this email I've just omitted the lines that I actually commented out in the file, namely characters that Chicago style does not say you can line-break URLs on. \def\sethyphenatedurlnormal#1{\expandafter\chardef\csname url @ #1\endcsname\zerocount} \def\sethyphenatedurlbefore#1{\expandafter\chardef\csname url @ #1\endcsname\plusone } \def\sethyphenatedurlafter #1{\expandafter\chardef\csname url @ #1\endcsname\plustwo } % Chicago manual of style rules: % Break URLs after: / or // (I don't know how to implement // so will be content with / for now. % To do: prevent breaking in middle of double slash //.) % Break URLs before: ~ . , - _ ? # % % Break URLs before or after: = & (I don't know how to implement 'before or after' so will % be content with breaking 'before' these characters for now). \sethyphenatedurlbefore \letterhash \sethyphenatedurlbefore \letterpercent \sethyphenatedurlbefore \letterampersand \sethyphenatedurlbefore , \sethyphenatedurlbefore - \sethyphenatedurlbefore . \sethyphenatedurlbefore = \sethyphenatedurlbefore ? \sethyphenatedurlbefore _ \sethyphenatedurlbefore \lettertilde \sethyphenatedurlafter / % was \sethyphenatedurlbefore / However, I have a few unsolved problems here. 1) I don't see a way, with the '\sethyphenatedurlbefore' or 'after' mechanism, to tell it not to break a URL between two slashes, as in "http://". At first I thought that since our text only had a few URLs, we'd likely never care. But ... you guessed it. One URL got broken between the slashes: "http:/ /www.sil.org/..." So I tried using the base tex hyphenation mechanism to inhibit breaking there: I changed the document from \hyphenatedurl{http://www.sil.org/...} to \hyphenatedurl{\hyphenation{http://}www.sil.org/...} but that gave a stack overflow. Then I tried \hyphenation{http://}\hyphenatedurl{www.sil.org/...} but got this error: ! Not a letter. http: // \hyphenation ...malhyphenation {\the \scratchtoks }\endgroup ... Linguistics. \hyphenation {http://} \hyphenatedurl {www.sil.or... \BE #1->\startmainexdent {#1 }\stopmainexdent l.317 ...l.org/silesr/abstract.asp?ref=2007-015}.} I'm kind of shooting in the dark there, so maybe somebody who knows TeX can help me out. 2) Even though I have "\sethyphenatedurlafter /" instead of "\sethyphenatedurlbefore /", there are four cases where a URL is broken before a slash, e.g.: http://www.sil.org/.../009 /YAMBASSA.html. and no cases where a URL is broken after a slash (except when it's also before a slash -- see 1). I wonder if my modifications are actually taking effect? Do I need to compile the changes to the .mkii file or something? I tried texexec.bat --make --all, but that didn't seem to change the outcome. 3) Conversely, even though I have "\sethyphenatedurlbefore -" and not "\sethyphenatedurlafter -", there is a case where a URL is broken after a hyphen (a hyphen that was already present in the URL): http://www..../Niger- Congo/... and no case where a URL is broken before a hyphen. Note that the "\sethyphenatedurlbefore -" setting is unchanged from the original lang-url.mkii, so this is not an issue of needing to recompile. Maybe the general tex hyphenation mechanism is operating here, in spite of the URL breaking settings. How do I override that (only for the URL)? 4) In one case, a URL is broken over the end of a column. That's ok, but it would be nice to be able to strongly discourage that from happening at the end of a page. I'm told that's a difficult problem to solve. It's not mandatory for us at this point but if anyone has a solution I'd like to hear about it. Thanks, Lars ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________