From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/63087 Path: news.gmane.org!not-for-mail From: Mojca Miklavec Newsgroups: gmane.comp.tex.context Subject: Re: transliteration russian Date: Fri, 29 Oct 2010 23:25:20 +0200 Message-ID: References: <6CB41398-8E44-433C-B4FA-1B98BC684A0E@st.estfiles.de> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Trace: dough.gmane.org 1288387546 8752 80.91.229.12 (29 Oct 2010 21:25:46 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 29 Oct 2010 21:25:46 +0000 (UTC) Cc: mailing list for ConTeXt users To: Steffen Wolfrum Original-X-From: ntg-context-bounces@ntg.nl Fri Oct 29 23:25:45 2010 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from balder.ntg.nl ([195.12.62.10]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PBwSC-0000Ry-UR for gctc-ntg-context-518@m.gmane.org; Fri, 29 Oct 2010 23:25:41 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id CF869CA7DF; Fri, 29 Oct 2010 23:25:39 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id aNSjLEee0edY; Fri, 29 Oct 2010 23:25:36 +0200 (CEST) Original-Received: from balder.ntg.nl (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id 76371CA7DA; Fri, 29 Oct 2010 23:25:36 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id 0419ACA7DB for ; Fri, 29 Oct 2010 23:25:35 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id VTBB-kE0eP1Z for ; Fri, 29 Oct 2010 23:25:22 +0200 (CEST) Original-Received: from filter4-til.mf.surf.net (filter4-til.mf.surf.net [194.171.167.220]) by balder.ntg.nl (Postfix) with ESMTP id 50C14CA7CF for ; Fri, 29 Oct 2010 23:25:22 +0200 (CEST) Original-Received: from mail-qw0-f41.google.com (mail-qw0-f41.google.com [209.85.216.41]) by filter4-til.mf.surf.net (8.14.3/8.14.3/Debian-5+lenny1) with ESMTP id o9TLPKdd000712 for ; Fri, 29 Oct 2010 23:25:21 +0200 Original-Received: by qwi2 with SMTP id 2so3567813qwi.14 for ; Fri, 29 Oct 2010 14:25:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=u44UydscUJjQVnBh+ap10hUmhExGvjXmFf+AWYf5mAU=; b=cNwrQvgIe2eUnrbyo3oRLalXzPC7nLpgjDPlQpaPt6WJKeX9lGIOGmXJ6SbXpNzPAa VS4PFattCw8NPCf2HzgpR6BUkhI+tNhWJcxedYj9wDKUMvkzayJnD+YfZF61kLwkSaSD Vu3hp2TGeY5rcVw7nJsDe8iIbMCDeKt+6XkoQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=p7yFOib+4S1n5PY8xJBs3m8vWWXhiKQtB5NCFLlZOciittcAtbiE/f+H8x0aT0foEr U9Z6Ozj6AwqnDpR/a2zRnuHXi/AtccPoEH62pVJ4HlgSRbY+tjiEPTLmZE0RR27q04vF SsX2uVhROL+2tSwnKOo+OMPuSp/4IoPFbkJPM= Original-Received: by 10.224.202.73 with SMTP id fd9mr1759071qab.70.1288387520156; Fri, 29 Oct 2010 14:25:20 -0700 (PDT) Original-Received: by 10.229.233.213 with HTTP; Fri, 29 Oct 2010 14:25:20 -0700 (PDT) In-Reply-To: <6CB41398-8E44-433C-B4FA-1B98BC684A0E@st.estfiles.de> X-Bayes-Prob: 0.0001 (Score 0, tokens from: @@RPTN) X-CanIt-Geo: ip=209.85.216.41; country=US; region=CA; city=Mountain View; postalcode=94043; latitude=37.4192; longitude=-122.0574; metrocode=807; areacode=650; http://maps.google.com/maps?q=37.4192,-122.0574&z=6 X-CanItPRO-Stream: uu:ntg-context@ntg.nl (inherits from uu:default, base:default) X-Canit-Stats-ID: 05DoJpk6K - 5a1d2fe2b415 - 20101029 X-Scanned-By: CanIt (www . roaringpenguin . com) on 194.171.167.220 X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.12 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl Xref: news.gmane.org gmane.comp.tex.context:63087 Archived-At: On Fri, Oct 29, 2010 at 13:18, Steffen Wolfrum wrote: > Hi all, > > I am just about to typeset a book of a russian author written in english, but with a lot of russian literature listed in the bibliography: > The titles of theses sources are russian but in latin transliteration, like this ... > O koordinacii mezhdunarodnyh i vneshnejekonomicheskih svjazej subjektov Rossijskoj Federacii > > But even though I assigned "\language[ru]" the word "vneshnejekonomicheskih" eg. does not get hyphenated. > And there are some dozen titles more that show the same problem ... > > Is this (to not hyphenate) because of the transliteration? > Do I have to choose another \language key? Dear Steffen, The Russian patterns only cover the Cyrillic part. Serbian patterns are the only ones that cover both scripts, but even then the patterns themselves are seen as two different languages by TeX. The best thing to do would be to transliterate Russian patterns into Latin script (under one condition: transliteration needs to be one-to-one; if one cyrillic glyph transliterates into two latin characters, that doesn't help you). If you use LuaTeX you may then load the patterns on the fly. Another "easy" option would be to load any other slavic patterns as Jano suggested and then add exceptions where needed. I'm not sure if transliterated patterns belong to hyph-utf8. (If nothing else, Russian is transliterated differently into Slovenian for example, so one would formally then need "transliteration from Russian to any other given language written in Cyrillic script"). [still under assumption that you use LuaTeX and that transliteration is one-to-one] By far the easiest and most portable solution would be if you could convince Taco to implement something like "latin a is equivalent to cyrillic a as far as hyphenation is concerned" (which could also solve many other problems that we have). Actually, you can already do that by redefining \lccode of latin a to point to cyrillic a (and do that for the whole alphabet), but then you need to make sure that you don't use any commands for lowercasing/uppercasing words. If you need details, I can help you out, but first exact transliteration rules are needed. Mojca ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________