From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/106893 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: kaddour kardio Newsgroups: gmane.comp.tex.context Subject: Re: Best way to create a large number of documents from database Date: Thu, 16 Apr 2020 17:39:57 +0100 Message-ID: References: <8584A5AC-467D-46DF-B41F-83FF6BC0DB24@elvenkind.com> <780bff75-0ea2-f150-5b6e-5aed60085c49@xs4all.nl> Reply-To: mailing list for ConTeXt users Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1360293283622360525==" Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="81915"; mail-complaints-to="usenet@ciao.gmane.io" To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Thu Apr 16 18:36:45 2020 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane-mx.org Original-Received: from zapf.boekplan.nl ([5.39.185.232] helo=zapf.ntg.nl) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jP7VM-000LEX-Hj for gctc-ntg-context-518@m.gmane-mx.org; Thu, 16 Apr 2020 18:36:44 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 05D94181ECC; Thu, 16 Apr 2020 18:36:24 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jw-VPy-qVpKa; Thu, 16 Apr 2020 18:36:21 +0200 (CEST) Original-Received: from zapf.ntg.nl (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id B80C4181EFC; Thu, 16 Apr 2020 18:36:21 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 2E81C181EFC for ; Thu, 16 Apr 2020 18:36:20 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id V4bR6wCimjCd for ; Thu, 16 Apr 2020 18:36:18 +0200 (CEST) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.210.44; helo=mail-ot1-f44.google.com; envelope-from=kaddourkardio@gmail.com; receiver= Original-Received: from mail-ot1-f44.google.com (mail-ot1-f44.google.com [209.85.210.44]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)) (No client certificate requested) by zapf.ntg.nl (Postfix) with ESMTPS id 55EE8181ECC for ; Thu, 16 Apr 2020 18:36:18 +0200 (CEST) Original-Received: by mail-ot1-f44.google.com with SMTP id m18so3556263otq.9 for ; Thu, 16 Apr 2020 09:36:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=mu+jK0xOHA1IAz9a5UwOtkW+iq4K53FpsHlrv5U0B5c=; b=WoEYbFB5AqYHY3PMekoB5znPY81I7HnlQwV/Zif5dS9mLmmroHzmTOFQ0SGYvhI1/W PG83vpDYvC3UiLdjzIrd4mCjXAUwccTKcVoEUfSTOLCyI2B5jGpp8rwMwjD4lsJ2+JHd xYftYwZqLt8A+5DmICZXs4lGSAUmrm7k5rQ9x5tAwjLFr5X70b9T0VtB6kMBwjwtFtvQ zKXmGGZtc9TyLTHoHQQe6k3pkffK5m7084aOp3k0cCC5CFEGvY3exjymT2wLbO0LukKS M/z/hnnm7FGsF9BpYuR2n2uad6oWfmvCLYKsReTslZZxuH10+7sppUD74FoIHnETNZR7 0O3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=mu+jK0xOHA1IAz9a5UwOtkW+iq4K53FpsHlrv5U0B5c=; b=Oz4MoAOyFrFsYhdA95uGKBxplk6eAo6XI46IVSBzpOGnqRYzwaZBxG0CtzporWYIOf 7IsVpX5VfX3Sgb8wNigutQC13j4O++nELka7t/4yhJusnyjfy7xuMPq7VugyYIM7/MER soDyADdXR444JnMsrv0RMsMnneh1FluhFot32lkdtrFwZNQFSwrTp8Z+yq5uYDl+/e5G 8q6O0SAU5gvBOXp3h/MiUuYACLh2UyLuBNOTm1Gkr+JeNfnQ52Zy1Jk87K0mG0vq9WGb WWgrylZtH2LfLHiig/Lt37tSv7DMogKI4wlcq+zqkeCoYno28PMgjwZM2amqwIaqnnNp 3vSw== X-Gm-Message-State: AGi0PuZg/spMl5J6+6Bb47C/foxQRAJXTzzg5L2cqBuBZRQR2Bu4MpYV qNwac6bRQAV91EUgc70ZZwpYffvpXNuTNdVKge2QdHKTtKY= X-Google-Smtp-Source: APiQypIoDKOtRHT2xOIO/w2ykADDXIDcmQL3ThAh4pvpXoNu8083ldzMN14Pz9aDai++SHw6C4J6RtTvhT7I9rutR90= X-Received: by 2002:a05:6830:1a:: with SMTP id c26mr28324337otp.282.1587054976088; Thu, 16 Apr 2020 09:36:16 -0700 (PDT) In-Reply-To: <780bff75-0ea2-f150-5b6e-5aed60085c49@xs4all.nl> X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.26 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: "ntg-context" Xref: news.gmane.io gmane.comp.tex.context:106893 Archived-At: --===============1360293283622360525== Content-Type: multipart/alternative; boundary="00000000000016be1f05a36b0827" --00000000000016be1f05a36b0827 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable A relatively simple way is to use a templating system such as jinja2 and iterate over a mkiv template. Calling context with subprocess and you got the result. Le jeu. 16 avr. 2020 =C3=A0 15:52, Hans Hagen a =C3=A9c= rit : > On 4/16/2020 4:38 PM, Mojca Miklavec wrote: > > On Thu, 16 Apr 2020 at 11:29, Taco Hoekwater wrote: > >>> On 16 Apr 2020, at 11:12, Mojca Miklavec wrote: > >>> > >>> I have been asked to create a few thousand PDF documents from a CSV > >>> "database" today > >> > >> In CPU cycles, the fastest way is to do a single context =E2=80=94once > >> run generating all the pages as a single document, then using > >> mutool merge to split it into separate documents using a (shell) > >> loop. > > > > Just to make it clear: I don't really need to optimize on the CPU end, > > as the bottleneck is on the other side of the keyboard, so as long as > > the CPU can process 5k pages today, I'm fine with it :) :) :) > > 5K is nothing ... so that will work > > >>> One option is that I quickly draft a python script that creates a few > >>> thousand TeX documents and compiles them individually, but it might b= e > >>> easier if there was a way to just create a single template document > >>> and then run something like > >>> context --some-params --N=3D42 --output=3Ddocument-0042.pdf > template.tex > >>> or something along those lines. > >> > >> If you want to go this route (and you may have to if not each record > >> fits exactly within a single page), > > > > I do have one page per document. The more annoying part is having > > strange document names that need more attention when mapping page > > number -> name (I'm not saying this is not doable). > > so, don't make files: > > - write a tex file foo.tex > - process it: context --batch --result=3D1 --once foo > > etc ... so, use --result for the target name and use the same input name > > (I won't bother you with the template system in context that no one > knows of.) > > Hans > > ----------------------------------------------------------------- > Hans Hagen | PRAGMA ADE > Ridderstraat 27 | 8061 GH Hasselt | The Netherlands > tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl > ----------------------------------------------------------------- > > _________________________________________________________________________= __________ > If your question is of interest to others as well, please add an entry to > the Wiki! > > maillist : ntg-context@ntg.nl / > http://www.ntg.nl/mailman/listinfo/ntg-context > webpage : http://www.pragma-ade.nl / http://context.aanhet.net > archive : https://bitbucket.org/phg/context-mirror/commits/ > wiki : http://contextgarden.net > > _________________________________________________________________________= __________ > --00000000000016be1f05a36b0827 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
A relatively simple way is to use a templating system suc= h as jinja2 and iterate over a mkiv template.
Calling cont= ext with subprocess and you got the result.

Le jeu. 16 avr. 2020 =C3= =A0 15:52, Hans Hagen <j.hagen@xs4a= ll.nl> a =C3=A9crit=C2=A0:
O= n 4/16/2020 4:38 PM, Mojca Miklavec wrote:
> On Thu, 16 Apr 2020 at 11:29, Taco Hoekwater wrote:
>>> On 16 Apr 2020, at 11:12, Mojca Miklavec wrote:
>>>
>>> I have been asked to create a few thousand PDF documents from = a CSV
>>> "database" today
>>
>> In CPU cycles, the fastest way is to do a single context =E2=80=94= once
>> run generating all the pages as a single document, then using
>> mutool merge to split it into separate documents using a (shell) >> loop.
>
> Just to make it clear: I don't really need to optimize on the CPU = end,
> as the bottleneck is on the other side of the keyboard, so as long as<= br> > the CPU can process 5k pages today, I'm fine with it :) :) :)

5K is nothing ... so that will work

>>> One option is that I quickly draft a python script that create= s a few
>>> thousand TeX documents and compiles them individually, but it = might be
>>> easier if there was a way to just create a single template doc= ument
>>> and then run something like
>>>=C2=A0 =C2=A0 =C2=A0context --some-params --N=3D42 --output=3Dd= ocument-0042.pdf template.tex
>>> or something along those lines.
>>
>> If you want to go this route (and you may have to if not each reco= rd
>> fits exactly within a single page),
>
> I do have one page per document. The more annoying part is having
> strange document names that need more attention when mapping page
> number -> name (I'm not saying this is not doable).

so, don't make files:

- write a tex file foo.tex
- process it: context --batch --result=3D1 --once foo

etc ... so, use --result for the target name and use the same input name
(I won't bother you with the template system in context that no one knows of.)

=C2=A0 Hans

-----------------------------------------------------------------
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0Hans Hagen | PRAGMA ADE
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Ridderstraat 27 | 80= 61 GH Hasselt | The Netherlands
=C2=A0 =C2=A0 =C2=A0 =C2=A0 tel: 038 477 53 69 | www.pragma-ade.n= l | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________= ________
If your question is of interest to others as well, please add an entry to t= he Wiki!

maillist : ntg-context@ntg.nl / http://= www.ntg.nl/mailman/listinfo/ntg-context
webpage=C2=A0 : http://www.pragma-ade.nl / http://= context.aanhet.net
archive=C2=A0 : https://bitbucket.org/ph= g/context-mirror/commits/
wiki=C2=A0 =C2=A0 =C2=A0: http://contextgarden.net
___________________________________________________________________________= ________
--00000000000016be1f05a36b0827-- --===============1360293283622360525== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX18KSWYgeW91ciBxdWVzdGlvbiBpcyBvZiBpbnRlcmVz dCB0byBvdGhlcnMgYXMgd2VsbCwgcGxlYXNlIGFkZCBhbiBlbnRyeSB0byB0aGUgV2lraSEKCm1h aWxsaXN0IDogbnRnLWNvbnRleHRAbnRnLm5sIC8gaHR0cDovL3d3dy5udGcubmwvbWFpbG1hbi9s aXN0aW5mby9udGctY29udGV4dAp3ZWJwYWdlICA6IGh0dHA6Ly93d3cucHJhZ21hLWFkZS5ubCAv IGh0dHA6Ly9jb250ZXh0LmFhbmhldC5uZXQKYXJjaGl2ZSAgOiBodHRwczovL2JpdGJ1Y2tldC5v cmcvcGhnL2NvbnRleHQtbWlycm9yL2NvbW1pdHMvCndpa2kgICAgIDogaHR0cDovL2NvbnRleHRn YXJkZW4ubmV0Cl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCg== --===============1360293283622360525==--