From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/82515 Path: news.gmane.org!not-for-mail From: Xan Newsgroups: gmane.comp.tex.context Subject: Re: Off-topic: Convert PDF to (Con/La)TeX Date: Wed, 15 May 2013 18:14:34 +0200 Message-ID: <5193B46A.4010400@telefonica.net> References: <518FA7E2.7030105@telefonica.net> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2025730589==" X-Trace: ger.gmane.org 1368634498 31534 80.91.229.3 (15 May 2013 16:14:58 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 15 May 2013 16:14:58 +0000 (UTC) Cc: mailing list for ConTeXt users To: luigi scarso Original-X-From: ntg-context-bounces@ntg.nl Wed May 15 18:14:59 2013 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from balder.ntg.nl ([195.12.62.10]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UceLz-00086U-Hb for gctc-ntg-context-518@m.gmane.org; Wed, 15 May 2013 18:14:59 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id DCC33101F3; Wed, 15 May 2013 18:14:58 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id ZkTe-jsVnJhX; Wed, 15 May 2013 18:14:53 +0200 (CEST) Original-Received: from balder.ntg.nl (localhost [IPv6:::1]) by balder.ntg.nl (Postfix) with ESMTP id 577F8101E5; Wed, 15 May 2013 18:14:53 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id 8319C101E5 for ; Wed, 15 May 2013 18:14:52 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id vJe7cLwdWJjj for ; Wed, 15 May 2013 18:14:47 +0200 (CEST) Original-Received: from filter5-til.mf.surf.net (filter5-til.mf.surf.net [194.171.167.221]) by balder.ntg.nl (Postfix) with ESMTP id 5B8CA101E0 for ; Wed, 15 May 2013 18:14:37 +0200 (CEST) Original-Received: from telefonica.net (impaqm5.telefonica.net [213.4.138.21]) by filter5-til.mf.surf.net (8.14.3/8.14.3/Debian-9.4) with ESMTP id r4FGEZPf024012 for ; Wed, 15 May 2013 18:14:36 +0200 Original-Received: from IMPmailhost5.adm.correo ([10.20.102.126]) by IMPaqm5.telefonica.net with bizsmtp id cDB11l0072jdgqJ3RGEZ1r; Wed, 15 May 2013 18:14:33 +0200 Original-Received: from [172.26.0.6] ([81.47.138.17]) by IMPmailhost5.adm.correo with BIZ IMP id cGEZ1l0050NiMKD1lGEZTr; Wed, 15 May 2013 18:14:34 +0200 X-CMAE-Analysis: v=1.1 cv=B1f61Mecprfp57iNGAUyHxtB/UYxRaa26UQZl+6Ewms= c=1 sm=1 a=77c5cIcZMT8A:10 a=IJ3LoscQE5wA:10 a=VY-2JVJxNB4A:10 a=tOC4kGaAeBlGomqVsvfEcg==:17 a=WYd5oxahAAAA:8 a=1oDlAcHVAAAA:8 a=wqu4I1yUAAAA:8 a=D4bUcqAWmFeQaou_UsoA:9 a=QEXdDO2ut3YA:10 a=ZKuRdPHpJ3oA:10 a=SIgZ-KYG32UA:10 a=pGLkceISAAAA:8 a=qqseaB6Uo5w2IIUFN8kA:9 a=_W_S_7VecoQA:10 a=tXsnliwV7b4A:10 a=Oor6mOGV3IhEaSkU:21 a=tOC4kGaAeBlGomqVsvfEcg==:117 X-original-sender: publicitat@telefonica.net User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130428 Thunderbird/17.0.5 In-Reply-To: X-Bayes-Prob: 0.0001 (Score 0, tokens from: @@RPTN) X-CanIt-Geo: ip=213.4.138.21; country=ES; latitude=40.0000; longitude=-4.0000; http://maps.google.com/maps?q=40.0000,-4.0000&z=6 X-CanItPRO-Stream: uu:ntg-context@ntg.nl (inherits from uu:default, base:default) X-Canit-Stats-ID: 0WJAgezAM - 0d4f7c09169f - 20130515 (trained as not-spam) X-Scanned-By: CanIt (www . roaringpenguin . com) on 194.171.167.221 X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.14 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: ntg-context-bounces@ntg.nl Xref: news.gmane.org gmane.comp.tex.context:82515 Archived-At: This is a multi-part message in MIME format. --===============2025730589== Content-Type: multipart/alternative; boundary="------------080000050802060601080305" This is a multi-part message in MIME format. --------------080000050802060601080305 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Al 13/05/13 09:55, En/na luigi scarso ha escrit: > > > > On Sun, May 12, 2013 at 4:32 PM, Xan > wrote: > > Hi, > > I just want to know if there is any tool to convert a pdf > (generated by latex or context) to latex source or context source > file. Does anyone have got an experience on that? > > I'm thinking about two alternatives: > * libraries for reading like podofo and custom script for passing > pdf context (text) to context commands > * pass pdf to jpg, and apply > http://detexify.kirelabs.org/classify.html for passing to tex symbo= ls. > > For me it's vital to pass mathematical symbols like (\int) to tex > symbol and not like utf-8 symbols. > > Thanks a lot, > Xan. > > > Have you seen the mudraw program of mupdf > http://www.mupdf.com/ > ? > It has a -t switch that outputs txt and a -tt and -ttt switches that=20 > output xml. > > --=20 > luigi Thank you for answering and sorry for delay. I will check it, but I=20 suspect that if I have $$\int_{i=3D1}^{\infty} x^2$$ in one latex document and it generates pdf, then mupdf -t of that=20 document does not generate that formula, else "S i=3D1 x=C2=B2". Thanks, Xan. --------------080000050802060601080305 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Al 13/05/13 09:55, En/na luigi scarso ha escrit:



On Sun, May 12, 2013 at 4:32 PM, Xan <dxpublica@telefonica.net> wrote:
Hi,

I just want to know if there is any tool to convert a pdf (generated by latex or context) to latex source or context source file. Does anyone have got an experience on that?
I'm thinking about two alternatives:
* libraries for reading like podofo and custom script for passing pdf context (text) to context commands
* pass pdf to jpg, and apply http://detexify.kirelabs.org/classify.h= tml for passing to tex symbols.

For me it's vital to pass mathematical symbols like (\int) to tex symbol and not like utf-8 symbols.

Thanks =C2=A0a lot,
Xan.


Have you seen the mudraw program of mupdf
= http://www.mupdf.com/
?
It has a -t switch that outputs txt=C2=A0 and a -tt and = -ttt switches that output xml.

--
luigi
Thank you for answering and sorry for delay. I will check it, but I suspect that if I have

=C2=A0$$\int_{i=3D1}^{\infty} x^2$$

in one latex document and it generates pdf, then mupdf -t of that document does not generate that formula, else "S i=3D1 x=C2=B2".

Thanks,
Xan.
--------------080000050802060601080305-- --===============2025730589== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________ --===============2025730589==--