From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/30545 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: "D.J." Newsgroups: gmane.text.pandoc Subject: Re: Converting all PDF in a folder to Markdown quick question Date: Tue, 10 May 2022 18:37:54 -0700 (PDT) Message-ID: <2440b99b-01f0-48be-bd5c-6cfb7885bb9an@googlegroups.com> References: <846e351a-a762-4c7d-8026-6fb700893f44n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_1820_1892462371.1652233074775" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="3145"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCPNVOXBXIARB45G5SJQMGQEK3GFF3Y-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Wed May 11 03:38:00 2022 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-oi1-f192.google.com ([209.85.167.192]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1nobId-0000Xi-Jc for gtp-pandoc-discuss@m.gmane-mx.org; Wed, 11 May 2022 03:37:59 +0200 Original-Received: by mail-oi1-f192.google.com with SMTP id bg4-20020a056808178400b00326bbead1d7sf502877oib.2 for ; Tue, 10 May 2022 18:37:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=6R3UHL36WT19A7bEyx4LGpGMsXEgy3uAukTORtGfkAk=; b=aD7INzOXRUGQrBGvEHBFXj9JSi63S8S8zruJ127Jk7Z6W8glDWYBpLayjw9B2Pp9Y6 DshsMGFUkjUKojuVyr+0Iw4wF62iAKyZBW45dqm+tfaFe/TLRjjAhfA7/HJoCdV4N8qk WdjA1XaDJTrSKvXG1EnTjmGJqogY1LrNGiMXHBdfb75oJOgiIn21IKqNyWhVlx7mFYoA n0q8lQx9xye9az7Yh5ZvawXqVOyMN3foihOsItEIXEOLjoNyzJZTjEGlL29znHilps3F tD8+T9qi6lHvWisotTps1scIorRwY0YF8D/srmU9gvUe06OLXKKUR8eOOdNI0Qbg/bz8 Aprg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=6R3UHL36WT19A7bEyx4LGpGMsXEgy3uAukTORtGfkAk=; b=DFMdFH45ild5jBKUTjcvyreZvBGRsv5VTSfJU9M6uN4AAEiHTIG9b+3MOZrkxE1vgu cuimlQPfhFM2cGSL++L98lXvlDjRQox9KbOsdKhx4PkoY0VSiD2ibmNwpJ9TnH2joVfy ASH6LEq/TfMIV9Q5w4EPJg+Dv4AoosoxGBfun+h0jLA3JsB8LEa74i4SpuEU0iwf8B0B a15FfSZze2Iu5yPuF22/I70/H/h+nI8cB6x2fKIbusZHIuwM0QyhjUn5Fv+M1Oh70vPW e+bIBOxy3jCGtNVOpDZ8lhLVMeq5nVZpBG05Q658MvNltbOP9iC3XqOx7ywquAyW7rCy HvUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=6R3UHL36WT19A7bEyx4LGpGMsXEgy3uAukTORtGfkAk=; b=rVU8V5akl/pqsX9GKqTAjA5yx67rP1J2U1kqdBastx65h4uqBR1KLRIf1yT0H2kfJq B/D9LK2ULgIix1EaCT3rytwtU+mwgdAxQ3VOeA2Y8c+fObUtFWWuydtggA2SaMy8gU2L 5G+GN8szDOaNpcAket55o5vmC/ttreauOC+ihrdAmJ8SePg2mfHBu3lOheNl9HirbEpH UymAmNEp0E2geAKzOqgNWyRXkNnJIduH9DUyyzUd5tqoB6gT0rNJybZcGBIOFFoL1i94 yOtMlM4SOs942rcPZHbAGPKS4I6RgEsq4vNpE05yJmafae58cU3Zi6wVq1NgyBlOloK1 pyGA== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM532gy3/etfkmu+fvYJ7vHqxmyLZVFrjAjhfw3YJfe0MK+Kffdko+ g5mg87+4VyoUxkGMfj2hb+o= X-Google-Smtp-Source: ABdhPJwpiaJSpEC32AQ+WINbJJux+xaD7BUZeMazpDV8eRYfahqm3QuqwTwAGdcg8Kdk4yAuHs+DSA== X-Received: by 2002:a05:6808:2214:b0:326:4f5b:2f3c with SMTP id bd20-20020a056808221400b003264f5b2f3cmr1365671oib.229.1652233078554; Tue, 10 May 2022 18:37:58 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6808:222a:b0:326:45f0:62be with SMTP id bd42-20020a056808222a00b0032645f062bels207814oib.5.gmail; Tue, 10 May 2022 18:37:55 -0700 (PDT) X-Received: by 2002:a05:6808:1981:b0:326:a6a3:eee9 with SMTP id bj1-20020a056808198100b00326a6a3eee9mr1380770oib.203.1652233075418; Tue, 10 May 2022 18:37:55 -0700 (PDT) In-Reply-To: X-Original-Sender: futurevintage-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:30545 Archived-At: ------=_Part_1820_1892462371.1652233074775 Content-Type: multipart/alternative; boundary="----=_Part_1821_1342731610.1652233074775" ------=_Part_1821_1342731610.1652233074775 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Aha, missed the not converting from part. Will try with one of the other=20 tools. Thank you all for your responses! On Tuesday, May 10, 2022 at 1:10:15 AM UTC-7 Bastien Dumont wrote: > You may be interested by https://github.com/jzillmann/pdf-to-markdown > > Le Monday 09 May 2022 =C3=A0 07:10:02PM, D.J. a =C3=A9crit : > > I've been able to convert all html in a folder to Markdown using: > >=20 > > find ./ -iname "*.html" -type f -exec sh -c 'pandoc "${0}" -o=20 > "${0%.html}.md"' > > {} \; > >=20 > > I'm now trying to convert all pdf in a folder to Markdown using a simil= ar > > approach: > >=20 > > find ./ -iname "*.pdf" -type f -exec sh -c 'pandoc "${0}" -o=20 > "${0%.html}.md"' > > {} \; > >=20 > > All that happens when I trigger the same for pdf conversion is terminal= =20 > then > > says:=20 > >=20 > > dquote> > >=20 > > The PDF isn't converted to Markdown. I'm a terminal/Pandoc newb so mayb= e=20 > it's > > different for PDF? I have installed Mactex using Brew install librsvg= =20 > python > > homebrew/cask/basictex > >=20 > > Do I have to do a full pdflatex install, or is there a better formula= =20 > here? > > Thanks! > >=20 > > -- > > You received this message because you are subscribed to the Google Grou= ps > > "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send= =20 > an email > > to [1]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit [2] > https://groups.google.com/d/msgid/ > > pandoc-discuss/846e351a-a762-4c7d-8026-6fb700893f44n%40googlegroups.com= . > >=20 > > References: > >=20 > > [1] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > > [2]=20 > https://groups.google.com/d/msgid/pandoc-discuss/846e351a-a762-4c7d-8026-= 6fb700893f44n%40googlegroups.com?utm_medium=3Demail&utm_source=3Dfooter > > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/2440b99b-01f0-48be-bd5c-6cfb7885bb9an%40googlegroups.com. ------=_Part_1821_1342731610.1652233074775 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Aha, missed the not converting from part. Will try with one of the other to= ols. Thank you all for your responses!

On Tuesday, May 10, 2022 at 1:10:15 AM= UTC-7 Bastien Dumont wrote:
You may be interested by https://github.com/jzillmann/pdf-to-markdo= wn

Le Monday 09 May 2022 =C3=A0 07:10:02PM, D.J. a =C3=A9crit :
> I've been able to convert all html in a folder to Markdown usi= ng:
>=20
> find ./ -iname "*.html" -type f -exec sh -c 'pandoc = "${0}" -o "${0%.html}.md"'
> {} \;
>=20
> I'm now trying to convert all pdf in a folder to Markdown usin= g a similar
> approach:
>=20
> find ./ -iname "*.pdf" -type f -exec sh -c 'pandoc &= quot;${0}" -o "${0%.html}.md"'
> {} \;
>=20
> All that happens when I trigger the same for pdf conversion is ter= minal then
> says:=20
>=20
> dquote>
>=20
> The PDF isn't converted to Markdown. I'm a terminal/Pandoc= newb so maybe it's
> different for PDF? I have installed Mactex using Brew install libr= svg python
> homebrew/cask/basictex
>=20
> Do I have to do a full pdflatex install, or is there a better form= ula here?
> Thanks!
>=20
> --
> You received this message because you are subscribed to the Google= Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, = send an email
> to [1]pandoc-discus...@= googlegroups.com.
> To view this discussion on the web visit [2]https://groups.google.com/d/msgid/
> pandoc-discuss/846e351a-a762-4c7d-8026-6fb700893f44n%40googlegroups.com.
>=20
> References:
>=20
> [1] mailto:pandoc-discu= s...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> [2] https://groups= .google.com/d/msgid/pandoc-discuss/846e351a-a762-4c7d-8026-6fb700893f44n%40= googlegroups.com?utm_medium=3Demail&utm_source=3Dfooter

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/2440b99b-01f0-48be-bd5c-6cfb7885bb9an%40googlegroups.= com.
------=_Part_1821_1342731610.1652233074775-- ------=_Part_1820_1892462371.1652233074775--