From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/18204 Path: news.gmane.org!.POSTED!not-for-mail From: CR Newsgroups: gmane.text.pandoc Subject: Re: A way to convert PDF to Markdown or other (Solution!) Date: Thu, 21 Sep 2017 04:17:41 -0700 (PDT) Message-ID: <24ee679b-3b0a-4ffe-ba28-8f8f25c56052@googlegroups.com> References: Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_912_364908521.1505992661981" X-Trace: blaine.gmane.org 1505992662 1383 195.159.176.226 (21 Sep 2017 11:17:42 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 21 Sep 2017 11:17:42 +0000 (UTC) To: pandoc-discuss Original-X-From: pandoc-discuss+bncBDKK5MOEYANBBVV7R3HAKGQESGFBRGA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Thu Sep 21 13:17:36 2017 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-io0-f184.google.com ([209.85.223.184]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1duzU7-0008Ni-V9 for gtp-pandoc-discuss@m.gmane.org; Thu, 21 Sep 2017 13:17:36 +0200 Original-Received: by mail-io0-f184.google.com with SMTP id q7sf3337333ioi.12 for ; Thu, 21 Sep 2017 04:17:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=RFCnmo26aP4JTmKvG2c/CbgtRCFuCfvkXOf7r6ymi7g=; b=nBrCE5p7EPXNOwfxr8d9sqY0XoBVsJn8xKZ9cU0j+1QPgxBsXNfGTbGzKfTbgPiyiC UpWG1Aa9p48gVxtfi/DzAgnzk6alivv4WHaaBK1It31BVNy00tZDgvnifUrdbRFDg0kt HgI/SxwmBJ0qpYkL0LybZlNvGqGsZ6V7ZUVdDphCvp+gLiUxVvv5+7drSmkhiuDAl64U LXcHOLsEc1oGoD1SV7S9bdJ2RpYbosrc6u3z+0bNQ7aISMDRqNZFNALDnD0MMto9Xjob imZ6lCM+04DWrxfjPcCLsjSrmg07Wd6vGxnh7h0A+Aagp0a67sukMs/SPF+f4v7gjgG+ WrSw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast-net.20150623.gappssmtp.com; s=20150623; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=RFCnmo26aP4JTmKvG2c/CbgtRCFuCfvkXOf7r6ymi7g=; b=zjRg+mu8Yw/2HONjKC/fAl36jUiKd/O7ZTugsGf/OvoKRKQwp/6BiakCUWYDdWD02S trUV1PjBPQSA0nITyVTulTFhhAS0Q1uZv9r5gk9oWouVqBzuXmwmONy6kUfVfNvrpR6i qcwV4ohqBirX6+tYSB3Pnx06piBE5jV92t2Du5h65K6d8d03uymK1cmruui95wMB1iyJ Wb/CmoSlwZsIi8oHiTIge4jZYxmGrNYXsOllqxqyz9Qp1S0iE218aiogf0z1E2z+jPeq PE6VoImOe2QHXPkRFVgIxDBBZfVBfM9HvAxlCCcm53HVnMa/WRZGuJ6Sx+M65xhS+gI9 UaWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=RFCnmo26aP4JTmKvG2c/CbgtRCFuCfvkXOf7r6ymi7g=; b=kjFErfAd0TEYJ7G+wcAE2qWY94hcW8EeOegaMLIL3vQ/+v2O1iGYI961kJJUKfbvzg TuMe9PfGsrMSAydwL3f4vnFltW6UEpGP/6ebfAaNBQyRlUxHj3PvmYk0yBGnv9TjnDc4 xKdiwWrMWP4MbPsLBFSZlx/DYLvq/sq99pgSu61XNm1HTmOIwRIkL1rRPNmLhPHCYNBK W1+UAL2ri/GhiFM7QtJrQwb5yJpyHSK8PbRlO+4p0g8aO4Xbu3pZZpjoJ23yq6e4ANXk Cs3gXnYztu/NB7m2uNQTuuOduL/WFUJ0BJa1ZTXBVuoALXB250TAXhoZo9u5ygSKHkV+ JT5w== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AHPjjUiihbKy1sWVYtNwGagx3r2QIIEmMX0mN8c28Tc3FgvFam2Z3oju /yCEgSrMYDxDgsYkd03Ilbk= X-Google-Smtp-Source: AOwi7QDKJRywZH7lZNL4GQVnmaKIlL0lEEANvGoSDW0XtInnGpMew5mV3MEG5uN8fI3vjjWn+Is0Pg== X-Received: by 10.36.10.14 with SMTP id 14mr12790itw.11.1505992663329; Thu, 21 Sep 2017 04:17:43 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.107.26.193 with SMTP id a184ls1507517ioa.3.gmail; Thu, 21 Sep 2017 04:17:42 -0700 (PDT) X-Received: by 10.36.125.198 with SMTP id b189mr11049itc.6.1505992662627; Thu, 21 Sep 2017 04:17:42 -0700 (PDT) In-Reply-To: X-Original-Sender: chuckr69-Wuw85uim5zDR7s880joybQ@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:18204 Archived-At: ------=_Part_912_364908521.1505992661981 Content-Type: multipart/alternative; boundary="----=_Part_913_909161684.1505992661981" ------=_Part_913_909161684.1505992661981 Content-Type: text/plain; charset="UTF-8" I use to extract the text from a PDF, then manually edit the text to turn it into Markdown. It works pretty well. Except sometimes the footnotes in the PDF are mis-formatted in the output text. This is one of the better PDF to text convertors I've used and I tested 5 or 6 of the free online convertors. I may have to try your method though. -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/24ee679b-3b0a-4ffe-ba28-8f8f25c56052%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. ------=_Part_913_909161684.1505992661981 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I use <https://document.online-convert.com/convert-to-t= xt> to extract the text from a PDF, then manually edit the text to turn = it into Markdown. It works pretty well. Except sometimes the footnotes in t= he PDF are mis-formatted in the output text. This is one of the better PDF = to text convertors I've used and I tested 5 or 6 of the free online con= vertors.

I may have to try your method though.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/= msgid/pandoc-discuss/24ee679b-3b0a-4ffe-ba28-8f8f25c56052%40googlegroups.co= m.
For more options, visit http= s://groups.google.com/d/optout.
------=_Part_913_909161684.1505992661981-- ------=_Part_912_364908521.1505992661981--