From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/11634 Path: news.gmane.org!not-for-mail From: Jesse Rosenthal Newsgroups: gmane.text.pandoc Subject: Re: UTF-8 error when converting Docx to Markdown Date: Wed, 31 Dec 2014 07:05:10 -0500 Message-ID: References: <9d171289-7a60-4ea0-907d-333e4cfba86e@googlegroups.com> <5305b92a-418f-44dc-87cc-8a42ae30fffd@googlegroups.com> <74660d48-9d88-4126-a34a-f815e542b4c7@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: ger.gmane.org 1420027519 2294 80.91.229.3 (31 Dec 2014 12:05:19 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 31 Dec 2014 12:05:19 +0000 (UTC) To: Farhan , pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBDF7DMU574PBB6GMR6SQKGQETNTMFOY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Wed Dec 31 13:05:14 2014 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-ig0-f191.google.com ([209.85.213.191]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Y6I1Z-0006xP-MK for gtp-pandoc-discuss@m.gmane.org; Wed, 31 Dec 2014 13:05:13 +0100 Original-Received: by mail-ig0-f191.google.com with SMTP id hn15sf2633677igb.8 for ; Wed, 31 Dec 2014 04:05:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=from:to:subject:in-reply-to:references:user-agent:date:message-id :mime-version:content-type:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:sender:list-subscribe :list-unsubscribe; bh=fLOuZgJsQMI7WKWWJPHBalTpFRFMpQ/p/3YGUU9bUUs=; b=AZI3Azp8MCtT3eJ4EjIWM/3RSUQ497wU3jCPP5GNmAPc7F/A0N1oV+gxwYDwAu+hg+ nLkIjV1eB/xuwJk0+CCLoo6jwWh4dB5vK1t9Fm+W6DrQtdclRbis+0F25T4txpls79gH YMMoRkUqVNwbPWcF3C2uXL+I6EMIG/bND0W9zXDN/1E/cx2XHS7oXKxehX7DsgtMH8mr dzdSxTharhD2taGsW6Y/AOXWePv8OhzzAw/l7lHhnD3JlxvICrK/Pb+ZcrwbeQam7Ud7 ndjEKfvp7nGMhi/ctSGH1f4P2MeHGMTF5JQkZDpnOQdifsckpUnNv1/t4PeyvNlQKX56 OORQ== X-Received: by 10.51.16.226 with SMTP id fz2mr805986igd.16.1420027513001; Wed, 31 Dec 2014 04:05:13 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.107.34.17 with SMTP id i17ls3849436ioi.83.gmail; Wed, 31 Dec 2014 04:05:12 -0800 (PST) X-Received: by 10.67.21.131 with SMTP id hk3mr35655469pad.13.1420027512324; Wed, 31 Dec 2014 04:05:12 -0800 (PST) Original-Received: from smtpauth.johnshopkins.edu (smtpauth.johnshopkins.edu. [128.220.229.167]) by gmr-mx.google.com with ESMTPS id e4si3783249qcq.3.2014.12.31.04.05.12 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 31 Dec 2014 04:05:12 -0800 (PST) Received-SPF: none (google.com: prvs=4354cc3c2=jrosenthal-4GNroTWusrE@public.gmane.org does not designate permitted sender hosts) client-ip=128.220.229.167; X-IronPort-AV: E=Sophos;i="5.07,672,1413259200"; d="scan'208";a="510299742" Original-Received: from c-69-137-43-192.hsd1.md.comcast.net (HELO localhost) ([69.137.43.192]) by IPMTW3.johnshopkins.edu with ESMTP/TLS/DHE-RSA-AES256-SHA; 31 Dec 2014 07:05:11 -0500 In-Reply-To: <74660d48-9d88-4126-a34a-f815e542b4c7-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> User-Agent: Notmuch/0.19+2~g32855b9 (http://notmuchmail.org) Emacs/24.4.1 (x86_64-apple-darwin13.4.0) X-Original-Sender: jrosenthal-4GNroTWusrE@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; spf=none (google.com: prvs=4354cc3c2=jrosenthal-4GNroTWusrE@public.gmane.org does not designate permitted sender hosts) smtp.mail=prvs=4354cc3c2=jrosenthal-4GNroTWusrE@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:11634 Archived-At: Hi, Farhan writes: > Sorry to answer my own question, but a few hours of Googling for me the > answer. You can use the tool unoconv to accomplish this task: > > unoconv --stdout -f html test.docx | pandoc -f html -t markdown -o test.md > > Hope this helps the next guy! That still seems weird -- are you sure you're using a pandoc version that actually supports reading docx? What's the output of `pandoc -v`?