From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/18007 Path: news.gmane.org!.POSTED!not-for-mail From: Kolen Cheung Newsgroups: gmane.text.pandoc Subject: Re: Writing custom filter in python to remove non-breaking spaces Date: Fri, 4 Aug 2017 15:09:34 -0700 (PDT) Message-ID: <4abb2571-34b8-4e49-a189-05632083aab9@googlegroups.com> References: <3be5ee09-90dc-41ad-a368-9298b965dfaa@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_1409_713720404.1501884574060" X-Trace: blaine.gmane.org 1501884578 13342 195.159.176.226 (4 Aug 2017 22:09:38 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 4 Aug 2017 22:09:38 +0000 (UTC) To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCS252WXTEIBBHXBSPGAKGQERJLHIBA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sat Aug 05 00:09:34 2017 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-io0-f188.google.com ([209.85.223.188]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ddkmf-00031w-2A for gtp-pandoc-discuss@m.gmane.org; Sat, 05 Aug 2017 00:09:29 +0200 Original-Received: by mail-io0-f188.google.com with SMTP id m88sf1223524iod.0 for ; Fri, 04 Aug 2017 15:09:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=uCOdBH29BG18PxmDJyW1TthE2AEu2ww86YNwEUTW7uo=; b=e4nYq3b3sG8Zo32nMO1yrfoheRrYXbsxq7nH8xrwenIlBcIMjDUQDNIcifd0pvshZu sBAcuY0KM7tZN2D+YAteh4YIkK9lyBzOxq4Yiai8aOc84Pr8P5CgOBQ8OtQlLCbwet4g xryXwMoYoqsuY6O0VvVJfifcQpXd7RVyA32wyBLwPxKnkQxQmiapbM/GMnZU6OYAY7Vh HbSfJMnZSmEfo9Ph0JwhKIFAxHbGZP7WcX/jrQafC9N/rhp2JLpIQHgo/LBLjknp055R yhEJhYh2v4i/V5vuP4e8x6HMmuEIyR0hGuqTzK7NupFEmJCVn0kqr1nif9aTWyZHl2eX RmEQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=uCOdBH29BG18PxmDJyW1TthE2AEu2ww86YNwEUTW7uo=; b=eSMq5r8wePAmgQ5Qf+Jh49jQ2Vbsod5ESdsgZ+atMpnWxnyHMt4+GHouxnw9xoLd+9 dZXKxYmdqjTjIlkIevI0i7AqvoUUFVZMyHjG9J0d7K+Q73dlyQwy+Ou2PEXZWnOM8Xn6 v51VbgAyG19aDKKhnz0QqU0aXxpbLTfLDi1F62BqSOg5HbHti0e+fqJ4h9Fp9WZwLNuI u31RIgGDXz7kNLOI/FCDd/UOygHD6mXLpilezvE2d5hxvQwhcpj8F5bLzhTzSLSJRJDF H05AvQyEvE6hQziJF4TV+FGjqh4gMduMTVKS1U99iFVvKS+ujteM1aPrGu4pFxw64404 m9xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=uCOdBH29BG18PxmDJyW1TthE2AEu2ww86YNwEUTW7uo=; b=jQr6PPF9dIw2oc+j4n3mTCSuQlYTF/X11ZcawcoiakMjSvtl7BSijyKnLQF7Czsa4r k85V4FvRjNOUE5MRqYjC74jHjfg5VA5QAWXxIbGTDtFuFENbpoUnnC5zkQzIyrwLDPo0 qpDRJrB0FRB6t61DRwTwfV90gkmebqsqZkSk4q5BmQ/VBqeMxsT9REDBT2eLSrXRX5jd ZsbNElD6CIEfg15uVDk6MpFdB808u8IbwlqIAVIptRMfFNL+kE9sogy6JDNH9/9od1ip sylC5G+zrE00WCb721X3Og4zj1DMkR93nfik7bMc4G8sU7oLFCl0CwKVfE2HFbvl7Edd 5HVA== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AIVw1114Bb+OZLO92R0AED8545vZaEvXsmWuiVb+e7b1KCpLay7Gs8zz J+zd4K7YUyjVjQ== X-Received: by 10.36.26.134 with SMTP id 128mr121251iti.3.1501884575177; Fri, 04 Aug 2017 15:09:35 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.107.178.88 with SMTP id b85ls4049696iof.5.gmail; Fri, 04 Aug 2017 15:09:34 -0700 (PDT) X-Received: by 10.36.34.209 with SMTP id o200mr122423ito.0.1501884574544; Fri, 04 Aug 2017 15:09:34 -0700 (PDT) In-Reply-To: X-Original-Sender: christian.kolen-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:18007 Archived-At: ------=_Part_1409_713720404.1501884574060 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable But I still want to reiterate my original point, only if the original sourc= e is problematic, one wants to remove/replace the non-breaking space. I rec= all when I processed some doc files, seems like Word has been trying to be = too smart, there are many non-breaking space that shouldn't be there (if my= memory serves me well). Otherwise, replacing non-breaking space to space is basically wrong in typo= graphy (another example is "1-2" vs "1=E2=80=932"). In other words, you're = destroying the information the original author has carefully put in there. --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/4abb2571-34b8-4e49-a189-05632083aab9%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. ------=_Part_1409_713720404.1501884574060--