From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/18008 Path: news.gmane.org!.POSTED!not-for-mail From: Melroch Newsgroups: gmane.text.pandoc Subject: Re: Writing custom filter in python to remove non-breaking spaces Date: Sat, 5 Aug 2017 10:45:54 +0200 Message-ID: References: <3be5ee09-90dc-41ad-a368-9298b965dfaa@googlegroups.com> <4abb2571-34b8-4e49-a189-05632083aab9@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="f403045e6c4a558b0f0555fda463" X-Trace: blaine.gmane.org 1501922760 31483 195.159.176.226 (5 Aug 2017 08:46:00 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 5 Aug 2017 08:46:00 +0000 (UTC) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBCWMVYEK54FRBRELS3GAKGQEMWZQ4YA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sat Aug 05 10:45:54 2017 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-qk0-f189.google.com ([209.85.220.189]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dduiV-0007ju-5d for gtp-pandoc-discuss@m.gmane.org; Sat, 05 Aug 2017 10:45:51 +0200 Original-Received: by mail-qk0-f189.google.com with SMTP id d136sf1844018qkg.1 for ; Sat, 05 Aug 2017 01:45:57 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1501922757; cv=pass; d=google.com; s=arc-20160816; b=LmEFQmr4ZNKVag9paQr1ECuhNyvBi70l/I/+51O67I3v41OCuG6LHwcGbITFWbhzpM Y5oHqLwGH1W2MZZQTIMls+zCp+XiN42iygqhdGoKOT0z8YZCbV/Ff7UrjmquE7XuYMfK 1D3oIivOZuWB+rSMawZFkmWXlWVH0egmqID/G/hFbAb+i2gGUzjGMZxEiHKOF5rKG28q lLgmibowNaTqz1JSP2S8ge/uxUX6eNLXvaiNUNmyWbwmxOZtuXCzjXStTSYWmqfTXchO NV8DYUTY9GTpmzOhnagUwskhh8Zid+emu11thFywgI49aqO1reMHQNuaXbMxWEF+u4iC n0qw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:to:subject:message-id:date :from:references:in-reply-to:mime-version:arc-authentication-results :arc-message-signature:sender:dkim-signature:dkim-signature :arc-authentication-results; bh=warocGL/BhTdEFOt361hvsINFjnuJrL5OEKlAKrjWus=; b=dLu0GcvwLK/wBeq3z5GqlQtEpc3imwN98BsDv3SB5jIn1xEoexF0IeyBU23nfjoq7s CubxOhdtn9DHqfhOwgfT66w3Vn6kM/uYpwm8g51zlifBLFrtxE7HVGZZdEIvpn7+DNWu EzVg4gYwXUR81zLicfqnskfj5ihQQF7ZWfwQFKD3La921dkygsu4bTgLlsSv9Iu4Vxw4 BV0FMFLN6AcOYOSMsEJg/wvSjoC6B1oT2c0ME38ZKdyZqMWdmTDbx2Qz6xi/jYS+Mglw Py2X+hPNts1Lw04fjaPWmKD6WhaQ0CyfJ8jN5HyEMMsx+A0bb84Tk1yt55RqHn+sLsCE 3DTQ== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.b=uhOjg5rB; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:400d:c0d::241 as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:mime-version:in-reply-to:references:from:date:message-id :subject:to:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=warocGL/BhTdEFOt361hvsINFjnuJrL5OEKlAKrjWus=; b=dATvsd5BUeuXxDrPtRPLFkF458eaD29zihCKMq+Lp5uMLaiRUmNRVRbVxea4Xv0Kxp IvxJ+uDA3ccgi3Hz9VN+lBoH43uNYqKrMYR0g+zyS0n1rtMO8mbchf5PQoiaQ0Y/rwZb Fil7VCgh26NkSz7+IgNaxr2C1iqd0vHFwKef4Ru0UKSjh87IPkG6ry5T7LADVRcy8S0p 2YdcP59PxqLW/g6Np9FFJMKqgTp5rXyW3uLrUbEIEOnGfCjRSDo1qvWAJ4jvlYJWI5FS G//6V51ajudPoxB7e8W8sEEBMjD+7OF4GYWTbti9zMKzPH1XGPFvSoBLk2+2x76qVpTY Oq8Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=warocGL/BhTdEFOt361hvsINFjnuJrL5OEKlAKrjWus=; b=gIb8Y4lXpU6aAzDx0nmkuiY7EdFR3mMc4+JnaG03Zzf6BbGdhXfeev6k1AoPvY0DYV PvyE8Trh5Zx8myIiur2JjeOY74zCcQQCbK3cXnMyo/YeGSaS5L+Z1DbPXe21pBVY1+Gm vawDs1iZPWo1ZOn8NFwLXplTmzI9QVx4HP2nS8WagsZPtbYV71RdxyN7ljoVSg3u48sj RA5RpFt9OK3+8SuauL+OKd6yt9GM8DkzvVRO+CSqOYPIvGJKJMfy17ACUmO6/9Cs6f0A ATRM9wEDV5EEQUZ97TlUe5Jt9A9J1+xEudZzRx8MKwIEmvbqtidzAmWf372xLMGUUenV goww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:mime-version:in-reply-to:references:from :date:message-id:subject:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=warocGL/BhTdEFOt361hvsINFjnuJrL5OEKlAKrjWus=; b=jB/GA5UjTiqVcON8mz3diWqmkB6FOYwOe+RTSjtvOXVBvWf93pbXg130QTJaGY/YLt QpPi5kJxcx5U8PlkFpNGtcq4MhYvyjpOiLjPFoQPWDgxG48myFjCbyuQD/M1srLYLKFV gmJpXRPCK1rB+VCFYNG6/2mwj6IHneQfKl08bSYVns/L8QpXeSe4Xb/8rIs6bvRAY55b 4Fn4QscQK6Bv72N68WHZwKDcUhvW/pro9PRzf4YL47Ff06IFN/p88NPRBslB7QwkuoNg D+4vBjO8hWNl+DIZ/9TbgN0VdRZLr7q/XdFGdfm4nP4daio76W59Pjj6KmJF9Te0RHVW JYiw== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AIVw113sivzsE60EeUmQSerPS+Gnir1Fw9EtlmdTG6NyBC3BgXoTRVI5 B4xe+mr/FhZcBA== X-Received: by 10.36.107.3 with SMTP id v3mr159131itc.2.1501922757057; Sat, 05 Aug 2017 01:45:57 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.36.137.196 with SMTP id s187ls3036693itd.2.canary-gmail; Sat, 05 Aug 2017 01:45:56 -0700 (PDT) X-Received: by 10.101.76.70 with SMTP id l6mr3427832pgr.201.1501922756243; Sat, 05 Aug 2017 01:45:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1501922756; cv=none; d=google.com; s=arc-20160816; b=kbzf/i5ShbkvJOkNjCnXxWZpIyFQXV8GVOmeTTy8lntwSA9T48O7NAEyzwKM/wjXNo XA71HzO341gRD8lmflJquFXRc2ijCzv5R3s1HNnBKxv7tdDAJg6PjyQO64IkW+im264A mtsWHwE8rDeww6JrBJ6BhZRseRWPBhXw0JiVppfnVSQC116UqtIiIlpcV3CsouuxmfJN BLJTUNWE48m8n12JyUIV1Ov3tm91PtPSDvcJlW1wB6h0viekvfY93cIIqoLTXqJGquLr zM0il39W0TZOqI2TvDV9HGtjyf7EwOpKYaRlBn3K4vY5qjCvqqu3nIYZV7ZvR0l2W95N fyzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:references:in-reply-to:mime-version :dkim-signature:arc-authentication-results; bh=phTQbiGcGfoSM9nMNQtsdZ7nMGzGCRBTVB9q2zKMmfg=; b=Ibromp5Z59tCurmLMN1od1Brojk1Ozog/zgEcF11w9hRXryNPPWV7DhGis2PYH8WzF nfTtkNLIi3g2p4B7iQjvZLmkSZsGP/eCZWCB9z9oB2mRTvgLYlvFtNwmyYOde9C+epeF 9iZiM7zJYZIH0vAh8fwyT+DdrLc9maroyMwpqZki1vBNMgluuDNOWGN6v9vVXFb8cZSf cJw+aZQDf+mETQnXjG/fGfsoeNZX4A53NnmLZOr5JtxMwcDgTMoaFtvWLdwRa1krJsEL GwA8trcG0ab/mFtFQl4ZE34pqLyZSIhEk5w3bGs1pZsI73wPX53UiQODfALRuZbfoDEl zxNA== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.b=uhOjg5rB; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:400d:c0d::241 as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Original-Received: from mail-qt0-x241.google.com (mail-qt0-x241.google.com. [2607:f8b0:400d:c0d::241]) by gmr-mx.google.com with ESMTPS id x136si293391ywx.6.2017.08.05.01.45.56 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 05 Aug 2017 01:45:56 -0700 (PDT) Received-SPF: pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:400d:c0d::241 as permitted sender) client-ip=2607:f8b0:400d:c0d::241; Original-Received: by mail-qt0-x241.google.com with SMTP id u19so3589666qtc.0 for ; Sat, 05 Aug 2017 01:45:56 -0700 (PDT) X-Received: by 10.200.61.82 with SMTP id u18mr6645323qtf.196.1501922755708; Sat, 05 Aug 2017 01:45:55 -0700 (PDT) Original-Received: by 10.55.212.153 with HTTP; Sat, 5 Aug 2017 01:45:54 -0700 (PDT) Original-Received: by 10.55.212.153 with HTTP; Sat, 5 Aug 2017 01:45:54 -0700 (PDT) In-Reply-To: <4abb2571-34b8-4e49-a189-05632083aab9-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Original-Sender: melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@gmail.com header.b=uhOjg5rB; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:400d:c0d::241 as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:18008 Archived-At: --f403045e6c4a558b0f0555fda463 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable With some fonts LibreOffice renders nbspaces way too narrow or too wide, and maybe Word does the same to the extent that they use the same algorithms. It is arguably a bug to let the width of the \xa0 character in the font determine how an nbspace is rendered but there you have it. The best solution is often to double all nbspaces as too wide spaces may look better than too narrow ones. Den 5 aug 2017 00:10 skrev "Kolen Cheung" : > But I still want to reiterate my original point, only if the original > source is problematic, one wants to remove/replace the non-breaking space= . > I recall when I processed some doc files, seems like Word has been trying > to be too smart, there are many non-breaking space that shouldn't be ther= e > (if my memory serves me well). > > Otherwise, replacing non-breaking space to space is basically wrong in > typography (another example is "1-2" vs "1=E2=80=932"). In other words, y= ou're > destroying the information the original author has carefully put in there= . > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/ > msgid/pandoc-discuss/4abb2571-34b8-4e49-a189-05632083aab9% > 40googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/CADAJKhCedoDRvxGke5Ymnn8Ompj28EMf1ufiAG1JsDzL5B1y8w%40mail.g= mail.com. For more options, visit https://groups.google.com/d/optout. --f403045e6c4a558b0f0555fda463 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
With some fonts LibreOffice renders nbspaces way too narr= ow or too wide, and maybe Word does the same to the extent that they use th= e same algorithms. It is arguably a bug to let the width of the \xa0 charac= ter in the font determine how an nbspace is rendered but there you have it.= The best solution is often to double all nbspaces as too wide spaces may l= ook better than too narrow ones.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to
pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.= google.com/d/msgid/pandoc-discuss/CADAJKhCedoDRvxGke5Ymnn8Ompj28EMf1ufiAG1J= sDzL5B1y8w%40mail.gmail.com.
For more options, visit http= s://groups.google.com/d/optout.
--f403045e6c4a558b0f0555fda463--