From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/16100 Path: news.gmane.org!.POSTED!not-for-mail From: =?UTF-8?Q?J=C3=BCrgen_Schulze?= <1manfactory-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Newsgroups: gmane.text.pandoc Subject: Re: Hyphenation Date: Mon, 21 Nov 2016 05:35:38 -0800 (PST) Message-ID: References: <60d24566-f642-4d47-9d47-1a6f91a7d562@googlegroups.com> <877f80h9x3.fsf@espresso.zeitkraut.de> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_5845_593326397.1479735338659" X-Trace: blaine.gmane.org 1479735342 17600 195.159.176.226 (21 Nov 2016 13:35:42 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 21 Nov 2016 13:35:42 +0000 (UTC) Cc: albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCAMDSEJR4LBBK7QZPAQKGQE32LXZTY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mon Nov 21 14:35:37 2016 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-pf0-f185.google.com ([209.85.192.185]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1c8oky-0003U8-Sh for gtp-pandoc-discuss@m.gmane.org; Mon, 21 Nov 2016 14:35:37 +0100 Original-Received: by mail-pf0-f185.google.com with SMTP id 17sf59526386pfy.0 for ; Mon, 21 Nov 2016 05:35:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=sender:date:from:to:cc:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=ULzDZhwBVvrCJN65NmPtuuXoWIYRZXStl1z7KTesBZc=; b=CPV4jBP0jWkAamzd07pfDwRh4KG5H4h095iYzqxGVd74qyqawVUcOfBV1sMssLY8bU PW2zoaMHhlivwzIx//TE2lYOaxRf9Mq8a0b0prLQO3xqpIEDHpWe0lZ3l2RQpVSCskHQ KjkSBLeTMN/vNzg2ZZxiDG49zeAqMTpbPxO2S/nJ0jr7/tdO3gh/SCuaRgzzFSfYfO5t U6hQWxbZyYlE3Nx7sL8zCgw3keKX9KBeJGoXR6F6Hu4p4DzdD9TZZUK8riB3uWEZqafM YcEpYn/w0/ABvCsM7SVe9H2/A2KhKEcnTtOmw2VGMakJR57gNYxXUmdrwEBXFgxjjVT8 mhFg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=ULzDZhwBVvrCJN65NmPtuuXoWIYRZXStl1z7KTesBZc=; b=FssF10/+Xcfc8IRNBFP7ScYkAhOPS4rLh3yAq4HIi2RfvSj+aZ5C0CdYjfaH9uDYvP 250GduNr0NfRpxo3gdtXyTNXLmA6lNQPwJAptplIJUvjIWpJCq3edXrz7/ohF94IBFQF JZnvidtMinX7duarZ8xNwcuwWvhWSuUHEUIzxVgx+2ct45ZuTFWzzp8U+9ZFJoy8kWA2 RGtDupl3ERONMi+3043W2dCjqYzsHcDgpEr1TKG6nTN+ZE6wnlXZD+/muHveYDSp/8d5 xlYaqoO0VEt/AsgOIWydSjnLiDhdMHbEXOwP2iW74EFy1YmMd8WFuQ7531qJcN5hTgq2 SqBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=sender:x-gm-message-state:date:from:to:cc:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=ULzDZhwBVvrCJN65NmPtuuXoWIYRZXStl1z7KTesBZc=; b=Q4jwvOvHAjy5jBYfAcL7mT/5I0R9WqOq9juj9AKkbHz3JVLx3iBC4a/wmIE9xooYoz M5Xa6HYlmjxWM2zo4qr45Jc8pb57xkCy+AbThYUQGNxQ6wkM66D/jDkJ7kLsMuLfVtRL rKJiHOGV+98lRLhvD2iUl1CzmZvphLw/Y0Qpu4FxZqEg78YZ8+s5JbrVmpN+gjJYZpm4 nKnTQ3mgu/+kxthkH7KdHhetDHAFkuoKaUigRWixgUF8zz3BQ13e/ciSrpzpbOQcyblO 9dVMsrc3DQMVz3Yw3dkPHkSIp3J+JVLuS+5oXwLY8NoCKKESefqW2liXh5DZ7robDZgA Gt3g== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AKaTC00F7dRrBbFjEJybyXht1wzy5/cR0gUVFZJU8Two4+pR5IwxLWrT2Qm3gceu5wTU8Q== X-Received: by 10.157.17.167 with SMTP id v36mr756512otf.12.1479735339693; Mon, 21 Nov 2016 05:35:39 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.157.27.206 with SMTP id v14ls11891511otv.2.gmail; Mon, 21 Nov 2016 05:35:39 -0800 (PST) X-Received: by 10.157.63.164 with SMTP id r33mr762571otc.10.1479735339284; Mon, 21 Nov 2016 05:35:39 -0800 (PST) In-Reply-To: <877f80h9x3.fsf-NJ6QtbQ9hATDZamjJ9D3v6C1jgCzLlUE@public.gmane.org> X-Original-Sender: 1manfactory-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:16100 Archived-At: ------=_Part_5845_593326397.1479735338659 Content-Type: multipart/alternative; boundary="----=_Part_5846_203355309.1479735338659" ------=_Part_5846_203355309.1479735338659 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello, Albert Krewinkel=20 thanks for providing me this solution. But I decided to move on with a PHP "compiler" to speed up a php=20 hyphenation script Juergen Am Freitag, 18. November 2016 20:20:51 UTC+1 schrieb Albert Krewinkel: > > J=C3=BCrgen Schulze <1manf...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org > writes:=20 > > > Hello, I would like the hyphenation abilities of pandoc when generating= =20 > PDF=20 > > documents to use for simple text/html files where the hyphenation is=20 > inserted=20 > > with entity "­".=20 > >=20 > > Something like=20 > >=20 > > pandoc --smart --wrap=3Dnone text.txt -o text2.txt=20 > >=20 > > How can this be done?=20 > > One method would be to rely on the CSS `hyphens` property. Unforunately,= =20 > that property is not supported by Chrome/Webkit, so an additional=20 > polyfill library like [Hyphenator](https://github.com/mnater/Hyphenator)= =20 > would be required.=20 > > Alternatively, a pandoc filters can insert soft hyphens directly.=20 > You'll need to install the python libraries `panflute` and `pyphen`.=20 > Put the following code into a file and call it as a pandoc filter.=20 > > > #!/usr/bin/env python3=20 > from panflute import *=20 > import pyphen=20 > > dic =3D pyphen.Pyphen(lang=3D'en_US')=20 > > def hyphenate(inline, doc):=20 > if type(inline) =3D=3D Str:=20 > hyphenated =3D dic.inserted(inline.text, hyphen=3D'=C2=AD')= =20 > return Str(hyphenated)=20 > > if __name__ =3D=3D "__main__":=20 > toJSONFilter(hyphenate)=20 > > > The above code uses the unicode soft hyphen instead of the HTML entity,= =20 > which helps keeping the filesize low. Just change the `hyphen`=20 > parameter if that's not what you want.=20 > > --=20 > Albert Krewinkel=20 > GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124=20 > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/b514ade1-a211-4989-8712-07ebe9842b6f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. ------=_Part_5846_203355309.1479735338659 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hello, Albert Krewinkel=C2=A0
thanks for providing me = this solution.
But I decided to move on with a PHP "compiler= " to speed up a php hyphenation script
Juergen

Am Fre= itag, 18. November 2016 20:20:51 UTC+1 schrieb Albert Krewinkel:J=C3=BCrgen Schulze <1manf...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org= > writes:

> Hello, I would like the hyphenation abilities of pandoc when gener= ating PDF
> documents to use for simple text/html files where the hyphenation = is inserted
> with entity "&shy;".
>
> Something like
>
> pandoc --smart --wrap=3Dnone text.txt -o text2.txt
>
> How can this be done?

One method would be to rely on the CSS `hyphens` property. Unforunately= ,
that property is not supported by Chrome/Webkit, so an additional
polyfill library like [Hyphenator](https://= github.com/mnater/Hyphenator)
would be required.

Alternatively, a pandoc filters can insert soft hyphens directly.
You'll need to install the python libraries `panflute` and `pyphen`= .
Put the following code into a file and call it as a pandoc filter.


=C2=A0 =C2=A0 #!/usr/bin/env python3
=C2=A0 =C2=A0 from panflute import *
=C2=A0 =C2=A0 import pyphen

=C2=A0 =C2=A0 dic =3D pyphen.Pyphen(lang=3D'en_US')

=C2=A0 =C2=A0 def hyphenate(inline, doc):
=C2=A0 =C2=A0 =C2=A0 =C2=A0 if type(inline) =3D=3D Str:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 hyphenated =3D dic.inserted(i= nline.text, hyphen=3D'=C2=AD')
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return Str(hyphenated)

=C2=A0 =C2=A0 if __name__ =3D=3D "__main__":
=C2=A0 =C2=A0 =C2=A0 =C2=A0 toJSONFilter(hyphenate)


The above code uses the unicode soft hyphen instead of the HTML entity,
which helps keeping the filesize low. =C2=A0Just change the `hyphen`
parameter if that's not what you want.

--=20
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe =C2=A0e836 388d c0b2 1f63 1124

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/= msgid/pandoc-discuss/b514ade1-a211-4989-8712-07ebe9842b6f%40googlegroups.co= m.
For more options, visit http= s://groups.google.com/d/optout.
------=_Part_5846_203355309.1479735338659-- ------=_Part_5845_593326397.1479735338659--