From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/29280 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Vin Cent Newsgroups: gmane.text.pandoc Subject: Re: epub : footnote backlink character not rendered by Kobo fonts Date: Wed, 22 Sep 2021 13:08:48 -0700 (PDT) Message-ID: References: <349480b0-92cd-450f-abc0-f3d65140d23cn@googlegroups.com> <23e0bec1-2642-49d7-b6d5-d440caf74fdcn@googlegroups.com> <68257f93-0fd1-46a1-9e99-46d6045dc4b9n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_2466_905681185.1632341328183" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="7465"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCT6LC6Q3QDBBUM2V2FAMGQE2PTDYVQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Wed Sep 22 22:08:52 2021 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-ot1-f57.google.com ([209.85.210.57]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1mT8Xz-0001nY-Vm for gtp-pandoc-discuss@m.gmane-mx.org; Wed, 22 Sep 2021 22:08:52 +0200 Original-Received: by mail-ot1-f57.google.com with SMTP id x15-20020a9d458f000000b005452a0e24f8sf1753105ote.17 for ; Wed, 22 Sep 2021 13:08:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=/wH7eujDFzyFRr4pxZ0e3SQRyynxHftOs0O/4YWgvHk=; b=UuY38BcfXvzD+Jl78uHpjd98bvCuKRVEJKhUiubtkrZh0rtAINVnR3W4HY8C9IiJA9 l+z1K4mKbTWTOApxXcIS9sHrrCfoabxQ3O0wPR28VIVVtALUdwpA6GcHl9XIy4ZKrdz8 kNfQN0Gnraw5nn7sGT6qzX3h0zcrVxc3i5qoyAJTn8IKeLgnjO11jnmJx4sqPpnEspsX VNkjQbP/ztHT37icE6X4jI5pB4D+Zf7wPI5exnsKZqHfZU8R7suG0MXSTSshFfE4wqm4 jMpLKHJ2/hnjRN7NsF/LdtTdII1Qlw70EkHZ82ms3WP7rII2IkGgDduEtSW0WJsh46J2 lxOg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=/wH7eujDFzyFRr4pxZ0e3SQRyynxHftOs0O/4YWgvHk=; b=SWdNCJd9EUZ2VvO2RKdYB9KdLsvuxT5oZE2og4ihmne419rBOIzGFKBuPI5HaRk+8X N1fSUMfSKwFcrqgWNnp3neiwqPm1A1fDXjx97KLmoSJZECLCCSgLDzvDxoA/iijZx22C RrgdvPo0oujYZh89lIYrjHJDDT1IgWymwVxD70KGo37B4vE6REWeN8iawqkfN4qJi/9t jiSTOJqptvx/yrSIwrCqzbqLWXJk+KpUHBPHJItPpMv+VxVViJSir/PU6/ey5hnpEalQ o1kJVw8HZvEQ9wbI9uc1G1Uv6Th8N1euTVCc9DHFI3yFGHv9W7iAYSRsA6eYFM9eJWU+ ObsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=/wH7eujDFzyFRr4pxZ0e3SQRyynxHftOs0O/4YWgvHk=; b=0zYCP/CjFSARfUCzggSgcHM4K0VoI/1KdRUMHOMPrNREfgjrOvGB/25WjICSo/eTVQ HiN0+u6D3j1QYoTelZc1m/rc+lnE7xX3dsTI7IDQErvNHZ789XpbOl5GL/CyQbqMCqsj QVyAEsTyTrw4jcDK8ALySXw4N5/pvYu4mXXSeAi5m7hD7ztwYaV88tBFAXin0QjBNnIj FixcFf0bPh9Ea5D2ZONeGRRoA5M4kwmW0hC6G7wskEr9DQBdrH6sqsB7uaKX7K0iXUVV 7qGyE35U/5jVV+9OWcYCoLcM41TNo8Z6u0gKAsi5n0CmUGImViNz4qJ1YEUreUtcbLGl w7FQ== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM533r2PUKv0lTdCshKa9KDpDievXpTlni5BQ9WDkc75CmX6rYHXV6 0+rkPjeGmy7HzYltvLDdZAI= X-Google-Smtp-Source: ABdhPJyOjC5voT9kfr3SbeQgUzP8d6grVQTU8Ds5KhlhnS+KdbIVpd+AgKst1SlUlCTKIYr0j1yShA== X-Received: by 2002:a05:6808:13c9:: with SMTP id d9mr888262oiw.165.1632341330725; Wed, 22 Sep 2021 13:08:50 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:aca:3542:: with SMTP id c63ls1006874oia.6.gmail; Wed, 22 Sep 2021 13:08:49 -0700 (PDT) X-Received: by 2002:aca:d988:: with SMTP id q130mr827947oig.148.1632341328822; Wed, 22 Sep 2021 13:08:48 -0700 (PDT) In-Reply-To: X-Original-Sender: irakay17-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:29280 Archived-At: ------=_Part_2466_905681185.1632341328183 Content-Type: multipart/alternative; boundary="----=_Part_2467_1105023101.1632341328183" ------=_Part_2467_1105023101.1632341328183 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks. Shell script written ; works like a charm. If that counts, I vote for the #3149 change/configure character approach :) If that can ease anybody's life, here is my shell script (more than 3=20 lines, but should be safe and leave no trace in case of failure) : #!/bin/sh FN=3D"ebook.epub" ZIPNAME=3D"ebook.zip" CWD=3D`pwd` if [ ! -f $FN ]; then exit 1 fi if [ ! -d $MYDIR ]; then exit 2 fi MYDIR=3D`mktemp -d` cleanup() { cd $CWD rm -rf $MYDIR exit $1 } mv $FN ${MYDIR}/${ZIPNAME} || cleanup 3 cd $MYDIR || cleanup 4 unzip $ZIPNAME || cleanup 5 rm -f $ZIPNAME || cleanup 6 cd EPUB/text || cleanup 7 sed -i -e 's/>=E2=86=A9=EF=B8=8E<\/a>/>=E2=80=A1<\/a>/g' *html || cleanup 8 rm -f $ZIPNAME || cleanup 9 cd ../.. || cleanup 10 zip -8 -r ${CWD}/${ZIPNAME} * || cleanup 11 cd $CWD || cleanup 12 mv $ZIPNAME $FN || cleanup 13 rm -rf $MYDIR On Wednesday, September 22, 2021 at 6:40:14 PM UTC+2 John MacFarlane wrote: > > A custom writer is probably overkill. It would be easier to write > a small script that unzips the epub, does search and replace > on the xhtml files, and then zips it back up again. This could > be 3 lines of shell script. > > That said, this is a long standing issue and we might consider > using a different character or making it configurable: > > https://github.com/jgm/pandoc/issues/3149 > > > William Lupton writes: > > > This probably won't be the answer that you wanted, but you could use a > > custom writer, e.g., based on the provided sample.lua. See > > https://pandoc.org/MANUAL.html#custom-writers. > > > > Here's the relevant code (this isn't all the code relating to footnotes= , > > but it's the bit that has the special character!): > > > > function Note(s) > > local num =3D #notes + 1 > > -- insert the back reference right before the final closing tag. > > s =3D string.gsub( > > s, '(.*) > -- add a list item with the note to the note table. > > table.insert(notes, '
  • ' .. s .. '
  • ') > > -- return the footnote reference, linked to the note. > > return ' > '">' .. num .. '' > > end > > > > On Tue, 21 Sept 2021 at 13:23, Vin Cent wrote: > > > >> Ok, after some more reading, I'm still unsure how to proceed ; and I > >> actually don't see how I could to that with a filter. > >> I have dumped pandoc's native output format ; and the character I need= =20 > to > >> replace is not part of it. I guess it is added at the time of pandoc's > >> writing the output document. > >> I don't know how how I can replace that character. Hopefully there is = a > >> pandoc setting I can tune so that I won't need to replace it at some= =20 > late > >> stage. > >> > >> > >> > >> On Tuesday, September 21, 2021 at 10:42:14 AM UTC+2 Vin Cent wrote: > >> > >>> Sorry for answering myself. I have RTFMed a bit in the while. > >>> I actually hint this can be simply done with a LUA filter. > >>> I will try to implement it as a self-exercice and report the result= =20 > here. > >>> > >>> On Tuesday, September 21, 2021 at 9:34:26 AM UTC+2 Vin Cent wrote: > >>> > >>>> Hi, > >>>> > >>>> I am generating epub3 from latex source. > >>>> I was wondering why footnotes show a backlink to the original text= =20 > when > >>>> I display the document in calibre, and why they do not when I displa= y=20 > the > >>>> document on a kobo reader... > >>>> > >>>> ... until I found the backlink is indeed well present. It is "just"= =20 > not > >>>> displayed by kobo because the character set (by pandoc, I think) has= =20 > no > >>>> rendering on their font. > >>>> > >>>> The backlink character, copied and pasted from the .epub document > >>>> itself, seems to be "=E2=86=A9=EF=B8=8E". > >>>> > >>>> I see two possible tricks here so that the link will appear on kobo = : > >>>> 1. Change the character to another one that would be rendered by kob= o. > >>>> 2. Change the default font of the document to one that has rendering= =20 > for > >>>> that character. > >>>> > >>>> I have tried all available fonts on my device. There are actually=20 > three > >>>> classes of them. > >>>> (Listing them all below, this might be of interest to somebody in th= e > >>>> future) > >>>> > >>>> Avenir Next ; Georgia ; Kobo Nickel : these fonts render absolutely > >>>> nothing for that character, leading the human to believe the backlin= k=20 > does > >>>> not exist. It is present, clickable, just not rendered. > >>>> > >>>> Amasis ; Caecilia ; Gill Sans ; Malabar ; OpenDislexic : these fonts > >>>> render "__" for that character. I find it "better" than the first=20 > family, > >>>> but still not great. It is not obvious for a non-tehnical human that= =20 > this > >>>> is a backlink to the text. > >>>> > >>>> AR UDJingxihei ; Kobo UD Kakugo ; Kobo Tsukishi Mincho : display > >>>> oriental character (I think this is chinese / japanese depending on= =20 > font) > >>>> for the backlink character. > >>>> > >>>> > >>>> Therefore, I tend to favor the first solution. Do you guys know of a= =20 > way > >>>> to customize the "backlink =E2=86=A9=EF=B8=8E" character set by pand= oc in epub ? > >>>> Or is there a third approach ? > >>>> > >>>> Thanks, > >>>> > >>>> Vincent > >>>> > >>> -- > >> You received this message because you are subscribed to the Google=20 > Groups > >> "pandoc-discuss" group. > >> To unsubscribe from this group and stop receiving emails from it, send= =20 > an > >> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > >> To view this discussion on the web visit > >>=20 > https://groups.google.com/d/msgid/pandoc-discuss/68257f93-0fd1-46a1-9e99-= 46d6045dc4b9n%40googlegroups.com > >> < > https://groups.google.com/d/msgid/pandoc-discuss/68257f93-0fd1-46a1-9e99-= 46d6045dc4b9n%40googlegroups.com?utm_medium=3Demail&utm_source=3Dfooter > > > >> . > >> > > > > --=20 > > You received this message because you are subscribed to the Google=20 > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send= =20 > an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit=20 > https://groups.google.com/d/msgid/pandoc-discuss/CAEe_xxiSZbwmOYcdd5nKSqy= SuF8L4tvPbrD%2B-kXDcONXu3n4hQ%40mail.gmail.com > . > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/e216efa7-009b-469a-add1-ae8c93d0ffc3n%40googlegroups.com. ------=_Part_2467_1105023101.1632341328183 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks. Shell script written ; works like a charm.
    If that counts, I vo= te for the #3149 change/configure character approach :)

    If that can ease anybody's life, here is my shell script (more than 3= lines, but should be safe and leave no trace in case of failure) :


    #!/bin= /sh

    <= font face=3D"Times New Roman">FN=3D"ebook.epub"
    ZIPNAME=3D"ebook.zip"
    CWD=3D`pwd`

    if [ ! -f $FN ]; then=
      exit 1
    =
    fi

    if [ ! -d = $MYDIR ]; then
      exit = 2
    fi

    MYDIR=3D`mktemp -d`
    cle= anup()
    {
    <= font face=3D"Times New Roman">  cd $CWD
      rm -rf $MYDIR
      exit $1
    = }

    mv $FN ${MYDIR}/${ZIPNAME} || cleanup 3<= /div>
    cd $MYDIR || cleanup 4
    unzip $ZIPNAME || cleanup 5
    rm -f $ZIPNAME || cleanup 6
    cd EPUB/text || cleanup 7
    sed -i -e 's/>=E2=86=A9=EF=B8=8E&l= t;\/a>/>=E2=80=A1<\/a>/g' *html || cleanup 8
    <= font face=3D"Times New Roman">rm -f $ZIPNAME || cleanup 9
    = cd ../.. || cleanup 10
    zip -8 -r ${CWD}/${ZIPNAME} * || cleanup 11
    cd $CWD || cleanup 12
    mv $ZIPNAME $FN || cleanup 13=
    rm -rf $MYDIR

    On Wednesday, September 22, 2021 at 6:40:14 PM UTC+2 John MacFarla= ne wrote:

    A custom writer is probably overkill. It would be easier to write
    a small script that unzips the epub, does search and replace
    on the xhtml files, and then zips it back up again. This could
    be 3 lines of shell script.

    That said, this is a long standing issue and we might consider
    using a different character or making it configurable:

    https:= //github.com/jgm/pandoc/issues/3149


    William Lupton <wlu...@br= oadband-forum.org> writes:

    > This probably won't be the answer that you wanted, but you cou= ld use a
    > custom writer, e.g., based on the provided sample.lua. See
    > https://pandoc.org/MANUAL.html#custom-writers.
    >
    > Here's the relevant code (this isn't all the code relating= to footnotes,
    > but it's the bit that has the special character!):
    >
    > function Note(s)
    > local num =3D #notes + 1
    > -- insert the back reference right before the final closing ta= g.
    > s =3D string.gsub(
    > s, '(.*)</', '%1 <a href=3D"#fnref&= #39; .. num .. '">&#8617;</a></')
    > -- add a list item with the note to the note table.
    > table.insert(notes, '<li id=3D"fn' .. num .. &= #39;">' .. s .. '</li>')
    > -- return the footnote reference, linked to the note.
    > return '<a id=3D"fnref' .. num .. '" = href=3D"#fn' .. num ..
    > '"><sup>' .. num .. '</sup>= ;</a>'
    > end
    >
    > On Tue, 21 Sept 2021 at 13:23, Vin Cent <irak...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
    >
    >> Ok, after some more reading, I'm still unsure how to proce= ed ; and I
    >> actually don't see how I could to that with a filter.
    >> I have dumped pandoc's native output format ; and the char= acter I need to
    >> replace is not part of it. I guess it is added at the time of = pandoc's
    >> writing the output document.
    >> I don't know how how I can replace that character. Hopeful= ly there is a
    >> pandoc setting I can tune so that I won't need to replace = it at some late
    >> stage.
    >>
    >>
    >>
    >> On Tuesday, September 21, 2021 at 10:42:14 AM UTC+2 Vin Cent w= rote:
    >>
    >>> Sorry for answering myself. I have RTFMed a bit in the whi= le.
    >>> I actually hint this can be simply done with a LUA filter.
    >>> I will try to implement it as a self-exercice and report t= he result here.
    >>>
    >>> On Tuesday, September 21, 2021 at 9:34:26 AM UTC+2 Vin Cen= t wrote:
    >>>
    >>>> Hi,
    >>>>
    >>>> I am generating epub3 from latex source.
    >>>> I was wondering why footnotes show a backlink to the o= riginal text when
    >>>> I display the document in calibre, and why they do not= when I display the
    >>>> document on a kobo reader...
    >>>>
    >>>> ... until I found the backlink is indeed well present.= It is "just" not
    >>>> displayed by kobo because the character set (by pandoc= , I think) has no
    >>>> rendering on their font.
    >>>>
    >>>> The backlink character, copied and pasted from the .ep= ub document
    >>>> itself, seems to be "=E2=86=A9=EF=B8=8E".
    >>>>
    >>>> I see two possible tricks here so that the link will a= ppear on kobo :
    >>>> 1. Change the character to another one that would be r= endered by kobo.
    >>>> 2. Change the default font of the document to one that= has rendering for
    >>>> that character.
    >>>>
    >>>> I have tried all available fonts on my device. There a= re actually three
    >>>> classes of them.
    >>>> (Listing them all below, this might be of interest to = somebody in the
    >>>> future)
    >>>>
    >>>> Avenir Next ; Georgia ; Kobo Nickel : these fonts rend= er absolutely
    >>>> nothing for that character, leading the human to belie= ve the backlink does
    >>>> not exist. It is present, clickable, just not rendered= .
    >>>>
    >>>> Amasis ; Caecilia ; Gill Sans ; Malabar ; OpenDislexic= : these fonts
    >>>> render "__" for that character. I find it &q= uot;better" than the first family,
    >>>> but still not great. It is not obvious for a non-tehni= cal human that this
    >>>> is a backlink to the text.
    >>>>
    >>>> AR UDJingxihei ; Kobo UD Kakugo ; Kobo Tsukishi Mincho= : display
    >>>> oriental character (I think this is chinese / japanese= depending on font)
    >>>> for the backlink character.
    >>>>
    >>>>
    >>>> Therefore, I tend to favor the first solution. Do you = guys know of a way
    >>>> to customize the "backlink =E2=86=A9=EF=B8=8E&quo= t; character set by pandoc in epub ?
    >>>> Or is there a third approach ?
    >>>>
    >>>> Thanks,
    >>>>
    >>>> Vincent
    >>>>
    >>> --
    >> You received this message because you are subscribed to the Go= ogle Groups
    >> "pandoc-discuss" group.
    >> To unsubscribe from this group and stop receiving emails from = it, send an
    >> email to pandoc-dis= cus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
    >> To view this discussion on the web visit
    >> https://groups.go= ogle.com/d/msgid/pandoc-discuss/68257f93-0fd1-46a1-9e99-46d6045dc4b9n%40goo= glegroups.com
    >> <http= s://groups.google.com/d/msgid/pandoc-discuss/68257f93-0fd1-46a1-9e99-46d604= 5dc4b9n%40googlegroups.com?utm_medium=3Demail&utm_source=3Dfooter&g= t;
    >> .
    >>
    >
    > --=20
    > You received this message because you are subscribed to the Google= Groups "pandoc-discuss" group.
    > To unsubscribe from this group and stop receiving emails from it, = send an email to pandoc-discus..= .@googlegroups.com.
    > To view this discussion on the web visit https://groups.google.com= /d/msgid/pandoc-discuss/CAEe_xxiSZbwmOYcdd5nKSqySuF8L4tvPbrD%2B-kXDcONXu3n4= hQ%40mail.gmail.com.

    --
    You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
    To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
    To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/e216efa7-009b-469a-add1-ae8c93d0ffc3n%40googlegroups.= com.
    ------=_Part_2467_1105023101.1632341328183-- ------=_Part_2466_905681185.1632341328183--