public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Lua filter: Move punctuation after a following citation
@ 2020-04-23 14:08 Denis Maier
       [not found] ` <ea11a459-6dfe-4632-9d38-aee8ece66172-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Denis Maier @ 2020-04-23 14:08 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 4388 bytes --]

Hi,

I've written a lua filter (see below) that moves punctuation after a 
following citation.

So, if you have this:

```
"This is a citation." [@doe]
"This is a citation". [@doe]
```

the filter will produce:

```
"This is a citation" [@doe].
"This is a citation" [@doe].
```

The reason for this is: Pandoc's already provides a similar mechanism 
already. You have this:

```
"This is a citation" [@doe].
```

Now, with a note-based citation style, the result is:

```
"This is a citation." [@doe]

```

(Of course, the citation will be properly processed with pandoc-citeproc.)

That's fine in American English, but in many other languages---e.g., 
German, British English---you'll only have the punctuation inside quotation 
marks if the period was there already in the quoted text. But that's 
impossible to know with the current syntax.

So, the idea is to enter quotations with punctuation in the original place, 
and move them after a following citation only if you're using a 
parenthetical citation style. So, in effect, it's the exact opposite to the 
current default.

Anyway, the filter is below and on github 
(https://github.com/denismaier/pandoc-lua-move-punctuation-after-citations). 
My lua knowledge is rather limited and this was mostly trial-and-error. So: 
Suggestions for improvements or other comments are welcome.

Best,
Denis


```
local function ends_with_punctuation(str)
  return str:sub(-1) == '.'
    or str:sub(-1) == ','
    or str:sub(-1) == ';'
    or str:sub(-1) == '!'
    or str:sub(-1) == '?'
end

local function is_punct_last_in_quote (doublequote)
  return doublequote and doublequote.t == 'Quoted' and 
doublequote.quotetype == 'DoubleQuote'
    and doublequote.content[#doublequote.content].t == 'Str'
    and 
ends_with_punctuation(doublequote.content[#doublequote.content].text)
end

local function is_quote_space_before_normal_citation (doublequote, spc, 
cite)
  return 
  doublequote and doublequote.t == 'Quoted' and doublequote.quotetype == 
'DoubleQuote'
  and spc and spc.t == 'Space'
  and cite and cite.t == 'Cite'
  -- citationMode must be NormalCitation
  and cite.citations[1].mode == 'NormalCitation'
end

local function is_quote_punctuation_space_before_normal_citation 
(doublequote, punct, spc, cite)
    return 
      doublequote and doublequote.t == 'Quoted' and doublequote.quotetype 
== 'DoubleQuote'
      and punct and punct.t == 'Str' and (punct.text == '.'  or punct.text 
== ',' or punct.text == ';')
      and spc and spc.t == 'Space'
      and cite and cite.t == 'Cite'
      -- citationMode must be NormalCitation
      and cite.citations[1].mode == 'NormalCitation'
end

  
function Inlines (inlines)
  -- both loops go from end to start to avoid problems with shifting 
indices.

  -- doublequote punct space citation => doublequote space citation punct
  for i = #inlines-3, 1, -1 do
      if is_quote_punctuation_space_before_normal_citation(inlines[i], 
inlines[i+1], inlines[i+2], inlines[i+3]) then
        -- save current inline elements
        local doublequote = inlines[i]
        local punctuation = inlines[i+1]
        local space = inlines[i+2] 
        local cite = inlines[i+3]
        -- swap inline element order
        inlines[i] = doublequote
        inlines[i+1] = space
        inlines[i+2] = cite
        inlines[i+3] = punctuation
      end
    end

  -- punct doublequote space citation => doublequote space citation punct
  for i = #inlines-2, 1, -1 do
    if is_punct_last_in_quote(inlines[i]) and 
is_quote_space_before_normal_citation (inlines[i], inlines[i+1], 
inlines[i+2]) then
      -- get punctuation
      punctuation = 
pandoc.Str(inlines[i].content[#inlines[i].content].text:sub(-1))
      -- remove punctuation from quotation
      inlines[i].content[#inlines[i].content].text = 
inlines[i].content[#inlines[i].content].text:sub(1,-2)
      -- reinsert punctuation after cite element
      inlines:insert(i+3, punctuation)
    end
  end
  return inlines
end
```

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ea11a459-6dfe-4632-9d38-aee8ece66172%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 6280 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Lua filter: Move punctuation after a following citation
       [not found] ` <ea11a459-6dfe-4632-9d38-aee8ece66172-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-04-23 20:10   ` Albert Krewinkel
       [not found]     ` <87eese2cbi.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Albert Krewinkel @ 2020-04-23 20:10 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Looks good, thanks for sharing here! All suggestions which I can give
are based on taste. The filter is fine the way it is.

Denis Maier writes:

> local function ends_with_punctuation(str)
>   return str:sub(-1) == '.'
>     or str:sub(-1) == ','
>     or str:sub(-1) == ';'
>     or str:sub(-1) == '!'
>     or str:sub(-1) == '?'
> end

One could also use pattern matching, like so:

    local punctuation_chars = '.,;!?'
    local function ends_with_punctuation_char(str)
        return str:match('[' .. punctuation_pattern .. ']$')
    end

or the shorter but less-precise

    local function ends_with_punctuation_char(str)
        return str:match('%p$')
    end

See the "Patterns" section of the Lua manual for a good explanation of
this method: https://www.lua.org/manual/5.3/manual.html#6.4.1

> local function is_punct_last_in_quote (doublequote)
>   return doublequote and doublequote.t == 'Quoted' and
> doublequote.quotetype == 'DoubleQuote'
>     and doublequote.content[#doublequote.content].t == 'Str'
>     and
> ends_with_punctuation(doublequote.content[#doublequote.content].text)
> end

It might be slightly cleaner to split this function in two:

    function is_doublequote (el)
      return el and el.t == 'Quoted' and el.quotetype == 'DoubleQuote'
    end

    -- this returns nil if the last character is not punctuation,
    -- so this can also be used in a boolean context as well as to get
    -- the actual punctuation char.
    function final_punctuation_char (inlines)
      local last = inlines[#inlines]
      return last and last.t == 'Str' and last.text:match('%p$')
    end

> local function is_quote_space_before_normal_citation (doublequote, spc,
> cite)
>   return
>   doublequote and doublequote.t == 'Quoted' and doublequote.quotetype ==
> 'DoubleQuote'

The `is_doublequote` function from above could be reused here.

>   and spc and spc.t == 'Space'
>   and cite and cite.t == 'Cite'
>   -- citationMode must be NormalCitation
>   and cite.citations[1].mode == 'NormalCitation'
> end
>
> local function is_quote_punctuation_space_before_normal_citation
> (doublequote, punct, spc, cite)
>     return
>       doublequote and doublequote.t == 'Quoted' and doublequote.quotetype
> == 'DoubleQuote'
>       and punct and punct.t == 'Str' and (punct.text == '.'  or punct.text
> == ',' or punct.text == ';')

Would exclamation and quoation marks also be acceptable punctuation
here? If so, could we use
`punct.text:match('^[' .. punctuation_chars .. ']$')`?

>       and spc and spc.t == 'Space'
>       and cite and cite.t == 'Cite'
>       -- citationMode must be NormalCitation
>       and cite.citations[1].mode == 'NormalCitation'
> end

Factoring out an `is_normal_citation` function could help to clean
things up.

> function Inlines (inlines)
>   -- both loops go from end to start to avoid problems with shifting
> indices.
>
>   -- doublequote punct space citation => doublequote space citation punct
>   for i = #inlines-3, 1, -1 do
>       if is_quote_punctuation_space_before_normal_citation(inlines[i],
> inlines[i+1], inlines[i+2], inlines[i+3]) then
>         -- save current inline elements
>         local doublequote = inlines[i]
>         local punctuation = inlines[i+1]
>         local space = inlines[i+2]
>         local cite = inlines[i+3]
>         -- swap inline element order
>         inlines[i] = doublequote
>         inlines[i+1] = space
>         inlines[i+2] = cite
>         inlines[i+3] = punctuation
>       end
>     end
>
>   -- punct doublequote space citation => doublequote space citation punct
>   for i = #inlines-2, 1, -1 do
>     if is_punct_last_in_quote(inlines[i]) and
> is_quote_space_before_normal_citation (inlines[i], inlines[i+1],
> inlines[i+2]) then
>       -- get punctuation
>       punctuation =
> pandoc.Str(inlines[i].content[#inlines[i].content].text:sub(-1))
>       -- remove punctuation from quotation
>       inlines[i].content[#inlines[i].content].text =
> inlines[i].content[#inlines[i].content].text:sub(1,-2)
>       -- reinsert punctuation after cite element
>       inlines:insert(i+3, punctuation)
>     end
>   end
>   return inlines
> end

Cheers!

-- 
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Lua filter: Move punctuation after a following citation
       [not found]     ` <87eese2cbi.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
@ 2020-04-28 14:33       ` Denis Maier
  0 siblings, 0 replies; 3+ messages in thread
From: Denis Maier @ 2020-04-28 14:33 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 858 bytes --]

Thanks for your feedback. That's very appreciated. I'll need to look into 
the details, but I guess some things can be simplified.
Best,
Denis


Am Donnerstag, 23. April 2020 22:10:17 UTC+2 schrieb Albert Krewinkel:
>
> Looks good, thanks for sharing here! All suggestions which I can give 
> are based on taste. The filter is fine the way it is. 
>
 

> [ ...]

-- 
> Albert Krewinkel 
> GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124 
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/19ed1724-3dbb-43d9-9048-398f50805561%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1613 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-04-28 14:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-23 14:08 Lua filter: Move punctuation after a following citation Denis Maier
     [not found] ` <ea11a459-6dfe-4632-9d38-aee8ece66172-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-04-23 20:10   ` Albert Krewinkel
     [not found]     ` <87eese2cbi.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2020-04-28 14:33       ` Denis Maier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).