Fw: Re: pandoc/citeproc issues: recognizing citations

public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed

* Fw: Re: pandoc/citeproc issues: recognizing citations
@ 2010-11-21 22:05 John MacFarlane
  0 siblings, 0 replies; only message in thread
From: John MacFarlane @ 2010-11-21 22:05 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ John MacFarlane [Nov 21 10 11:13 ]:
> Suppose pandoc sees
> 
>     @foo
> 
> It cannot assume that this is a citation, since @foo could also
> be a reference to an example list item, as in the following:
> 
>     (@foo)  my list item
>     (@bar)  another list item
> 
>     The advantage (@foo) has over (@bar) is that...
> 
> (See http://johnmacfarlane.net/pandoc/README.html#numbered-examples.)
> 
> So currently, the parser determines whether '@foo' is a citation
> by looking up 'foo' in the list of citation identifiers defined
> in the bibliography file.  If 'foo' is found, '@foo' is treated
> as a citation.  Otherwise, it is left as it is (and at the end
> of the markdown reader, it will be transformed into a reference
> to the appropriate list item).
> 
> There are a few problems with this:
> 
> (1)  Users could accidentally use a label for an example list
> that corresponds to an item in the bibliography; this would cause
> the parser to treat the label as a citation, unexpectedly. Worse,
> the behavior could change if you added a new item to the bibliography
> or used a different bibliography, without any change in the source
> document.  It would seem much better if there were a separate,
> unambiguous syntax for citations and example list labels.
> 
> (2)  Pandoc needs to read the whole bibliography before parsing.
> This means that we can't have a "default bibliography" in ~/.pandoc
> without slowing down *every* invocation of pandoc with a read + parse
> of the bibliography.  Maybe this is okay -- I don't think we need
> the default bibliography feature.
> 
> (3)  There's an awkward inconsistency in the way pandoc treats
> textual citations and bracketed citations.  It checks textual citations
> against the bibliography, but it doesn't do the same for bracketed ones.
> Why not?  Because the parser for a bracket list of citations needs
> to return a single Cite inline.  If we require the individual citations
> to exist in the bibliography, then, a single missing citation will
> cause the whole list to be parsed as regular text, rather than a
> citation. Again, maybe this is okay -- but a few people have already
> said that it seems weird to treat
> 
>     [@missing p. 3]
> 
> differently from
> 
>     @missing [p. 3] says...
> 
> Here are some possible solutions:
> 
> A.  I think the best solution, looking forward, would be to change
> the syntax for numbered example lists, using ! instead of @.
> Then there would be no possibility of conflict with citation keys,
> and we wouldn't have to look up keys in the bibliography database
> as we were parsing.  An example list would look like this:
> 
>     (!)  First example
>     (!foo)  Second example, labeled 'foo'.
>     (!bar)  Third example, labeled 'bar'.
> 
>     (!bar) follows from (!foo), because ...
> 
> This would solve all of the problems above.  However, it has a serious
> drawback:  it would break existing documents, something I have tried
> very hard to avoid doing in updates of pandoc.  I might consider it,
> because numbered example lists have only been in pandoc for a little while,
> and may not yet be in widespread use.
> 
> B.  An alternative would be to use a different symbol, say !, for
> citations, reserving @ for the existing example lists.  Thus:
> 
>     !item1 [p. 99] says that blah [see also !item2 p. 33-34; !item3].
> 
> The problem is that @ is very natural for citations, and looks much
> better in my view.  # would be another possibility:
> 
>     #item1 [p. 99] says that blah [see also #item2 p. 33-34; #item3].
> 
> However, # is much more likely to occur at the beginning of a word
> in normal writing, so I think I'd avoid this.  ~, *, _, ^, $, <, > should be
> avoided because they already have pandoc meanings.  & might work:
> 
>     &item1 [p. 99] says that blah [see also &item2 p. 33-34; &item3].
> 
> But it is less natural and tends to read as "and" -- also, you'd
> capture things like "&c."
> 
> Another possibilities would include, + and =.
> 
>     +item1 [p. 99] says that blah [see also +item2 p. 33-34; +item3].
> 
>     =item1 [p. 99] says that blah [see also =item2 p. 33-34; =item3].
> 
> C.  Or, we could live with problems (1-2) and solve problem (3)
> by checking ALL citations, not just textual citations, to make
> sure they're in the bibliography.
> 
> Thoughts?

One more note:  if we don't check all citations for presence in the
bibliography file, we need to figure out how nonexistent citations
should be rendered. Currently citeproc seems to render them as
"Anon." (in English locales), which doesn't exactly signal "citation
not found".

John



^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2010-11-21 22:05 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-21 22:05 Fw: Re: pandoc/citeproc issues: recognizing citations John MacFarlane

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).