public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* citeproc updates
@ 2010-10-27 13:20 Andrea Rossato
       [not found] ` <20101027132010.GB6998-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-10-27 13:20 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Hi,

I just wanted to briefly report on the recent citeproc-hs updates.

As the ones of you interested in the project remember, last spring I
announced I was working on updating my CSL implementation to support
CSL-1.0 and ot add a new extended API to allow a richer citation
markup for pandoc.

I was not able to finish quickly but I kept slowly working on it and
now I've just pushed a few patches that bring us closer to the goal:

  * the new pandoc API is ready and I've also some code for the pandoc
    side to implement citation prefixes, and modifiers to print the
    author only or to suppress it in the citation. Some other
    modifiers could be included (something like a \nocite would be
    very useful for me);

  * CSL-1.0 implementation is not complete (319 out of 498 tests of
    the standard test-suite are passed, but some of the failures are
    due to the fact that only a limited portion of the test-suite
    data-set is implemented), but the core features are there.
    Important missing features are the advanced sorting and the new
    collapsing options, together with rich text formatting.


I think citeproc-hs should now be quite usable now - probably already
better then the 0.2 version.

For the brave ones, here you can find the pandoc patch to be applied
to the git tree, together with some data to test it:

http://gorgias.mine.nu/citeproc/

Obviously you need the darcs version of citeproc-hs

 darcs get http://code.haskell.org/citeproc-hs

The proposed markdown markup is ... a proposal.

If you have the time to give it a try and report back your impressions
I'd be delighted.

Cheers,
Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found] ` <20101027132010.GB6998-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-10-28  1:36   ` John MacFarlane
       [not found]     ` <20101028013616.GA13075-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2010-11-03 23:39   ` Nathan Gass
  1 sibling, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-10-28  1:36 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Oct 27 10 15:20 ]:
> Hi,
> 
> I just wanted to briefly report on the recent citeproc-hs updates.
> 
> As the ones of you interested in the project remember, last spring I
> announced I was working on updating my CSL implementation to support
> CSL-1.0 and ot add a new extended API to allow a richer citation
> markup for pandoc.
> 
> I was not able to finish quickly but I kept slowly working on it and
> now I've just pushed a few patches that bring us closer to the goal:
> 
>   * the new pandoc API is ready and I've also some code for the pandoc
>     side to implement citation prefixes, and modifiers to print the
>     author only or to suppress it in the citation. Some other
>     modifiers could be included (something like a \nocite would be
>     very useful for me);
> 
>   * CSL-1.0 implementation is not complete (319 out of 498 tests of
>     the standard test-suite are passed, but some of the failures are
>     due to the fact that only a limited portion of the test-suite
>     data-set is implemented), but the core features are there.
>     Important missing features are the advanced sorting and the new
>     collapsing options, together with rich text formatting.
> 
> 
> I think citeproc-hs should now be quite usable now - probably already
> better then the 0.2 version.
> 
> For the brave ones, here you can find the pandoc patch to be applied
> to the git tree, together with some data to test it:
> 
> http://gorgias.mine.nu/citeproc/
> 
> Obviously you need the darcs version of citeproc-hs
> 
>  darcs get http://code.haskell.org/citeproc-hs
> 
> The proposed markdown markup is ... a proposal.
> 
> If you have the time to give it a try and report back your impressions
> I'd be delighted.

Andrea,

This is great news!  I was able to compile the code without difficulty,
and it worked on your test case. To make it easier for others, I added a
'citeproc' branch on github that includes your patches to the pandoc code
base.

One question: How do you get it to include a list of works cited
(bibliography)?  If I recall, it used to automatically include this
at the end of the document, but it doesn't seem to do that any more,
at least with the style.csl in your test directory.

As for the markdown syntax:  Nathan Gass and I hashed out a couple of
proposals on http://gitit.net/Pandoc%20Citations.  If you want to
read it over and add your comments, that will be helpful in deciding
on a final syntax.  You might add a comment about wanting a "nocite"
variant, for example -- and anything else that would be useful.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]     ` <20101028013616.GA13075-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-10-28 19:56       ` BP Jonsson
       [not found]         ` <4CC9D55B.8090004-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2010-10-28 20:26       ` Re: citeproc updates Andrea Rossato
  1 sibling, 1 reply; 107+ messages in thread
From: BP Jonsson @ 2010-10-28 19:56 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

2010-10-28 03:36, John MacFarlane skrev:
> As for the markdown syntax:  Nathan Gass and I hashed out a couple of
> proposals onhttp://gitit.net/Pandoc%20Citations.  If you want to
> read it over and add your comments, that will be helpful in deciding
> on a final syntax.  You might add a comment about wanting a "nocite"
> variant, for example -- and anything else that would be useful.
>

The @'s in the suggested syntax are really iffy.  They make
the thing look nothing like a 'natural' citation.  I'd prefer
using a colon to separate the key from the locator.  It's not 
quite natural, but better.  The locator should go inside the
bracket, since abbreviations like 'p.' are language-specific
and cannot be relied on.  I like the idea of distinguishing
between an escaped and unescaped ( to identify a (non)prefix.
Surely it could be combined with the moderate proposal?

    (see, for example, [doe99: 34-55, 67-89]; also [smith07: chap. 
6]; jones49)

becomes

    (see, for example,  Doe 1999, 34-55, 67-89; also Smith 2007, 
chap. 6; Jones 1949)

while

    \(see, for example, [doe99: 34-55, 67-89])

becomes

    (see, for example, Doe (1999), 34-55, 67-89)

Also, what about a period rather than a hyphen for the
omit-author 'pretag'?  It looks less like if someone by
accident deleted the firs part of a hyphenated name:

    As pointed out by Smith [.smith07]

becomes

     As pointed out by Smith (2007)

----

If there must be standard markers for "page, chapter,
section" etc. then I suggest, since there will probably too
many of them to co-opt punctuation characters, they be of
the form period + lowercase letter + period and always
surrounded by whitespace or parentheses, and with doubled
letters for plural. While they would still be based on a
specific language they would then look more like markup and
less (just) like random English words embedded in a text in
another language. (I'm tempted to suggest picking the
letters from the Latin terms, but that would be the same as
for English for almost all terms, since most of the English
terms are derived from the Latin ones).

Hopefully one could then instruct pandoc/citeproc what
language's terms to expand those sigils into, rather than
relying on the OS locale.  I write more formal stuff in
English than in Swedish, and the only thing which really
bugs me in Zotero is that you have to muck around in
Firefox's configs and then quit/reopen FF to change the
'locale' Zotero uses.  Needless to say there are millions of
people who have an OS in their native language but regularly
or more often than not write in some other language (usually
English...)

Soo for example:

volume(s)       .v.     (.vv.)
book(s)         .b.     (.bb.)
part(s)         .d.     (.dd.)      Mnemonic: "divisio(n)"
chapter(s)      .c.     (.cc.)
section(s)      .s.     (.ss.)
page(s)         .p.     (.pp.)
issue(s)        .i.     (.ii.)


/bpj


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]     ` <20101028013616.GA13075-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2010-10-28 19:56       ` BP Jonsson
@ 2010-10-28 20:26       ` Andrea Rossato
       [not found]         ` <20101028202646.GB6256-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-10-28 20:26 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Wed, Oct 27, 2010 at 06:36:18PM -0700, John MacFarlane wrote:
> One question: How do you get it to include a list of works cited
> (bibliography)?  If I recall, it used to automatically include this
> at the end of the document, but it doesn't seem to do that any more,
> at least with the style.csl in your test directory.

The lack of a bibliography is due to the fact that the style does not
include it, indeed. If you try the styles which come with the
test-suite you'll see that the bibliography is added, as it used to,
at the end of the pandoc document:

http://bitbucket.org/bdarcus/citeproc-test/src/tip/styles/

we agreed that, in order to have a "Reference" section indexed, the
text should end with something like:

# References

You should grab the latest patches from the darcs tree, though, since
there were some regressions (some testing beyond the test-suite is
needed!) in the implementation of the bibliography.

> As for the markdown syntax:  Nathan Gass and I hashed out a couple of
> proposals on http://gitit.net/Pandoc%20Citations.  If you want to
> read it over and add your comments, that will be helpful in deciding
> on a final syntax.  You might add a comment about wanting a "nocite"
> variant, for example -- and anything else that would be useful.

I'll try to be helpful but I have little confidence with things like
these. I think that the local modifiers should, at least, provide a
way to:

 - suppress the author (in a author-date style, for instance), and
   this is implemented;

 - print the author only (implemented, the syntax is '+@doe1999');

 - suppress parenthesis(?);

 - prevent the citation to show up inn the reference list
   (bibliography).

Maybe Bruce is reading and could comment on this, maybe with respect
to the features a CSL implementation is supposed to provide.

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]         ` <4CC9D55B.8090004-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2010-10-28 20:33           ` John MacFarlane
       [not found]             ` <20101028203320.GA1581-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-10-28 20:33 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ BP Jonsson [Oct 28 10 21:56 ]:
> 2010-10-28 03:36, John MacFarlane skrev:
> >As for the markdown syntax:  Nathan Gass and I hashed out a couple of
> >proposals onhttp://gitit.net/Pandoc%20Citations.  If you want to
> >read it over and add your comments, that will be helpful in deciding
> >on a final syntax.  You might add a comment about wanting a "nocite"
> >variant, for example -- and anything else that would be useful.
> >
> 
> The @'s in the suggested syntax are really iffy.  They make
> the thing look nothing like a 'natural' citation.  I'd prefer
> using a colon to separate the key from the locator.

The problem is that a lot of people use colons inside the bibliographic
keys themselves.  Though maybe it would be safe to use a colon followed
by whitespace as a separator.

> It's not quite
> natural, but better.  The locator should go inside the
> bracket, since abbreviations like 'p.' are language-specific
> and cannot be relied on.

None of the proposals relied on anything language specific.

> I like the idea of distinguishing
> between an escaped and unescaped ( to identify a (non)prefix.
> Surely it could be combined with the moderate proposal?
> 
>    (see, for example, [doe99: 34-55, 67-89]; also [smith07: chap.
> 6]; jones49)
> 
> becomes
> 
>    (see, for example,  Doe 1999, 34-55, 67-89; also Smith 2007,
> chap. 6; Jones 1949)
> 
> while
> 
>    \(see, for example, [doe99: 34-55, 67-89])
> 
> becomes
> 
>    (see, for example, Doe (1999), 34-55, 67-89)

Yes, maybe that's right. The idea would be to store information in
state about whether you're already inside parentheses -- escaping
the ( could turn this off.  And then somehow this information would
be passed to citeproc to activate a parenthesized or non-parenthesized
variant.  I'm not sure whether citeproc has support for that yet.

> Also, what about a period rather than a hyphen for the
> omit-author 'pretag'?  It looks less like if someone by
> accident deleted the firs part of a hyphenated name:
> 
>    As pointed out by Smith [.smith07]
> 
> becomes
> 
>     As pointed out by Smith (2007)

I still like the '-', partly because it's easy to remember -
you're subtracting something.

> If there must be standard markers for "page, chapter,
> section" etc. then I suggest, since there will probably too
> many of them to co-opt punctuation characters, they be of
> the form period + lowercase letter + period and always
> surrounded by whitespace or parentheses, and with doubled
> letters for plural. While they would still be based on a
> specific language they would then look more like markup and
> less (just) like random English words embedded in a text in
> another language. (I'm tempted to suggest picking the
> letters from the Latin terms, but that would be the same as
> for English for almost all terms, since most of the English
> terms are derived from the Latin ones).

Again, none of the proposals currently involve parsing these sorts of markers.

> Hopefully one could then instruct pandoc/citeproc what
> language's terms to expand those sigils into, rather than
> relying on the OS locale.  I write more formal stuff in
> English than in Swedish, and the only thing which really
> bugs me in Zotero is that you have to muck around in
> Firefox's configs and then quit/reopen FF to change the
> 'locale' Zotero uses.  Needless to say there are millions of
> people who have an OS in their native language but regularly
> or more often than not write in some other language (usually
> English...)
> 
> Soo for example:
> 
> volume(s)       .v.     (.vv.)
> book(s)         .b.     (.bb.)
> part(s)         .d.     (.dd.)      Mnemonic: "divisio(n)"
> chapter(s)      .c.     (.cc.)
> section(s)      .s.     (.ss.)
> page(s)         .p.     (.pp.)
> issue(s)        .i.     (.ii.)
> 
> 
> /bpj

The current proposals involve an unstructured locator -- as in
bibtex.  We don't try to parse it out into sections, pages, etc.
That would be nice, in principle, but a bit too complex for now.

You might want to briefly summarize your comments on the wiki
page, so we don't lose track.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]         ` <20101028202646.GB6256-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-10-29 20:27           ` John MacFarlane
       [not found]             ` <20101029202716.GA26844-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2010-10-30  2:54           ` Bruce
  2010-10-30 11:17           ` Nathan Gass
  2 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-10-29 20:27 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 1292 bytes --]

+++ Andrea Rossato [Oct 28 10 22:26 ]:
> On Wed, Oct 27, 2010 at 06:36:18PM -0700, John MacFarlane wrote:
> > One question: How do you get it to include a list of works cited
> > (bibliography)?  If I recall, it used to automatically include this
> > at the end of the document, but it doesn't seem to do that any more,
> > at least with the style.csl in your test directory.
> 
> The lack of a bibliography is due to the fact that the style does not
> include it, indeed. If you try the styles which come with the
> test-suite you'll see that the bibliography is added, as it used to,
> at the end of the pandoc document:
> 
> http://bitbucket.org/bdarcus/citeproc-test/src/tip/styles/

I tried it with the chicago-author-date.csl style, and a few others,
and noticed that the suppress-author citations all have an extra space
before the data.  (Example attached.)

John

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.


[-- Attachment #2: test.html --]
[-- Type: text/html, Size: 1802 bytes --]

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]         ` <20101028202646.GB6256-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  2010-10-29 20:27           ` John MacFarlane
@ 2010-10-30  2:54           ` Bruce
       [not found]             ` <507b277f-218d-494f-88ae-67c84c9e5cec-TDjcMQyrprHatUqWdiOD/mB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
  2010-10-30 11:17           ` Nathan Gass
  2 siblings, 1 reply; 107+ messages in thread
From: Bruce @ 2010-10-30  2:54 UTC (permalink / raw)
  To: pandoc-discuss



On Oct 28, 4:26 pm, Andrea Rossato <andrea.ross...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Wed, Oct 27, 2010 at 06:36:18PM -0700, John MacFarlane wrote:
> > One question: How do you get it to include a list of works cited
> > (bibliography)?  If I recall, it used to automatically include this
> > at the end of the document, but it doesn't seem to do that any more,
> > at least with the style.csl in your test directory.
>
> The lack of a bibliography is due to the fact that the style does not
> include it, indeed. If you try the styles which come with the
> test-suite you'll see that the bibliography is added, as it used to,
> at the end of the pandoc document:
>
> http://bitbucket.org/bdarcus/citeproc-test/src/tip/styles/
>
> we agreed that, in order to have a "Reference" section indexed, the
> text should end with something like:
>
> # References
>
> You should grab the latest patches from the darcs tree, though, since
> there were some regressions (some testing beyond the test-suite is
> needed!) in the implementation of the bibliography.
>
> > As for the markdown syntax:  Nathan Gass and I hashed out a couple of
> > proposals onhttp://gitit.net/Pandoc%20Citations.  If you want to
> > read it over and add your comments, that will be helpful in deciding
> > on a final syntax.  You might add a comment about wanting a "nocite"
> > variant, for example -- and anything else that would be useful.
>
> I'll try to be helpful but I have little confidence with things like
> these. I think that the local modifiers should, at least, provide a
> way to:
>
>  - suppress the author (in a author-date style, for instance), and
>    this is implemented;
>
>  - print the author only (implemented, the syntax is '+@doe1999');
>
>  - suppress parenthesis(?);
>
>  - prevent the citation to show up inn the reference list
>    (bibliography).
>
> Maybe Bruce is reading and could comment on this, maybe with respect
> to the features a CSL implementation is supposed to provide.

I've only just checked in. I'm pretty busy these days, and it's quite
hard to follow all this. Can someone update me on where this
conversation stands, and what sorts of details you need feedback on?

Also, if I want to test this out with some real world work, what's the
easiest way to get it running?

Bruce

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]             ` <507b277f-218d-494f-88ae-67c84c9e5cec-TDjcMQyrprHatUqWdiOD/mB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
@ 2010-10-30  4:13               ` fiddlosopher
       [not found]                 ` <1d77490f-4c76-4571-8f53-6902d1604ba5-PQeItPOgslmbvKjc6lLfglYGCWtFR9XvQQ4Iyu8u01E@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: fiddlosopher @ 2010-10-30  4:13 UTC (permalink / raw)
  To: pandoc-discuss

> I've only just checked in. I'm pretty busy these days, and it's quite
> hard to follow all this. Can someone update me on where this
> conversation stands, and what sorts of details you need feedback on?
>
> Also, if I want to test this out with some real world work, what's the
> easiest way to get it running?

Bruce,

Here's how to get it running.

cabal update
darcs get http://code.haskell.org/citeproc-hs
cd citeproc-hs
cabal install
cd ..
git clone -b citeproc git://github.com/jgm/pandoc.git pandoc-citeproc
cd pandoc-citeproc
cabal install

You should now have a version of pandoc compiled against Andrea's
latest citeproc library.  (Let me know if there are any difficulties
here.)

To test, do something like

pandoc -s -S --biblio biblio.bib --csl style.csl test.markdown >
test.html

You can download a sample biblio.bib, style.csl, and test.markdown
from Andrea's site:

http://gorgias.mine.nu/citeproc/test.markdown

Of course, pandoc can also use mods bibliographies.  (I'm not sure,
you might have to use the --biblio-format option to specify the
format, but it seemed to recognize the bibtex file without this.)

Andrea still has some work to do to get his implementation fully
conformant, but he is getting close.  On the pandoc side, the main
issue is how to design the markdown citation format.  There's an
extensive discussion of this on a wiki page:

http://gitit.net/Pandoc%20Citations

Andera's current implementation uses a somewhat different system than
we'll probably end up with.  Anyway, feedback on this would be
helpful.  One thing you could help with is by clarifying whether
citeproc has variants for suppressing parentheses, as you have in
natbib for latex.  We were a bit uncertain about that.

Once we pick a citation format and start including citeproc by default
(which I'd like to do), there will be a strong presumption against
backward-compatible changes going forward, since these would break
existing documents.  So it's pretty important to get these decisions
right at the outset.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]             ` <20101029202716.GA26844-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-10-30 10:57               ` Andrea Rossato
  0 siblings, 0 replies; 107+ messages in thread
From: Andrea Rossato @ 2010-10-30 10:57 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Fri, Oct 29, 2010 at 01:27:16PM -0700, John MacFarlane wrote:
> I tried it with the chicago-author-date.csl style, and a few others,
> and noticed that the suppress-author citations all have an extra space
> before the data.  (Example attached.)

I'm aware of the problem and working on a fix. Thanks for the example,
though.

I think that, in addition to the standard test-suite, a few pandoc
specific tests would be very helpful: I'll try to put together some
data.

Thanks,
Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]         ` <20101028202646.GB6256-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  2010-10-29 20:27           ` John MacFarlane
  2010-10-30  2:54           ` Bruce
@ 2010-10-30 11:17           ` Nathan Gass
  2 siblings, 0 replies; 107+ messages in thread
From: Nathan Gass @ 2010-10-30 11:17 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Just when I got fed up using latex commands for citation in my documents 
again, this topic sees some great progress. Thanks for all the work you do.

On 28.10.10 22:26, Andrea Rossato wrote:

>> As for the markdown syntax:  Nathan Gass and I hashed out a couple of
>> proposals on http://gitit.net/Pandoc%20Citations.  If you want to
>> read it over and add your comments, that will be helpful in deciding
>> on a final syntax.  You might add a comment about wanting a "nocite"
>> variant, for example -- and anything else that would be useful.
>
> I'll try to be helpful but I have little confidence with things like
> these. I think that the local modifiers should, at least, provide a
> way to:
>
>   - suppress the author (in a author-date style, for instance), and
>     this is implemented;
>
>   - print the author only (implemented, the syntax is '+@doe1999');

As primitive in pandoc itself, author only citations are very useful and 
I plan to use it in a --natbib option I want to implement, which lets 
pandoc parse and write natbib citations in latex files.

That said, I disagree about the need for markdown syntax for this. By 
far the most cases which would use this syntax would look something like:

    As argued by [+doe1999] [-doe1999@p.10]

We can't cover all cases of mentioning the authors in text, as this is 
just to complex and language specific. I therefore would only add a 
syntax for the most common case (so `[+doe1999]` becomes "Doe (1999)" 
and not only "Doe"). An alternative is of course to have no syntax for 
author names in text at all and require to always write them out by hand.

Again this needs no changes to citeproc at all. From the pandoc 
perspective it is useful to have citeproc as flexible as possible, so 
that pandoc can always choose to render citation variants if the target 
format does not support them. This is only about the markdown reader 
producing different native documents than you currently propose.

So lets assume John's last proposal without any support for authors in 
text gets implemented. I use my new --natbib option to convert a latex 
document using natbib citations. Given the current native citation 
structure, this would very well be possible, as the latex reader could 
just translate `\citet[p.10]{doe1999}` to two native citations (first 
author-only, then suppress-author). The markdown writer can then write 
out the suppress-author citation as `[-doe1999: p.10]` and render the 
author-only citation as `Doe` because there is no markdown syntax for 
this citation variant in markdown. So the latex `As argued by 
\citet[p.10]{doe1999}` gets translated to `As argued by Doe [-doe1999: 
p.10]`, using citeprocs author-only variant to render the "Doe". Keeping 
that possibility would be very cool.

>
>   - suppress parenthesis(?);

As said before, this is useful but imho not a high priority, as this 
feature can later be added without serious break of backwards compatibility.

So thanks again.

Nathan Gass


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]             ` <20101028203320.GA1581-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-10-30 12:51               ` Andrea Rossato
       [not found]                 ` <20101030125104.GC14156-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  2010-10-30 16:06               ` supporting both parenthetical and footnote citations (was: citeproc updates) John MacFarlane
  1 sibling, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-10-30 12:51 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Thu, Oct 28, 2010 at 01:33:20PM -0700, John MacFarlane wrote:
> +++ BP Jonsson [Oct 28 10 21:56 ]:
> > I like the idea of distinguishing
> > between an escaped and unescaped ( to identify a (non)prefix.
> > Surely it could be combined with the moderate proposal?
> > 
> >    (see, for example, [doe99: 34-55, 67-89]; also [smith07: chap.
> > 6]; jones49)
> > 
> > becomes
> > 
> >    (see, for example,  Doe 1999, 34-55, 67-89; also Smith 2007,
> > chap. 6; Jones 1949)
> > 
> > while
> > 
> >    \(see, for example, [doe99: 34-55, 67-89])
> > 
> > becomes
> > 
> >    (see, for example, Doe (1999), 34-55, 67-89)
> 
> Yes, maybe that's right. The idea would be to store information in
> state about whether you're already inside parentheses -- escaping
> the ( could turn this off.  And then somehow this information would
> be passed to citeproc to activate a parenthesized or non-parenthesized
> variant.  I'm not sure whether citeproc has support for that yet.

the problem is that citeproc entirely relays on the style for
producing the output and so it is not predictable, now, whether the
style encloses citations in parenthesis or not. This kind of local
modifications would break the possibility to easily change citation
styles, if I understand the problem correctly.

> > Hopefully one could then instruct pandoc/citeproc what
> > language's terms to expand those sigils into, rather than
> > relying on the OS locale.  I write more formal stuff in
> > English than in Swedish, and the only thing which really
> > bugs me in Zotero is that you have to muck around in
> > Firefox's configs and then quit/reopen FF to change the
> > 'locale' Zotero uses.  Needless to say there are millions of
> > people who have an OS in their native language but regularly
> > or more often than not write in some other language (usually
> > English...)
> > 
> > Soo for example:
> > 
> > volume(s)       .v.     (.vv.)
> > book(s)         .b.     (.bb.)
> > part(s)         .d.     (.dd.)      Mnemonic: "divisio(n)"
> > chapter(s)      .c.     (.cc.)
> > section(s)      .s.     (.ss.)
> > page(s)         .p.     (.pp.)
> > issue(s)        .i.     (.ii.)
> > 
> > 
> > /bpj
> 
> The current proposals involve an unstructured locator -- as in
> bibtex.  We don't try to parse it out into sections, pages, etc.
> That would be nice, in principle, but a bit too complex for now.
> 
> You might want to briefly summarize your comments on the wiki
> page, so we don't lose track.

citeproc parses the locator (this is required by the specification,
since the locator label and the numeric value may be used
independently by the style.

That's the code that does the job (it checks the initial part of the
locator up to an unambiguous amount of characters).

parseLocator :: String -> (String, String)
parseLocator s
    | "b"    `isPrefixOf` formatField s = mk "book"
    | "ch"   `isPrefixOf` formatField s = mk "chapter"
    | "co"   `isPrefixOf` formatField s = mk "column"
    | "fi"   `isPrefixOf` formatField s = mk "figure"
    | "fo"   `isPrefixOf` formatField s = mk "folio"
    | "i"    `isPrefixOf` formatField s = mk "issue"
    | "l"    `isPrefixOf` formatField s = mk "line"
    | "n"    `isPrefixOf` formatField s = mk "note"
    | "o"    `isPrefixOf` formatField s = mk "opus"
    | "para" `isPrefixOf` formatField s = mk "paragraph"
    | "part" `isPrefixOf` formatField s = mk "part"
    | "p"    `isPrefixOf` formatField s = mk "page"
    | "sec"  `isPrefixOf` formatField s = mk "section"
    | "sub"  `isPrefixOf` formatField s = mk "sub verbo"
    | "ve"   `isPrefixOf` formatField s = mk "verse"
    | "v"    `isPrefixOf` formatField s = mk "volume"
    | otherwise                         =    ([], [])
    where
      mk c = if null s then ([], []) else (,) c . concat . tail . words $ s

that is to say, 'p', 'p.', 'page', 'penguins' would all became a
"page" label.

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                 ` <1d77490f-4c76-4571-8f53-6902d1604ba5-PQeItPOgslmbvKjc6lLfglYGCWtFR9XvQQ4Iyu8u01E@public.gmane.org>
@ 2010-10-30 15:18                   ` Bruce
  2010-11-13  9:46                   ` lukshuntim-Re5JQEeQqe8AvxtiuMwx3w
  2010-11-13 16:16                   ` fiddlosopher
  2 siblings, 0 replies; 107+ messages in thread
From: Bruce @ 2010-10-30 15:18 UTC (permalink / raw)
  To: pandoc-discuss



On Oct 30, 12:13 am, fiddlosopher <fiddlosop...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > I've only just checked in. I'm pretty busy these days, and it's quite
> > hard to follow all this. Can someone update me on where this
> > conversation stands, and what sorts of details you need feedback on?
>
> > Also, if I want to test this out with some real world work, what's the
> > easiest way to get it running?
>
> Bruce,
>
> Here's how to get it running.
>
> cabal update
> darcs gethttp://code.haskell.org/citeproc-hs
> cd citeproc-hs
> cabal install
> cd ..
> git clone -b citeproc git://github.com/jgm/pandoc.git pandoc-citeproc
> cd pandoc-citeproc
> cabal install
>
> You should now have a version of pandoc compiled against Andrea's
> latest citeproc library.  (Let me know if there are any difficulties
> here.)
>
> To test, do something like
>
> pandoc -s -S --biblio biblio.bib --csl style.csl test.markdown >
> test.html

OK, thanks; will try that out later and let you know if I run into any
problems.

> You can download a sample biblio.bib, style.csl, and test.markdown
> from Andrea's site:
>
> http://gorgias.mine.nu/citeproc/test.markdown
>
> Of course, pandoc can also use mods bibliographies.  (I'm not sure,
> you might have to use the --biblio-format option to specify the
> format, but it seemed to recognize the bibtex file without this.)
>
> Andrea still has some work to do to get his implementation fully
> conformant, but he is getting close.  On the pandoc side, the main
> issue is how to design the markdown citation format.  There's an
> extensive discussion of this on a wiki page:
>
> http://gitit.net/Pandoc%20Citations

Yeah, that's the page I was referring when I said I was having a hard
time following the discussion. Is there some particular section(s) I
should focus on there? I get the feeling that some of it is more
background, and that you've moved on.

> Andera's current implementation uses a somewhat different system than
> we'll probably end up with.  Anyway, feedback on this would be
> helpful.  One thing you could help with is by clarifying whether
> citeproc has variants for suppressing parentheses, as you have in
> natbib for latex.  We were a bit uncertain about that.

The only variant behavior CSL explicitly supports is what I'd call
positional variants: first/subsequent, ibid, etc. So no, there is no
support for suppressing parenthesis.

But I'm not sure we explicitly need to support it in CSL, since the
prefix for the citation is easily discerned. In that sense, it could
work sort of like suppressing author does now: it's not mentioned
anywhere in the CSL spec, but it's easy enough to implement, and so it
usually included in implementations.

In any case, once we sort it all out, we can look to feed back some of
this discussion to CSL development if necessary.

> Once we pick a citation format and start including citeproc by default
> (which I'd like to do), there will be a strong presumption against
> backward-compatible changes going forward, since these would break
> existing documents.  So it's pretty important to get these decisions
> right at the outset.

Absolutely agreed.

Bruce

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: supporting both parenthetical and footnote citations (was: citeproc updates)
       [not found]             ` <20101028203320.GA1581-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2010-10-30 12:51               ` Andrea Rossato
@ 2010-10-30 16:06               ` John MacFarlane
       [not found]                 ` <20101030160608.GA4075-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-10-30 16:06 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

It's difficult to see how to design the system so that the
author can switch freely between a style using parenthetical
citations and a style using footnote citations. (CSL seems
to offer both.)

Here's an example from Andrea's test document:

> Then a citation group: [See @item1, p. 34-35; also @item3, chap. 3].

This actually doesn't come out right for either style.  With
a parenthetical style, you'll get something like:

> Then a citation group:  (See Doe 2000, p. 34-35; also Doe 2002, chap. 3).

Here the 'See' should be 'see'.

However, with a footnote style, you'll get something like:

> Then a citation group: 1.

> 1. See Doe 2000, p. 34-35; also Doe 2002, chap. 3

Here there are three problems:

- The footnote is missing a period, since the period was outside
  of the citation
- The period occurs after the footnote.
- There is a space before the footnote.

It's a bit hard to see how to resolve these issues, given that, when
pandoc is actually parsing the citation, it won't know whether the
output is to be rendered using a footnote style or an author/date
style.

I noticed another issue with Andrea's test:  the note says simply
"at 3" instead of "chap. 3" -- apparently the number is being stored
and the rest ignored.  This seems undesirable.

My own inclination, if there's no obvious solution, is to optimize pandoc for
the author-date format. After all, if people want citations in footnotes, they
can always manually insert footnotes.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: supporting both parenthetical and footnote citations (was: citeproc updates)
       [not found]                 ` <20101030160608.GA4075-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-10-30 20:18                   ` Bruce
       [not found]                     ` <d7beeaa7-d578-47db-bf92-aab5415d341e-QB3fWVJXTS+bvKjc6lLfglYGCWtFR9XvQQ4Iyu8u01E@public.gmane.org>
  2010-10-31 15:58                   ` Bruce
  2010-11-02 21:39                   ` Andrea Rossato
  2 siblings, 1 reply; 107+ messages in thread
From: Bruce @ 2010-10-30 20:18 UTC (permalink / raw)
  To: pandoc-discuss



On Oct 30, 12:06 pm, John MacFarlane <fiddlosop...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> It's difficult to see how to design the system so that the
> author can switch freely between a style using parenthetical
> citations and a style using footnote citations. (CSL seems
> to offer both.)
>
> Here's an example from Andrea's test document:
>
> > Then a citation group: [See @item1, p. 34-35; also @item3, chap. 3].
>
> This actually doesn't come out right for either style.  With
> a parenthetical style, you'll get something like:
>
> > Then a citation group:  (See Doe 2000, p. 34-35; also Doe 2002, chap. 3).
>
> Here the 'See' should be 'see'.
>
> However, with a footnote style, you'll get something like:
>
> > Then a citation group: 1.
> > 1. See Doe 2000, p. 34-35; also Doe 2002, chap. 3
>
> Here there are three problems:
>
> - The footnote is missing a period, since the period was outside
>   of the citation
> - The period occurs after the footnote.
> - There is a space before the footnote.
>
> It's a bit hard to see how to resolve these issues, given that, when
> pandoc is actually parsing the citation, it won't know whether the
> output is to be rendered using a footnote style or an author/date
> style.

I can check with the people that worked on the Zotero code to handle
this. No doubt, it does require some cleanup work around punctuation
though.

> I noticed another issue with Andrea's test:  the note says simply
> "at 3" instead of "chap. 3" -- apparently the number is being stored
> and the rest ignored.  This seems undesirable.
>
> My own inclination, if there's no obvious solution, is to optimize pandoc for
> the author-date format. After all, if people want citations in footnotes, they
> can always manually insert footnotes.

Believe it or not, there are some fields (including mine) where it's
entirely feasible to author a manuscript in an author-date format, but
ultimately publish in a journal (or book) format that requires notes.
Or vice versa. It would be a large PITA for an author to have to
manually switch between them.

Obviously if it's not realistically possible to achieve this design
goal in pandoc, then the decision is made. But it would be great if we
could figure out a way to solve these issues.

Bruce

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: supporting both parenthetical and footnote citations (was: citeproc updates)
       [not found]                     ` <d7beeaa7-d578-47db-bf92-aab5415d341e-QB3fWVJXTS+bvKjc6lLfglYGCWtFR9XvQQ4Iyu8u01E@public.gmane.org>
@ 2010-10-30 23:46                       ` John MacFarlane
  0 siblings, 0 replies; 107+ messages in thread
From: John MacFarlane @ 2010-10-30 23:46 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Bruce [Oct 30 10 13:18 ]:
> 
> 
> On Oct 30, 12:06 pm, John MacFarlane <fiddlosop...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > It's difficult to see how to design the system so that the
> > author can switch freely between a style using parenthetical
> > citations and a style using footnote citations. (CSL seems
> > to offer both.)
> >
> > Here's an example from Andrea's test document:
> >
> > > Then a citation group: [See @item1, p. 34-35; also @item3, chap. 3].
> >
> > This actually doesn't come out right for either style.  With
> > a parenthetical style, you'll get something like:
> >
> > > Then a citation group:  (See Doe 2000, p. 34-35; also Doe 2002, chap. 3).
> >
> > Here the 'See' should be 'see'.
> >
> > However, with a footnote style, you'll get something like:
> >
> > > Then a citation group: 1.
> > > 1. See Doe 2000, p. 34-35; also Doe 2002, chap. 3
> >
> > Here there are three problems:
> >
> > - The footnote is missing a period, since the period was outside
> >   of the citation
> > - The period occurs after the footnote.
> > - There is a space before the footnote.
> >
> > It's a bit hard to see how to resolve these issues, given that, when
> > pandoc is actually parsing the citation, it won't know whether the
> > output is to be rendered using a footnote style or an author/date
> > style.
> 
> I can check with the people that worked on the Zotero code to handle
> this. No doubt, it does require some cleanup work around punctuation
> though.
> 
> > I noticed another issue with Andrea's test:  the note says simply
> > "at 3" instead of "chap. 3" -- apparently the number is being stored
> > and the rest ignored.  This seems undesirable.
> >
> > My own inclination, if there's no obvious solution, is to optimize pandoc for
> > the author-date format. After all, if people want citations in footnotes, they
> > can always manually insert footnotes.
> 
> Believe it or not, there are some fields (including mine) where it's
> entirely feasible to author a manuscript in an author-date format, but
> ultimately publish in a journal (or book) format that requires notes.
> Or vice versa. It would be a large PITA for an author to have to
> manually switch between them.

Yes, I've had to do this too, so I'd love to have a fully automatic
solution.  It's just hard to see exactly how to do it.  But I'll
think about it.

John

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: supporting both parenthetical and footnote citations (was: citeproc updates)
       [not found]                 ` <20101030160608.GA4075-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2010-10-30 20:18                   ` Bruce
@ 2010-10-31 15:58                   ` Bruce
  2010-11-02 21:39                   ` Andrea Rossato
  2 siblings, 0 replies; 107+ messages in thread
From: Bruce @ 2010-10-31 15:58 UTC (permalink / raw)
  To: pandoc-discuss

So not a complete answer (still gathering info on that), but we can
cross one issue off the list.

On Oct 30, 12:06 pm, John MacFarlane <fiddlosop...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

...

> Here there are three problems:
>
> - The footnote is missing a period, since the period was outside
>   of the citation
> - The period occurs after the footnote.
> - There is a space before the footnote.

The first is a non-issue, because the period would come from the
formatted citation (note-style CSLs include the period in the citation
suffix).

Bruce

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax (was: citeproc updates)
       [not found]                 ` <20101030125104.GC14156-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-10-31 22:06                   ` John MacFarlane
       [not found]                     ` <20101031220602.GA21760-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-10-31 22:06 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Here are my latest thoughts on citation syntax.

I'm beginning to wonder whether we really want an omit-author
form.  Here's why.  In general, we've set ourselves the goal
of making it possible to change between inline and footnote
citations, and more generally to change citation styles, without
modifying the document itself. I think this is really a good
goal.

The main use that people have had in mind for the omit-author
form is for use in generating the following kind of output:

Doe (1999, 33) says that ...

You'd generate this with something like:

[+@doe99] [-@doe99, p. 33] says that ...

But now, what happens when you switch to a footnote citation style?
You end up with a footnote reference as the subject of your sentence --
not good!

Could we do without omit-author?  I think we could.  You can always
just write:

Doe [-@doe99, p. 33] says that ...

This will look fine in footnote styles (provided we can solve the
spacing and punctuation issues mentioned elsewhere in this thread).

Nathan's plans to add a --natbib option to the latex reader shouldn't
be affected by this.  natbib's omit-author forms could be translated
to literal renderings of the author's name, which I assume we could
get out of citeproc-hs.

That would leave us with a fairly simple syntax for citations -- pretty
close to what Andrea is using.  (I have to say that I like the @s,
myself; they make it clear that we've got something special here --
a code for a bibliographic entry, and they seem fairly unobtrusive
to my eye.)

CITATION = '[' '-'? whitespace PREFIX? CITE (';' whitespace CITE)* ']'
PREFIX = string of characters up to an unescaped '@' or ']'
PLAIN = "\@" | "\]" | [^@\]]
CITE = CITEKEY LOCATOR?
LOCATOR = ',' whitespace (string of characters up to unescaped ']' or ';')
CITEKEY = '@' CITEKEYCHAR+
CITEKEYCHAR = any non-whitespace character besides ',', ';', ']', '@'

What would be missing? In an author-date parenthetical style, you'd still end
up with undesirable parens-inside-parens. For example:

(Doe [-doe99, p. 30] contradicts James [-james44, p. 10])

would become

(Doe (1999, 30) contradicts James (1944, 10))

where I think what is often desired is

(Doe 1999, 30 contradicts James 1944, 10)

Perhaps citeproc-hs could implement a strip-parens option that
modifies the result of applying the citeproc style.  Instead of having
a separate syntax for that variant, we could try to apply it
automatically, as discussed elsewhere, whenever a citation occurs
inside unescaped parentheses.  The problem with this is that
some authors might actually want the parens within parens.
Unfortunately, this choice can't be represented in a citeproc
style, since citeproc doesn't have a suppress-parens variant.
So maybe it would be better to make it an explicit option, triggered
by something like '~'.  Then we'd have three forms:

[@doe99, p. 30]   =>  (Doe 1999, 30)
[~@doe99, p. 30]  =>  Doe 1999, 30
[-@doe99, p. 30]  =>  (1999, 30)

Andrea also suggested that we needed a nocite variant.  I assume
he means something like bibtex's \nocite, which causes the entry
to be included in the list of references at the end, without
generating a citation in the text.

One could just use another symbol:

[&@doe99]

Or one could use a special locator or prefix:

[nocite @doe99; @smith04]

I prefer the latter, just because its meaning is more obvious,
but it may be too English-centric for pandoc.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax (was: citeproc updates)
       [not found]                     ` <20101031220602.GA21760-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-01  0:48                       ` Bruce
       [not found]                         ` <3b5e1631-d1c4-4d49-bc07-fb43544faf4e-kXDyx5gwD+DerssAVCGTfmB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
  2010-11-01 10:01                       ` citation syntax Nathan Gass
  1 sibling, 1 reply; 107+ messages in thread
From: Bruce @ 2010-11-01  0:48 UTC (permalink / raw)
  To: pandoc-discuss



On Oct 31, 6:06 pm, John MacFarlane <fiddlosop...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Here are my latest thoughts on citation syntax.
>
> I'm beginning to wonder whether we really want an omit-author
> form.  Here's why.  In general, we've set ourselves the goal
> of making it possible to change between inline and footnote
> citations, and more generally to change citation styles, without
> modifying the document itself. I think this is really a good
> goal.
>
> The main use that people have had in mind for the omit-author
> form is for use in generating the following kind of output:
>
> Doe (1999, 33) says that ...

Correct.

> You'd generate this with something like:
>
> [+@doe99] [-@doe99, p. 33] says that ...

This is the BibTeX way, but I'd be fine with simply:

Doe [-@doe99, p. 33] says that ...

Indeed, that's how it works in Zotero. People, in particular with
BibTeX background, will complain about treating the author name as
plain text, but it's never bothered me.

> But now, what happens when you switch to a footnote citation style?
> You end up with a footnote reference as the subject of your sentence --
> not good!

Not following here. You'd get "Doe^1 says that ..."; what's wrong with
that?

> Could we do without omit-author?  I think we could.  You can always
> just write:
>
> Doe [-@doe99, p. 33] says that ...
>
> This will look fine in footnote styles (provided we can solve the
> spacing and punctuation issues mentioned elsewhere in this thread).

Right; I see we arrived at the same place ;-)

But are you using the right terminology here? I thought this IS omit-
author (the full citation minus the author)?

> Nathan's plans to add a --natbib option to the latex reader shouldn't
> be affected by this.  natbib's omit-author forms could be translated
> to literal renderings of the author's name, which I assume we could
> get out of citeproc-hs.
>
> That would leave us with a fairly simple syntax for citations -- pretty
> close to what Andrea is using.  (I have to say that I like the @s,
> myself; they make it clear that we've got something special here --
> a code for a bibliographic entry, and they seem fairly unobtrusive
> to my eye.)

Well, and the idea is that the symbol literally means "at" and so is
appropriate for the purpose.

> CITATION = '[' '-'? whitespace PREFIX? CITE (';' whitespace CITE)* ']'
> PREFIX = string of characters up to an unescaped '@' or ']'
> PLAIN = "\@" | "\]" | [^@\]]
> CITE = CITEKEY LOCATOR?
> LOCATOR = ',' whitespace (string of characters up to unescaped ']' or ';')
> CITEKEY = '@' CITEKEYCHAR+
> CITEKEYCHAR = any non-whitespace character besides ',', ';', ']', '@'

So what is the value of the locator? It's just a string you pass on to
citeproc?

The most common case by far is a page number. But what if you needed
something like (Doe v. Smith, 1978, p34, l4)?

> What would be missing? In an author-date parenthetical style, you'd still end
> up with undesirable parens-inside-parens. For example:
>
> (Doe [-doe99, p. 30] contradicts James [-james44, p. 10])
>
> would become
>
> (Doe (1999, 30) contradicts James (1944, 10))
>
> where I think what is often desired is
>
> (Doe 1999, 30 contradicts James 1944, 10)

Right. Another one of those funky details that needs to get cleaned
up, much like the distinction between a regular citation in a
footnote, and what we've called "footnoted citations".

> Perhaps citeproc-hs could implement a strip-parens option that
> modifies the result of applying the citeproc style.  Instead of having
> a separate syntax for that variant, we could try to apply it
> automatically, as discussed elsewhere, whenever a citation occurs
> inside unescaped parentheses.  The problem with this is that
> some authors might actually want the parens within parens.

Can you think of a case where one would want this? I can't.

> Unfortunately, this choice can't be represented in a citeproc
> style, since citeproc doesn't have a suppress-parens variant.
> So maybe it would be better to make it an explicit option, triggered
> by something like '~'.  Then we'd have three forms:
>
> [@doe99, p. 30]   =>  (Doe 1999, 30)
> [~@doe99, p. 30]  =>  Doe 1999, 30
> [-@doe99, p. 30]  =>  (1999, 30)
>
> Andrea also suggested that we needed a nocite variant.  I assume
> he means something like bibtex's \nocite, which causes the entry
> to be included in the list of references at the end, without
> generating a citation in the text.
>
> One could just use another symbol:
>
> [&@doe99]
>
> Or one could use a special locator or prefix:
>
> [nocite @doe99; @smith04]
>
> I prefer the latter, just because its meaning is more obvious,
> but it may be too English-centric for pandoc.

Another option (not thought of this before, so just off-the-cuff); a
double '-':

[--doe99]

... or if we don't need the earlier '+' behavior, use that?

Finally, I'd like to add a use case for consideration that's come up a
fair bit in the zotero fora:

Multiple bibliography sections. For example, you need different
sections for primary and secondary documents, and legal citations.

Whether you add support for it now or note, it'd at least be good to
consider how it might be added. It might be so simple as some sort of
configuration for citeproc, where the markdown document need not
include any specific syntax.

Bruce

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax (was: citeproc updates)
       [not found]                         ` <3b5e1631-d1c4-4d49-bc07-fb43544faf4e-kXDyx5gwD+DerssAVCGTfmB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
@ 2010-11-01  1:34                           ` John MacFarlane
  0 siblings, 0 replies; 107+ messages in thread
From: John MacFarlane @ 2010-11-01  1:34 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Bruce [Oct 31 10 17:48 ]:
> 
> 
> On Oct 31, 6:06 pm, John MacFarlane <fiddlosop...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > Here are my latest thoughts on citation syntax.
> >
> > I'm beginning to wonder whether we really want an omit-author
> > form.  Here's why.  In general, we've set ourselves the goal
> > of making it possible to change between inline and footnote
> > citations, and more generally to change citation styles, without
> > modifying the document itself. I think this is really a good
> > goal.
> >
> > The main use that people have had in mind for the omit-author
> > form is for use in generating the following kind of output:
> >
> > Doe (1999, 33) says that ...
> 
> Correct.
> 
> > You'd generate this with something like:
> >
> > [+@doe99] [-@doe99, p. 33] says that ...
> 
> This is the BibTeX way, but I'd be fine with simply:
> 
> Doe [-@doe99, p. 33] says that ...
> 
> Indeed, that's how it works in Zotero. People, in particular with
> BibTeX background, will complain about treating the author name as
> plain text, but it's never bothered me.
> 
> > But now, what happens when you switch to a footnote citation style?
> > You end up with a footnote reference as the subject of your sentence --
> > not good!
> 
> Not following here. You'd get "Doe^1 says that ..."; what's wrong with
> that?

Sorry, I spoke above of "omit-author" when I really meant "author-only."
Andrea's syntax has a variant

[+@doe99]

that omits everything *but* the author.  It's this that I think
we can do without, and should do without, because

[+@doe99] says that...

wouldn't make sense in a footnote citation style. We certainly still need
omit-author.

> > Could we do without omit-author?  I think we could.  You can always
> > just write:
> >
> > Doe [-@doe99, p. 33] says that ...
> >
> > This will look fine in footnote styles (provided we can solve the
> > spacing and punctuation issues mentioned elsewhere in this thread).
> 
> Right; I see we arrived at the same place ;-)
> 
> But are you using the right terminology here? I thought this IS omit-
> author (the full citation minus the author)?

Yes, sorry - I used the wrong term.

> > Nathan's plans to add a --natbib option to the latex reader shouldn't
> > be affected by this.  natbib's omit-author forms could be translated
> > to literal renderings of the author's name, which I assume we could
> > get out of citeproc-hs.
> >
> > That would leave us with a fairly simple syntax for citations -- pretty
> > close to what Andrea is using.  (I have to say that I like the @s,
> > myself; they make it clear that we've got something special here --
> > a code for a bibliographic entry, and they seem fairly unobtrusive
> > to my eye.)
> 
> Well, and the idea is that the symbol literally means "at" and so is
> appropriate for the purpose.
> 
> > CITATION = '[' '-'? whitespace PREFIX? CITE (';' whitespace CITE)* ']'
> > PREFIX = string of characters up to an unescaped '@' or ']'
> > PLAIN = "\@" | "\]" | [^@\]]
> > CITE = CITEKEY LOCATOR?
> > LOCATOR = ',' whitespace (string of characters up to unescaped ']' or ';')
> > CITEKEY = '@' CITEKEYCHAR+
> > CITEKEYCHAR = any non-whitespace character besides ',', ';', ']', '@'
> 
> So what is the value of the locator? It's just a string you pass on to
> citeproc?
> 
> The most common case by far is a page number. But what if you needed
> something like (Doe v. Smith, 1978, p34, l4)?

My thought was that we'd pass the locator (as an unstructured string)
to citeproc-hs, which would parse it in a locale-sensitive way (as
Andrea explained in his last email).  Pandoc doesn't need to parse
the locator, or be aware of the locale, since citeproc-hs is doing
that.

> > What would be missing? In an author-date parenthetical style, you'd still end
> > up with undesirable parens-inside-parens. For example:
> >
> > (Doe [-doe99, p. 30] contradicts James [-james44, p. 10])
> >
> > would become
> >
> > (Doe (1999, 30) contradicts James (1944, 10))
> >
> > where I think what is often desired is
> >
> > (Doe 1999, 30 contradicts James 1944, 10)
> 
> Right. Another one of those funky details that needs to get cleaned
> up, much like the distinction between a regular citation in a
> footnote, and what we've called "footnoted citations".
> 
> > Perhaps citeproc-hs could implement a strip-parens option that
> > modifies the result of applying the citeproc style.  Instead of having
> > a separate syntax for that variant, we could try to apply it
> > automatically, as discussed elsewhere, whenever a citation occurs
> > inside unescaped parentheses.  The problem with this is that
> > some authors might actually want the parens within parens.
> 
> Can you think of a case where one would want this? I can't.

I wouldn't want it myself, but there are so many different standards
for citations, that I couldn't be confident that none of them would
want this.

You probably know much more about this than I, though. If there
isn't a real world need, then maybe the automatic detection would
be best.

> > Unfortunately, this choice can't be represented in a citeproc
> > style, since citeproc doesn't have a suppress-parens variant.
> > So maybe it would be better to make it an explicit option, triggered
> > by something like '~'.  Then we'd have three forms:
> >
> > [@doe99, p. 30]   =>  (Doe 1999, 30)
> > [~@doe99, p. 30]  =>  Doe 1999, 30
> > [-@doe99, p. 30]  =>  (1999, 30)
> >
> > Andrea also suggested that we needed a nocite variant.  I assume
> > he means something like bibtex's \nocite, which causes the entry
> > to be included in the list of references at the end, without
> > generating a citation in the text.
> >
> > One could just use another symbol:
> >
> > [&@doe99]
> >
> > Or one could use a special locator or prefix:
> >
> > [nocite @doe99; @smith04]
> >
> > I prefer the latter, just because its meaning is more obvious,
> > but it may be too English-centric for pandoc.
> 
> Another option (not thought of this before, so just off-the-cuff); a
> double '-':
> 
> [--doe99]
> 
> ... or if we don't need the earlier '+' behavior, use that?
> 
> Finally, I'd like to add a use case for consideration that's come up a
> fair bit in the zotero fora:
> 
> Multiple bibliography sections. For example, you need different
> sections for primary and secondary documents, and legal citations.
> 
> Whether you add support for it now or note, it'd at least be good to
> consider how it might be added. It might be so simple as some sort of
> configuration for citeproc, where the markdown document need not
> include any specific syntax.

Yes, I'm not too sure about this.  Currently citeproc returns a
list of works cited, which is just inserted at the end of the pandoc
document.  The suggestion is that authors will finish their document
with an appropriate section heading, say

# References

and the bibliography would be inserted after this.

If you wanted two separate bibliographies, you'd need two section
headings, presumably, and then we'd need syntax to tell pandoc
which bibliographies should be inserted where.  This is quite
a bit of additional complexity.

As for creating two sorted lists (one of primary and one of
secondary sources), that would have to be done by citeproc
itself, presumably on the basis of information in the bibliography
entries (maybe a boolean 'primary' field). So I don't think
special syntax would be needed in the citations themselves.

John

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax
       [not found]                     ` <20101031220602.GA21760-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2010-11-01  0:48                       ` Bruce
@ 2010-11-01 10:01                       ` Nathan Gass
       [not found]                         ` <4CCE900E.6060106-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: Nathan Gass @ 2010-11-01 10:01 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 31.10.10 23:06, John MacFarlane wrote:
> Here are my latest thoughts on citation syntax.
>
> I'm beginning to wonder whether we really want an omit-author
> form.  Here's why.  In general, we've set ourselves the goal
> of making it possible to change between inline and footnote
> citations, and more generally to change citation styles, without
> modifying the document itself. I think this is really a good
> goal.

I think you meen the author-only form [+@doe99] and not the omit-author 
form [-@doe99]?

>
> The main use that people have had in mind for the omit-author
> form is for use in generating the following kind of output:
>
> Doe (1999, 33) says that ...
>
> You'd generate this with something like:
>
> [+@doe99] [-@doe99, p. 33] says that ...
>
> But now, what happens when you switch to a footnote citation style?
> You end up with a footnote reference as the subject of your sentence --
> not good!

In my opinion [+@doe99] should always give you something suitable as a 
subject in any style. That is kind of the hole point calling them 
textual citations. So [+@doe99] would render as "Doe" in the text itself 
and not as a footnote, even in footnote style. I'd rather have it 
rendered as  `Doe ^[1999]` so, because this is by far the most common 
use-case for this feature. Still it should always give you something 
suitable as subject in any style.

 >
> Could we do without omit-author?  I think we could.  You can always
> just write:
>
> Doe [-@doe99, p. 33] says that ...
>
> This will look fine in footnote styles (provided we can solve the
> spacing and punctuation issues mentioned elsewhere in this thread).
>
> Nathan's plans to add a --natbib option to the latex reader shouldn't
> be affected by this.  natbib's omit-author forms could be translated
> to literal renderings of the author's name, which I assume we could
> get out of citeproc-hs.

That is true. We still need the author-only primitive in the native 
pandoc document though, in case we want to be able to convert natbib to 
some other citation format with supports textual citations.

> The problem with this is that
> some authors might actually want the parens within parens.
> Unfortunately, this choice can't be represented in a citeproc
> style, since citeproc doesn't have a suppress-parens variant.
> So maybe it would be better to make it an explicit option, triggered
> by something like '~'.  Then we'd have three forms:
>
> [@doe99, p. 30]   =>   (Doe 1999, 30)
> [~@doe99, p. 30]  =>   Doe 1999, 30
> [-@doe99, p. 30]  =>   (1999, 30)

I don't think we should add special syntax to markdown which differs 
only per citation style. Either we want to support citation styles 
avoiding double parentheses and add the needed features to citeproc-hs, 
or we don't support it.

>
> Andrea also suggested that we needed a nocite variant.  I assume
> he means something like bibtex's \nocite, which causes the entry
> to be included in the list of references at the end, without
> generating a citation in the text.
>
> One could just use another symbol:
>
> [&@doe99]
>
> Or one could use a special locator or prefix:
>
> [nocite @doe99; @smith04]
>
> I prefer the latter, just because its meaning is more obvious,
> but it may be too English-centric for pandoc.

I prefer the latter too. I argued before that markdown syntax gets away 
with having special syntax for everything, because much is left to do 
with html directly and because of the rather limited scope. In my view, 
the target for pandoc should be to work with different output formats 
and support most/all features needed in academic writing. This removes 
html as fallback and widens the scope to a point where it is unfeasible 
to add special syntax for everything.

Elsewhere we discussed adding arbitrary metadata to the header. We could 
support additional cites with a metadata key cites and avoid the special 
syntax. Any comments to this idea?

Nathan

>
> John
>


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax
       [not found]                         ` <4CCE900E.6060106-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
@ 2010-11-01 13:11                           ` Bruce
       [not found]                             ` <a97a886c-6ae0-428c-a962-b8e3258e798c-a16pFvPtgY3HdqrNY7FC6GB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
  2010-11-01 15:24                           ` John MacFarlane
  1 sibling, 1 reply; 107+ messages in thread
From: Bruce @ 2010-11-01 13:11 UTC (permalink / raw)
  To: pandoc-discuss

On Nov 1, 6:01 am, Nathan Gass <xa...-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org> wrote:
> On 31.10.10 23:06, John MacFarlane wrote:
>
> > Here are my latest thoughts on citation syntax.
>
> > I'm beginning to wonder whether we really want an omit-author
> > form.  Here's why.  In general, we've set ourselves the goal
> > of making it possible to change between inline and footnote
> > citations, and more generally to change citation styles, without
> > modifying the document itself. I think this is really a good
> > goal.
>
> I think you meen the author-only form [+@doe99] and not the omit-author
> form [-@doe99]?
>
>
>
> > The main use that people have had in mind for the omit-author
> > form is for use in generating the following kind of output:
>
> > Doe (1999, 33) says that ...
>
> > You'd generate this with something like:
>
> > [+@doe99] [-@doe99, p. 33] says that ...
>
> > But now, what happens when you switch to a footnote citation style?
> > You end up with a footnote reference as the subject of your sentence --
> > not good!
>
> In my opinion [+@doe99] should always give you something suitable as a
> subject in any style. That is kind of the hole point calling them
> textual citations. So [+@doe99] would render as "Doe" in the text itself
> and not as a footnote, even in footnote style. I'd rather have it
> rendered as  `Doe ^[1999]` so, because this is by far the most common
> use-case for this feature. Still it should always give you something
> suitable as subject in any style.

To me a "textual citation" as you seem to be using the term (printing
only the author) is not a citation, and so is out of scope.
Notwithstanding natbib tradition, can you give me an example of where
one would find such a citation in a document and have the entry it
referred to in the bibliography?

I searched for the term "textual citation" in Google, but it wasn't
much help; people seem to use the term to refer to standard in-text
citations.

> > Could we do without omit-author?  I think we could.  You can always
> > just write:
>
> > Doe [-@doe99, p. 33] says that ...
>
> > This will look fine in footnote styles (provided we can solve the
> > spacing and punctuation issues mentioned elsewhere in this thread).
>
> > Nathan's plans to add a --natbib option to the latex reader shouldn't
> > be affected by this.  natbib's omit-author forms could be translated
> > to literal renderings of the author's name, which I assume we could
> > get out of citeproc-hs.
>
> That is true. We still need the author-only primitive in the native
> pandoc document though, in case we want to be able to convert natbib to
> some other citation format with supports textual citations.
>
> > The problem with this is that
> > some authors might actually want the parens within parens.
> > Unfortunately, this choice can't be represented in a citeproc
> > style, since citeproc doesn't have a suppress-parens variant.
> > So maybe it would be better to make it an explicit option, triggered
> > by something like '~'.  Then we'd have three forms:
>
> > [@doe99, p. 30]   =>   (Doe 1999, 30)
> > [~@doe99, p. 30]  =>   Doe 1999, 30
> > [-@doe99, p. 30]  =>   (1999, 30)
>
> I don't think we should add special syntax to markdown which differs
> only per citation style. Either we want to support citation styles
> avoiding double parentheses and add the needed features to citeproc-hs,
> or we don't support it.
>
> > Andrea also suggested that we needed a nocite variant.  I assume
> > he means something like bibtex's \nocite, which causes the entry
> > to be included in the list of references at the end, without
> > generating a citation in the text.
>
> > One could just use another symbol:
>
> > [&@doe99]
>
> > Or one could use a special locator or prefix:
>
> > [nocite @doe99; @smith04]
>
> > I prefer the latter, just because its meaning is more obvious,
> > but it may be too English-centric for pandoc.
>
> I prefer the latter too. I argued before that markdown syntax gets away
> with having special syntax for everything, because much is left to do
> with html directly and because of the rather limited scope. In my view,
> the target for pandoc should be to work with different output formats
> and support most/all features needed in academic writing. This removes
> html as fallback and widens the scope to a point where it is unfeasible
> to add special syntax for everything.

But pandoc should NOT be privileging Bib/LaTeX. I think it's a big
negative to be using natural language-specific instructions.

> Elsewhere we discussed adding arbitrary metadata to the header. We could
> support additional cites with a metadata key cites and avoid the special
> syntax. Any comments to this idea?

I'd say in any case that if we can't on agree on a precise proposal
for nocite, we leave it for later.

Bruce

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax
       [not found]                             ` <a97a886c-6ae0-428c-a962-b8e3258e798c-a16pFvPtgY3HdqrNY7FC6GB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
@ 2010-11-01 14:48                               ` Nathan Gass
       [not found]                                 ` <4CCED340.9010304-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Nathan Gass @ 2010-11-01 14:48 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 01.11.10 14:11, Bruce wrote:
> On Nov 1, 6:01 am, Nathan Gass<xa...-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>  wrote:
>> On 31.10.10 23:06, John MacFarlane wrote:
>>
>>> The main use that people have had in mind for the omit-author
>>> form is for use in generating the following kind of output:
>>
>>> Doe (1999, 33) says that ...
>>
>>> You'd generate this with something like:
>>
>>> [+@doe99] [-@doe99, p. 33] says that ...
>>
>>> But now, what happens when you switch to a footnote citation style?
>>> You end up with a footnote reference as the subject of your sentence --
>>> not good!
>>
>> In my opinion [+@doe99] should always give you something suitable as a
>> subject in any style. That is kind of the hole point calling them
>> textual citations. So [+@doe99] would render as "Doe" in the text itself
>> and not as a footnote, even in footnote style. I'd rather have it
>> rendered as  `Doe ^[1999]` so, because this is by far the most common
>> use-case for this feature. Still it should always give you something
>> suitable as subject in any style.
>
> To me a "textual citation" as you seem to be using the term (printing
> only the author) is not a citation, and so is out of scope.
> Notwithstanding natbib tradition, can you give me an example of where
> one would find such a citation in a document and have the entry it
> referred to in the bibliography?

I was a bit sloppy in my usage here. Sorry about the confusion this caused.

For me, a textual citation is a citation which fulfills some grammatical 
role in a sentence. For example "Doe (1999: p.10) argues, that ...". I 
borrowed this term from http://merkel.zoneo.net/Latex/natbib.php, where 
it is opposed to parenthetical citation. I think this is what you mean 
by "standard in-text citations" too.

  Andreas author-only citation rendering [+doe1999@] as "Doe" would 
allow to construct such textual citations with `[+doe1999@] 
[-doe1999@p.10]`. I'd rather have the syntax `[+doe1999@p.10]` for this 
and not any syntax for author-only citations, as I said here and in 
another email in the same thread. My main point though was, that textual 
citations can be made to work with any citation style (including 
footnote style).

The natbib command for the given textual citation is 
\citet[p.10]{doe1999} and produces a real citation with corresponding 
entry in the bibliography. I don't know if there is something equivalent 
to the author-only citations in natbib. I'm therefore a bit at a loss 
why author-only citations are in any way a "natbib tradition".

>
> I searched for the term "textual citation" in Google, but it wasn't
> much help; people seem to use the term to refer to standard in-text
> citations.
>>
>>> Andrea also suggested that we needed a nocite variant.  I assume
>>> he means something like bibtex's \nocite, which causes the entry
>>> to be included in the list of references at the end, without
>>> generating a citation in the text.
>>
>>> One could just use another symbol:
>>
>>> [&@doe99]
>>
>>> Or one could use a special locator or prefix:
>>
>>> [nocite @doe99; @smith04]
>>
>>> I prefer the latter, just because its meaning is more obvious,
>>> but it may be too English-centric for pandoc.
>>
>> I prefer the latter too. I argued before that markdown syntax gets away
>> with having special syntax for everything, because much is left to do
>> with html directly and because of the rather limited scope. In my view,
>> the target for pandoc should be to work with different output formats
>> and support most/all features needed in academic writing. This removes
>> html as fallback and widens the scope to a point where it is unfeasible
>> to add special syntax for everything.
>
> But pandoc should NOT be privileging Bib/LaTeX. I think it's a big
> negative to be using natural language-specific instructions.

I don't understand. What has my argumentation to do with bibtex or latex?

About the second point: I think it is a bigger negative to have a huge 
list of seldom used special chars in citations, which nobody can 
remember ([-@doe1999], [~@doe1999], [@doe1999], [+@doe1999], [&@doe1999] 
etc).

>
>> Elsewhere we discussed adding arbitrary metadata to the header. We could
>> support additional cites with a metadata key cites and avoid the special
>> syntax. Any comments to this idea?
>
> I'd say in any case that if we can't on agree on a precise proposal
> for nocite, we leave it for later.

I personally don't need a nocite feature, so I'm happy with this solution.

Nathan

>
> Bruce
>


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax
       [not found]                         ` <4CCE900E.6060106-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  2010-11-01 13:11                           ` Bruce
@ 2010-11-01 15:24                           ` John MacFarlane
  1 sibling, 0 replies; 107+ messages in thread
From: John MacFarlane @ 2010-11-01 15:24 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Nathan Gass [Nov 01 10 11:01 ]:
> >Could we do without omit-author?  I think we could.  You can always
> >just write:
> >
> >Doe [-@doe99, p. 33] says that ...
> >
> >This will look fine in footnote styles (provided we can solve the
> >spacing and punctuation issues mentioned elsewhere in this thread).
> >
> >Nathan's plans to add a --natbib option to the latex reader shouldn't
> >be affected by this.  natbib's omit-author forms could be translated
> >to literal renderings of the author's name, which I assume we could
> >get out of citeproc-hs.
> 
> That is true. We still need the author-only primitive in the native
> pandoc document though, in case we want to be able to convert natbib
> to some other citation format with supports textual citations.

Yes, I guess that's true.

> >The problem with this is that
> >some authors might actually want the parens within parens.
> >Unfortunately, this choice can't be represented in a citeproc
> >style, since citeproc doesn't have a suppress-parens variant.
> >So maybe it would be better to make it an explicit option, triggered
> >by something like '~'.  Then we'd have three forms:
> >
> >[@doe99, p. 30]   =>   (Doe 1999, 30)
> >[~@doe99, p. 30]  =>   Doe 1999, 30
> >[-@doe99, p. 30]  =>   (1999, 30)
> 
> I don't think we should add special syntax to markdown which differs
> only per citation style. Either we want to support citation styles
> avoiding double parentheses and add the needed features to
> citeproc-hs, or we don't support it.

I'd be okay with not supporting it, at the beginning.  Maybe
Bruce or Andrea could comment on the feasibility of adding
the needed features to CSL or citeproc-hs.

> >Andrea also suggested that we needed a nocite variant.  I assume
> >he means something like bibtex's \nocite, which causes the entry
> >to be included in the list of references at the end, without
> >generating a citation in the text.
> >
> >One could just use another symbol:
> >
> >[&@doe99]
> >
> >Or one could use a special locator or prefix:
> >
> >[nocite @doe99; @smith04]
> >
> >I prefer the latter, just because its meaning is more obvious,
> >but it may be too English-centric for pandoc.
> 
> I prefer the latter too. I argued before that markdown syntax gets
> away with having special syntax for everything, because much is left
> to do with html directly and because of the rather limited scope. In
> my view, the target for pandoc should be to work with different
> output formats and support most/all features needed in academic
> writing. This removes html as fallback and widens the scope to a
> point where it is unfeasible to add special syntax for everything.

Another option would be this:

<nocite>
  [@doe99]
  [@smith04]
</nocite>

or

<nocite keys="doe99;smith04"/>

This has some advantages. First, we need to consider how our documents will
look when processed with non-pandoc markdown processors. Citations like
[@doe99, p. 30] will still be readable, but we want to make sure that the
"nocite" tags can be hidden. If we used a <nocite> block like this, people
could use css to hide the "nocite" section.

> Elsewhere we discussed adding arbitrary metadata to the header. We
> could support additional cites with a metadata key cites and avoid
> the special syntax. Any comments to this idea?

It might seem obtrusive to put a big list of citations right at
the beginning, in the header.  Ideally you want the document to
start at the beginning, without the kind of long preamble you
get in html or latex.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax
       [not found]                                 ` <4CCED340.9010304-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
@ 2010-11-01 15:28                                   ` John MacFarlane
  2010-11-01 16:24                                   ` dsanson
  2010-11-01 19:21                                   ` John MacFarlane
  2 siblings, 0 replies; 107+ messages in thread
From: John MacFarlane @ 2010-11-01 15:28 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Nathan Gass [Nov 01 10 15:48 ]:
> On 01.11.10 14:11, Bruce wrote:
> >On Nov 1, 6:01 am, Nathan Gass<xa...-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>  wrote:
> >>On 31.10.10 23:06, John MacFarlane wrote:
> >>
> >>>The main use that people have had in mind for the omit-author
> >>>form is for use in generating the following kind of output:
> >>
> >>>Doe (1999, 33) says that ...
> >>
> >>>You'd generate this with something like:
> >>
> >>>[+@doe99] [-@doe99, p. 33] says that ...
> >>
> >>>But now, what happens when you switch to a footnote citation style?
> >>>You end up with a footnote reference as the subject of your sentence --
> >>>not good!
> >>
> >>In my opinion [+@doe99] should always give you something suitable as a
> >>subject in any style. That is kind of the hole point calling them
> >>textual citations. So [+@doe99] would render as "Doe" in the text itself
> >>and not as a footnote, even in footnote style. I'd rather have it
> >>rendered as  `Doe ^[1999]` so, because this is by far the most common
> >>use-case for this feature. Still it should always give you something
> >>suitable as subject in any style.
> >
> >To me a "textual citation" as you seem to be using the term (printing
> >only the author) is not a citation, and so is out of scope.
> >Notwithstanding natbib tradition, can you give me an example of where
> >one would find such a citation in a document and have the entry it
> >referred to in the bibliography?
> 
> I was a bit sloppy in my usage here. Sorry about the confusion this caused.
> 
> For me, a textual citation is a citation which fulfills some
> grammatical role in a sentence. For example "Doe (1999: p.10)
> argues, that ...". I borrowed this term from
> http://merkel.zoneo.net/Latex/natbib.php, where it is opposed to
> parenthetical citation. I think this is what you mean by "standard
> in-text citations" too.
> 
>  Andreas author-only citation rendering [+doe1999@] as "Doe" would
> allow to construct such textual citations with `[+doe1999@]
> [-doe1999@p.10]`. I'd rather have the syntax `[+doe1999@p.10]` for
> this and not any syntax for author-only citations, as I said here
> and in another email in the same thread. My main point though was,
> that textual citations can be made to work with any citation style
> (including footnote style).
> 
> The natbib command for the given textual citation is
> \citet[p.10]{doe1999} and produces a real citation with
> corresponding entry in the bibliography. I don't know if there is
> something equivalent to the author-only citations in natbib. I'm
> therefore a bit at a loss why author-only citations are in any way a
> "natbib tradition".

natbib has \citeauthor

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax
       [not found]                                 ` <4CCED340.9010304-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  2010-11-01 15:28                                   ` John MacFarlane
@ 2010-11-01 16:24                                   ` dsanson
       [not found]                                     ` <cacc908c-bd4b-435d-901a-7a66fb9cb4b5-f5wI9GJRwsKaNOhjBGSpuVYGCWtFR9XvQQ4Iyu8u01E@public.gmane.org>
  2010-11-01 19:21                                   ` John MacFarlane
  2 siblings, 1 reply; 107+ messages in thread
From: dsanson @ 2010-11-01 16:24 UTC (permalink / raw)
  To: pandoc-discuss

As a writer, it is rare that the name of an author I refer to
throughout the text changes (though it can happen), so it seems easy
enough to "hard-code" that data rather than relying on citeproc to
generate it.

So it seems to me that the important question is instead whether or
not different citation styles ever need to be aware of the occurrence
of the author's name in the text for any reason that goes beyond
omitting the author. If not, that information is, as Bruce says, out
of scope. (But perhaps the correct semantic interpretation of the
markup is not, 'citation processor: you should omit author', but
'citation processor: be aware that the author name is included in the
text and proceed as you see fit.')

So is a citation that would be rendered in author-date style as:

+   According to Doe (1999), "blah blah blah"

ever rendered by some other citation style in a way that depends on
knowing that the "Doe" is related to the "(1999)"? I could imagine a
guideline that insisted on something like

+   According to [Author 1] (1999), "blah blah blah"

but I've never come across any such thing. There is the medieval
practice of referring to authors indirectly by nickname, e.g.,
"Aristotle" becomes "The Philosopher". But I don't know of any
publishers who insist on a similar practice today, if only for lack of
sufficient appropriate nicknames. More plausibly, there might be some
journals that insist on "this author" when referring to one's own
publications, as some insist on "this journal" when referring to
themselves.

If that information matters, it is worth pointing out that neither of
these proposals encode it:

1.   According to Doe [-@1999], "blah blah blah"
2.   According to [+@doe1999] [-@doe1999], "blah blah blah",

Note that the plan to use (2) to generate natbib \citet citations
seems to assume incorrectly that (2) does encode this sort of
information; (2) also wrongly suggests, as Bruce points out, that
[+doe1999] could be a freestanding citation. So I think it is right to
reject (2), and the question is whether or not we need something
semantically richer than (1). If we do, I think it is going to be hard
to get (we'd need some system to bind the different citations together
(indices?), as well as some system indicate which parts to display).
Yuck.

David


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax
       [not found]                                     ` <cacc908c-bd4b-435d-901a-7a66fb9cb4b5-f5wI9GJRwsKaNOhjBGSpuVYGCWtFR9XvQQ4Iyu8u01E@public.gmane.org>
@ 2010-11-01 17:16                                       ` John MacFarlane
       [not found]                                         ` <20101101171648.GA10110-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2010-11-01 17:46                                       ` Tillmann Rendel
  1 sibling, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-01 17:16 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ dsanson [Nov 01 10 09:24 ]:
> As a writer, it is rare that the name of an author I refer to
> throughout the text changes (though it can happen), so it seems easy
> enough to "hard-code" that data rather than relying on citeproc to
> generate it.
> 
> So it seems to me that the important question is instead whether or
> not different citation styles ever need to be aware of the occurrence
> of the author's name in the text for any reason that goes beyond
> omitting the author. If not, that information is, as Bruce says, out
> of scope. (But perhaps the correct semantic interpretation of the
> markup is not, 'citation processor: you should omit author', but
> 'citation processor: be aware that the author name is included in the
> text and proceed as you see fit.')
> 
> So is a citation that would be rendered in author-date style as:
> 
> +   According to Doe (1999), "blah blah blah"
> 
> ever rendered by some other citation style in a way that depends on
> knowing that the "Doe" is related to the "(1999)"? I could imagine a
> guideline that insisted on something like
> 
> +   According to [Author 1] (1999), "blah blah blah"
> 
> but I've never come across any such thing. There is the medieval
> practice of referring to authors indirectly by nickname, e.g.,
> "Aristotle" becomes "The Philosopher". But I don't know of any
> publishers who insist on a similar practice today, if only for lack of
> sufficient appropriate nicknames. More plausibly, there might be some
> journals that insist on "this author" when referring to one's own
> publications, as some insist on "this journal" when referring to
> themselves.
> 
> If that information matters, it is worth pointing out that neither of
> these proposals encode it:
> 
> 1.   According to Doe [-@1999], "blah blah blah"
> 2.   According to [+@doe1999] [-@doe1999], "blah blah blah",
> 
> Note that the plan to use (2) to generate natbib \citet citations
> seems to assume incorrectly that (2) does encode this sort of
> information;

Just to clarify: Nobody has proposed using (2) to generate \citet citations.
The LaTeX writer would use citeproc, too, bypassing bibtex entirely (this
way, you could be sure that the document would look the same in all output
formats). What Nathan wants is a way to go FROM latex+natbib to pandoc.

> (2) also wrongly suggests, as Bruce points out, that
> [+doe1999] could be a freestanding citation. So I think it is right to
> reject (2), and the question is whether or not we need something
> semantically richer than (1). If we do, I think it is going to be hard
> to get (we'd need some system to bind the different citations together
> (indices?), as well as some system indicate which parts to display).
> Yuck.

Right - this needs to be simple. My own view is that (1) would suffice.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax
       [not found]                                         ` <20101101171648.GA10110-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-01 17:30                                           ` Bruce
  0 siblings, 0 replies; 107+ messages in thread
From: Bruce @ 2010-11-01 17:30 UTC (permalink / raw)
  To: pandoc-discuss



On Nov 1, 1:16 pm, John MacFarlane <fiddlosop...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> +++ dsanson [Nov 01 10 09:24 ]:
>
>
>
> > As a writer, it is rare that the name of an author I refer to
> > throughout the text changes (though it can happen), so it seems easy
> > enough to "hard-code" that data rather than relying on citeproc to
> > generate it.
>
> > So it seems to me that the important question is instead whether or
> > not different citation styles ever need to be aware of the occurrence
> > of the author's name in the text for any reason that goes beyond
> > omitting the author. If not, that information is, as Bruce says, out
> > of scope. (But perhaps the correct semantic interpretation of the
> > markup is not, 'citation processor: you should omit author', but
> > 'citation processor: be aware that the author name is included in the
> > text and proceed as you see fit.')
>
> > So is a citation that would be rendered in author-date style as:
>
> > +   According to Doe (1999), "blah blah blah"
>
> > ever rendered by some other citation style in a way that depends on
> > knowing that the "Doe" is related to the "(1999)"? I could imagine a
> > guideline that insisted on something like
>
> > +   According to [Author 1] (1999), "blah blah blah"
>
> > but I've never come across any such thing. There is the medieval
> > practice of referring to authors indirectly by nickname, e.g.,
> > "Aristotle" becomes "The Philosopher". But I don't know of any
> > publishers who insist on a similar practice today, if only for lack of
> > sufficient appropriate nicknames. More plausibly, there might be some
> > journals that insist on "this author" when referring to one's own
> > publications, as some insist on "this journal" when referring to
> > themselves.
>
> > If that information matters, it is worth pointing out that neither of
> > these proposals encode it:
>
> > 1.   According to Doe [-@1999], "blah blah blah"
> > 2.   According to [+@doe1999] [-@doe1999], "blah blah blah",
>
> > Note that the plan to use (2) to generate natbib \citet citations
> > seems to assume incorrectly that (2) does encode this sort of
> > information;
>
> Just to clarify: Nobody has proposed using (2) to generate \citet citations.
> The LaTeX writer would use citeproc, too, bypassing bibtex entirely (this
> way, you could be sure that the document would look the same in all output
> formats). What Nathan wants is a way to go FROM latex+natbib to pandoc.
>
> > (2) also wrongly suggests, as Bruce points out, that
> > [+doe1999] could be a freestanding citation. So I think it is right to
> > reject (2), and the question is whether or not we need something
> > semantically richer than (1). If we do, I think it is going to be hard
> > to get (we'd need some system to bind the different citations together
> > (indices?), as well as some system indicate which parts to display).
> > Yuck.
>
> Right - this needs to be simple. My own view is that (1) would suffice.

So do I need to consult with the xbib group on the citet variant
(where output is "Doe (1999)", or are you all taking this off the
table?

Bruce

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax
       [not found]                                     ` <cacc908c-bd4b-435d-901a-7a66fb9cb4b5-f5wI9GJRwsKaNOhjBGSpuVYGCWtFR9XvQQ4Iyu8u01E@public.gmane.org>
  2010-11-01 17:16                                       ` John MacFarlane
@ 2010-11-01 17:46                                       ` Tillmann Rendel
       [not found]                                         ` <4CCEFCF8.4070805-jNDFPZUTrfTbB13WlS47k8u21/r88PR+s0AfqQuZ5sE@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: Tillmann Rendel @ 2010-11-01 17:46 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Hi,

dsanson wrote:
> As a writer, it is rare that the name of an author I refer to
> throughout the text changes (though it can happen), so it seems easy
> enough to "hard-code" that data rather than relying on citeproc to
> generate it.

With latex & natbib, I use \citeauthor for two purposes, both related to 
references with many authors.

First, with \citeauthor{...}, I do not have to remember the names of all 
coauthors.

And second, some citation styles propose the following handling of cited 
works with many authors: The first reference should read "A, B, C, and D 
proposed ... in their seminal paper (1976)", and a subsequent reference 
to the same thing should be "Unlike A et al.'s approach (1976), this 
work ...". I want the citation processor to figure out which mentioning 
of the author's names is the first in my work, and format the names 
accordingly.

> So it seems to me that the important question is instead whether or
> not different citation styles ever need to be aware of the occurrence
> of the author's name in the text for any reason that goes beyond
> omitting the author.

In output formats whichs support hyperlinks, the name of the author 
could be linked to the full reference.

In a later email, John McFarlane wrote:
> The LaTeX writer would use citeproc, too, bypassing bibtex entirely (this
> way, you could be sure that the document would look the same in all output
> formats).

I see that this would be the default. However, it would be great if 
pandoc would optionally support producing latex+bibtex output, too. At 
least in computer science, many conferences and journals provide their 
citation guidelines as bibtex configuration files. I want to use these 
configuration files together with pandoc.

   Tillmann


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax
       [not found]                                         ` <4CCEFCF8.4070805-jNDFPZUTrfTbB13WlS47k8u21/r88PR+s0AfqQuZ5sE@public.gmane.org>
@ 2010-11-01 18:33                                           ` Bruce
  2010-11-01 19:02                                           ` John MacFarlane
  1 sibling, 0 replies; 107+ messages in thread
From: Bruce @ 2010-11-01 18:33 UTC (permalink / raw)
  To: pandoc-discuss



On Nov 1, 1:46 pm, Tillmann Rendel <ren...-jNDFPZUTrfTbB13WlS47k8u21/r88PR+s0AfqQuZ5sE@public.gmane.org>
wrote:

...

> I see that this would be the default. However, it would be great if
> pandoc would optionally support producing latex+bibtex output, too. At
> least in computer science, many conferences and journals provide their
> citation guidelines as bibtex configuration files. I want to use these
> configuration files together with pandoc.

You could always see if there are CSL equivalents, and if they're not,
help create them ;-)

Seriously, though, the zotero community is typically happy to help
with new styles.

Bruce

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax
       [not found]                                         ` <4CCEFCF8.4070805-jNDFPZUTrfTbB13WlS47k8u21/r88PR+s0AfqQuZ5sE@public.gmane.org>
  2010-11-01 18:33                                           ` Bruce
@ 2010-11-01 19:02                                           ` John MacFarlane
       [not found]                                             ` <20101101190204.GA11857-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-01 19:02 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Tillmann Rendel [Nov 01 10 18:46 ]:
> Hi,
> 
> dsanson wrote:
> >As a writer, it is rare that the name of an author I refer to
> >throughout the text changes (though it can happen), so it seems easy
> >enough to "hard-code" that data rather than relying on citeproc to
> >generate it.
> 
> With latex & natbib, I use \citeauthor for two purposes, both
> related to references with many authors.
> 
> First, with \citeauthor{...}, I do not have to remember the names of
> all coauthors.
> 
> And second, some citation styles propose the following handling of
> cited works with many authors: The first reference should read "A,
> B, C, and D proposed ... in their seminal paper (1976)", and a
> subsequent reference to the same thing should be "Unlike A et al.'s
> approach (1976), this work ...". I want the citation processor to
> figure out which mentioning of the author's names is the first in my
> work, and format the names accordingly.

I can see how this might be useful. On the other hand, we have to
balance flexibility against simplicity. In this case, I'm not sure
the utility of this feature is worth the added complexity.

> >So it seems to me that the important question is instead whether or
> >not different citation styles ever need to be aware of the occurrence
> >of the author's name in the text for any reason that goes beyond
> >omitting the author.
> 
> In output formats whichs support hyperlinks, the name of the author
> could be linked to the full reference.
> 
> In a later email, John McFarlane wrote:
> >The LaTeX writer would use citeproc, too, bypassing bibtex entirely (this
> >way, you could be sure that the document would look the same in all output
> >formats).
> 
> I see that this would be the default. However, it would be great if
> pandoc would optionally support producing latex+bibtex output, too.
> At least in computer science, many conferences and journals provide
> their citation guidelines as bibtex configuration files. I want to
> use these configuration files together with pandoc.

Yes, that makes sense.  Well, I'm happy to have an option to make
the LaTeX writer output some kind of bibtex output. (But should
it be natbib?  I've moved on to using biblatex myself.)

I had been thinking that this could be done even without \citet or \citeauthor
equivalents in the pandoc citation syntax:

Doe [-@doe99, p. 30]  => Doe \citeyear[p. 30]{doe99}

But this wouldn't work for non-author-date styles. (It doesn't work with a
plain latex numerical style, for example.) Adding a \citet variant doesn't
help, because \citet also seems to fail for non-author-date styles.

Perhaps this is okay; if you're using the bibtex option with the
latex writer, you shouldn't use the suppress-author citations
unless you're going to use an author-date style.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax
       [not found]                                 ` <4CCED340.9010304-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  2010-11-01 15:28                                   ` John MacFarlane
  2010-11-01 16:24                                   ` dsanson
@ 2010-11-01 19:21                                   ` John MacFarlane
       [not found]                                     ` <20101101192157.GB11857-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-01 19:21 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

> About the second point: I think it is a bigger negative to have a
> huge list of seldom used special chars in citations, which nobody
> can remember ([-@doe1999], [~@doe1999], [@doe1999], [+@doe1999],
> [&@doe1999] etc).

Here's a thought.  What if we had pandoc treat a citation as
a "textual citation" if it is followed by a space, tab, or newline,
and otherwise as a regular citation?

So:

[@doe99, p. 20] says that P.  But the evidence shows that Q [@smith04].

would turn into

Doe (1999, 20) says that P. But the evidence shows that Q (Smith 2004).

If this would suffice, then we could do without the special modifiers
that make citations look like perl variables.

Unfortunately there seem to be a few cases where this default wouldn't
be right:

I'm not sure whether the pests are rats [@smith04] or mice [@jones05].

P, says [@jones99].

We could include special characters for the rare cases when it's
necessary to defeat the default, so you could do:

I'm not sure whether the pests are rats [* @smith04] or mice [@jones05].

P, says [+ @jones99].

In general, you wouldn't have to use them.  Many people could get by
without even knowing them. But maybe this default is too complex.

This decision is independent of whether the "textual citation" support
is to be added to citeproc or "faked" by pandoc and citeproc-hs.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citation syntax
       [not found]                                             ` <20101101190204.GA11857-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-01 20:31                                               ` Andrea Rossato
  2010-11-01 23:46                                                 ` Bruce
  2010-11-02  1:08                                                 ` John MacFarlane
  0 siblings, 2 replies; 107+ messages in thread
From: Andrea Rossato @ 2010-11-01 20:31 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Mon, Nov 01, 2010 at 12:02:04PM -0700, John MacFarlane wrote:
> +++ Tillmann Rendel [Nov 01 10 18:46 ]:
> > And second, some citation styles propose the following handling of
> > cited works with many authors: The first reference should read "A,
> > B, C, and D proposed ... in their seminal paper (1976)", and a
> > subsequent reference to the same thing should be "Unlike A et al.'s
> > approach (1976), this work ...". I want the citation processor to
> > figure out which mentioning of the author's names is the first in my
> > work, and format the names accordingly.
> 
> I can see how this might be useful. On the other hand, we have to
> balance flexibility against simplicity. In this case, I'm not sure
> the utility of this feature is worth the added complexity.
> 

citeproc-hs already does that as CSL requires the processor to set a
"position" variable which can either be set to "first", "subsequent",
"ibid", etc. if I understand the problem correctly.

Andrea

ps: sorry if I'm a bit behind all this discussion about the syntax:
I'm working to provide the API, but I'm afraid I'm not useful in
providing meaningful comments. I'm presently fixing the (three)
footnote generation probelms reported by John. I already have some
code but I will come back to that. I also seem to have an issue with
google marking my post as spam (I checked my SMTP server and it
apparenlty isn't listed in any black-lists I'm aware of).


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax
  2010-11-01 20:31                                               ` Andrea Rossato
@ 2010-11-01 23:46                                                 ` Bruce
  2010-11-02  1:08                                                 ` John MacFarlane
  1 sibling, 0 replies; 107+ messages in thread
From: Bruce @ 2010-11-01 23:46 UTC (permalink / raw)
  To: pandoc-discuss



On Nov 1, 4:31 pm, Andrea Rossato <andrea.ross...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Mon, Nov 01, 2010 at 12:02:04PM -0700, John MacFarlane wrote:
> > +++ Tillmann Rendel [Nov 01 10 18:46 ]:
> > > And second, some citation styles propose the following handling of
> > > cited works with many authors: The first reference should read "A,
> > > B, C, and D proposed ... in their seminal paper (1976)", and a
> > > subsequent reference to the same thing should be "Unlike A et al.'s
> > > approach (1976), this work ...". I want the citation processor to
> > > figure out which mentioning of the author's names is the first in my
> > > work, and format the names accordingly.
>
> > I can see how this might be useful. On the other hand, we have to
> > balance flexibility against simplicity. In this case, I'm not sure
> > the utility of this feature is worth the added complexity.
>
> citeproc-hs already does that as CSL requires the processor to set a
> "position" variable which can either be set to "first", "subsequent",
> "ibid", etc. if I understand the problem correctly.

Yes, CSL does support pretty complex positional behavior like this.
But it doesn't support the "active" form of citation he used above
(where the author name is placed outside the parentheses).

Bruce

> Andrea
>
> ps: sorry if I'm a bit behind all this discussion about the syntax:
> I'm working to provide the API, but I'm afraid I'm not useful in
> providing meaningful comments. I'm presently fixing the (three)
> footnote generation probelms reported by John. I already have some
> code but I will come back to that. I also seem to have an issue with
> google marking my post as spam (I checked my SMTP server and it
> apparenlty isn't listed in any black-lists I'm aware of).

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax
  2010-11-01 20:31                                               ` Andrea Rossato
  2010-11-01 23:46                                                 ` Bruce
@ 2010-11-02  1:08                                                 ` John MacFarlane
  1 sibling, 0 replies; 107+ messages in thread
From: John MacFarlane @ 2010-11-02  1:08 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Nov 01 10 21:31 ]:
> On Mon, Nov 01, 2010 at 12:02:04PM -0700, John MacFarlane wrote:
> > +++ Tillmann Rendel [Nov 01 10 18:46 ]:
> > > And second, some citation styles propose the following handling of
> > > cited works with many authors: The first reference should read "A,
> > > B, C, and D proposed ... in their seminal paper (1976)", and a
> > > subsequent reference to the same thing should be "Unlike A et al.'s
> > > approach (1976), this work ...". I want the citation processor to
> > > figure out which mentioning of the author's names is the first in my
> > > work, and format the names accordingly.
> > 
> > I can see how this might be useful. On the other hand, we have to
> > balance flexibility against simplicity. In this case, I'm not sure
> > the utility of this feature is worth the added complexity.
> > 
> 
> citeproc-hs already does that as CSL requires the processor to set a
> "position" variable which can either be set to "first", "subsequent",
> "ibid", etc. if I understand the problem correctly.

Yes, that's true.  I meant to be commenting specifically on author-only,
but you're right, CSL styles can be sensitive to position.

> 
> ps: sorry if I'm a bit behind all this discussion about the syntax:
> I'm working to provide the API, but I'm afraid I'm not useful in
> providing meaningful comments. I'm presently fixing the (three)
> footnote generation probelms reported by John. I already have some
> code but I will come back to that. I also seem to have an issue with
> google marking my post as spam (I checked my SMTP server and it
> apparenlty isn't listed in any black-lists I'm aware of).

This one got through, so maybe it has finally learned that you're OK.


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citation syntax
       [not found]                                     ` <20101101192157.GB11857-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-02  7:28                                       ` Nathan Gass
  0 siblings, 0 replies; 107+ messages in thread
From: Nathan Gass @ 2010-11-02  7:28 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 01.11.10 20:21, John MacFarlane wrote:
>> About the second point: I think it is a bigger negative to have a
>> huge list of seldom used special chars in citations, which nobody
>> can remember ([-@doe1999], [~@doe1999], [@doe1999], [+@doe1999],
>> [&@doe1999] etc).
>
> Here's a thought.  What if we had pandoc treat a citation as
> a "textual citation" if it is followed by a space, tab, or newline,
> and otherwise as a regular citation?
>
> So:
>
> [@doe99, p. 20] says that P.  But the evidence shows that Q [@smith04].
>
> would turn into
>
> Doe (1999, 20) says that P. But the evidence shows that Q (Smith 2004).
>
> If this would suffice, then we could do without the special modifiers
> that make citations look like perl variables.

My preferred syntax currently looks like this:

     Some argument [see doe99: p.10].
     Doe's [-doe99: p.10] argument implies ...
     As said by [+doe99: p.10], ...
     [nocite doe99:]

Not very perlish ;-).

>
> Unfortunately there seem to be a few cases where this default wouldn't
> be right:
>
> I'm not sure whether the pests are rats [@smith04] or mice [@jones05].
>
> P, says [@jones99].
>
> We could include special characters for the rare cases when it's
> necessary to defeat the default, so you could do:
>
> I'm not sure whether the pests are rats [* @smith04] or mice [@jones05].
>
> P, says [+ @jones99].
>
> In general, you wouldn't have to use them.  Many people could get by
> without even knowing them. But maybe this default is too complex.

Yes I think this is too complex, and moreover believe most people using 
citations would need to know the special forms anyway. I do think 
textual citations are common enough to warrant a special char (that is 
*not* an argument for a \citet-like construct in markdown, as I consider 
the author-less version [-@doe99] specially made for textual citations too).

I was arguing against [&@doe99] for nocite. As this is definitely 
pushing it too far. I can see of course that the same argument can be 
made against [+@doe99] (working like \citet, not as it is currently 
implemented). For me, the benefit of the latter is worth the special 
syntax and the + is a little bit more mnemonic than & for nocite. But I 
can understand if you decide to simplify even more and have only one 
special char, just don't make it 3-4 special char variants. YMMV

Nathan

>
> This decision is independent of whether the "textual citation" support
> is to be added to citeproc or "faked" by pandoc and citeproc-hs.
>
> John
>


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: supporting both parenthetical and footnote citations (was: citeproc updates)
       [not found]                 ` <20101030160608.GA4075-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2010-10-30 20:18                   ` Bruce
  2010-10-31 15:58                   ` Bruce
@ 2010-11-02 21:39                   ` Andrea Rossato
       [not found]                     ` <20101102213938.GE18764-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  2 siblings, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-02 21:39 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 3498 bytes --]

On Sat, Oct 30, 2010 at 09:06:08AM -0700, John MacFarlane wrote:
> > Then a citation group: 1.
> 
> > 1. See Doe 2000, p. 34-35; also Doe 2002, chap. 3
> 
> Here there are three problems:
> 
> - The footnote is missing a period, since the period was outside
>   of the citation
> - The period occurs after the footnote.
> - There is a space before the footnote.
> 
> It's a bit hard to see how to resolve these issues, given that, when
> pandoc is actually parsing the citation, it won't know whether the
> output is to be rendered using a footnote style or an author/date
> style.
> 
> I noticed another issue with Andrea's test:  the note says simply
> "at 3" instead of "chap. 3" -- apparently the number is being stored
> and the rest ignored.  This seems undesirable.
> 
> My own inclination, if there's no obvious solution, is to optimize pandoc for
> the author-date format. After all, if people want citations in footnotes, they
> can always manually insert footnotes.
> 


I think these issues can be solved (I didn't notice them since I was
in a hurry and mostly focused on the citeproc side of the problems).
I'm attaching a patch to the citeproc branch which deals with them.

 * The first: some of the styles coming with the test-suite usually
   put a final "." suffix in the citation layout. I don't know if this
   is a good style coding practice (I tend to think it is not), but
   while this usually solves the problem you reported it usually
   messes up normal footnotes ending with a citation followed by a
   period (which makes sense in markdown). I updated my test to check
   this. This issue requires looking for duplicate punctuations at the
   end of a footnote paragraph;

 * the second: this could be seen as a more general problem. Sometimes
   I'm required to provide documents where footnote marks follow a
   punctuation, while other times I'm require to do the opposite
   (footnote mark *before* punctuation). It would be nice if pandoc
   could take care of that (it wouldn't be difficult for the markdown
   parser I think) with a command line option. Otherwise the problem
   can be solved the same way I solved the third one;

 * the third: the fix of this problem, and the first one, could
   possibly be simpler (there must be something I do not get about
   generics since I'm not able to match a [Inline] without matching
   every single Inline). I don't know how much all this processing
   impacts the performance (beyond the burden of the csl processor):
   some benchmarking would be nice. I have the feeling a better
   approach could be studied.

I updated the test.markdown file to test the new code with two note
styles comming with the test-suite (mhra.csl and
chicago-fullnote-bibliography-bb.csl) and an in-text style
(chicago-author-date.csl). the outcomes are prova1a.html, prova1b,html
and prova2.html respectively.

Everything can be found here:
http://gorgias.mine.nu/citeproc/

The outcome seems to be consistent, at least at a first glance.

The second problem remains unsolved and I'm waiting for opinions.

Andrea

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.


[-- Attachment #2: 0001-Changes-to-use-citeproc-0.3.patch --]
[-- Type: text/plain, Size: 9520 bytes --]

From ac06ca2b00f1c0b25b02b1e25aca8dd28961240d Mon Sep 17 00:00:00 2001
From: John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org>
Date: Wed, 27 Oct 2010 18:22:45 -0700
Subject: [PATCH 1/2] Changes to use citeproc 0.3.

Patch from Andrea Rossato.
Note: the markdown syntax is preliminary and will probably change.
---
 src/Text/Pandoc/Biblio.hs           |   75 ++++++++++++++++++++++++++++++-----
 src/Text/Pandoc/Definition.hs       |   16 +++++++-
 src/Text/Pandoc/Readers/Markdown.hs |   32 +++++++++------
 src/pandoc.hs                       |    2 +-
 4 files changed, 101 insertions(+), 24 deletions(-)

diff --git a/src/Text/Pandoc/Biblio.hs b/src/Text/Pandoc/Biblio.hs
index 436eadd..cbf6191 100644
--- a/src/Text/Pandoc/Biblio.hs
+++ b/src/Text/Pandoc/Biblio.hs
@@ -22,7 +22,7 @@ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
    Copyright   : Copyright (C) 2008 Andrea Rossato
    License     : GNU GPL, version 2 or above
 
-   Maintainer  : Andrea Rossato <andrea.rossato-/Q1r7N5in3P/wltNWqQaag@public.gmane.org>
+   Maintainer  : Andrea Rossato <andrea.rossato-3IIOeSMMxS4@public.gmane.org>
    Stability   : alpha
    Portability : portable
 -}
@@ -31,7 +31,9 @@ module Text.Pandoc.Biblio ( processBiblio ) where
 
 import Control.Monad ( when )
 import Data.List
-import Text.CSL
+import Data.Unique
+import Text.CSL hiding ( Cite(..), Citation(..) )
+import qualified Text.CSL as CSL ( Cite(..) )
 import Text.Pandoc.Definition
 
 -- | Process a 'Pandoc' document by adding citations formatted
@@ -42,25 +44,78 @@ processBiblio cf r p
       else do
         when (null cf) $ error "Missing the needed citation style file"
         csl  <- readCSLFile cf
-        let groups     = queryWith getCite p
-            result     = citeproc csl r groups
+        p'   <- if styleClass csl == "note"
+                then processNote p
+                else processWithM setHash p
+        let groups     = if styleClass csl /= "note"
+                         then queryWith getCitation p'
+                         else getNoteCitations p'
+            result     = citeproc' csl r (setNearNote csl $ map (map toCslCite) groups)
             cits_map   = zip groups (citations result)
             biblioList = map (read . renderPandoc' csl) (bibliography result)
-            Pandoc m b = processWith (processCite csl cits_map) p
+            Pandoc m b = processWith (processCite csl cits_map) p'
         return $ Pandoc m $ b ++ biblioList
 
 -- | Substitute 'Cite' elements with formatted citations.
-processCite :: Style -> [([Target],[FormattedOutput])] -> Inline -> Inline
+processCite :: Style -> [([Citation],[FormattedOutput])] -> Inline -> Inline
 processCite s cs il
     | Cite t _ <- il = Cite t (process t)
     | otherwise      = il
     where
-      process t = case elemIndex t (map fst cs) of
-                    Just i -> read . renderPandoc s $ snd (cs !! i)
+      process t = case lookup t cs of
+                    Just  i -> read $ renderPandoc s i
                     Nothing -> [Str ("Error processing " ++ show t)]
 
 -- | Retrieve all citations from a 'Pandoc' docuument. To be used with
 -- 'queryWith'.
-getCite :: Inline -> [[(String,String)]]
-getCite i | Cite t _ <- i = [t]
+getCitation :: Inline -> [[Citation]]
+getCitation i | Cite t _ <- i = [t]
+          | otherwise         = []
+
+getNote :: Inline -> [Inline]
+getNote i | Note _ <- i = [i]
+          | otherwise   = []
+
+getCite :: Inline -> [Inline]
+getCite i | Cite _ _ <- i = [i]
           | otherwise     = []
+
+getNoteCitations :: Pandoc -> [[Citation]]
+getNoteCitations
+    = let cits = concat . flip (zipWith $ setCiteNoteNum) [1..] .
+                 map (queryWith getCite) . queryWith getNote
+      in  queryWith getCitation . cits
+
+setHash :: Citation -> IO Citation
+setHash (Citation i p l nn ao na _)
+    = hashUnique `fmap` newUnique >>= return . Citation i p l nn ao na
+
+processNote :: Pandoc  -> IO Pandoc
+processNote p = do
+  p' <- processWithM setHash p
+  let cits     = queryWith getCite p'
+      ncits    = map (queryWith getCite) $ queryWith getNote p'
+      needNote = cits \\ concat ncits
+  return $ processWith (mvCiteInNote needNote) p'
+
+mvCiteInNote :: [Inline] -> Inline -> Inline
+mvCiteInNote is i = if i `elem` is then Note [Para [i]] else i
+
+setCiteNoteNum :: [Inline] -> Int -> [Inline]
+setCiteNoteNum ((Cite cs o):xs) n = Cite (setCitationNoteNum n cs) o : setCiteNoteNum xs n
+setCiteNoteNum               _  _ = []
+
+setCitationNoteNum :: Int -> [Citation] -> [Citation]
+setCitationNoteNum i = map $ \c -> c { citationNoteNum = i}
+
+toCslCite :: Citation -> CSL.Cite
+toCslCite (Citation i p l nn ao na _)
+    = let (la,lo) = parseLocator l
+      in   emptyCite { CSL.citeId         = i
+                     , CSL.citePrefix     = p
+                     , CSL.citeLabel      = la
+                     , CSL.citeLocator    = lo
+                     , CSL.citeNoteNumber = show nn
+                     , CSL.authorOnly     = ao
+                     , CSL.suppressAuthor = na
+                     }
diff --git a/src/Text/Pandoc/Definition.hs b/src/Text/Pandoc/Definition.hs
index fffca3b..bec216b 100644
--- a/src/Text/Pandoc/Definition.hs
+++ b/src/Text/Pandoc/Definition.hs
@@ -112,7 +112,7 @@ data Inline
     | Subscript [Inline]    -- ^ Subscripted text (list of inlines)
     | SmallCaps [Inline]    -- ^ Small caps text (list of inlines)
     | Quoted QuoteType [Inline] -- ^ Quoted text (list of inlines)
-    | Cite [Target] [Inline] -- ^ Citation (list of inlines)
+    | Cite [Citation]  [Inline] -- ^ Citation (list of inlines)
     | Code String           -- ^ Inline code (literal)
     | Space                 -- ^ Inter-word space
     | EmDash                -- ^ Em dash
@@ -129,6 +129,20 @@ data Inline
     | Note [Block]          -- ^ Footnote or endnote 
     deriving (Show, Eq, Ord, Read, Typeable, Data)
 
+data Citation = Citation { citationId      :: String
+                         , citationPrefix  :: String
+                         , citationLocator :: String
+                         , citationNoteNum :: Int
+                         , citationAutOnly :: Bool
+                         , citationNoAut   :: Bool
+                         , citationHash    :: Int
+                         }
+                deriving (Show, Ord, Read, Typeable, Data)
+
+instance Eq Citation where
+    (==) (Citation _ _ _ _ _ _ ha)
+         (Citation _ _ _ _ _ _ hb) = ha == hb
+
 -- | Applies a transformation on @a@s to matching elements in a @b@.
 processWith :: (Data a, Data b) => (a -> a) -> b -> b
 processWith f = everywhere (mkT f)
diff --git a/src/Text/Pandoc/Readers/Markdown.hs b/src/Text/Pandoc/Readers/Markdown.hs
index 8c6a90e..030da91 100644
--- a/src/Text/Pandoc/Readers/Markdown.hs
+++ b/src/Text/Pandoc/Readers/Markdown.hs
@@ -1316,27 +1316,35 @@ inlineCitation = try $ do
      then return $ Cite citations []
      else fail "no citation found"
 
-chkCit :: Target -> GenParser Char ParserState (Maybe Target)
+chkCit :: Citation -> GenParser Char ParserState (Maybe Citation)
 chkCit t = do
   st <- getState
-  case lookupKeySrc (stateKeys st) (Key [Str $ fst t]) of
+  case lookupKeySrc (stateKeys st) (Key [Str $ citationId t]) of
      Just  _ -> fail "This is a link"
-     Nothing -> if elem (fst t) $ stateCitations st
+     Nothing -> if elem (citationId t) $ stateCitations st
                    then return $ Just t
                    else return $ Nothing
 
 citeMarker :: GenParser Char ParserState String
 citeMarker = char '[' >> manyTill ( noneOf "\n" <|> (newline >>~ notFollowedBy blankline) ) (char ']')
 
-parseCitation :: GenParser Char ParserState [(String,String)]
-parseCitation = try $ sepBy (parseLabel) (oneOf ";")
+parseCitation :: GenParser Char ParserState [Citation]
+parseCitation = try $ sepBy (parseLabel) (skipMany1 $ char ';')
 
-parseLabel :: GenParser Char ParserState (String,String)
+parseLabel :: GenParser Char ParserState Citation
 parseLabel = try $ do
-  res <- sepBy (skipSpaces >> optional newline >> skipSpaces >> many1 (noneOf "@;")) (oneOf "@")
-  case res of
-    [lab,loc] -> return (lab, loc)
-    [lab]     -> return (lab, "" )
-    _         -> return ("" , "" )
-
+  r <- many (noneOf ";")
+  let t' s = if s /= [] then tail s else []
+      trim = unwords . words
+      pref =      takeWhile (/= '@') r
+      rest = t' $ dropWhile (/= '@') r
+      cit  =      takeWhile (/= ',') rest
+      loc  = t' $ dropWhile (/= ',') rest
+      (p,na) = if pref /= [] && last pref == '-'
+               then (init pref, True )
+               else (pref     , False)
+      (p',o) = if p /= [] && last p == '+'
+               then (init p   , True )
+               else (p        , False)
+  return $ Citation cit (trim p') (trim loc) 0 o na 0
 #endif
diff --git a/src/pandoc.hs b/src/pandoc.hs
index 082e337..c8c414a 100644
--- a/src/pandoc.hs
+++ b/src/pandoc.hs
@@ -789,7 +789,7 @@ main = do
                                                      lhsExtension sources,
                               stateStandalone      = standalone',
 #ifdef _CITEPROC
-                              stateCitations       = map citeKey refs,
+                              stateCitations       = map refId refs,
 #endif
                               stateSmart           = smart || writerName' `elem`
                                                               ["latex", "context", "latex+lhs", "man"],
-- 
1.7.1


[-- Attachment #3: 0002-improve-the-footnote-generation-of-in-text-citations.patch --]
[-- Type: text/plain, Size: 7903 bytes --]

From 5f270557b9fe26689b7b5df2c959860f34d07c85 Mon Sep 17 00:00:00 2001
From: Andrea Rossato <andrea.rossato-/Q1r7N5in3P/wltNWqQaag@public.gmane.org>
Date: Tue, 2 Nov 2010 21:58:58 +0100
Subject: [PATCH 2/2] improve the footnote generation of in-text citations withe note csl styles

---
 src/Text/Pandoc/Biblio.hs |  135 +++++++++++++++++++++++++++++++++++++--------
 1 files changed, 111 insertions(+), 24 deletions(-)

diff --git a/src/Text/Pandoc/Biblio.hs b/src/Text/Pandoc/Biblio.hs
index cbf6191..2f773c7 100644
--- a/src/Text/Pandoc/Biblio.hs
+++ b/src/Text/Pandoc/Biblio.hs
@@ -30,6 +30,7 @@ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 module Text.Pandoc.Biblio ( processBiblio ) where
 
 import Control.Monad ( when )
+import Data.Char ( toUpper, isPunctuation )
 import Data.List
 import Data.Unique
 import Text.CSL hiding ( Cite(..), Citation(..) )
@@ -44,17 +45,18 @@ processBiblio cf r p
       else do
         when (null cf) $ error "Missing the needed citation style file"
         csl  <- readCSLFile cf
-        p'   <- if styleClass csl == "note"
-                then processNote p
-                else processWithM setHash p
-        let groups     = if styleClass csl /= "note"
-                         then queryWith getCitation p'
-                         else getNoteCitations p'
-            result     = citeproc' csl r (setNearNote csl $ map (map toCslCite) groups)
-            cits_map   = zip groups (citations result)
+        p'   <- processWithM setHash p
+        let (nts,grps) = if styleClass csl /= "note"
+                         then (,) [] $ queryWith getCitation p'
+                         else let cits   = queryWith getCite p'
+                                  ncits  = map (queryWith getCite) $ queryWith getNote p'
+                                  needNt = cits \\ concat ncits
+                              in (,) needNt $ getNoteCitations needNt p'
+            result     = citeproc' csl r (setNearNote csl $ map (map toCslCite) grps)
+            cits_map   = zip grps (citations result)
             biblioList = map (read . renderPandoc' csl) (bibliography result)
             Pandoc m b = processWith (processCite csl cits_map) p'
-        return $ Pandoc m $ b ++ biblioList
+        return . generateNotes nts . Pandoc m $ b ++ biblioList
 
 -- | Substitute 'Cite' elements with formatted citations.
 processCite :: Style -> [([Citation],[FormattedOutput])] -> Inline -> Inline
@@ -70,7 +72,7 @@ processCite s cs il
 -- 'queryWith'.
 getCitation :: Inline -> [[Citation]]
 getCitation i | Cite t _ <- i = [t]
-          | otherwise         = []
+              | otherwise     = []
 
 getNote :: Inline -> [Inline]
 getNote i | Note _ <- i = [i]
@@ -80,26 +82,111 @@ getCite :: Inline -> [Inline]
 getCite i | Cite _ _ <- i = [i]
           | otherwise     = []
 
-getNoteCitations :: Pandoc -> [[Citation]]
-getNoteCitations
-    = let cits = concat . flip (zipWith $ setCiteNoteNum) [1..] .
-                 map (queryWith getCite) . queryWith getNote
-      in  queryWith getCitation . cits
+getNoteCitations :: [Inline] -> Pandoc -> [[Citation]]
+getNoteCitations needNote
+    = let mvCite i = if i `elem` needNote then Note [Para [i]] else i
+          setNote  = processWith mvCite
+          getCits  = concat . flip (zipWith $ setCiteNoteNum) [1..] .
+                     map (queryWith getCite) . queryWith getNote . setNote
+      in  queryWith getCitation . getCits
 
 setHash :: Citation -> IO Citation
 setHash (Citation i p l nn ao na _)
     = hashUnique `fmap` newUnique >>= return . Citation i p l nn ao na
 
-processNote :: Pandoc  -> IO Pandoc
-processNote p = do
-  p' <- processWithM setHash p
-  let cits     = queryWith getCite p'
-      ncits    = map (queryWith getCite) $ queryWith getNote p'
-      needNote = cits \\ concat ncits
-  return $ processWith (mvCiteInNote needNote) p'
+generateNotes :: [Inline] -> Pandoc -> Pandoc
+generateNotes needNote = processWith (mvCiteInNote needNote)
 
-mvCiteInNote :: [Inline] -> Inline -> Inline
-mvCiteInNote is i = if i `elem` is then Note [Para [i]] else i
+procInlines :: ([Inline] -> [Inline]) -> Block -> Block
+procInlines f b
+    | Plain    inls <- b = Plain    $ f inls
+    | Para     inls <- b = Para     $ f inls
+    | Header i inls <- b = Header i $ f inls
+    | otherwise          = b
+
+mvCiteInNote :: [Inline] -> Block -> Block
+mvCiteInNote is = procInlines mvCite
+    where
+      elem_ x xs = case x of Cite cs _ -> (Cite cs []) `elem` xs; _ -> False
+      mvCite :: [Inline] -> [Inline]
+      mvCite inls
+          | x:i:xs <- inls
+          , x == Space,   i `elem_` is = mvInNote i : mvCite xs
+          | i:xs <- inls, i `elem_` is = mvInNote i : mvCite xs
+          | i:xs <- inls, Note _ <- i  = checkNt  i : mvCite xs
+          | i:xs <- inls               = i          : mvCite xs
+          | otherwise                  = []
+      mvInNote i
+          | Cite t o <- i = Note [Para [Cite t $ toCapital o]]
+          | otherwise     = Note [Para [i                   ]]
+      checkPt i = if and (map isPunctuation $ lastInline i) &&
+                     lastInline i == lastInline (tailInline i)
+                  then initInline i
+                  else i
+      checkNt   = processWith $ procInlines checkPt
+
+lastInline :: [Inline] -> String
+lastInline [] = []
+lastInline (i:[])
+    | Str s <- i = last' s
+    | Space <- i = " "
+    | otherwise  = lastInline $ getInline i
+    where
+      last' s = if s /= [] then [last s] else []
+lastInline (_:xs) = lastInline xs
+
+initInline :: [Inline] -> [Inline]
+initInline [] = []
+initInline (i:[])
+    | Str          s <- i = return $ Str (init' s)
+    | Emph        is <- i = return $ Emph        (initInline is)
+    | Strong      is <- i = return $ Strong      (initInline is)
+    | Strikeout   is <- i = return $ Strikeout   (initInline is)
+    | Superscript is <- i = return $ Superscript (initInline is)
+    | Subscript   is <- i = return $ Subscript   (initInline is)
+    | Quoted q    is <- i = return $ Quoted q    (initInline is)
+    | SmallCaps   is <- i = return $ SmallCaps   (initInline is)
+    | Link      is t <- i = return $ Link        (initInline is) t
+    | otherwise           = []
+    where
+      init' s = if s /= [] then init s else []
+initInline (i:xs) = i : initInline xs
+
+toCapital :: [Inline] -> [Inline]
+toCapital = mapHeadInline toCap
+    where
+      toCap s = if s /= [] then toUpper (head s) : tail s else []
+
+tailInline :: [Inline] -> [Inline]
+tailInline = mapHeadInline tail'
+    where
+      tail' s = if s /= [] then tail s else []
+
+mapHeadInline :: (String -> String) -> [Inline] -> [Inline]
+mapHeadInline _ [] = []
+mapHeadInline f (i:xs)
+    | Str          s <- i = Str         (f                s)   : xs
+    | Emph        is <- i = Emph        (mapHeadInline f is)   : xs
+    | Strong      is <- i = Strong      (mapHeadInline f is)   : xs
+    | Strikeout   is <- i = Strikeout   (mapHeadInline f is)   : xs
+    | Superscript is <- i = Superscript (mapHeadInline f is)   : xs
+    | Subscript   is <- i = Subscript   (mapHeadInline f is)   : xs
+    | Quoted q    is <- i = Quoted q    (mapHeadInline f is)   : xs
+    | SmallCaps   is <- i = SmallCaps   (mapHeadInline f is)   : xs
+    | Link      is t <- i = Link        (mapHeadInline f is) t : xs
+    | otherwise           = []
+
+getInline :: Inline -> [Inline]
+getInline i
+    | Emph        is <- i = is
+    | Strong      is <- i = is
+    | Strikeout   is <- i = is
+    | Superscript is <- i = is
+    | Subscript   is <- i = is
+    | Quoted _    is <- i = is
+    | SmallCaps   is <- i = is
+    | Link      is _ <- i = is
+    | otherwise           = []
 
 setCiteNoteNum :: [Inline] -> Int -> [Inline]
 setCiteNoteNum ((Cite cs o):xs) n = Cite (setCitationNoteNum n cs) o : setCiteNoteNum xs n

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: Re: supporting both parenthetical and footnote citations (was: citeproc updates)
       [not found]                     ` <20101102213938.GE18764-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-02 23:58                       ` Andrea Rossato
  2010-11-03  4:10                       ` John MacFarlane
  1 sibling, 0 replies; 107+ messages in thread
From: Andrea Rossato @ 2010-11-02 23:58 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 666 bytes --]

On Tue, Nov 02, 2010 at 10:39:39PM +0100, Andrea Rossato wrote:
> I'm attaching a patch to the citeproc branch which deals with them.

forget the previous 0002-etc patch as it was not general enough.
please test instead the attached one.

andrea

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.


[-- Attachment #2: 0002-improve-footnote-generation-of-in-text-citations-wit.patch --]
[-- Type: text/plain, Size: 8098 bytes --]

From 3765807eddbd7f63c7156b49c93c2e618b560f2a Mon Sep 17 00:00:00 2001
From: Andrea Rossato <andrea.rossato-/Q1r7N5in3P/wltNWqQaag@public.gmane.org>
Date: Wed, 3 Nov 2010 00:53:06 +0100
Subject: [PATCH 2/2] improve footnote generation of in-text citations with note csl styles

---
 src/Text/Pandoc/Biblio.hs |  142 +++++++++++++++++++++++++++++++++++++--------
 1 files changed, 118 insertions(+), 24 deletions(-)

diff --git a/src/Text/Pandoc/Biblio.hs b/src/Text/Pandoc/Biblio.hs
index cbf6191..d4b72c9 100644
--- a/src/Text/Pandoc/Biblio.hs
+++ b/src/Text/Pandoc/Biblio.hs
@@ -30,6 +30,7 @@ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 module Text.Pandoc.Biblio ( processBiblio ) where
 
 import Control.Monad ( when )
+import Data.Char ( toUpper, isPunctuation )
 import Data.List
 import Data.Unique
 import Text.CSL hiding ( Cite(..), Citation(..) )
@@ -44,17 +45,18 @@ processBiblio cf r p
       else do
         when (null cf) $ error "Missing the needed citation style file"
         csl  <- readCSLFile cf
-        p'   <- if styleClass csl == "note"
-                then processNote p
-                else processWithM setHash p
-        let groups     = if styleClass csl /= "note"
-                         then queryWith getCitation p'
-                         else getNoteCitations p'
-            result     = citeproc' csl r (setNearNote csl $ map (map toCslCite) groups)
-            cits_map   = zip groups (citations result)
+        p'   <- processWithM setHash p
+        let (nts,grps) = if styleClass csl /= "note"
+                         then (,) [] $ queryWith getCitation p'
+                         else let cits   = queryWith getCite p'
+                                  ncits  = map (queryWith getCite) $ queryWith getNote p'
+                                  needNt = cits \\ concat ncits
+                              in (,) needNt $ getNoteCitations needNt p'
+            result     = citeproc' csl r (setNearNote csl $ map (map toCslCite) grps)
+            cits_map   = zip grps (citations result)
             biblioList = map (read . renderPandoc' csl) (bibliography result)
             Pandoc m b = processWith (processCite csl cits_map) p'
-        return $ Pandoc m $ b ++ biblioList
+        return . generateNotes nts . Pandoc m $ b ++ biblioList
 
 -- | Substitute 'Cite' elements with formatted citations.
 processCite :: Style -> [([Citation],[FormattedOutput])] -> Inline -> Inline
@@ -70,7 +72,7 @@ processCite s cs il
 -- 'queryWith'.
 getCitation :: Inline -> [[Citation]]
 getCitation i | Cite t _ <- i = [t]
-          | otherwise         = []
+              | otherwise     = []
 
 getNote :: Inline -> [Inline]
 getNote i | Note _ <- i = [i]
@@ -80,26 +82,118 @@ getCite :: Inline -> [Inline]
 getCite i | Cite _ _ <- i = [i]
           | otherwise     = []
 
-getNoteCitations :: Pandoc -> [[Citation]]
-getNoteCitations
-    = let cits = concat . flip (zipWith $ setCiteNoteNum) [1..] .
-                 map (queryWith getCite) . queryWith getNote
-      in  queryWith getCitation . cits
+getNoteCitations :: [Inline] -> Pandoc -> [[Citation]]
+getNoteCitations needNote
+    = let mvCite i = if i `elem` needNote then Note [Para [i]] else i
+          setNote  = processWith mvCite
+          getCits  = concat . flip (zipWith $ setCiteNoteNum) [1..] .
+                     map (queryWith getCite) . queryWith getNote . setNote
+      in  queryWith getCitation . getCits
 
 setHash :: Citation -> IO Citation
 setHash (Citation i p l nn ao na _)
     = hashUnique `fmap` newUnique >>= return . Citation i p l nn ao na
 
-processNote :: Pandoc  -> IO Pandoc
-processNote p = do
-  p' <- processWithM setHash p
-  let cits     = queryWith getCite p'
-      ncits    = map (queryWith getCite) $ queryWith getNote p'
-      needNote = cits \\ concat ncits
-  return $ processWith (mvCiteInNote needNote) p'
+generateNotes :: [Inline] -> Pandoc -> Pandoc
+generateNotes needNote = processWith (mvCiteInNote needNote)
 
-mvCiteInNote :: [Inline] -> Inline -> Inline
-mvCiteInNote is i = if i `elem` is then Note [Para [i]] else i
+procInlines :: ([Inline] -> [Inline]) -> Block -> Block
+procInlines f b
+    | Plain    inls <- b = Plain    $ f inls
+    | Para     inls <- b = Para     $ f inls
+    | Header i inls <- b = Header i $ f inls
+    | otherwise          = b
+
+mvCiteInNote :: [Inline] -> Block -> Block
+mvCiteInNote is = procInlines mvCite
+    where
+      elem_ x xs = case x of Cite cs _ -> (Cite cs []) `elem` xs; _ -> False
+      mvCite :: [Inline] -> [Inline]
+      mvCite inls
+          | x:i:xs <- inls
+          , x == Space,   i `elem_` is = mvInNote i : mvCite xs
+          | i:xs <- inls, i `elem_` is = mvInNote i : mvCite xs
+          | i:xs <- inls, Note _ <- i  = checkNt  i : mvCite xs
+          | i:xs <- inls               = i          : mvCite xs
+          | otherwise                  = []
+      mvInNote i
+          | Cite t o <- i = Note [Para [Cite t $ toCapital o]]
+          | otherwise     = Note [Para [i                   ]]
+      checkPt i
+          | Cite c o : xs <- i
+          , headInline xs == lastInline o
+          , isPunct o = Cite c (initInline o) : checkPt xs
+          | x:xs <- i = x : checkPt xs
+          | otherwise = []
+      isPunct   = and . map isPunctuation . lastInline
+      checkNt   = processWith $ procInlines checkPt
+
+headInline :: [Inline] -> String
+headInline [] = []
+headInline (i:_)
+    | Str s <- i = head' s
+    | Space <- i = " "
+    | otherwise  = headInline $ getInline i
+    where
+      head' s = if s /= [] then [head s] else []
+
+lastInline :: [Inline] -> String
+lastInline [] = []
+lastInline (i:[])
+    | Str s <- i = last' s
+    | Space <- i = " "
+    | otherwise  = lastInline $ getInline i
+    where
+      last' s = if s /= [] then [last s] else []
+lastInline (_:xs) = lastInline xs
+
+initInline :: [Inline] -> [Inline]
+initInline [] = []
+initInline (i:[])
+    | Str          s <- i = return $ Str         (init'       s)
+    | Emph        is <- i = return $ Emph        (initInline is)
+    | Strong      is <- i = return $ Strong      (initInline is)
+    | Strikeout   is <- i = return $ Strikeout   (initInline is)
+    | Superscript is <- i = return $ Superscript (initInline is)
+    | Subscript   is <- i = return $ Subscript   (initInline is)
+    | Quoted q    is <- i = return $ Quoted q    (initInline is)
+    | SmallCaps   is <- i = return $ SmallCaps   (initInline is)
+    | Link      is t <- i = return $ Link        (initInline is) t
+    | otherwise           = []
+    where
+      init' s = if s /= [] then init s else []
+initInline (i:xs) = i : initInline xs
+
+toCapital :: [Inline] -> [Inline]
+toCapital = mapHeadInline toCap
+    where
+      toCap s = if s /= [] then toUpper (head s) : tail s else []
+
+mapHeadInline :: (String -> String) -> [Inline] -> [Inline]
+mapHeadInline _ [] = []
+mapHeadInline f (i:xs)
+    | Str          s <- i = Str         (f                s)   : xs
+    | Emph        is <- i = Emph        (mapHeadInline f is)   : xs
+    | Strong      is <- i = Strong      (mapHeadInline f is)   : xs
+    | Strikeout   is <- i = Strikeout   (mapHeadInline f is)   : xs
+    | Superscript is <- i = Superscript (mapHeadInline f is)   : xs
+    | Subscript   is <- i = Subscript   (mapHeadInline f is)   : xs
+    | Quoted q    is <- i = Quoted q    (mapHeadInline f is)   : xs
+    | SmallCaps   is <- i = SmallCaps   (mapHeadInline f is)   : xs
+    | Link      is t <- i = Link        (mapHeadInline f is) t : xs
+    | otherwise           = []
+
+getInline :: Inline -> [Inline]
+getInline i
+    | Emph        is <- i = is
+    | Strong      is <- i = is
+    | Strikeout   is <- i = is
+    | Superscript is <- i = is
+    | Subscript   is <- i = is
+    | Quoted _    is <- i = is
+    | SmallCaps   is <- i = is
+    | Link      is _ <- i = is
+    | otherwise           = []
 
 setCiteNoteNum :: [Inline] -> Int -> [Inline]
 setCiteNoteNum ((Cite cs o):xs) n = Cite (setCitationNoteNum n cs) o : setCiteNoteNum xs n

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: supporting both parenthetical and footnote citations (was: citeproc updates)
       [not found]                     ` <20101102213938.GE18764-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  2010-11-02 23:58                       ` Andrea Rossato
@ 2010-11-03  4:10                       ` John MacFarlane
       [not found]                         ` <20101103041014.GA19840-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-03  4:10 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Nov 02 10 22:39 ]:
> On Sat, Oct 30, 2010 at 09:06:08AM -0700, John MacFarlane wrote:
> > > Then a citation group: 1.
> > 
> > > 1. See Doe 2000, p. 34-35; also Doe 2002, chap. 3
> > 
> > Here there are three problems:
> > 
> > - The footnote is missing a period, since the period was outside
> >   of the citation
> > - The period occurs after the footnote.
> > - There is a space before the footnote.
> > 
> > It's a bit hard to see how to resolve these issues, given that, when
> > pandoc is actually parsing the citation, it won't know whether the
> > output is to be rendered using a footnote style or an author/date
> > style.
> > 
> > I noticed another issue with Andrea's test:  the note says simply
> > "at 3" instead of "chap. 3" -- apparently the number is being stored
> > and the rest ignored.  This seems undesirable.
> > 
> > My own inclination, if there's no obvious solution, is to optimize pandoc for
> > the author-date format. After all, if people want citations in footnotes, they
> > can always manually insert footnotes.
> > 
> 
> 
> I think these issues can be solved (I didn't notice them since I was
> in a hurry and mostly focused on the citeproc side of the problems).
> I'm attaching a patch to the citeproc branch which deals with them.
> 
>  * The first: some of the styles coming with the test-suite usually
>    put a final "." suffix in the citation layout. I don't know if this
>    is a good style coding practice (I tend to think it is not), but
>    while this usually solves the problem you reported it usually
>    messes up normal footnotes ending with a citation followed by a
>    period (which makes sense in markdown). I updated my test to check
>    this. This issue requires looking for duplicate punctuations at the
>    end of a footnote paragraph;
> 
>  * the second: this could be seen as a more general problem. Sometimes
>    I'm required to provide documents where footnote marks follow a
>    punctuation, while other times I'm require to do the opposite
>    (footnote mark *before* punctuation). It would be nice if pandoc
>    could take care of that (it wouldn't be difficult for the markdown
>    parser I think) with a command line option. Otherwise the problem
>    can be solved the same way I solved the third one;

I've personally never seen a style in which the footnote mark comes
before punctuation.  Is this information somehow encoding in the CSL
style, so that pandoc could extract it and do the right thing
automatically? A command-line option for this seems like the wrong
approach.

>  * the third: the fix of this problem, and the first one, could
>    possibly be simpler (there must be something I do not get about
>    generics since I'm not able to match a [Inline] without matching
>    every single Inline). I don't know how much all this processing
>    impacts the performance (beyond the burden of the csl processor):
>    some benchmarking would be nice. I have the feeling a better
>    approach could be studied.

From a quick glance at the code, there seems to be a lot of boilerplate
of the type that could be eliminated with generics.  I'm not sure
I understand the problem you were describing here:

>    generics since I'm not able to match a [Inline] without matching
>    every single Inline). I don't know how much all this processing

Your test in http://gorgias.mine.nu/citeproc/prova1a.html looks pretty
good, but what's happening in n. 5 doesn't look right. Maybe the
+ and - forms don't work inside a note?

Anyway, I'll push your changes to the citeproc branch.  Good to see
all this progress!

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: supporting both parenthetical and footnote citations (was: citeproc updates)
       [not found]                         ` <20101103041014.GA19840-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-03 12:10                           ` Andrea Rossato
       [not found]                             ` <20101103121027.GG18764-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-03 12:10 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 3855 bytes --]

On Tue, Nov 02, 2010 at 09:10:14PM -0700, John MacFarlane wrote:
> +++ Andrea Rossato [Nov 02 10 22:39 ]:
> >  * the second: this could be seen as a more general problem. Sometimes
> >    I'm required to provide documents where footnote marks follow a
> >    punctuation, while other times I'm require to do the opposite
> >    (footnote mark *before* punctuation). It would be nice if pandoc
> >    could take care of that (it wouldn't be difficult for the markdown
> >    parser I think) with a command line option. Otherwise the problem
> >    can be solved the same way I solved the third one;
> 
> I've personally never seen a style in which the footnote mark comes
> before punctuation.  Is this information somehow encoding in the CSL
> style, so that pandoc could extract it and do the right thing
> automatically? A command-line option for this seems like the wrong
> approach.

I'm not talking about citation styles, here. This is why I think this
could be thought as a more general issue. I happen to be required,
when publishing with my University, to format the manuscript so that
footnote marks are placed before punctuation.[1] It would be a nice
feature to be able to switch, with a command line option, to a
footnote-mark-before-punctuation mode.

It is like whether punctuation at the end of a quotation must be
placed within or outside quotes. While this later problem is handled
by a CSL style, CSL says nothing on where the footnote should be
placed with regards to punctuation.

Anyway, the attached patch will always generate the Note inline
*after* punctuation.

> 
> >  * the third: the fix of this problem, and the first one, could
> >    possibly be simpler (there must be something I do not get about
> >    generics since I'm not able to match a [Inline] without matching
> >    every single Inline). I don't know how much all this processing
> >    impacts the performance (beyond the burden of the csl processor):
> >    some benchmarking would be nice. I have the feeling a better
> >    approach could be studied.
> 
> From a quick glance at the code, there seems to be a lot of boilerplate
> of the type that could be eliminated with generics.  I'm not sure
> I understand the problem you were describing here:
> 
> >    generics since I'm not able to match a [Inline] without matching
> >    every single Inline). I don't know how much all this processing
> 

if you change:
mvCiteInNote :: [Inline] -> Block -> Block
mvCiteInNote is = procInlines mvCite

to:
mvCiteInNote :: [Inline] -> [Inline] -> [Inline]
mvCiteInNote is = mvCite

mvCite will not match the Space. I don't understand why.

Anyway, some boilerplate code is necessary when it comes to
manipulating the document: I do not see any other way of implementing
tailInline, initInline, etc.

> Your test in http://gorgias.mine.nu/citeproc/prova1a.html looks pretty
> good, but what's happening in n. 5 doesn't look right. Maybe the
> + and - forms don't work inside a note?

This is something you already reported. It is part of what I'm
presently fixing on the citeproc side.

The attached patch also includes something I left out: the code to
check whether the generated footnote needs a final period or not.

Andrea


[1] A couple of books as examples:
    http://eprints.biblio.unitn.it/archive/00001336/02/quaderno_70_roberto_caso_eprints.pdf
    http://eprints.biblio.unitn.it/archive/00001136/01/quaderno_59_caso_file_unico_DEF.pdf

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.


[-- Attachment #2: 0003-improve-footnote-generation-a-bit-more.patch --]
[-- Type: text/plain, Size: 3166 bytes --]

From ef7bea4d218dae5dceca14a81931d67ea87bd2d9 Mon Sep 17 00:00:00 2001
From: Andrea Rossato <andrea.rossato-/Q1r7N5in3P/wltNWqQaag@public.gmane.org>
Date: Wed, 3 Nov 2010 12:00:23 +0100
Subject: [PATCH 3/3] improve footnote generation a bit more

 - when generating a footnote check if we need to add a final period;
 - always place the footnote mark after a punctuation character
---
 src/Text/Pandoc/Biblio.hs |   34 ++++++++++++++++++++++++----------
 1 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/src/Text/Pandoc/Biblio.hs b/src/Text/Pandoc/Biblio.hs
index d4b72c9..8a9b21b 100644
--- a/src/Text/Pandoc/Biblio.hs
+++ b/src/Text/Pandoc/Biblio.hs
@@ -107,26 +107,35 @@ procInlines f b
 mvCiteInNote :: [Inline] -> Block -> Block
 mvCiteInNote is = procInlines mvCite
     where
-      elem_ x xs = case x of Cite cs _ -> (Cite cs []) `elem` xs; _ -> False
       mvCite :: [Inline] -> [Inline]
       mvCite inls
+          | x:i:xs <- inls, startWPt xs
+          , x == Space,   i `elem_` is = split i xs ++ mvCite (tailInline xs)
           | x:i:xs <- inls
-          , x == Space,   i `elem_` is = mvInNote i : mvCite xs
-          | i:xs <- inls, i `elem_` is = mvInNote i : mvCite xs
-          | i:xs <- inls, Note _ <- i  = checkNt  i : mvCite xs
-          | i:xs <- inls               = i          : mvCite xs
+          , x == Space,   i `elem_` is = mvInNote i :  mvCite xs
+          | i:xs <- inls, i `elem_` is
+          , startWPt xs                = split i xs ++ mvCite (tailInline xs)
+          | i:xs <- inls, Note _ <- i  = checkNt  i :  mvCite xs
+          | i:xs <- inls               = i          :  mvCite xs
           | otherwise                  = []
+      elem_ x xs = case x of Cite cs _ -> (Cite cs []) `elem` xs; _ -> False
+      split i xs = Str (headInline xs) : mvInNote i : []
       mvInNote i
-          | Cite t o <- i = Note [Para [Cite t $ toCapital o]]
-          | otherwise     = Note [Para [i                   ]]
+          | Cite t o <- i = Note [Para [Cite t $ sanitize o]]
+          | otherwise     = Note [Para [i                  ]]
+      sanitize i
+          | endWPt  i = toCapital i
+          | otherwise = toCapital (i ++ [Str "."])
+
       checkPt i
           | Cite c o : xs <- i
           , headInline xs == lastInline o
-          , isPunct o = Cite c (initInline o) : checkPt xs
+          , endWPt  o = Cite c (initInline o) : checkPt xs
           | x:xs <- i = x : checkPt xs
           | otherwise = []
-      isPunct   = and . map isPunctuation . lastInline
-      checkNt   = processWith $ procInlines checkPt
+      endWPt   = and . map isPunctuation . lastInline
+      startWPt = and . map isPunctuation . headInline
+      checkNt  = processWith $ procInlines checkPt
 
 headInline :: [Inline] -> String
 headInline [] = []
@@ -164,6 +173,11 @@ initInline (i:[])
       init' s = if s /= [] then init s else []
 initInline (i:xs) = i : initInline xs
 
+tailInline :: [Inline] -> [Inline]
+tailInline = mapHeadInline tail'
+    where
+      tail' s = if s /= [] then tail s else []
+
 toCapital :: [Inline] -> [Inline]
 toCapital = mapHeadInline toCap
     where

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: supporting both parenthetical and footnote citations (was: citeproc updates)
       [not found]                             ` <20101103121027.GG18764-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-03 15:03                               ` Bruce
       [not found]                                 ` <88f053e9-c948-4b5d-b92a-46caba340c09-sqEsDViiE88wq+9lhVFwFGB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
  2010-11-03 15:36                               ` John MacFarlane
  1 sibling, 1 reply; 107+ messages in thread
From: Bruce @ 2010-11-03 15:03 UTC (permalink / raw)
  To: pandoc-discuss

On Nov 3, 8:10 am, Andrea Rossato <andrea.ross...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

...

> I'm not talking about citation styles, here. This is why I think this
> could be thought as a more general issue. I happen to be required,
> when publishing with my University, to format the manuscript so that
> footnote marks are placed before punctuation.[1] It would be a nice
> feature to be able to switch, with a command line option, to a
> footnote-mark-before-punctuation mode.
>
> It is like whether punctuation at the end of a quotation must be
> placed within or outside quotes. While this later problem is handled
> by a CSL style, CSL says nothing on where the footnote should be
> placed with regards to punctuation.

So do you see these rules as locale, or style, specific?

In any case, feel free to suggest adding a parameter for this to CSL
if you think we need it.

Bruce

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: supporting both parenthetical and footnote citations (was: citeproc updates)
       [not found]                             ` <20101103121027.GG18764-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  2010-11-03 15:03                               ` Bruce
@ 2010-11-03 15:36                               ` John MacFarlane
  1 sibling, 0 replies; 107+ messages in thread
From: John MacFarlane @ 2010-11-03 15:36 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Nov 03 10 13:10 ]:
> On Tue, Nov 02, 2010 at 09:10:14PM -0700, John MacFarlane wrote:
> > +++ Andrea Rossato [Nov 02 10 22:39 ]:
> > >  * the second: this could be seen as a more general problem. Sometimes
> > >    I'm required to provide documents where footnote marks follow a
> > >    punctuation, while other times I'm require to do the opposite
> > >    (footnote mark *before* punctuation). It would be nice if pandoc
> > >    could take care of that (it wouldn't be difficult for the markdown
> > >    parser I think) with a command line option. Otherwise the problem
> > >    can be solved the same way I solved the third one;
> > 
> > I've personally never seen a style in which the footnote mark comes
> > before punctuation.  Is this information somehow encoding in the CSL
> > style, so that pandoc could extract it and do the right thing
> > automatically? A command-line option for this seems like the wrong
> > approach.
> 
> I'm not talking about citation styles, here. This is why I think this
> could be thought as a more general issue. I happen to be required,
> when publishing with my University, to format the manuscript so that
> footnote marks are placed before punctuation.[1] It would be a nice
> feature to be able to switch, with a command line option, to a
> footnote-mark-before-punctuation mode.

Let me make sure I understand: You're saying it's a more general issue because
you want this to affect all footnotes, not just citations?  So that you
can write a document with all footnotes after punctuation, and use
this switch to convert?

I'm certainly willing to consider this. My only reservation is that this
seems very special-purpose for a command-line option (especially since
footnotes-before-punctuation is fairly unusual). Another way to handle this
(sacrificing some performance) would be to write a small script that reads
Native pandoc, does the transformation, and writes Native. You could then use
this in a pipe between two pandoc invocations. Would that suffice?

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found] ` <20101027132010.GB6998-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  2010-10-28  1:36   ` John MacFarlane
@ 2010-11-03 23:39   ` Nathan Gass
       [not found]     ` <4CD1F297.2020609-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: Nathan Gass @ 2010-11-03 23:39 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 27.10.10 15:20, Andrea Rossato wrote:
> Hi,
>
> I just wanted to briefly report on the recent citeproc-hs updates.
>
> As the ones of you interested in the project remember, last spring I
> announced I was working on updating my CSL implementation to support
> CSL-1.0 and ot add a new extended API to allow a richer citation
> markup for pandoc.
>
> I was not able to finish quickly but I kept slowly working on it and
> now I've just pushed a few patches that bring us closer to the goal:
>
>    * the new pandoc API is ready and I've also some code for the pandoc
>      side to implement citation prefixes, and modifiers to print the
>      author only or to suppress it in the citation. Some other
>      modifiers could be included (something like a \nocite would be
>      very useful for me);

These modifieres currently are parsed and stored by citation key, so 
`[+@doe99; -@jones90]` gets parsed as one Cite inline having two 
citations, one author-only and one author-less. Do we really want to 
support this? If not, I can make a patch to move the options out of the 
Citation datatype into the Cite inline and adapt the readers and 
writers. Comments?

I would like to add a textual citation primitive to the native pandoc 
format to make the conversion code easier and less error-prone. With 
citeproc-hs, we could render the primitive using two citations 
(author-only and then author-less). Converting citations from one format 
to another would be a lot easier by this addition. That does not mean we 
have to add a textual citation primitive in markdown too. Is this ok?

Greetings
Nathan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]     ` <4CD1F297.2020609-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
@ 2010-11-04  7:36       ` John MacFarlane
       [not found]         ` <20101104073606.GA17293-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-04  7:36 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Nathan Gass [Nov 04 10 00:39 ]:
> On 27.10.10 15:20, Andrea Rossato wrote:
> >Hi,
> >
> >I just wanted to briefly report on the recent citeproc-hs updates.
> >
> >As the ones of you interested in the project remember, last spring I
> >announced I was working on updating my CSL implementation to support
> >CSL-1.0 and ot add a new extended API to allow a richer citation
> >markup for pandoc.
> >
> >I was not able to finish quickly but I kept slowly working on it and
> >now I've just pushed a few patches that bring us closer to the goal:
> >
> >   * the new pandoc API is ready and I've also some code for the pandoc
> >     side to implement citation prefixes, and modifiers to print the
> >     author only or to suppress it in the citation. Some other
> >     modifiers could be included (something like a \nocite would be
> >     very useful for me);
> 
> These modifieres currently are parsed and stored by citation key, so
> `[+@doe99; -@jones90]` gets parsed as one Cite inline having two
> citations, one author-only and one author-less. Do we really want to
> support this?

My own view is that

(a) we can get by with just one modifier (the -), and

(b) this should modify the whole list of citations (the Cite inline),
rather than individual citations.  (See my sample grammar from earlier
in this thread.)

Do others agree?

> If not, I can make a patch to move the options out of
> the Citation datatype into the Cite inline and adapt the readers and
> writers. Comments?
> 
> I would like to add a textual citation primitive to the native
> pandoc format to make the conversion code easier and less
> error-prone. With citeproc-hs, we could render the primitive using
> two citations (author-only and then author-less). Converting
> citations from one format to another would be a lot easier by this
> addition. That does not mean we have to add a textual citation
> primitive in markdown too. Is this ok?

I'm tempted to say yes, but I'd also like to hear what Andrea and
others think.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: supporting both parenthetical and footnote citations (was: citeproc updates)
       [not found]                                 ` <88f053e9-c948-4b5d-b92a-46caba340c09-sqEsDViiE88wq+9lhVFwFGB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
@ 2010-11-04 11:09                                   ` Andrea Rossato
  0 siblings, 0 replies; 107+ messages in thread
From: Andrea Rossato @ 2010-11-04 11:09 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Wed, Nov 03, 2010 at 08:03:40AM -0700, Bruce wrote:
> On Nov 3, 8:10 am, Andrea Rossato <andrea.ross...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> 
> ...
> 
> > I'm not talking about citation styles, here. This is why I think this
> > could be thought as a more general issue. I happen to be required,
> > when publishing with my University, to format the manuscript so that
> > footnote marks are placed before punctuation.[1] It would be a nice
> > feature to be able to switch, with a command line option, to a
> > footnote-mark-before-punctuation mode.
> >
> > It is like whether punctuation at the end of a quotation must be
> > placed within or outside quotes. While this later problem is handled
> > by a CSL style, CSL says nothing on where the footnote should be
> > placed with regards to punctuation.
> 
> So do you see these rules as locale, or style, specific?
> 
> In any case, feel free to suggest adding a parameter for this to CSL
> if you think we need it.

No, I don't think they are related to locale o citation styles.

The problem is probably too specific to require a command-line option:
two runs of pandoc with some processing of the native format, as John
suggested, is probably the best way to take care of such kind of
issues.

Andrea

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]         ` <20101104073606.GA17293-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-04 11:48           ` Andrea Rossato
       [not found]             ` <20101104114801.GF10392-u31zCTIHpvLVI6Gt0zCidg@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-04 11:48 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Thu, Nov 04, 2010 at 12:36:07AM -0700, John MacFarlane wrote:
> +++ Nathan Gass [Nov 04 10 00:39 ]:
> > On 27.10.10 15:20, Andrea Rossato wrote:
> > >Hi,
> > >
> > >I just wanted to briefly report on the recent citeproc-hs updates.
> > >
> > >As the ones of you interested in the project remember, last spring I
> > >announced I was working on updating my CSL implementation to support
> > >CSL-1.0 and ot add a new extended API to allow a richer citation
> > >markup for pandoc.
> > >
> > >I was not able to finish quickly but I kept slowly working on it and
> > >now I've just pushed a few patches that bring us closer to the goal:
> > >
> > >   * the new pandoc API is ready and I've also some code for the pandoc
> > >     side to implement citation prefixes, and modifiers to print the
> > >     author only or to suppress it in the citation. Some other
> > >     modifiers could be included (something like a \nocite would be
> > >     very useful for me);
> > 
> > These modifieres currently are parsed and stored by citation key, so
> > `[+@doe99; -@jones90]` gets parsed as one Cite inline having two
> > citations, one author-only and one author-less. Do we really want to
> > support this?
> 
> My own view is that
> 
> (a) we can get by with just one modifier (the -), and
> 
> (b) this should modify the whole list of citations (the Cite inline),
> rather than individual citations.  (See my sample grammar from earlier
> in this thread.)
> 
> Do others agree?

I agree. This is the way the "suppress-author" modifier is supposed to
work: in a cite with multiple citations, if one has the
suppress-author bit set, all citations in the cite will be rendered
without the author. See:
http://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/discretionary_SuppressMultipleAuthors.txt

So, if I understand it correctly, as Nathan suggested, we have the
citeproc API to support both suppress-author and author-only (which
can be used by the natbib reader Nathant is working to) without the
markdown syntax for it. This is perfectly fine, for me.

> 
> > If not, I can make a patch to move the options out of
> > the Citation datatype into the Cite inline and adapt the readers and
> > writers. Comments?
> > 
> > I would like to add a textual citation primitive to the native
> > pandoc format to make the conversion code easier and less
> > error-prone. With citeproc-hs, we could render the primitive using
> > two citations (author-only and then author-less). Converting
> > citations from one format to another would be a lot easier by this
> > addition. That does not mean we have to add a textual citation
> > primitive in markdown too. Is this ok?
> 
> I'm tempted to say yes, but I'd also like to hear what Andrea and
> others think.

I do not understand why we should have a new Inline constructor. Isn't
Cite enough? Do we need to patch it to take extra parameters - to
support the natbib reader? Why do we need them? I'm just asking
because perhaps I'm missing something.

The API supports also a citation suffix, in the case we need it.

About the citation syntax in markdown: I'm reviewing the threads, so
sorry if I'm proposing something which has already been rejected. More
then the nocite you have been talking about, I need a command to
render a citation without having that citation showing up in the list
of references (bibliography). This is especially useful for court
decisions, which may be cited but not included in the bibliography.
Maybe that could be achieved with a modifier to be placed before the
'@', (like the '-' to set the suppress-author bit).

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: Re: citeproc updates
       [not found]             ` <20101104114801.GF10392-u31zCTIHpvLVI6Gt0zCidg@public.gmane.org>
@ 2010-11-04 14:54               ` Andrea Rossato
       [not found]                 ` <20101104145457.GA6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  2010-11-04 15:23               ` Nathan Gass
  2010-11-04 16:06               ` John MacFarlane
  2 siblings, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-04 14:54 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 1823 bytes --]

On Thu, Nov 04, 2010 at 12:48:01PM +0100, Andrea Rossato wrote:
> On Thu, Nov 04, 2010 at 12:36:07AM -0700, John MacFarlane wrote:
> > (b) this should modify the whole list of citations (the Cite inline),
> > rather than individual citations.  (See my sample grammar from earlier
> > in this thread.)
> > 
> > Do others agree?
> 
> I agree. This is the way the "suppress-author" modifier is supposed to
> work: in a cite with multiple citations, if one has the
> suppress-author bit set, all citations in the cite will be rendered
> without the author. See:
> http://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/discretionary_SuppressMultipleAuthors.txt

I was wrong on this point: suppress-author refers to individual
citations (the cited test is to check whether the collapsing is
handled correctly).

Anyway, I don't see the relation with your grammar: if I understand it
correctly, suppress-author works on a per citation basis too. am I
right?

With regards to the problem of generating footnotes and thus switching
between in-text and note style I think we are close to a solution, I
think. I'm attaching a small patch for pandoc. In a few hours I'm also
going to commit the fixes to the citeproc-hs side.

Here I updated the test.markdown file and the produced output with the
usual 3 different styles (2 note styles and 1 in-text):
http://gorgias.mine.nu/citeproc/

Andrea

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.


[-- Attachment #2: 0004-check-for-punctuation-only.patch --]
[-- Type: text/plain, Size: 1563 bytes --]

From a1a805c4bc68ec053c9e7bc7fe70cce1fb22fd67 Mon Sep 17 00:00:00 2001
From: Andrea Rossato <andrea.rossato-/Q1r7N5in3P/wltNWqQaag@public.gmane.org>
Date: Thu, 4 Nov 2010 15:39:49 +0100
Subject: [PATCH 4/4] check for punctuation only
 Data.Char.isPuctuation matches "various kinds of connectors, brackets
 and quotes", too much...

---
 src/Text/Pandoc/Biblio.hs |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/Text/Pandoc/Biblio.hs b/src/Text/Pandoc/Biblio.hs
index 8a9b21b..1621550 100644
--- a/src/Text/Pandoc/Biblio.hs
+++ b/src/Text/Pandoc/Biblio.hs
@@ -30,7 +30,7 @@ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 module Text.Pandoc.Biblio ( processBiblio ) where
 
 import Control.Monad ( when )
-import Data.Char ( toUpper, isPunctuation )
+import Data.Char ( toUpper )
 import Data.List
 import Data.Unique
 import Text.CSL hiding ( Cite(..), Citation(..) )
@@ -129,12 +129,12 @@ mvCiteInNote is = procInlines mvCite
 
       checkPt i
           | Cite c o : xs <- i
-          , headInline xs == lastInline o
+          , endWPt o, startWPt xs
           , endWPt  o = Cite c (initInline o) : checkPt xs
           | x:xs <- i = x : checkPt xs
           | otherwise = []
-      endWPt   = and . map isPunctuation . lastInline
-      startWPt = and . map isPunctuation . headInline
+      endWPt   = and . map (`elem` ".,;:!?") . lastInline
+      startWPt = and . map (`elem` ".,;:!?") . headInline
       checkNt  = processWith $ procInlines checkPt
 
 headInline :: [Inline] -> String

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]             ` <20101104114801.GF10392-u31zCTIHpvLVI6Gt0zCidg@public.gmane.org>
  2010-11-04 14:54               ` Andrea Rossato
@ 2010-11-04 15:23               ` Nathan Gass
       [not found]                 ` <4CD2CFFE.8030503-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  2010-11-04 16:06               ` John MacFarlane
  2 siblings, 1 reply; 107+ messages in thread
From: Nathan Gass @ 2010-11-04 15:23 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 04.11.10 12:48, Andrea Rossato wrote:
> On Thu, Nov 04, 2010 at 12:36:07AM -0700, John MacFarlane wrote:
>> +++ Nathan Gass [Nov 04 10 00:39 ]:
>>> On 27.10.10 15:20, Andrea Rossato wrote:
>>>> Hi,
>>>>
>>>> I just wanted to briefly report on the recent citeproc-hs updates.
>>>>
>>>> As the ones of you interested in the project remember, last spring I
>>>> announced I was working on updating my CSL implementation to support
>>>> CSL-1.0 and ot add a new extended API to allow a richer citation
>>>> markup for pandoc.
>>>>
>>>> I was not able to finish quickly but I kept slowly working on it and
>>>> now I've just pushed a few patches that bring us closer to the goal:
>>>>
>>>>    * the new pandoc API is ready and I've also some code for the pandoc
>>>>      side to implement citation prefixes, and modifiers to print the
>>>>      author only or to suppress it in the citation. Some other
>>>>      modifiers could be included (something like a \nocite would be
>>>>      very useful for me);
>>>
>>> These modifieres currently are parsed and stored by citation key, so
>>> `[+@doe99; -@jones90]` gets parsed as one Cite inline having two
>>> citations, one author-only and one author-less. Do we really want to
>>> support this?
>>
>> My own view is that
>>
>> (a) we can get by with just one modifier (the -), and
>>
>> (b) this should modify the whole list of citations (the Cite inline),
>> rather than individual citations.  (See my sample grammar from earlier
>> in this thread.)
>>
>> Do others agree?
>
> I agree. This is the way the "suppress-author" modifier is supposed to
> work: in a cite with multiple citations, if one has the
> suppress-author bit set, all citations in the cite will be rendered
> without the author. See:
> http://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/discretionary_SuppressMultipleAuthors.txt
>

In that case I'll continue to work on a patch to reflect this in the 
Cite inline, which currently can express different options per Citation 
in a single Cite. If we don't want to support this, we should not give 
any reader the possibility to generate such a native form. So I will 
provide a patch changing

     Cite [Citations] [Inline]

to

     Cite CiteOptions [Citations] [Inline]

and move the variants from the Citation type to the CiteOptions type.


> So, if I understand it correctly, as Nathan suggested, we have the
> citeproc API to support both suppress-author and author-only (which
> can be used by the natbib reader Nathant is working to) without the
> markdown syntax for it. This is perfectly fine, for me.
>
>>
>>> If not, I can make a patch to move the options out of
>>> the Citation datatype into the Cite inline and adapt the readers and
>>> writers. Comments?
>>>
>>> I would like to add a textual citation primitive to the native
>>> pandoc format to make the conversion code easier and less
>>> error-prone. With citeproc-hs, we could render the primitive using
>>> two citations (author-only and then author-less). Converting
>>> citations from one format to another would be a lot easier by this
>>> addition. That does not mean we have to add a textual citation
>>> primitive in markdown too. Is this ok?
>>
>> I'm tempted to say yes, but I'd also like to hear what Andrea and
>> others think.
>
> I do not understand why we should have a new Inline constructor. Isn't
> Cite enough? Do we need to patch it to take extra parameters - to
> support the natbib reader? Why do we need them? I'm just asking
> because perhaps I'm missing something.

The Cite inline is enough, but it should support an option "textual 
citation". Without that primitive, I have to add two citations in the 
reader for a textual citation and handle this special case in the writer 
too (at least to get as standard latex as possible). This is harder to 
implement and more error-prone. No extra parameters are needed. The 
markdown writer could handle textual citations, by rendering the author 
with citeproc-hs and then write our syntax for an omit-author citation.

Nathan



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]             ` <20101104114801.GF10392-u31zCTIHpvLVI6Gt0zCidg@public.gmane.org>
  2010-11-04 14:54               ` Andrea Rossato
  2010-11-04 15:23               ` Nathan Gass
@ 2010-11-04 16:06               ` John MacFarlane
       [not found]                 ` <20101104160627.GE25944-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-04 16:06 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Nov 04 10 12:48 ]:
> On Thu, Nov 04, 2010 at 12:36:07AM -0700, John MacFarlane wrote:
> > +++ Nathan Gass [Nov 04 10 00:39 ]:
> > > On 27.10.10 15:20, Andrea Rossato wrote:
> > > >Hi,
> > > >
> > > >I just wanted to briefly report on the recent citeproc-hs updates.
> > > >
> > > >As the ones of you interested in the project remember, last spring I
> > > >announced I was working on updating my CSL implementation to support
> > > >CSL-1.0 and ot add a new extended API to allow a richer citation
> > > >markup for pandoc.
> > > >
> > > >I was not able to finish quickly but I kept slowly working on it and
> > > >now I've just pushed a few patches that bring us closer to the goal:
> > > >
> > > >   * the new pandoc API is ready and I've also some code for the pandoc
> > > >     side to implement citation prefixes, and modifiers to print the
> > > >     author only or to suppress it in the citation. Some other
> > > >     modifiers could be included (something like a \nocite would be
> > > >     very useful for me);
> > > 
> > > These modifieres currently are parsed and stored by citation key, so
> > > `[+@doe99; -@jones90]` gets parsed as one Cite inline having two
> > > citations, one author-only and one author-less. Do we really want to
> > > support this?
> > 
> > My own view is that
> > 
> > (a) we can get by with just one modifier (the -), and
> > 
> > (b) this should modify the whole list of citations (the Cite inline),
> > rather than individual citations.  (See my sample grammar from earlier
> > in this thread.)
> > 
> > Do others agree?
> 
> I agree. This is the way the "suppress-author" modifier is supposed to
> work: in a cite with multiple citations, if one has the
> suppress-author bit set, all citations in the cite will be rendered
> without the author. See:
> http://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/discretionary_SuppressMultipleAuthors.txt
> 
> So, if I understand it correctly, as Nathan suggested, we have the
> citeproc API to support both suppress-author and author-only (which
> can be used by the natbib reader Nathant is working to) without the
> markdown syntax for it. This is perfectly fine, for me.

I think than Nathan was suggesting the following:

(a)  In the markdown syntax, the suppress-author modifier should go
at the beginning of the whole citation, not before individual reference
keys.  This was also how I had it in the suggested grammar.  So,
[-@doe99; @smith00] is legal, as is [- @doe99; @smith00],
but [-@doe99; -@smith00] is not.

(b)  This modifier (and any other modifiers we use internally,
like suppress-author and textual-citation) would go in the Cite
inline, rather than in Citation.  This is because it modifies a
whole list of citations, not particular references.  So Cite
would have some added field that could be CiteFull or CiteAuthorOnly
or CiteNoAuthor or CiteTextual.  I thought that Nathan's idea
was that the Citation structure would no longer need to contain this
information, but that may not be compatible with what you're
doing internally; if not, perhaps things could be set up so that
the relevant fields in Citation were always filled in automatically
from the fields in the containing Cite.

> > > If not, I can make a patch to move the options out of
> > > the Citation datatype into the Cite inline and adapt the readers and
> > > writers. Comments?
> > > 
> > > I would like to add a textual citation primitive to the native
> > > pandoc format to make the conversion code easier and less
> > > error-prone. With citeproc-hs, we could render the primitive using
> > > two citations (author-only and then author-less). Converting
> > > citations from one format to another would be a lot easier by this
> > > addition. That does not mean we have to add a textual citation
> > > primitive in markdown too. Is this ok?
> > 
> > I'm tempted to say yes, but I'd also like to hear what Andrea and
> > others think.
> 
> I do not understand why we should have a new Inline constructor. Isn't
> Cite enough? Do we need to patch it to take extra parameters - to
> support the natbib reader? Why do we need them? I'm just asking
> because perhaps I'm missing something.

I thought that what Nathan was proposing was not an entirely new inline
element, but rather a variant (similar to suppress-author or
author-only) that would be a field on Cite. So you could have
for example

Cite CiteTextual [Cite ..., Cite ...] [(inlines)]

or

Cite CiteNoAuthor [Cite ..., Cite ...] [(inlines)]

Nathan, is that what you intended?

I also think that if we had CiteTextual, we really wouldn't
need CiteAuthorOnly.  AS people have pointed out, you'd really
never want mere mention of an author to trigger a reference
being added to the list of works cited.

> The API supports also a citation suffix, in the case we need it.
> 
> About the citation syntax in markdown: I'm reviewing the threads, so
> sorry if I'm proposing something which has already been rejected. More
> then the nocite you have been talking about, I need a command to
> render a citation without having that citation showing up in the list
> of references (bibliography). This is especially useful for court
> decisions, which may be cited but not included in the bibliography.
> Maybe that could be achieved with a modifier to be placed before the
> '@', (like the '-' to set the suppress-author bit).

So, this is the converse of nocite, which includes the work in the list of
works cited without actually producing a citation. I don't currently have a
better idea than using another special character, though I don't like that
much.

(In fact, I'm still somewhat attached to my proposal to make pandoc figure out
automatically when to use a textual citation and when not ... does anybody
else like that idea?)

We need to figure out how to represent the nocite option and its converse.
Presumably this information, too, should be part of a Cite? It's a bit
complicated, because a flag for nocite (say, CiteOmitFromReferences) could be
an alternative to CiteTextual, CiteNoAuthor, etc. (You're not printing the
thing, so you don't need to specify these options.) But the converse of nocite
is orthogonal to these other options, so would need to go elsewhere.

Here's another idea, which would not require putting this information in Cite
or adding more symbols to the citation syntax. At the end of a document, you
could specify where the bibliography goes using:

<references />

and if you wanted to get fancy:

<references>
+doe99
+smith04
-smith09
</references>

Or, perhaps

<references include="doe99, smith04" omit="smith09"/>

This would tell pandoc:  include doe99 and smith04 in the references,
even if they weren't cited, and don't include smith09, even if it
was cited.

Having something like <references /> would also allow you to omit
the list of references if you wanted to -- something that isn't possible
now -- or to include it elsewhere.

In fact, we could get even fancier and allow multiple <references />
tags in a document.  Each one would trigger a list of references based
on the citations since the beginning of the document OR the last
<references /> tag.  This would allow you to do per-chapter or
per-section bibliographies, a common need.

Thoughts?

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                 ` <20101104145457.GA6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-04 16:09                   ` John MacFarlane
       [not found]                     ` <20101104160942.GF25944-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-04 16:09 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Nov 04 10 15:54 ]:
> On Thu, Nov 04, 2010 at 12:48:01PM +0100, Andrea Rossato wrote:
> > On Thu, Nov 04, 2010 at 12:36:07AM -0700, John MacFarlane wrote:
> > > (b) this should modify the whole list of citations (the Cite inline),
> > > rather than individual citations.  (See my sample grammar from earlier
> > > in this thread.)
> > > 
> > > Do others agree?
> > 
> > I agree. This is the way the "suppress-author" modifier is supposed to
> > work: in a cite with multiple citations, if one has the
> > suppress-author bit set, all citations in the cite will be rendered
> > without the author. See:
> > http://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/discretionary_SuppressMultipleAuthors.txt
> 
> I was wrong on this point: suppress-author refers to individual
> citations (the cited test is to check whether the collapsing is
> handled correctly).
> 
> Anyway, I don't see the relation with your grammar: if I understand it
> correctly, suppress-author works on a per citation basis too. am I
> right?

I just can't think of a case where you'd want a list of citations
with some suppress-author and others not, e.g.:

(Doe 1999, 30; Smith 2004; 2008, 30)

So, even if in principle citeproc can do this, it seems better for
our representation to put the flag on the Cite rather than on
individual Citations.

> With regards to the problem of generating footnotes and thus switching
> between in-text and note style I think we are close to a solution, I
> think. I'm attaching a small patch for pandoc. In a few hours I'm also
> going to commit the fixes to the citeproc-hs side.
> 
> Here I updated the test.markdown file and the produced output with the
> usual 3 different styles (2 note styles and 1 in-text):
> http://gorgias.mine.nu/citeproc/

Great! I'll apply the pandoc patch and push to github.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                 ` <4CD2CFFE.8030503-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
@ 2010-11-04 23:03                   ` Nathan Gass
       [not found]                     ` <4CD33BD6.6050703-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Nathan Gass @ 2010-11-04 23:03 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 04.11.10 16:23, Nathan Gass wrote:
> On 04.11.10 12:48, Andrea Rossato wrote:
>> On Thu, Nov 04, 2010 at 12:36:07AM -0700, John MacFarlane wrote:
>>> +++ Nathan Gass [Nov 04 10 00:39 ]:
>>>> On 27.10.10 15:20, Andrea Rossato wrote:
>>>>> Hi,
>>>>>
>>>>> I just wanted to briefly report on the recent citeproc-hs updates.
>>>>>
>>>>> As the ones of you interested in the project remember, last spring I
>>>>> announced I was working on updating my CSL implementation to support
>>>>> CSL-1.0 and ot add a new extended API to allow a richer citation
>>>>> markup for pandoc.
>>>>>
>>>>> I was not able to finish quickly but I kept slowly working on it and
>>>>> now I've just pushed a few patches that bring us closer to the goal:
>>>>>
>>>>> * the new pandoc API is ready and I've also some code for the pandoc
>>>>> side to implement citation prefixes, and modifiers to print the
>>>>> author only or to suppress it in the citation. Some other
>>>>> modifiers could be included (something like a \nocite would be
>>>>> very useful for me);
>>>>
>>>> These modifieres currently are parsed and stored by citation key, so
>>>> `[+@doe99; -@jones90]` gets parsed as one Cite inline having two
>>>> citations, one author-only and one author-less. Do we really want to
>>>> support this?
>>>
>>> My own view is that
>>>
>>> (a) we can get by with just one modifier (the -), and
>>>
>>> (b) this should modify the whole list of citations (the Cite inline),
>>> rather than individual citations. (See my sample grammar from earlier
>>> in this thread.)
>>>
>>> Do others agree?
>>
>> I agree. This is the way the "suppress-author" modifier is supposed to
>> work: in a cite with multiple citations, if one has the
>> suppress-author bit set, all citations in the cite will be rendered
>> without the author. See:
>> http://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/discretionary_SuppressMultipleAuthors.txt
>>
>>
>
> In that case I'll continue to work on a patch to reflect this in the
> Cite inline, which currently can express different options per Citation
> in a single Cite. If we don't want to support this, we should not give
> any reader the possibility to generate such a native form. So I will
> provide a patch changing
>
> Cite [Citations] [Inline]
>
> to
>
> Cite CiteOptions [Citations] [Inline]
>
> and move the variants from the Citation type to the CiteOptions type.
>
>

I just send a pull request on github with my changes. My citeproc branch 
is at http://github.com/xabbu42/pandoc/tree/citeproc, in case others 
want to see what I'm doing. Andrea, do you have your citeproc branch 
somewhere accessible?

I need this changes for the natbib and biblatex writer and reader, which 
are my next top priority.

If I get around to it, I'd like to simplify the Cite inline a bit, as it 
has some data which I consider an implementation detail and which should 
not be exposed in the native pandoc format (namely citationNoteNum and 
citationHash). Is this a sensible feeling or something stupid to try?


Greetings
Nathan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                 ` <20101104160627.GE25944-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-05  0:41                   ` Frank Bennett
       [not found]                     ` <1df2d027-9220-45b1-8126-1b0965bd7836-s+NOhRKKP/7FX/zIJQasLWB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Frank Bennett @ 2010-11-05  0:41 UTC (permalink / raw)
  To: pandoc-discuss

Hi, all.  I'm a colleague of Andrea and Bruce on the CSL/citeproc side
(as the author of citeproc-js), and am just now joining the
discussion.

On Nov 5, 1:06 am, John MacFarlane <fiddlosop...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> (In fact, I'm still somewhat attached to my proposal to make pandoc figure out
> automatically when to use a textual citation and when not ... does anybody
> else like that idea?)

I can offer a further use case for "textual citations" that might be
relevant.

A common citation style in China is purely numeric, using numbers for
all references, including "textual citations".  As a base example,
suppose a text is set as follows in an author-date style:

  Jones (2000) says that water is wet, which has been confirmed by
others. (Smith 1999)

In the pure numeric style, this would be rendered like this (^
signifying superscript -- note that the "ordinary" type of reference
is superscripted, while the "author-only" type of reference is not):

  Reference [1] establishes that water is wet, a finding that has been
refined by others.^[2]^

  Bibliography
  [1] John Jones, "Wetness of Water" (2000).
  [2] Samuel Smith, "Dampness of Water" (1999).

In a note style, the same text would be rendered something like this:

  Jones^1^ establishes that water is wet, a finding that has been
refined by others.^2^

  Footnotes
  1. "Wetness of Water" (2000).
  2. Samuel Smith, "Dampness of Water" (1999).

I won't intrude on the syntax proposals made upthread, but it is
possible to handle these various transformations by tracking the use
of author-only and suppress-author toggles in citeproc, and applying
rules like the following:

(1) Cites with neither author-only nor suppress-author render in the
normal way.

(2) An author-only cite always renders in the main text.  Style-
supplied formatting (such as italics or superscripting) are always
omitted.  If the style provides only the citation number as the cite,
then a style-provided term is set before the cite in the text.

(3) If an author-only cite is followed by a suppress-author cite,
then:
  (a) If the style provides only the citation number as the cite, the
suppress-author cite renders nothing.
  (b) Otherwise, the cite is rendered in the normal way, with the
author suppressed.

This puts a significant additional burden on citeproc, so (thinking of
Andrea's position) it's probably not something to target in the short
term. But as the use case is lurking out there, preserving a pathway
in the document syntax that provides the necessary hints for
implementation might save a bit of pain down the road.

There are a couple of processor tests that exercise the relevant
conditions here:

http://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/discretionary_AuthorDateAuthorOnlyThenSuppressAuthor.txt

http://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/discretionary_CitationNumberAuthorOnlyThenSuppressAuthor.txt

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                     ` <4CD33BD6.6050703-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
@ 2010-11-05  7:00                       ` John MacFarlane
  2010-11-05 11:36                       ` Andrea Rossato
  1 sibling, 0 replies; 107+ messages in thread
From: John MacFarlane @ 2010-11-05  7:00 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Nathan Gass [Nov 05 10 00:03 ]:
> On 04.11.10 16:23, Nathan Gass wrote:
> >On 04.11.10 12:48, Andrea Rossato wrote:
> >>On Thu, Nov 04, 2010 at 12:36:07AM -0700, John MacFarlane wrote:
> >>>+++ Nathan Gass [Nov 04 10 00:39 ]:
> >>>>On 27.10.10 15:20, Andrea Rossato wrote:
> >>>>>Hi,
> >>>>>
> >>>>>I just wanted to briefly report on the recent citeproc-hs updates.
> >>>>>
> >>>>>As the ones of you interested in the project remember, last spring I
> >>>>>announced I was working on updating my CSL implementation to support
> >>>>>CSL-1.0 and ot add a new extended API to allow a richer citation
> >>>>>markup for pandoc.
> >>>>>
> >>>>>I was not able to finish quickly but I kept slowly working on it and
> >>>>>now I've just pushed a few patches that bring us closer to the goal:
> >>>>>
> >>>>>* the new pandoc API is ready and I've also some code for the pandoc
> >>>>>side to implement citation prefixes, and modifiers to print the
> >>>>>author only or to suppress it in the citation. Some other
> >>>>>modifiers could be included (something like a \nocite would be
> >>>>>very useful for me);
> >>>>
> >>>>These modifieres currently are parsed and stored by citation key, so
> >>>>`[+@doe99; -@jones90]` gets parsed as one Cite inline having two
> >>>>citations, one author-only and one author-less. Do we really want to
> >>>>support this?
> >>>
> >>>My own view is that
> >>>
> >>>(a) we can get by with just one modifier (the -), and
> >>>
> >>>(b) this should modify the whole list of citations (the Cite inline),
> >>>rather than individual citations. (See my sample grammar from earlier
> >>>in this thread.)
> >>>
> >>>Do others agree?
> >>
> >>I agree. This is the way the "suppress-author" modifier is supposed to
> >>work: in a cite with multiple citations, if one has the
> >>suppress-author bit set, all citations in the cite will be rendered
> >>without the author. See:
> >>http://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/discretionary_SuppressMultipleAuthors.txt
> >>
> >>
> >
> >In that case I'll continue to work on a patch to reflect this in the
> >Cite inline, which currently can express different options per Citation
> >in a single Cite. If we don't want to support this, we should not give
> >any reader the possibility to generate such a native form. So I will
> >provide a patch changing
> >
> >Cite [Citations] [Inline]
> >
> >to
> >
> >Cite CiteOptions [Citations] [Inline]
> >
> >and move the variants from the Citation type to the CiteOptions type.
> >
> >
> 
> I just send a pull request on github with my changes. My citeproc
> branch is at http://github.com/xabbu42/pandoc/tree/citeproc, in case
> others want to see what I'm doing. Andrea, do you have your citeproc
> branch somewhere accessible?
> 
> I need this changes for the natbib and biblatex writer and reader,
> which are my next top priority.
> 
> If I get around to it, I'd like to simplify the Cite inline a bit,
> as it has some data which I consider an implementation detail and
> which should not be exposed in the native pandoc format (namely
> citationNoteNum and citationHash). Is this a sensible feeling or
> something stupid to try?

Thanks.  Let's get Andrea's feedback about your proposed
changes first, and also about the idea to simplify Cite.
The idea seems reasonable to me, but Andrea has a much better
grip on what is needed in the code than I do.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                     ` <20101104160942.GF25944-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-05 10:33                       ` Andrea Rossato
       [not found]                         ` <20101105103345.GD6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-05 10:33 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Thu, Nov 04, 2010 at 09:09:42AM -0700, John MacFarlane wrote:
> +++ Andrea Rossato [Nov 04 10 15:54 ]:
> > I was wrong on this point: suppress-author refers to individual
> > citations (the cited test is to check whether the collapsing is
> > handled correctly).
> > 
> > Anyway, I don't see the relation with your grammar: if I understand it
> > correctly, suppress-author works on a per citation basis too. am I
> > right?
> 
> I just can't think of a case where you'd want a list of citations
> with some suppress-author and others not, e.g.:
> 
> (Doe 1999, 30; Smith 2004; 2008, 30)
> 
> So, even if in principle citeproc can do this, it seems better for
> our representation to put the flag on the Cite rather than on
> individual Citations.

This is a simple example:

  This is a citation with the suppress-author bit set as proposed by
  Doe [-@item1, p. 4; but see also @item3, chap. 2].

In a in-text style, suppose chicago-author-date, that would be
rendered as:

  This is a citation with the suppress-author bit set as proposed by
  Doe (2005a, 4; but see also Doe and Roe 2007, chap. 2).

But, and this is the cool part, in a note style, like
chicago-fullnote-bibliography-bb, that would become:

  This is a citation with the suppress-author bit set as proposed by
  Doe.(5)

  (5) First Book, 4; but see also Doe and Roe, Third Book, chap. 2.

This makes perfectly sense to me. But there would be no way to achieve
such a result with the syntax you are proposing (a single '-' right
after the '[').

So, if I understand it correctly, you think that flexibility should be
dropped for the sake of the syntax, because a single per-Cite modifier
is better.

I disagree: I would prefer a per-citation modifier, to make the
citeproc expressiveness fully available (and I have the feeling that
this kind of flexibility would be asked for by users). I understand
that rendering this citation model with natbib would be troublesome,
but this is due to the fact that CSL is far more powerful and
expressive than natbib (I'll elaborate a bit when commenting Nathan's
code).

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                     ` <4CD33BD6.6050703-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  2010-11-05  7:00                       ` John MacFarlane
@ 2010-11-05 11:36                       ` Andrea Rossato
       [not found]                         ` <20101105113647.GE6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-05 11:36 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Fri, Nov 05, 2010 at 12:03:50AM +0100, Nathan Gass wrote:
> On 04.11.10 16:23, Nathan Gass wrote:
> >On 04.11.10 12:48, Andrea Rossato wrote:
> >>On Thu, Nov 04, 2010 at 12:36:07AM -0700, John MacFarlane wrote:
> >>>+++ Nathan Gass [Nov 04 10 00:39 ]:
> >>>>On 27.10.10 15:20, Andrea Rossato wrote:
> >>>>>Hi,
> >>>>>
> >>>>>I just wanted to briefly report on the recent citeproc-hs updates.
> >>>>>
> >>>>>As the ones of you interested in the project remember, last spring I
> >>>>>announced I was working on updating my CSL implementation to support
> >>>>>CSL-1.0 and ot add a new extended API to allow a richer citation
> >>>>>markup for pandoc.
> >>>>>
> >>>>>I was not able to finish quickly but I kept slowly working on it and
> >>>>>now I've just pushed a few patches that bring us closer to the goal:
> >>>>>
> >>>>>* the new pandoc API is ready and I've also some code for the pandoc
> >>>>>side to implement citation prefixes, and modifiers to print the
> >>>>>author only or to suppress it in the citation. Some other
> >>>>>modifiers could be included (something like a \nocite would be
> >>>>>very useful for me);
> >>>>
> >>>>These modifieres currently are parsed and stored by citation key, so
> >>>>`[+@doe99; -@jones90]` gets parsed as one Cite inline having two
> >>>>citations, one author-only and one author-less. Do we really want to
> >>>>support this?
> >>>
> >>>My own view is that
> >>>
> >>>(a) we can get by with just one modifier (the -), and
> >>>
> >>>(b) this should modify the whole list of citations (the Cite inline),
> >>>rather than individual citations. (See my sample grammar from earlier
> >>>in this thread.)
> >>>
> >>>Do others agree?
> >>
> >>I agree. This is the way the "suppress-author" modifier is supposed to
> >>work: in a cite with multiple citations, if one has the
> >>suppress-author bit set, all citations in the cite will be rendered
> >>without the author. See:
> >>http://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/discretionary_SuppressMultipleAuthors.txt

As I said I was wrong on that.

> >In that case I'll continue to work on a patch to reflect this in the
> >Cite inline, which currently can express different options per Citation
> >in a single Cite. If we don't want to support this, we should not give
> >any reader the possibility to generate such a native form. So I will
> >provide a patch changing
> >
> >Cite [Citations] [Inline]
> >
> >to
> >
> >Cite CiteOptions [Citations] [Inline]
> >
> >and move the variants from the Citation type to the CiteOptions type.
> >
> >
> 
> I just send a pull request on github with my changes. My citeproc
> branch is at http://github.com/xabbu42/pandoc/tree/citeproc, in case
> others want to see what I'm doing. Andrea, do you have your citeproc
> branch somewhere accessible?
> 
> I need this changes for the natbib and biblatex writer and reader,
> which are my next top priority.

I had a look at the code: you are reducing the information of the Cite
constructor and introducing a redundant CitationVariant type. That
would make sense if we want to stick with the inflexible, natbib-like,
citation model which let you modify only the citation group as a
whole. The problem is that, if you then want to add back some
flexibility at a latter time, you'll need to change the Cite
constructor again - and modify all the writers.

Moreover, when feeding the citeproc, the lost information (individual
citation modifiers) will be taken from the citation group. Indeed I'm
not going to modify the citeproc API: even if not formally defined in
a document, the API has been discussed by the CSL guys and included in
the test-suite (and Frank Bennett's citepro-js being the reference
implementation even with regards to the API). We agreed to discuss API
changes and additions so that to preserve some unity among different
implementations (this was the topic I was asking Bruce to specifically
comment, if possible, in a message at the beginning of this thread).

I think that the present configuration of the Cite constructor gives
you all you need:

CitationVariant can be entirely derived from [Citations]:

  - authorOnlyCitation = or . map citationAutOnly
  - noAuthorCitation   = or . map citationNoAut
  - normalCitation  cs = not (authorOnlyCitation cs || noAuthorCitation cs)

Moving in the Cite the note number makes sense, but I do not think
worth the effort (also because it must be set in each citation again
before feeding the citeproc).

When reading an author-only natbib cite you just set every citation's
citationAutOnly bit, and so on.

The problem comes when you need to write natbib from a Cite. I'm not
familiar with natbib enough, but I think you could be using the extra
information to provide multiple cites, dividing the citation group
(the Cite inline) opportunely when needed.

Anyway I hardly find this a good argument for reducing the markdown
syntax expressiveness. CSL has been develop and design - in the course
of many years of discussion I should add - to be a flexible and
powerful citation style language usable for formatting a wide range of
sources and for the use in any possible discipline. I don't know
whether it is successful or not, but its aim is to be more powerful
and flexible than bibtex. I think that the features you are proposing
to drop will be requested by pandoc users who have already been
exposed to CSL - this is indeed the reason why I coded them... ;-)

> If I get around to it, I'd like to simplify the Cite inline a bit,
> as it has some data which I consider an implementation detail and
> which should not be exposed in the native pandoc format (namely
> citationNoteNum and citationHash). Is this a sensible feeling or
> something stupid to try?
> 

The citationHash is used in the Eq instance for the Citation type and
is needed to uniquely identify citations to manipulate them
(positioning is sensitive in CSL and so two identical citations may be
rendered differently if located in different (relative) positions).
For the note-number, I have the feeling that getting rid of it would
require some more processing - and affect the performance (I remember
taking that into account when coding it, but I could be plainly
wrong). Anyway, if you want to give it a try and come up with
something cleaner I'd have no objections, far from it!

You were asking where to find my code: since I have little familiarity
with git, I have no branched/forked code on github: I just send
patches here and John pushes them in the citeproc branch that comes
with his tree.

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                     ` <1df2d027-9220-45b1-8126-1b0965bd7836-s+NOhRKKP/7FX/zIJQasLWB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
@ 2010-11-05 13:36                       ` Andrea Rossato
       [not found]                         ` <20101105133608.GG6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  2010-11-08  9:07                       ` Nathan Gass
  1 sibling, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-05 13:36 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Hi Frank,

thanks for stopping by.

On Thu, Nov 04, 2010 at 05:41:24PM -0700, Frank Bennett wrote:
> Hi, all.  I'm a colleague of Andrea and Bruce on the CSL/citeproc side
> (as the author of citeproc-js), and am just now joining the
> discussion.
> 
> On Nov 5, 1:06 am, John MacFarlane <fiddlosop...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> 
> > (In fact, I'm still somewhat attached to my proposal to make pandoc figure out
> > automatically when to use a textual citation and when not ... does anybody
> > else like that idea?)
> 
> I can offer a further use case for "textual citations" that might be
> relevant.
> 
> A common citation style in China is purely numeric, using numbers for
> all references, including "textual citations".  As a base example,
> suppose a text is set as follows in an author-date style:
> 
>   Jones (2000) says that water is wet, which has been confirmed by
> others. (Smith 1999)
> 
> In the pure numeric style, this would be rendered like this (^
> signifying superscript -- note that the "ordinary" type of reference
> is superscripted, while the "author-only" type of reference is not):
> 
>   Reference [1] establishes that water is wet, a finding that has been
> refined by others.^[2]^
> 
>   Bibliography
>   [1] John Jones, "Wetness of Water" (2000).
>   [2] Samuel Smith, "Dampness of Water" (1999).
> 
> In a note style, the same text would be rendered something like this:
> 
>   Jones^1^ establishes that water is wet, a finding that has been
> refined by others.^2^
> 
>   Footnotes
>   1. "Wetness of Water" (2000).
>   2. Samuel Smith, "Dampness of Water" (1999).
> 
> I won't intrude on the syntax proposals made upthread, but it is
> possible to handle these various transformations by tracking the use
> of author-only and suppress-author toggles in citeproc, and applying
> rules like the following:

This is another example of the utility of a per-citation modifier and
the level of expressiveness you can get out of it.
 
> (1) Cites with neither author-only nor suppress-author render in the
> normal way.
> 
> (2) An author-only cite always renders in the main text.  Style-
> supplied formatting (such as italics or superscripting) are always
> omitted.  If the style provides only the citation number as the cite,
> then a style-provided term is set before the cite in the text.
> 
> (3) If an author-only cite is followed by a suppress-author cite,
> then:
>   (a) If the style provides only the citation number as the cite, the
> suppress-author cite renders nothing.
>   (b) Otherwise, the cite is rendered in the normal way, with the
> author suppressed.
> 
> This puts a significant additional burden on citeproc, so (thinking of
> Andrea's position) it's probably not something to target in the short
> term. But as the use case is lurking out there, preserving a pathway
> in the document syntax that provides the necessary hints for
> implementation might save a bit of pain down the road.
> 
> There are a couple of processor tests that exercise the relevant
> conditions here:
> 
> http://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/discretionary_AuthorDateAuthorOnlyThenSuppressAuthor.txt
> 
> http://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/discretionary_CitationNumberAuthorOnlyThenSuppressAuthor.txt

Well, part of the job has already been done. Actually citeproc-hs
passes 5 of the 6 tests of the "discretionary" group. Only this last
one was left out since I didn't get it at first sight (I was planning
to come back to it at a latter time since I know it is documented in
the citeproc-js manual).

Andrea

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                         ` <20101105113647.GE6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-05 15:12                           ` Nathan Gass
       [not found]                             ` <4CD41EC1.7090100-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Nathan Gass @ 2010-11-05 15:12 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 05.11.10 12:36, Andrea Rossato wrote:
> On Fri, Nov 05, 2010 at 12:03:50AM +0100, Nathan Gass wrote:
>> On 04.11.10 16:23, Nathan Gass wrote:
>>> On 04.11.10 12:48, Andrea Rossato wrote:
>>>> On Thu, Nov 04, 2010 at 12:36:07AM -0700, John MacFarlane wrote:
>>>>> +++ Nathan Gass [Nov 04 10 00:39 ]:
>>>>>> On 27.10.10 15:20, Andrea Rossato wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I just wanted to briefly report on the recent citeproc-hs updates.
>>>>>>>
>>>>>>> As the ones of you interested in the project remember, last spring I
>>>>>>> announced I was working on updating my CSL implementation to support
>>>>>>> CSL-1.0 and ot add a new extended API to allow a richer citation
>>>>>>> markup for pandoc.
>>>>>>>
>>>>>>> I was not able to finish quickly but I kept slowly working on it and
>>>>>>> now I've just pushed a few patches that bring us closer to the goal:
>>>>>>>
>>>>>>> * the new pandoc API is ready and I've also some code for the pandoc
>>>>>>> side to implement citation prefixes, and modifiers to print the
>>>>>>> author only or to suppress it in the citation. Some other
>>>>>>> modifiers could be included (something like a \nocite would be
>>>>>>> very useful for me);
>>>>>>
>>>>>> These modifieres currently are parsed and stored by citation key, so
>>>>>> `[+@doe99; -@jones90]` gets parsed as one Cite inline having two
>>>>>> citations, one author-only and one author-less. Do we really want to
>>>>>> support this?
>>>>>
>>>>> My own view is that
>>>>>
>>>>> (a) we can get by with just one modifier (the -), and
>>>>>
>>>>> (b) this should modify the whole list of citations (the Cite inline),
>>>>> rather than individual citations. (See my sample grammar from earlier
>>>>> in this thread.)
>>>>>
>>>>> Do others agree?
>>>>
>>>> I agree. This is the way the "suppress-author" modifier is supposed to
>>>> work: in a cite with multiple citations, if one has the
>>>> suppress-author bit set, all citations in the cite will be rendered
>>>> without the author. See:
>>>> http://bitbucket.org/bdarcus/citeproc-test/src/tip/processor-tests/humans/discretionary_SuppressMultipleAuthors.txt
>
> As I said I was wrong on that.
>
>>> In that case I'll continue to work on a patch to reflect this in the
>>> Cite inline, which currently can express different options per Citation
>>> in a single Cite. If we don't want to support this, we should not give
>>> any reader the possibility to generate such a native form. So I will
>>> provide a patch changing
>>>
>>> Cite [Citations] [Inline]
>>>
>>> to
>>>
>>> Cite CiteOptions [Citations] [Inline]
>>>
>>> and move the variants from the Citation type to the CiteOptions type.
>>>
>>>
>>
>> I just send a pull request on github with my changes. My citeproc
>> branch is at http://github.com/xabbu42/pandoc/tree/citeproc, in case
>> others want to see what I'm doing. Andrea, do you have your citeproc
>> branch somewhere accessible?
>>
>> I need this changes for the natbib and biblatex writer and reader,
>> which are my next top priority.
>
> I had a look at the code: you are reducing the information of the Cite
> constructor and introducing a redundant CitationVariant type. That
> would make sense if we want to stick with the inflexible, natbib-like,
> citation model which let you modify only the citation group as a
> whole. The problem is that, if you then want to add back some
> flexibility at a latter time, you'll need to change the Cite
> constructor again - and modify all the writers.

Yeah, I want to get it right at the start too. I was under the 
impression that my implementation is equivalent to yours based on your 
earlier assertion that setting a modifier on one Citation enables it for 
the hole group. As you later corrected that assertion, I was of course 
wrong.

You have given a valid use-case for making only the first citation 
author-less. Are there use-cases for multiple author-less citations in 
one cite group? And for cite groups with start with a normal citations 
but contains author-less citations in the middle or at the end? If the 
answer to this question is no, I'd propose to use my changes, with one 
more change I'll push shortly, such that only the first citation is 
author-less in any citation group.

>
> Moreover, when feeding the citeproc, the lost information (individual
> citation modifiers) will be taken from the citation group. Indeed I'm
> not going to modify the citeproc API: even if not formally defined in
> a document, the API has been discussed by the CSL guys and included in
> the test-suite (and Frank Bennett's citepro-js being the reference
> implementation even with regards to the API). We agreed to discuss API
> changes and additions so that to preserve some unity among different
> implementations (this was the topic I was asking Bruce to specifically
> comment, if possible, in a message at the beginning of this thread).
>
> I think that the present configuration of the Cite constructor gives
> you all you need:

My problem was that it gives you too much. I'd like as much as possible 
that sensible Cites are enforced by the type system. How does a citation 
render with citationAutOnly *and* citationNoAut both set? And do we 
actually want to have this strange possibility? If not my Cite data type 
enforces that they can not be set together, where as yours does not. 
That means any code using Cites (so any tool using pandoc as a library 
to manipulate citations) has to consider this possibility.

>
> CitationVariant can be entirely derived from [Citations]:
>
>    - authorOnlyCitation = or . map citationAutOnly
>    - noAuthorCitation   = or . map citationNoAut
>    - normalCitation  cs = not (authorOnlyCitation cs || noAuthorCitation cs)
>
> Moving in the Cite the note number makes sense, but I do not think
> worth the effort (also because it must be set in each citation again
> before feeding the citeproc).

Ideally, the note number would not be tracked in the Cite at all, for 
similar reasons as above. The native pandoc format is an api of some 
sort. Currently you have to explain to anyone using pandoc to manipulate 
citations that they can just ignore the note number (and the hash), as 
they get overwritten later anyway. It would be cleaner if this stuff is 
moved outside of the Cite inline (same is true for the rendered 
citations I believe).

>
> When reading an author-only natbib cite you just set every citation's
> citationAutOnly bit, and so on.
>
> The problem comes when you need to write natbib from a Cite. I'm not
> familiar with natbib enough, but I think you could be using the extra
> information to provide multiple cites, dividing the citation group
> (the Cite inline) opportunely when needed.
>
> Anyway I hardly find this a good argument for reducing the markdown
> syntax expressiveness. CSL has been develop and design - in the course
> of many years of discussion I should add - to be a flexible and
> powerful citation style language usable for formatting a wide range of
> sources and for the use in any possible discipline. I don't know
> whether it is successful or not, but its aim is to be more powerful
> and flexible than bibtex. I think that the features you are proposing
> to drop will be requested by pandoc users who have already been
> exposed to CSL - this is indeed the reason why I coded them... ;-)

I completely agree with this sentiment and did not want to imply that we 
should consider natbib or biblatex limitations in the markdown syntax or 
the citeproc implementation. The reason I can't start with the 
natbib/biblatex code before this stuff is ironed out and stable, is 
simply that the expressiveness of the native pandoc format, and how this 
expressiveness is achieved, is important for my code and I did not want 
to write something I have to significantly change later.

For example, with your current code, I have to test how citeproc-hs 
renders a Citation with both modifiers set and try to imitate this in 
natbib/biblatex. With my current code there is no such possibility as 
enforced by my Cite definition and I therefore don't have to consider it 
in my natbib or biblatex writer.

I personally take citeproc-hs itself, API and results, as given and are 
not implying in any way this should change. I actually don't care that 
much what citeproc-hs does with a author-less-author-only citation as 
long as I don't have to consider this possibility in my code.

>
>> If I get around to it, I'd like to simplify the Cite inline a bit,
>> as it has some data which I consider an implementation detail and
>> which should not be exposed in the native pandoc format (namely
>> citationNoteNum and citationHash). Is this a sensible feeling or
>> something stupid to try?
>>
>
> The citationHash is used in the Eq instance for the Citation type and
> is needed to uniquely identify citations to manipulate them
> (positioning is sensitive in CSL and so two identical citations may be
> rendered differently if located in different (relative) positions).
> For the note-number, I have the feeling that getting rid of it would
> require some more processing - and affect the performance (I remember
> taking that into account when coding it, but I could be plainly
> wrong). Anyway, if you want to give it a try and come up with
> something cleaner I'd have no objections, far from it!

Ok, I'll see if I get something to work. After all, I want to learn 
haskell for real ;-).

>
> You were asking where to find my code: since I have little familiarity
> with git, I have no branched/forked code on github: I just send
> patches here and John pushes them in the citeproc branch that comes
> with his tree.

Ok, this works for me.

Nathan

>
> Andrea
>


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                         ` <20101105103345.GD6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-05 16:41                           ` John MacFarlane
       [not found]                             ` <20101105164134.GB582-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-05 16:41 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Nov 05 10 11:33 ]:
> On Thu, Nov 04, 2010 at 09:09:42AM -0700, John MacFarlane wrote:
> > +++ Andrea Rossato [Nov 04 10 15:54 ]:
> > > I was wrong on this point: suppress-author refers to individual
> > > citations (the cited test is to check whether the collapsing is
> > > handled correctly).
> > > 
> > > Anyway, I don't see the relation with your grammar: if I understand it
> > > correctly, suppress-author works on a per citation basis too. am I
> > > right?
> > 
> > I just can't think of a case where you'd want a list of citations
> > with some suppress-author and others not, e.g.:
> > 
> > (Doe 1999, 30; Smith 2004; 2008, 30)
> > 
> > So, even if in principle citeproc can do this, it seems better for
> > our representation to put the flag on the Cite rather than on
> > individual Citations.
> 
> This is a simple example:
> 
>   This is a citation with the suppress-author bit set as proposed by
>   Doe [-@item1, p. 4; but see also @item3, chap. 2].
> 
> In a in-text style, suppose chicago-author-date, that would be
> rendered as:
> 
>   This is a citation with the suppress-author bit set as proposed by
>   Doe (2005a, 4; but see also Doe and Roe 2007, chap. 2).
> 
> But, and this is the cool part, in a note style, like
> chicago-fullnote-bibliography-bb, that would become:
> 
>   This is a citation with the suppress-author bit set as proposed by
>   Doe.(5)
> 
>   (5) First Book, 4; but see also Doe and Roe, Third Book, chap. 2.
> 
> This makes perfectly sense to me. But there would be no way to achieve
> such a result with the syntax you are proposing (a single '-' right
> after the '[').
> 
> So, if I understand it correctly, you think that flexibility should be
> dropped for the sake of the syntax, because a single per-Cite modifier
> is better.
> 
> I disagree: I would prefer a per-citation modifier, to make the
> citeproc expressiveness fully available (and I have the feeling that
> this kind of flexibility would be asked for by users). I understand
> that rendering this citation model with natbib would be troublesome,
> but this is due to the fact that CSL is far more powerful and
> expressive than natbib (I'll elaborate a bit when commenting Nathan's
> code).

OK, you've persuaded me.  Let's keep the suppress-author bit on the
individual citations.

The example also shows that prefixes need to be per-citation --
another problem with my proposed grammar. I've put a revised
version here: http://gitit.net/PandocCitationGrammar

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                             ` <4CD41EC1.7090100-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
@ 2010-11-05 16:59                               ` Andrea Rossato
       [not found]                                 ` <20101105165947.GH6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-05 16:59 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Fri, Nov 05, 2010 at 04:12:01PM +0100, Nathan Gass wrote:
> On 05.11.10 12:36, Andrea Rossato wrote:
> >I had a look at the code: you are reducing the information of the Cite
> >constructor and introducing a redundant CitationVariant type. That
> >would make sense if we want to stick with the inflexible, natbib-like,
> >citation model which let you modify only the citation group as a
> >whole. The problem is that, if you then want to add back some
> >flexibility at a latter time, you'll need to change the Cite
> >constructor again - and modify all the writers.
> 
> Yeah, I want to get it right at the start too. I was under the
> impression that my implementation is equivalent to yours based on
> your earlier assertion that setting a modifier on one Citation
> enables it for the hole group. As you later corrected that
> assertion, I was of course wrong.
> 
> You have given a valid use-case for making only the first citation
> author-less. Are there use-cases for multiple author-less citations
> in one cite group? And for cite groups with start with a normal
> citations but contains author-less citations in the middle or at the
> end? If the answer to this question is no, I'd propose to use my
> changes, with one more change I'll push shortly, such that only the
> first citation is author-less in any citation group.

Why only the first citation?

    This is a citation with the suppress-author bit set as proposed by
    Doe [-@item1, p. 4; but see also @item3, chap. 2; cfr. also what
    Doe said in -@item2; see also @item4].

With chicago-author-date his is rendered as:

    This is a citation with the suppress-author bit set as proposed by
    Doe (2005a, 4; but see also Doe and Roe 2007, chap. 2; cfr. also
    what Doe said in 2006b; see also Red 2007).

With chicago-fullnote:

    This is a citation with the suppress-author bit set as proposed by
    Doe.(5)

    (5) First Book, 4; but see also Doe and Roe, Third Book, chap. 2;
    cfr. also what Doe said in Second Book; see also Margaret Red,
    Third Book, 2007.

Legal scholars like me and Frank may be writing very very long
citation groups... are you going to hard-code every possible
combination?
;-)

> >Moreover, when feeding the citeproc, the lost information (individual
> >citation modifiers) will be taken from the citation group. Indeed I'm
> >not going to modify the citeproc API: even if not formally defined in
> >a document, the API has been discussed by the CSL guys and included in
> >the test-suite (and Frank Bennett's citepro-js being the reference
> >implementation even with regards to the API). We agreed to discuss API
> >changes and additions so that to preserve some unity among different
> >implementations (this was the topic I was asking Bruce to specifically
> >comment, if possible, in a message at the beginning of this thread).
> >
> >I think that the present configuration of the Cite constructor gives
> >you all you need:
> 
> My problem was that it gives you too much. I'd like as much as
> possible that sensible Cites are enforced by the type system. How
> does a citation render with citationAutOnly *and* citationNoAut both
> set? And do we actually want to have this strange possibility? If
> not my Cite data type enforces that they can not be set together,
> where as yours does not. That means any code using Cites (so any
> tool using pandoc as a library to manipulate citations) has to
> consider this possibility.

Data consistency cannot always be achieved by the type system, but can
also be enforced by the parsers, or by the citeproc. Anyway, your
specific example is indeed sounded and so, within a citation, we could
have a CiteMode = AuthorOnly | SuppressAuthor

It is good style, even though it doesn't add very much to the code
clarity. Data consistency is already enforced by the parser.

> >
> >CitationVariant can be entirely derived from [Citations]:
> >
> >   - authorOnlyCitation = or . map citationAutOnly
> >   - noAuthorCitation   = or . map citationNoAut
> >   - normalCitation  cs = not (authorOnlyCitation cs || noAuthorCitation cs)
> >
> >Moving in the Cite the note number makes sense, but I do not think
> >worth the effort (also because it must be set in each citation again
> >before feeding the citeproc).
> 
> Ideally, the note number would not be tracked in the Cite at all,
> for similar reasons as above. The native pandoc format is an api of
> some sort. Currently you have to explain to anyone using pandoc to
> manipulate citations that they can just ignore the note number (and
> the hash), as they get overwritten later anyway. It would be cleaner
> if this stuff is moved outside of the Cite inline (same is true for
> the rendered citations I believe).

processing the citations and generating the footnotes is quite tricky
and we should try to avoid to many generic traversal, which can have
quite an impact when dealing with large documents (a book?).

Footnote generation must be done when the citations have been
rendered, but rendering a citation requires knowing the footnote
number the citation occurs in. So, with the first traversal we
generate to footnotes to set the number in each Citation. We render
the citation with the citeproc and we update each Cite with the
generated output (hence the need of the hash to find the right Cite).
Then we make the last traversal to generate the footnotes where
needed, fixing the output (adding a final period, capitalizing the
first letter, etc.).

> 
> >
> >When reading an author-only natbib cite you just set every citation's
> >citationAutOnly bit, and so on.
> >
> >The problem comes when you need to write natbib from a Cite. I'm not
> >familiar with natbib enough, but I think you could be using the extra
> >information to provide multiple cites, dividing the citation group
> >(the Cite inline) opportunely when needed.
> >
> >Anyway I hardly find this a good argument for reducing the markdown
> >syntax expressiveness. CSL has been develop and design - in the course
> >of many years of discussion I should add - to be a flexible and
> >powerful citation style language usable for formatting a wide range of
> >sources and for the use in any possible discipline. I don't know
> >whether it is successful or not, but its aim is to be more powerful
> >and flexible than bibtex. I think that the features you are proposing
> >to drop will be requested by pandoc users who have already been
> >exposed to CSL - this is indeed the reason why I coded them... ;-)
> 
> I completely agree with this sentiment and did not want to imply
> that we should consider natbib or biblatex limitations in the
> markdown syntax or the citeproc implementation. The reason I can't
> start with the natbib/biblatex code before this stuff is ironed out
> and stable, is simply that the expressiveness of the native pandoc
> format, and how this expressiveness is achieved, is important for my
> code and I did not want to write something I have to significantly
> change later.

I agree with that. But my code is almost done: I may need some
addition in the Citation record, some fields may be cleaned up, etc.,
but this is it.

> For example, with your current code, I have to test how citeproc-hs
> renders a Citation with both modifiers set and try to imitate this
> in natbib/biblatex. With my current code there is no such
> possibility as enforced by my Cite definition and I therefore don't
> have to consider it in my natbib or biblatex writer.
>
> I personally take citeproc-hs itself, API and results, as given and
> are not implying in any way this should change. I actually don't
> care that much what citeproc-hs does with a author-less-author-only
> citation as long as I don't have to consider this possibility in my
> code.

I'm not following you here. Why do you need to run the processor for
rendering your stuff? I'm totally lost here: I just supposed you were
translating the markdown syntax into natbib \cite* commands (and
reading them into Cite). What am I missing?

> >The citationHash is used in the Eq instance for the Citation type and
> >is needed to uniquely identify citations to manipulate them
> >(positioning is sensitive in CSL and so two identical citations may be
> >rendered differently if located in different (relative) positions).
> >For the note-number, I have the feeling that getting rid of it would
> >require some more processing - and affect the performance (I remember
> >taking that into account when coding it, but I could be plainly
> >wrong). Anyway, if you want to give it a try and come up with
> >something cleaner I'd have no objections, far from it!
> 
> Ok, I'll see if I get something to work. After all, I want to learn
> haskell for real ;-).

Cool! Be careful, though: Haskell may be highly addictive...

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                         ` <20101105133608.GG6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-05 17:01                           ` John MacFarlane
       [not found]                             ` <20101105170119.GC582-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-05 17:01 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Nov 05 10 14:36 ]:
> Hi Frank,
> 
> thanks for stopping by.
> 
> On Thu, Nov 04, 2010 at 05:41:24PM -0700, Frank Bennett wrote:
> > Hi, all.  I'm a colleague of Andrea and Bruce on the CSL/citeproc side
> > (as the author of citeproc-js), and am just now joining the
> > discussion.
> > 
> > On Nov 5, 1:06 am, John MacFarlane <fiddlosop...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > 
> > > (In fact, I'm still somewhat attached to my proposal to make pandoc figure out
> > > automatically when to use a textual citation and when not ... does anybody
> > > else like that idea?)
> > 
> > I can offer a further use case for "textual citations" that might be
> > relevant.
> > 
> > A common citation style in China is purely numeric, using numbers for
> > all references, including "textual citations".  As a base example,
> > suppose a text is set as follows in an author-date style:
> > 
> >   Jones (2000) says that water is wet, which has been confirmed by
> > others. (Smith 1999)
> > 
> > In the pure numeric style, this would be rendered like this (^
> > signifying superscript -- note that the "ordinary" type of reference
> > is superscripted, while the "author-only" type of reference is not):
> > 
> >   Reference [1] establishes that water is wet, a finding that has been
> > refined by others.^[2]^
> > 
> >   Bibliography
> >   [1] John Jones, "Wetness of Water" (2000).
> >   [2] Samuel Smith, "Dampness of Water" (1999).
> > 
> > In a note style, the same text would be rendered something like this:
> > 
> >   Jones^1^ establishes that water is wet, a finding that has been
> > refined by others.^2^
> > 
> >   Footnotes
> >   1. "Wetness of Water" (2000).
> >   2. Samuel Smith, "Dampness of Water" (1999).
> > 
> > I won't intrude on the syntax proposals made upthread, but it is
> > possible to handle these various transformations by tracking the use
> > of author-only and suppress-author toggles in citeproc, and applying
> > rules like the following:
> 
> This is another example of the utility of a per-citation modifier and
> the level of expressiveness you can get out of it.

But another thing Frank's example shows is that there may be an independent
need for a "textual citation" variant.  We were previously thinking that
we could fake this by just writing

Jones [@jones99]

in the markdown, but that wouldn't give the desired result in the style
Frank describes above.

However, since textual citations aren't yet supported in CSL, I'm
not sure what to do about this.  As Frank suggests, it might be
a reason to have a pandoc syntax contsruct that could later be
hooked up to textual citation variants.  And, as Nathan has suggested,
it would be convenient to represent textual citations as such in
our pandoc structure, even if they need to be converted to something
else for processing by citeproc-hs.

As for syntax, I did have a thought about this. First, it seems to me that
when you're doing a textual citation, you won't have a list of citations. It
would be quite awkward to say:

Jones (1999); Smith (2000); Xu (2001) say that blah.

So what about the following syntax for textual citations?
(No funny characters needed!)

@jones1999 says that water is wet.  @smith2000 [p. 30] confirms this,
as does @xu2001.  Others have disagreed [@taylor1999, p. 30; @rhodes00].

This would render in inline author-date as:

Jones (1999) says that water is wet. Smith (2000, 30) confirms this,
as does Xu 2001.  Others have disagreed (Taylor 1999, 30; Rhodes 2000).

And in footnote style as:

Jones^1 says that water is wet. Smith^2 confirms this,
as does Xu^3.  Others have disagreed^4.

John

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                 ` <20101105165947.GH6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-05 17:06                                   ` John MacFarlane
  2010-11-05 17:36                                   ` Nathan Gass
  1 sibling, 0 replies; 107+ messages in thread
From: John MacFarlane @ 2010-11-05 17:06 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

> > My problem was that it gives you too much. I'd like as much as
> > possible that sensible Cites are enforced by the type system. How
> > does a citation render with citationAutOnly *and* citationNoAut both
> > set? And do we actually want to have this strange possibility? If
> > not my Cite data type enforces that they can not be set together,
> > where as yours does not. That means any code using Cites (so any
> > tool using pandoc as a library to manipulate citations) has to
> > consider this possibility.
> 
> Data consistency cannot always be achieved by the type system, but can
> also be enforced by the parsers, or by the citeproc. Anyway, your
> specific example is indeed sounded and so, within a citation, we could
> have a CiteMode = AuthorOnly | SuppressAuthor
> 
> It is good style, even though it doesn't add very much to the code
> clarity. Data consistency is already enforced by the parser.

I agree that it makes good sense to have it enforced as much
as possible by the types, even if it is enforced by the parser.
(After all, some people might want to create a native pandoc
structure themselves, without parsing from markdown.)

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                 ` <20101105165947.GH6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  2010-11-05 17:06                                   ` John MacFarlane
@ 2010-11-05 17:36                                   ` Nathan Gass
       [not found]                                     ` <4CD44082.8000205-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: Nathan Gass @ 2010-11-05 17:36 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 05.11.10 17:59, Andrea Rossato wrote:
> On Fri, Nov 05, 2010 at 04:12:01PM +0100, Nathan Gass wrote:
>> On 05.11.10 12:36, Andrea Rossato wrote:
>>> I had a look at the code: you are reducing the information of the Cite
>>> constructor and introducing a redundant CitationVariant type. That
>>> would make sense if we want to stick with the inflexible, natbib-like,
>>> citation model which let you modify only the citation group as a
>>> whole. The problem is that, if you then want to add back some
>>> flexibility at a latter time, you'll need to change the Cite
>>> constructor again - and modify all the writers.
>>
>> Yeah, I want to get it right at the start too. I was under the
>> impression that my implementation is equivalent to yours based on
>> your earlier assertion that setting a modifier on one Citation
>> enables it for the hole group. As you later corrected that
>> assertion, I was of course wrong.
>>
>> You have given a valid use-case for making only the first citation
>> author-less. Are there use-cases for multiple author-less citations
>> in one cite group? And for cite groups with start with a normal
>> citations but contains author-less citations in the middle or at the
>> end? If the answer to this question is no, I'd propose to use my
>> changes, with one more change I'll push shortly, such that only the
>> first citation is author-less in any citation group.
>
> Why only the first citation?
>
>      This is a citation with the suppress-author bit set as proposed by
>      Doe [-@item1, p. 4; but see also @item3, chap. 2; cfr. also what
>      Doe said in -@item2; see also @item4].
>
> With chicago-author-date his is rendered as:
>
>      This is a citation with the suppress-author bit set as proposed by
>      Doe (2005a, 4; but see also Doe and Roe 2007, chap. 2; cfr. also
>      what Doe said in 2006b; see also Red 2007).
>
> With chicago-fullnote:
>
>      This is a citation with the suppress-author bit set as proposed by
>      Doe.(5)
>
>      (5) First Book, 4; but see also Doe and Roe, Third Book, chap. 2;
>      cfr. also what Doe said in Second Book; see also Margaret Red,
>      Third Book, 2007.
>
> Legal scholars like me and Frank may be writing very very long
> citation groups... are you going to hard-code every possible
> combination?
> ;-)

No, thats why I asked if this one combination is all that is needed. As 
it is not enough, I agree that it best to handle the variants per citation.

>
>>> Moreover, when feeding the citeproc, the lost information (individual
>>> citation modifiers) will be taken from the citation group. Indeed I'm
>>> not going to modify the citeproc API: even if not formally defined in
>>> a document, the API has been discussed by the CSL guys and included in
>>> the test-suite (and Frank Bennett's citepro-js being the reference
>>> implementation even with regards to the API). We agreed to discuss API
>>> changes and additions so that to preserve some unity among different
>>> implementations (this was the topic I was asking Bruce to specifically
>>> comment, if possible, in a message at the beginning of this thread).
>>>
>>> I think that the present configuration of the Cite constructor gives
>>> you all you need:
>>
>> My problem was that it gives you too much. I'd like as much as
>> possible that sensible Cites are enforced by the type system. How
>> does a citation render with citationAutOnly *and* citationNoAut both
>> set? And do we actually want to have this strange possibility? If
>> not my Cite data type enforces that they can not be set together,
>> where as yours does not. That means any code using Cites (so any
>> tool using pandoc as a library to manipulate citations) has to
>> consider this possibility.
>
> Data consistency cannot always be achieved by the type system, but can
> also be enforced by the parsers, or by the citeproc. Anyway, your
> specific example is indeed sounded and so, within a citation, we could
> have a CiteMode = AuthorOnly | SuppressAuthor
>
> It is good style, even though it doesn't add very much to the code
> clarity. Data consistency is already enforced by the parser.

But there are many parsers, and additionally, somebody could want to 
generate or process native pandoc with some tool. All of them could 
produce inconsistent data if they are buggy. As it is possible to 
enforce this particular consistency, I think it is worth it.

>
>>>
>>> CitationVariant can be entirely derived from [Citations]:
>>>
>>>    - authorOnlyCitation = or . map citationAutOnly
>>>    - noAuthorCitation   = or . map citationNoAut
>>>    - normalCitation  cs = not (authorOnlyCitation cs || noAuthorCitation cs)
>>>
>>> Moving in the Cite the note number makes sense, but I do not think
>>> worth the effort (also because it must be set in each citation again
>>> before feeding the citeproc).
>>
>> Ideally, the note number would not be tracked in the Cite at all,
>> for similar reasons as above. The native pandoc format is an api of
>> some sort. Currently you have to explain to anyone using pandoc to
>> manipulate citations that they can just ignore the note number (and
>> the hash), as they get overwritten later anyway. It would be cleaner
>> if this stuff is moved outside of the Cite inline (same is true for
>> the rendered citations I believe).
>
> processing the citations and generating the footnotes is quite tricky
> and we should try to avoid to many generic traversal, which can have
> quite an impact when dealing with large documents (a book?).
>
> Footnote generation must be done when the citations have been
> rendered, but rendering a citation requires knowing the footnote
> number the citation occurs in. So, with the first traversal we
> generate to footnotes to set the number in each Citation. We render
> the citation with the citeproc and we update each Cite with the
> generated output (hence the need of the hash to find the right Cite).
> Then we make the last traversal to generate the footnotes where
> needed, fixing the output (adding a final period, capitalizing the
> first letter, etc.).
>

Yeah I see the reasons why it is handled as it is.

>> For example, with your current code, I have to test how citeproc-hs
>> renders a Citation with both modifiers set and try to imitate this
>> in natbib/biblatex. With my current code there is no such
>> possibility as enforced by my Cite definition and I therefore don't
>> have to consider it in my natbib or biblatex writer.
>>
>> I personally take citeproc-hs itself, API and results, as given and
>> are not implying in any way this should change. I actually don't
>> care that much what citeproc-hs does with a author-less-author-only
>> citation as long as I don't have to consider this possibility in my
>> code.
>
> I'm not following you here. Why do you need to run the processor for
> rendering your stuff? I'm totally lost here: I just supposed you were
> translating the markdown syntax into natbib \cite* commands (and
> reading them into Cite). What am I missing?

First, I need citeproc as a fallback whenever some citation feature is 
only supported in one format (for example, if we have no textual 
citation in markdown, I need to partially render a \citet found in latex 
to translate it to `Doe [-@doe99]`).

Second the Cite constructor determines what sort of data I need to 
handle, as I'd like my code to work with any pandoc reader and writer 
implementing some yet unknown citation format. They all are confined 
only by the Cite constructor in what citations they generate.

Nathan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                             ` <20101105164134.GB582-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-05 19:50                               ` Andrea Rossato
       [not found]                                 ` <20101105195055.GI6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-05 19:50 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Fri, Nov 05, 2010 at 09:41:34AM -0700, John MacFarlane wrote:
> +++ Andrea Rossato [Nov 05 10 11:33 ]:
> > This is a simple example:
> > 
> >   This is a citation with the suppress-author bit set as proposed by
> >   Doe [-@item1, p. 4; but see also @item3, chap. 2].
> > 
> 
> The example also shows that prefixes need to be per-citation --
> another problem with my proposed grammar. I've put a revised
> version here: http://gitit.net/PandocCitationGrammar

Since we seem to agree to take this path, I think that supporting a
citation suffix would very useful too:

   This is a citation with the suppress-author bit set as proposed by
   Doe [-@item1, p. 4; but see also -@item3, chap. 2, where he seemed
   to suggest the opposite; for another example see @item4].

Do we need multiple locators?

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                 ` <20101105195055.GI6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-05 20:29                                   ` John MacFarlane
       [not found]                                     ` <20101105202938.GA3968-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-05 20:29 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Nov 05 10 20:50 ]:
> On Fri, Nov 05, 2010 at 09:41:34AM -0700, John MacFarlane wrote:
> > +++ Andrea Rossato [Nov 05 10 11:33 ]:
> > > This is a simple example:
> > > 
> > >   This is a citation with the suppress-author bit set as proposed by
> > >   Doe [-@item1, p. 4; but see also @item3, chap. 2].
> > > 
> > 
> > The example also shows that prefixes need to be per-citation --
> > another problem with my proposed grammar. I've put a revised
> > version here: http://gitit.net/PandocCitationGrammar
> 
> Since we seem to agree to take this path, I think that supporting a
> citation suffix would very useful too:
> 
>    This is a citation with the suppress-author bit set as proposed by
>    Doe [-@item1, p. 4; but see also -@item3, chap. 2, where he seemed
>    to suggest the opposite; for another example see @item4].
> 
> Do we need multiple locators?

I'm still a bit confused about how locators are handled.
The citeLocator field is just a string, but I take it that citeproc
parses this somewhere into something more structured (recognizing
page, chapter, etc. according to locale).  Shouldn't citeLocator
be this more structured thing?  If so, it seems that we should
probably allow multiple locators, if this is allowed by CSL.

[@doe99, p. 13, sec. 15]

And I think we should also allow a suffix -- whatever remains after
we've parsed a sequence of locators, and before the next citation.

There's a tricky issue here, though.  If we allow verbose prefixes
and suffixes in the citations, people are going to expect to be
able to format them.  That is, they would most naturally be [Inline],
not String.  They'll even expect to be able to do things like

[See @doe99 for a discussion of the `;` character]

but if we just parse the suffix as a string, we'll take the ';'
as the end of one citation and expect another.  A mess.

I can think of two reasons you might resist using [Inline] for
prefixes and suffixes in citeproc-hs:

1. You want the library to be more general, not pandoc-specific
2. You don't want to inherit pandoc's dependencies
3. We'd have circular dependencies if pandoc requires citeproc
   and citeproc requires pandoc

If (1) is the issue, then one solution would be to use a
type class for prefixes and suffixes, allowing them to be String
or [Inline].

As for (2) and (3), we could perhaps address them by splitting
Text.Pandoc.Definition into its own package.  citeproc would
just have to depend on that, which would not depend on (much)
else.

Any thoughts on this?

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                                     ` <20101105202938.GA3968-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-05 20:52                                       ` Andrea Rossato
       [not found]                                         ` <20101105205212.GJ6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-05 20:52 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Fri, Nov 05, 2010 at 01:29:39PM -0700, John MacFarlane wrote:
> +++ Andrea Rossato [Nov 05 10 20:50 ]:
> > Do we need multiple locators?
> 
> I'm still a bit confused about how locators are handled.
> The citeLocator field is just a string, but I take it that citeproc
> parses this somewhere into something more structured (recognizing
> page, chapter, etc. according to locale).  Shouldn't citeLocator
> be this more structured thing?  If so, it seems that we should
> probably allow multiple locators, if this is allowed by CSL.
> 
> [@doe99, p. 13, sec. 15]

As far as I remember multiple locators are not allowed in CSL, but I
need to check.

> And I think we should also allow a suffix -- whatever remains after
> we've parsed a sequence of locators, and before the next citation.
> 
> There's a tricky issue here, though.  If we allow verbose prefixes
> and suffixes in the citations, people are going to expect to be
> able to format them.  That is, they would most naturally be [Inline],
> not String.  They'll even expect to be able to do things like
> 
> [See @doe99 for a discussion of the `;` character]
> 
> but if we just parse the suffix as a string, we'll take the ';'
> as the end of one citation and expect another.  A mess.

CSL have some rich text formatting which should be usable in affixes.
Parsing them as [Inline] would be fine for me.

> 
> I can think of two reasons you might resist using [Inline] for
> prefixes and suffixes in citeproc-hs:
> 
> 1. You want the library to be more general, not pandoc-specific
> 2. You don't want to inherit pandoc's dependencies
> 3. We'd have circular dependencies if pandoc requires citeproc
>    and citeproc requires pandoc
> 
> If (1) is the issue, then one solution would be to use a
> type class for prefixes and suffixes, allowing them to be String
> or [Inline].
> 
> As for (2) and (3), we could perhaps address them by splitting
> Text.Pandoc.Definition into its own package.  citeproc would
> just have to depend on that, which would not depend on (much)
> else.
> 
> Any thoughts on this?

Text.Pandoc.Definition as a separate package would be *great* for me!
Pandoc is the main output for citeproc. To avoid circular dependencies
I'm force to replicate everything in Text.CSL.Output.Pandoc and then
again in Text.Pandoc.Biblio. I was tempted to suggest such a move. We
would have the Pandoc type and the interface to manipulate it as a
single common package without any duplicate code.

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                             ` <20101105170119.GC582-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-05 21:32                               ` Andrea Rossato
       [not found]                                 ` <20101105213239.GK6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-05 21:32 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Fri, Nov 05, 2010 at 10:01:20AM -0700, John MacFarlane wrote:
> +++ Andrea Rossato [Nov 05 10 14:36 ]:
> > On Thu, Nov 04, 2010 at 05:41:24PM -0700, Frank Bennett wrote:
> > > A common citation style in China is purely numeric, using numbers for
> > > all references, including "textual citations".  As a base example,
> > > suppose a text is set as follows in an author-date style:
> > > 
> > >   Jones (2000) says that water is wet, which has been confirmed by
> > > others. (Smith 1999)
> > > 
> > > In the pure numeric style, this would be rendered like this (^
> > > signifying superscript -- note that the "ordinary" type of reference
> > > is superscripted, while the "author-only" type of reference is not):
> > > 
> > >   Reference [1] establishes that water is wet, a finding that has been
> > > refined by others.^[2]^
> > > 
> > >   Bibliography
> > >   [1] John Jones, "Wetness of Water" (2000).
> > >   [2] Samuel Smith, "Dampness of Water" (1999).
> > > 
> > > In a note style, the same text would be rendered something like this:
> > > 
> > >   Jones^1^ establishes that water is wet, a finding that has been
> > > refined by others.^2^
> > > 
> > >   Footnotes
> > >   1. "Wetness of Water" (2000).
> > >   2. Samuel Smith, "Dampness of Water" (1999).
> > > 
> > > I won't intrude on the syntax proposals made upthread, but it is
> > > possible to handle these various transformations by tracking the use
> > > of author-only and suppress-author toggles in citeproc, and applying
> > > rules like the following:
> > 
> > This is another example of the utility of a per-citation modifier and
> > the level of expressiveness you can get out of it.
> 
> But another thing Frank's example shows is that there may be an independent
> need for a "textual citation" variant.  We were previously thinking that
> we could fake this by just writing
> 
> Jones [@jones99]

(Jones [-@jones99], right?)
> 
> in the markdown, but that wouldn't give the desired result in the style
> Frank describes above.
>
> However, since textual citations aren't yet supported in CSL, I'm
> not sure what to do about this.  As Frank suggests, it might be
> a reason to have a pandoc syntax contsruct that could later be
> hooked up to textual citation variants.  And, as Nathan has suggested,
> it would be convenient to represent textual citations as such in
> our pandoc structure, even if they need to be converted to something
> else for processing by citeproc-hs.
> 
> As for syntax, I did have a thought about this. First, it seems to me that
> when you're doing a textual citation, you won't have a list of citations. It
> would be quite awkward to say:
> 
> Jones (1999); Smith (2000); Xu (2001) say that blah.
> 
> So what about the following syntax for textual citations?
> (No funny characters needed!)
> 
> @jones1999 says that water is wet.  @smith2000 [p. 30] confirms this,
> as does @xu2001.  Others have disagreed [@taylor1999, p. 30; @rhodes00].
> 
> This would render in inline author-date as:
> 
> Jones (1999) says that water is wet. Smith (2000, 30) confirms this,
> as does Xu 2001.  Others have disagreed (Taylor 1999, 30; Rhodes 2000).
> 
> And in footnote style as:
> 
> Jones^1 says that water is wet. Smith^2 confirms this,
> as does Xu^3.  Others have disagreed^4.

Just to clarify, "textual citation" in the example made by Frank means
a citation with the "author-only" bit set. The 3 examples work because
they all use, to render the name of the author, an "author-only"
citation followed by a "suppress-author" citation. In the second
example, only the "author-only" citation produces an output,
'Reference [1]' while the "suppress-author" does not produce anything:
http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#automating-text-insertions

So, if I understand it, you are suggesting to use '@item1' for a
citation with the "author-only" bit set, right?

Frank's example would become:

    @jones2000 [-@jones2000) says that water is wet, which has been
    confirmed by others [@smith1999].

Or '@jones2000' should be translated into the meaning of "@jones2000
[-@jones2000)" (which means, "author-only" within citation groups are
not allowed)?

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                         ` <20101105205212.GJ6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-06  0:12                                           ` John MacFarlane
       [not found]                                             ` <20101106001214.GA6579-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-06  0:12 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Nov 05 10 21:52 ]:
 
> Text.Pandoc.Definition as a separate package would be *great* for me!
> Pandoc is the main output for citeproc. To avoid circular dependencies
> I'm force to replicate everything in Text.CSL.Output.Pandoc and then
> again in Text.Pandoc.Biblio. I was tempted to suggest such a move. We
> would have the Pandoc type and the interface to manipulate it as a
> single common package without any duplicate code.

This is a sensible move, I think.  I've just split off
Text.Pandoc.Definitions into a new package,

https://github.com/jgm/pandoc-types

The citeproc branch at

https://github.com/jgm/pandoc/tree/citeproc

now depends on this package instead of supplying its own
Text.Pandoc.Definitions.  This should allow you to simplify your
code considerably!

It will also be useful more generally, I think -- any package
that wants to produce structured text can emit it as Pandoc
without bringing in lots of dependencies.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                 ` <20101105213239.GK6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-06  0:28                                   ` John MacFarlane
  0 siblings, 0 replies; 107+ messages in thread
From: John MacFarlane @ 2010-11-06  0:28 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Nov 05 10 22:32 ]:
> On Fri, Nov 05, 2010 at 10:01:20AM -0700, John MacFarlane wrote:
> > +++ Andrea Rossato [Nov 05 10 14:36 ]:
> > > On Thu, Nov 04, 2010 at 05:41:24PM -0700, Frank Bennett wrote:
> > > > A common citation style in China is purely numeric, using numbers for
> > > > all references, including "textual citations".  As a base example,
> > > > suppose a text is set as follows in an author-date style:
> > > > 
> > > >   Jones (2000) says that water is wet, which has been confirmed by
> > > > others. (Smith 1999)
> > > > 
> > > > In the pure numeric style, this would be rendered like this (^
> > > > signifying superscript -- note that the "ordinary" type of reference
> > > > is superscripted, while the "author-only" type of reference is not):
> > > > 
> > > >   Reference [1] establishes that water is wet, a finding that has been
> > > > refined by others.^[2]^
> > > > 
> > > >   Bibliography
> > > >   [1] John Jones, "Wetness of Water" (2000).
> > > >   [2] Samuel Smith, "Dampness of Water" (1999).
> > > > 
> > > > In a note style, the same text would be rendered something like this:
> > > > 
> > > >   Jones^1^ establishes that water is wet, a finding that has been
> > > > refined by others.^2^
> > > > 
> > > >   Footnotes
> > > >   1. "Wetness of Water" (2000).
> > > >   2. Samuel Smith, "Dampness of Water" (1999).
> > > > 
> > > > I won't intrude on the syntax proposals made upthread, but it is
> > > > possible to handle these various transformations by tracking the use
> > > > of author-only and suppress-author toggles in citeproc, and applying
> > > > rules like the following:
> > > 
> > > This is another example of the utility of a per-citation modifier and
> > > the level of expressiveness you can get out of it.
> > 
> > But another thing Frank's example shows is that there may be an independent
> > need for a "textual citation" variant.  We were previously thinking that
> > we could fake this by just writing
> > 
> > Jones [@jones99]
> 
> (Jones [-@jones99], right?)

Yes.

> > 
> > in the markdown, but that wouldn't give the desired result in the style
> > Frank describes above.
> >
> > However, since textual citations aren't yet supported in CSL, I'm
> > not sure what to do about this.  As Frank suggests, it might be
> > a reason to have a pandoc syntax contsruct that could later be
> > hooked up to textual citation variants.  And, as Nathan has suggested,
> > it would be convenient to represent textual citations as such in
> > our pandoc structure, even if they need to be converted to something
> > else for processing by citeproc-hs.
> > 
> > As for syntax, I did have a thought about this. First, it seems to me that
> > when you're doing a textual citation, you won't have a list of citations. It
> > would be quite awkward to say:
> > 
> > Jones (1999); Smith (2000); Xu (2001) say that blah.
> > 
> > So what about the following syntax for textual citations?
> > (No funny characters needed!)
> > 
> > @jones1999 says that water is wet.  @smith2000 [p. 30] confirms this,
> > as does @xu2001.  Others have disagreed [@taylor1999, p. 30; @rhodes00].
> > 
> > This would render in inline author-date as:
> > 
> > Jones (1999) says that water is wet. Smith (2000, 30) confirms this,
> > as does Xu 2001.  Others have disagreed (Taylor 1999, 30; Rhodes 2000).
> > 
> > And in footnote style as:
> > 
> > Jones^1 says that water is wet. Smith^2 confirms this,
> > as does Xu^3.  Others have disagreed^4.
> 
> Just to clarify, "textual citation" in the example made by Frank means
> a citation with the "author-only" bit set. The 3 examples work because
> they all use, to render the name of the author, an "author-only"
> citation followed by a "suppress-author" citation. In the second
> example, only the "author-only" citation produces an output,
> 'Reference [1]' while the "suppress-author" does not produce anything:
> http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#automating-text-insertions

Oh, I see -- so, the style for the Chinese format tells it to render an
author-only citation, not by giving the name of the author, but by printing
"Source [n]"? In that case, then, maybe we don't need a separate form for that
in citeproc. (Assuming we can get the author-only form to omit the parentheses
and footnotization -- this should presumably always go in the main text
without adornment, as it seems to in the examples you linked to above.)

Anyway, my syntax suggestion is relatively independent of whether
we have a separate underlying representation for textual citations.

> So, if I understand it, you are suggesting to use '@item1' for a
> citation with the "author-only" bit set, right?

Not quite. The idea is that

@item1 [p. 3]

would be equivalent to

[+@item1] [-@item1, p. 3]

where I'm pretending that + designates the author-only form.

Actually, I'd prefer not to have '+' at all.  It seems to me that the
author-only form is only needed in conjunction with the suppress-author
form, and this combined form could be handled with the syntax above.

> Frank's example would become:
> 
>     @jones2000 [-@jones2000) says that water is wet, which has been
>     confirmed by others [@smith1999].
> 
> Or '@jones2000' should be translated into the meaning of "@jones2000
> [-@jones2000)" (which means, "author-only" within citation groups are
> not allowed)?

Yes, that's right.  So, Frank's example would be produced by:

@jones2000 says that water is wet, which has been confirmed
by others [@smith1999].

This seems very natural to me, and minimizes the use of special
characters.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                             ` <20101106001214.GA6579-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-06 14:58                                               ` Nathan Gass
       [not found]                                                 ` <4CD56D33.2010004-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  2010-11-06 15:04                                               ` Andrea Rossato
  1 sibling, 1 reply; 107+ messages in thread
From: Nathan Gass @ 2010-11-06 14:58 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 06.11.10 01:12, John MacFarlane wrote:
> +++ Andrea Rossato [Nov 05 10 21:52 ]:
>
>> Text.Pandoc.Definition as a separate package would be *great* for me!
>> Pandoc is the main output for citeproc. To avoid circular dependencies
>> I'm force to replicate everything in Text.CSL.Output.Pandoc and then
>> again in Text.Pandoc.Biblio. I was tempted to suggest such a move. We
>> would have the Pandoc type and the interface to manipulate it as a
>> single common package without any duplicate code.
>
> This is a sensible move, I think.  I've just split off
> Text.Pandoc.Definitions into a new package,
>
> https://github.com/jgm/pandoc-types
>
> The citeproc branch at
>
> https://github.com/jgm/pandoc/tree/citeproc
>
> now depends on this package instead of supplying its own
> Text.Pandoc.Definitions.  This should allow you to simplify your
> code considerably!
>
> It will also be useful more generally, I think -- any package
> that wants to produce structured text can emit it as Pandoc
> without bringing in lots of dependencies.

I'd vote for keeping the two haskell packages in one git repository, if 
that is possible. I fear there are a lot of dependencies between the two 
and manually splitting and syncing patches,branches... between the two 
repos could become cumbersome.

I just now created a new branch [^1] with only the acceptable changes 
from my old branch left. Namely the new automated tests and the 
CitationVariant data type to enforce consistency. The changes depend on 
my changes on pandoc-types [^2], which is a first example why one git 
repo with two haskell packages would be easier to handle.

I hope this changes now are acceptable.

[^1]: https://github.com/xabbu42/pandoc/tree/citeproc2

[^2]: https://github.com/xabbu42/pandoc-types

Nathan


>
> John
>


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                                             ` <20101106001214.GA6579-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2010-11-06 14:58                                               ` Nathan Gass
@ 2010-11-06 15:04                                               ` Andrea Rossato
       [not found]                                                 ` <20101106150410.GL6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-06 15:04 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2309 bytes --]

On Fri, Nov 05, 2010 at 05:12:15PM -0700, John MacFarlane wrote:
> +++ Andrea Rossato [Nov 05 10 21:52 ]:
>  
> > Text.Pandoc.Definition as a separate package would be *great* for me!
> > Pandoc is the main output for citeproc. To avoid circular dependencies
> > I'm force to replicate everything in Text.CSL.Output.Pandoc and then
> > again in Text.Pandoc.Biblio. I was tempted to suggest such a move. We
> > would have the Pandoc type and the interface to manipulate it as a
> > single common package without any duplicate code.
> 
> This is a sensible move, I think.  I've just split off
> Text.Pandoc.Definitions into a new package,
> 
> https://github.com/jgm/pandoc-types
> 
> The citeproc branch at
> 
> https://github.com/jgm/pandoc/tree/citeproc
> 
> now depends on this package instead of supplying its own
> Text.Pandoc.Definitions.  This should allow you to simplify your
> code considerably!
> 
> It will also be useful more generally, I think -- any package
> that wants to produce structured text can emit it as Pandoc
> without bringing in lots of dependencies.


I pushed the latest patches to the darcs repository of citeproc-hs:
they include the switch to the new pandoc-types package and the latest
fixes.

Here the updated test material:
http://gorgias.mine.nu/citeproc/

I'm including a couple of patches to pandoc-types: the first one
removes trailing white spaces, the second one adds a datatype to make
suppress-author and author-only mutually exclusive.

I'm also attaching a patch for the pandoc citeproc branch:
0006-update-to-latest-APII-changes-and-remove-duplicate-c.patch

It removes duplicate code and updates everything to work with the
latest citeproc code I've just pushed.

I did not change the markdown parser yet (John, are you going to do
the job yourself or do you want me to do it? I'm not very happy when
it comes to writing parsers...).

Andrea

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.


[-- Attachment #2: 0001-remove-trailing-white-spaces.patch --]
[-- Type: text/plain, Size: 4442 bytes --]

From f84caa04f6131a6361687e23f6575f40172901fb Mon Sep 17 00:00:00 2001
From: Andrea Rossato <andrea.rossato-/Q1r7N5in3P/wltNWqQaag@public.gmane.org>
Date: Sat, 6 Nov 2010 10:51:02 +0100
Subject: [PATCH 1/2] remove trailing white-spaces

---
 Text/Pandoc/Definition.hs |   28 ++++++++++++++--------------
 1 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/Text/Pandoc/Definition.hs b/Text/Pandoc/Definition.hs
index bec216b..4adb08c 100644
--- a/Text/Pandoc/Definition.hs
+++ b/Text/Pandoc/Definition.hs
@@ -20,7 +20,7 @@ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 {- |
    Module      : Text.Pandoc.Definition
    Copyright   : Copyright (C) 2006-2010 John MacFarlane
-   License     : GNU GPL, version 2 or above 
+   License     : GNU GPL, version 2 or above
 
    Maintainer  : John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org>
    Stability   : alpha
@@ -42,9 +42,9 @@ data Meta = Meta { docTitle   :: [Inline]
             deriving (Eq, Ord, Show, Read, Typeable, Data)
 
 -- | Alignment of a table column.
-data Alignment = AlignLeft 
-               | AlignRight 
-               | AlignCenter 
+data Alignment = AlignLeft
+               | AlignRight
+               | AlignCenter
                | AlignDefault deriving (Eq, Ord, Show, Read, Typeable, Data)
 
 -- | List attributes.
@@ -53,37 +53,37 @@ type ListAttributes = (Int, ListNumberStyle, ListNumberDelim)
 -- | Style of list numbers.
 data ListNumberStyle = DefaultStyle
                      | Example
-                     | Decimal 
-                     | LowerRoman 
+                     | Decimal
+                     | LowerRoman
                      | UpperRoman
-                     | LowerAlpha 
+                     | LowerAlpha
                      | UpperAlpha deriving (Eq, Ord, Show, Read, Typeable, Data)
 
 -- | Delimiter of list numbers.
 data ListNumberDelim = DefaultDelim
                      | Period
-                     | OneParen 
+                     | OneParen
                      | TwoParens deriving (Eq, Ord, Show, Read, Typeable, Data)
 
 -- | Attributes: identifier, classes, key-value pairs
 type Attr = (String, [String], [(String, String)])
 
 -- | Block element.
-data Block  
+data Block
     = Plain [Inline]        -- ^ Plain text, not a paragraph
     | Para [Inline]         -- ^ Paragraph
-    | CodeBlock Attr String -- ^ Code block (literal) with attributes 
+    | CodeBlock Attr String -- ^ Code block (literal) with attributes
     | RawHtml String        -- ^ Raw HTML block (literal)
     | BlockQuote [Block]    -- ^ Block quote (list of blocks)
     | OrderedList ListAttributes [[Block]] -- ^ Ordered list (attributes
                             -- and a list of items, each a list of blocks)
     | BulletList [[Block]]  -- ^ Bullet list (list of items, each
                             -- a list of blocks)
-    | DefinitionList [([Inline],[[Block]])]  -- ^ Definition list 
+    | DefinitionList [([Inline],[[Block]])]  -- ^ Definition list
                             -- Each list item is a pair consisting of a
                             -- term (a list of inlines) and one or more
                             -- definitions (each a list of blocks)
-    | Header Int [Inline]   -- ^ Header - level (integer) and text (inlines) 
+    | Header Int [Inline]   -- ^ Header - level (integer) and text (inlines)
     | HorizontalRule        -- ^ Horizontal rule
     | Table [Inline] [Alignment] [Double] [[Block]] [[[Block]]]  -- ^ Table,
                             -- with caption, column alignments,
@@ -103,7 +103,7 @@ type Target = (String, String)
 data MathType = DisplayMath | InlineMath deriving (Show, Eq, Ord, Read, Typeable, Data)
 
 -- | Inline elements.
-data Inline 
+data Inline
     = Str String            -- ^ Text (string)
     | Emph [Inline]         -- ^ Emphasized text (list of inlines)
     | Strong [Inline]       -- ^ Strongly emphasized text (list of inlines)
@@ -126,7 +126,7 @@ data Inline
     | Link [Inline] Target  -- ^ Hyperlink: text (list of inlines), target
     | Image [Inline] Target -- ^ Image:  alt text (list of inlines), target
                             -- and target
-    | Note [Block]          -- ^ Footnote or endnote 
+    | Note [Block]          -- ^ Footnote or endnote
     deriving (Show, Eq, Ord, Read, Typeable, Data)
 
 data Citation = Citation { citationId      :: String
-- 
1.7.1


[-- Attachment #3: 0002-add-CitationMode-to-make-suppress-author-and-author-.patch --]
[-- Type: text/plain, Size: 1747 bytes --]

From 0299fe60e2bd3c30515a00ef25fd70869e455585 Mon Sep 17 00:00:00 2001
From: Andrea Rossato <andrea.rossato-/Q1r7N5in3P/wltNWqQaag@public.gmane.org>
Date: Sat, 6 Nov 2010 15:40:55 +0100
Subject: [PATCH 2/2] add CitationMode to make suppress-author and author-only mutually exclusive

---
 Text/Pandoc/Definition.hs |   11 ++++++-----
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/Text/Pandoc/Definition.hs b/Text/Pandoc/Definition.hs
index 4adb08c..7ad3fdf 100644
--- a/Text/Pandoc/Definition.hs
+++ b/Text/Pandoc/Definition.hs
@@ -132,16 +132,18 @@ data Inline
 data Citation = Citation { citationId      :: String
                          , citationPrefix  :: String
                          , citationLocator :: String
+                         , citationMode    :: CitationMode
                          , citationNoteNum :: Int
-                         , citationAutOnly :: Bool
-                         , citationNoAut   :: Bool
                          , citationHash    :: Int
                          }
                 deriving (Show, Ord, Read, Typeable, Data)
 
 instance Eq Citation where
-    (==) (Citation _ _ _ _ _ _ ha)
-         (Citation _ _ _ _ _ _ hb) = ha == hb
+    (==) (Citation _ _ _ _ _ ha)
+         (Citation _ _ _ _ _ hb) = ha == hb
+
+data CitationMode = AuthorOnly | SuppressAuthor | NormalCitation
+                    deriving (Show, Eq, Ord, Read, Typeable, Data)
 
 -- | Applies a transformation on @a@s to matching elements in a @b@.
 processWith :: (Data a, Data b) => (a -> a) -> b -> b
@@ -162,4 +164,3 @@ processPandoc = processWith
 {-# DEPRECATED queryPandoc "Use queryWith instead" #-}
 queryPandoc :: Data a => (a -> [b]) -> Pandoc -> [b]
 queryPandoc = queryWith
-
-- 
1.7.1


[-- Attachment #4: 0006-update-to-latest-APII-changes-and-remove-duplicate-c.patch --]
[-- Type: text/plain, Size: 9289 bytes --]

From e0fe94fd01c9e163eafeedfe97883bcd3e22639f Mon Sep 17 00:00:00 2001
From: Andrea Rossato <andrea.rossato-/Q1r7N5in3P/wltNWqQaag@public.gmane.org>
Date: Sat, 6 Nov 2010 15:51:43 +0100
Subject: [PATCH 6/6] update to latest APII changes and remove duplicate code

---
 pandoc.cabal                        |    2 +-
 src/Text/Pandoc/Biblio.hs           |  113 +++++++----------------------------
 src/Text/Pandoc/Readers/Markdown.hs |    6 ++-
 3 files changed, 27 insertions(+), 94 deletions(-)

diff --git a/pandoc.cabal b/pandoc.cabal
index aad149c..0583781 100644
--- a/pandoc.cabal
+++ b/pandoc.cabal
@@ -168,7 +168,7 @@ Library
     Build-depends: highlighting-kate >= 0.2.7.1
     cpp-options:   -D_HIGHLIGHTING
   if flag(citeproc)
-    Build-depends: citeproc-hs >= 0.2
+    Build-depends: citeproc-hs >= 0.3 && < 0.4
     cpp-options:   -D_CITEPROC
   if impl(ghc >= 6.12)
     Ghc-Options:   -O2 -Wall -fno-warn-unused-do-bind
diff --git a/src/Text/Pandoc/Biblio.hs b/src/Text/Pandoc/Biblio.hs
index 1621550..d8a4659 100644
--- a/src/Text/Pandoc/Biblio.hs
+++ b/src/Text/Pandoc/Biblio.hs
@@ -30,7 +30,6 @@ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 module Text.Pandoc.Biblio ( processBiblio ) where
 
 import Control.Monad ( when )
-import Data.Char ( toUpper )
 import Data.List
 import Data.Unique
 import Text.CSL hiding ( Cite(..), Citation(..) )
@@ -52,9 +51,9 @@ processBiblio cf r p
                                   ncits  = map (queryWith getCite) $ queryWith getNote p'
                                   needNt = cits \\ concat ncits
                               in (,) needNt $ getNoteCitations needNt p'
-            result     = citeproc' csl r (setNearNote csl $ map (map toCslCite) grps)
+            result     = citeproc csl r (setNearNote csl $ map (map toCslCite) grps)
             cits_map   = zip grps (citations result)
-            biblioList = map (read . renderPandoc' csl) (bibliography result)
+            biblioList = map (renderPandoc' csl) (bibliography result)
             Pandoc m b = processWith (processCite csl cits_map) p'
         return . generateNotes nts . Pandoc m $ b ++ biblioList
 
@@ -65,7 +64,7 @@ processCite s cs il
     | otherwise      = il
     where
       process t = case lookup t cs of
-                    Just  i -> read $ renderPandoc s i
+                    Just  i -> renderPandoc s i
                     Nothing -> [Str ("Error processing " ++ show t)]
 
 -- | Retrieve all citations from a 'Pandoc' docuument. To be used with
@@ -91,8 +90,8 @@ getNoteCitations needNote
       in  queryWith getCitation . getCits
 
 setHash :: Citation -> IO Citation
-setHash (Citation i p l nn ao na _)
-    = hashUnique `fmap` newUnique >>= return . Citation i p l nn ao na
+setHash (Citation i p l cm nn _)
+    = hashUnique `fmap` newUnique >>= return . Citation i p l cm nn
 
 generateNotes :: [Inline] -> Pandoc -> Pandoc
 generateNotes needNote = processWith (mvCiteInNote needNote)
@@ -109,12 +108,12 @@ mvCiteInNote is = procInlines mvCite
     where
       mvCite :: [Inline] -> [Inline]
       mvCite inls
-          | x:i:xs <- inls, startWPt xs
-          , x == Space,   i `elem_` is = split i xs ++ mvCite (tailInline xs)
+          | x:i:xs <- inls, startWithPunct xs
+          , x == Space,   i `elem_` is = split i xs ++ mvCite (tailFirstInlineStr xs)
           | x:i:xs <- inls
           , x == Space,   i `elem_` is = mvInNote i :  mvCite xs
           | i:xs <- inls, i `elem_` is
-          , startWPt xs                = split i xs ++ mvCite (tailInline xs)
+          , startWithPunct xs          = split i xs ++ mvCite (tailFirstInlineStr xs)
           | i:xs <- inls, Note _ <- i  = checkNt  i :  mvCite xs
           | i:xs <- inls               = i          :  mvCite xs
           | otherwise                  = []
@@ -124,91 +123,17 @@ mvCiteInNote is = procInlines mvCite
           | Cite t o <- i = Note [Para [Cite t $ sanitize o]]
           | otherwise     = Note [Para [i                  ]]
       sanitize i
-          | endWPt  i = toCapital i
-          | otherwise = toCapital (i ++ [Str "."])
+          | endWithPunct i = toCapital i
+          | otherwise      = toCapital (i ++ [Str "."])
 
       checkPt i
           | Cite c o : xs <- i
-          , endWPt o, startWPt xs
-          , endWPt  o = Cite c (initInline o) : checkPt xs
-          | x:xs <- i = x : checkPt xs
-          | otherwise = []
-      endWPt   = and . map (`elem` ".,;:!?") . lastInline
-      startWPt = and . map (`elem` ".,;:!?") . headInline
+          , endWithPunct o, startWithPunct xs
+          , endWithPunct o = Cite c (initInline o) : checkPt xs
+          | x:xs <- i      = x : checkPt xs
+          | otherwise      = []
       checkNt  = processWith $ procInlines checkPt
 
-headInline :: [Inline] -> String
-headInline [] = []
-headInline (i:_)
-    | Str s <- i = head' s
-    | Space <- i = " "
-    | otherwise  = headInline $ getInline i
-    where
-      head' s = if s /= [] then [head s] else []
-
-lastInline :: [Inline] -> String
-lastInline [] = []
-lastInline (i:[])
-    | Str s <- i = last' s
-    | Space <- i = " "
-    | otherwise  = lastInline $ getInline i
-    where
-      last' s = if s /= [] then [last s] else []
-lastInline (_:xs) = lastInline xs
-
-initInline :: [Inline] -> [Inline]
-initInline [] = []
-initInline (i:[])
-    | Str          s <- i = return $ Str         (init'       s)
-    | Emph        is <- i = return $ Emph        (initInline is)
-    | Strong      is <- i = return $ Strong      (initInline is)
-    | Strikeout   is <- i = return $ Strikeout   (initInline is)
-    | Superscript is <- i = return $ Superscript (initInline is)
-    | Subscript   is <- i = return $ Subscript   (initInline is)
-    | Quoted q    is <- i = return $ Quoted q    (initInline is)
-    | SmallCaps   is <- i = return $ SmallCaps   (initInline is)
-    | Link      is t <- i = return $ Link        (initInline is) t
-    | otherwise           = []
-    where
-      init' s = if s /= [] then init s else []
-initInline (i:xs) = i : initInline xs
-
-tailInline :: [Inline] -> [Inline]
-tailInline = mapHeadInline tail'
-    where
-      tail' s = if s /= [] then tail s else []
-
-toCapital :: [Inline] -> [Inline]
-toCapital = mapHeadInline toCap
-    where
-      toCap s = if s /= [] then toUpper (head s) : tail s else []
-
-mapHeadInline :: (String -> String) -> [Inline] -> [Inline]
-mapHeadInline _ [] = []
-mapHeadInline f (i:xs)
-    | Str          s <- i = Str         (f                s)   : xs
-    | Emph        is <- i = Emph        (mapHeadInline f is)   : xs
-    | Strong      is <- i = Strong      (mapHeadInline f is)   : xs
-    | Strikeout   is <- i = Strikeout   (mapHeadInline f is)   : xs
-    | Superscript is <- i = Superscript (mapHeadInline f is)   : xs
-    | Subscript   is <- i = Subscript   (mapHeadInline f is)   : xs
-    | Quoted q    is <- i = Quoted q    (mapHeadInline f is)   : xs
-    | SmallCaps   is <- i = SmallCaps   (mapHeadInline f is)   : xs
-    | Link      is t <- i = Link        (mapHeadInline f is) t : xs
-    | otherwise           = []
-
-getInline :: Inline -> [Inline]
-getInline i
-    | Emph        is <- i = is
-    | Strong      is <- i = is
-    | Strikeout   is <- i = is
-    | Superscript is <- i = is
-    | Subscript   is <- i = is
-    | Quoted _    is <- i = is
-    | SmallCaps   is <- i = is
-    | Link      is _ <- i = is
-    | otherwise           = []
-
 setCiteNoteNum :: [Inline] -> Int -> [Inline]
 setCiteNoteNum ((Cite cs o):xs) n = Cite (setCitationNoteNum n cs) o : setCiteNoteNum xs n
 setCiteNoteNum               _  _ = []
@@ -217,13 +142,17 @@ setCitationNoteNum :: Int -> [Citation] -> [Citation]
 setCitationNoteNum i = map $ \c -> c { citationNoteNum = i}
 
 toCslCite :: Citation -> CSL.Cite
-toCslCite (Citation i p l nn ao na _)
+toCslCite (Citation i p l cm nn _)
     = let (la,lo) = parseLocator l
+          citMode = case cm of
+                      AuthorOnly     -> (True, False)
+                      SuppressAuthor -> (False,True )
+                      NormalCitation -> (False,False)
       in   emptyCite { CSL.citeId         = i
                      , CSL.citePrefix     = p
                      , CSL.citeLabel      = la
                      , CSL.citeLocator    = lo
                      , CSL.citeNoteNumber = show nn
-                     , CSL.authorOnly     = ao
-                     , CSL.suppressAuthor = na
+                     , CSL.authorOnly     = fst citMode
+                     , CSL.suppressAuthor = snd citMode
                      }
diff --git a/src/Text/Pandoc/Readers/Markdown.hs b/src/Text/Pandoc/Readers/Markdown.hs
index 030da91..0256184 100644
--- a/src/Text/Pandoc/Readers/Markdown.hs
+++ b/src/Text/Pandoc/Readers/Markdown.hs
@@ -1346,5 +1346,9 @@ parseLabel = try $ do
       (p',o) = if p /= [] && last p == '+'
                then (init p   , True )
                else (p        , False)
-  return $ Citation cit (trim p') (trim loc) 0 o na 0
+      mode = case (na,o) of
+               (True, False) -> SuppressAuthor
+               (False,True ) -> AuthorOnly
+               _             -> NormalCitation
+  return $ Citation cit (trim p') (trim loc) mode 0 0
 #endif

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                 ` <4CD56D33.2010004-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
@ 2010-11-06 18:56                                                   ` John MacFarlane
       [not found]                                                     ` <20101106185638.GB21524-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-06 18:56 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Nathan Gass [Nov 06 10 15:58 ]:
> On 06.11.10 01:12, John MacFarlane wrote:
> >+++ Andrea Rossato [Nov 05 10 21:52 ]:
> >
> >>Text.Pandoc.Definition as a separate package would be *great* for me!
> >>Pandoc is the main output for citeproc. To avoid circular dependencies
> >>I'm force to replicate everything in Text.CSL.Output.Pandoc and then
> >>again in Text.Pandoc.Biblio. I was tempted to suggest such a move. We
> >>would have the Pandoc type and the interface to manipulate it as a
> >>single common package without any duplicate code.
> >
> >This is a sensible move, I think.  I've just split off
> >Text.Pandoc.Definitions into a new package,
> >
> >https://github.com/jgm/pandoc-types
> >
> >The citeproc branch at
> >
> >https://github.com/jgm/pandoc/tree/citeproc
> >
> >now depends on this package instead of supplying its own
> >Text.Pandoc.Definitions.  This should allow you to simplify your
> >code considerably!
> >
> >It will also be useful more generally, I think -- any package
> >that wants to produce structured text can emit it as Pandoc
> >without bringing in lots of dependencies.
> 
> I'd vote for keeping the two haskell packages in one git repository,
> if that is possible. I fear there are a lot of dependencies between
> the two and manually splitting and syncing patches,branches...
> between the two repos could become cumbersome.
> 
> I just now created a new branch [^1] with only the acceptable
> changes from my old branch left. Namely the new automated tests and
> the CitationVariant data type to enforce consistency. The changes
> depend on my changes on pandoc-types [^2], which is a first example
> why one git repo with two haskell packages would be easier to
> handle.
> 
> I hope this changes now are acceptable.
> 
> [^1]: https://github.com/xabbu42/pandoc/tree/citeproc2
> 
> [^2]: https://github.com/xabbu42/pandoc-types

I think it makes good sense to prevent a citation from having "author-only"
and "no-author" set simultaneously. Andrea, are the citeproc changes
proposed here acceptable to you?

https://github.com/xabbu42/pandoc/commit/03bc1c8ec26dbf1e521ffd60b5424ac9df63e53e

https://github.com/xabbu42/pandoc/commit/ebe8cbb61bf98ed9d160aa1b81e550c234215b32

https://github.com/xabbu42/pandoc-types/commit/5de65502a52b5146a7c23f5a378ecc52d0fdb076

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                                                     ` <20101106185638.GB21524-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-06 19:24                                                       ` Andrea Rossato
       [not found]                                                         ` <20101106192448.GA24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  2010-11-06 19:36                                                       ` John MacFarlane
  1 sibling, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-06 19:24 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Sat, Nov 06, 2010 at 11:56:38AM -0700, John MacFarlane wrote:
> I think it makes good sense to prevent a citation from having "author-only"
> and "no-author" set simultaneously. Andrea, are the citeproc changes
> proposed here acceptable to you?
> 
> https://github.com/xabbu42/pandoc/commit/03bc1c8ec26dbf1e521ffd60b5424ac9df63e53e
> 
> https://github.com/xabbu42/pandoc/commit/ebe8cbb61bf98ed9d160aa1b81e550c234215b32
> 
> https://github.com/xabbu42/pandoc-types/commit/5de65502a52b5146a7c23f5a378ecc52d0fdb076

yes, they are. only, I've sent you some patches that do basically the
same. if you can figure out how to fix the conflicts I'll be delighted
to see Nathan's code used instead of mine (I copied his, indeed!).

andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                 ` <20101106150410.GL6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-06 19:35                                                   ` John MacFarlane
       [not found]                                                     ` <20101106193523.GB22106-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-06 19:35 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Nov 06 10 16:04 ]:
> On Fri, Nov 05, 2010 at 05:12:15PM -0700, John MacFarlane wrote:
> > +++ Andrea Rossato [Nov 05 10 21:52 ]:
> >  
> > > Text.Pandoc.Definition as a separate package would be *great* for me!
> > > Pandoc is the main output for citeproc. To avoid circular dependencies
> > > I'm force to replicate everything in Text.CSL.Output.Pandoc and then
> > > again in Text.Pandoc.Biblio. I was tempted to suggest such a move. We
> > > would have the Pandoc type and the interface to manipulate it as a
> > > single common package without any duplicate code.
> > 
> > This is a sensible move, I think.  I've just split off
> > Text.Pandoc.Definitions into a new package,
> > 
> > https://github.com/jgm/pandoc-types
> > 
> > The citeproc branch at
> > 
> > https://github.com/jgm/pandoc/tree/citeproc
> > 
> > now depends on this package instead of supplying its own
> > Text.Pandoc.Definitions.  This should allow you to simplify your
> > code considerably!
> > 
> > It will also be useful more generally, I think -- any package
> > that wants to produce structured text can emit it as Pandoc
> > without bringing in lots of dependencies.
> 
> 
> I pushed the latest patches to the darcs repository of citeproc-hs:
> they include the switch to the new pandoc-types package and the latest
> fixes.
> 
> Here the updated test material:
> http://gorgias.mine.nu/citeproc/
> 
> I'm including a couple of patches to pandoc-types: the first one
> removes trailing white spaces, the second one adds a datatype to make
> suppress-author and author-only mutually exclusive.

I've pushed these.

> I'm also attaching a patch for the pandoc citeproc branch:
> 0006-update-to-latest-APII-changes-and-remove-duplicate-c.patch

I'm still waiting on this -- code.haskell.org seems to be unreachable
(as it often is), so I can't install your latest citeproc-hs code.

> It removes duplicate code and updates everything to work with the
> latest citeproc code I've just pushed.
> 
> I did not change the markdown parser yet (John, are you going to do
> the job yourself or do you want me to do it? I'm not very happy when
> it comes to writing parsers...).

I'll be glad to do this part, once we have the rest figured out.

We still haven't resolved how to deal with locators.  Here's
what I suggest.  Pandoc's parser will just give you a big
unstructured string, which might include a locator and a suffix:
e.g., "pp. 13-15, especially the diagram on p. 14".  citeproc-hs
will be in charge of parsing this into a structured locator and
suffix (here the localization is needed).  This is pretty much how
it works now. Does that seem right?

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                     ` <20101106185638.GB21524-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2010-11-06 19:24                                                       ` Andrea Rossato
@ 2010-11-06 19:36                                                       ` John MacFarlane
       [not found]                                                         ` <20101106193634.GC22106-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-06 19:36 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ John MacFarlane [Nov 06 10 11:56 ]:
> +++ Nathan Gass [Nov 06 10 15:58 ]:
> > On 06.11.10 01:12, John MacFarlane wrote:
> > >+++ Andrea Rossato [Nov 05 10 21:52 ]:
> > >
> > >>Text.Pandoc.Definition as a separate package would be *great* for me!
> > >>Pandoc is the main output for citeproc. To avoid circular dependencies
> > >>I'm force to replicate everything in Text.CSL.Output.Pandoc and then
> > >>again in Text.Pandoc.Biblio. I was tempted to suggest such a move. We
> > >>would have the Pandoc type and the interface to manipulate it as a
> > >>single common package without any duplicate code.
> > >
> > >This is a sensible move, I think.  I've just split off
> > >Text.Pandoc.Definitions into a new package,
> > >
> > >https://github.com/jgm/pandoc-types
> > >
> > >The citeproc branch at
> > >
> > >https://github.com/jgm/pandoc/tree/citeproc
> > >
> > >now depends on this package instead of supplying its own
> > >Text.Pandoc.Definitions.  This should allow you to simplify your
> > >code considerably!
> > >
> > >It will also be useful more generally, I think -- any package
> > >that wants to produce structured text can emit it as Pandoc
> > >without bringing in lots of dependencies.
> > 
> > I'd vote for keeping the two haskell packages in one git repository,
> > if that is possible. I fear there are a lot of dependencies between
> > the two and manually splitting and syncing patches,branches...
> > between the two repos could become cumbersome.
> > 
> > I just now created a new branch [^1] with only the acceptable
> > changes from my old branch left. Namely the new automated tests and
> > the CitationVariant data type to enforce consistency. The changes
> > depend on my changes on pandoc-types [^2], which is a first example
> > why one git repo with two haskell packages would be easier to
> > handle.
> > 
> > I hope this changes now are acceptable.
> > 
> > [^1]: https://github.com/xabbu42/pandoc/tree/citeproc2
> > 
> > [^2]: https://github.com/xabbu42/pandoc-types
> 
> I think it makes good sense to prevent a citation from having "author-only"
> and "no-author" set simultaneously. Andrea, are the citeproc changes
> proposed here acceptable to you?
> 
> https://github.com/xabbu42/pandoc/commit/03bc1c8ec26dbf1e521ffd60b5424ac9df63e53e
> 
> https://github.com/xabbu42/pandoc/commit/ebe8cbb61bf98ed9d160aa1b81e550c234215b32
> 
> https://github.com/xabbu42/pandoc-types/commit/5de65502a52b5146a7c23f5a378ecc52d0fdb076

I see that Andrea has already made changes to the Citation type.
(I credited you for the idea in the log.)

Would still be good to get his feedback on the other proposed changes.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                                                     ` <20101106193523.GB22106-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-06 20:15                                                       ` Andrea Rossato
  0 siblings, 0 replies; 107+ messages in thread
From: Andrea Rossato @ 2010-11-06 20:15 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Sat, Nov 06, 2010 at 12:35:23PM -0700, John MacFarlane wrote:
> +++ Andrea Rossato [Nov 06 10 16:04 ]:
> 
> > I'm also attaching a patch for the pandoc citeproc branch:
> > 0006-update-to-latest-APII-changes-and-remove-duplicate-c.patch
> 
> I'm still waiting on this -- code.haskell.org seems to be unreachable
> (as it often is), so I can't install your latest citeproc-hs code.


I know the problems of code.haskell.org: before pushing there I always
push my patches here:

http://gorgias.mine.nu/repos/citeproc-hs/

> > It removes duplicate code and updates everything to work with the
> > latest citeproc code I've just pushed.
> > 
> > I did not change the markdown parser yet (John, are you going to do
> > the job yourself or do you want me to do it? I'm not very happy when
> > it comes to writing parsers...).
> 
> I'll be glad to do this part, once we have the rest figured out.
> 
> We still haven't resolved how to deal with locators.  Here's
> what I suggest.  Pandoc's parser will just give you a big
> unstructured string, which might include a locator and a suffix:
> e.g., "pp. 13-15, especially the diagram on p. 14".  citeproc-hs
> will be in charge of parsing this into a structured locator and
> suffix (here the localization is needed).  This is pretty much how
> it works now. Does that seem right?

That should be fine, I think.

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                         ` <20101106192448.GA24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-07  1:32                                                           ` John MacFarlane
       [not found]                                                             ` <20101107013243.GC25887-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-07  1:32 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Before I write the parser, we need to settle whether there
should be a separate syntax for "textual citations",
and if so, how they are going to be represented.

I myself think there should be a separate syntax, along
the lines I suggested earlier.  I also think we do *not*
need a separate syntax for "author-only."

So, assuming that we are going to parse

@jones99 [p. 30]

as a textual citation -- which will ultimately be handled
as if it were two citations, one author-only, the next
suppress-author -- how should it be represented in the
AST?

One possibility is just to use two Cite inlines, one with
author-only, the next with suppress-author.

Advantages:

* No changes needed to citeproc or pandoc-types

Disadvantages:

* Complicates parsing.  The pandoc inline parsers are set up
  to emit a single Inline each.  To handle this case, I'd need to
  add a Group Inline (simply a container for a group of
  Inlines, like HTML span) and emit that, or rewrite all
  of the Inline parsers to emit lists of Inlines, or
  do something fancy with parser state.

* Complicates translation to-from bibtex.  Suppose we want
  an option to use bibtex instead of citeproc in the LaTeX writer.
  (Several people have requested this.)   The writer would
  have to be trained to recognize the two-inline sequence.
  This can be done, but as above, it complicates things.

The alternative would be to add a textual citation inline,
perhaps by adding a boolean flag ("isTextual") to
the Cite inline element.  citeproc-hs would then have
to recognize the textual citation and use convert
to a CSL author-only followed by a suppress-author
in producing the formatted citation.  (We'd have to
settle what to do if more than one Citation is
included; this doesn't really make sense with a
textual citation.) This would require
further modifications to citeproc-hs and pandoc-types,
but would make things easier for the markdown parser
and the writers.

Thoughts?

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                             ` <20101107013243.GC25887-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-07  2:41                                                               ` Frank Bennett
  2010-11-07  2:52                                                               ` Frank Bennett
                                                                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 107+ messages in thread
From: Frank Bennett @ 2010-11-07  2:41 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Sun, Nov 7, 2010 at 10:32 AM, John MacFarlane <fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Before I write the parser, we need to settle whether there
> should be a separate syntax for "textual citations",
> and if so, how they are going to be represented.
>
> I myself think there should be a separate syntax, along
> the lines I suggested earlier.  I also think we do *not*
> need a separate syntax for "author-only."
>
> So, assuming that we are going to parse
>
> @jones99 [p. 30]
>
> as a textual citation -- which will ultimately be handled
> as if it were two citations, one author-only, the next
> suppress-author -- how should it be represented in the
> AST?
>
> One possibility is just to use two Cite inlines, one with
> author-only, the next with suppress-author.
>
> Advantages:
>
> * No changes needed to citeproc or pandoc-types
>
> Disadvantages:
>
> * Complicates parsing.  The pandoc inline parsers are set up
>  to emit a single Inline each.  To handle this case, I'd need to
>  add a Group Inline (simply a container for a group of
>  Inlines, like HTML span) and emit that, or rewrite all
>  of the Inline parsers to emit lists of Inlines, or
>  do something fancy with parser state.
>
> * Complicates translation to-from bibtex.  Suppose we want
>  an option to use bibtex instead of citeproc in the LaTeX writer.
>  (Several people have requested this.)   The writer would
>  have to be trained to recognize the two-inline sequence.
>  This can be done, but as above, it complicates things.
>
> The alternative would be to add a textual citation inline,
> perhaps by adding a boolean flag ("isTextual") to
> the Cite inline element.  citeproc-hs would then have
> to recognize the textual citation and use convert
> to a CSL author-only followed by a suppress-author
> in producing the formatted citation.  (We'd have to
> settle what to do if more than one Citation is
> included; this doesn't really make sense with a
> textual citation.) This would require
> further modifications to citeproc-hs and pandoc-types,
> but would make things easier for the markdown parser
> and the writers.
>
> Thoughts?

At present, the "author-only" + "suppress-author" behavior of the
citeproc-js processor is not being used anywhere outside of the
project test suite.  So while I think Bruce has a desire on CSL side
to have processors share a common API, there's no reason to take what
citeproc-js does as controlling; I'd be very happy to line its
behavior up with whatever works best in pandoc for this use case.

(In fact, after the initial tests passed, I considered bundling this
behavior into a single operation, just as John suggests, but never got
back to it.)

Re what to do with multiple citations in an "author-only" cite, it
might make sense for the processors to just ignore "author-only" and
"suppress-author" completely on clusters with multiple items.  Some
styles sort the cite items before rendering, which could result in
citation content and formatting that looks right on the page, but
which in fact does not reflect the (immediate) author's intention.

Those are just some random thoughts from me.  I'll leave it to Bruce
and Andrea to speak to the result of any discussion on the CSL list.

Frank

>
> John
>
> --
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                             ` <20101107013243.GC25887-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2010-11-07  2:41                                                               ` Frank Bennett
@ 2010-11-07  2:52                                                               ` Frank Bennett
  2010-11-07 14:11                                                               ` Andrea Rossato
  2010-11-07 15:29                                                               ` Nathan Gass
  3 siblings, 0 replies; 107+ messages in thread
From: Frank Bennett @ 2010-11-07  2:52 UTC (permalink / raw)
  To: pandoc-discuss

(After posting my last, I looked at the processor test cases and
realized I'd missed something -- multiple citations with "author-
only" [extended to the two-step operation] or "suppress-author" _do_
make sense when the author of all citations is identical.  I think
that, if anything, that probably strengthens the argument for passing
through a bundle of options in the "author-only" case, and letting the
processor figure out what to with them, but we'll see what emerges
from discussion chez CSL.)


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                                                         ` <20101106193634.GC22106-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-07 10:33                                                           ` Andrea Rossato
       [not found]                                                             ` <20101107103327.GC24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-07 10:33 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Sat, Nov 06, 2010 at 12:36:34PM -0700, John MacFarlane wrote:
> +++ John MacFarlane [Nov 06 10 11:56 ]:
> > 
> > https://github.com/xabbu42/pandoc/commit/03bc1c8ec26dbf1e521ffd60b5424ac9df63e53e
> 
> Would still be good to get his feedback on the other proposed changes.


If I am right the remaining code relates to the test-suite. I'm sorry
but I'm not familiar with it but I certainly favor the presence of
comprehensive tests.

Maybe the included one is a bit too simple: the bibliography is
missing and some more advanced features should be probably included
(disambiguation, citation collapsing, and so), but we can add what's
missing at a latter time.

Andrea

 


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                                     ` <4CD44082.8000205-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
@ 2010-11-07 11:11                                       ` Andrea Rossato
  0 siblings, 0 replies; 107+ messages in thread
From: Andrea Rossato @ 2010-11-07 11:11 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Fri, Nov 05, 2010 at 06:36:02PM +0100, Nathan Gass wrote:
> On 05.11.10 17:59, Andrea Rossato wrote:
> >I'm not following you here. Why do you need to run the processor for
> >rendering your stuff? I'm totally lost here: I just supposed you were
> >translating the markdown syntax into natbib \cite* commands (and
> >reading them into Cite). What am I missing?
> 
> First, I need citeproc as a fallback whenever some citation feature
> is only supported in one format (for example, if we have no textual
> citation in markdown, I need to partially render a \citet found in
> latex to translate it to `Doe [-@doe99]`).
> 
> Second the Cite constructor determines what sort of data I need to
> handle, as I'd like my code to work with any pandoc reader and
> writer implementing some yet unknown citation format. They all are
> confined only by the Cite constructor in what citations they
> generate.

I would have to read some code to understand what you are trying to
do, because of my little knowledge of natbib, basically.

We regards to the Cite constructor: the first list, [Citation], may be
seen as the global state for the citeproc: it is instantiated by the
parser, but also used in Text.Pandoc.Biblio to store some intermediate
processing information (maybe the two parts could be more clearly
divided) on a per-citation basis. The second list, [Inline], the
output to be used by the writes, is generated in the processing.

Now, my understanding is that you should be using the first list, or
an added parameter, for your natbib processing: I have no objection
for both solutions. Moreover, we are probably going to have a
constructor for textual citation (or a modified Cite for the same
purpose).

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                                                             ` <20101107013243.GC25887-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2010-11-07  2:41                                                               ` Frank Bennett
  2010-11-07  2:52                                                               ` Frank Bennett
@ 2010-11-07 14:11                                                               ` Andrea Rossato
  2010-11-07 15:29                                                               ` Nathan Gass
  3 siblings, 0 replies; 107+ messages in thread
From: Andrea Rossato @ 2010-11-07 14:11 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Sat, Nov 06, 2010 at 06:32:43PM -0700, John MacFarlane wrote:
> Before I write the parser, we need to settle whether there
> should be a separate syntax for "textual citations",
> and if so, how they are going to be represented.
> 
> I myself think there should be a separate syntax, along
> the lines I suggested earlier.  I also think we do *not*
> need a separate syntax for "author-only."
> 
> So, assuming that we are going to parse
> 
> @jones99 [p. 30]

That will be translated, for the citeproc-hs side, into a Cite with a
single Citation with the author-only bit set, followed by a Space and
a Cite with a single Citation with the "suppress-author" bit and the
locator both set, right? Still we could use 

> 
> as a textual citation -- which will ultimately be handled
> as if it were two citations, one author-only, the next
> suppress-author -- how should it be represented in the
> AST?
> 
> One possibility is just to use two Cite inlines, one with
> author-only, the next with suppress-author.
> 
> Advantages:
> 
> * No changes needed to citeproc or pandoc-types
> 
> Disadvantages:
> 
> * Complicates parsing.  The pandoc inline parsers are set up
>   to emit a single Inline each.  To handle this case, I'd need to
>   add a Group Inline (simply a container for a group of
>   Inlines, like HTML span) and emit that, or rewrite all
>   of the Inline parsers to emit lists of Inlines, or
>   do something fancy with parser state.
> 
> * Complicates translation to-from bibtex.  Suppose we want
>   an option to use bibtex instead of citeproc in the LaTeX writer.
>   (Several people have requested this.)   The writer would
>   have to be trained to recognize the two-inline sequence.
>   This can be done, but as above, it complicates things.
> 
> The alternative would be to add a textual citation inline,
> perhaps by adding a boolean flag ("isTextual") to
> the Cite inline element.  citeproc-hs would then have
> to recognize the textual citation and use convert
> to a CSL author-only followed by a suppress-author
> in producing the formatted citation.  (We'd have to
> settle what to do if more than one Citation is
> included; this doesn't really make sense with a
> textual citation.) This would require
> further modifications to citeproc-hs and pandoc-types,
> but would make things easier for the markdown parser
> and the writers.
> 
> Thoughts?


The simplest solution could to have a Text.Pandoc.Definition.Cite thus
modified:

    | Cite Bool [Citation] [Inline]


Text.Pandoc.Biblio.toCslCite would become:

toCslCite :: [Cite] -> [CSL.Cite]

that requires some other changes to Text.Pandoc.Biblio, but not too
difficult.

Another possibility is to have a specific constructor:

    | IntextCite Citation [Inline]

(note there's just one citation)

Whatever you prefer. In any case it must be converted into something
citeproc-hs can read (a list with a single CSL.Cite with the
"author-only" bit set, followed by a list with the first Cite with the
"suppress-author" only set).

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                             ` <20101107013243.GC25887-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
                                                                                 ` (2 preceding siblings ...)
  2010-11-07 14:11                                                               ` Andrea Rossato
@ 2010-11-07 15:29                                                               ` Nathan Gass
       [not found]                                                                 ` <4CD6C5E6.3040300-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  3 siblings, 1 reply; 107+ messages in thread
From: Nathan Gass @ 2010-11-07 15:29 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 07.11.10 02:32, John MacFarlane wrote:
> Before I write the parser, we need to settle whether there
> should be a separate syntax for "textual citations",
> and if so, how they are going to be represented.
>
> I myself think there should be a separate syntax, along
> the lines I suggested earlier.  I also think we do *not*
> need a separate syntax for "author-only."
>
> So, assuming that we are going to parse
>
> @jones99 [p. 30]
>
> as a textual citation -- which will ultimately be handled
> as if it were two citations, one author-only, the next
> suppress-author -- how should it be represented in the
> AST?
>
> One possibility is just to use two Cite inlines, one with
> author-only, the next with suppress-author.
>
> Advantages:
>
> * No changes needed to citeproc or pandoc-types
>
> Disadvantages:
>
> * Complicates parsing.  The pandoc inline parsers are set up
>    to emit a single Inline each.  To handle this case, I'd need to
>    add a Group Inline (simply a container for a group of
>    Inlines, like HTML span) and emit that, or rewrite all
>    of the Inline parsers to emit lists of Inlines, or
>    do something fancy with parser state.
>
> * Complicates translation to-from bibtex.  Suppose we want
>    an option to use bibtex instead of citeproc in the LaTeX writer.
>    (Several people have requested this.)   The writer would
>    have to be trained to recognize the two-inline sequence.
>    This can be done, but as above, it complicates things.
>
> The alternative would be to add a textual citation inline,
> perhaps by adding a boolean flag ("isTextual") to
> the Cite inline element.  citeproc-hs would then have
> to recognize the textual citation and use convert
> to a CSL author-only followed by a suppress-author
> in producing the formatted citation.  (We'd have to
> settle what to do if more than one Citation is
> included; this doesn't really make sense with a
> textual citation.) This would require
> further modifications to citeproc-hs and pandoc-types,
> but would make things easier for the markdown parser
> and the writers.

I don't think this would require modifications to citeproc-hs, we just 
need to adapt the code in Text.Pandoc.Biblio. I can do this if you want, 
if we want a textual primitive in pandoc.

My intention was to implement the textual citation as another 
CitationMode (or CitationVariant in my code). I think we don't want to 
combine textual citations with suppress-author or author-only, so adding 
another CitationMode is imho best.

I'm unsure what code is our current base by the way. Andrea stated he 
copied my code, you merged the code of Andrea. I don't actually care one 
way or the other, but to continue we should use the same changes ;-).

So I'm in favor of a textual citation primitive and volunteer to 
implement it.

Nathan


>
> Thoughts?
>
> John
>


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                                 ` <4CD6C5E6.3040300-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
@ 2010-11-07 23:05                                                                   ` John MacFarlane
       [not found]                                                                     ` <20101107230549.GA7894-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2010-11-07 23:12                                                                   ` John MacFarlane
  1 sibling, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-07 23:05 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Nathan Gass [Nov 07 10 16:29 ]:
> On 07.11.10 02:32, John MacFarlane wrote:
> >Before I write the parser, we need to settle whether there
> >should be a separate syntax for "textual citations",
> >and if so, how they are going to be represented.
> >
> >I myself think there should be a separate syntax, along
> >the lines I suggested earlier.  I also think we do *not*
> >need a separate syntax for "author-only."
> >
> >So, assuming that we are going to parse
> >
> >@jones99 [p. 30]
> >
> >as a textual citation -- which will ultimately be handled
> >as if it were two citations, one author-only, the next
> >suppress-author -- how should it be represented in the
> >AST?
> >
> >One possibility is just to use two Cite inlines, one with
> >author-only, the next with suppress-author.
> >
> >Advantages:
> >
> >* No changes needed to citeproc or pandoc-types
> >
> >Disadvantages:
> >
> >* Complicates parsing.  The pandoc inline parsers are set up
> >   to emit a single Inline each.  To handle this case, I'd need to
> >   add a Group Inline (simply a container for a group of
> >   Inlines, like HTML span) and emit that, or rewrite all
> >   of the Inline parsers to emit lists of Inlines, or
> >   do something fancy with parser state.
> >
> >* Complicates translation to-from bibtex.  Suppose we want
> >   an option to use bibtex instead of citeproc in the LaTeX writer.
> >   (Several people have requested this.)   The writer would
> >   have to be trained to recognize the two-inline sequence.
> >   This can be done, but as above, it complicates things.
> >
> >The alternative would be to add a textual citation inline,
> >perhaps by adding a boolean flag ("isTextual") to
> >the Cite inline element.  citeproc-hs would then have
> >to recognize the textual citation and use convert
> >to a CSL author-only followed by a suppress-author
> >in producing the formatted citation.  (We'd have to
> >settle what to do if more than one Citation is
> >included; this doesn't really make sense with a
> >textual citation.) This would require
> >further modifications to citeproc-hs and pandoc-types,
> >but would make things easier for the markdown parser
> >and the writers.
> 
> I don't think this would require modifications to citeproc-hs, we
> just need to adapt the code in Text.Pandoc.Biblio. I can do this if
> you want, if we want a textual primitive in pandoc.
> 
> My intention was to implement the textual citation as another
> CitationMode (or CitationVariant in my code). I think we don't want
> to combine textual citations with suppress-author or author-only, so
> adding another CitationMode is imho best.
> 
> I'm unsure what code is our current base by the way. Andrea stated
> he copied my code, you merged the code of Andrea. I don't actually
> care one way or the other, but to continue we should use the same
> changes ;-).
> 
> So I'm in favor of a textual citation primitive and volunteer to
> implement it.

I just had another thought about how this could be done.  It would
require some changes on the citeproc side (or perhaps just in
pandoc's Biblio module), but they'd be pretty minor.

Here's the idea:  there would be no flag in Cite for textual citations,
nor would there be a separate CitationMode for textual citations.
Instead, pandoc/citeproc would treat a Cite as a textual cite
if the list of Citations it contains meets the following conditions:

1.  the first citation in the list is an author-only
2.  all of the subsequent citations (if any) are to works by the
same author, and they are all marked suppress-author

This makes sense to me, because I don't think we'd ever use
an author-only flag *within* a regular (non-textual) citation.
This also gives us the flexibility to have textual citations
with several works by the same author, which we do want:

  Doe (1999, 33; 2000, 24; 2004) discusses the issue...

So, we can have what we want without changing the Cite or Citation
definitions at all.  We just need to change the code in
Text.Pandoc.Biblio that produce the Inlines.

I think this solution is actually better than the others that
have been discussed.  If you have

  TextCite Citation [Inline]

then you lose the ability to cite multiple works by the same author
in one textual citation.  If you have

  Cite Bool [Citation] [Inline]

then you can have incoherent citations where "textual" flag is set
but you have a list of citations with different authors.
If you make TextualCitation an CitationMode (or whatever the
terminology is), then you could get incoherent Cites containing
several Citations marked Textual, or a mix.

Does this seem a good solution?

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                                 ` <4CD6C5E6.3040300-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  2010-11-07 23:05                                                                   ` John MacFarlane
@ 2010-11-07 23:12                                                                   ` John MacFarlane
       [not found]                                                                     ` <20101107231252.GA8043-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-07 23:12 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

 
> I'm unsure what code is our current base by the way. Andrea stated
> he copied my code, you merged the code of Andrea. I don't actually
> care one way or the other, but to continue we should use the same
> changes ;-).

The code base is my citeproc branch of pandoc + Andrea's darcs
repository of citeproc-hs.  I'll try to push your test-related
patch soon.  Andrea's patches have already been pushed.

> So I'm in favor of a textual citation primitive and volunteer to
> implement it.

Let's first figure out how we want to go about this.  I just sent
a suggestion.  If that meets with approval, then Andrea should
have the first crack at implementing it, but he may want to take
you up on your offer!

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                             ` <20101107103327.GC24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-07 23:18                                                               ` John MacFarlane
       [not found]                                                                 ` <20101107231810.GB8043-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-07 23:18 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Nov 07 10 11:33 ]:
> On Sat, Nov 06, 2010 at 12:36:34PM -0700, John MacFarlane wrote:
> > +++ John MacFarlane [Nov 06 10 11:56 ]:
> > > 
> > > https://github.com/xabbu42/pandoc/commit/03bc1c8ec26dbf1e521ffd60b5424ac9df63e53e
> > 
> > Would still be good to get his feedback on the other proposed changes.
> 
> 
> If I am right the remaining code relates to the test-suite. I'm sorry
> but I'm not familiar with it but I certainly favor the presence of
> comprehensive tests.

Having looked this over, I favor waiting a bit before adding tests.
We should have something more comprehensive, but it will be easier
to design once we've got the syntax done etc.

JOhn


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                     ` <1df2d027-9220-45b1-8126-1b0965bd7836-s+NOhRKKP/7FX/zIJQasLWB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
  2010-11-05 13:36                       ` Andrea Rossato
@ 2010-11-08  9:07                       ` Nathan Gass
       [not found]                         ` <4CD7BDD0.1040408-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: Nathan Gass @ 2010-11-08  9:07 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 05.11.10 01:41, Frank Bennett wrote:
> Hi, all.  I'm a colleague of Andrea and Bruce on the CSL/citeproc side
> (as the author of citeproc-js), and am just now joining the
> discussion.
>
> On Nov 5, 1:06 am, John MacFarlane<fiddlosop...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>  wrote:
>
>> (In fact, I'm still somewhat attached to my proposal to make pandoc figure out
>> automatically when to use a textual citation and when not ... does anybody
>> else like that idea?)
>
> I can offer a further use case for "textual citations" that might be
> relevant.
>
> A common citation style in China is purely numeric, using numbers for
> all references, including "textual citations".  As a base example,
> suppose a text is set as follows in an author-date style:
>
>    Jones (2000) says that water is wet, which has been confirmed by
> others. (Smith 1999)
>
> In the pure numeric style, this would be rendered like this (^
> signifying superscript -- note that the "ordinary" type of reference
> is superscripted, while the "author-only" type of reference is not):
>
>    Reference [1] establishes that water is wet, a finding that has been
> refined by others.^[2]^
>
>    Bibliography
>    [1] John Jones, "Wetness of Water" (2000).
>    [2] Samuel Smith, "Dampness of Water" (1999).
>
> In a note style, the same text would be rendered something like this:
>
>    Jones^1^ establishes that water is wet, a finding that has been
> refined by others.^2^
>
>    Footnotes
>    1. "Wetness of Water" (2000).
>    2. Samuel Smith, "Dampness of Water" (1999).

Is it possible to solve the Chinese style completely, meaning we can 
generate the correct text to include in any case? The reason I ask this, 
is that it changes the game a bit.

For English, German and many other languages you can't solve textual 
citations for all cases, so our support should be mostly guided by the 
common case, and not strive to be the most flexible solution (there will 
always be some special textual citations which have to be written by hand).

If the common Chinese style does not suffer from the problems of textual 
citations in for example English, and can be completely solved, then we 
have a strong argument for a flexible solution handling the Chinese 
style in all cases.

Nathan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                                                                     ` <20101107230549.GA7894-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-08 11:00                                                                       ` Andrea Rossato
       [not found]                                                                         ` <20101108110025.GI24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  2010-11-08 18:28                                                                       ` Nathan Gass
  1 sibling, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-08 11:00 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Sun, Nov 07, 2010 at 03:05:49PM -0800, John MacFarlane wrote:
> +++ Nathan Gass [Nov 07 10 16:29 ]:
> > So I'm in favor of a textual citation primitive and volunteer to
> > implement it.
> 
> I just had another thought about how this could be done.  It would
> require some changes on the citeproc side (or perhaps just in
> pandoc's Biblio module), but they'd be pretty minor.
> 
> Here's the idea:  there would be no flag in Cite for textual citations,
> nor would there be a separate CitationMode for textual citations.
> Instead, pandoc/citeproc would treat a Cite as a textual cite
> if the list of Citations it contains meets the following conditions:
> 
> 1.  the first citation in the list is an author-only
> 2.  all of the subsequent citations (if any) are to works by the
> same author, and they are all marked suppress-author
> 
> This makes sense to me, because I don't think we'd ever use
> an author-only flag *within* a regular (non-textual) citation.
> This also gives us the flexibility to have textual citations
> with several works by the same author, which we do want:
> 
>   Doe (1999, 33; 2000, 24; 2004) discusses the issue...

We do not need to be so strict with the second rule and, at the same
time, we need to require at least one citation. That is to say, the
rules should be:

 1. the first citation in the list is an author-only;
 2. a subsequent identical citation marked suppress-author and
     possibly other citations.

We could so permit:

   Doe (1999, 33; 2000, 24; 2004) discusses the issue...

but also

   Doe (1999, 33; 2000, 24; 2004; but see also Brown 2009) discusses
   the issue...

One problem: what is the syntax going to look like?

Normal Cite will be: 

    a citation [see @item1, p. 4, for an opinion; see Roe's objection
    in -@item2, chap. 3]

The "textual citation" you proposed was:

   a textual citation of @item1[p. 1-6].

That would become a Cite with a two element list [Citation]. The
Citation would have the same citationId, but the first would be
AuthorOnly and the second SuppressAuthor. The first citation goes
in-text, the second wold be placed in accordance with the style class
(in-text or footnote).

How would you express multiple textual citations? As you see we can
add any kind of citations to the Cite as long as it can be identified
by its first two elements.

> So, we can have what we want without changing the Cite or Citation
> definitions at all.  We just need to change the code in
> Text.Pandoc.Biblio that produce the Inlines.
> 
> I think this solution is actually better than the others that
> have been discussed.  If you have
> 
>   TextCite Citation [Inline]
> 
> then you lose the ability to cite multiple works by the same author
> in one textual citation.  If you have
> 
>   Cite Bool [Citation] [Inline]
> 
> then you can have incoherent citations where "textual" flag is set
> but you have a list of citations with different authors.
> If you make TextualCitation an CitationMode (or whatever the
> terminology is), then you could get incoherent Cites containing
> several Citations marked Textual, or a mix.
> 
> Does this seem a good solution?

It seems the best to me.

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                                                                     ` <20101107231252.GA8043-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-08 11:09                                                                       ` Andrea Rossato
       [not found]                                                                         ` <20101108110950.GJ24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-08 11:09 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Sun, Nov 07, 2010 at 03:12:52PM -0800, John MacFarlane wrote:
>  
> > I'm unsure what code is our current base by the way. Andrea stated
> > he copied my code, you merged the code of Andrea. I don't actually
> > care one way or the other, but to continue we should use the same
> > changes ;-).
> 
> The code base is my citeproc branch of pandoc + Andrea's darcs
> repository of citeproc-hs.  I'll try to push your test-related
> patch soon.  Andrea's patches have already been pushed.
> 
> > So I'm in favor of a textual citation primitive and volunteer to
> > implement it.
> 
> Let's first figure out how we want to go about this.  I just sent
> a suggestion.  If that meets with approval, then Andrea should
> have the first crack at implementing it, but he may want to take
> you up on your offer!

If Nathan wants to implement the pandoc side, it is fine for me. I'd
need to go back to citeproc-hs and code my way to something
releasable.

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                                                                         ` <20101108110025.GI24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-08 11:12                                                                           ` Frank Bennett
  2010-11-08 15:53                                                                           ` John MacFarlane
  1 sibling, 0 replies; 107+ messages in thread
From: Frank Bennett @ 2010-11-08 11:12 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Mon, Nov 8, 2010 at 8:00 PM, Andrea Rossato <andrea.rossato-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Sun, Nov 07, 2010 at 03:05:49PM -0800, John MacFarlane wrote:
>> +++ Nathan Gass [Nov 07 10 16:29 ]:
>> > So I'm in favor of a textual citation primitive and volunteer to
>> > implement it.
>>
>> I just had another thought about how this could be done.  It would
>> require some changes on the citeproc side (or perhaps just in
>> pandoc's Biblio module), but they'd be pretty minor.
>>
>> Here's the idea:  there would be no flag in Cite for textual citations,
>> nor would there be a separate CitationMode for textual citations.
>> Instead, pandoc/citeproc would treat a Cite as a textual cite
>> if the list of Citations it contains meets the following conditions:
>>
>> 1.  the first citation in the list is an author-only
>> 2.  all of the subsequent citations (if any) are to works by the
>> same author, and they are all marked suppress-author
>>
>> This makes sense to me, because I don't think we'd ever use
>> an author-only flag *within* a regular (non-textual) citation.
>> This also gives us the flexibility to have textual citations
>> with several works by the same author, which we do want:
>>
>>   Doe (1999, 33; 2000, 24; 2004) discusses the issue...
>
> We do not need to be so strict with the second rule

Yes, I logged in tonight to post to the same effect.  But I think it
goes further.  It shouldn't be necessary to use suppress-author at all
(see below).

> and, at the same
> time, we need to require at least one citation. That is to say, the
> rules should be:
>
>  1. the first citation in the list is an author-only;
>  2. a subsequent identical citation marked suppress-author and
>     possibly other citations.
>
> We could so permit:
>
>   Doe (1999, 33; 2000, 24; 2004) discusses the issue...

In an author-date style, these cites will collapse anyway, without an
explicit toggle (i.e. it would have been (Doe 1999, 33; 2000, 24; 2004
normally).

>
> but also
>
>   Doe (1999, 33; 2000, 24; 2004; but see also Brown 2009) discusses
>   the issue...
>
> One problem: what is the syntax going to look like?
>
> Normal Cite will be:
>
>    a citation [see @item1, p. 4, for an opinion; see Roe's objection
>    in -@item2, chap. 3]
>
> The "textual citation" you proposed was:
>
>   a textual citation of @item1[p. 1-6].
>
> That would become a Cite with a two element list [Citation]. The
> Citation would have the same citationId, but the first would be
> AuthorOnly and the second SuppressAuthor. The first citation goes
> in-text, the second wold be placed in accordance with the style class
> (in-text or footnote).
>
> How would you express multiple textual citations? As you see we can
> add any kind of citations to the Cite as long as it can be identified
> by its first two elements.
>
>> So, we can have what we want without changing the Cite or Citation
>> definitions at all.  We just need to change the code in
>> Text.Pandoc.Biblio that produce the Inlines.
>>
>> I think this solution is actually better than the others that
>> have been discussed.  If you have
>>
>>   TextCite Citation [Inline]
>>
>> then you lose the ability to cite multiple works by the same author
>> in one textual citation.  If you have
>>
>>   Cite Bool [Citation] [Inline]
>>
>> then you can have incoherent citations where "textual" flag is set
>> but you have a list of citations with different authors.
>> If you make TextualCitation an CitationMode (or whatever the
>> terminology is), then you could get incoherent Cites containing
>> several Citations marked Textual, or a mix.
>>
>> Does this seem a good solution?
>
> It seems the best to me.
>
> Andrea
>
> --
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                                         ` <20101108110025.GI24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  2010-11-08 11:12                                                                           ` Frank Bennett
@ 2010-11-08 15:53                                                                           ` John MacFarlane
       [not found]                                                                             ` <20101108155306.GB15777-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-08 15:53 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Nov 08 10 12:00 ]:
> On Sun, Nov 07, 2010 at 03:05:49PM -0800, John MacFarlane wrote:
> > +++ Nathan Gass [Nov 07 10 16:29 ]:
> > > So I'm in favor of a textual citation primitive and volunteer to
> > > implement it.
> > 
> > I just had another thought about how this could be done.  It would
> > require some changes on the citeproc side (or perhaps just in
> > pandoc's Biblio module), but they'd be pretty minor.
> > 
> > Here's the idea:  there would be no flag in Cite for textual citations,
> > nor would there be a separate CitationMode for textual citations.
> > Instead, pandoc/citeproc would treat a Cite as a textual cite
> > if the list of Citations it contains meets the following conditions:
> > 
> > 1.  the first citation in the list is an author-only
> > 2.  all of the subsequent citations (if any) are to works by the
> > same author, and they are all marked suppress-author
> > 
> > This makes sense to me, because I don't think we'd ever use
> > an author-only flag *within* a regular (non-textual) citation.
> > This also gives us the flexibility to have textual citations
> > with several works by the same author, which we do want:
> > 
> >   Doe (1999, 33; 2000, 24; 2004) discusses the issue...
> 
> We do not need to be so strict with the second rule and, at the same
> time, we need to require at least one citation. That is to say, the
> rules should be:
> 
>  1. the first citation in the list is an author-only;
>  2. a subsequent identical citation marked suppress-author and
>      possibly other citations.

Good, I agree.  So, something like this?

isTextualCite :: Inline -> Bool
isTextualCite (Cite (c1:c2:cs) inls) =
  citationMode c1 == AuthorOnly && citationMode c2 == SuppressAuthor &&
  citationId c1 == citationId c2
isTextualCite _ = False

> We could so permit:
> 
>    Doe (1999, 33; 2000, 24; 2004) discusses the issue...
> 
> but also
> 
>    Doe (1999, 33; 2000, 24; 2004; but see also Brown 2009) discusses
>    the issue...
> 
> One problem: what is the syntax going to look like?
> 
> Normal Cite will be: 
> 
>     a citation [see @item1, p. 4, for an opinion; see Roe's objection
>     in -@item2, chap. 3]
> 
> The "textual citation" you proposed was:
> 
>    a textual citation of @item1[p. 1-6].
> 
> That would become a Cite with a two element list [Citation]. The
> Citation would have the same citationId, but the first would be
> AuthorOnly and the second SuppressAuthor. The first citation goes
> in-text, the second wold be placed in accordance with the style class
> (in-text or footnote).
> 
> How would you express multiple textual citations? As you see we can
> add any kind of citations to the Cite as long as it can be identified
> by its first two elements.

Your example above

>    Doe (1999, 33; 2000, 24; 2004; but see also Brown 2009) discusses

would be:

  @doe99 [33; @doe00, 24; @doe04; but see also @brown09] discusses

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                                         ` <20101108110950.GJ24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-08 16:03                                                                           ` John MacFarlane
  0 siblings, 0 replies; 107+ messages in thread
From: John MacFarlane @ 2010-11-08 16:03 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Nov 08 10 12:09 ]:
> On Sun, Nov 07, 2010 at 03:12:52PM -0800, John MacFarlane wrote:
> >  
> > > I'm unsure what code is our current base by the way. Andrea stated
> > > he copied my code, you merged the code of Andrea. I don't actually
> > > care one way or the other, but to continue we should use the same
> > > changes ;-).
> > 
> > The code base is my citeproc branch of pandoc + Andrea's darcs
> > repository of citeproc-hs.  I'll try to push your test-related
> > patch soon.  Andrea's patches have already been pushed.
> > 
> > > So I'm in favor of a textual citation primitive and volunteer to
> > > implement it.
> > 
> > Let's first figure out how we want to go about this.  I just sent
> > a suggestion.  If that meets with approval, then Andrea should
> > have the first crack at implementing it, but he may want to take
> > you up on your offer!
> 
> If Nathan wants to implement the pandoc side, it is fine for me. I'd
> need to go back to citeproc-hs and code my way to something
> releasable.

Looking back at the code, it seems to me that most of the work
is going to be in citeproc -- since that's what generates an
[FormattedOutput] corresponding to each [Citation].
Is this right?  If so, maybe it would be best for you to make
those changes first, so we'll know what needs to be done on
the Pandoc side...

A suggestion for Text.Pandoc.Biblio:  in processCite, we should
use a M.Map [Citation] [FormattedOutput] instead of an association
list (if possible), for performance.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                                                                             ` <20101108155306.GB15777-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-08 17:14                                                                               ` Andrea Rossato
       [not found]                                                                                 ` <20101108171429.GM24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Andrea Rossato @ 2010-11-08 17:14 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Mon, Nov 08, 2010 at 07:53:06AM -0800, John MacFarlane wrote:
> Good, I agree.  So, something like this?
> 
> isTextualCite :: Inline -> Bool
> isTextualCite (Cite (c1:c2:cs) inls) =
>   citationMode c1 == AuthorOnly && citationMode c2 == SuppressAuthor &&
>   citationId c1 == citationId c2
> isTextualCite _ = False


I'm thinking about getting rid of c2. We change the name of the
citation mode constructor to AuthorInText and:

isTextualCite :: Inline -> Bool
isTextualCite (Cite (c:_) _) = citationMode c == AuthorInText
isTextualCite _ = False

and that's it.

The answer will not be a Map (there is no guarantee that after running
the citeproc the order of [Citation] is preserved in the order of
[FormattedOutput]). But in the case of a textual citation citeproc
will return a [FormattedOutput] whose head is the label to be placed
in-text. The rest is the formatted citation group and may be placed in
a footnote or stay in-text depending on the style class.

That should work, right?

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                                                 ` <20101108171429.GM24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
@ 2010-11-08 17:38                                                                                   ` John MacFarlane
  0 siblings, 0 replies; 107+ messages in thread
From: John MacFarlane @ 2010-11-08 17:38 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Andrea Rossato [Nov 08 10 18:14 ]:
> On Mon, Nov 08, 2010 at 07:53:06AM -0800, John MacFarlane wrote:
> > Good, I agree.  So, something like this?
> > 
> > isTextualCite :: Inline -> Bool
> > isTextualCite (Cite (c1:c2:cs) inls) =
> >   citationMode c1 == AuthorOnly && citationMode c2 == SuppressAuthor &&
> >   citationId c1 == citationId c2
> > isTextualCite _ = False
> 
> 
> I'm thinking about getting rid of c2. We change the name of the
> citation mode constructor to AuthorInText and:
> 
> isTextualCite :: Inline -> Bool
> isTextualCite (Cite (c:_) _) = citationMode c == AuthorInText
> isTextualCite _ = False
> 
> and that's it.

Sounds good to me!  Do you want me to change AuthorOnly to
AuthorInText in pandoc-types?

Also, how will citeproc handle cases where an AuthorInText
citation occurs outside of the first position in the Citation
list?

> The answer will not be a Map (there is no guarantee that after running
> the citeproc the order of [Citation] is preserved in the order of
> [FormattedOutput]). But in the case of a textual citation citeproc
> will return a [FormattedOutput] whose head is the label to be placed
> in-text. The rest is the formatted citation group and may be placed in
> a footnote or stay in-text depending on the style class.
> 
> That should work, right?

When I talked about a Map, I just meant the code for processCite in the Biblio
module. Here's the suggestion:

in processBiblio,

+import qualified Data.Map as M

-           cits_map   = zip grps (citations result)
+           cits_map   = M.fromList $ zip grps (citations result)

in processCite,

-processCite :: Style -> [([Citation],[FormattedOutput])] -> Inline -> Inline
+processCite :: Style -> M.Map [Citation] [FormattedOutput] -> Inline -> Inline
processCite s cs il
    | Cite t _ <- il = Cite t (process t)
    | otherwise      = il
    where
-     process t = case lookup t cs of
+     process t = case M.lookup t cs of
                    Just  i -> renderPandoc s i
                    Nothing -> [Str ("Error processing " ++ show t)]

M.lookup should be a lot faster than lookup when you have lots of entries.

John



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                         ` <4CD7BDD0.1040408-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
@ 2010-11-08 18:02                           ` John MacFarlane
  0 siblings, 0 replies; 107+ messages in thread
From: John MacFarlane @ 2010-11-08 18:02 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Nathan Gass [Nov 08 10 10:07 ]:
> On 05.11.10 01:41, Frank Bennett wrote:
> >Hi, all.  I'm a colleague of Andrea and Bruce on the CSL/citeproc side
> >(as the author of citeproc-js), and am just now joining the
> >discussion.
> >
> >On Nov 5, 1:06 am, John MacFarlane<fiddlosop...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>  wrote:
> >
> >>(In fact, I'm still somewhat attached to my proposal to make pandoc figure out
> >>automatically when to use a textual citation and when not ... does anybody
> >>else like that idea?)
> >
> >I can offer a further use case for "textual citations" that might be
> >relevant.
> >
> >A common citation style in China is purely numeric, using numbers for
> >all references, including "textual citations".  As a base example,
> >suppose a text is set as follows in an author-date style:
> >
> >   Jones (2000) says that water is wet, which has been confirmed by
> >others. (Smith 1999)
> >
> >In the pure numeric style, this would be rendered like this (^
> >signifying superscript -- note that the "ordinary" type of reference
> >is superscripted, while the "author-only" type of reference is not):
> >
> >   Reference [1] establishes that water is wet, a finding that has been
> >refined by others.^[2]^
> >
> >   Bibliography
> >   [1] John Jones, "Wetness of Water" (2000).
> >   [2] Samuel Smith, "Dampness of Water" (1999).
> >
> >In a note style, the same text would be rendered something like this:
> >
> >   Jones^1^ establishes that water is wet, a finding that has been
> >refined by others.^2^
> >
> >   Footnotes
> >   1. "Wetness of Water" (2000).
> >   2. Samuel Smith, "Dampness of Water" (1999).
> 
> Is it possible to solve the Chinese style completely, meaning we can
> generate the correct text to include in any case? The reason I ask
> this, is that it changes the game a bit.
> 
> For English, German and many other languages you can't solve textual
> citations for all cases, so our support should be mostly guided by
> the common case, and not strive to be the most flexible solution
> (there will always be some special textual citations which have to
> be written by hand).
> 
> If the common Chinese style does not suffer from the problems of
> textual citations in for example English, and can be completely
> solved, then we have a strong argument for a flexible solution
> handling the Chinese style in all cases.

If I understood the earlier exchange with Frank and Andrea properly,
we should be able to handle the Chinese style.  The CSL will tell
citeproc to format an "author-only" citation (now, "AuthorInText")
as "Reference [n]" rather than as the author's name.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                                     ` <20101107230549.GA7894-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2010-11-08 11:00                                                                       ` Andrea Rossato
@ 2010-11-08 18:28                                                                       ` Nathan Gass
       [not found]                                                                         ` <4CD8415F.8030703-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: Nathan Gass @ 2010-11-08 18:28 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 08.11.10 00:05, John MacFarlane wrote:
> +++ Nathan Gass [Nov 07 10 16:29 ]:
>> On 07.11.10 02:32, John MacFarlane wrote:
>>> Before I write the parser, we need to settle whether there
>>> should be a separate syntax for "textual citations",
>>> and if so, how they are going to be represented.
>>>
>>> I myself think there should be a separate syntax, along
>>> the lines I suggested earlier.  I also think we do *not*
>>> need a separate syntax for "author-only."
>>>
>>> So, assuming that we are going to parse
>>>
>>> @jones99 [p. 30]
>>>
>>> as a textual citation -- which will ultimately be handled
>>> as if it were two citations, one author-only, the next
>>> suppress-author -- how should it be represented in the
>>> AST?
>>>
>>> One possibility is just to use two Cite inlines, one with
>>> author-only, the next with suppress-author.
>>>
>>> Advantages:
>>>
>>> * No changes needed to citeproc or pandoc-types
>>>
>>> Disadvantages:
>>>
>>> * Complicates parsing.  The pandoc inline parsers are set up
>>>    to emit a single Inline each.  To handle this case, I'd need to
>>>    add a Group Inline (simply a container for a group of
>>>    Inlines, like HTML span) and emit that, or rewrite all
>>>    of the Inline parsers to emit lists of Inlines, or
>>>    do something fancy with parser state.
>>>
>>> * Complicates translation to-from bibtex.  Suppose we want
>>>    an option to use bibtex instead of citeproc in the LaTeX writer.
>>>    (Several people have requested this.)   The writer would
>>>    have to be trained to recognize the two-inline sequence.
>>>    This can be done, but as above, it complicates things.
>>>
>>> The alternative would be to add a textual citation inline,
>>> perhaps by adding a boolean flag ("isTextual") to
>>> the Cite inline element.  citeproc-hs would then have
>>> to recognize the textual citation and use convert
>>> to a CSL author-only followed by a suppress-author
>>> in producing the formatted citation.  (We'd have to
>>> settle what to do if more than one Citation is
>>> included; this doesn't really make sense with a
>>> textual citation.) This would require
>>> further modifications to citeproc-hs and pandoc-types,
>>> but would make things easier for the markdown parser
>>> and the writers.
>>
>> I don't think this would require modifications to citeproc-hs, we
>> just need to adapt the code in Text.Pandoc.Biblio. I can do this if
>> you want, if we want a textual primitive in pandoc.
>>
>> My intention was to implement the textual citation as another
>> CitationMode (or CitationVariant in my code). I think we don't want
>> to combine textual citations with suppress-author or author-only, so
>> adding another CitationMode is imho best.
>>
>> I'm unsure what code is our current base by the way. Andrea stated
>> he copied my code, you merged the code of Andrea. I don't actually
>> care one way or the other, but to continue we should use the same
>> changes ;-).
>>
>> So I'm in favor of a textual citation primitive and volunteer to
>> implement it.
>
> I just had another thought about how this could be done.  It would
> require some changes on the citeproc side (or perhaps just in
> pandoc's Biblio module), but they'd be pretty minor.
>
> Here's the idea:  there would be no flag in Cite for textual citations,
> nor would there be a separate CitationMode for textual citations.
> Instead, pandoc/citeproc would treat a Cite as a textual cite
> if the list of Citations it contains meets the following conditions:
>
> 1.  the first citation in the list is an author-only
> 2.  all of the subsequent citations (if any) are to works by the
> same author, and they are all marked suppress-author
>
> This makes sense to me, because I don't think we'd ever use
> an author-only flag *within* a regular (non-textual) citation.
> This also gives us the flexibility to have textual citations
> with several works by the same author, which we do want:
>
>    Doe (1999, 33; 2000, 24; 2004) discusses the issue...
>
> So, we can have what we want without changing the Cite or Citation
> definitions at all.  We just need to change the code in
> Text.Pandoc.Biblio that produce the Inlines.

Do we replace the +@doe99 syntax with this one or do we have both?
In the first case I'm in favor (with the caveat that I like Andrea's 
idea to use only one Citation marked as textual, instead of one 
author-only and one suppress-author). As long as we can't handle every 
textual citation, we should handle the most common one and require the 
rest to be written by hand.

In the second case I see some problems, at least if we don't go with the 
specially marked textual citation. On one hand the following two 
examples would parse exactly the same and both give you a textual citation:

1. As @doe99 [p. 10; but see @brown p. 20]

2. As [+@doe99 -@doe99 p.10; but see @brown p. 20]

On the other hand, author-only can be used for citations like

3. some text [+@doe99 said this in -@doe99 p. 10; but see @brown p. 20]

I think the rules on how our markdown gets interpreted gets too complex 
and intelligent at this point. So if we have the author-only version and 
a textual cite, the second example in my list should be rendered just as 
you would expect from the third, and needs to be disambiguated from the 
first, even if actually writing the second example is probably senseless.

>
> I think this solution is actually better than the others that
> have been discussed.  If you have
>
>    TextCite Citation [Inline]
>
> then you lose the ability to cite multiple works by the same author
> in one textual citation.  If you have
>
>    Cite Bool [Citation] [Inline]
>
> then you can have incoherent citations where "textual" flag is set
> but you have a list of citations with different authors.
> If you make TextualCitation an CitationMode (or whatever the
> terminology is), then you could get incoherent Cites containing
> several Citations marked Textual, or a mix.

You're right, that was a bad idea. As others mentioned, the flag in the 
Cite inline can not actually lead to incoherent citations, as you may 
want to mention other authors in a textual citation, and it works 
correctly (did not test this myself) for multiple citations of the same 
author. So this is imho the best solution.

Nathan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                                         ` <4CD8415F.8030703-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
@ 2010-11-08 19:16                                                                           ` John MacFarlane
  0 siblings, 0 replies; 107+ messages in thread
From: John MacFarlane @ 2010-11-08 19:16 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Nathan Gass [Nov 08 10 19:28 ]:
> On 08.11.10 00:05, John MacFarlane wrote:
> >+++ Nathan Gass [Nov 07 10 16:29 ]:
> >>On 07.11.10 02:32, John MacFarlane wrote:
> >>>Before I write the parser, we need to settle whether there
> >>>should be a separate syntax for "textual citations",
> >>>and if so, how they are going to be represented.
> >>>
> >>>I myself think there should be a separate syntax, along
> >>>the lines I suggested earlier.  I also think we do *not*
> >>>need a separate syntax for "author-only."
> >>>
> >>>So, assuming that we are going to parse
> >>>
> >>>@jones99 [p. 30]
> >>>
> >>>as a textual citation -- which will ultimately be handled
> >>>as if it were two citations, one author-only, the next
> >>>suppress-author -- how should it be represented in the
> >>>AST?
> >>>
> >>>One possibility is just to use two Cite inlines, one with
> >>>author-only, the next with suppress-author.
> >>>
> >>>Advantages:
> >>>
> >>>* No changes needed to citeproc or pandoc-types
> >>>
> >>>Disadvantages:
> >>>
> >>>* Complicates parsing.  The pandoc inline parsers are set up
> >>>   to emit a single Inline each.  To handle this case, I'd need to
> >>>   add a Group Inline (simply a container for a group of
> >>>   Inlines, like HTML span) and emit that, or rewrite all
> >>>   of the Inline parsers to emit lists of Inlines, or
> >>>   do something fancy with parser state.
> >>>
> >>>* Complicates translation to-from bibtex.  Suppose we want
> >>>   an option to use bibtex instead of citeproc in the LaTeX writer.
> >>>   (Several people have requested this.)   The writer would
> >>>   have to be trained to recognize the two-inline sequence.
> >>>   This can be done, but as above, it complicates things.
> >>>
> >>>The alternative would be to add a textual citation inline,
> >>>perhaps by adding a boolean flag ("isTextual") to
> >>>the Cite inline element.  citeproc-hs would then have
> >>>to recognize the textual citation and use convert
> >>>to a CSL author-only followed by a suppress-author
> >>>in producing the formatted citation.  (We'd have to
> >>>settle what to do if more than one Citation is
> >>>included; this doesn't really make sense with a
> >>>textual citation.) This would require
> >>>further modifications to citeproc-hs and pandoc-types,
> >>>but would make things easier for the markdown parser
> >>>and the writers.
> >>
> >>I don't think this would require modifications to citeproc-hs, we
> >>just need to adapt the code in Text.Pandoc.Biblio. I can do this if
> >>you want, if we want a textual primitive in pandoc.
> >>
> >>My intention was to implement the textual citation as another
> >>CitationMode (or CitationVariant in my code). I think we don't want
> >>to combine textual citations with suppress-author or author-only, so
> >>adding another CitationMode is imho best.
> >>
> >>I'm unsure what code is our current base by the way. Andrea stated
> >>he copied my code, you merged the code of Andrea. I don't actually
> >>care one way or the other, but to continue we should use the same
> >>changes ;-).
> >>
> >>So I'm in favor of a textual citation primitive and volunteer to
> >>implement it.
> >
> >I just had another thought about how this could be done.  It would
> >require some changes on the citeproc side (or perhaps just in
> >pandoc's Biblio module), but they'd be pretty minor.
> >
> >Here's the idea:  there would be no flag in Cite for textual citations,
> >nor would there be a separate CitationMode for textual citations.
> >Instead, pandoc/citeproc would treat a Cite as a textual cite
> >if the list of Citations it contains meets the following conditions:
> >
> >1.  the first citation in the list is an author-only
> >2.  all of the subsequent citations (if any) are to works by the
> >same author, and they are all marked suppress-author
> >
> >This makes sense to me, because I don't think we'd ever use
> >an author-only flag *within* a regular (non-textual) citation.
> >This also gives us the flexibility to have textual citations
> >with several works by the same author, which we do want:
> >
> >   Doe (1999, 33; 2000, 24; 2004) discusses the issue...
> >
> >So, we can have what we want without changing the Cite or Citation
> >definitions at all.  We just need to change the code in
> >Text.Pandoc.Biblio that produce the Inlines.
> 
> Do we replace the +@doe99 syntax with this one or do we have both?
> In the first case I'm in favor (with the caveat that I like Andrea's
> idea to use only one Citation marked as textual, instead of one
> author-only and one suppress-author). As long as we can't handle
> every textual citation, we should handle the most common one and
> require the rest to be written by hand.

I don't think we need a special syntax for author-only, so
I'm in favor of dropping the + variant.  (We still need
the - variant for suppress-author.)

> In the second case I see some problems, at least if we don't go with
> the specially marked textual citation. On one hand the following two
> examples would parse exactly the same and both give you a textual
> citation:
> 
> 1. As @doe99 [p. 10; but see @brown p. 20]
> 
> 2. As [+@doe99 -@doe99 p.10; but see @brown p. 20]
> 
> On the other hand, author-only can be used for citations like
> 
> 3. some text [+@doe99 said this in -@doe99 p. 10; but see @brown p. 20]
> 
> I think the rules on how our markdown gets interpreted gets too
> complex and intelligent at this point. So if we have the author-only
> version and a textual cite, the second example in my list should be
> rendered just as you would expect from the third, and needs to be
> disambiguated from the first, even if actually writing the second
> example is probably senseless.

Yes, I think it's better to simplify and have writers spell out
"Doe" if they want (3) to come out as a regular citation.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                                 ` <20101107231810.GB8043-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-10  0:02                                                                   ` Nathan Gass
       [not found]                                                                     ` <4CD9E118.1020801-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Nathan Gass @ 2010-11-10  0:02 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 08.11.10 00:18, John MacFarlane wrote:
> +++ Andrea Rossato [Nov 07 10 11:33 ]:
>> On Sat, Nov 06, 2010 at 12:36:34PM -0700, John MacFarlane wrote:
>>> +++ John MacFarlane [Nov 06 10 11:56 ]:
>>>>
>>>> https://github.com/xabbu42/pandoc/commit/03bc1c8ec26dbf1e521ffd60b5424ac9df63e53e
>>>
>>> Would still be good to get his feedback on the other proposed changes.
>>
>>
>> If I am right the remaining code relates to the test-suite. I'm sorry
>> but I'm not familiar with it but I certainly favor the presence of
>> comprehensive tests.
>
> Having looked this over, I favor waiting a bit before adding tests.
> We should have something more comprehensive, but it will be easier
> to design once we've got the syntax done etc.

So rather than having some minimal testing for the currently implemented 
syntax we rather have none at all? I can't understand the rational here.

Anyway, I'm currently blocked on not having *any* automated tests for my 
latex writer and reader. My normal approach would be to use the same 
test file to test the latex writer and reader, and then expand the file 
with more cases whenever everything already in there works. I'm at a 
loss on how to come up with a comprehensive (for some definition of 
comprehensive) test-suite in one go, so I have to wait on you guys 
producing something acceptable.

Nathan

>
> JOhn
>


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                                     ` <4CD9E118.1020801-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
@ 2010-11-10  2:14                                                                       ` John MacFarlane
       [not found]                                                                         ` <20101110021409.GA22040-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: John MacFarlane @ 2010-11-10  2:14 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Nathan Gass [Nov 10 10 01:02 ]:
> On 08.11.10 00:18, John MacFarlane wrote:
> >+++ Andrea Rossato [Nov 07 10 11:33 ]:
> >>On Sat, Nov 06, 2010 at 12:36:34PM -0700, John MacFarlane wrote:
> >>>+++ John MacFarlane [Nov 06 10 11:56 ]:
> >>>>
> >>>>https://github.com/xabbu42/pandoc/commit/03bc1c8ec26dbf1e521ffd60b5424ac9df63e53e
> >>>
> >>>Would still be good to get his feedback on the other proposed changes.
> >>
> >>
> >>If I am right the remaining code relates to the test-suite. I'm sorry
> >>but I'm not familiar with it but I certainly favor the presence of
> >>comprehensive tests.
> >
> >Having looked this over, I favor waiting a bit before adding tests.
> >We should have something more comprehensive, but it will be easier
> >to design once we've got the syntax done etc.
> 
> So rather than having some minimal testing for the currently
> implemented syntax we rather have none at all? I can't understand
> the rational here.
> 
> Anyway, I'm currently blocked on not having *any* automated tests
> for my latex writer and reader. My normal approach would be to use
> the same test file to test the latex writer and reader, and then
> expand the file with more cases whenever everything already in there
> works. I'm at a loss on how to come up with a comprehensive (for
> some definition of comprehensive) test-suite in one go, so I have to
> wait on you guys producing something acceptable.

Why not wait until we have things settled in citeproc and pandoc?
It shouldn't be too much longer.  Otherwise you're just going to
have to keep starting over with your own coding project, which
is frustrating, I'm sure.

John



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                                         ` <20101110021409.GA22040-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-11-10  9:55                                                                           ` Nathan Gass
       [not found]                                                                             ` <4CDA6C2F.3080301-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  0 siblings, 1 reply; 107+ messages in thread
From: Nathan Gass @ 2010-11-10  9:55 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 10.11.10 03:14, John MacFarlane wrote:
> +++ Nathan Gass [Nov 10 10 01:02 ]:
>> On 08.11.10 00:18, John MacFarlane wrote:
>>> +++ Andrea Rossato [Nov 07 10 11:33 ]:
>>>> On Sat, Nov 06, 2010 at 12:36:34PM -0700, John MacFarlane wrote:
>>>>> +++ John MacFarlane [Nov 06 10 11:56 ]:
>>>>>>
>>>>>> https://github.com/xabbu42/pandoc/commit/03bc1c8ec26dbf1e521ffd60b5424ac9df63e53e
>>>>>
>>>>> Would still be good to get his feedback on the other proposed changes.
>>>>
>>>>
>>>> If I am right the remaining code relates to the test-suite. I'm sorry
>>>> but I'm not familiar with it but I certainly favor the presence of
>>>> comprehensive tests.
>>>
>>> Having looked this over, I favor waiting a bit before adding tests.
>>> We should have something more comprehensive, but it will be easier
>>> to design once we've got the syntax done etc.
>>
>> So rather than having some minimal testing for the currently
>> implemented syntax we rather have none at all? I can't understand
>> the rational here.
>>
>> Anyway, I'm currently blocked on not having *any* automated tests
>> for my latex writer and reader. My normal approach would be to use
>> the same test file to test the latex writer and reader, and then
>> expand the file with more cases whenever everything already in there
>> works. I'm at a loss on how to come up with a comprehensive (for
>> some definition of comprehensive) test-suite in one go, so I have to
>> wait on you guys producing something acceptable.
>
> Why not wait until we have things settled in citeproc and pandoc?
> It shouldn't be too much longer.  Otherwise you're just going to
> have to keep starting over with your own coding project, which
> is frustrating, I'm sure.

Why should I have to keep starting over? That is, as long as Andrea and 
I have a common base on which we can work.

Nathan

>
> John
>
>


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                                                                             ` <4CDA6C2F.3080301-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
@ 2010-11-10 20:12                                                                               ` John MacFarlane
  0 siblings, 0 replies; 107+ messages in thread
From: John MacFarlane @ 2010-11-10 20:12 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Nathan Gass [Nov 10 10 10:55 ]:
> On 10.11.10 03:14, John MacFarlane wrote:
> >+++ Nathan Gass [Nov 10 10 01:02 ]:
> >>On 08.11.10 00:18, John MacFarlane wrote:
> >>>+++ Andrea Rossato [Nov 07 10 11:33 ]:
> >>>>On Sat, Nov 06, 2010 at 12:36:34PM -0700, John MacFarlane wrote:
> >>>>>+++ John MacFarlane [Nov 06 10 11:56 ]:
> >>>>>>
> >>>>>>https://github.com/xabbu42/pandoc/commit/03bc1c8ec26dbf1e521ffd60b5424ac9df63e53e
> >>>>>
> >>>>>Would still be good to get his feedback on the other proposed changes.
> >>>>
> >>>>
> >>>>If I am right the remaining code relates to the test-suite. I'm sorry
> >>>>but I'm not familiar with it but I certainly favor the presence of
> >>>>comprehensive tests.
> >>>
> >>>Having looked this over, I favor waiting a bit before adding tests.
> >>>We should have something more comprehensive, but it will be easier
> >>>to design once we've got the syntax done etc.
> >>
> >>So rather than having some minimal testing for the currently
> >>implemented syntax we rather have none at all? I can't understand
> >>the rational here.
> >>
> >>Anyway, I'm currently blocked on not having *any* automated tests
> >>for my latex writer and reader. My normal approach would be to use
> >>the same test file to test the latex writer and reader, and then
> >>expand the file with more cases whenever everything already in there
> >>works. I'm at a loss on how to come up with a comprehensive (for
> >>some definition of comprehensive) test-suite in one go, so I have to
> >>wait on you guys producing something acceptable.
> >
> >Why not wait until we have things settled in citeproc and pandoc?
> >It shouldn't be too much longer.  Otherwise you're just going to
> >have to keep starting over with your own coding project, which
> >is frustrating, I'm sure.
> 
> Why should I have to keep starting over? That is, as long as Andrea
> and I have a common base on which we can work.

Well, Andrea's code has changed quite a bit in the last two weeks -
including the interface - and it is going to change some more.

I don't want to discourage you from working, if you're happy to
code against a moving target.  But for the sake of efficiency,
I'm not going to review and integrate patches until the basics
have been settled.  I hope this won't be much longer.

John


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                 ` <1d77490f-4c76-4571-8f53-6902d1604ba5-PQeItPOgslmbvKjc6lLfglYGCWtFR9XvQQ4Iyu8u01E@public.gmane.org>
  2010-10-30 15:18                   ` Bruce
@ 2010-11-13  9:46                   ` lukshuntim-Re5JQEeQqe8AvxtiuMwx3w
       [not found]                     ` <4CDE5E5A.2000707-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2010-11-13 16:16                   ` fiddlosopher
  2 siblings, 1 reply; 107+ messages in thread
From: lukshuntim-Re5JQEeQqe8AvxtiuMwx3w @ 2010-11-13  9:46 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw; +Cc: fiddlosopher

On 10/30/2010 12:13 PM, fiddlosopher wrote:
>> I've only just checked in. I'm pretty busy these days, and it's quite
>> hard to follow all this. Can someone update me on where this
>> conversation stands, and what sorts of details you need feedback on?
>>
>> Also, if I want to test this out with some real world work, what's the
>> easiest way to get it running?
> 
> Bruce,
> 
> Here's how to get it running.
> 
> cabal update
> darcs get http://code.haskell.org/citeproc-hs
> cd citeproc-hs
> cabal install
> cd ..
> git clone -b citeproc git://github.com/jgm/pandoc.git pandoc-citeproc
> cd pandoc-citeproc
> cabal install
> 
> You should now have a version of pandoc compiled against Andrea's
> latest citeproc library.  (Let me know if there are any difficulties
> here.)
> 
> To test, do something like
> 
> pandoc -s -S --biblio biblio.bib --csl style.csl test.markdown >
> test.html

It doesn't work for me. :-( I got

"$ pandoc -s -S --biblio biblio.bib --csl style.csl test.markdown >test.html
pandoc: unrecognized option `--biblio'
pandoc: unrecognized option `--csl'
Try pandoc --help for more information."

Here's what pandoc --help" gives.

"$ pandoc --help
pandoc [OPTIONS] [FILES]
Input formats:  native, markdown, markdown+lhs, rst, rst+lhs, html,
latex, latex+lhs
Output formats:  native, html, html+lhs, s5, slidy, docbook,
opendocument, odt, epub, latex, latex+lhs, context, texinfo, man,
markdown, markdown+lhs, plain, rst, rst+lhs, mediawiki, rtf
Options:
  -f FORMAT, -r FORMAT  --from=FORMAT, --read=FORMAT
  -t FORMAT, -w FORMAT  --to=FORMAT, --write=FORMAT
  -s                    --standalone
  -o FILENAME           --output=FILENAME
  -p                    --preserve-tabs
                        --tab-stop=TABSTOP
                        --strict
                        --reference-links
  -R                    --parse-raw
  -S                    --smart
  -m[URL]               --latexmathml[=URL], --asciimathml[=URL]
                        --mathml[=URL]
                        --mimetex[=URL]
                        --webtex[=URL]
                        --jsmath[=URL]
                        --mathjax=URL
                        --gladtex
  -i                    --incremental
                        --offline
                        --xetex
  -N                    --number-sections
                        --section-divs
                        --no-wrap
                        --sanitize-html
                        --email-obfuscation=none|javascript|references
                        --id-prefix=STRING
                        --indented-code-classes=STRING
                        --toc, --table-of-contents
                        --base-header-level=LEVEL
                        --template=FILENAME
  -V KEY:VALUE          --variable=KEY:VALUE
  -c URL                --css=URL
  -H FILENAME           --include-in-header=FILENAME
  -B FILENAME           --include-before-body=FILENAME
  -A FILENAME           --include-after-body=FILENAME
  -C FILENAME           --custom-header=FILENAME
  -T STRING             --title-prefix=STRING
                        --reference-odt=FILENAME
                        --epub-stylesheet=FILENAME
                        --epub-metadata=FILENAME
  -D FORMAT             --print-default-template=FORMAT
                        --data-dir=DIRECTORY
                        --dump-args
                        --ignore-args
  -v                    --version
  -h                   --help"

Pandoc version is

"$ pandoc --version
pandoc 1.7"

Hope I've not missed anything.

Regards.
ST
--


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                     ` <4CDE5E5A.2000707-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2010-11-13 11:44                       ` Nathan Gass
       [not found]                         ` <4CDE7A10.7060100-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  2010-11-13 11:48                       ` Andrea Rossato
  1 sibling, 1 reply; 107+ messages in thread
From: Nathan Gass @ 2010-11-13 11:44 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 13.11.10 10:46, lukshuntim-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
> On 10/30/2010 12:13 PM, fiddlosopher wrote:
>>> I've only just checked in. I'm pretty busy these days, and it's quite
>>> hard to follow all this. Can someone update me on where this
>>> conversation stands, and what sorts of details you need feedback on?
>>>
>>> Also, if I want to test this out with some real world work, what's the
>>> easiest way to get it running?
>>
>> Bruce,
>>
>> Here's how to get it running.
>>
>> cabal update
>> darcs get http://code.haskell.org/citeproc-hs
>> cd citeproc-hs
>> cabal install
>> cd ..
>> git clone -b citeproc git://github.com/jgm/pandoc.git pandoc-citeproc
>> cd pandoc-citeproc
>> cabal install
>>
>> You should now have a version of pandoc compiled against Andrea's
>> latest citeproc library.  (Let me know if there are any difficulties
>> here.)
>>
>> To test, do something like
>>
>> pandoc -s -S --biblio biblio.bib --csl style.csl test.markdown>
>> test.html
>
> It doesn't work for me. :-( I got
>
> "$ pandoc -s -S --biblio biblio.bib --csl style.csl test.markdown>test.html
> pandoc: unrecognized option `--biblio'
> pandoc: unrecognized option `--csl'
> Try pandoc --help for more information."

The instructions are missing a -fciteproc, so the correct way is:

cd pandoc-citeproc
cabal install -fciteproc

instead of

cd pandoc-citeproc
cabal install

I'm currently in the process of adapting gitit to use the citation 
support and will make one available online shortly. This will hopefully 
enable a lot more people to try out what we have and add to the discussion.

Greetings
Nathan


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                     ` <4CDE5E5A.2000707-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2010-11-13 11:44                       ` Nathan Gass
@ 2010-11-13 11:48                       ` Andrea Rossato
  1 sibling, 0 replies; 107+ messages in thread
From: Andrea Rossato @ 2010-11-13 11:48 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Sat, Nov 13, 2010 at 05:46:02PM +0800, lukshuntim-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
> On 10/30/2010 12:13 PM, fiddlosopher wrote:
> >> I've only just checked in. I'm pretty busy these days, and it's quite
> >> hard to follow all this. Can someone update me on where this
> >> conversation stands, and what sorts of details you need feedback on?
> >>
> >> Also, if I want to test this out with some real world work, what's the
> >> easiest way to get it running?
> > 
> > Bruce,
> > 
> > Here's how to get it running.
> > 
> > cabal update
> > darcs get http://code.haskell.org/citeproc-hs
> > cd citeproc-hs
> > cabal install
> > cd ..
> > git clone -b citeproc git://github.com/jgm/pandoc.git pandoc-citeproc
> > cd pandoc-citeproc
> > cabal install
> > 
> > You should now have a version of pandoc compiled against Andrea's
> > latest citeproc library.  (Let me know if there are any difficulties
> > here.)
> > 
> > To test, do something like
> > 
> > pandoc -s -S --biblio biblio.bib --csl style.csl test.markdown >
> > test.html
> 
> It doesn't work for me. :-( I got
[...]
> 
> Hope I've not missed anything.

It seems you did, since pandoc was compiled without the proper support
for citeproc.

I suppose you were able to get pandoc-types before installing
citeproc-hs.

Then, update you citeproc-hs tree by running 'darcs pull' and compile
as described above.

Then, after pulling the pandoc-citeproc branch, instead of 'cabal
install' I think you should run:

runhaskell Setup.hs configure -fciteproc
runhaskell Setup.hs build

That is to say, you need to use the citeproc flag (it should be
possible to pass that flag to cabal, but I'm not sure since I do not
use it).

Please report back any problem you may find. Thanks.

Hope this helps.

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                         ` <4CDE7A10.7060100-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
@ 2010-11-13 13:22                           ` Andrea Rossato
  2010-11-13 13:46                           ` lukshuntim-Re5JQEeQqe8AvxtiuMwx3w
  1 sibling, 0 replies; 107+ messages in thread
From: Andrea Rossato @ 2010-11-13 13:22 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Sat, Nov 13, 2010 at 12:44:16PM +0100, Nathan Gass wrote:
> I'm currently in the process of adapting gitit to use the citation
> support and will make one available online shortly. This will
> hopefully enable a lot more people to try out what we have and add
> to the discussion.

This is cool| I would like to eventually write a pandoc parser for
UniWakka, a wiki I used to develop, derived from wakkawiki which was a
popular wiki engine about a couple of centuries ago.

UniWakka had/has (I still run it locally) bibliographic and mathml
support. Maybe a conversion tool to port the database to gitit would
be possible... but, given my present time constraint, this is probably
just a dream...:-)

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                         ` <4CDE7A10.7060100-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
  2010-11-13 13:22                           ` Andrea Rossato
@ 2010-11-13 13:46                           ` lukshuntim-Re5JQEeQqe8AvxtiuMwx3w
       [not found]                             ` <4CDE96CE.8030000-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  1 sibling, 1 reply; 107+ messages in thread
From: lukshuntim-Re5JQEeQqe8AvxtiuMwx3w @ 2010-11-13 13:46 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw; +Cc: Nathan Gass

On 11/13/2010 07:44 PM, Nathan Gass wrote:
> On 13.11.10 10:46, lukshuntim-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
>> On 10/30/2010 12:13 PM, fiddlosopher wrote:
>>>> I've only just checked in. I'm pretty busy these days, and it's quite
>>>> hard to follow all this. Can someone update me on where this
>>>> conversation stands, and what sorts of details you need feedback on?
>>>>
>>>> Also, if I want to test this out with some real world work, what's the
>>>> easiest way to get it running?
>>>
>>> Bruce,
>>>
>>> Here's how to get it running.
>>>
>>> cabal update
>>> darcs get http://code.haskell.org/citeproc-hs
>>> cd citeproc-hs
>>> cabal install
>>> cd ..
>>> git clone -b citeproc git://github.com/jgm/pandoc.git pandoc-citeproc
>>> cd pandoc-citeproc
>>> cabal install
>>>
>>> You should now have a version of pandoc compiled against Andrea's
>>> latest citeproc library.  (Let me know if there are any difficulties
>>> here.)
>>>
>>> To test, do something like
>>>
>>> pandoc -s -S --biblio biblio.bib --csl style.csl test.markdown>
>>> test.html
>>
>> It doesn't work for me. :-( I got
>>
>> "$ pandoc -s -S --biblio biblio.bib --csl style.csl
>> test.markdown>test.html
>> pandoc: unrecognized option `--biblio'
>> pandoc: unrecognized option `--csl'
>> Try pandoc --help for more information."
> 
> The instructions are missing a -fciteproc, so the correct way is:
> 
> cd pandoc-citeproc
> cabal install -fciteproc
> 
> instead of
> 
> cd pandoc-citeproc
> cabal install

This is what I got:

$ cabal install -fciteproc
Resolving dependencies...
cabal: cannot configure pandoc-1.7. It requires citeproc-hs ==0.3.*
There is no available version of citeproc-hs that satisfies ==0.3.*

I'll try another suggestion by Andreas and report back.

Thanks very for the reply,
ST
--


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: Re: citeproc updates
       [not found]                             ` <4CDE96CE.8030000-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2010-11-13 13:59                               ` Andrea Rossato
  2010-11-13 14:06                               ` lukshuntim-Re5JQEeQqe8AvxtiuMwx3w
  1 sibling, 0 replies; 107+ messages in thread
From: Andrea Rossato @ 2010-11-13 13:59 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On Sat, Nov 13, 2010 at 09:46:54PM +0800, lukshuntim-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
> >> On 10/30/2010 12:13 PM, fiddlosopher wrote:
> >>> darcs get http://code.haskell.org/citeproc-hs
> >>> cd citeproc-hs
> >>> cabal install
> 
> This is what I got:
> 
> $ cabal install -fciteproc
> Resolving dependencies...
> cabal: cannot configure pandoc-1.7. It requires citeproc-hs ==0.3.*
> There is no available version of citeproc-hs that satisfies ==0.3.*
> 
> I'll try another suggestion by Andreas and report back.

you need to manually install the darcs version of citeproc before
trying to compile pandoc.

Instructions above.

Andrea


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                             ` <4CDE96CE.8030000-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2010-11-13 13:59                               ` Andrea Rossato
@ 2010-11-13 14:06                               ` lukshuntim-Re5JQEeQqe8AvxtiuMwx3w
  1 sibling, 0 replies; 107+ messages in thread
From: lukshuntim-Re5JQEeQqe8AvxtiuMwx3w @ 2010-11-13 14:06 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw; +Cc: Nathan Gass

On 11/13/2010 09:46 PM, lukshuntim-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
> On 11/13/2010 07:44 PM, Nathan Gass wrote:
>> On 13.11.10 10:46, lukshuntim-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
>>> On 10/30/2010 12:13 PM, fiddlosopher wrote:
>>>>> I've only just checked in. I'm pretty busy these days, and it's quite
>>>>> hard to follow all this. Can someone update me on where this
>>>>> conversation stands, and what sorts of details you need feedback on?
>>>>>
>>>>> Also, if I want to test this out with some real world work, what's the
>>>>> easiest way to get it running?
>>>>
>>>> Bruce,
>>>>
>>>> Here's how to get it running.
>>>>
>>>> cabal update
>>>> darcs get http://code.haskell.org/citeproc-hs
>>>> cd citeproc-hs
>>>> cabal install
>>>> cd ..
>>>> git clone -b citeproc git://github.com/jgm/pandoc.git pandoc-citeproc
>>>> cd pandoc-citeproc
>>>> cabal install
>>>>
>>>> You should now have a version of pandoc compiled against Andrea's
>>>> latest citeproc library.  (Let me know if there are any difficulties
>>>> here.)
>>>>
>>>> To test, do something like
>>>>
>>>> pandoc -s -S --biblio biblio.bib --csl style.csl test.markdown>
>>>> test.html
>>>
>>> It doesn't work for me. :-( I got
>>>
>>> "$ pandoc -s -S --biblio biblio.bib --csl style.csl
>>> test.markdown>test.html
>>> pandoc: unrecognized option `--biblio'
>>> pandoc: unrecognized option `--csl'
>>> Try pandoc --help for more information."
>>
>> The instructions are missing a -fciteproc, so the correct way is:
>>
>> cd pandoc-citeproc
>> cabal install -fciteproc
>>
>> instead of
>>
>> cd pandoc-citeproc
>> cabal install
> 
> This is what I got:
> 
> $ cabal install -fciteproc
> Resolving dependencies...
> cabal: cannot configure pandoc-1.7. It requires citeproc-hs ==0.3.*
> There is no available version of citeproc-hs that satisfies ==0.3.*
> 
> I'll try another suggestion by Andreas and report back.
> 
> Thanks very for the reply,
> ST
> --

Sorry for the noise. Now I got it working.

I didn't recompile citeproc-hs after I saw "No remote changes to pull
in!" when "darcs pull" was run. After recompiling citeproc-hs, "cabal
install -fciteproc" went through and the test "pandoc -s -S --biblio
biblio.bib --csl style.csl test.markdown" succeeded.

Thanks very much to all for the help and these great tools.

Regards,
ST
--




^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: citeproc updates
       [not found]                 ` <1d77490f-4c76-4571-8f53-6902d1604ba5-PQeItPOgslmbvKjc6lLfglYGCWtFR9XvQQ4Iyu8u01E@public.gmane.org>
  2010-10-30 15:18                   ` Bruce
  2010-11-13  9:46                   ` lukshuntim-Re5JQEeQqe8AvxtiuMwx3w
@ 2010-11-13 16:16                   ` fiddlosopher
  2 siblings, 0 replies; 107+ messages in thread
From: fiddlosopher @ 2010-11-13 16:16 UTC (permalink / raw)
  To: pandoc-discuss

> Bruce,
>
> Here's how to get it running.
>
> cabal update
> darcs get http://code.haskell.org/citeproc-hs
> cd citeproc-hs
> cabal install

Sorry, I left out one import instruction:  use the citeproc flag when
you install pandoc:

cabal install -fciteproc



> cd ..
> git clone -bciteprocgit://github.com/jgm/pandoc.git pandoc-citeproc
> cd pandoc-citeproc
> cabalinstall


^ permalink raw reply	[flat|nested] 107+ messages in thread

end of thread, other threads:[~2010-11-13 16:16 UTC | newest]

Thread overview: 107+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-27 13:20 citeproc updates Andrea Rossato
     [not found] ` <20101027132010.GB6998-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-10-28  1:36   ` John MacFarlane
     [not found]     ` <20101028013616.GA13075-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-10-28 19:56       ` BP Jonsson
     [not found]         ` <4CC9D55B.8090004-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-10-28 20:33           ` John MacFarlane
     [not found]             ` <20101028203320.GA1581-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-10-30 12:51               ` Andrea Rossato
     [not found]                 ` <20101030125104.GC14156-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-10-31 22:06                   ` citation syntax (was: citeproc updates) John MacFarlane
     [not found]                     ` <20101031220602.GA21760-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-01  0:48                       ` Bruce
     [not found]                         ` <3b5e1631-d1c4-4d49-bc07-fb43544faf4e-kXDyx5gwD+DerssAVCGTfmB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
2010-11-01  1:34                           ` John MacFarlane
2010-11-01 10:01                       ` citation syntax Nathan Gass
     [not found]                         ` <4CCE900E.6060106-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
2010-11-01 13:11                           ` Bruce
     [not found]                             ` <a97a886c-6ae0-428c-a962-b8e3258e798c-a16pFvPtgY3HdqrNY7FC6GB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
2010-11-01 14:48                               ` Nathan Gass
     [not found]                                 ` <4CCED340.9010304-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
2010-11-01 15:28                                   ` John MacFarlane
2010-11-01 16:24                                   ` dsanson
     [not found]                                     ` <cacc908c-bd4b-435d-901a-7a66fb9cb4b5-f5wI9GJRwsKaNOhjBGSpuVYGCWtFR9XvQQ4Iyu8u01E@public.gmane.org>
2010-11-01 17:16                                       ` John MacFarlane
     [not found]                                         ` <20101101171648.GA10110-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-01 17:30                                           ` Bruce
2010-11-01 17:46                                       ` Tillmann Rendel
     [not found]                                         ` <4CCEFCF8.4070805-jNDFPZUTrfTbB13WlS47k8u21/r88PR+s0AfqQuZ5sE@public.gmane.org>
2010-11-01 18:33                                           ` Bruce
2010-11-01 19:02                                           ` John MacFarlane
     [not found]                                             ` <20101101190204.GA11857-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-01 20:31                                               ` Andrea Rossato
2010-11-01 23:46                                                 ` Bruce
2010-11-02  1:08                                                 ` John MacFarlane
2010-11-01 19:21                                   ` John MacFarlane
     [not found]                                     ` <20101101192157.GB11857-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-02  7:28                                       ` Nathan Gass
2010-11-01 15:24                           ` John MacFarlane
2010-10-30 16:06               ` supporting both parenthetical and footnote citations (was: citeproc updates) John MacFarlane
     [not found]                 ` <20101030160608.GA4075-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-10-30 20:18                   ` Bruce
     [not found]                     ` <d7beeaa7-d578-47db-bf92-aab5415d341e-QB3fWVJXTS+bvKjc6lLfglYGCWtFR9XvQQ4Iyu8u01E@public.gmane.org>
2010-10-30 23:46                       ` John MacFarlane
2010-10-31 15:58                   ` Bruce
2010-11-02 21:39                   ` Andrea Rossato
     [not found]                     ` <20101102213938.GE18764-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-02 23:58                       ` Andrea Rossato
2010-11-03  4:10                       ` John MacFarlane
     [not found]                         ` <20101103041014.GA19840-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-03 12:10                           ` Andrea Rossato
     [not found]                             ` <20101103121027.GG18764-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-03 15:03                               ` Bruce
     [not found]                                 ` <88f053e9-c948-4b5d-b92a-46caba340c09-sqEsDViiE88wq+9lhVFwFGB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
2010-11-04 11:09                                   ` Andrea Rossato
2010-11-03 15:36                               ` John MacFarlane
2010-10-28 20:26       ` Re: citeproc updates Andrea Rossato
     [not found]         ` <20101028202646.GB6256-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-10-29 20:27           ` John MacFarlane
     [not found]             ` <20101029202716.GA26844-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-10-30 10:57               ` Andrea Rossato
2010-10-30  2:54           ` Bruce
     [not found]             ` <507b277f-218d-494f-88ae-67c84c9e5cec-TDjcMQyrprHatUqWdiOD/mB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
2010-10-30  4:13               ` fiddlosopher
     [not found]                 ` <1d77490f-4c76-4571-8f53-6902d1604ba5-PQeItPOgslmbvKjc6lLfglYGCWtFR9XvQQ4Iyu8u01E@public.gmane.org>
2010-10-30 15:18                   ` Bruce
2010-11-13  9:46                   ` lukshuntim-Re5JQEeQqe8AvxtiuMwx3w
     [not found]                     ` <4CDE5E5A.2000707-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-11-13 11:44                       ` Nathan Gass
     [not found]                         ` <4CDE7A10.7060100-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
2010-11-13 13:22                           ` Andrea Rossato
2010-11-13 13:46                           ` lukshuntim-Re5JQEeQqe8AvxtiuMwx3w
     [not found]                             ` <4CDE96CE.8030000-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-11-13 13:59                               ` Andrea Rossato
2010-11-13 14:06                               ` lukshuntim-Re5JQEeQqe8AvxtiuMwx3w
2010-11-13 11:48                       ` Andrea Rossato
2010-11-13 16:16                   ` fiddlosopher
2010-10-30 11:17           ` Nathan Gass
2010-11-03 23:39   ` Nathan Gass
     [not found]     ` <4CD1F297.2020609-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
2010-11-04  7:36       ` John MacFarlane
     [not found]         ` <20101104073606.GA17293-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-04 11:48           ` Andrea Rossato
     [not found]             ` <20101104114801.GF10392-u31zCTIHpvLVI6Gt0zCidg@public.gmane.org>
2010-11-04 14:54               ` Andrea Rossato
     [not found]                 ` <20101104145457.GA6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-04 16:09                   ` John MacFarlane
     [not found]                     ` <20101104160942.GF25944-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-05 10:33                       ` Andrea Rossato
     [not found]                         ` <20101105103345.GD6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-05 16:41                           ` John MacFarlane
     [not found]                             ` <20101105164134.GB582-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-05 19:50                               ` Andrea Rossato
     [not found]                                 ` <20101105195055.GI6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-05 20:29                                   ` John MacFarlane
     [not found]                                     ` <20101105202938.GA3968-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-05 20:52                                       ` Andrea Rossato
     [not found]                                         ` <20101105205212.GJ6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-06  0:12                                           ` John MacFarlane
     [not found]                                             ` <20101106001214.GA6579-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-06 14:58                                               ` Nathan Gass
     [not found]                                                 ` <4CD56D33.2010004-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
2010-11-06 18:56                                                   ` John MacFarlane
     [not found]                                                     ` <20101106185638.GB21524-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-06 19:24                                                       ` Andrea Rossato
     [not found]                                                         ` <20101106192448.GA24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-07  1:32                                                           ` John MacFarlane
     [not found]                                                             ` <20101107013243.GC25887-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-07  2:41                                                               ` Frank Bennett
2010-11-07  2:52                                                               ` Frank Bennett
2010-11-07 14:11                                                               ` Andrea Rossato
2010-11-07 15:29                                                               ` Nathan Gass
     [not found]                                                                 ` <4CD6C5E6.3040300-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
2010-11-07 23:05                                                                   ` John MacFarlane
     [not found]                                                                     ` <20101107230549.GA7894-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-08 11:00                                                                       ` Andrea Rossato
     [not found]                                                                         ` <20101108110025.GI24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-08 11:12                                                                           ` Frank Bennett
2010-11-08 15:53                                                                           ` John MacFarlane
     [not found]                                                                             ` <20101108155306.GB15777-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-08 17:14                                                                               ` Andrea Rossato
     [not found]                                                                                 ` <20101108171429.GM24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-08 17:38                                                                                   ` John MacFarlane
2010-11-08 18:28                                                                       ` Nathan Gass
     [not found]                                                                         ` <4CD8415F.8030703-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
2010-11-08 19:16                                                                           ` John MacFarlane
2010-11-07 23:12                                                                   ` John MacFarlane
     [not found]                                                                     ` <20101107231252.GA8043-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-08 11:09                                                                       ` Andrea Rossato
     [not found]                                                                         ` <20101108110950.GJ24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-08 16:03                                                                           ` John MacFarlane
2010-11-06 19:36                                                       ` John MacFarlane
     [not found]                                                         ` <20101106193634.GC22106-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-07 10:33                                                           ` Andrea Rossato
     [not found]                                                             ` <20101107103327.GC24988-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-07 23:18                                                               ` John MacFarlane
     [not found]                                                                 ` <20101107231810.GB8043-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-10  0:02                                                                   ` Nathan Gass
     [not found]                                                                     ` <4CD9E118.1020801-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
2010-11-10  2:14                                                                       ` John MacFarlane
     [not found]                                                                         ` <20101110021409.GA22040-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-10  9:55                                                                           ` Nathan Gass
     [not found]                                                                             ` <4CDA6C2F.3080301-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
2010-11-10 20:12                                                                               ` John MacFarlane
2010-11-06 15:04                                               ` Andrea Rossato
     [not found]                                                 ` <20101106150410.GL6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-06 19:35                                                   ` John MacFarlane
     [not found]                                                     ` <20101106193523.GB22106-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-06 20:15                                                       ` Andrea Rossato
2010-11-04 15:23               ` Nathan Gass
     [not found]                 ` <4CD2CFFE.8030503-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
2010-11-04 23:03                   ` Nathan Gass
     [not found]                     ` <4CD33BD6.6050703-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
2010-11-05  7:00                       ` John MacFarlane
2010-11-05 11:36                       ` Andrea Rossato
     [not found]                         ` <20101105113647.GE6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-05 15:12                           ` Nathan Gass
     [not found]                             ` <4CD41EC1.7090100-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
2010-11-05 16:59                               ` Andrea Rossato
     [not found]                                 ` <20101105165947.GH6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-05 17:06                                   ` John MacFarlane
2010-11-05 17:36                                   ` Nathan Gass
     [not found]                                     ` <4CD44082.8000205-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
2010-11-07 11:11                                       ` Andrea Rossato
2010-11-04 16:06               ` John MacFarlane
     [not found]                 ` <20101104160627.GE25944-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-05  0:41                   ` Frank Bennett
     [not found]                     ` <1df2d027-9220-45b1-8126-1b0965bd7836-s+NOhRKKP/7FX/zIJQasLWB/v6IoIuQBVpNB7YpNyf8@public.gmane.org>
2010-11-05 13:36                       ` Andrea Rossato
     [not found]                         ` <20101105133608.GG6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-05 17:01                           ` John MacFarlane
     [not found]                             ` <20101105170119.GC582-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-11-05 21:32                               ` Andrea Rossato
     [not found]                                 ` <20101105213239.GK6834-j4W6CDmL7uNdAaE8spi6tJZpQXiuRcL9@public.gmane.org>
2010-11-06  0:28                                   ` John MacFarlane
2010-11-08  9:07                       ` Nathan Gass
     [not found]                         ` <4CD7BDD0.1040408-8UOIJiGH10pyDzI6CaY1VQ@public.gmane.org>
2010-11-08 18:02                           ` John MacFarlane

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).