From: BPJ <bpj-J3H7GcXPSITLoDKTGw+V6w@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: Way to have markdown ==highlighting== show up as highlighting in .docx or .odt files?
Date: Thu, 23 Jun 2022 21:06:45 +0200 [thread overview]
Message-ID: <CADAJKhA3F-VC--BMe2mpERZr=LmXZFNE61EwvHmfk0dwYp_ALw@mail.gmail.com> (raw)
In-Reply-To: <3316a007-a142-4d3d-a2f8-40befafb4249n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
[-- Attachment #1.1: Type: text/plain, Size: 9239 bytes --]
It would be possible but it would be rather fragile and finicky because you
would have to
1. traverse lists of inline elements,
2. locate string elements which contain "==",
3. split that strings into the bit before and after "==",
4. insert the right raw markup for the output format in place of "=="
5. collect elements up to the next string element which contains "==",
6. Redo #3 and #4 with that string,
7. Throw an error if #5 fails!
You are probably better off replacing the `==...==` in your existing files
using the attached Perl script. It is a modification of a script which I
have used to convert `_..._` and the like to spans. It uses regexes, but is
smart enough to leave block and inline code and math as well as "==" in
contexts were it probably isn't a delimiter alone. Make sure to check out
the -h and -m options for documentation
Den tors 23 juni 2022 13:15Emiliano <gattulli.emiliano-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev:
> BPJ, is it possible to create a lua filter that does the same thing but it
> converts Obsidian syntax '== ==' into a highlighted text? I have tons of
> notes written in Obsidian syntax and it would be an enormous task to modify
> all of them with the 'new' syntax. By the way, your lua filter works
> perfectly!
>
> Il giorno mercoledì 22 giugno 2022 alle 19:45:07 UTC+2 BPJ ha scritto:
>
>> According to the principle that it's better to find out what you can do
>> with the tools you have you can use a span with a class, like `[text]{.hl}`
>> and use a simple filter to convert that to Obsidian's syntax when
>> processing with Obsidian, by choosing `markdown` as output format, or
>> insert the necessary LaTeX markup when producing PDF (or arrange for the
>> necessary CSS to be loaded if producing PDF via HTML.)
>>
>> ``````lua
>> local eq_hl = pandoc.RawInline('markdown', '==')
>>
>> local highlight = {
>> markdown = { start = eq_hl, stop = eq_hl },
>> latex = {
>> start = pandoc.RawInline('latex', '\\colorbox[named]{yellow}{'),
>> stop = pandoc.RawInline('latex', '}'),
>> },
>> }
>>
>> local hl = highlight[FORMAT]
>>
>> function Span (s)
>> if s.classes:includes('hl') then
>> if hl then
>> rv = s.content
>> rv:insert(1, hl.start)
>> rv:insert(hl.stop)
>> return rv
>> end
>> end
>> return nil
>> end
>> ``````
>>
>> I'm not sure that the default LaTeX template always loads the xcolor
>> package. You may need a modifier template.
>>
>> I can imagine you lose some in-editor preview, but you get reasonable
>> output.
>>
>> HTH,
>>
>> /bpj
>>
>> Den ons 22 juni 2022 16:11Emiliano <gattulli...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev:
>>
>>> Well, if you export in PDF through Obsidian the highlighted text is
>>> rendered correctly but not if you use Pandoc. I do not export in PDF
>>> through Obsidian because then I would be bound to the style of the active
>>> theme, namely, I would see the PDF file with a black background (I use the
>>> Dark Mode), font size, spacing, margins, etc. of Obsidian's active theme.
>>>
>>> Il giorno martedì 21 giugno 2022 alle 18:44:42 UTC+2
>>> paulschi...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org ha scritto:
>>>
>>>> Good question! Thanks for reminding me of this. But exporting to PDF in
>>>> Obsidian with highlights should work automatically, no?
>>>>
>>>> On Tuesday, June 21, 2022 at 3:21:03 p.m. UTC+2 Emiliano wrote:
>>>>
>>>>> Any news about this feature for Pandoc? I use a lot the highlight
>>>>> syntax ('== ==') in Obsidian and it would be great if I could render my
>>>>> highlighted text in PDF (also in DOCX and ODT).
>>>>>
>>>>> Il giorno domenica 2 gennaio 2022 alle 17:52:44 UTC+1 Alx Nbl ha
>>>>> scritto:
>>>>>
>>>>>> My use case is different from paulschi, in my case i am trying to
>>>>>> convert docx into markdown and generating '== ==' syntax when there is
>>>>>> higlighted text in the docx file.
>>>>>>
>>>>>> On Sunday, January 2, 2022 at 3:09:42 PM UTC+1 Alx Nbl wrote:
>>>>>>
>>>>>>> Hi all. The '== ==' syntax is also used by Joplin app. I would also
>>>>>>> be very interested by such a feature.
>>>>>>>
>>>>>>> On Thursday, December 9, 2021 at 6:29:51 PM UTC+1 John MacFarlane
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> On CriticMarkup, see
>>>>>>>>
>>>>>>>> https://github.com/jgm/pandoc/issues/2873
>>>>>>>> https://github.com/jgm/pandoc/issues/5430
>>>>>>>>
>>>>>>>>
>>>>>>>> Joseph Reagle <josep...-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> writes:
>>>>>>>>
>>>>>>>> > BTW: If CommonMark or pandoc were to support highlight, I would
>>>>>>>> then wonder why not support all of CriticMarkup, which supports highlight
>>>>>>>> as `{== ==}` or `{>> <<}`. (It's a shame that we have two different
>>>>>>>> syntaxes emerging for highlight.)
>>>>>>>> >
>>>>>>>> > On 21-12-09 11:10, John MacFarlane wrote:
>>>>>>>> >>
>>>>>>>> >> If this is a syntax that is becoming common, we could consider
>>>>>>>> >> adding a markdown extension for it. You could open an issue on
>>>>>>>> >> our issue tracker.
>>>>>>>> >>
>>>>>>>> >> Joseph Reagle <josep...-T1oY19WcHSwdnm+yROfE0A@public.gmane.org> writes:
>>>>>>>> >>
>>>>>>>> >>> This is the first time I've encountered [this syntax][1] and it
>>>>>>>> is not natively supported by pandoc. Or am I wrong and you are saying
>>>>>>>> pandoc handles it when using the latex/PDF writer? (Or, are you saying
>>>>>>>> Obsidian can export to PDF, but not Word?)
>>>>>>>> >>>
>>>>>>>> >>> I see there's been some discussion on the [CommonMark
>>>>>>>> forum][2], but it doesn't look like you'd find an immediate solution.
>>>>>>>> >>>
>>>>>>>> >>> Using a filter or hacking something that converts `==foo==` to
>>>>>>>> [foo]{.highlight} that is properly rendered in Word might be options.
>>>>>>>> >>>
>>>>>>>> >>> [1]: https://www.markdownguide.org/extended-syntax/#highlight
>>>>>>>> >>> [2]:
>>>>>>>> https://talk.commonmark.org/t/highlighting-text-with-the-mark-element/840
>>>>>>>> >>>
>>>>>>>> >>> On 21-12-09 08:29, Paul wrote:
>>>>>>>> >>>> I use a lot of highlighting in my markdown editor Obsidian,
>>>>>>>> but I was wondering if there's a way to have that highlighting show up in
>>>>>>>> the Word or Libreoffice Writer files?
>>>>>>>> >>>>
>>>>>>>> >>>> Bold and italics work fine, as far as I can tell, and when
>>>>>>>> converting to a pdf the highlighting transfers great. I gather, however,
>>>>>>>> that the ==highlighting== is not standard in all markdown so is that the
>>>>>>>> issue?
>>>>>>>> >>>
>>>>>>>> >>> --
>>>>>>>> >>> You received this message because you are subscribed to the
>>>>>>>> Google Groups "pandoc-discuss" group.
>>>>>>>> >>> To unsubscribe from this group and stop receiving emails from
>>>>>>>> it, send an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>>>>>>> >>> To view this discussion on the web visit
>>>>>>>> https://groups.google.com/d/msgid/pandoc-discuss/9995ee8a-295e-1836-5645-9bb5ff76445d%40reagle.org.
>>>>>>>>
>>>>>>>> >>
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > You received this message because you are subscribed to the
>>>>>>>> Google Groups "pandoc-discuss" group.
>>>>>>>> > To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>>>>>>> > To view this discussion on the web visit
>>>>>>>> https://groups.google.com/d/msgid/pandoc-discuss/9d89679a-94dc-2459-822f-93dbe4cbca57%40reagle.org.
>>>>>>>>
>>>>>>>>
>>>>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "pandoc-discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>>
>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/pandoc-discuss/ba18ff15-897d-4a7f-bbd4-3735da206f1dn%40googlegroups.com
>>> <https://groups.google.com/d/msgid/pandoc-discuss/ba18ff15-897d-4a7f-bbd4-3735da206f1dn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/3316a007-a142-4d3d-a2f8-40befafb4249n%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/3316a007-a142-4d3d-a2f8-40befafb4249n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhA3F-VC--BMe2mpERZr%3DLmXZFNE61EwvHmfk0dwYp_ALw%40mail.gmail.com.
[-- Attachment #1.2: Type: text/html, Size: 13740 bytes --]
[-- Attachment #2: highlight-eq2span.pl --]
[-- Type: text/x-perl, Size: 6371 bytes --]
#!/usr/bin/env perl
use 5.010001;
use utf8;
# use utf8::all;
use strict;
use warnings;
use warnings FATAL => 'utf8';
use autodie;
use open qw[ :utf8 :std ];
use Getopt::Long qw[GetOptions
:config bundling no_auto_abbrev no_ignore_case];
use Pod::Usage qw[pod2usage];
use Text::Balanced qw[extract_multiple];
my %opt = (
attributes => '.hl',
check_word_chars => 1,
check_whitespace => 1,
backslash_escapes => 1,
backticks_code => 1,
tilde_code_blocks => 1,
tex_math_dollars => 1,
tex_math_double_backslash => 0,
tex_math_single_backslash => 0,
);
my @opts = grep { /_/ } keys %opt;
sub all {
$opt{$_} = 1 for @opts;
}
sub none {
$opt{$_} = 0 for @opts;
}
sub neg_opt {
my($name) = @_;
$name =~ s/^no_//;
$opt{$name} = 0;
}
GetOptions(
\%opt,
'attributes|a=s',
'check_whitespace|check-whitespace|s',
'no_check_whitespace|no-check-whitespace|S' => \&neg_opt,
'check_word_chars|check-word-chars|w',
'no_check_word_chars|no-check-word-chars|W' => \&neg_opt,
'backslash_escapes|backslash-escapes|b',
'no_backslash_escapes|no-backslash-escapes|B' => \&neg_opt,
'backticks_code|backticks-code|c',
'no_backticks_code|no-backticks-code|C' => \&neg_opt,
'tilde_code_blocks|tilde-code-blocks|t',
'no_tilde_code_blocks|no-tilde-code-blocks|T' => \&neg_opt,
'tex_math_dollars|tex-math-dollars|d',
'no_tex_math_dollars|no-tex-math-dollars|D' => \&neg_opt,
'tex_math_double_backslash|tex-math-double-backslash|db',
'no_tex_math_double_backslash|no-tex-math-double-backslash|DB' => \&neg_opt,
'tex_math_single_backslash|tex-math-single-backslash|sb',
'no_tex_math_single_backslash|no-tex-math-single-backslash|SB' => \&neg_opt,
'none|n' => \&none,
'all|N|A' => \&all,
'help|h' => sub { pod2usage(1) },
'man|m' => sub { pod2usage( -verbose => 2) },
);
my $span_start = '[';
my $span_stop = "]{$opt{attributes}}";
my @extractors;
if ( $opt{tex_math_double_backslash} ) {
push @extractors, (
qr{ \\\\ \( .+? \\\\ \) }msx,
qr{ \\\\ \[ .+? \\\\ \] }msx,
);
}
if ( $opt{tex_math_single_backslash} ) {
push @extractors, (
qr{ \\ \( .+? \\ \) }msx,
qr{ \\ \[ .+? \\ \] }msx,
);
}
push @extractors, qr{ \\. }msx if $opt{backslash_escapes};
push @extractors, qr[ ( ( \~{3,} ) .+? \g{-1} ) ]msx if $opt{tilde_code_blocks};
push @extractors, qr[ ( ( \`+ ) .+? \g{-1} ) ]msx if $opt{backticks_code};
if ( $opt{tex_math_dollars} ) {
push @extractors, (
qr{ \$\$ (?: [^\n] | (?<! \n ) \n (?! \n ) )+? \$\$ }msx,
qr{ \$ (?! \s ) .+? (?<! \s ) \$ (?! \d ) }msx,
);
}
{
my $highlight = qr{
#w (?<! [\pL\pN\p{Mn}] )
\=\=
#s (?! \s )
( .+? )
#s (?<! \s )
\=\=
#w (?! [\pL\pN\p{Mn}] )
}msx;
if ( $opt{check_whitespace} ) {
$highlight =~ s/#s//g;
}
if ( $opt{check_word_chars} ) {
$highlight =~ s/#w//g;
}
push @extractors, +{ highlight => qr/$highlight/msx };
}
# Slurp stdin
my $text = do { local $/; <>; };
# Process the text
my @chunks = extract_multiple $text, \@extractors;
for my $chunk ( @chunks ) {
if ( ref $chunk ) {
$chunk = $span_start . $$chunk . $span_stop;
}
}
print join "", @chunks;
__END__
=encoding UTF-8
=head1 NAME
highlight-eq2span.pl -- Replace Obsidian higlight runs with Pandoc spans
=head1 VERSION
This documentation describes version 0.001 of highlight-eq2span.pl
=head1 SYNOPSIS
perl highlight-eq2span.pl [OPTIONS] <input.md >output.md
=head1 DESCRIPTION
highlight-eq2span.pl replaces C<==HIGHLIGHTED==> as understood
by Obsidian with Pandoc spans like C<[HIGHLIGHTED]{.hl}>.
This script is a regex-based text filter, with far simpler parsing
capabilities than Pandoc.
However it by default tries to leave B<==> sequences which are unlikely
to be highlighting markup alone. There are some command line
options to control this.
=head1 OPTIONS
=over
=item -a, --attributes STR
Use STR as attributes for Pandoc spans.
Default value: C<.hl>
=item -s, --check-whitespace
Assume that opening C<==> delimiters are not followed by whitespace,
and that closing C<==> delimiters are not preceded by whitespace.
Default value: true
=item -S --no-check-whitespace
Set the -s option just above to false.
=item -w, --check-word-chars
Assume that opening C<==> delimiters are not preceded by word-chars,
and that closing C<==> delimiters are not followed by word-chars.
Default value: true
=item -W --no-check-word-chars
Set the -w option just above to false.
=item -b, --backslash-escapes
Skip characters preceded by a backslash.
This notably includes C<\=>.
Default value: true
Note that the B<--db> and B<--sb> option below affect this option!
=item -B --no-backslash-escapes
Set the -b option just above to false.
=item -c, --backticks-code
Skip chunks of text which look like block or inline
backticks-delimited code.
Default value: true
=item -C --no-backticks-code
Set the -c option just above to false.
=item -t, --tilde-code-blocks
Skip chunks of text which look like tilde-delimited code blocks.
Default value: true
=item -T --no-tilde-code-blocks
Set the -t option just above to false.
=item -d, --tex-math-dollars
Skip chunks of text which look like block or inline $ delimited math.
Default value: true
=item -D --no-tex-math-dollars
Set the -d option just above to false.
=item --db, --tex-math-double-backticks
Skip chunks of text which look like C<\\(...\\)> or C<\\[...\\]>
delimited math.
Default value: false
=item --DB --no-tex-math-double-backticks
Set the --db option just above to false.
=item --sb, --tex-math-single-backticks
Skip chunks of text which look like C<\(...\)> or C<\[...\]>
delimited math.
Default value: false
=item --SB --no-tex-math-single-backticks
Set the --sb option just above to false.
=item -n, --none
Disable all switches.
=item -A, -N, --all
Enable all switches.
=item -h --help
Print usage help and exit.
=item -m, --man
Print full documentation and exit.
=head1 LICENSE
This software is copyright (c) 2022 by Benct Philip Jonsson.
This is free software; you can redistribute it and/or modify it under
the same terms as the Perl 5 programming language system itself.
http://dev.perl.org/licenses/
=head1 AUTHOR
Benct Philip Jonsson E<lt>bpjonsson@gmail.comE<gt>
=cut
# Vim: set ft=pod et ts=4 sts=4 sw=4 tw=72 cc=72:
# Vim: set ft=pod et ts=4 sts=4 sw=4 tw=72 cc=72:
next prev parent reply other threads:[~2022-06-23 19:06 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-09 13:29 Paul
[not found] ` <b36d117c-bce8-4cda-acef-795fdc6d95dfn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2021-12-09 13:55 ` Joseph Reagle
[not found] ` <9995ee8a-295e-1836-5645-9bb5ff76445d-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
2021-12-09 14:23 ` Paul
2021-12-09 16:10 ` John MacFarlane
[not found] ` <m2czm5ep3u.fsf-d8241O7hbXoP5tpWdHSM3tPlBySK3R6THiGdP5j34PU@public.gmane.org>
2021-12-09 16:58 ` Paul
2021-12-09 17:11 ` Joseph Reagle
[not found] ` <9d89679a-94dc-2459-822f-93dbe4cbca57-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
2021-12-09 17:16 ` Paul
2021-12-09 17:29 ` John MacFarlane
[not found] ` <m2r1ald6vy.fsf-d8241O7hbXoP5tpWdHSM3tPlBySK3R6THiGdP5j34PU@public.gmane.org>
2022-01-02 14:09 ` Alx Nbl
[not found] ` <2cf7ddb7-c135-441c-8758-d780938bb5ffn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-01-02 16:52 ` Alx Nbl
[not found] ` <c0083e12-4b71-4fd1-a701-f6ea922a1f98n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-06-21 13:21 ` Emiliano
[not found] ` <beef58d1-ac94-4f5a-9405-ecfbff6caa8cn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-06-21 16:43 ` John MacFarlane
2022-06-21 16:44 ` -
[not found] ` <489be9a1-e45a-4bee-ab8d-ce83ca7ed292n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-06-22 14:10 ` Emiliano
[not found] ` <ba18ff15-897d-4a7f-bbd4-3735da206f1dn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-06-22 17:44 ` BPJ
[not found] ` <CADAJKhCSGnvyAP=OSkNB_JRhwUgdtZ0Do8bNScw+b-aQDWwzWQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-06-23 11:14 ` Emiliano
[not found] ` <3316a007-a142-4d3d-a2f8-40befafb4249n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-06-23 19:06 ` BPJ [this message]
[not found] ` <CADAJKhA3F-VC--BMe2mpERZr=LmXZFNE61EwvHmfk0dwYp_ALw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-06-23 19:45 ` Emiliano
2022-06-24 8:46 ` BPJ
[not found] ` <CADAJKhAq4vmgvNP7VFduvLQ-EAPeGry+-gNcuFYJFpnDbZ02Bw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-06-24 9:04 ` BPJ
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CADAJKhA3F-VC--BMe2mpERZr=LmXZFNE61EwvHmfk0dwYp_ALw@mail.gmail.com' \
--to=bpj-j3h7gcxpsitlodktgw+v6w@public.gmane.org \
--cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).