From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/23317 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Mikhail Ramendik Newsgroups: gmane.text.pandoc Subject: Re: reading html,

header ignored Date: Tue, 27 Aug 2019 15:54:03 -0700 (PDT) Message-ID: <684df614-496b-455f-aa2d-e602b19c96b0@googlegroups.com> References: <8a9e115c-2983-47d7-a7df-82af5d73822c@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_2065_567107360.1566946443451" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="145260"; mail-complaints-to="usenet@blaine.gmane.org" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCEPFTPKWEMBBDHJS3VQKGQE5D7TFSY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Wed Aug 28 00:54:07 2019 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-ot1-f63.google.com ([209.85.210.63]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.89) (envelope-from ) id 1i2kLm-000beR-Oy for gtp-pandoc-discuss@m.gmane.org; Wed, 28 Aug 2019 00:54:06 +0200 Original-Received: by mail-ot1-f63.google.com with SMTP id d14sf285709otf.2 for ; Tue, 27 Aug 2019 15:54:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=4+yEiCmiJEasSa4JbHLu4Ix70SQDe0wqy3aUO0wu3N0=; b=iyyTS4gkX+eKbNGXv5Ba+Qk6CSzh3Wx4SkZwl6OCsNg6jBz4H3jmm17NvUVjhYpHqM OPSwR8iVW2ImW9b/rXLW3YYIQ8TD6Yd2/yH9kbtJKRsjzTabjaZ+fmf1Gdhx8oUnsjUG L6H+xxoXLWFfAf+BtyktUepbiOnCgtSbx/ohmrwoqAMQhveyUNbqBc1OCdIt2xJp9edT 1TpMDyrLcTjDnW4aLZSWEBnDRmXzl/1nYhc8vd7e6XtneAnSo439PFp1Qa4FHpC7CczW 9CsWf6Wt70OnSE6ejEJJZxDePPsPupU50iEnN9V9IAdpKZ2yOhUycoHvgPGXb+xp4E1a 15PA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ramendik-ru.20150623.gappssmtp.com; s=20150623; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=4+yEiCmiJEasSa4JbHLu4Ix70SQDe0wqy3aUO0wu3N0=; b=0bBvK0cYUANHCmmsu08dolInQEMMpeUU1DpzR5X5zWbBAzwmOL9IsVwIcLBtyfpzUQ nqosUVyjIHoJtihvDQwzLEHSIGJqlhxC4TLF1Ksy4u+9a3ib4SKIkc3FOcIEj29Jw7M6 y0JIiiEu8IyPuN9KWMUIe5pFgiOP65KVYN1FaODXCbYuTMS33Y4CcieHRKxbXeTeUNtc 1AMbI6NCZQTCU0NOzNXZlh7qZ6yuhVOJZRAFRUNOGV+IfZ9ZKloRs+XA1Lh3zlcGjhkr d09D2VC7uzS/+4hd8l9DIouKAZMSR+UCPCcBa4nyYSqdA5RFgr/kRa0qKFIbbBWySEQ4 TjaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=4+yEiCmiJEasSa4JbHLu4Ix70SQDe0wqy3aUO0wu3N0=; b=FE2sWy6MI+GA7xaef3HzdjdKa9cW7+AXSp0upsKhG23VHEX6SkLnWDc9fdKB4mnLHp tlNx94Gs7tJAyrjRCdHhNYjYr8yI8UjJRrMrO/tXvPY2Cm0PLA3e8LQgPH11OReCIXen WzSuSaZDesjRfLCjnR6yVk0bT0cS5fFyLpp5kI469p3djE0yfPlwJ4A9bpFSjuUqbpGb npqvTPj7Qwbvu0SbXivS49t5x/P8I503/U8ZlPfDbciLE+vOtOTp3BjVcRBGy+Mife7w Hx7UDFGD1RSY8orpCS3f897ZjOgaQ//MvZnAE7XTtmdhEbj5z1O8l9GIS3tdUgabs5Os xAlg== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: APjAAAVfln3xBUxd48SCPM6g14/aV61Hku9Z46eA9hCwlZRIDnJ10UU2 zENObiMaxAQPFV2L6j0dSqk= X-Google-Smtp-Source: APXvYqzKdzmj1sy3W2+++mTsh4OHO5LeC95fHZqlJPWskQh4Z2g4ldV9w0XN81PZ7gFbzj8yhtHNBg== X-Received: by 2002:a9d:6b84:: with SMTP id b4mr831707otq.63.1566946445249; Tue, 27 Aug 2019 15:54:05 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6808:103:: with SMTP id b3ls30423oie.2.gmail; Tue, 27 Aug 2019 15:54:04 -0700 (PDT) X-Received: by 2002:aca:518a:: with SMTP id f132mr751506oib.114.1566946444058; Tue, 27 Aug 2019 15:54:04 -0700 (PDT) In-Reply-To: X-Original-Sender: mr-eJ/51bLfIl8ox3rIn2DAYQ@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:23317 Archived-At: ------=_Part_2065_567107360.1566946443451 Content-Type: multipart/alternative; boundary="----=_Part_2066_1821257310.1566946443451" ------=_Part_2066_1821257310.1566946443451 Content-Type: text/plain; charset="UTF-8" Hello, Thank you very much for your response! On Tuesday, August 27, 2019 at 5:33:24 PM UTC+1, John MacFarlane wrote: > > > One possibility would be to change pandoc's HTML reader so that >

is normally parsed as a regular level-1 > heading, UNLESS is present in the > head section. That would allow nice round tripping from pandoc > but not get in the way of other HTML-producers. > > However, it may be that pandoc's current behavior is actually > better in many cases, even when processing HTML produced by > other sources. So it's quite possible that making this change > would lead to a surge of complaints. (Comments welcome on this.) > I would suggest that this behaviour become the default, BUT you add a command line option to invoke the present behaviour. So: - with , process

as metadata - with --title-metadata (or similar), process

as metadata - otherwise process

as a header > > Another, probably better approach would be to parse >

as a metadata title when pandoc is run > with --standalone, but not when pandoc is run in fragment mode. But I want to get a complete ODT document as output. Don't I need to use --standalone? If I do then this fix would do nothing for me. > > A workaround for you would be to preprocess the input, or > run in --standalone mode and use a lua filter that extracts > the metadata title and inserts a level 1 header with its content > at the beginning of the document. > Preprocessing the input with a mere search and replace, changing class="title" to class="meow", is a simple approach that works. But it is a mandatory extra step. Yours, Mikhail Ramendik > > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/684df614-496b-455f-aa2d-e602b19c96b0%40googlegroups.com. ------=_Part_2066_1821257310.1566946443451 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello,=C2=A0

Thank you very much for yo= ur response!

On Tuesday, August 27, 2019 at 5:33:24 PM UTC+1, John M= acFarlane wrote:

One possibility would be to change pandoc's HTML reader so that
<h1 class=3D"title"> is normally parsed as a regular le= vel-1
heading, UNLESS <meta generator=3D"pandoc"> is present = in the
head section. =C2=A0That would allow nice round tripping from pandoc
but not get in the way of other HTML-producers.


Another, probably better approach would be to parse
<h1 class=3D"title"> as a metadata title when pandoc is= run
with --standalone, but not when pandoc is run in fragment mode.

But I want to get a complete ODT document as outpu= t. Don't I need to use --standalone? If I do then this fix would do not= hing for me.
=C2=A0

A workaround for you would be to preprocess the input, or
run in --standalone mode and use a lua filter that extracts
the metadata title and inserts a level 1 header with its content
at the beginning of the document.

Preprocessing the input with a mere se= arch and replace, changing class=3D"title" to class=3D"meow&= quot;, is a simple approach that works. But it is a mandatory extra step.

=C2=A0Yours, Mikhail Ramendik=C2=A0

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/= msgid/pandoc-discuss/684df614-496b-455f-aa2d-e602b19c96b0%40googlegroups.co= m.
------=_Part_2066_1821257310.1566946443451-- ------=_Part_2065_567107360.1566946443451--