From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/23310 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: John MacFarlane Newsgroups: gmane.text.pandoc Subject: Re: reading html,

header ignored Date: Tue, 27 Aug 2019 09:33:07 -0700 Message-ID: References: <8a9e115c-2983-47d7-a7df-82af5d73822c@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="148167"; mail-complaints-to="usenet@blaine.gmane.org" To: Mikhail Ramendik , pandoc-discuss Original-X-From: pandoc-discuss+bncBCJZJHG45QDBBUNWSXVQKGQEYPOSPRA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Tue Aug 27 18:33:25 2019 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-yw1-f59.google.com ([209.85.161.59]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.89) (envelope-from ) id 1i2ePL-000cN0-N9 for gtp-pandoc-discuss@m.gmane.org; Tue, 27 Aug 2019 18:33:23 +0200 Original-Received: by mail-yw1-f59.google.com with SMTP id k191sf15436821ywe.18 for ; Tue, 27 Aug 2019 09:33:23 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1566923602; cv=pass; d=google.com; s=arc-20160816; b=TvpLmThbkU1KjrBBnwgpXbDNOYBpF7LsgbWFbCcCUpBX1JNXBj77jY96bAaZM0IFHh DhQ8s+Bf22mRxHksJ+kfxAcQm0WROfUqgeQUSC5FiRSPHi7URqVnr/wZriJbDFdKONk3 LYAvH9ueY24T3wfNmfVzZHh56SbUIdVoqut+uGHOApvSAcZV0lXEew90OJr3OPxHl20u 9u27cYW7dYvaQ3EaRTKjKvpHOob7I8bZW0fWcKQg+4wj38EJAGoBiUZ155rCpZKy/DYG XiEcfMR7PaxjrGS+KmaHeQY9LxKzvXjZcGLVMG0aEeOsE0oOP0Er9UWRTBtoKHBbTBZs 04Lg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:mime-version:message-id :date:references:in-reply-to:subject:to:from:sender:dkim-signature; bh=tzjVEYN1V6xayMrBBQAK4tGIey4EmkMWXsACe8BzQLE=; b=fNxY2s75NQq2j0tVzwsC6f0aDawD+7zJV55KpWq79ViZ0UNyoPViB48/pr8I1RLNlL QBbhbxbgeE+2gJB9fXGBL3pllb0gjPL0Kf6r5bYpOr03GBZTTN9KwwmF+L8gYh6c0QwJ hAdUGM7w/K5hYGMQndXOab2A5wC57aCljCgk2+fHvFeIfzb9R1Xy9vPYOD7IGcLT+rHX yqjuqgH41TcqzGhbnvWKqjNdgPY2W+qQwIkkcXWTX6nfQYEE73C6cZTGx8Sdy2HpvjbA FXtsAYg9UnICGGIyjkwG0JA7xUVqG+sGWb87FPS2uVTYJIlJ5AxmyPOUZkg4CCQ2D9FE Bv/A== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=rQKlbI3S; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::42e as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:from:to:subject:in-reply-to:references:date:message-id :mime-version:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=tzjVEYN1V6xayMrBBQAK4tGIey4EmkMWXsACe8BzQLE=; b=pIYWIO9ZjIzSqWMJVfcJ75Ngf+JcoPGD6eIjWhtDXe9znTsale1Yjijb62Ntn4NKLF io+RuLoS7wFGTXvkb05TJH2aGhvuMMH+YtXENODkgUH27GIYU7srCoWKIvf7ksl6aQND hXCPe45qKz6OfcDN7yeDsy7tRpdeguFW3mDra7ez4t1NYnzqLwr9ODxC132HGJeYAIL8 G40DOFPNbA6rNobDB/wiZL9WP9IOz8JihFx8qy5Y9VVk4Dn45L8+IbCW+ApRq3GS8y18 p/hazc23z3b0bzMcYdmhnfT4p7Cf1QgVp9hu7g/ugIiI4gbA8wIGSJI+vdiDYZoKFZEi xLTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:from:to:subject:in-reply-to:references :date:message-id:mime-version:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=tzjVEYN1V6xayMrBBQAK4tGIey4EmkMWXsACe8BzQLE=; b=tt9zavnHah+/IZrmjwrd6InPP+8YlSCAN9vRCwRYlNzjW/+vL9ba4pd28sO/OuZX2V WFYZMkKXYDqSFtk0IJWXOEy99mbXjPBlR+05K/N/zHomhRzTbXEzwh3mAYEZGS5tzx3P Gl3lKUf20h8se6MOL9HjaWH9o9m30kTbzVFp0rMWAxn74N2Jht4Hs4w7jC1zDx+5kR9K c9IX4nkvzQgl6lTmp9VN1qb37lMHT0blAukuBjl6n56FhzFgX09l4ad0O64cqJPXoYTt S0oNszgIscJN2+0XXn28HjYPWdSHrAKhomuVhQd1mfzh8PP6IKvdjmB62eY9uJ0laFi2 clNg== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: APjAAAU0XJtxloBOwq8F8OPyC+VEYxKbh6/AI5ra4XF23BiALDf2Kq+r 6alhMtgYRtH7V+r2wZ6vqT0= X-Google-Smtp-Source: APXvYqzVOWAK6m4LOVn0tQms0yCBEI8e7qgNet9IgbTk+YM+7WziLwK0H7iwUUBmZAsip+K2tiO0Tw== X-Received: by 2002:a25:c74a:: with SMTP id w71mr17361223ybe.311.1566923602505; Tue, 27 Aug 2019 09:33:22 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a0d:eb83:: with SMTP id u125ls5010863ywe.10.gmail; Tue, 27 Aug 2019 09:33:21 -0700 (PDT) X-Received: by 2002:a81:1c9:: with SMTP id 192mr18856139ywb.395.1566923601582; Tue, 27 Aug 2019 09:33:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566923601; cv=none; d=google.com; s=arc-20160816; b=jrN1mywbZcaxK5lN/6rsid1FSHhb+09WBRVgcLYygc0iOGDvFOTECKyRf0HCeKs296 IptbvvrFt13u0RwnssaV7sk28v1JLnWpYJN5BrclEVnBGFy8UUynej8It82IqGYvGkAH 5plckpt3DXviVgfeo86RYO1Y5gK9ilVDiLBNhnqzgk2L3KCr8//H1zo0bdpS277opUif h+0+JseOzShvamMCJEMOLoP9g9UGcHqvzIuXhWd7Pq5VUfP7HvBi3pKBNP+SLsUbpNOP g5pPDqR0aP9FZanivlEJGwYTruepfDEdHgmKuhXGyMMXlq849ir4A1IskPr+lCQR++DL ZR5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:message-id:date:references:in-reply-to:subject:to:from :dkim-signature; bh=uKKaj8ZmrpD6CjOXog7hDSeyyp17lPcEcfmA+zon4Ug=; b=d4ye0WIR4GFIN3tt3+S4FlGKiWJvZmemxOfVng4WqiuVhJXXPMFlxgxAlwzZJ99azM 2v6HFb+Yeo+Pru37qQtcs9ftBEGegrhznQiOapuMb+VYN3zl598HI4c4s1f91pV7Qz51 I0kym3P/ZhXE+S8QLrRslVPW2NzULWRRy2TKSdFclFUtSxVIGJJCUXV/CYMxMACb3JsL MiNhO9qxUhz4N8r6mu/cexA/WiXEupIIlHc5DAvhXLR2OBCrbS5rvViU3XJIh84YfgVe Y6IDanbO46Qx0oZMvs2UXYlBNk+qfbLQI+b9hTdhsDVyRt8AJ2DzQ5ZqHeKetwKDdaZb xpPA== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=rQKlbI3S; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::42e as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Original-Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com. [2607:f8b0:4864:20::42e]) by gmr-mx.google.com with ESMTPS id r6si788957ybb.1.2019.08.27.09.33.21 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 27 Aug 2019 09:33:21 -0700 (PDT) Received-SPF: pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::42e as permitted sender) client-ip=2607:f8b0:4864:20::42e; Original-Received: by mail-pf1-x42e.google.com with SMTP id v12so14449802pfn.10 for ; Tue, 27 Aug 2019 09:33:21 -0700 (PDT) X-Received: by 2002:a17:90a:374a:: with SMTP id u68mr26389236pjb.4.1566923599333; Tue, 27 Aug 2019 09:33:19 -0700 (PDT) Original-Received: from johnmacfarlane.net (li55-134.members.linode.com. [74.82.3.134]) by smtp.gmail.com with ESMTPSA id i14sm16910134pfq.77.2019.08.27.09.33.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Aug 2019 09:33:18 -0700 (PDT) Original-Received: by johnmacfarlane.net (Postfix, from userid 1000) id B5C93A18E; Tue, 27 Aug 2019 12:33:07 -0400 (EDT) In-Reply-To: <8a9e115c-2983-47d7-a7df-82af5d73822c-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Original-Sender: jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20150623.gappssmtp.com header.s=20150623 header.b=rQKlbI3S; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::42e as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:23310 Archived-At: This is because pandoc uses

for metadata titles when rendering HTML. So, for better round-trip consistency, we parse these as metadata when reading HTML. Every once and a while someone runs into this issue, using HTML created elsewhere that uses the same class. It would probably have been better, in retrospect, to use a class like "pandoc-title". I'd be reluctant to change that now, though, since it would affect lots of customized templates. One possibility would be to change pandoc's HTML reader so that

is normally parsed as a regular level-1 heading, UNLESS is present in the head section. That would allow nice round tripping from pandoc but not get in the way of other HTML-producers. However, it may be that pandoc's current behavior is actually better in many cases, even when processing HTML produced by other sources. So it's quite possible that making this change would lead to a surge of complaints. (Comments welcome on this.) Another, probably better approach would be to parse

as a metadata title when pandoc is run with --standalone, but not when pandoc is run in fragment mode. (Currently, in fragment mode, the h1 just disappears, since no metadata is created.) Feel free to add an issue to the tracker suggesting this (and, comments welcome from anyone). A workaround for you would be to preprocess the input, or run in --standalone mode and use a lua filter that extracts the metadata title and inserts a level 1 header with its content at the beginning of the document. Mikhail Ramendik writes: > Hello, > > I am converting an HTML file to ODT. (The problem is with the reader, not > writer, as it also reproduces of converting to MediaWiki). > > My HTML generator uses the

markup for chapter titles. > And these titles end up entirely missing on pandoc output. > > If I do a replace in the file so the tag looks like

> instead and then convert, the titles are in place. > > How can I make pandoc process chapter titles that are marked up with

class="title"> and include them in the output? Or do I need to create a > bug/issue somewhere? > > $ pandoc --version > pandoc 2.1.2 > Compiled with pandoc-types 1.17.3.1, texmath 0.10.1.2, skylighting 0.6 > > (Installed from Fedora 29 repository). > > Yours, Mikhail Ramendik > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/8a9e115c-2983-47d7-a7df-82af5d73822c%40googlegroups.com.