From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/29895 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Frank Bergmann Newsgroups: gmane.text.pandoc Subject: "double emphasis" bug when converting to asciidoc? Date: Tue, 4 Jan 2022 19:02:24 +0100 Message-ID: <3f7b920b-c982-5be5-fa04-9025e008e518@tuxad.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="------------rLRBiVok6P4Sqhc0RQqLVxie" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="17305"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:91.0) Gecko/20100101 Thunderbird/91.4.1 To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBDB4NK6F5EBBBM4X2KHAMGQE4LDCXEY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Tue Jan 04 19:02:30 2022 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-lf1-f64.google.com ([209.85.167.64]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1n4o8k-0004Kp-Ev for gtp-pandoc-discuss@m.gmane-mx.org; Tue, 04 Jan 2022 19:02:30 +0100 Original-Received: by mail-lf1-f64.google.com with SMTP id p19-20020a19f113000000b00425930cf042sf8216474lfh.22 for ; Tue, 04 Jan 2022 10:02:30 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1641319349; cv=pass; d=google.com; s=arc-20160816; b=WBKX5U7WFX4N1YlhwHLXjbGuUG86W8uieisSXgfze+O28kqm1kQyhcZkofF8OQUwis BeRMdz0RAa1eXDmvciu5nTPYdt3cgpQWJ5/RN0Pcj7DMYbCgSJrWFMMaiDrl7SKwupaJ gc8vXiJ4FAaHrdL3WSb+/5oiIHaewC4Oq1PiJt+MujWsD029eQjFUeOtIGQGApzzBb5O cPSNWa5wqmuRXTho9qDC/gm7fPTJ+jX2kd8k2MwoaD7CL0lavetR5L5U3WW8QhfGCQZx 6mJ5sGUt2sTAwZmdeOX0egXBPqbd3zNIb9qk3tKrCFTIf+SH9sWaukFSkWqPutIoMw8T O/cw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:subject:from:to :content-language:user-agent:mime-version:date:message-id:sender :dkim-signature; bh=cVrVdAlJo0pCPF3G7O9qPruxlPqUBa9JBK4dHaTL6GU=; b=PMBPU2JvbCsHGrScq6EEPpugJ2eUdGDrpjmy2CfJMoyCvC/tue5mfQbCaeWWLfkmL8 dSbZpJXNZK4sljT8OdapWQ6XSX1hPwJskOzgwt9+aKCThESgNRTwccrsQ57WZK0fN0lg E8Ys5F5xHRGxzYvuNM1z2/nkPK+SmwAB+Si9nOvE3gVBYiupfV8/kxoIx42iVKttJS1u 7d/XJKy23EapKCTweARj6StOvB9hbfFkGP36BxqDaRxdf1J7jFEhSHBuE5o6+qyzvdQ4 ZRimP0rDrQ2y7R6g/z7Pd4O2+3NGrtvbSEBYj6HKyygrOHRz83nCssx3VC7AlxsloaIU VIyw== ARC-Authentication-Results: i=2; gmr-mx.google.com; spf=pass (google.com: domain of pandoc-eSlkCAlw8VwAvxtiuMwx3w@public.gmane.org designates 81.89.239.233 as permitted sender) smtp.mailfrom=pandoc-eSlkCAlw8VwAvxtiuMwx3w@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=sender:message-id:date:mime-version:user-agent:content-language:to :from:subject:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=cVrVdAlJo0pCPF3G7O9qPruxlPqUBa9JBK4dHaTL6GU=; b=jOblfAODqhuvHaIiCfU6nqvGRRPQTPo5IJ05AMAUnwYryVK+V4zbScxt5z4+QjllUa dromLuz60UcAUNl9sHImaLLDplwF3L8d3R5C/WV8S7Mf8+SACT8HCXjAmhJaIc0HXilm JQL3U3aGx6Reo4HTxU6BlO0ZWsexSM23jHEJYhGP1N3LvbNqm6vlJkPCd6USSezxjze+ f6tCVBiHNARTSiuk8yMX63behdw7rhHb0cmsu61VG03gd29UnpOw4oCAU3q2CT9EjwAs BjdTpYZkNlcGd4QPEA0/uTPWd157p1wMHv2DfO3HEhHDYZyCk2ojQhJv9hRDpRtABD5/ 4G2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=sender:x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:from:subject:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=cVrVdAlJo0pCPF3G7O9qPruxlPqUBa9JBK4dHaTL6GU=; b=A8sR61vN3fdv8dddz+0jEPxOq31/skyO+nJf7kE8glM1nVBudJY8dT8JwHZBk7uVjb L91dnx1GiypcMqO4hfe622yBQLp0kEv2QhKvIgO2mMlm2bNwUMGkaaUMxK6/rQgPY6DQ ubhR50+Qh3BMQ4At8esabO5LX8FaLAfJeoGOWJGTcxp346UrkBcn1DU5sXjRmbH8A8pa PTCobMix6vrCk+KVGoY1V6fljZ9gWXzIwMRyhSQsSPYWbgb5Pf4csr25ssSv7zJNoRnB SMsfPAANKVCPEQjvsmx3RFAaAMYPRHDkdwBW6U/q8Cp3XOCaJXJri+tVCirHczkn9/de KoHw== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM533+ZybwZwiv83/R8bp7oYMqNmB52V8GLZ/kMhY+krHqqnUJhx9X kxFBlwkefdxhZXV4xk+zBF4= X-Google-Smtp-Source: ABdhPJwEJ6Yke5h7hF4lUFeyhKkv24/hSAqPOVDlOz1Qj6Rd5ZyvBHY8PGG3/6IAvFAHPsXjYN+bfg== X-Received: by 2002:a05:6512:1191:: with SMTP id g17mr43262044lfr.17.1641319349353; Tue, 04 Jan 2022 10:02:29 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:ac2:5304:: with SMTP id c4ls1335962lfh.3.gmail; Tue, 04 Jan 2022 10:02:26 -0800 (PST) X-Received: by 2002:a19:5e41:: with SMTP id z1mr43590056lfi.657.1641319346888; Tue, 04 Jan 2022 10:02:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1641319346; cv=none; d=google.com; s=arc-20160816; b=v3fL0GpVCsx4Mw3axJPYaVePBkxwyeN+aQNimMBBvSZ/GK97+2gmS3F03y8/dM49zm PizmiwhO0ypV/gVjUhmJ9/x2mmZDPgeamV9QmXl1E4t8vEq+sW6QkixMwp7Xvems2ALD mTS79pzD95o8/ud1eBfe5uNSHE5eliUN2uZTU0vIIEb/JsF67Co/YjLqLxGPRoFAQQEZ MrSZJTwoCirbsuyuugNxPiBq978uEUgLC5DbymCRVBcAycMNMNdh+0bd6IMgJc+Zwfnw dqRCXrqh6eJtOnB3kj8YLc5TZQIKdbmERNl9nU7gNr3nncpqcLSvsCcazq/Ld5WfIo9w +uMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=subject:from:to:content-language:user-agent:mime-version:date :message-id; bh=xE7DKB9UOCgyYroBCJZmQnZuJn2nIeAnTtUT/Nk3pSw=; b=q9aRV1NLPA4xTAgJCQOxDh5158Au6E/VIn1VCQDWJDOzxdAr2Z3CsN9G/BTLaXlwWb ho8aVc23+Gcc9UkZALREoyO+iVIzV3K8TiJnfNbxoGN6pQyP+mFOL6ujgWqNistNQmdU slx4LPz1+WfxOwjekyeTm+kOsaWcXJYx3VS6CXt8SBZ1ZDdKe5ohnq4ZwriM4p3SxbiU zoKpugHTV2uXRAkPLaAU0zaChAyF/2DBHyItcS2IaRlnXj4YzjqfQSzLdXM37DikmWcu Sg5okUmSkpDHfWqdY2u9zn02VR81Jct33JfUhs7ZBtxZbgb8asulwxMUeFg1/CyjerJC J6GA== ARC-Authentication-Results: i=1; gmr-mx.google.com; spf=pass (google.com: domain of pandoc-eSlkCAlw8VwAvxtiuMwx3w@public.gmane.org designates 81.89.239.233 as permitted sender) smtp.mailfrom=pandoc-eSlkCAlw8VwAvxtiuMwx3w@public.gmane.org Original-Received: from mail.tuxad.com (treferpol.tuxad.net. [81.89.239.233]) by gmr-mx.google.com with ESMTPS id k19si2002472lfv.12.2022.01.04.10.02.26 for (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 04 Jan 2022 10:02:26 -0800 (PST) Received-SPF: pass (google.com: domain of pandoc-eSlkCAlw8VwAvxtiuMwx3w@public.gmane.org designates 81.89.239.233 as permitted sender) client-ip=81.89.239.233; Original-Received: from [192.168.101.166] (rhtec.tuxad.net [62.216.165.252]) (using TLSv1.3 with cipher AEAD-CHACHA20-POLY1305-SHA256 (256/256 bits)) (No client certificate requested) (Authenticated sender: frankb) by mail.tuxad.com (Postfix) with ESMTP id E8E4B5633C for ; Tue, 4 Jan 2022 19:02:25 +0100 (CET) Content-Language: en-US X-Original-Sender: pandoc-eSlkCAlw8VwAvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of pandoc-eSlkCAlw8VwAvxtiuMwx3w@public.gmane.org designates 81.89.239.233 as permitted sender) smtp.mailfrom=pandoc-eSlkCAlw8VwAvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:29895 Archived-At: This is a multi-part message in MIME format. --------------rLRBiVok6P4Sqhc0RQqLVxie Content-Type: text/plain; charset="UTF-8"; format=flowed Hi, I found a strange behaviour when converting some HTML files to asciidoc. Versions used: asciidoc 9.1.0 pandoc 2.16.2 Example input: Xx Xx, With "pandoc --wrap=none -f html -t asciidoc" I get this asciidoc output: link:x.htm[_Xx_]__,__ The double underscores look "suspicious" and with "asciidoc -b docbook;xmllint" I get: z.xml:10: parser error : Unescaped '<' not allowed in attributes values link:x.htm,link:x.htm, *Is this a known bug?* If I add a space before comma... Xx , then I get link:x.htm[_Xx_] _,_ which causes no issue. Also adding a space before the emphasis... Xx , create an asciidoc file which can be rendered: link:x.htm[_Xx_] _,_ Does someone know this? Does a fix already exist? cheers, Frank -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/3f7b920b-c982-5be5-fa04-9025e008e518%40tuxad.com. --------------rLRBiVok6P4Sqhc0RQqLVxie Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi,

I found a strange behaviour when converting some HTML files to asciidoc.

Versions used:
asciidoc 9.1.0
pandoc 2.16.2

Example input:

<!DOCTYPE HTML>
<html>
<head>
<title>Xx</title>
</head>
<body>
<a href=3D"x.htm"><i>Xx</i></a><i>,</i>=
</body>
</html>


With "pandoc --wrap=3Dnone -f html -t asciidoc" I get this asciidoc output:

link:x.htm[_Xx_]__,__

The double underscores look "suspicious" and with "asciidoc -b docbook;xmllint" I get:

z.xml:10: parser error : Unescaped '<' not allowed in attributes values
<simpara>link:x.htm<emphasis><phrase role=3D"<emphasis>Xx</emphasis>">,</phrase></=


The related docbook line which was created by asciidoc:

<simpara>link:x.htm<emphasis><p= hrase role=3D"<emphasis>Xx</emphasis>">,</phrase></emphas= is></simpara>

Is this a known bug?


If I add a space before comma...

<a href=3D"x.htm"><i>Xx</i></a><i> ,</i>

then I get

link:x.htm[_Xx_] _,_

which causes no issue. Also adding a space before the emphasis...

<a href=3D"x.htm"><i>Xx</i></a> <i>,</i>


create an asciidoc file which can be rendered:

link:x.htm[_Xx_] _,_



Does someone know this? Does a fix already exist?


cheers,
Frank


--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/p= andoc-discuss/3f7b920b-c982-5be5-fa04-9025e008e518%40tuxad.com.
--------------rLRBiVok6P4Sqhc0RQqLVxie--