Hi,
I found a strange behaviour when converting some HTML files to
asciidoc.
Versions used:
asciidoc 9.1.0
pandoc 2.16.2
Example input:
<!DOCTYPE HTML>
<html>
<head>
<title>Xx</title>
</head>
<body>
<a
href="x.htm"><i>Xx</i></a><i>,</i>
</body>
</html>
With "pandoc --wrap=none -f html -t asciidoc" I get this asciidoc
output:
link:x.htm[_Xx_]__,__
The double underscores look "suspicious" and with "asciidoc -b
docbook;xmllint" I get:
z.xml:10: parser error : Unescaped '<' not
allowed in attributes values
<simpara>link:x.htm<emphasis><phrase
role="<emphasis>Xx</emphasis>">,</phrase></
The related docbook line which was created by asciidoc:
<simpara>link:x.htm<emphasis><phrase
role="<emphasis>Xx</emphasis>">,</phrase></emphasis></simpara>
Is this a known bug?
If I add a space before comma...
<a href="x.htm"><i>Xx</i></a><i>
,</i>
then I get
link:x.htm[_Xx_] _,_
which causes no issue. Also adding a space before the emphasis...
<a href="x.htm"><i>Xx</i></a>
<i>,</i>
create an asciidoc file which can be rendered:
link:x.htm[_Xx_] _,_
Does someone know this? Does a fix already exist?
cheers,
Frank