Gnus development mailing list
 help / color / mirror / Atom feed
From: David Hansen <david.hansen@gmx.net>
Cc: ding@gnus.org
Subject: Broken XML, xml.el and nnrss.el
Date: Thu, 21 Apr 2005 17:13:23 +0200	[thread overview]
Message-ID: <87hdi0rwmk.fsf@robotron.ath.cx> (raw)

Hello,

I tried to read the RSS feed at

http://xxx.pogogeil.de/podcast.php 

copy used to produce the backtrace at

http://www.physik.fu-berlin.de/~dhansen/stuff/podcast.rss

using gnus and nnrss.

It is obviously broken XML as the umlauts are encoded using plain
HTML enteties (e.g. &uuml; instead of &amp;uuml;).  Still i find
the behavior of the XML parser a bit weird:

Debugger entered--Lisp error: (wrong-type-argument integerp nil)
  mapconcat(identity ((nil #("chte Andre Rieu die MIch m" 0 21 ... 21 26 ...)) #("bel gerade r" 0 12 (fontified nil)) nil) "")
  (concat (mapconcat (quote identity) (nreverse children) "") (substring string point))
  (cond ((stringp children) (concat children ...)) ((stringp ...) (concat ... ...)) ((null children) string) (t (concat ... ...)))
  (let ((point 0) children end-point) (while (string-match "&\\([^;]*\\);" string point) (setq end-point ...) (let* ... ... ...)) (cond (... ...) (... ...) (... string) (t ...)))
  xml-substitute-special(#("Ich m&ouml;chte Andre Rieu die M&ouml;bel gerade r&uuml;cken" 0 60 (fontified nil)))
  (let* ((pos ...) (string ...)) (setq pos 0) (while (string-match "
\n?" string pos) (setq string ...) (setq pos ...)) (xml-substitute-special string))
  xml-parse-string()
  (let ((expansion ...)) (setq children (if ... ... ...)))
  (cond ((looking-at "</") (error "XML: (Not Well-Formed) Invalid end tag (expecting %s) at pos %d" node-name ...)) ((= ... 60) (let ... ...)) (t (let ... ...)))
  (while (not (looking-at end)) (cond (... ...) (... ...) (t ...)))
  (let ((end ...)) (while (not ...) (cond ... ... ...)) (goto-char (match-end 0)) (nreverse children))
  (progn (forward-char 1) (let (...) (while ... ...) (goto-char ...) (nreverse children)))
  (if (eq (char-after) 62) (progn (forward-char 1) (let ... ... ... ...)) (error "XML: (Well-Formed) Couldn't parse tag: %s" (buffer-substring ... ...)))
  (if (looking-at "/>") (progn (forward-char 2) (nreverse children)) (if (eq ... 62) (progn ... ...) (error "XML: (Well-Formed) Couldn't parse tag: %s" ...)))
  (let* ((node-name ...) (attrs ...) children pos) (when (consp xml-ns) (dolist ... ...)) (setq children (list attrs ...)) (if (looking-at "/>") (progn ... ...) (if ... ... ...)))
  (cond ((looking-at "<\\?") (search-forward "?>") (skip-syntax-forward " ") (xml-parse-tag parse-dtd xml-ns)) ((looking-at "<!\\[CDATA\\[") (let ... ... ...)) ((looking-at "<!DOCTYPE") (let ... ... ...)) ((looking-at "<!--") (search-forward "-->") nil) ((looking-at "</") (quote nil)) ((looking-at "<\\([^/>[:space:]]+\\)") (goto-char ...) (let* ... ... ... ...)) (t (unless xml-sub-parser ...) (xml-parse-string)))
  (let ((xml-validating-parser ...) (xml-ns ...)) (cond (... ... ... ...) (... ...) (... ...) (... ... nil) (... ...) (... ... ...) (t ... ...)))
  xml-parse-tag(nil nil)
  (let ((tag ...)) (when tag (push tag children)))
  (cond ((looking-at "</") (error "XML: (Not Well-Formed) Invalid end tag (expecting %s) at pos %d" node-name ...)) ((= ... 60) (let ... ...)) (t (let ... ...)))
  (while (not (looking-at end)) (cond (... ...) (... ...) (t ...)))
  (let ((end ...)) (while (not ...) (cond ... ... ...)) (goto-char (match-end 0)) (nreverse children))
  (progn (forward-char 1) (let (...) (while ... ...) (goto-char ...) (nreverse children)))
  (if (eq (char-after) 62) (progn (forward-char 1) (let ... ... ... ...)) (error "XML: (Well-Formed) Couldn't parse tag: %s" (buffer-substring ... ...)))
  (if (looking-at "/>") (progn (forward-char 2) (nreverse children)) (if (eq ... 62) (progn ... ...) (error "XML: (Well-Formed) Couldn't parse tag: %s" ...)))
  (let* ((node-name ...) (attrs ...) children pos) (when (consp xml-ns) (dolist ... ...)) (setq children (list attrs ...)) (if (looking-at "/>") (progn ... ...) (if ... ... ...)))
  (cond ((looking-at "<\\?") (search-forward "?>") (skip-syntax-forward " ") (xml-parse-tag parse-dtd xml-ns)) ((looking-at "<!\\[CDATA\\[") (let ... ... ...)) ((looking-at "<!DOCTYPE") (let ... ... ...)) ((looking-at "<!--") (search-forward "-->") nil) ((looking-at "</") (quote nil)) ((looking-at "<\\([^/>[:space:]]+\\)") (goto-char ...) (let* ... ... ... ...)) (t (unless xml-sub-parser ...) (xml-parse-string)))
  (let ((xml-validating-parser ...) (xml-ns ...)) (cond (... ... ... ...) (... ...) (... ...) (... ... nil) (... ...) (... ... ...) (t ... ...)))
  xml-parse-tag(nil nil)
  (let ((tag ...)) (when tag (push tag children)))
  (cond ((looking-at "</") (error "XML: (Not Well-Formed) Invalid end tag (expecting %s) at pos %d" node-name ...)) ((= ... 60) (let ... ...)) (t (let ... ...)))
  (while (not (looking-at end)) (cond (... ...) (... ...) (t ...)))
  (let ((end ...)) (while (not ...) (cond ... ... ...)) (goto-char (match-end 0)) (nreverse children))
  (progn (forward-char 1) (let (...) (while ... ...) (goto-char ...) (nreverse children)))
  (if (eq (char-after) 62) (progn (forward-char 1) (let ... ... ... ...)) (error "XML: (Well-Formed) Couldn't parse tag: %s" (buffer-substring ... ...)))
  (if (looking-at "/>") (progn (forward-char 2) (nreverse children)) (if (eq ... 62) (progn ... ...) (error "XML: (Well-Formed) Couldn't parse tag: %s" ...)))
  (let* ((node-name ...) (attrs ...) children pos) (when (consp xml-ns) (dolist ... ...)) (setq children (list attrs ...)) (if (looking-at "/>") (progn ... ...) (if ... ... ...)))
  (cond ((looking-at "<\\?") (search-forward "?>") (skip-syntax-forward " ") (xml-parse-tag parse-dtd xml-ns)) ((looking-at "<!\\[CDATA\\[") (let ... ... ...)) ((looking-at "<!DOCTYPE") (let ... ... ...)) ((looking-at "<!--") (search-forward "-->") nil) ((looking-at "</") (quote nil)) ((looking-at "<\\([^/>[:space:]]+\\)") (goto-char ...) (let* ... ... ... ...)) (t (unless xml-sub-parser ...) (xml-parse-string)))
  (let ((xml-validating-parser ...) (xml-ns ...)) (cond (... ... ... ...) (... ...) (... ...) (... ... nil) (... ...) (... ... ...) (t ... ...)))
  xml-parse-tag(nil nil)
  (let ((tag ...)) (when tag (push tag children)))
  (cond ((looking-at "</") (error "XML: (Not Well-Formed) Invalid end tag (expecting %s) at pos %d" node-name ...)) ((= ... 60) (let ... ...)) (t (let ... ...)))
  (while (not (looking-at end)) (cond (... ...) (... ...) (t ...)))
  (let ((end ...)) (while (not ...) (cond ... ... ...)) (goto-char (match-end 0)) (nreverse children))
  (progn (forward-char 1) (let (...) (while ... ...) (goto-char ...) (nreverse children)))
  (if (eq (char-after) 62) (progn (forward-char 1) (let ... ... ... ...)) (error "XML: (Well-Formed) Couldn't parse tag: %s" (buffer-substring ... ...)))
  (if (looking-at "/>") (progn (forward-char 2) (nreverse children)) (if (eq ... 62) (progn ... ...) (error "XML: (Well-Formed) Couldn't parse tag: %s" ...)))
  (let* ((node-name ...) (attrs ...) children pos) (when (consp xml-ns) (dolist ... ...)) (setq children (list attrs ...)) (if (looking-at "/>") (progn ... ...) (if ... ... ...)))
  (cond ((looking-at "<\\?") (search-forward "?>") (skip-syntax-forward " ") (xml-parse-tag parse-dtd xml-ns)) ((looking-at "<!\\[CDATA\\[") (let ... ... ...)) ((looking-at "<!DOCTYPE") (let ... ... ...)) ((looking-at "<!--") (search-forward "-->") nil) ((looking-at "</") (quote nil)) ((looking-at "<\\([^/>[:space:]]+\\)") (goto-char ...) (let* ... ... ... ...)) (t (unless xml-sub-parser ...) (xml-parse-string)))
  (let ((xml-validating-parser ...) (xml-ns ...)) (cond (... ... ... ...) (... ...) (... ...) (... ... nil) (... ...) (... ... ...) (t ... ...)))
  xml-parse-tag(nil nil)
  (cond ((looking-at "<\\?") (search-forward "?>") (skip-syntax-forward " ") (xml-parse-tag parse-dtd xml-ns)) ((looking-at "<!\\[CDATA\\[") (let ... ... ...)) ((looking-at "<!DOCTYPE") (let ... ... ...)) ((looking-at "<!--") (search-forward "-->") nil) ((looking-at "</") (quote nil)) ((looking-at "<\\([^/>[:space:]]+\\)") (goto-char ...) (let* ... ... ... ...)) (t (unless xml-sub-parser ...) (xml-parse-string)))
  (let ((xml-validating-parser ...) (xml-ns ...)) (cond (... ... ... ...) (... ...) (... ...) (... ... nil) (... ...) (... ... ...) (t ... ...)))
  xml-parse-tag(nil nil)
  (setq result (xml-parse-tag parse-dtd parse-ns))
  (progn (forward-char -1) (setq result (xml-parse-tag parse-dtd parse-ns)) (if (and xml result ...) (error "XML: (Not Well-Formed) Only one root tag allowed") (cond ... ... ...)))
  (if (search-forward "<" nil t) (progn (forward-char -1) (setq result ...) (if ... ... ...)) (goto-char (point-max)))
  (while (not (eobp)) (if (search-forward "<" nil t) (progn ... ... ...) (goto-char ...)))
  (save-excursion (if buffer (set-buffer buffer)) (goto-char (point-min)) (while (not ...) (if ... ... ...)) (if parse-dtd (cons dtd ...) (nreverse xml)))
  (let ((case-fold-search nil) xml result dtd) (save-excursion (if buffer ...) (goto-char ...) (while ... ...) (if parse-dtd ... ...)))
  (progn (set-syntax-table (standard-syntax-table)) (let (... xml result dtd) (save-excursion ... ... ... ...)))
  (unwind-protect (progn (set-syntax-table ...) (let ... ...)) (save-current-buffer (set-buffer buffer) (set-syntax-table table)))
  (let ((table ...) (buffer ...)) (unwind-protect (progn ... ...) (save-current-buffer ... ...)))
  (with-syntax-table (standard-syntax-table) (let (... xml result dtd) (save-excursion ... ... ... ...)))
  (save-restriction (narrow-to-region beg end) (with-syntax-table (standard-syntax-table) (let ... ...)))
  xml-parse-region(1 48808)
  eval((xml-parse-region (point-min) (point-max)))
  eval-expression((xml-parse-region (point-min) (point-max)) nil)
  call-interactively(eval-expression)

`xml-substitute-special' only errors if `xml-validating-parser'
is non nil (line 748) and set's the expansion of an unknown
entity to nil otherwise:

     (when xml-validating-parser
	 (error "XML: (Validity) Undefined entity `%s'"
		this-part))

Due to this it builds a list of list in line 770:

   (setq children (list expansion
                        prev-part
                        children))

(should it be (append (list expansion prev-part) children) ?)

which can't be handled by mapconcat.

I'm not familiar with XML but i doubt that the current behavior
is intended.  I think it should either error when it detects the
unknown entity (i think thats what the XML standard says, but
seems the real world isn't that standard conform) or produce some
(more or less useful) result.

What about not expanding unknown entities at all?

(if xml-validating-parser
    (error "XML: (Validity) Undefined entity `%s'"
           this-part)
  (concat "&" this-part ";"))

David

                 reply	other threads:[~2005-04-21 15:13 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87hdi0rwmk.fsf@robotron.ath.cx \
    --to=david.hansen@gmx.net \
    --cc=ding@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).