edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed
From: Geoff McLane <ubuntu@geoffair.info>
To: Karl Dahlke <eklhad@comcast.net>, edbrowse-dev@lists.the-brannons.com
Subject: Re: [Edbrowse-dev] <table> <form>
Date: Thu, 22 Dec 2016 21:13:32 +0100	[thread overview]
Message-ID: <bd3d42ad-d97e-91d8-5b2e-6b5fb30d3376@geoffair.info> (raw)
In-Reply-To: <20161122133544.eklhad@comcast.net>

Hi Karl,

 > In an ideal world,

LOL! Well we all know that does not exist!

Tidy does leave the form open, waiting, as it
should, for a close form, but then it hits
a tr open table element, and reports -

line 5 column 1 - Warning: missing close form
before tr

It is at this point that it *must* close the
form... and carries on parsing the table
row.. etc...

And that is why tidy emits an error when it
does eventually find a close form...

I too have had the thought - does this not
tell tidy that the earlier implicit form
close it added was not right - but what can
it do about it at that stage?

 > postmuck with the tree

Yes, I hear you! That is *not* fun, and as you
point out in fixing one page, you can break so
many others...

 > Using libtidy

You know, for a long time I have wondered why
you do not write your own html parser!

Not that I particularly want you to abandon
libtidy... your participation has helped solve
some libtidy problems... and so do hope you
continue...

But like any std html browser, IE, firefox, chrome,
who-ever, you are not really interested in how
well a document is formed... browsers can just skip
over many problems...

If necessary, maybe levering code from text-based
web browsers, like Lynx, but in my experimentation
with some of these, they too can get very hairy...

It is just that once you have the html text in a
buffer, it basically consists of looking for
`<` and the `>`, with not too many exceptions...

I have done this, with reasonable success, in several
perl scripts I have written... as I am sure you
probably have... like I remember in your first perl
version...

But I understand, this is a long, LONG way around...
quite an amount of new work initially...

But libtidy is always going to give you problems
when it runs into invalid html, and its efforts
to make it valid...

Just some thoughts... Sorry, can not seem to help
more...

Regards, Geoff.


  reply	other threads:[~2016-12-22 20:13 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-20 19:14 Karl Dahlke
2016-12-21 14:01 ` Chris Brannon
2016-12-21 17:03   ` Geoff McLane
2016-12-22 18:35     ` Karl Dahlke
2016-12-22 20:13       ` Geoff McLane [this message]
2016-12-25 12:53         ` Adam Thompson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bd3d42ad-d97e-91d8-5b2e-6b5fb30d3376@geoffair.info \
    --to=ubuntu@geoffair.info \
    --cc=edbrowse-dev@lists.the-brannons.com \
    --cc=eklhad@comcast.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).