edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed
From: Adam Thompson <arthompson1990@gmail.com>
To: Karl Dahlke <eklhad@comcast.net>
Cc: edbrowse-dev@lists.the-brannons.com
Subject: Re: [Edbrowse-dev] script tags in scripts
Date: Fri, 11 Sep 2015 19:02:17 +0100	[thread overview]
Message-ID: <20150911180217.GB29720@toaster.adamthompson.me.uk> (raw)
In-Reply-To: <20150811061713.eklhad@comcast.net>

[-- Attachment #1: Type: text/plain, Size: 2191 bytes --]

On Fri, Sep 11, 2015 at 06:17:13AM -0400, Karl Dahlke wrote:
> > I'm not sure what we can do about this,
> > but I'm inclined to think that whatever we do won't catch every case and that
> > at some stage we have to accept that and move on.
> 
> That was true of my parser, true of tidy5, and true of any parser,
> however, as you point out regularly, we should handle most websites
> that other browsers handle.
> And when we don't,
> entire web pages shouldn't disappear beyond the point of error.
> This bug is produced by fanfiction.net and fictionpress.com,
> two high volume sites that work on every other browser.

Agreed, we need to work out what's breaking here and why it's affecting tidy5
and not, say, firefox etc. I may try the pages with some other html parsing
libs (not applicable to edbrowse unfortunately as they're in, e.g.
Python or Perl) to see what they do with the pages.
I'm just saying that I think we should continue to move forward with the design
on the basis that tidy5 will be fixed.
If it's not then we'll need to look at other alternatives but there're a lot of
elements of the new design which should stay in any case I think.

> And by the way, my thanks to those users who exercise and test our bleeding edge software;
> you're as brave as a Windows 10 insider.

I second this. We need users to test this software and I appreciate the time
and effort it takes to keep on top of the latest code,
particularly when we're adding library dependancies.

> In any case, tidy5 needs to fix this,
> or we need to find a way to preprocess around it,
> the latter meaning I'd have to keep at least half of my parser,
> which I really wanted to throw away entirely.   :(

May be, or we keep the tidy-inspired design but rewrite the parsing logic,
may be borrowing the parsing code from somewhere else and making it our own.
I know I said we should try and stay out of the html parsing business,
and I still would like to ideally
but if we really can't then we can at least keep the current design direction.
There has to be a parsing lib out there somewhere which works properly...
at least I hope there is.

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

  reply	other threads:[~2015-09-11 17:59 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-11  0:17 Tyler Spivey
2015-09-11  1:10 ` Karl Dahlke
2015-09-11  5:28   ` Kevin Carhart
2015-09-11  7:39     ` Adam Thompson
2015-09-11 10:17       ` Karl Dahlke
2015-09-11 18:02         ` Adam Thompson [this message]
2015-09-11 18:55           ` Karl Dahlke
2015-09-11 16:37     ` Chris Brannon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150911180217.GB29720@toaster.adamthompson.me.uk \
    --to=arthompson1990@gmail.com \
    --cc=edbrowse-dev@lists.the-brannons.com \
    --cc=eklhad@comcast.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).