edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed
From: Adam Thompson <arthompson1990@gmail.com>
To: Kevin Carhart <kevin@carhart.net>
Cc: Karl Dahlke <eklhad@comcast.net>, edbrowse-dev@lists.the-brannons.com
Subject: Re: [Edbrowse-dev] script tags in scripts
Date: Fri, 11 Sep 2015 08:39:39 +0100	[thread overview]
Message-ID: <20150911073939.GA29720@toaster.adamthompson.me.uk> (raw)
In-Reply-To: <alpine.LRH.2.03.1509102200340.19704@carhart.net>

[-- Attachment #1: Type: text/plain, Size: 1951 bytes --]

On Thu, Sep 10, 2015 at 10:28:03PM -0700, Kevin Carhart wrote:
> 
> Interesting.. Karl, does your certainty mean that you are saying
> that the distinction between the two tags is fundamentally
> unknowable for a parser?

It's certainly difficult if the parser isn't also capable of parsing the
scripting language within the script tags.

> I guess one good sign is that there appears to be a lot of
> past literature on this issue, on Tidy listservs.  Including
> one from 2006 called "Tidy barfs on split <SCRIPT> tags".
> Unless it's an impossible problem, maybe these past threads
> will contain something we can use.  I will read some of this
> correspondence.

I've also ran the example through the main tidy html5 version and it also spits it out.

> This reminds me of other gnarly situations with literals.
> For instance, when there are regular expression criteria in
> javascript strings that contain just solely a close brace or close
> parenthesis, if I come along and want to make
> assumptions about pairs of braces, the unmatched literal gets me
> out of sync.

Agreed, literals in scripts can cause issues like this.
There's also the issue of json shoved in script tags etc (I've seen web apps
use this for pre-caching server responses).

I'm not sure what we can do about this,
but I'm inclined to think that whatever we do won't catch every case and that
at some stage we have to accept that and move on.
I seem to remember that the accepted "fix"
for this in html is not to split the script in </script> but rather to split it at the / thus:
document.write("<"); document.write("/script>");
But I may be wrong there.

We should probably report a bug against tidy5 in any case for this.
That's why we're using a parsing library after all.
At least this one's maintained for us so there's a reasonable chance they'll
fix these things once they work out a workable solution.

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

  reply	other threads:[~2015-09-11  7:37 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-11  0:17 Tyler Spivey
2015-09-11  1:10 ` Karl Dahlke
2015-09-11  5:28   ` Kevin Carhart
2015-09-11  7:39     ` Adam Thompson [this message]
2015-09-11 10:17       ` Karl Dahlke
2015-09-11 18:02         ` Adam Thompson
2015-09-11 18:55           ` Karl Dahlke
2015-09-11 16:37     ` Chris Brannon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150911073939.GA29720@toaster.adamthompson.me.uk \
    --to=arthompson1990@gmail.com \
    --cc=edbrowse-dev@lists.the-brannons.com \
    --cc=eklhad@comcast.net \
    --cc=kevin@carhart.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).