edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed
* [Edbrowse-dev] questions about new finds
@ 2016-06-23 11:22 Kevin Carhart
  2016-06-23 16:41 ` Karl Dahlke
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin Carhart @ 2016-06-23 11:22 UTC (permalink / raw)
  To: edbrowse-dev




Predominantly for Karl , hi Karl:

I have several things to submit.  There are a couple of areas where I'd 
like to find out if you think the entire class of correction is a mistake 
and wouldn't be good, before I make a patch out of it.  Or I have 
discovered a couple of areas where I'd like to know if you think there 
would be a good approach.

(1) Not a JS problem for a change: Sites are creating kinds of tags with 
their own invented vocabulary as the type. These tag types are not in 
availableTags, so we get a message that the tag could not be created. 
For instance, Amazon and other pages want to create:
video
canvas
header
comment
fragment
nav
modernizr

Do we want to add these phrases to availableTags even though they are just 
made-up names?  Probably not, since it becomes silly to hardcode 
an endless number of them.  'video' and 'comment' seem 
plausible, but 'modernizr' is just some library.  Is there some way that 
availableTags could be amended on the fly and these and all future 
unencountered tag types would get a generic placeholder?  The reason I 
for wanting to support these things would be that maybe the site's 
javascript expects to address its own tag type and then write to it later 
on.

(2) Amazon has a strange <script> block.  It is not javascript.  It is 
just a bunch of json in script tags, and the type is specified.  This may 
be known as JSONP - I'm not sure.   So it is 
like <script type="abc">{a:1,b:2,c:3}</script>. 
In prepareScript, you have code to check that the language is 
"javascript".  What do you think about also testing the "type" attribute? 
If a script has a strange type specified, return without parsing.

(3) I think some page code within a timeout isn't getting to run, because 
it fails the strict test at the top of setTimeout where there must be 2 
arguments and the second one must be numeric.  Some of the jquery versions 
say this, for example:

setTimeout( callback );

It doesn't run, and good thing too, if the strictness is necessary. 
However, it seems like a lot of site developers consider just 1 argument 
to be legitimate.  Should we loosen the test?  I am worried 
about this either way.  I tried it more lenient, and bad code seemed to 
run.  It was hard to isolate but edbrowse was crashing - so I don't know 
what to do.  I was between a rock and a hard place, so I just went on to 
something else.  I bet this is a cause of some sites not working.

(4) I was wondering about a change in the event listener.  At least some 
of the time, I have gotten events to work by having a kind of object 
called Event.  Then you pass an object called 'click' or whatever it 
needs to be.  I had some success with this on the 
Drescher page from Sebastian, in fact.  (Sebastian points out that online 
banking is a higher priority to fix than Austrian metal with a sense of 
humor, but I happened to learn a lot from the Drescher site because they 
use jquery and xhr!)

So, in startwindow, the handler is currently fired off like a[i]();

But what if there was an argument inside the parentheses, which would be 
your event object.

So the eval block begins
eval('this.' + ev + ........................
var tempEvent = new Event
tempEvent.type = ev; // actually would be ev without the "on" prepended
a[i](tempEvent);} };');

That's enough for now, there are a lot of exciting areas to amend which I 
think will improve some sites.  We have that whole iframe question as 
well.  I also have some more normal additions I can submit soon for 
domLink and for the createElement switch statement in startwindow.  For 
instance, elements need to have nodeType.

thanks for reading
Kevin


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Edbrowse-dev]  questions about new finds
  2016-06-23 11:22 [Edbrowse-dev] questions about new finds Kevin Carhart
@ 2016-06-23 16:41 ` Karl Dahlke
  2016-06-27  6:39   ` Kevin Carhart
  0 siblings, 1 reply; 8+ messages in thread
From: Karl Dahlke @ 2016-06-23 16:41 UTC (permalink / raw)
  To: edbrowse-dev

(1) Do you mean <modernizer> in the html text or createElement("modernizer")?
Those are radically different issues.
The first is up to tidy, and out of our hands.
If those tags are simply dropped, they should not be,
and we must submit a bug report to the tidy team.
As for createElement yes we should create some kind of object
for any string that comes in.
Then we'd need a way to scan through those created objects by type.
I don't know how far away we are from this general approach.

(2) Clearly <script> should not run if the type is specified and wrong; just like language.

(3) Sad truth is, if firefox runs some code some way then we have to do the same.
If it does timeout with one arg then so should we, but what is the delay?
Perhaps 1ms.
Hint: jseng_moz.cpp line 2030, the 2 forces exactly 2 args.
I think. You might have to change it to 0 for vararg then check
arg counts down the line.

(4) I was wondering about a change in the event listener.
Well I don't know if I follow all this
but if you'd like to submit something we'll have a look.

(5) Iframe, still pending, as you say,
maybe start by making the object, as we did for XMLHttpRequest,
then we can tie the writer to a native method that pretty much does what innerHTML does.

I even more believe, as you do your research,
that the crazy asynchronous ajax stuff is not most of our problems,
and that would be *really* hard to get right anyways;
that instead it's parts of the dom that just aren't there or aren't implemented properly.
Wish we could borrow an entire dom, as we did with js and tidy html,
but then of course that has its own issues.
Not under our control etc.

Karl Dahlke

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Edbrowse-dev] questions about new finds
  2016-06-23 16:41 ` Karl Dahlke
@ 2016-06-27  6:39   ` Kevin Carhart
  2016-07-02 16:48     ` Adam Thompson
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin Carhart @ 2016-06-27  6:39 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: edbrowse-dev



Thanks for the notes on my new finds.  I checked up on a couple of these.

> (1) Do you mean <modernizer> in the html text or createElement("modernizer")?
> Those are radically different issues.
> The first is up to tidy, and out of our hands.

Aha,so it actually is createElement.  The Amazon homepage says things 
like:
...r.createElement("video")....
and
...r.createElement("canvas")...
and then from tagFromJavaVar2, edbrowse can't find those strings in 
availableTags, and then reports:
cannot create tag node video
cannot create tag node canvas

And on item 3, about the setTimeout and badarg:

> Hint: jseng_moz.cpp line 2030, the 2 forces exactly 2 args.
> I think. You might have to change it to 0 for vararg then check
> arg counts down the line.

Aha, I wonder if that was the missing link when I was getting crashes. 
Because it now didn't adhere to the spec.  That could be it!

Kevin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Edbrowse-dev] questions about new finds
  2016-06-27  6:39   ` Kevin Carhart
@ 2016-07-02 16:48     ` Adam Thompson
  2016-07-02 17:29       ` Karl Dahlke
  2016-07-02 22:11       ` Kevin Carhart
  0 siblings, 2 replies; 8+ messages in thread
From: Adam Thompson @ 2016-07-02 16:48 UTC (permalink / raw)
  To: Kevin Carhart; +Cc: Karl Dahlke, edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 788 bytes --]

Hi all,

First of all appologies for going silent for a while; work, life and computer issues basically.

On Sun, Jun 26, 2016 at 11:39:49PM -0700, Kevin Carhart wrote:
> 
> >(1) Do you mean <modernizer> in the html text or createElement("modernizer")?
> >Those are radically different issues.
> >The first is up to tidy, and out of our hands.
What do we do in the first case, i.e. if we get <unknowntag>? If it's something we don't know are we putting it in the DOM but not rendering, or ignoring it?
Also, I think *some* of those tags are actually html 5 tags i.e. <video> and <canvas> (probably the header and comment also but can't remember) and thus we
should probably have some handling whether they're created in the html or via js (as in this case).

Cheers,
Adam.

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Edbrowse-dev]  questions about new finds
  2016-07-02 16:48     ` Adam Thompson
@ 2016-07-02 17:29       ` Karl Dahlke
  2016-07-03  9:54         ` Adam Thompson
  2016-07-02 22:11       ` Kevin Carhart
  1 sibling, 1 reply; 8+ messages in thread
From: Karl Dahlke @ 2016-07-02 17:29 UTC (permalink / raw)
  To: edbrowse-dev

> What do we do in the first case, i.e. if we get <unknowntag>?

When tidy sees <foobar> it says "discarding unknown tag".
This may prove to be an unacceptable behavior at some point.
A lot depends on what other browsers do with it,
and whether we are expected to do the same.

Karl Dahlke

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Edbrowse-dev] questions about new finds
  2016-07-02 16:48     ` Adam Thompson
  2016-07-02 17:29       ` Karl Dahlke
@ 2016-07-02 22:11       ` Kevin Carhart
  2016-07-03 11:12         ` Adam Thompson
  1 sibling, 1 reply; 8+ messages in thread
From: Kevin Carhart @ 2016-07-02 22:11 UTC (permalink / raw)
  To: Adam Thompson; +Cc: Karl Dahlke, edbrowse-dev



Hi Adam and Chris,

There's a lot to do basically.  I see tidy errors, but I gravitate to the 
JS errors, I guess partially because they are autonomous for us to fix, 
while messages from tidy involve making threads with the tidy developers, 
fun and worthy in its own right, but the path of least resistance takes 
you to something you can work on right away.  (Dang, here's a toast to the 
tidy group!  It underpins everything now and it's so good I don't think 
about it!  We only integrated tidy around 1 year ago.)

We could add things to availableTags one by one as we find them but this 
will go forever.  I experimented adding a "placeholder" to availableTags 
and editing html.c so that if the call to newTag fails to find a string,
it will create your element as a "placeholder."  But I don't know what I'm 
doing - I was also seeing some infinite loops in renderNode, which I 
probably caused by adding placeholder.  I don't know what the TAGACT or 
flags should be for a made-up placeholder.

K



On Sat, 2 Jul 2016, Adam Thompson wrote:

> Hi all,
>
> First of all appologies for going silent for a while; work, life and computer issues basically.
>
> On Sun, Jun 26, 2016 at 11:39:49PM -0700, Kevin Carhart wrote:
>>
>>> (1) Do you mean <modernizer> in the html text or createElement("modernizer")?
>>> Those are radically different issues.
>>> The first is up to tidy, and out of our hands.
> What do we do in the first case, i.e. if we get <unknowntag>? If it's something we don't know are we putting it in the DOM but not rendering, or ignoring it?
> Also, I think *some* of those tags are actually html 5 tags i.e. <video> and <canvas> (probably the header and comment also but can't remember) and thus we
> should probably have some handling whether they're created in the html or via js (as in this case).
>
> Cheers,
> Adam.
>

--------
Kevin Carhart * 415 225 5306 * The Ten Ninety Nihilists

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Edbrowse-dev] questions about new finds
  2016-07-02 17:29       ` Karl Dahlke
@ 2016-07-03  9:54         ` Adam Thompson
  0 siblings, 0 replies; 8+ messages in thread
From: Adam Thompson @ 2016-07-03  9:54 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 1157 bytes --]

On Sat, Jul 02, 2016 at 01:29:35PM -0400, Karl Dahlke wrote:
> > What do we do in the first case, i.e. if we get <unknowntag>?
> 
> When tidy sees <foobar> it says "discarding unknown tag".
> This may prove to be an unacceptable behavior at some point.
> A lot depends on what other browsers do with it,
> and whether we are expected to do the same.

But these aren't unknown tags in tidy since they're html5, they're just unknown in Edbrowse.  I'm wondering if there's anything we can do in the meantime whilst
we get our html5 support completed to at least provide the html5 stuff as part of the DOM for use by js.  Any ideas?  I think that header, footer (probably
comment and fragment too) should be treated like div in edbrowse, and probably same for article (and it's subtags) but obviously js needs to see the original
element types.  From memory there are a few more div-like tags as well though I can't remember what they are without going through the spec.
<video> is for embedding video files and, I think, <canvas> is some sort of drawing element though I don't know what it does (or if it can be used) with text.

Cheers,
Adam.

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Edbrowse-dev] questions about new finds
  2016-07-02 22:11       ` Kevin Carhart
@ 2016-07-03 11:12         ` Adam Thompson
  0 siblings, 0 replies; 8+ messages in thread
From: Adam Thompson @ 2016-07-03 11:12 UTC (permalink / raw)
  To: Kevin Carhart; +Cc: Karl Dahlke, edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 1724 bytes --]

On Sat, Jul 02, 2016 at 03:11:58PM -0700, Kevin Carhart wrote:
> There's a lot to do basically.  I see tidy errors, but I gravitate to the JS
> errors, I guess partially because they are autonomous for us to fix, while
> messages from tidy involve making threads with the tidy developers, fun and
> worthy in its own right, but the path of least resistance takes you to
> something you can work on right away.  (Dang, here's a toast to the tidy
> group!  It underpins everything now and it's so good I don't think about it!
> We only integrated tidy around 1 year ago.)

Agreed, it's worked well and simplified the code-base as well as handling much of the oddness in modern html for us.

> We could add things to availableTags one by one as we find them but this
> will go forever.  I experimented adding a "placeholder" to availableTags and
> editing html.c so that if the call to newTag fails to find a string,
> it will create your element as a "placeholder."  But I don't know what I'm
> doing - I was also seeing some infinite loops in renderNode, which I
> probably caused by adding placeholder.  I don't know what the TAGACT or
> flags should be for a made-up placeholder.

I think that we should probably treat them as... div... elements may be so at least things are rendered.  May be print a debug message at a relatively high
debug level so that we can track unimplemented tags and hopefully clean up some of the more common ones.  The trick here is (as I said in a previous email to
Karl) to expose the provided element type to js.  In some cases, i.e. with non-text elements, this may not necessarily be possible but I suspect that doing this
will hopefully fix a few things with just one task.

Cheers,
Adam.

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-07-03 11:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-23 11:22 [Edbrowse-dev] questions about new finds Kevin Carhart
2016-06-23 16:41 ` Karl Dahlke
2016-06-27  6:39   ` Kevin Carhart
2016-07-02 16:48     ` Adam Thompson
2016-07-02 17:29       ` Karl Dahlke
2016-07-03  9:54         ` Adam Thompson
2016-07-02 22:11       ` Kevin Carhart
2016-07-03 11:12         ` Adam Thompson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).