From: Kevin Carhart <kevin@carhart.net>
To: Karl Dahlke <eklhad@comcast.net>
Cc: Edbrowse-dev@lists.the-brannons.com
Subject: Re: [Edbrowse-dev] acid[0]
Date: Sat, 19 Aug 2017 15:53:58 -0700 (PDT) [thread overview]
Message-ID: <alpine.LRH.2.03.1708191537320.6887@carhart.net> (raw)
In-Reply-To: <20170719113834.eklhad@comcast.net>
I think we're getting into CSS here. The acid3 html file has a text/css
section at the top including this:
#instructions:last-child { white-space: pre-wrap; white-space: x-bogus;
}
What are your feelings about css? I have been making a claim that I think
there's some evidence for, but I'm not positive: Even though the
bulk of CSS is not useful or interesting to the edbrowse renderer, we
might still be interested in CSS because sites use the presence of
CSS names and values as a workaround for user-agent spoofing. The
collection of results from poking and prodding 100 attributes is what they
take to be your browser and OS fingerprint, overriding what you said it
was. Diabolical, huh?
Do you think this is a compelling reason to get into CSS? I think I have
found some 3rd-party JS code that we might be interested in, if we wanted
to do something with this. It might save work. There's one object that is
a CSS parser. It would turn a .css file into JSON, where it is easier to
traverse afterwards. There is also a JS implementation of
querySelectorAll, which works like getElementsByTagName, only the
discernment of the result elements is based on selector syntax, rather
than tag or name. The colon, the period, the hash mark have particular
hardcoded meanings for different types of selections.
thanks
Kevin
On Sat, 19 Aug 2017, Karl Dahlke wrote:
> With Kevin pointing the way, I started looking at the first of 100 acid tests.
> It runs into a problem in that it expects a pure whitespace node that is not there.
> Note the following html.
>
> <body>
> <p>paragraph 1</p>
> <p>paragraph 2</p>
> </body>
>
> Browse with db5 and tidy gives us the two paragraph nodes in sequence, there is no node in between with the newline (whitespace) character.
> The javascript expects it to be there.
> Why is it not there?
>
> Note html-tidy.c line 126.
> I tell tidy not to drop empty elements, or empty paragraphs.
> Geoff, or anyone else, any insights?
>
> Karl Dahlke
>
--------
Kevin Carhart * 415 225 5306 * The Ten Ninety Nihilists
next prev parent reply other threads:[~2017-08-19 22:53 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-19 15:38 Karl Dahlke
2017-08-19 22:53 ` Kevin Carhart [this message]
2017-08-19 23:08 ` Karl Dahlke
2017-08-19 23:33 ` Kevin Carhart
2017-08-20 0:00 ` Karl Dahlke
2017-08-20 0:37 ` Kevin Carhart
2017-08-20 14:33 ` Karl Dahlke
2017-08-20 20:00 ` Kevin Carhart
2017-08-20 20:08 ` [Edbrowse-dev] getAttributeNode / setAttributeNode Kevin Carhart
2017-08-20 20:24 ` Karl Dahlke
2017-08-20 20:56 ` Kevin Carhart
2017-08-20 21:59 ` Kevin Carhart
[not found] ` <20170721105041.eklhad@comcast.net>
2017-08-21 19:11 ` Kevin Carhart
2017-08-21 20:01 ` Karl Dahlke
2017-08-24 9:54 ` Kevin Carhart
2017-08-24 9:57 ` Kevin Carhart
2017-08-25 8:19 ` Kevin Carhart
2017-08-25 22:09 ` [Edbrowse-dev] whitespace nodes Kevin Carhart
2017-08-25 22:56 ` Karl Dahlke
2017-08-26 4:25 ` [Edbrowse-dev] (something other than) " Kevin Carhart
2017-09-02 9:03 ` Adam Thompson
2017-09-02 15:42 ` Karl Dahlke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LRH.2.03.1708191537320.6887@carhart.net \
--to=kevin@carhart.net \
--cc=Edbrowse-dev@lists.the-brannons.com \
--cc=eklhad@comcast.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).