* [Edbrowse-dev] acid[0] @ 2017-08-19 15:38 Karl Dahlke 2017-08-19 22:53 ` Kevin Carhart 0 siblings, 1 reply; 22+ messages in thread From: Karl Dahlke @ 2017-08-19 15:38 UTC (permalink / raw) To: Edbrowse-dev [-- Attachment #1: Type: text/plain, Size: 600 bytes --] With Kevin pointing the way, I started looking at the first of 100 acid tests. It runs into a problem in that it expects a pure whitespace node that is not there. Note the following html. <body> <p>paragraph 1</p> <p>paragraph 2</p> </body> Browse with db5 and tidy gives us the two paragraph nodes in sequence, there is no node in between with the newline (whitespace) character. The javascript expects it to be there. Why is it not there? Note html-tidy.c line 126. I tell tidy not to drop empty elements, or empty paragraphs. Geoff, or anyone else, any insights? Karl Dahlke ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Edbrowse-dev] acid[0] 2017-08-19 15:38 [Edbrowse-dev] acid[0] Karl Dahlke @ 2017-08-19 22:53 ` Kevin Carhart 2017-08-19 23:08 ` Karl Dahlke 0 siblings, 1 reply; 22+ messages in thread From: Kevin Carhart @ 2017-08-19 22:53 UTC (permalink / raw) To: Karl Dahlke; +Cc: Edbrowse-dev I think we're getting into CSS here. The acid3 html file has a text/css section at the top including this: #instructions:last-child { white-space: pre-wrap; white-space: x-bogus; } What are your feelings about css? I have been making a claim that I think there's some evidence for, but I'm not positive: Even though the bulk of CSS is not useful or interesting to the edbrowse renderer, we might still be interested in CSS because sites use the presence of CSS names and values as a workaround for user-agent spoofing. The collection of results from poking and prodding 100 attributes is what they take to be your browser and OS fingerprint, overriding what you said it was. Diabolical, huh? Do you think this is a compelling reason to get into CSS? I think I have found some 3rd-party JS code that we might be interested in, if we wanted to do something with this. It might save work. There's one object that is a CSS parser. It would turn a .css file into JSON, where it is easier to traverse afterwards. There is also a JS implementation of querySelectorAll, which works like getElementsByTagName, only the discernment of the result elements is based on selector syntax, rather than tag or name. The colon, the period, the hash mark have particular hardcoded meanings for different types of selections. thanks Kevin On Sat, 19 Aug 2017, Karl Dahlke wrote: > With Kevin pointing the way, I started looking at the first of 100 acid tests. > It runs into a problem in that it expects a pure whitespace node that is not there. > Note the following html. > > <body> > <p>paragraph 1</p> > <p>paragraph 2</p> > </body> > > Browse with db5 and tidy gives us the two paragraph nodes in sequence, there is no node in between with the newline (whitespace) character. > The javascript expects it to be there. > Why is it not there? > > Note html-tidy.c line 126. > I tell tidy not to drop empty elements, or empty paragraphs. > Geoff, or anyone else, any insights? > > Karl Dahlke > -------- Kevin Carhart * 415 225 5306 * The Ten Ninety Nihilists ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Edbrowse-dev] acid[0] 2017-08-19 22:53 ` Kevin Carhart @ 2017-08-19 23:08 ` Karl Dahlke 2017-08-19 23:33 ` Kevin Carhart 0 siblings, 1 reply; 22+ messages in thread From: Karl Dahlke @ 2017-08-19 23:08 UTC (permalink / raw) To: Edbrowse-dev [-- Attachment #1: Type: text/plain, Size: 324 bytes --] Well duktape has some json support out of the box. It has a JSON global object, with JSON.parse() in it and I don't know what else, so using js to convert css to json might be a practical pathway. Course we'd have to follow up with a function to apply bgcolor=white to foo.style wherever that makes sense. Karl Dahlke ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Edbrowse-dev] acid[0] 2017-08-19 23:08 ` Karl Dahlke @ 2017-08-19 23:33 ` Kevin Carhart 2017-08-20 0:00 ` Karl Dahlke 0 siblings, 1 reply; 22+ messages in thread From: Kevin Carhart @ 2017-08-19 23:33 UTC (permalink / raw) To: Karl Dahlke; +Cc: Edbrowse-dev On Sat, 19 Aug 2017, Karl Dahlke wrote: > Well duktape has some json support out of the box. > It has a JSON global object, with JSON.parse() in it and I don't know what else, so using js to convert css to json might be a practical pathway. Correction: Actually I made a mistakes that it was JSON. The parser doesn't return JSON, the parser returns a tree of nested objects. The documentation simply turned it into JSON in order to serialize the contents of the object for readability. So this is even more familiar, just traversal with recursion maybe. > Course we'd have to follow up with a function to apply bgcolor=white to > foo.style wherever that makes sense. I think that's right. We might get the first two thirds of a three step process done "free" by the libraries and have to write the function that you describe. Since querySelectorAll returns elements (foo) and the parser css.js breaks down selectors, attribute names (bgcolor) and attribute values (white) into neat compartments, I think it would be (don't want to speak too soon) somewhat straightforward to dole out bgcolor=white to foo.style. Here is the code for the parser and then for querySelector: https://raw.githubusercontent.com/jotform/css.js/master/css.js https://raw.githubusercontent.com/yiminghe/query-selector/master/build/query-selector-debug.js And here are the git projects: https://github.com/jotform/css.js.git https://github.com/yiminghe/query-selector.git K ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Edbrowse-dev] acid[0] 2017-08-19 23:33 ` Kevin Carhart @ 2017-08-20 0:00 ` Karl Dahlke 2017-08-20 0:37 ` Kevin Carhart 0 siblings, 1 reply; 22+ messages in thread From: Karl Dahlke @ 2017-08-20 0:00 UTC (permalink / raw) To: Edbrowse-dev [-- Attachment #1: Type: text/plain, Size: 535 bytes --] Not sure what querySelectorAll is all about; can't we just call document.getElementsByTagName()? So if an object says p.snork has bgcolor=white then we get the array a = document.getElementsByTagName("p"); Loop over array and if obj.class == "snork" then obj.style.bgcolor = white. Or if the descriptor is on #instructions rather than a class of nodes, we use getElementById to find the node and then set its values. So I think we already have the middle third, and the last third seems reasonably easy to write. Karl Dahlke ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Edbrowse-dev] acid[0] 2017-08-20 0:00 ` Karl Dahlke @ 2017-08-20 0:37 ` Kevin Carhart 2017-08-20 14:33 ` Karl Dahlke 0 siblings, 1 reply; 22+ messages in thread From: Kevin Carhart @ 2017-08-20 0:37 UTC (permalink / raw) To: Karl Dahlke; +Cc: Edbrowse-dev On Sat, 19 Aug 2017, Karl Dahlke wrote: > Not sure what querySelectorAll is all about; can't we just call document.getElementsByTagName()? It's a thing of its own. A lot of sites' JS uses this. For instance, in the nasa.gov code file vendor.js, e.querySelectorAll("[msallowcapture^='']") e.querySelectorAll("[selected]") e.querySelectorAll(":checked") a=r.querySelector("#morph-"+n) e.querySelectorAll("[id~="+q+"-]") e.querySelectorAll("a#"+q+"+*") The brackets, the hash and the colon have hardcoded meanings. And the syntax used here, I believe is the same selector syntax you find in CSS blocks. So at the least, there's also the period and the at symbol: .hidden { visibility: hidden; } @font-face { font-family: "AcidAhemTest"; src: url(font.ttf); } > So if an object says p.snork has bgcolor=white > then we get the array > a = document.getElementsByTagName("p"); > Loop over array and if obj.class == "snork" then obj.style.bgcolor = white. > Or if the descriptor is on #instructions rather than a class of nodes, we use getElementById to find the node and then set its values. > So I think we already have the middle third, and the last third seems reasonably easy to write. I don't rule out that this can be done. It depends if you want to dig in to the selectors language-within-a-language or use a component to hopefully avoid having to. If it's fun, that's good. If it's completely undesirable to learn a new mini syntax, maybe the outside component can do it for us. I think even if you wanted to do a certain thing within the implementation that called getElements, there would need to be a wrapper called querySel etc which is going to receive an argument beginning with a symbol. We can any kind of node math we want under the hood in order to select the results. It's definitely possible that you will know how to do it, so that it would turn out to be less work than bringing in the outside code. I don't know which of those is less work. Kevin ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Edbrowse-dev] acid[0] 2017-08-20 0:37 ` Kevin Carhart @ 2017-08-20 14:33 ` Karl Dahlke 2017-08-20 20:00 ` Kevin Carhart 0 siblings, 1 reply; 22+ messages in thread From: Karl Dahlke @ 2017-08-20 14:33 UTC (permalink / raw) To: Edbrowse-dev [-- Attachment #1: Type: text/plain, Size: 2611 bytes --] Ok Kevin, all that free software is just too irresistable! I put both functions in startwindow.js and pushed, so everybody have a look. The css parser works like a dream! You can even do it stand alone. duk -i startwindow.js parser = new cssjs; list = parser.parseCSS(css_string); list is an array of the css descriptors in the string. Each is an object with members: selector, rules, comment. The selector is something like "p.snork". Rules is another array of all the keyword value pairs like bgcolor = white It's so simple and clean. I took the ridiculous <style> tag out of acid3 and pushed it through the parser, and it worked perfectly. 45 descriptors corresponding to the contents of that <style> tag. querySelectorAll is not as simple. As Kevin pointed out, other websites use the same construct, maybe even the same code, and I don't want their function to collide with our function, especially if they work somewhat differently, so I call ours eb$qs. eb$qs("div") But there's another problem. The code creates the function querySelectorAll, or in our case eb$qs, and in doing so it creates a temporary <div> tag. That doesn't work unless we have a framework in place. So I put a wrapper around it: eb$qs$start(). That sets everything up to then run eb$qs as often as you like. Eventually edbrowse will call eb$qs$start() after the html document is browsed and before the first javascript runs. eb$qs$start() parser.parseCSS() on every style tag in the document and every file <link type=css href=> Then map those values onto the objects by applying eb$qs to each selector in each css descriptor. But there are more problems. Try it with jsrt. Set db3 so you can see what is going on. browse, jdb, eb$qs$start() and now you're ready to go. list = eb$qs("script"); Holy crap it works, list is an array of 9 objects for the 9 scripts in jsrt. Step through each one and look at list[i].data. That is the contents of each script. This also works for "p" "a", and other such things. It doesn't work for "table.filbert", even though we have a <table class=filbert> tag. It calls a method getAttributeNode which we don't have. Oops. That's probably our omission, and something we should address anyways. Then try eb$qs("#jkl"); That doesn't work either. We are missing the compareDocumentPosition() method. So we can't move forward on this until we fill in some missing pieces in our DOM. Any volunteers to implement getAttributeNode() or compareDocumentPosition()? The former is easier, and more important, than the latter. Karl Dahlke ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Edbrowse-dev] acid[0] 2017-08-20 14:33 ` Karl Dahlke @ 2017-08-20 20:00 ` Kevin Carhart 2017-08-20 20:08 ` [Edbrowse-dev] getAttributeNode / setAttributeNode Kevin Carhart 0 siblings, 1 reply; 22+ messages in thread From: Kevin Carhart @ 2017-08-20 20:00 UTC (permalink / raw) To: Karl Dahlke; +Cc: Edbrowse-dev Wow! Thank you for doing all of this! > It's so simple and clean. Yes, the parser in particular seems to be very high quality. The developers have notes that call it "battle tested". > querySelectorAll is not as simple. As Kevin pointed out, other websites > use the same construct, maybe even the same code, and I don't want their > function to collide with our function, especially if they work somewhat > differently, so I call ours eb$qs. I'd like to clarify something here. When nasa.gov calls querySelectorAll, it is on the same order as appendChild as far as that web developer is concerned. They expect it to be provided by the browser, which for all they know is a compiled, closed-source browser. Isn't collision not exactly right for the situation? Other websites use the same construct but only to call and expect it to be provided. There's nothing wrong with calling ours eb$qs, but are we then going to create a wrapper so that page code can lock on to it by name? - There's one more thing to mention that might be relevant. It's wonderful that you dove in! We might need to calibrate the querySelector code for browsers rather than node, which is the system for server-side javascript (I think it's like an interpreter - I may be describing it wrong. I have used it, but not that much.) If there are references to "exports", I think these need to be removed. I have definitely gotten qS working with edbrowse in the past! But I have not gone through the motions recently. Maybe you are way ahead of me if you got it working. > It calls a method getAttributeNode which we don't have. Oops. It's entirely possible that I implemented this in the same experimental build where I got qS working and have never turned it in. I will check. Kevin ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Edbrowse-dev] getAttributeNode / setAttributeNode 2017-08-20 20:00 ` Kevin Carhart @ 2017-08-20 20:08 ` Kevin Carhart 2017-08-20 20:24 ` Karl Dahlke 0 siblings, 1 reply; 22+ messages in thread From: Kevin Carhart @ 2017-08-20 20:08 UTC (permalink / raw) To: Karl Dahlke; +Cc: Edbrowse-dev Didn't we have this at one time? Maybe not, I don't remember. It is basically a scalar-to-object converter. Given a string, it wants the balloon blown up. It wants an attribute node whose name is the passed in string. document.getAttributeNode = function (name) { rv = document.createElement("Attr"); rv.setAttribute(name,this[name.toLowerCase()]); return rv; } document.setAttributeNode = function(name, v) { this.attributes[name.toLowerCase()] = v; this[name.toLowerCase()] = v; } ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Edbrowse-dev] getAttributeNode / setAttributeNode 2017-08-20 20:08 ` [Edbrowse-dev] getAttributeNode / setAttributeNode Kevin Carhart @ 2017-08-20 20:24 ` Karl Dahlke 2017-08-20 20:56 ` Kevin Carhart 0 siblings, 1 reply; 22+ messages in thread From: Karl Dahlke @ 2017-08-20 20:24 UTC (permalink / raw) To: Edbrowse-dev [-- Attachment #1: Type: text/plain, Size: 826 bytes --] Well as you see, I implemented getAttributeNode(), because it wasn't hard, but a little harder than your example suggests because of side effects. Setting value has to propagate down to setattribute in the original element, which I do with a setter. With this in place, much of eb$qs is working. Sounds like I misunderstood though, and it should really be called querySelectorAll, but that's just a one line change if we want to do that. Let me know if that's what we should do. I notice inside the code it checks navigator.userAgent, so it tailors itself to the kind of browser we are. God knows what it does with edbrowse. :) Anyways, to make this all work standalone, without edbrowse, duk -i startwindow.js, I had to put in something for navigator.userAgent, or it was blowing up. Line 195. Karl Dahlke ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Edbrowse-dev] getAttributeNode / setAttributeNode 2017-08-20 20:24 ` Karl Dahlke @ 2017-08-20 20:56 ` Kevin Carhart 2017-08-20 21:59 ` Kevin Carhart 0 siblings, 1 reply; 22+ messages in thread From: Kevin Carhart @ 2017-08-20 20:56 UTC (permalink / raw) To: Karl Dahlke; +Cc: Edbrowse-dev On Sun, 20 Aug 2017, Karl Dahlke wrote: > Well as you see, I implemented getAttributeNode(), because it wasn't hard, but a little harder than your example suggests because of side effects. Ah! Thank you. > > Sounds like I misunderstood though, and it should really be called querySelectorAll, but that's just a one line change if we want to do that. > Let me know if that's what we should do. Well.. I believe so, in the same way that we have apch, but pages by some random web developer in the world expect to lock on to appendChild. It's the DOM. querySelector and querySelectorAll are part of the DOM as far as they are concerned. We just happen to be implementing them in open javascript. > I notice inside the code it checks navigator.userAgent, so it tailors itself to the kind of browser we are. Yes.. I remember having a problem with a couple of lines that I think test for an IE version. I remember that the qS code has some multi byte Asian letters in some comments. I'll track them down later. Maybe they will sit merrily and be ignored, but I'm worried that they would make startwindow garbled if someone was compiling from source and didn't have a charset that renders these alphabets. Maybe it's fine. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Edbrowse-dev] getAttributeNode / setAttributeNode 2017-08-20 20:56 ` Kevin Carhart @ 2017-08-20 21:59 ` Kevin Carhart [not found] ` <20170721105041.eklhad@comcast.net> 0 siblings, 1 reply; 22+ messages in thread From: Kevin Carhart @ 2017-08-20 21:59 UTC (permalink / raw) To: Karl Dahlke; +Cc: Edbrowse-dev I don't think any of what you are doing is incorrect though. There are multiple sub-projects going on at once almost! It is very, very, very wonderful to parse and propagate style blocks and .css files. It is a quantum leap, iframe support is a quantum leap, so I am in heaven. So if we ourselves call qS almost like a private member that would not be exposed to pages, this is great. We can call it anything we want. There are two potential use cases for qS. The same work that can do two kinds of jobs. The first job is, we call qS ourselves, as part of I think a three step process to (a) parse css sections (b) identify and return the set of elements that the styles need to be doled out to (c) dole out the styles to that set of elements I leapt ahead, without enough explanation of what I was on about. Because pages *also* call querySelector and querySelectorAll. It's separate in some ways - it is separate from us being a web browser and doing an integral, fundamental thing with styles information. It is more like the toolbelt of the web designer. The web designer calls getElementsByTagName("blah") in one function and then calls querySelectorAll(":blah") or querySelectorAll(".blah") in the next. One job is low level and internal, and the other job is high level and external, but they both use qS to process the selectors mini-language and then search the tree. (Those terms "high level" and "low level" are so overloaded both in technical settings and regular society or whatnot that they are really useless. But I hope you get what I'm describing. If low-level is taken to mean, less about aesthetics, fundamental architecture of a web browser per se, that's the first job. If high-level is taken to mean, scripters and designers who build websites, that's the second job.) Does that make sense? Sorry if by overlapping two use cases I made anything confusing. It's like water rushing downstream because I am so excited about both of the scenarios!! >> Sounds like I misunderstood though, and it should really be called >> querySelectorAll, but that's just a one line change if we want to do that. >> Let me know if that's what we should do. > > Well.. I believe so, in the same way that we have apch, but pages by some > random web developer in the world expect to lock on to appendChild. It's the > DOM. querySelector and querySelectorAll are part of the DOM as far as they > are concerned. We just happen to be implementing them in open javascript. ^ permalink raw reply [flat|nested] 22+ messages in thread
[parent not found: <20170721105041.eklhad@comcast.net>]
* Re: [Edbrowse-dev] getAttributeNode / setAttributeNode [not found] ` <20170721105041.eklhad@comcast.net> @ 2017-08-21 19:11 ` Kevin Carhart 2017-08-21 20:01 ` Karl Dahlke 0 siblings, 1 reply; 22+ messages in thread From: Kevin Carhart @ 2017-08-21 19:11 UTC (permalink / raw) To: Edbrowse-dev On Mon, 21 Aug 2017, Karl Dahlke wrote: > So css attributes from <style> tags or from <link> css files now apply to the objects as they should. It's cool. > See tests 164 and 165 in jsrt. This is great! I tried it out last night a bit. > > Still acid 0 is a long way away. Yes, I went to acid 0. I think we are very close. 'last' and 'penultimate' did not used to have anything in them correctly, and now they do. The assertion at the end that uses computedStyle would even work if the property being retrieved happened to be one of the ones propagated by our CSS code. qS("#instructions:last-child") returns zero elements. It isn't picking up penultimate. But all of the earlier steps prior to the last line are working. > One of the mysteries remaining is they set "white-space" = "pre-wrap" in the style block, but then the test checks for .whiteSpace. > Now how when or why does white-space equate to whiteSpace? I don't get that. Aha! I found something out about this. There is this DOM implementation by Thatcher et al, called env.js. I used it a couple of years ago with an edbrowse 3.3.1 before we started ours. I learned a lot from using it. They have CSS-related code, and they have the following internal routines: var __toCamelCase__ = function(name) { if (name) { return name.replace(/\-(\w)/g, function(all, letter) { return letter.toUpperCase(); }); } return name; }; var __toDashed__ = function(camelCaseName) { if (camelCaseName) { return camelCaseName.replace(/[A-Z]/g, function(all) { return '-' + all.toLowerCase(); }); } return camelCaseName; }; So I conclude that formalized conversion of camel case to/from dashed CSS is a thing. I think that may be the missing link or one of them. Kevin ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Edbrowse-dev] getAttributeNode / setAttributeNode 2017-08-21 19:11 ` Kevin Carhart @ 2017-08-21 20:01 ` Karl Dahlke 2017-08-24 9:54 ` Kevin Carhart ` (2 more replies) 0 siblings, 3 replies; 22+ messages in thread From: Karl Dahlke @ 2017-08-21 20:01 UTC (permalink / raw) To: Edbrowse-dev [-- Attachment #1: Type: text/plain, Size: 2024 bytes --] Ok I now convert foo-bar to fooBar, as you suggest, and as the acid test 0 suggests, but I think it's wweird. You say you're not finding the right properties in penultimate, but oh boy it's very subtle. There are several problems at play. The selector we're looking for is #instructions:last-child, and I had to read some of the MIT code to see what that meant. It means the node with id=instructions, and it has to be the last nontrivial child of its parent, where nontrivial beans nodeType = 1. So a silly empty whitespace node doesn't count. At the time the acid test runs, and at the time it calls getComputedStyle() to make its calculation, it has already removed the paragraph after instructions, and the instructional paragraph is indeed the last child of its parent. So getComputedStyle creates a style object for this node, and it should have whiteSpace set properly, but it's just a dynamically created style node, it's not the actual style attached to the node. That style we might be messing with, might change it to green etc. getcomputedStyle simply tells you what the style would be, right now, if all the rules were applied. So I'm starting to unravel that but there's another problem. After this test runs, and succeeds or fails, another script runs and does a document.write which adds all sorts of nodes to body. So now the browse is done, and you get into jdb, and you try to reproduce this stuff, but you can't, because instructions isn't the last child of its parent any more. It was but it isn't any more, so the machinery looks like it's not working but it works just fine. So - I think we are just one step away from test 0 passing. The test expects a blank node between the two paragraphs, a node corrresponding to the newline character, an empty node, a node of nodeType 0, but tidy doesn't give us this node, so nothing lines up. I asked Geoff about this and am waiting for his reply. If tidy doesn't give us those nodes, then acid test 0 will never pass. Karl Dahlke ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Edbrowse-dev] getAttributeNode / setAttributeNode 2017-08-21 20:01 ` Karl Dahlke @ 2017-08-24 9:54 ` Kevin Carhart 2017-08-24 9:57 ` Kevin Carhart 2017-08-25 8:19 ` Kevin Carhart 2 siblings, 0 replies; 22+ messages in thread From: Kevin Carhart @ 2017-08-24 9:54 UTC (permalink / raw) To: Karl Dahlke; +Cc: Edbrowse-dev I was visiting my parents and afk, but I am back now and excited by the latest. On Mon, 21 Aug 2017, Karl Dahlke wrote: > Ok I now convert foo-bar to fooBar, as you suggest, and as the acid test 0 suggests, but I think it's wweird. > > You say you're not finding the right properties in penultimate, but oh boy it's very subtle. Yeah. > There are several problems at play. > The selector we're looking for is #instructions:last-child, and I had to read some of the MIT code to see what that meant. Interesting. What is the MIT code? Is it like a CSS spec? > It was but it isn't any more, so the machinery looks like it's not working but it works just fine. > > So - I think we are just one step away from test 0 passing. Exactly!! I am glad you went there because now we both have our bearings in the same stuff. I completely agree about things getting clobbered later, creating the suggestion at jdb-time that it isn't working. I labored under this misapprehension for a while and wasted time before figuring this out. So as a workaround, I said wget http://acid3.acidtests.org, save it locally as index.html or another name, and then add alerts in the "test 0" code so that you can get your feedback from when it actually runs and not from jdb, later. > but tidy doesn't give us this node, so nothing lines up. Ah, is that right! So this is where we came in. You mentioned this a couple days ago and that was when I brought up the CSS components. So now we are really getting down to the problem. Amazing how much they pack into test zero. Woo! I am literally doing a little dance every day about new edbrowse. In honor of the fact that we are working on these Stylistic issues, the official soundtrack of edbrowse-dev, for a while at least, will be "Betcha By Golly Wow" by The Stylistics. Kevin ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Edbrowse-dev] getAttributeNode / setAttributeNode 2017-08-21 20:01 ` Karl Dahlke 2017-08-24 9:54 ` Kevin Carhart @ 2017-08-24 9:57 ` Kevin Carhart 2017-08-25 8:19 ` Kevin Carhart 2 siblings, 0 replies; 22+ messages in thread From: Kevin Carhart @ 2017-08-24 9:57 UTC (permalink / raw) To: Karl Dahlke; +Cc: Edbrowse-dev > The selector we're looking for is #instructions:last-child, and I had to read some of the MIT code to see what that meant. Oops, you're talking about the MIT-licensed code from Jotform and yiminghe, right? I get it now. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Edbrowse-dev] getAttributeNode / setAttributeNode 2017-08-21 20:01 ` Karl Dahlke 2017-08-24 9:54 ` Kevin Carhart 2017-08-24 9:57 ` Kevin Carhart @ 2017-08-25 8:19 ` Kevin Carhart 2017-08-25 22:09 ` [Edbrowse-dev] whitespace nodes Kevin Carhart 2 siblings, 1 reply; 22+ messages in thread From: Kevin Carhart @ 2017-08-25 8:19 UTC (permalink / raw) To: Edbrowse-dev Thank you for the writeup of the routines in qS (my abbreviation for the third party querySelector code) that are called for #instructions:last-child! This writeup is very helpful. > The test expects a blank node between the two paragraphs, a node corrresponding to the newline character, an empty node, a node of nodeType 0, > but tidy doesn't give us this node, so nothing lines up. > I asked Geoff about this and am waiting for his reply. So does this mean that all pages should have tons of these nodes all over the place? I guess we will know soon enough when certain persons are available. :=)) Or we could start a thread about this under Issues. But I am trying to play along in case I can make some headway now. Maybe it is comparable to the options we already set in html-tidy.c: tidyOptSetBool(tdoc, TidyEscapeScripts, no); tidyOptSetBool(tdoc, TidyDropEmptyElems, no); tidyOptSetBool(tdoc, TidyDropEmptyParas, no); My candidates so far are TidyNewline and TidyEmptyTags. I don't know what they do yet - those are just the ones with plausible names. For anyone reading who doesn't already know, there is a long list of tidy config options under tidy-html5-master/src, FYI. Kevin ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Edbrowse-dev] whitespace nodes 2017-08-25 8:19 ` Kevin Carhart @ 2017-08-25 22:09 ` Kevin Carhart 2017-08-25 22:56 ` Karl Dahlke 0 siblings, 1 reply; 22+ messages in thread From: Kevin Carhart @ 2017-08-25 22:09 UTC (permalink / raw) To: Edbrowse-dev I haven't been able to get additional nodes created out of newlines just by adding a certain tidyOptSet. I tried one called TidyLiteralAttribs, but this is for passing through the contents of tags (I think), and the whitespace we want is in between tags like "negative space", so to speak. ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Edbrowse-dev] whitespace nodes 2017-08-25 22:09 ` [Edbrowse-dev] whitespace nodes Kevin Carhart @ 2017-08-25 22:56 ` Karl Dahlke 2017-08-26 4:25 ` [Edbrowse-dev] (something other than) " Kevin Carhart 0 siblings, 1 reply; 22+ messages in thread From: Karl Dahlke @ 2017-08-25 22:56 UTC (permalink / raw) To: Edbrowse-dev > I haven't been able to get additional nodes created out of newlines Not sure how hard we should work on this, or even if we want it, just to pass an acid test. It probably has no bearing in the real world, and who wants all those empty nodes cluttering up the tree? For now I think we should just delete or comment out line 227 in the acid test file, it's just understood that this line is nulled out, then test 0 should pass and we move on. Let's get the value out of the acid tests without becoming obsessed over them. That's my gut feeling right now. Karl Dahlke ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Edbrowse-dev] (something other than) whitespace nodes 2017-08-25 22:56 ` Karl Dahlke @ 2017-08-26 4:25 ` Kevin Carhart 2017-09-02 9:03 ` Adam Thompson 0 siblings, 1 reply; 22+ messages in thread From: Kevin Carhart @ 2017-08-26 4:25 UTC (permalink / raw) To: Edbrowse-dev Thanks for pointing this out. I guess I took something overly literal that is not a part of the generic principle they're getting at in the test. Clearly node-ifying every '\n' in every web page isn't common or important or we would have hit it previously.. I could have keyed in to this fact sooner. Oh well. I was only in tidy for a short time, and the exploration seems useful anyhow. On Fri, 25 Aug 2017, Karl Dahlke wrote: >> I haven't been able to get additional nodes created out of newlines > > Not sure how hard we should work on this, or even if we want it, just to pass an acid test. > It probably has no bearing in the real world, and who wants all those empty nodes cluttering up the tree? > For now I think we should just delete or comment out line 227 in the acid test file, > it's just understood that this line is nulled out, then test 0 should pass and we move on. > Let's get the value out of the acid tests without becoming obsessed over them. > That's my gut feeling right now. > > Karl Dahlke > _______________________________________________ > Edbrowse-dev mailing list > Edbrowse-dev@lists.the-brannons.com > http://lists.the-brannons.com/mailman/listinfo/edbrowse-dev > -------- Kevin Carhart * 415 225 5306 * The Ten Ninety Nihilists ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [Edbrowse-dev] (something other than) whitespace nodes 2017-08-26 4:25 ` [Edbrowse-dev] (something other than) " Kevin Carhart @ 2017-09-02 9:03 ` Adam Thompson 2017-09-02 15:42 ` Karl Dahlke 0 siblings, 1 reply; 22+ messages in thread From: Adam Thompson @ 2017-09-02 9:03 UTC (permalink / raw) To: Kevin Carhart; +Cc: Edbrowse-dev First of all thanks for all the work you've all done on this and appologies for going silent... again... Hopefully this time I'll keep my computers working at least long enough to participate in discussions again. On Fri, Aug 25, 2017 at 09:25:42PM -0700, Kevin Carhart wrote: > > Thanks for pointing this out. I guess I took something overly literal that > is not a part of the generic principle they're getting at in the test. > Clearly node-ifying every '\n' in every web page isn't common or important > or we would have hit it previously.. I could have keyed in to this fact > sooner. Oh well. I was only in tidy for a short time, and the exploration > seems useful anyhow. Tbh it sounds like it wasn't wasted time in that we now understand that this could be a thing in the future (although it sounds like a strange thing which is probably why tidy doesn't do it). I'd also say that the more we know about the tidy code the better so, as you say, the exploration was probably worth it. Anyway I agree with safely ignoring the lack of a newline because... who cares about blank text nodes (which is, I guess, what this would be). May be we need this in the future, but I can't imagine why. Cheers, Adam. ^ permalink raw reply [flat|nested] 22+ messages in thread
* [Edbrowse-dev] (something other than) whitespace nodes 2017-09-02 9:03 ` Adam Thompson @ 2017-09-02 15:42 ` Karl Dahlke 0 siblings, 0 replies; 22+ messages in thread From: Karl Dahlke @ 2017-09-02 15:42 UTC (permalink / raw) To: Edbrowse-dev [-- Attachment #1: Type: text/plain, Size: 951 bytes --] Geoff has confirmed that tidy does not mess with intervening whitespace, and certainly doesn't turn it into empty nodes, and isn't likely too in the future. After all, the html spec says such space is meaningless, so he's following the spec. But then acid3 assumes every browser creates these empty whitespace nodes. It's a contradiction. I'm not gonna worry about it. Just know that to pass acid test 0 you have to delete or comment out line 227, and on we go. I don't think this will ever be a problem in the real world. In fact the last-child first-child css constructs are defined to be the last or first "real" nodes under a parent, so if for some reason the browser cranks out empty whitespace nodes those are ignored. They design it to work no matter how your browser behaves. I read the jotform code and it screens for noteType = 1, real nodes. In other words, I think we're fine and we can move on to something else. Karl Dahlke ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2017-09-02 15:41 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-08-19 15:38 [Edbrowse-dev] acid[0] Karl Dahlke 2017-08-19 22:53 ` Kevin Carhart 2017-08-19 23:08 ` Karl Dahlke 2017-08-19 23:33 ` Kevin Carhart 2017-08-20 0:00 ` Karl Dahlke 2017-08-20 0:37 ` Kevin Carhart 2017-08-20 14:33 ` Karl Dahlke 2017-08-20 20:00 ` Kevin Carhart 2017-08-20 20:08 ` [Edbrowse-dev] getAttributeNode / setAttributeNode Kevin Carhart 2017-08-20 20:24 ` Karl Dahlke 2017-08-20 20:56 ` Kevin Carhart 2017-08-20 21:59 ` Kevin Carhart [not found] ` <20170721105041.eklhad@comcast.net> 2017-08-21 19:11 ` Kevin Carhart 2017-08-21 20:01 ` Karl Dahlke 2017-08-24 9:54 ` Kevin Carhart 2017-08-24 9:57 ` Kevin Carhart 2017-08-25 8:19 ` Kevin Carhart 2017-08-25 22:09 ` [Edbrowse-dev] whitespace nodes Kevin Carhart 2017-08-25 22:56 ` Karl Dahlke 2017-08-26 4:25 ` [Edbrowse-dev] (something other than) " Kevin Carhart 2017-09-02 9:03 ` Adam Thompson 2017-09-02 15:42 ` Karl Dahlke
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).