On Tue, Mar 11, 2014 at 04:59:10PM -0400, Karl Dahlke wrote: > > > perhaps growing as n^2 for the number of html tags, > > > Out of interest, how do you arrive at this figure? > > Mostly a gut feeling. > As background, consider my garbage collector for lines in a buffer. > This is just edits, no browsing. > Make a change, then make another change, so that there are now some lines > that you won't need even if you undo. > How do I determine and free those lines? > By comparing two maps of pointers. I did wonder how this worked. > The pointers in A, not in B, are not needed and can be freed. > The first version was simple and stupid, looking for things in A not in B, > and ran in n^2 time. Ah. > This didn't matter for a few years, until I started editing some really big files. > You also have started editing some large files. Yeah. > If a file has a million lines, thatt's a trillion operations per edit. > I enter a substitute command and wait 2 minutes for a response. > I don't think so! > So a few years ago I rewrote it using quicksort and and internal form of comm, > and it is plenty quick, even for large files. Yeah, I'd never really noticed this going on till I turned up the debugging level. > Look for qsort() in buffers.c. Sounds interesting. > > Now fast forward to today; I am scanning a tree of js objects > and comparing the tags that are implied by those objects with the tags we already have, > and doing, essentially, a diff, so I can report to the user what has changed, > which a sighted person would simply see as a change to something on the screen. > It is very likely that there are quick ways to run this comparison, > but if there absolutely aren't, > and if a web page rewrote itself in a nasty way that like changed every other line, > then maybe the comparison would be as awful as n^2. Perhaps, but I hope not. > But as I say I expect there will be better algorithms > once we really dive into it, > and I'm quite sure that web pages only change small portions of themselves, > the text in various boxes, or menus, etc, > and don't completely rewrite themselves, > so that's why I don't think it will be a performance issue, > even if we implement it in a rather direct and plodding way. Yeah, I doubt it'll be worse than the setter hell that full dynamic page support would cause at any rate. As I've previously said though, I think this is more of a reason to come up with a js-independant DOM which we can get working efficiently, then put the js on top. This allows much more flexibility, and hopefully provides opertunities to remove the entire post scan each time js makes a change. I'm thinking of the tag tree we previously discussed, with a flag set at the point where a change is made. Any post scan can then start from this place rather than the top of the tree. The only issue with this is if someone set the inner html of something to some broken html which effected tags outside where the inner html was set. I'm not sure what to do in this case. Cheers, Adam.