* [Edbrowse-dev] tidy5 and versions
@ 2015-08-24 11:54 Karl Dahlke
2015-08-25 0:24 ` Kevin Carhart
2015-08-26 18:06 ` Chris Brannon
0 siblings, 2 replies; 3+ messages in thread
From: Karl Dahlke @ 2015-08-24 11:54 UTC (permalink / raw)
To: Edbrowse-dev
We stand on the edge of pushing a change that will require tidy5.
It's cautious, doesn't do anything except run the html through tidy,
in parallel with everything else we are doing,
then free the tidy tree when the window is freed.
Just to get us started, to make sure tidy doesn't seg fault etc.
But it will change the way edbrowse is built.
We now need another library etc.
Should we, and I kinda think we should, stamp another version, 3.5.4.2,
before we jump into the tidy pool?
Some work has been done since 3.5.4.1, some bug fixes, some cosmetics,
and the framework for imap, including a simple move delete interface.
Chris before you push Kevin's tidy patch, maybe stamp 3.5.4.2.
After we are using tidy to parse html,
and I hope this isn't a long time coming, we may want to jump up to 3.6.
Karl Dahlke
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Edbrowse-dev] tidy5 and versions
2015-08-24 11:54 [Edbrowse-dev] tidy5 and versions Karl Dahlke
@ 2015-08-25 0:24 ` Kevin Carhart
2015-08-26 18:06 ` Chris Brannon
1 sibling, 0 replies; 3+ messages in thread
From: Kevin Carhart @ 2015-08-25 0:24 UTC (permalink / raw)
To: Edbrowse-dev
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1418 bytes --]
And a note to Adam, we've hashed this patch out offlist, but if you have
any critiques on this, please fire away. It's just a few lines and
straightforward, but as a patch-submission newbie I can use the
multifaceted scrutiny, and would like to know what you think.
Thanks
Kevin
On Mon, 24 Aug 2015, Karl Dahlke wrote:
> We stand on the edge of pushing a change that will require tidy5.
> It's cautious, doesn't do anything except run the html through tidy,
> in parallel with everything else we are doing,
> then free the tidy tree when the window is freed.
> Just to get us started, to make sure tidy doesn't seg fault etc.
> But it will change the way edbrowse is built.
> We now need another library etc.
> Should we, and I kinda think we should, stamp another version, 3.5.4.2,
> before we jump into the tidy pool?
> Some work has been done since 3.5.4.1, some bug fixes, some cosmetics,
> and the framework for imap, including a simple move delete interface.
> Chris before you push Kevin's tidy patch, maybe stamp 3.5.4.2.
>
> After we are using tidy to parse html,
> and I hope this isn't a long time coming, we may want to jump up to 3.6.
>
> Karl Dahlke
> _______________________________________________
> Edbrowse-dev mailing list
> Edbrowse-dev@lists.the-brannons.com
> http://lists.the-brannons.com/mailman/listinfo/edbrowse-dev
>
--------
Kevin Carhart * 415 225 5306 * The Ten Ninety Nihilists
[-- Attachment #2: Type: TEXT/PLAIN, Size: 3638 bytes --]
diff -Naur 1/edbrowse-master/README 2/edbrowse-master/README
--- 1/edbrowse-master/README 2015-08-23 01:46:57.000000000 -0700
+++ 2/edbrowse-master/README 2015-08-23 21:46:42.783741131 -0700
@@ -73,6 +73,19 @@
If you have to compile curl from source, be sure to specify
--ENABLE-VERSION-SYMBOLS (or some such) at the configure script.
+Edbrowse now uses the Tidy HTML parser. So there are a couple
+of things to install for this prerequisite.
+The Tidy compilation process uses cmake. Please either use your
+package manager to get cmake (for instance, apt-get install cmake),
+or follow the instructions at http://www.cmake.org/download/
+
+Once you have cmake, download the latest Tidy code from:
+https://github.com/htacg/tidy-html5/archive/master.zip
+Unzip and cd to build/cmake
+cmake ../..
+make install
+Now the latest Tidy library will be available to edbrowse.
+
Finally, you need the Spider Monkey javascript engine from Mozilla.org
ftp://ftp.mozilla.org/pub/mozilla.org/js/
Edbrowse 3.5.1 and higher requires Mozilla js version 2.4.
diff -Naur 1/edbrowse-master/src/buffers.c 2/edbrowse-master/src/buffers.c
--- 1/edbrowse-master/src/buffers.c 2015-08-23 01:46:57.000000000 -0700
+++ 2/edbrowse-master/src/buffers.c 2015-08-24 16:02:18.351550150 -0700
@@ -583,6 +583,7 @@
nzFree(w->firstURL);
nzFree(w->referrer);
nzFree(w->baseDirName);
+ tidyRelease(w->tdoc);
free(w);
} /* freeWindow */
diff -Naur 1/edbrowse-master/src/eb.h 2/edbrowse-master/src/eb.h
--- 1/edbrowse-master/src/eb.h 2015-08-23 01:46:57.000000000 -0700
+++ 2/edbrowse-master/src/eb.h 2015-08-23 21:34:37.165011656 -0700
@@ -26,6 +26,7 @@
#include <stdio.h>
#include <errno.h>
#include <fcntl.h>
+#include <tidy.h>
#include <curl/curl.h>
#ifdef DOSLIKE
#include <io.h>
@@ -362,6 +363,7 @@
jsobjtype jcx;
jsobjtype winobj;
jsobjtype docobj; /* window.document */
+ TidyDoc tdoc; /* tidy5 html parser */
struct DBTABLE *table; /* if in sqlMode */
};
extern struct ebWindow *cw; /* current window */
diff -Naur 1/edbrowse-master/src/html.c 2/edbrowse-master/src/html.c
--- 1/edbrowse-master/src/html.c 2015-08-23 01:46:57.000000000 -0700
+++ 2/edbrowse-master/src/html.c 2015-08-24 16:17:20.031306748 -0700
@@ -1668,6 +1668,21 @@
int nopt; /* number of options */
int intable = 0, inrow = 0;
bool tdfirst;
+ int TidyReturnValue; /* for Tidy methods that return
+success/failure */
+
+ // Tidy-related actions on incoming html
+
+ // At the moment, the goal is to get the parser into edbrowse
+ // and be able to call things without detrimental effect to
+ // any existing functionality
+
+ cw->tdoc = tidyCreate();
+printf("In case you wanted to know if this is the version with Tidy, it is");
+ // run tidyParseString here, or do something else
+ //TidyReturnValue = tidyParseString (tdoc,html);
+
+ // The use of Tidy ends here ---
ns = initString(&ns_l);
preamble = initString(&preamble_l);
diff -Naur 1/edbrowse-master/src/makefile 2/edbrowse-master/src/makefile
--- 1/edbrowse-master/src/makefile 2015-08-23 21:32:17.459104575 -0700
+++ 2/edbrowse-master/src/makefile 2015-08-24 16:45:40.857553878 -0700
@@ -32,7 +32,7 @@
# Override JSLIB on the command-line, if your distro uses a different name.
# E.G., make JSLIB=-lmozjs
JSLIB = -lmozjs-24
-LDLIBS = -lpcre -lcurl -lreadline -lncurses
+LDLIBS = -lpcre -lcurl -lreadline -lncurses -ltidy
# Make the dynamically linked executable program by default.
all: edbrowse edbrowse-js
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Edbrowse-dev] tidy5 and versions
2015-08-24 11:54 [Edbrowse-dev] tidy5 and versions Karl Dahlke
2015-08-25 0:24 ` Kevin Carhart
@ 2015-08-26 18:06 ` Chris Brannon
1 sibling, 0 replies; 3+ messages in thread
From: Chris Brannon @ 2015-08-26 18:06 UTC (permalink / raw)
To: Edbrowse-dev
Karl Dahlke <eklhad@comcast.net> writes:
> Should we, and I kinda think we should, stamp another version, 3.5.4.2,
> before we jump into the tidy pool?
Yep, I haven't seen any objections, so I'll go ahead and push 3.5.4.2,
along with new static binaries.
-- Chris
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-08-26 18:04 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-24 11:54 [Edbrowse-dev] tidy5 and versions Karl Dahlke
2015-08-25 0:24 ` Kevin Carhart
2015-08-26 18:06 ` Chris Brannon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).