edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed
* [Edbrowse-dev] curl things
@ 2014-12-28 15:35 Karl Dahlke
  2014-12-28 16:04 ` Adam Thompson
  2015-01-03 13:19 ` Chris Brannon
  0 siblings, 2 replies; 10+ messages in thread
From: Karl Dahlke @ 2014-12-28 15:35 UTC (permalink / raw)
  To: Edbrowse-dev

This is mostly aimed at Chris, who is our resident curl expert,
and maybe should be a pm but I like keeping everyone in the loop.
Do you have time for projects, and if not can I consult with you?

Imap is an obvious curl task.
Copy fetchmail.c to imap.c and keep most of what is there,
in that imap should allow me to do, with my emails,
all the things I already do,
but add commands to invoke specific imap calls
to list folders and list the emails in folders and create folders
and so on.
Part of me thinks it wouldn't be hard to do, but part of me thinks
it's a bit more, since imap has more power.
We might need 2 letter commands,
not just the simple 1 letter commands in fetchmail.c.
Not sure bout that yet.

For downloading a file, I want to see that the header is
something other than text/html, allow the user to download,
then switch the curl callback function to write to disk, but can you switch
the callback function in mid stream?
I've already called curl_perform, and it's running,
so how can I switch the callback function in mid stream?

As per this feature, I pushed a small change so
downdir = /home/eklhad/dld
in your config file becomes same in edbrowse, the download directory.

Adam, the string push is fine.
Course the same functions are also in jseng-moz.cpp, but they haven't caused any trouble there.


Karl Dahlke

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Edbrowse-dev] curl things
  2014-12-28 15:35 [Edbrowse-dev] curl things Karl Dahlke
@ 2014-12-28 16:04 ` Adam Thompson
  2015-01-03 13:19 ` Chris Brannon
  1 sibling, 0 replies; 10+ messages in thread
From: Adam Thompson @ 2014-12-28 16:04 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 999 bytes --]

On Sun, Dec 28, 2014 at 10:35:28AM -0500, Karl Dahlke wrote:
> For downloading a file, I want to see that the header is
> something other than text/html, allow the user to download,
> then switch the curl callback function to write to disk, but can you switch
> the callback function in mid stream?
> I've already called curl_perform, and it's running,
> so how can I switch the callback function in mid stream?

I don't know if curl can do this, but if not,
make the callback given to curl support both operations,
perhaps by having that call another callback based on a file-level variable?

> As per this feature, I pushed a small change so
> downdir = /home/eklhad/dld
> in your config file becomes same in edbrowse, the download directory.

Cool, thanks.

> Adam, the string push is fine.
> Course the same functions are also in jseng-moz.cpp, but they haven't caused any trouble there.

Yeah, I'm wondering if we want to make that file more c++ anyway as it's got to
be in c++.

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Edbrowse-dev] curl things
  2014-12-28 15:35 [Edbrowse-dev] curl things Karl Dahlke
  2014-12-28 16:04 ` Adam Thompson
@ 2015-01-03 13:19 ` Chris Brannon
  1 sibling, 0 replies; 10+ messages in thread
From: Chris Brannon @ 2015-01-03 13:19 UTC (permalink / raw)
  To: Edbrowse-dev

Karl Dahlke <eklhad@comcast.net> writes:

> Do you have time for projects, and if not can I consult with you?

Hey guys!  You've been busy!  Sorry, I've been out of town and mostly
away from a usable computer for the last couple of weeks.

Oh, I have plenty of time, and I'm happy to work on this.  I'll start
researching once I've had time to study all of the latest changes.

> We might need 2 letter commands,

Very likely, especially for folder manipulation.

> I've already called curl_perform, and it's running,
> so how can I switch the callback function in mid stream?

Have a "wrapper" callback function that you pass to libcurl, and let
the wrapper delegate to the appropriate callback somehow.>
I don't know if libcurl will let you switch the callback mid-stream, but
if it does, this seems like an implementation detail that may be
unreliable.

-- Chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Edbrowse-dev] curl things
  2015-01-06  0:25 ` Chris Brannon
@ 2015-01-06 21:17   ` Adam Thompson
  0 siblings, 0 replies; 10+ messages in thread
From: Adam Thompson @ 2015-01-06 21:17 UTC (permalink / raw)
  To: Chris Brannon; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 1891 bytes --]

On Mon, Jan 05, 2015 at 04:25:26PM -0800, Chris Brannon wrote:
> Karl Dahlke <eklhad@comcast.net> writes:
> 
> > then maybe we shouldn't spend a lot of time tracking it down.
> > I wasn't asking for the feature in the first place.
> 
> Me either.  I use GNU screen, so if my edbrowse blocks, I just open up
> another screen window and fire off another browser.
> The saving directly to disk feature is nice though, since that's what
> people are usually going to want to do with extremely large files.
> I do see why some folks would like downloading in the background, so
> if I can figure out why it barfs on https, I will.

Thanks. The problem I've ran into on more than one occasion is that edbrowse
supports multiple buffers and yet blocks on large downloads.
This means that if I'm working in edbrowse and start a large (or more likely
just really slow) download, then I can't use the other edbrowse buffers I've
got open whilst the download happens.
Tbh, before edbrowse got saving to disk,
I rarely used it for large downloads because most of my machines are memory
limited (1 gb of RAM counts as memory limited when you try downloading a 1.5 gb
iso by mistake), however now I'd like to be able to download files just like
I'd do in a graphical browser, or when using links (which I sometimes have to do when edbrowse won't handle a site's particular kind of html) for that matter.

> Something tells me that it's not ok to use the same curl handle across
> multiple processes though.  More research is required.

Yeah. I'm not sure if it's a factor since I've never ran into this before,
but when running in gdb you can see that curl launches a new thread when
performing network operations. I wonder what happens when one forks the process
mid operation, and how that plays with whatever openssl is doing behind curl's 
threading.

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Edbrowse-dev] curl things
  2015-01-05 21:33 Karl Dahlke
  2015-01-06  0:25 ` Chris Brannon
@ 2015-01-06 21:09 ` Adam Thompson
  1 sibling, 0 replies; 10+ messages in thread
From: Adam Thompson @ 2015-01-06 21:09 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 1135 bytes --]

On Mon, Jan 05, 2015 at 04:33:35PM -0500, Karl Dahlke wrote:
> https download example fails consistently for me, on two different machines.
> 
> curl 7.32.0
> OpenSSL 1.0.1e-fips 11
> 
> curl 7.36.0
> OpenSSL 1.0.1e-fips 11
> 
> I upgraded from 1.0.1e-30.fc20 to 1.0.1e-40.fc20 just for grins.
> Probably got rid of heartbleed bug, but didn't fix background download.
> read error right away.
> Well if it works for everyone but me
> then maybe we shouldn't spend a lot of time tracking it down.
> I wasn't asking for the feature in the first place.

True, I'm concerned as to why it's doing this in some, but not all,
versions of curl. What os are you running;
if I get time I'll set up a couple of virtual machines to try and reproduce this.

> > is there any way we could have a command to tell you
> > what downloads are currently in progress?
> 
> maintain a dynamic array of background downloads, file name and child pid,
> catch signal 18, signal handler removes child from the list.
> Yes it could be done.

That sounds like a sensible design. How tricky would it be to implement?

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Edbrowse-dev] curl things
  2015-01-05 21:33 Karl Dahlke
@ 2015-01-06  0:25 ` Chris Brannon
  2015-01-06 21:17   ` Adam Thompson
  2015-01-06 21:09 ` Adam Thompson
  1 sibling, 1 reply; 10+ messages in thread
From: Chris Brannon @ 2015-01-06  0:25 UTC (permalink / raw)
  To: Edbrowse-dev

Karl Dahlke <eklhad@comcast.net> writes:

> then maybe we shouldn't spend a lot of time tracking it down.
> I wasn't asking for the feature in the first place.

Me either.  I use GNU screen, so if my edbrowse blocks, I just open up
another screen window and fire off another browser.
The saving directly to disk feature is nice though, since that's what
people are usually going to want to do with extremely large files.
I do see why some folks would like downloading in the background, so
if I can figure out why it barfs on https, I will.
Something tells me that it's not ok to use the same curl handle across
multiple processes though.  More research is required.

-- Chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Edbrowse-dev]  curl things
@ 2015-01-05 21:33 Karl Dahlke
  2015-01-06  0:25 ` Chris Brannon
  2015-01-06 21:09 ` Adam Thompson
  0 siblings, 2 replies; 10+ messages in thread
From: Karl Dahlke @ 2015-01-05 21:33 UTC (permalink / raw)
  To: Edbrowse-dev

https download example fails consistently for me, on two different machines.

curl 7.32.0
OpenSSL 1.0.1e-fips 11

curl 7.36.0
OpenSSL 1.0.1e-fips 11

I upgraded from 1.0.1e-30.fc20 to 1.0.1e-40.fc20 just for grins.
Probably got rid of heartbleed bug, but didn't fix background download.
read error right away.
Well if it works for everyone but me
then maybe we shouldn't spend a lot of time tracking it down.
I wasn't asking for the feature in the first place.

> is there any way we could have a command to tell you
> what downloads are currently in progress?

maintain a dynamic array of background downloads, file name and child pid,
catch signal 18, signal handler removes child from the list.
Yes it could be done.

Karl Dahlke

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Edbrowse-dev] curl things
  2015-01-05 18:12 ` Chris Brannon
@ 2015-01-05 20:48   ` Adam Thompson
  0 siblings, 0 replies; 10+ messages in thread
From: Adam Thompson @ 2015-01-05 20:48 UTC (permalink / raw)
  To: Chris Brannon; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 866 bytes --]

On Mon, Jan 05, 2015 at 10:12:24AM -0800, Chris Brannon wrote:
> Karl Dahlke <eklhad@comcast.net> writes:
> 
> > As you see, download in the background works for http and ftp,
> > but not their secure versions,
> 
> That's strange.  The example works just fine for me.
> Usually, when I've seen the "cannot read data from the server" message,
> it indicates that the server closed the connection prematurely.  More
> often than not, it's just an intermittent failure.  Give it another go
> and tell me if it still fails for you.  If it does, libcurl is behaving
> differently between versions or something, and it's going to be fun to
> track down.

Ok, just got round to testing this and it seems to work for me also.
Out of interest, is there any way we could have a command to tell you what
downloads are currently in progress?

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Edbrowse-dev] curl things
  2015-01-03 15:34 Karl Dahlke
@ 2015-01-05 18:12 ` Chris Brannon
  2015-01-05 20:48   ` Adam Thompson
  0 siblings, 1 reply; 10+ messages in thread
From: Chris Brannon @ 2015-01-05 18:12 UTC (permalink / raw)
  To: Edbrowse-dev

Karl Dahlke <eklhad@comcast.net> writes:

> As you see, download in the background works for http and ftp,
> but not their secure versions,

That's strange.  The example works just fine for me.
Usually, when I've seen the "cannot read data from the server" message,
it indicates that the server closed the connection prematurely.  More
often than not, it's just an intermittent failure.  Give it another go
and tell me if it still fails for you.  If it does, libcurl is behaving
differently between versions or something, and it's going to be fun to
track down.

-- Chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Edbrowse-dev]  curl things
@ 2015-01-03 15:34 Karl Dahlke
  2015-01-05 18:12 ` Chris Brannon
  0 siblings, 1 reply; 10+ messages in thread
From: Karl Dahlke @ 2015-01-03 15:34 UTC (permalink / raw)
  To: Edbrowse-dev

Welcome back from wherever, and hope you had a grand time.
We have missed your thoughts and advice and wisdom,
but have pressed on anyways.
Your git pull will have a lot of fast forwarding to do.
Read through all our emails and you'll see where we are.
Do make comments on what we have done so far.

> I have plenty of time, and I'm happy to work on this.

That's great news.

As you see, download in the background works for http and ftp,
but not their secure versions,
due to some kind of curl interruptus.
This is the failure case.
https://archive.torproject.org/amnesia.boum.org/tails/stable/tails-i386-1.2.2/tails-i386-1.2.2.iso
Not sure the best plan - maybe (the easy way) I can just return
something different from callback and it will do what I want,
or I can just close the socket in parent before returning -1,
then whatever curl does will not disrupt the server
and what it is doing with the child.
Or maybe the roles of parent and child must switch, if ssl
uses the process id somewhere and must march along.
Uglier cases are to abort and restart the secure download,
or forget about doing it in the background and just download to disk,
which is valuable in itself.
Or just put plain downloads in background, as almost all of them are plain,
and keep secure in foreground, but that could be confusing.
IDK

The most important step in imap might be a clean command line
interface to same, that is not far different from the one in fetchmail.
Don't know a lot about imap, or curl's support thereof,
but would probably like it if it was there.

My next project, which is not a major redisign,
but more like filling in some holes,
is to fix tag.innerHTML = "foobar", in js.
This never really worked properly, never.
It injects html into the page, after it is parsed,
but my parsing software in html.c was built to be a one time thing,
so there is some work to do here.
I use to get around this by dumping the injected html into another buffer
and rendering it there, but that is ugly.
It belongs where it belongs, under the specified tag
on the current page.
And for calls to document.write() after browse, I use to do the same thing,
dump the html somewhere else and parse it in a separate window.
This is illustrated by line 3 in browsed jsrt, the timer, fire it and
watch what it does, ugly.
It should just append that (rendered) html to the current buffer.
Both issues are related, and I am making a series of small foundational
changes that will let me approach this one.

Karl Dahlke

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-01-06 21:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-28 15:35 [Edbrowse-dev] curl things Karl Dahlke
2014-12-28 16:04 ` Adam Thompson
2015-01-03 13:19 ` Chris Brannon
2015-01-03 15:34 Karl Dahlke
2015-01-05 18:12 ` Chris Brannon
2015-01-05 20:48   ` Adam Thompson
2015-01-05 21:33 Karl Dahlke
2015-01-06  0:25 ` Chris Brannon
2015-01-06 21:17   ` Adam Thompson
2015-01-06 21:09 ` Adam Thompson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).