edbrowse-dev - development list for edbrowse
 help / color / mirror / Atom feed
* [Edbrowse-dev]    pdf auto download
@ 2015-04-10 11:48 Karl Dahlke
  2015-04-10 13:58 ` Adam Thompson
  0 siblings, 1 reply; 9+ messages in thread
From: Karl Dahlke @ 2015-04-10 11:48 UTC (permalink / raw)
  To: Edbrowse-dev

> I'd just alter the existing mime mechanism to support a contenttype keyword

Ok I see where you're headed here. That makes sense.
Then, along with the program to run to process that type of file,
would be some kind of download attribute if you want the option
to download that file to disk.
But I wonder if it shouldn't be the other way round: default for anything
other than text is to give you the option to download,
and if you want things always rendered,
always want to read the pdf or hear the music, then some kind of
render or inmemory keyword.
There are zillions of types of files out there and almost all of them,
other than text, I would want the download option,
which is why I'm thinking that would be the default.
In my config, my personal preference, a mime type application/pdf,
that would bypass autodownload and run my particular pdf converter,
which really should not be hardcoded as it is today in http.c.

I'd still support suffixes for local browsing,
strolling through a directory, load a file,
and type pb (play buffer) for example and it knows
which player to invoke by suffix,
but as you say this would mostly be for local files,
whereas web should use content type most of the time.

Karl Dahlke

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Edbrowse-dev] pdf auto download
  2015-04-10 11:48 [Edbrowse-dev] pdf auto download Karl Dahlke
@ 2015-04-10 13:58 ` Adam Thompson
  0 siblings, 0 replies; 9+ messages in thread
From: Adam Thompson @ 2015-04-10 13:58 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 1847 bytes --]

On Fri, Apr 10, 2015 at 07:48:37AM -0400, Karl Dahlke wrote:
> > I'd just alter the existing mime mechanism to support a contenttype keyword
> 
> Ok I see where you're headed here. That makes sense.
> Then, along with the program to run to process that type of file,
> would be some kind of download attribute if you want the option
> to download that file to disk.
> But I wonder if it shouldn't be the other way round: default for anything
> other than text is to give you the option to download,
> and if you want things always rendered,
> always want to read the pdf or hear the music, then some kind of
> render or inmemory keyword.
> There are zillions of types of files out there and almost all of them,
> other than text, I would want the download option,
> which is why I'm thinking that would be the default.
> In my config, my personal preference, a mime type application/pdf,
> that would bypass autodownload and run my particular pdf converter,
> which really should not be hardcoded as it is today in http.c.
> 
> I'd still support suffixes for local browsing,
> strolling through a directory, load a file,
> and type pb (play buffer) for example and it knows
> which player to invoke by suffix,
> but as you say this would mostly be for local files,
> whereas web should use content type most of the time.

Agreed. I like either render or display for the keyword to avoid downloading.
If the display (or render) keyword isn't specified,
the download prompt is displayed, and if the user chooses to load the file into
memory then the specified converter is used (if one exists).
I'd also like to see the hard-coded pdf convertion removed.
As for suffixes, I wonder if we shouldn't use libmagic to do local mime types instead of them.

Any idea how hard this would be to implement?

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Edbrowse-dev] pdf auto download
  2015-04-07 18:23 Karl Dahlke
@ 2015-04-10 11:22 ` Adam Thompson
  0 siblings, 0 replies; 9+ messages in thread
From: Adam Thompson @ 2015-04-10 11:22 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 2048 bytes --]

On Tue, Apr 07, 2015 at 02:23:53PM -0400, Karl Dahlke wrote:
> > If we changed the mechanism to be based on http headers rather than suffix,
> 
> The determination of whether to download an http file to disk,
> as opposed to in memory and browsing, is currently based on content-type
> as per the http headers.
> In other words, it already works the way you want.
> text/ or application/pdf is rendered, all else is downloaded.
> Of course you can override a download by typing a space, and keep it in memory.
> I do this sometimes for mp3 or wav off a website,
> if I just want to hear the sound and not save it somewhere.
> 
> This is not configurable, as you suggested.
> Perhaps something like
> 
> downtype = application/pdf
> 
> In the config file.
> I'm not sure what the defaults would be, and how to change them in .ebrc.
> It could be done, but I wonder if it would confuse
> more than it helps.
> I'm not sure the average user understands the various content types
> in http headers.

I'd just alter the existing mime mechanism to support a contenttype keyword
which would override the suffix keyword if specified.
I've never been a fan of specifying a mime type based on file extension,
and I find the existing mechanism of mapping arbitrary file extensions to
arbitrary mime types a bit... odd... anyway.
What I'm thinking is to have a mechanism where we check for a mime handler
for a particular type and adjust the prompt accordingly.
This would move us closer to the plugin mechanism supported by most other
modern browsers. In addition, given the way the web is heading,
this change will become increasingly important,
particularly as scripts are increasingly generating non-html output.

> Remember that ftp files are always
> downloaded to disk, since we have no indication of their type,
> the one exception to this being an ftp directory, which I load and browse,
> assuming you will select a file or subdirectory from this directory etc.

This seems sensible to me.

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Edbrowse-dev]   pdf auto download
@ 2015-04-07 18:23 Karl Dahlke
  2015-04-10 11:22 ` Adam Thompson
  0 siblings, 1 reply; 9+ messages in thread
From: Karl Dahlke @ 2015-04-07 18:23 UTC (permalink / raw)
  To: Edbrowse-dev

> If we changed the mechanism to be based on http headers rather than suffix,

The determination of whether to download an http file to disk,
as opposed to in memory and browsing, is currently based on content-type
as per the http headers.
In other words, it already works the way you want.
text/ or application/pdf is rendered, all else is downloaded.
Of course you can override a download by typing a space, and keep it in memory.
I do this sometimes for mp3 or wav off a website,
if I just want to hear the sound and not save it somewhere.

This is not configurable, as you suggested.
Perhaps something like

downtype = application/pdf

In the config file.
I'm not sure what the defaults would be, and how to change them in .ebrc.
It could be done, but I wonder if it would confuse
more than it helps.
I'm not sure the average user understands the various content types
in http headers.

Remember that ftp files are always
downloaded to disk, since we have no indication of their type,
the one exception to this being an ftp directory, which I load and browse,
assuming you will select a file or subdirectory from this directory etc.

Karl Dahlke

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Edbrowse-dev] pdf auto download
  2015-03-28 11:18 Karl Dahlke
@ 2015-04-07 17:43 ` Adam Thompson
  0 siblings, 0 replies; 9+ messages in thread
From: Adam Thompson @ 2015-04-07 17:43 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

[-- Attachment #1: Type: text/plain, Size: 1268 bytes --]

On Sat, Mar 28, 2015 at 07:18:07AM -0400, Karl Dahlke wrote:
> Chuck asks:
> > how would a user be able to retrieve and save the original pdf?
> 
> You can save the original of any file you access on the web,
> be it pdf or html or whatever, by typing ub to unbrowse the file,
> thus reproducing the original, then w to write.
> Or w/ to write to the internet filename, if that is the filename you want.
> You wouldn't lose any capabilities; probably this is how you did it
> in edbrowse 3.5.2 and earlier.

This makes me wonder whether we should make the content type handling (as in the mime type reported by the http header, not based on suffix)
configurable rather than hard-coding defaults and then using the download
prompt for the rest?
I know we have the mime mechanism at the moment,
but it needs a suffix (as far as I know), and it'd be nice to not rely on that,
particularly for websites which don't use standard suffixes (i.e.
some php scripts I've seen generate pdf output but have the .php suffix in the URL).
If we changed the mechanism to be based on http headers rather than suffix,
then we could also add a download or view option
so that you could configure pdfs to download rather than view if you wanted to.

Cheers,
Adam.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Edbrowse-dev] pdf auto download
  2015-03-28  3:42 Karl Dahlke
  2015-03-28 10:03 ` Chuck Hallenbeck
@ 2015-03-28 12:13 ` Chris Brannon
  1 sibling, 0 replies; 9+ messages in thread
From: Chris Brannon @ 2015-03-28 12:13 UTC (permalink / raw)
  To: Edbrowse-dev

> I propose adding, somewhere near http.c line 1568,
> the additional check for application/pdf,

Go for it.

-- Chris

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Edbrowse-dev]  pdf auto download
@ 2015-03-28 11:18 Karl Dahlke
  2015-04-07 17:43 ` Adam Thompson
  0 siblings, 1 reply; 9+ messages in thread
From: Karl Dahlke @ 2015-03-28 11:18 UTC (permalink / raw)
  To: Edbrowse-dev

Chuck asks:
> how would a user be able to retrieve and save the original pdf?

You can save the original of any file you access on the web,
be it pdf or html or whatever, by typing ub to unbrowse the file,
thus reproducing the original, then w to write.
Or w/ to write to the internet filename, if that is the filename you want.
You wouldn't lose any capabilities; probably this is how you did it
in edbrowse 3.5.2 and earlier.

Karl Dahlke

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Edbrowse-dev] pdf auto download
  2015-03-28  3:42 Karl Dahlke
@ 2015-03-28 10:03 ` Chuck Hallenbeck
  2015-03-28 12:13 ` Chris Brannon
  1 sibling, 0 replies; 9+ messages in thread
From: Chuck Hallenbeck @ 2015-03-28 10:03 UTC (permalink / raw)
  To: Karl Dahlke; +Cc: Edbrowse-dev

Karl,

If you make that change for pdf files, how would a user be able to
retrieve and save the original pdf?  At present I allow the automatic
download, then apply a simple bash script to convert the saved pdf to  a
txt version.

Chuck


-- 
--

Sent from the bottom of my heart


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Edbrowse-dev] pdf auto download
@ 2015-03-28  3:42 Karl Dahlke
  2015-03-28 10:03 ` Chuck Hallenbeck
  2015-03-28 12:13 ` Chris Brannon
  0 siblings, 2 replies; 9+ messages in thread
From: Karl Dahlke @ 2015-03-28  3:42 UTC (permalink / raw)
  To: Edbrowse-dev

An unexpected side effect of our recent download to disk feature
is that it takes place on pdf files,
and I really don't want it to.
Here is the logic, when surfing the web.
If the http headers say the file that is coming is text,
it is loaded into memory and rendered.
You still have the option to save it to disk,
formatted or unformatted, but it doesn't go down that path automatically,
because it's most likely a web page that you just want to read.
I think a pdf file should be treated the same way, though it is not today,
because its content type is application/pdf.
I propose adding, somewhere near http.c line 1568,
the additional check for application/pdf, so it is treated like other
text/html files, like other web pages.
Passing the idea by you first, before I make the change.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-04-10 14:00 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-10 11:48 [Edbrowse-dev] pdf auto download Karl Dahlke
2015-04-10 13:58 ` Adam Thompson
  -- strict thread matches above, loose matches on Subject: below --
2015-04-07 18:23 Karl Dahlke
2015-04-10 11:22 ` Adam Thompson
2015-03-28 11:18 Karl Dahlke
2015-04-07 17:43 ` Adam Thompson
2015-03-28  3:42 Karl Dahlke
2015-03-28 10:03 ` Chuck Hallenbeck
2015-03-28 12:13 ` Chris Brannon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).