* [9front] htmlfs
@ 2021-08-29 21:05 Philip Silva
2021-08-30 4:12 ` Amavect
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Philip Silva @ 2021-08-29 21:05 UTC (permalink / raw)
To: 9front
Speaking of Go and gumbo, I was also wondering if there is some sort of htmlfs. In at lest one forum it was mentioned there would be an xmlfs somewhere in contrib but couldn't find it. I'd be most curious about how it's structured also given that tags must be ordered. Maybe /mnt/.../html/body/001-div/... or .../body/001/div/...? I'm not sure if that would be a nice solution. For the opossum browser I eventually wrote an rpc that gets a css selector as input and returns jsons of html nodes. (Not really something for a reusable api I guess, but enough to try some stuff :) Although I'd much prefer using a common/clean interface/fs)
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9front] htmlfs
2021-08-29 21:05 [9front] htmlfs Philip Silva
@ 2021-08-30 4:12 ` Amavect
2021-08-31 5:33 ` unobe
2021-08-31 11:49 ` hiro
2021-08-31 5:23 ` unobe
2021-08-31 13:20 ` Pavel Renev
2 siblings, 2 replies; 18+ messages in thread
From: Amavect @ 2021-08-30 4:12 UTC (permalink / raw)
To: 9front
Not everything needs to be a file system.
A program still needs to deserialize and load structs.
A 9p fs just doesn't do that.
I think you just want a programming library.
Thanks,
Amavect
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9front] htmlfs
2021-08-30 4:12 ` Amavect
@ 2021-08-31 5:33 ` unobe
2021-08-31 9:46 ` jstsmthrgk
2021-08-31 20:09 ` hiro
2021-08-31 11:49 ` hiro
1 sibling, 2 replies; 18+ messages in thread
From: unobe @ 2021-08-31 5:33 UTC (permalink / raw)
To: 9front
Quoth Amavect <amavect@gmail.com>:
> Not everything needs to be a file system.
> A program still needs to deserialize and load structs.
> A 9p fs just doesn't do that.
That's true, there are benefits to a programming library (namely,
performance). But doesn't a file system that presents a consistent
interface allow for a choice of programming language and for the
ability to abstract further? For instance, having xmlfs (if such a
thing existed) would allow for rc programs to do some simple tasks
that need to muck with xml.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9front] htmlfs
2021-08-31 5:33 ` unobe
@ 2021-08-31 9:46 ` jstsmthrgk
2021-08-31 10:36 ` hiro
2021-08-31 20:09 ` hiro
1 sibling, 1 reply; 18+ messages in thread
From: jstsmthrgk @ 2021-08-31 9:46 UTC (permalink / raw)
To: 9front
There exists a piece of software (for linux) with some interesting ideas regarding xml scriptability: https://manpages.debian.org/unstable/xml2/html2.1.en.html
Am 31. August 2021 07:33:25 MESZ schrieb unobe@cpan.org:
>Quoth Amavect <amavect@gmail.com>:
>> Not everything needs to be a file system.
>> A program still needs to deserialize and load structs.
>> A 9p fs just doesn't do that.
>
>That's true, there are benefits to a programming library (namely,
>performance). But doesn't a file system that presents a consistent
>interface allow for a choice of programming language and for the
>ability to abstract further? For instance, having xmlfs (if such a
>thing existed) would allow for rc programs to do some simple tasks
>that need to muck with xml.
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9front] htmlfs
2021-08-31 9:46 ` jstsmthrgk
@ 2021-08-31 10:36 ` hiro
2021-08-31 12:29 ` Steve Simon
0 siblings, 1 reply; 18+ messages in thread
From: hiro @ 2021-08-31 10:36 UTC (permalink / raw)
To: 9front
i can confirm, but the man page doesn't represent the ideas.
if you end up using this, you will see that for each element one line
of text is printed, including the full path/name of the value.
it feels a little bit like the output of grep -r, just with xml
hierarchy instead of file paths.
i always thought it would be neat to have a real fs instead, allowing
globbing instead of grep to read a specific element.
example:
$ html2 < poettering-Walkthrough\ for\ Portable\ Services.html
2>/dev/null | grep head/title
/html/head/title= Walkthrough for Portable Services
would be cool to instead run:
; htmlfs poettering*html
mounted at /n/htmlfs
; cat /n/htmlfs/*/*/title
Walkthrough for Portable Services
;
cool.
but not necessary for my use case.
A file is useful for separating binary data and strings that contain
newlines, but it just so happens that inside html you can often ignore
newlines, which means that practically the value of any entity should
fit quite well on a line of text as html2/xml2 output it.
On 8/31/21, jstsmthrgk <jstsmthrgk@jstsmthrgk.eu> wrote:
> There exists a piece of software (for linux) with some interesting ideas
> regarding xml scriptability:
> https://manpages.debian.org/unstable/xml2/html2.1.en.html
>
>
> Am 31. August 2021 07:33:25 MESZ schrieb unobe@cpan.org:
>>Quoth Amavect <amavect@gmail.com>:
>>> Not everything needs to be a file system.
>>> A program still needs to deserialize and load structs.
>>> A 9p fs just doesn't do that.
>>
>>That's true, there are benefits to a programming library (namely,
>>performance). But doesn't a file system that presents a consistent
>>interface allow for a choice of programming language and for the
>>ability to abstract further? For instance, having xmlfs (if such a
>>thing existed) would allow for rc programs to do some simple tasks
>>that need to muck with xml.
>>
>
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9front] htmlfs
2021-08-31 5:33 ` unobe
2021-08-31 9:46 ` jstsmthrgk
@ 2021-08-31 20:09 ` hiro
2021-08-31 22:40 ` Stuart Morrow
1 sibling, 1 reply; 18+ messages in thread
From: hiro @ 2021-08-31 20:09 UTC (permalink / raw)
To: 9front
indeed. both a filesystem or a piping process like discussed in
previous emails can be used easily from rc
On 8/31/21, unobe@cpan.org <unobe@cpan.org> wrote:
> Quoth Amavect <amavect@gmail.com>:
>> Not everything needs to be a file system.
>> A program still needs to deserialize and load structs.
>> A 9p fs just doesn't do that.
>
> That's true, there are benefits to a programming library (namely,
> performance). But doesn't a file system that presents a consistent
> interface allow for a choice of programming language and for the
> ability to abstract further? For instance, having xmlfs (if such a
> thing existed) would allow for rc programs to do some simple tasks
> that need to muck with xml.
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9front] htmlfs
2021-08-30 4:12 ` Amavect
2021-08-31 5:33 ` unobe
@ 2021-08-31 11:49 ` hiro
2021-08-31 15:42 ` Philip Silva
1 sibling, 1 reply; 18+ messages in thread
From: hiro @ 2021-08-31 11:49 UTC (permalink / raw)
To: 9front
> Not everything needs to be a file system.
> A program still needs to deserialize and load structs.
not everything needs to use structs.
esp. if everything is a string.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9front] htmlfs
2021-08-29 21:05 [9front] htmlfs Philip Silva
2021-08-30 4:12 ` Amavect
@ 2021-08-31 5:23 ` unobe
2021-08-31 13:20 ` Pavel Renev
2 siblings, 0 replies; 18+ messages in thread
From: unobe @ 2021-08-31 5:23 UTC (permalink / raw)
To: 9front
Quoth Philip Silva <philip.silva@protonmail.com>:
> Speaking of Go and gumbo, I was also wondering if there is some sort of htmlfs. In at lest one forum it was mentioned there would be an xmlfs somewhere in contrib but couldn't find it. I'd be most curious about how it's structured also given that tags must be ordered. Maybe /mnt/.../html/body/001-div/... or .../body/001/div/...? I'm not sure if that would be a nice solution. For the opossum browser I eventually wrote an rpc that gets a css selector as input and returns jsons of html nodes. (Not really something for a reusable api I guess, but enough to try some stuff :) Although I'd much prefer using a common/clean interface/fs)
Is this what you're looking for?
cpu% 9fs sources
post...
cpu% ls -l /n/sources/contrib/steve/libxml*
--rw-r--r-- M 505 bootes sys 10178 Jun 7 2017 /n/sources/contrib/steve/libxml.tbz
cpu%
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9front] htmlfs
2021-08-29 21:05 [9front] htmlfs Philip Silva
2021-08-30 4:12 ` Amavect
2021-08-31 5:23 ` unobe
@ 2021-08-31 13:20 ` Pavel Renev
2021-08-31 15:40 ` Philip Silva
2021-09-02 11:44 ` hiro
2 siblings, 2 replies; 18+ messages in thread
From: Pavel Renev @ 2021-08-31 13:20 UTC (permalink / raw)
To: 9front
I have a half-backed DOMfs:
http://git.nsmpr.xyz/domfs/files.html
but it just represents documents as a flat list of numbered nodes (the way rio serves its windows) and their hierarchy is provided through a separate file.
The challenge with xml/html is that unlike traditional file trees their elements do not have unique names and instead addressed by their order. Additionaly, element's attributes often play bigger role than text data they contain.
Style also can override tree hierarchy when it comes to rendering, and when it comes to javascript, programs look up needed elements via global search by id and usually only care about element's immediate parent/children.
TL;DR: the tree is a lie.
Maybe serving html via some kind of database query interface would be better.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9front] htmlfs
2021-08-31 13:20 ` Pavel Renev
@ 2021-08-31 15:40 ` Philip Silva
2021-09-02 11:44 ` hiro
1 sibling, 0 replies; 18+ messages in thread
From: Philip Silva @ 2021-08-31 15:40 UTC (permalink / raw)
To: 9front
Cool thanks for sharing! Yes I was thinking of use cases like connecting a separate JS process for dom manipulation or an rc script. For automation a numbered tree is oftentimes probably what is actually needed...
> I have a half-backed DOMfs:
>
> http://git.nsmpr.xyz/domfs/files.html
>
> but it just represents documents as a flat list of numbered nodes (the way rio serves its windows) and their hierarchy is provided through a separate file.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9front] htmlfs
2021-08-31 13:20 ` Pavel Renev
2021-08-31 15:40 ` Philip Silva
@ 2021-09-02 11:44 ` hiro
2021-09-02 12:32 ` hiro
1 sibling, 1 reply; 18+ messages in thread
From: hiro @ 2021-09-02 11:44 UTC (permalink / raw)
To: 9front
On 8/31/21, Pavel Renev <an2qzavok@gmail.com> wrote:
> I have a half-backed DOMfs:
> http://git.nsmpr.xyz/domfs/files.html
> but it just represents documents as a flat list of numbered nodes (the way
> rio serves its windows) and their hierarchy is provided through a separate
> file.
>
> The challenge with xml/html is that unlike traditional file trees their
> elements do not have unique names and instead addressed by their order.
> Additionaly, element's attributes often play bigger role than text data they
> contain.
> Style also can override tree hierarchy when it comes to rendering, and when
> it comes to javascript, programs look up needed elements via global search
> by id and usually only care about element's immediate parent/children.
>
> TL;DR: the tree is a lie.
> Maybe serving html via some kind of database query interface would be
> better.
>
why not do like xpath? numbers can signify order.
we don't support javascript anyway, so the tree wouldn't really change
under our feet...
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [9front] htmlfs
2021-09-02 11:44 ` hiro
@ 2021-09-02 12:32 ` hiro
0 siblings, 0 replies; 18+ messages in thread
From: hiro @ 2021-09-02 12:32 UTC (permalink / raw)
To: 9front
after writing my last email i googled xpath, and realize that i only
ever got to see a subset of it's insane complexity (the simple path
notation that includes a way to specify this n-th-element of a type,
which i have seen used a lot in practice by adblockers and anything
that needs to scrape content from websites that don't supply
meaningful element names.
contrary, i have indeed seen that some websites randomize their
element names to prevent this kind of javascript-free processing. so
yes, our low effort will not help with websites that really don't want
to be scraped...
On 9/2/21, hiro <23hiro@gmail.com> wrote:
> On 8/31/21, Pavel Renev <an2qzavok@gmail.com> wrote:
>> I have a half-backed DOMfs:
>> http://git.nsmpr.xyz/domfs/files.html
>> but it just represents documents as a flat list of numbered nodes (the
>> way
>> rio serves its windows) and their hierarchy is provided through a
>> separate
>> file.
>>
>> The challenge with xml/html is that unlike traditional file trees their
>> elements do not have unique names and instead addressed by their order.
>> Additionaly, element's attributes often play bigger role than text data
>> they
>> contain.
>> Style also can override tree hierarchy when it comes to rendering, and
>> when
>> it comes to javascript, programs look up needed elements via global
>> search
>> by id and usually only care about element's immediate parent/children.
>>
>> TL;DR: the tree is a lie.
>> Maybe serving html via some kind of database query interface would be
>> better.
>>
>
> why not do like xpath? numbers can signify order.
> we don't support javascript anyway, so the tree wouldn't really change
> under our feet...
>
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2021-09-02 14:13 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-29 21:05 [9front] htmlfs Philip Silva
2021-08-30 4:12 ` Amavect
2021-08-31 5:33 ` unobe
2021-08-31 9:46 ` jstsmthrgk
2021-08-31 10:36 ` hiro
2021-08-31 12:29 ` Steve Simon
2021-09-01 6:53 ` hiro
2021-09-01 8:30 ` sirjofri
2021-09-01 9:02 ` kvik
2021-08-31 20:09 ` hiro
2021-08-31 22:40 ` Stuart Morrow
2021-08-31 11:49 ` hiro
2021-08-31 15:42 ` Philip Silva
2021-08-31 5:23 ` unobe
2021-08-31 13:20 ` Pavel Renev
2021-08-31 15:40 ` Philip Silva
2021-09-02 11:44 ` hiro
2021-09-02 12:32 ` hiro
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).