* [9fans] sept. town hall meeting @ 2005-09-09 20:55 Francisco Ballesteros 2005-09-09 21:05 ` Uriel 0 siblings, 1 reply; 11+ messages in thread From: Francisco Ballesteros @ 2005-09-09 20:55 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs What about sched´ing the next town hall meeting for the third friday of september? I'm willing to run it if there are no objections or anyone else prefers to do that. Preferences regarding time? Was the last one (8pm gmt?) convenient? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [9fans] sept. town hall meeting 2005-09-09 20:55 [9fans] sept. town hall meeting Francisco Ballesteros @ 2005-09-09 21:05 ` Uriel 2005-09-09 22:10 ` [9fans] reliability and failing over Francisco Ballesteros 0 siblings, 1 reply; 11+ messages in thread From: Uriel @ 2005-09-09 21:05 UTC (permalink / raw) To: 9fans On Fri, Sep 09, 2005 at 10:55:01PM +0200, Francisco Ballesteros wrote: > What about sched´ing the next town hall meeting > for the third friday of september? > I'm willing to run it if there are no objections or anyone else > prefers to do that. > > Preferences regarding time? Was the last one (8pm gmt?) > convenient? I had tentatively scheduled the next THM for Sep 10th, but I have been too stressed to even send out an announcement. After some research it seems that Saturdays are the best days for most people(specially given the diversity of timezones involved), 20:00 GMT seems like a time that works reasonably well for most people. For consistency(and avoid having to come up with a new date/time), I suggest having a THM the third Saturday of the month at 20:00 GMT Any complains? if not, I will make it 'official' in the Wiki For the Summary of the last THM see(thanks to Hyperion for writing it up): http://plan9.bell-labs.com/wiki/plan9/thm_2005-08-15_Summary/ uriel ^ permalink raw reply [flat|nested] 11+ messages in thread
* [9fans] reliability and failing over. 2005-09-09 21:05 ` Uriel @ 2005-09-09 22:10 ` Francisco Ballesteros 2005-09-09 22:24 ` Russ Cox 2005-09-10 1:22 ` Eric Van Hensbergen 0 siblings, 2 replies; 11+ messages in thread From: Francisco Ballesteros @ 2005-09-09 22:10 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Funny. The 9p reliability project looks to me a lot like the redirfs that we played with before introducing the Plan B volumes into the kernel. It provided failover (on replicated FSs, by some other means) and could recover the fids just by keeping track of their paths. The user level process I'm with now is quite similar to that (appart from including the language to select particular volumes) it maintains a fid table knowning which server, and which path within the server are the ones for each fid. It's what the Plan B kernel ns does, but within a server. Probably, the increase in latency you are seeing is the one I'm going to see in volfs. The 2x penalty in performace is what one could expect, because you have twice the latency. However, the in-kernel implementation has no penalty at all, because the kernel can rewrite the mount tables. Maybe we should talk about this. Eric? Russ? What do you say? Is it worth to pay the extra latency just to avoid a change (serious, I admit) in the kernel? On 9/9/05, Uriel <uriell@binarydream.org> wrote: > On Fri, Sep 09, 2005 at 10:55:01PM +0200, Francisco Ballesteros wrote: > > What about sched´ing the next town hall meeting > > for the third friday of september? > > I'm willing to run it if there are no objections or anyone else > > prefers to do that. > > > > Preferences regarding time? Was the last one (8pm gmt?) > > convenient? > I had tentatively scheduled the next THM for Sep 10th, but I have been > too stressed to even send out an announcement. > > After some research it seems that Saturdays are the best days for most > people(specially given the diversity of timezones involved), 20:00 GMT > seems like a time that works reasonably well for most people. > > For consistency(and avoid having to come up with a new date/time), I > suggest having a THM the third Saturday of the month at 20:00 GMT > > Any complains? if not, I will make it 'official' in the Wiki > > For the Summary of the last THM see(thanks to Hyperion for writing it > up): > > http://plan9.bell-labs.com/wiki/plan9/thm_2005-08-15_Summary/ > > uriel > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [9fans] reliability and failing over. 2005-09-09 22:10 ` [9fans] reliability and failing over Francisco Ballesteros @ 2005-09-09 22:24 ` Russ Cox 2005-09-09 22:50 ` Gorka guardiola 2005-09-10 1:22 ` Eric Van Hensbergen 1 sibling, 1 reply; 11+ messages in thread From: Russ Cox @ 2005-09-09 22:24 UTC (permalink / raw) To: Francisco Ballesteros, Fans of the OS Plan 9 from Bell Labs > Funny. The 9p reliability project looks to me a lot like the redirfs > that we played with before introducing the Plan B volumes into the kernel. > It provided failover (on replicated FSs, by some other means) and could > recover the fids just by keeping track of their paths. > > The user level process I'm with now is quite similar to that (appart from > including the language to select particular volumes) it maintains a fid > table knowning which server, and which path within the server are the ones > for each fid. It's what the Plan B kernel ns does, but within a server. > > Probably, the increase in latency you are seeing is the one I'm going to > see in volfs. The 2x penalty in performace is what one could expect, because > you have twice the latency. However, the in-kernel implementation has no > penalty at all, because the kernel can rewrite the mount tables. > > Maybe we should talk about this. > Eric? Russ? What do you say? Is it worth to pay the extra > latency just to avoid a change (serious, I admit) in the kernel? I keep seeing that 2x number but I still don't believe it's actually reasonable to measure the hit on an empty loopback file server. Do something over a 100Mbps file server connection talking to fossil and see what performance hit you get then. Stuff in the kernel is much harder to change and debug. Unless there's a compelling reason, I'd like to see it stay in user space. And I'm not yet convinced that performance is a compelling reason. Russ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [9fans] reliability and failing over. 2005-09-09 22:24 ` Russ Cox @ 2005-09-09 22:50 ` Gorka guardiola 2005-09-09 23:02 ` Francisco Ballesteros 0 siblings, 1 reply; 11+ messages in thread From: Gorka guardiola @ 2005-09-09 22:50 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs On 9/9/05, Russ Cox <rsc@swtch.com> wrote: > > Funny. The 9p reliability project looks to me a lot like the redirfs > > that we played with before introducing the Plan B volumes into the kernel. > > It provided failover (on replicated FSs, by some other means) and could > > recover the fids just by keeping track of their paths. It is almost the same thing. The difference is that one is 9P-9P an the other is 9P-syscall(read-write...). We didnt have a plan 9 kernel on the other side, so we couldnt use redirfs, that is why we ported recover. It is better for linux (p9p) too, as it doesnt require to mount the filesystem and then use it, so you dont depend on having someone to serve the 9P files by mounting them. I am not sure about Plan 9. On one side, recover knows more about the stuff under it, so it has more granularity, and can fail in a nicer way. On the other side, redirfs is much much simpler. > > > > The user level process I'm with now is quite similar to that (appart from > > including the language to select particular volumes) it maintains a fid > > table knowning which server, and which path within the server are the ones > > for each fid. It's what the Plan B kernel ns does, but within a server. > > > > Probably, the increase in latency you are seeing is the one I'm going to > > see in volfs. The 2x penalty in performace is what one could expect, because > > you have twice the latency. However, the in-kernel implementation has no > > penalty at all, because the kernel can rewrite the mount tables. > > > > Maybe we should talk about this. > > Eric? Russ? What do you say? Is it worth to pay the extra > > latency just to avoid a change (serious, I admit) in the kernel? > > I keep seeing that 2x number but I still don't believe it's actually > reasonable to measure the hit on an empty loopback file server. > Do something over a 100Mbps file server connection > talking to fossil and see what performance hit you get then. Yes, this increase in latency is in the loopback. If you are using a network it is probably completely lost in the noise, the network being probably 100 times slower than the loopback. > > Stuff in the kernel is much harder to change and debug. > Unless there's a compelling reason, I'd like to see it stay > in user space. And I'm not yet convinced that performance > is a compelling reason. I agree, though it depends on the application. For us (normal users) I agree completely that it is not compelling. Some people out there are doing stuff in which performance is important (let them write the code? :-)). -- - curiosity sKilled the cat ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [9fans] reliability and failing over. 2005-09-09 22:50 ` Gorka guardiola @ 2005-09-09 23:02 ` Francisco Ballesteros 2005-09-09 23:06 ` Russ Cox 2005-09-10 13:11 ` Sape Mullender 0 siblings, 2 replies; 11+ messages in thread From: Francisco Ballesteros @ 2005-09-09 23:02 UTC (permalink / raw) To: paurea, Fans of the OS Plan 9 from Bell Labs Beware of the latency. Plan B usage shows that latency is the biggest problem. I admit that only in unions, but if you join N different file servers in a directory, and then you dup the latency, it may be a problem. A workaround is just not to join too many servers, but it's a workaround, not a fix. Anyway, when I complete volfs we'll see if it´s beareable or not. If it is, I'll be happy to discard the kernel changes. If it's not, we'll have to see why. On 9/10/05, Gorka guardiola <paurea@gmail.com> wrote: > On 9/9/05, Russ Cox <rsc@swtch.com> wrote: > > > Funny. The 9p reliability project looks to me a lot like the redirfs > > > that we played with before introducing the Plan B volumes into the kernel. > > > It provided failover (on replicated FSs, by some other means) and could > > > recover the fids justh by keeping track of their paths. > > It is almost the same thing. The difference is that one is 9P-9P > an the other is 9P-syscall(read-write...). We didnt have a plan 9 kernel on the > other side, so we couldnt use redirfs, that is why we ported recover. > It is better for linux (p9p) too, as it doesnt require to mount the > filesystem and then use it, so you dont depend on having someone to > serve the 9P files by mounting them. I am not sure about Plan 9. On > one side, recover knows more about the stuff under it, so it has more > granularity, and can fail in a nicer way. On the other side, redirfs > is much much simpler. > > > > > > > The user level process I'm with now is quite similar to that (appart from > > > including the language to select particular volumes) it maintains a fid > > > table knowning which server, and which path within the server are the ones > > > for each fid. It's what the Plan B kernel ns does, but within a server. > > > > > > Probably, the increase in latency you are seeing is the one I'm going to > > > see in volfs. The 2x penalty in performace is what one could expect, because > > > you have twice the latency. However, the in-kernel implementation has no > > > penalty at all, because the kernel can rewrite the mount tables. > > > > > > Maybe we should talk about this. > > > Eric? Russ? What do you say? Is it worth to pay the extra > > > latency just to avoid a change (serious, I admit) in the kernel? > > > > I keep seeing that 2x number but I still don't believe it's actually > > reasonable to measure the hit on an empty loopback file server. > > Do something over a 100Mbps file server connection > > talking to fossil and see what performance hit you get then. > > Yes, this increase in latency is in the loopback. If you are > using a network it is probably completely lost in the noise, the > network being probably 100 times slower than the loopback. > > > > > Stuff in the kernel is much harder to change and debug. > > Unless there's a compelling reason, I'd like to see it stay > > in user space. And I'm not yet convinced that performance > > is a compelling reason. > > I agree, though it depends on the application. For us (normal users) I > agree completely that it is not compelling. Some people out there are > doing stuff in which performance > is important (let them write the code? :-)). > -- > - curiosity sKilled the cat > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [9fans] reliability and failing over. 2005-09-09 23:02 ` Francisco Ballesteros @ 2005-09-09 23:06 ` Russ Cox 2005-09-09 23:16 ` Francisco Ballesteros 2005-09-10 13:11 ` Sape Mullender 1 sibling, 1 reply; 11+ messages in thread From: Russ Cox @ 2005-09-09 23:06 UTC (permalink / raw) To: Francisco Ballesteros, Fans of the OS Plan 9 from Bell Labs > Plan B usage shows that latency is the biggest problem. > I admit that only in unions, but if you join N different file servers > in a directory, and then you dup the latency, it may be a problem. > A workaround is just not to join too many servers, but it's a workaround, > not a fix. Of course, one could issue walks in parallel to all members of a union and then you'd get rid of this particular problem. And doing so would be easier in user space. Russ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [9fans] reliability and failing over. 2005-09-09 23:06 ` Russ Cox @ 2005-09-09 23:16 ` Francisco Ballesteros 0 siblings, 0 replies; 11+ messages in thread From: Francisco Ballesteros @ 2005-09-09 23:16 UTC (permalink / raw) To: Russ Cox, Fans of the OS Plan 9 from Bell Labs Yep. I stand corrected. thanks. But let's try first the simplest (unoptimized) version. If it works well for real usage, I prefer a simpler (slower maybe) version. On 9/10/05, Russ Cox <rsc@swtch.com> wrote: > > Plan B usage shows that latency is the biggest problem. > > I admit that only in unions, but if you join N different file servers > > in a directory, and then you dup the latency, it may be a problem. > > A workaround is just not to join too many servers, but it's a workaround, > > not a fix. > > Of course, one could issue walks in parallel to all members > of a union and then you'd get rid of this particular problem. > And doing so would be easier in user space. > > Russ > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [9fans] reliability and failing over. 2005-09-09 23:02 ` Francisco Ballesteros 2005-09-09 23:06 ` Russ Cox @ 2005-09-10 13:11 ` Sape Mullender 2005-09-10 21:36 ` Francisco Ballesteros 1 sibling, 1 reply; 11+ messages in thread From: Sape Mullender @ 2005-09-10 13:11 UTC (permalink / raw) To: nemo, 9fans > Beware of the latency. > > Plan B usage shows that latency is the biggest problem. That's funny. We've been working on a low-latency wireless comms protocol here at the Labs that we've called Plan B (our wireless emulator runs on Plan 9, of course), because we'd come to the conclusion that latency is the biggest problem in current cellular systems. Sape ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [9fans] reliability and failing over. 2005-09-10 13:11 ` Sape Mullender @ 2005-09-10 21:36 ` Francisco Ballesteros 0 siblings, 0 replies; 11+ messages in thread From: Francisco Ballesteros @ 2005-09-10 21:36 UTC (permalink / raw) To: Fans of the OS Plan 9 from Bell Labs Funny, I admit. We saw the problem when combining multiple user-level file servers (not fossil, but device drivers for X10, guis, and the like) within the same dir, specially on wireless at home. Issuing requests in parallel as Russ suggested will help, for sure, but if things dont get too bad, the simplest user-level file server will be a good option. We just happen to have the kernel changed, because I wanted to be able to say "no overhead, but for the unions", which required a kernel to measure. That´s why I was a considering the kernel changes besides the user level alternative. Guess I have an opportunity to experiment with a few things now that it´s going to be a user program :-) On 9/10/05, Sape Mullender <sape@plan9.bell-labs.com> wrote: > > Beware of the latency. > > > > Plan B usage shows that latency is the biggest problem. > > That's funny. We've been working on a low-latency wireless comms protocol > here at the Labs that we've called Plan B (our wireless emulator runs on Plan 9, > of course), because we'd come to the conclusion that latency is the biggest problem > in current cellular systems. > > Sape > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [9fans] reliability and failing over. 2005-09-09 22:10 ` [9fans] reliability and failing over Francisco Ballesteros 2005-09-09 22:24 ` Russ Cox @ 2005-09-10 1:22 ` Eric Van Hensbergen 1 sibling, 0 replies; 11+ messages in thread From: Eric Van Hensbergen @ 2005-09-10 1:22 UTC (permalink / raw) To: Francisco Ballesteros, Fans of the OS Plan 9 from Bell Labs On 9/9/05, Francisco Ballesteros <nemo@lsub.org> wrote: > Funny. The 9p reliability project looks to me a lot like the redirfs > that we played with before introducing the Plan B volumes into the kernel. > Yes, Gorka and I talked about this when he first started down the path. There were some differences that seemed important at the time, but I can't recall what they were. > > Probably, the increase in latency you are seeing is the one I'm going to > see in volfs. The 2x penalty in performace is what one could expect, because > you have twice the latency. However, the in-kernel implementation has no > penalty at all, because the kernel can rewrite the mount tables. > > Maybe we should talk about this. > Eric? Russ? What do you say? Is it worth to pay the extra > latency just to avoid a change (serious, I admit) in the kernel? > Its something important to figure out. I'll admit we are testing the worst case scenerio here -- but the application we started the work for (inter-partition communication) is very low-latency, and so its desirable to eliminate as much overhead as possible. Still, its worth running local area network tests, but I'm going to go for gigabit versus 100 mbit. Gorka and I will work on this next week while we finish up the recover paper. Its quite likely I'll look at integrating something like this service into my library-OS version of the 9P client and/or v9fs. Still not sure if it would make sense in Plan 9 proper or not. -eric ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2005-09-10 21:36 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2005-09-09 20:55 [9fans] sept. town hall meeting Francisco Ballesteros 2005-09-09 21:05 ` Uriel 2005-09-09 22:10 ` [9fans] reliability and failing over Francisco Ballesteros 2005-09-09 22:24 ` Russ Cox 2005-09-09 22:50 ` Gorka guardiola 2005-09-09 23:02 ` Francisco Ballesteros 2005-09-09 23:06 ` Russ Cox 2005-09-09 23:16 ` Francisco Ballesteros 2005-09-10 13:11 ` Sape Mullender 2005-09-10 21:36 ` Francisco Ballesteros 2005-09-10 1:22 ` Eric Van Hensbergen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).