From mrhandle at outlook.com Wed Jan 2 08:21:26 2013 From: mrhandle at outlook.com (mrhandle) Date: Wed, 2 Jan 2013 19:21:26 +0300 Subject: [rust-dev] [Windows] Using result::get() causes "Entry Point Not Found" Message-ID: Hey all, I am trying Rust on Windows using the 0.5 release installer. So far I've been enjoying the language (especially the module system and the ffi. So convenient I love it). However I've hit a bug that is currently stopping me: result::get() causes "Entry Point Not Found" on Windows https://github.com/mozilla/rust/issues/4320 I'd like to see some confirmation that this bug is in Rust and is not caused by my configuration. Also, if someone can educate me on why this happens (as in what is wrong with the generated binary) that would be great. Thanks for the hard work, mrhandle From ben at benalpert.com Wed Jan 2 00:01:47 2013 From: ben at benalpert.com (Ben Alpert) Date: Wed, 2 Jan 2013 01:01:47 -0700 Subject: [rust-dev] Faster vec::each_permutation implementation Message-ID: Hi all, I was perusing the source code of vec::each_permutation and noticed the comment: > This does not seem like the most efficient implementation. You > could make far fewer copies if you put your mind to it. so I decided to try my hand at a faster version, which I have posted here: https://gist.github.com/48b14f37051e7b91c22c Code review would be appreciated. In particular, I did some awkward shenanigans to pass v around my recursive function called 'helper' in order to satisfy the ownership checker. Originally I had helper(v: &mut [T], ...) -> bool, but that causes an error because put takes a &[T]. It would be great if there was a simpler way to do this. For comparison, it seems to be around 100x faster on this trivial test: let mut i = 0; for each_permutation(~[1,2,3,4,5,6,7,8,9,10,11]) |_| { i += 1; } assert i == 39916800; Let me know whether it makes sense to open a pull request with this new implementation. Thanks, Ben P.S. I am puzzled by the fact that each_permutation is listed in the docs but it's not marked as pub so it appears to be inaccessible from outside the vec.rs file. From graydon at mozilla.com Wed Jan 2 15:40:08 2013 From: graydon at mozilla.com (Graydon Hoare) Date: Wed, 02 Jan 2013 15:40:08 -0800 Subject: [rust-dev] REPL is broken in 0.5 In-Reply-To: <50D625C6.7000409@mozilla.com> References: <50D5298E.5060200@mozilla.com> <50D52C2A.7040002@mozilla.com> <50D53C4E.2090302@mozilla.com> <50D625C6.7000409@mozilla.com> Message-ID: <50E4C558.1020208@mozilla.com> On 12-12-22 01:27 PM, Patrick Walton wrote: > OK, I have fixed the REPL. I will look into respinning a 0.5.1 release > for this, if there are no objections. (Mentioned on IRC, but repeating here) The time taken to put together a release is still, sadly, quite nontrivial; I'd prefer not to divert resources to another point release that could be spent on more pressing issues (of which there are many -- nightly builds, easier releases, a PPA, packaging, perf work and whatnot among them). -Graydon From niko at alum.mit.edu Wed Jan 2 16:20:23 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 02 Jan 2013 16:20:23 -0800 Subject: [rust-dev] Faster vec::each_permutation implementation In-Reply-To: References: Message-ID: <50E4CEC7.1040006@alum.mit.edu> I think we should be able to do away with the unsafe code. I'll take a look and get back to you. Niko Ben Alpert wrote: > Hi all, > > I was perusing the source code of vec::each_permutation and noticed the comment: > >> This does not seem like the most efficient implementation. You >> could make far fewer copies if you put your mind to it. > > so I decided to try my hand at a faster version, which I have posted here: > > https://gist.github.com/48b14f37051e7b91c22c > > Code review would be appreciated. In particular, I did some awkward > shenanigans to pass v around my recursive function called 'helper' in > order to satisfy the ownership checker. Originally I had helper(v: > &mut [T], ...) -> bool, but that causes an error because put takes a > &[T]. It would be great if there was a simpler way to do this. > > For comparison, it seems to be around 100x faster on this trivial test: > > let mut i = 0; > for each_permutation(~[1,2,3,4,5,6,7,8,9,10,11]) |_| { i += 1; } > assert i == 39916800; > > Let me know whether it makes sense to open a pull request with this > new implementation. > > Thanks, > Ben > > P.S. I am puzzled by the fact that each_permutation is listed in the > docs but it's not marked as pub so it appears to be inaccessible from > outside the vec.rs file. > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Wed Jan 2 16:27:11 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 02 Jan 2013 16:27:11 -0800 Subject: [rust-dev] Faster vec::each_permutation implementation In-Reply-To: References: Message-ID: <50E4D05F.7000204@alum.mit.edu> Here is a revised version: https://gist.github.com/4439672 The only real change is the workaround for the lack of mutable function arguments. Basically instead of a parameter `v: ~[mut T]` I changed it to `v: ~[T]` and added `let mut v = v`. This takes advantage of inherited mutability. It also means there is no need for unsafe code because the compiler can see that the vector is owned by the permutation function and thus allows it to be temporarily declared as immutable during the callback. Niko Ben Alpert wrote: > Hi all, > > I was perusing the source code of vec::each_permutation and noticed the comment: > >> This does not seem like the most efficient implementation. You >> could make far fewer copies if you put your mind to it. > > so I decided to try my hand at a faster version, which I have posted here: > > https://gist.github.com/48b14f37051e7b91c22c > > Code review would be appreciated. In particular, I did some awkward > shenanigans to pass v around my recursive function called 'helper' in > order to satisfy the ownership checker. Originally I had helper(v: > &mut [T], ...) -> bool, but that causes an error because put takes a > &[T]. It would be great if there was a simpler way to do this. > > For comparison, it seems to be around 100x faster on this trivial test: > > let mut i = 0; > for each_permutation(~[1,2,3,4,5,6,7,8,9,10,11]) |_| { i += 1; } > assert i == 39916800; > > Let me know whether it makes sense to open a pull request with this > new implementation. > > Thanks, > Ben > > P.S. I am puzzled by the fact that each_permutation is listed in the > docs but it's not marked as pub so it appears to be inaccessible from > outside the vec.rs file. > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From graydon at mozilla.com Wed Jan 2 16:44:25 2013 From: graydon at mozilla.com (Graydon Hoare) Date: Wed, 02 Jan 2013 16:44:25 -0800 Subject: [rust-dev] RFC: "impl Trait for Type" In-Reply-To: <50E11628.7030105@mozilla.com> References: <50E11628.7030105@mozilla.com> Message-ID: <50E4D469.9010805@mozilla.com> On 12-12-30 08:35 PM, Patrick Walton wrote: > The way to fix this is to change the code to: > > fn main() { > let x = Constructible::new(); > } > > This state of affairs strikes me as confusing to C++/Java/C# users, who > might be expecting the methods to be available in `Foo` as well as the > `Constructible` interface. Indeed this was the case in earlier versions > of Rust (and is the case today in some cases), but I worry that this > feature is dangerous, as we might want our type to later implement a > different typeclass with a different method called `new` and there would > then be a name clash. This is a bit of a gotcha, true. It looks a little less zany if you write Constructible::new::(). I don't think there's anything much you can do to make it less surprising, in terms of the decl site. Users are going to expect _static_ methods to be scoped to the type, not the trait, no matter how you declare it. I agree with your rationale for why _not_ to do that, but I don't think you can avoid the surprise. Type::static_method() is how it works in the languages you're concerned with users from (it's why the "factory pattern" exists!) and IMO anyone who knows that will be surprised by Trait::static_method() with the concrete type inferred (or parameterized). It surprises me every time I see it, despite my belief that it's a superior feature :) > I understand that there was an objection last time this syntax was > brought up that our trait syntax in generics is `T:Trait`, and therefore > it's more consistent to say `impl T : Trait`. However, it seems to me > that the situation back then is different from the situation now in > these ways: > > ... These are interesting but, to my eyes, don't make a clear case for any particular arrangement; as I said above, I think twiddling the decl site isn't going to make it any less of a toe-stub. And I still find the "mirror the generic type-constraint form" argument most compelling for the decl. Minor preference, anyway. Not a hill I'll die on, but traits as type-constraints seems to me like a good hint. -Graydon From ben at benalpert.com Wed Jan 2 16:45:08 2013 From: ben at benalpert.com (Ben Alpert) Date: Wed, 2 Jan 2013 17:45:08 -0700 Subject: [rust-dev] Faster vec::each_permutation implementation In-Reply-To: <50E4D05F.7000204@alum.mit.edu> References: <50E4D05F.7000204@alum.mit.edu> Message-ID: Is there any way to avoid passing v around all the time to make the code cleaner? After I sent the message last night, I attempted to make the helper take v: & ~mut [T] instead (or &mut ~[T] based on your last change): https://gist.github.com/0362eef759791558d13d but this gives the error "illegal borrow unless pure: creating immutable alias to mutable vec content" on the put(*v). It seems to me like this is safe so I filed https://github.com/mozilla/rust/issues/4331 but please correct me if I'm wrong here. I suppose it's probably not possible to get rid of the unsafe block with that approach. Ben On Wed, Jan 2, 2013 at 5:27 PM, Niko Matsakis wrote: > Here is a revised version: > > https://gist.github.com/4439672 > > The only real change is the workaround for the lack of mutable function > arguments. Basically instead of a parameter `v: ~[mut T]` I changed it to > `v: ~[T]` and added `let mut v = v`. This takes advantage of inherited > mutability. It also means there is no need for unsafe code because the > compiler can see that the vector is owned by the permutation function and > thus allows it to be temporarily declared as immutable during the callback. > > > > Niko > > Ben Alpert wrote: > > Hi all, > > I was perusing the source code of vec::each_permutation and noticed the > comment: > > This does not seem like the most efficient implementation. You > could make far fewer copies if you put your mind to it. > > so I decided to try my hand at a faster version, which I have posted here: > > https://gist.github.com/48b14f37051e7b91c22c > > Code review would be appreciated. In particular, I did some awkward > shenanigans to pass v around my recursive function called 'helper' in > order to satisfy the ownership checker. Originally I had helper(v: > &mut [T], ...) -> bool, but that causes an error because put takes a > &[T]. It would be great if there was a simpler way to do this. > > For comparison, it seems to be around 100x faster on this trivial test: > > let mut i = 0; > for each_permutation(~[1,2,3,4,5,6,7,8,9,10,11]) |_| { i += 1; } > assert i == 39916800; > > Let me know whether it makes sense to open a pull request with this > new implementation. > > Thanks, > Ben > > P.S. I am puzzled by the fact that each_permutation is listed in the > docs but it's not marked as pub so it appears to be inaccessible from > outside the vec.rs > file. > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From niko at alum.mit.edu Wed Jan 2 17:13:33 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 02 Jan 2013 17:13:33 -0800 Subject: [rust-dev] Faster vec::each_permutation implementation In-Reply-To: References: <50E4D05F.7000204@alum.mit.edu> Message-ID: <50E4DB3D.30301@alum.mit.edu> It is not currently possible to do it without threading the vector through. However, if we implement the so-called INHTPAMA changes I proposed here: http://www.smallcultfollowing.com/babysteps/blog/2012/11/18/imagine-never-hearing-the-phrase-aliasable/ then it would work fine with &mut. Niko Ben Alpert wrote: > Is there any way to avoid passing v around all the time to make the > code cleaner? > > After I sent the message last night, I attempted to make the helper > take v:& ~mut [T] instead (or&mut ~[T] based on your last change): > > https://gist.github.com/0362eef759791558d13d > > but this gives the error "illegal borrow unless pure: creating > immutable alias to mutable vec content" on the put(*v). It seems to me > like this is safe so I filed > > https://github.com/mozilla/rust/issues/4331 > > but please correct me if I'm wrong here. > > I suppose it's probably not possible to get rid of the unsafe block > with that approach. > > Ben > > On Wed, Jan 2, 2013 at 5:27 PM, Niko Matsakis wrote: >> Here is a revised version: >> >> https://gist.github.com/4439672 >> >> The only real change is the workaround for the lack of mutable function >> arguments. Basically instead of a parameter `v: ~[mut T]` I changed it to >> `v: ~[T]` and added `let mut v = v`. This takes advantage of inherited >> mutability. It also means there is no need for unsafe code because the >> compiler can see that the vector is owned by the permutation function and >> thus allows it to be temporarily declared as immutable during the callback. >> >> >> >> Niko >> >> Ben Alpert wrote: >> >> Hi all, >> >> I was perusing the source code of vec::each_permutation and noticed the >> comment: >> >> This does not seem like the most efficient implementation. You >> could make far fewer copies if you put your mind to it. >> >> so I decided to try my hand at a faster version, which I have posted here: >> >> https://gist.github.com/48b14f37051e7b91c22c >> >> Code review would be appreciated. In particular, I did some awkward >> shenanigans to pass v around my recursive function called 'helper' in >> order to satisfy the ownership checker. Originally I had helper(v: >> &mut [T], ...) -> bool, but that causes an error because put takes a >> &[T]. It would be great if there was a simpler way to do this. >> >> For comparison, it seems to be around 100x faster on this trivial test: >> >> let mut i = 0; >> for each_permutation(~[1,2,3,4,5,6,7,8,9,10,11]) |_| { i += 1; } >> assert i == 39916800; >> >> Let me know whether it makes sense to open a pull request with this >> new implementation. >> >> Thanks, >> Ben >> >> P.S. I am puzzled by the fact that each_permutation is listed in the >> docs but it's not marked as pub so it appears to be inaccessible from >> outside the vec.rs >> file. >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev From vadimcn at gmail.com Wed Jan 2 19:09:49 2013 From: vadimcn at gmail.com (Vadim) Date: Wed, 2 Jan 2013 19:09:49 -0800 Subject: [rust-dev] Rust and C++ Message-ID: Hi, Is there any expectation that Rust's traits would be binary-compatible with C++ vtables? I expect the answer is "no", but just in case... Reason I'm asking: I am toying with the idea of using Windows COM components from Rust. Has anyone here done anything like that in the past? Vadim -------------- next part -------------- An HTML attachment was scrubbed... URL: From pwalton at mozilla.com Wed Jan 2 19:23:42 2013 From: pwalton at mozilla.com (Patrick Walton) Date: Wed, 02 Jan 2013 19:23:42 -0800 Subject: [rust-dev] Rust and C++ In-Reply-To: References: Message-ID: <50E4F9BE.9000906@mozilla.com> On 1/2/13 7:09 PM, Vadim wrote: > Hi, > Is there any expectation that Rust's traits would be binary-compatible > with C++ vtables? I expect the answer is "no", but just in case... No, we aren't committing to any stable API. > Reason I'm asking: I am toying with the idea of using Windows COM > components from Rust. Has anyone here done anything like that in the past? For Servo we have bindings to Core Foundation on the Mac, which is a COM-like system. We implement the Clone trait and use structs with destructors in order to enforce the proper use of reference counting. You can find the code here: https://github.com/mozilla-servo/rust-core-foundation Patrick From pwalton at mozilla.com Wed Jan 2 19:24:00 2013 From: pwalton at mozilla.com (Patrick Walton) Date: Wed, 02 Jan 2013 19:24:00 -0800 Subject: [rust-dev] Rust and C++ In-Reply-To: <50E4F9BE.9000906@mozilla.com> References: <50E4F9BE.9000906@mozilla.com> Message-ID: <50E4F9D0.2020008@mozilla.com> On 1/2/13 7:23 PM, Patrick Walton wrote: > On 1/2/13 7:09 PM, Vadim wrote: >> Hi, >> Is there any expectation that Rust's traits would be binary-compatible >> with C++ vtables? I expect the answer is "no", but just in case... > > No, we aren't committing to any stable API. That should say "ABI". Patrick From vadimcn at gmail.com Thu Jan 3 17:05:53 2013 From: vadimcn at gmail.com (Vadim) Date: Thu, 3 Jan 2013 17:05:53 -0800 Subject: [rust-dev] Rust and C++ In-Reply-To: <50E4F9BE.9000906@mozilla.com> References: <50E4F9BE.9000906@mozilla.com> Message-ID: Thanks for the pointer! So now I'm trying to use Rust macros to generate all required COM artifacts from a single interface definition: macro_rules! com_interface( ( $ifc_name:ident { $( fn $fn_name:ident ( $( $param:ident : $ptype:ty ),* ) -> $tres:ty );+ ; } ) => ( trait $ifc_name { $( fn $fn_name ( $($param : $ptype),* ) -> $tres );+ ; } struct concat_idents!($ifc_name, _vtable) { $( $fn_name : *fn ( this: **concat_idents!($ifc_name, _vtable), $($param : $ptype),* ) -> $tres ),+ } ) ) com_interface!( IInterface { fn Foo(a : int, b : int, c : ~str) -> (); fn Bar(a : int) -> int; //fn Baz() -> (); } ) This is supposed to produce the following code: trait IInterface { fn Foo(a : int, b : int, c : ~str) -> (); fn Bar(a : int) -> int; } struct IInterface_vtable { Foo : *fn(this : **IInterface_vtable, a : int, b : int, c : ~str) -> (), Bar : *fn(this : **IInterface_vtable, a : int) -> int } I am having several problems with this: 1. Apparently this only generates the trait block and silently discards the struct portion of expansion (according to output of "rustc --pretty expanded test_macro.rs") 2. I can't figure out the syntax of concat_idents... rustc says: test_macro.rs:18:22: 18:23 error: expected `{`, `(`, or `;` after struct name but found `!` test_macro.rs:18 struct concat_idents!($ifc_name, _vtable) (comment out the trait block to see this in action). 3. This macro doesn't work with parameter-less functions. Uncommenting Baz() in the interface definition above produces: test_macro.rs:29:9: 29:10 error: Local ambiguity: multiple parsing options: built-in NTs ident ('param') or 1 other options. test_macro.rs:29 fn Baz() -> (); 4. Even if Baz were parsed correctly, I would probably still have a problem with expansion because of the extra trailing comma after "this" parameter. Any ideas how to work around that? thanks! On Wed, Jan 2, 2013 at 7:23 PM, Patrick Walton wrote: > On 1/2/13 7:09 PM, Vadim wrote: > >> Hi, >> Is there any expectation that Rust's traits would be binary-compatible >> with C++ vtables? I expect the answer is "no", but just in case... >> > > No, we aren't committing to any stable API. > > Reason I'm asking: I am toying with the idea of using Windows COM >> components from Rust. Has anyone here done anything like that in the >> past? >> > > For Servo we have bindings to Core Foundation on the Mac, which is a > COM-like system. We implement the Clone trait and use structs with > destructors in order to enforce the proper use of reference counting. You > can find the code here: > > https://github.com/mozilla-**servo/rust-core-foundation > > Patrick > > ______________________________**_________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/**listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pwalton at mozilla.com Sat Jan 5 13:37:37 2013 From: pwalton at mozilla.com (Patrick Walton) Date: Sat, 5 Jan 2013 13:37:37 -0800 Subject: [rust-dev] Plans for improving compiler performance Message-ID: <50E89BC5.4090208@mozilla.com> I thought I'd begin a discussion as to how to improve performance of the rustc compiler. First off, I do not see any way we can improve *general* compiler performance by an order of magnitude. The language is simply designed to favor safety and expressivity over compile-time performance. Other than code generation, typechecking is by far the longest pass for large Rust. But there is an upper bound on how fast we can make typechecking, because the language requires subtyping, generics, and a variant of Hindley-Milner type inference. This means that these common tricks cannot be used: 1. Fast C-like typechecking won't work because we need to solve for type variables. For instance, the type of `let x = [];` or `let y = None;` is determined from use, unlike for example C++, Java, C#, or Go. 2. Fast ML-like "type equality can be determined with a pointer comparison" tricks will not work, because we have subtyping and must recur on type structure to unify. 3. Nominal types in general cannot be represented as a simple integer "class ID", as in early Java. They require a heap-allocated vector to represent the type parameter substitutions. In general, the low-hanging fruit for general compiler performance is mostly picked at this point. I would put an upper bound of compiler performance improvements for all stages of a self-hosted build of the Rust compiler at 20% or so. The reasons for this are: 1. Typechecking and LLVM code generation are mostly optimal. When compiling `rustc`, the time spent in these two passes dwarfs all the others. Typechecking cannot be algorithmically improved, and LLVM code generation is about as straightforward as it can possibly be. The remaining performance issues in these two passes are generally due to allocating too much, but allocation and freeing in Rust is no more than 15% of the compile time. Thus even if we spent all our time on the allocator and got its cost down to a theoretical zero, we would only improve performance by 15% or so. 2. LLVM optimizations end up dominating compile time when they're turned on (75% of compile time). However, the Rust compiler, like most Rust (or C++) code, is dependent on LLVM optimizations for good performance. So if you turn off optimization, you have a slow compiler. But if you turn on optimization, the vast majority of your self-hosting time is spent in LLVM optimizations. The obvious way around this catch-22 is to spend a lot of time manually writing the optimizations that LLVM would have performed into our compiler in order to improve performance at -O0, but I don't think that's a particularly good use of our time, and it would hurt the compiler's maintainability. There are, however, some more situational things we can do. # General code generation performance * We can make `~"string"` allocations (and some others, like ~[ 1, 2, 3, 4, 5 ]) turn into calls to the moral equivalent of `strdup`. This improves some workloads, such as the PGP key in cargo (which should just be a constant in the first place). `rustc` still allocates a lot of strings like this, so this might improve the LLVM portion of `rustc`'s compilation speed. * Visitor glue should be optional; you should have to opt into its generation, like Haskell's `Data.Typeable`. This would potentially remove 15% of our code size and improve our code generation performance by a similar amount, but, as Graydon points out, it is needed for precise-on-the-heap GC. Perhaps we could use conservative GC at -O0, and thus reduce the amount of visitor glue we need to generate for unoptimized builds. # -O0 performance For -O0 (which is the default), we get kicked off LLVM's fast instruction selector far too often. We need to stop generating the instructions that cause LLVM to bail out to the slow SelectionDAG path at -O0. This only affects -O0, but since that's the most common case that matters for compilation speed, that's fine. Note that these optimizations are severely limited in what they can do for self-hosting performance, for the reasons stated above. * Invoke instructions cause FastISel bailouts. This means that we can't use the C++ exception infrastructure for task failure if we want fast builds. Graydon has work on an optional return-value-based unwinding mode which is nearing completion. I have a patch in review for a "disable_unwinding" flag, which disables unwinding for failure; this should be safe to turn on for libsyntax and librustc, since they have no need to handle failure gracefully, and doing so improves compile-time -O0 LLVM performance by 1.9x. * Visitor glue used to cause FastISel bailouts, but this is fixed in incoming. * Switch instructions cause FastISel bailouts. Pattern matching on enums (and sometimes on integers too) generates these. Drop and take glue on enums generates these too. This shouldn't be too hard to fix. * Integer division instructions result in FastISel bailouts on x86. We generate a lot of these due to the fact that our vector lengths are in bytes. We could change that, or we could try to hack LLVM, or we could turn integer divisions into function calls to libcore on -O0. (Note that integer division turns into a function call *anyway* on ARM, since ARM has no integer divide instruction. So I'm inclined to try the last one.) # Memory allocation performance Our memory allocation is suboptimal in several ways. I do not think that improving it will improve compiler performance as long as you aren't already swapping, but I'll list them anyway. * We do not have our own allocator; we just use the system malloc. However, we need to trace all allocations, to clean up @ cycles on task death. So we thread all allocations into a doubly-linked list. This is a huge waste of memory for the next and previous pointers. We could fix this by using an allocator that allows us to trace allocations. I would be surprised if fixing this had a huge impact in performance, but maybe it would bump some allocations that were previously in higher storage classes into the TINY class, which generally has a fast path in the allocator. And, of course, it would reduce swapping when self-hosting if you don't have enough memory. * We don't clean up @ cycles until task death. Fixing this will, in all likelihood, worsen the compiler's performance. However, its memory usage will improve. * ~ allocations don't really need to be linked into any list or be traceable, *unless* they contain @ pointers, at which point they do need to be traceable. Fixing this will improve memory usage and improve performance by a negligible amount. # External metadata We currently read external crate metadata in its entirety for external crates during a few phases of the compiler. This dominates the compilation time of small programs only, as in larger programs such as rustc, the cost quickly shrinks to nothing compared to the larger compilation. However, since newcomers to Rust generally compile small programs, this is most of the cost they see. Also, this constitutes the majority of the time that our test suite takes. Finally, this is the performance bottleneck for the REPL. Improving this will not improve the compilation speed of self-hosting by more than 1%. The biggest benefit of fixing this is that small programs will appear to compile instantly, which improves the first impressions of Rust a lot for those used to fast builds in other languages. * External metadata reading takes a long time (0.3 s). I'm not sure whether all of this is necessary, as I'm not too familiar with this pass. * Language item collection reads all the items in external crates to look for language items (another 0.3 s). This is silly and is easy to fix; we just add a new table to the metadata that specifies def IDs for language items. * Name resolution has to read all the items in external crates (another 0.3 s). This was the easiest way to approximate the 0.5 name resolution semantics. (The actual semantics were basically unimplementable, but this algorithm got close enough to work in practice -- usually.) With the new semantics in Rust 0.6 we should be able to do better here and avoid reading modules until they're actually referenced. Unfortunately, fixing this will require rewriting resolve, which is a month's worth of work. # Stack switching * We could run rustc with a large stack and avoid stack switching. This is functionality we need for Servo anyway. This might improve compiler performance by 1% or so. None of these optimizations will improve the `rustc` self-hosting time by anything approaching an order of magnitude. However, I think they could have a positive impact on the experience for newcomers to Rust. Patrick From pwalton at mozilla.com Sat Jan 5 16:20:46 2013 From: pwalton at mozilla.com (Patrick Walton) Date: Sat, 5 Jan 2013 16:20:46 -0800 Subject: [rust-dev] Plans for improving compiler performance In-Reply-To: <50E89BC5.4090208@mozilla.com> References: <50E89BC5.4090208@mozilla.com> Message-ID: <50E8C1F7.30601@mozilla.com> I realized I forgot some things. # Garbage collection * Not generating reference count manipulations (which would be possible if we switched to tracing GC) improves code generation speed. I created a test that resulted in a 22.5% improvement in LLVM pass speed. Not an order of magnitude difference, but it's nice. # FastISel * There is another issue: `i1` parameters cause FastISel bailouts. These would be our `bool` type. Probably the solution here is to translate our `bool` as `i8` instead of `i1`. Patrick From james at mansionfamily.plus.com Sun Jan 6 08:22:13 2013 From: james at mansionfamily.plus.com (james) Date: Sun, 06 Jan 2013 16:22:13 +0000 Subject: [rust-dev] Plans for improving compiler performance In-Reply-To: <50E89BC5.4090208@mozilla.com> References: <50E89BC5.4090208@mozilla.com> Message-ID: <50E9A4B5.4090602@mansionfamily.plus.com> Could you use multiple threads to type check and code gen in parallel? Could you retain information from a previous run of the compiler and reuse it (especially for code generation)? From niko at alum.mit.edu Sun Jan 6 09:16:37 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Sun, 06 Jan 2013 09:16:37 -0800 Subject: [rust-dev] Plans for improving compiler performance In-Reply-To: <50E89BC5.4090208@mozilla.com> References: <50E89BC5.4090208@mozilla.com> Message-ID: <50E9B175.3010200@alum.mit.edu> I basically agree with everything you said but I still think we'll be able to improve performance quite a bit over time. I guess I'd say there is a lot of "mid-level hanging fruit" to be picked. Perhaps our type checker is optimal in the Big O sense, but I am quite confident that the hidden constants are much larger than they have to be. Here are some relatively simple things we could do to improve performance in type-checking, off the top of my head: - Right now I think we allocate a fair number of empty vectors. If we used @[] or Option<~[]>, we could avoid allocation on the empty case and lower overall memory use. - Convert structural records in the compiler to structs. - Caching for method lookup results (*) and possibly other similar things. - More use of arenas (eventually, we're not quite ready for this yet). Longer term, we could refactor the compiler to support parallel compilation and to track dependencies. Niko (*) This is non-trivial. But right now we definitely do a lot of work for every method lookup and I'm certain we could cache some of it. Patrick Walton wrote: > I thought I'd begin a discussion as to how to improve performance of > the rustc compiler. > > First off, I do not see any way we can improve *general* compiler > performance by an order of magnitude. The language is simply designed > to favor safety and expressivity over compile-time performance. Other > than code generation, typechecking is by far the longest pass for > large Rust. But there is an upper bound on how fast we can make > typechecking, because the language requires subtyping, generics, and a > variant of Hindley-Milner type inference. This means that these common > tricks cannot be used: > > 1. Fast C-like typechecking won't work because we need to solve for > type variables. For instance, the type of `let x = [];` or `let y = > None;` is determined from use, unlike for example C++, Java, C#, or Go. > > 2. Fast ML-like "type equality can be determined with a pointer > comparison" tricks will not work, because we have subtyping and must > recur on type structure to unify. > > 3. Nominal types in general cannot be represented as a simple integer > "class ID", as in early Java. They require a heap-allocated vector to > represent the type parameter substitutions. > > In general, the low-hanging fruit for general compiler performance is > mostly picked at this point. I would put an upper bound of compiler > performance improvements for all stages of a self-hosted build of the > Rust compiler at 20% or so. The reasons for this are: > > 1. Typechecking and LLVM code generation are mostly optimal. When > compiling `rustc`, the time spent in these two passes dwarfs all the > others. Typechecking cannot be algorithmically improved, and LLVM code > generation is about as straightforward as it can possibly be. The > remaining performance issues in these two passes are generally due to > allocating too much, but allocation and freeing in Rust is no more > than 15% of the compile time. Thus even if we spent all our time on > the allocator and got its cost down to a theoretical zero, we would > only improve performance by 15% or so. > > 2. LLVM optimizations end up dominating compile time when they're > turned on (75% of compile time). However, the Rust compiler, like most > Rust (or C++) code, is dependent on LLVM optimizations for good > performance. So if you turn off optimization, you have a slow > compiler. But if you turn on optimization, the vast majority of your > self-hosting time is spent in LLVM optimizations. The obvious way > around this catch-22 is to spend a lot of time manually writing the > optimizations that LLVM would have performed into our compiler in > order to improve performance at -O0, but I don't think that's a > particularly good use of our time, and it would hurt the compiler's > maintainability. > > There are, however, some more situational things we can do. > > # General code generation performance > > * We can make `~"string"` allocations (and some others, like ~[ 1, 2, > 3, 4, 5 ]) turn into calls to the moral equivalent of `strdup`. This > improves some workloads, such as the PGP key in cargo (which should > just be a constant in the first place). `rustc` still allocates a lot > of strings like this, so this might improve the LLVM portion of > `rustc`'s compilation speed. > > * Visitor glue should be optional; you should have to opt into its > generation, like Haskell's `Data.Typeable`. This would potentially > remove 15% of our code size and improve our code generation > performance by a similar amount, but, as Graydon points out, it is > needed for precise-on-the-heap GC. Perhaps we could use conservative > GC at -O0, and thus reduce the amount of visitor glue we need to > generate for unoptimized builds. > > # -O0 performance > > For -O0 (which is the default), we get kicked off LLVM's fast > instruction selector far too often. We need to stop generating the > instructions that cause LLVM to bail out to the slow SelectionDAG path > at -O0. > > This only affects -O0, but since that's the most common case that > matters for compilation speed, that's fine. Note that these > optimizations are severely limited in what they can do for > self-hosting performance, for the reasons stated above. > > * Invoke instructions cause FastISel bailouts. This means that we > can't use the C++ exception infrastructure for task failure if we want > fast builds. Graydon has work on an optional return-value-based > unwinding mode which is nearing completion. I have a patch in review > for a "disable_unwinding" flag, which disables unwinding for failure; > this should be safe to turn on for libsyntax and librustc, since they > have no need to handle failure gracefully, and doing so improves > compile-time -O0 LLVM performance by 1.9x. > > * Visitor glue used to cause FastISel bailouts, but this is fixed in > incoming. > > * Switch instructions cause FastISel bailouts. Pattern matching on > enums (and sometimes on integers too) generates these. Drop and take > glue on enums generates these too. This shouldn't be too hard to fix. > > * Integer division instructions result in FastISel bailouts on x86. We > generate a lot of these due to the fact that our vector lengths are in > bytes. We could change that, or we could try to hack LLVM, or we could > turn integer divisions into function calls to libcore on -O0. (Note > that integer division turns into a function call *anyway* on ARM, > since ARM has no integer divide instruction. So I'm inclined to try > the last one.) > > # Memory allocation performance > > Our memory allocation is suboptimal in several ways. I do not think > that improving it will improve compiler performance as long as you > aren't already swapping, but I'll list them anyway. > > * We do not have our own allocator; we just use the system malloc. > However, we need to trace all allocations, to clean up @ cycles on > task death. So we thread all allocations into a doubly-linked list. > This is a huge waste of memory for the next and previous pointers. We > could fix this by using an allocator that allows us to trace allocations. > > I would be surprised if fixing this had a huge impact in performance, > but maybe it would bump some allocations that were previously in > higher storage classes into the TINY class, which generally has a fast > path in the allocator. And, of course, it would reduce swapping when > self-hosting if you don't have enough memory. > > * We don't clean up @ cycles until task death. Fixing this will, in > all likelihood, worsen the compiler's performance. However, its memory > usage will improve. > > * ~ allocations don't really need to be linked into any list or be > traceable, *unless* they contain @ pointers, at which point they do > need to be traceable. Fixing this will improve memory usage and > improve performance by a negligible amount. > > # External metadata > > We currently read external crate metadata in its entirety for external > crates during a few phases of the compiler. This dominates the > compilation time of small programs only, as in larger programs such as > rustc, the cost quickly shrinks to nothing compared to the larger > compilation. However, since newcomers to Rust generally compile small > programs, this is most of the cost they see. Also, this constitutes > the majority of the time that our test suite takes. Finally, this is > the performance bottleneck for the REPL. > > Improving this will not improve the compilation speed of self-hosting > by more than 1%. The biggest benefit of fixing this is that small > programs will appear to compile instantly, which improves the first > impressions of Rust a lot for those used to fast builds in other > languages. > > * External metadata reading takes a long time (0.3 s). I'm not sure > whether all of this is necessary, as I'm not too familiar with this pass. > > * Language item collection reads all the items in external crates to > look for language items (another 0.3 s). This is silly and is easy to > fix; we just add a new table to the metadata that specifies def IDs > for language items. > > * Name resolution has to read all the items in external crates > (another 0.3 s). This was the easiest way to approximate the 0.5 name > resolution semantics. (The actual semantics were basically > unimplementable, but this algorithm got close enough to work in > practice -- usually.) With the new semantics in Rust 0.6 we should be > able to do better here and avoid reading modules until they're > actually referenced. Unfortunately, fixing this will require rewriting > resolve, which is a month's worth of work. > > # Stack switching > > * We could run rustc with a large stack and avoid stack switching. > This is functionality we need for Servo anyway. This might improve > compiler performance by 1% or so. > > None of these optimizations will improve the `rustc` self-hosting time > by anything approaching an order of magnitude. However, I think they > could have a positive impact on the experience for newcomers to Rust. > > Patrick > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From lindsey at rockstargirl.org Sun Jan 6 09:50:07 2013 From: lindsey at rockstargirl.org (Lindsey Kuper) Date: Sun, 6 Jan 2013 12:50:07 -0500 Subject: [rust-dev] Plans for improving compiler performance In-Reply-To: <50E89BC5.4090208@mozilla.com> References: <50E89BC5.4090208@mozilla.com> Message-ID: What stands in the way of doing incremental compilation? From pwalton at mozilla.com Sun Jan 6 10:29:30 2013 From: pwalton at mozilla.com (Patrick Walton) Date: Sun, 6 Jan 2013 10:29:30 -0800 Subject: [rust-dev] Plans for improving compiler performance In-Reply-To: References: <50E89BC5.4090208@mozilla.com> Message-ID: <50E9C135.8040001@mozilla.com> On 1/6/13 9:50 AM, Lindsey Kuper wrote: > What stands in the way of doing incremental compilation? Personally, I think our time would be better spent making it easy to break large projects up into multiple finer-grained crates. We should be able to tell cargo and/or the work-in-progress `fbuild` workalike to compile a crate and rebuild all of its dependent crates (if they were modified) in one command. This strikes me as more reliable than incremental compilation, because crate structure enforces a DAG. What worries me with incremental compilation is that we'll do a lot of work to make it work, then we'll discover that in practice intra-crate dependencies are so intertwined that most changes result in a full rebuild anyway. Patrick From niko at alum.mit.edu Sun Jan 6 10:34:45 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Sun, 06 Jan 2013 10:34:45 -0800 Subject: [rust-dev] Plans for improving compiler performance In-Reply-To: <50E9B175.3010200@alum.mit.edu> References: <50E89BC5.4090208@mozilla.com> <50E9B175.3010200@alum.mit.edu> Message-ID: <50E9C3C5.60004@alum.mit.edu> Niko Matsakis wrote: > Here are some relatively simple things we could do to improve > performance in type-checking, off the top of my head: I should add that the overall impact of these (and other) changes might easily be small. I know you have profiles showing malloc to be a relatively minor contribution to overall performance, for example. My feeling is that it's hard to estimate the impact in advance---past experience suggests that sometimes these sorts of changes have an oversized impact relative to the profile and sometimes none at all. Long term I think we should try to tighten up performance and, if we do enough of that, things *will* get faster. Have we tried to profile memory consumption at all? I'd be curious to know e.g. what portion of our memory is used in the AST representation vs other things. It should be easy enough to use dtrace and get an idea how much is allocated in each pass. Niko From catamorphism at gmail.com Sun Jan 6 14:57:52 2013 From: catamorphism at gmail.com (Tim Chevalier) Date: Sun, 6 Jan 2013 14:57:52 -0800 Subject: [rust-dev] Plans for improving compiler performance In-Reply-To: <50E9C135.8040001@mozilla.com> References: <50E89BC5.4090208@mozilla.com> <50E9C135.8040001@mozilla.com> Message-ID: On Sun, Jan 6, 2013 at 10:29 AM, Patrick Walton wrote: > On 1/6/13 9:50 AM, Lindsey Kuper wrote: >> >> What stands in the way of doing incremental compilation? > > > Personally, I think our time would be better spent making it easy to break > large projects up into multiple finer-grained crates. We should be able to > tell cargo and/or the work-in-progress `fbuild` workalike to compile a crate > and rebuild all of its dependent crates (if they were modified) in one > command. > > This strikes me as more reliable than incremental compilation, because crate > structure enforces a DAG. What worries me with incremental compilation is > that we'll do a lot of work to make it work, then we'll discover that in > practice intra-crate dependencies are so intertwined that most changes > result in a full rebuild anyway. I think that it would be good to do an experiment to see whether or not that worry is justified, which is to say, printing out what the dependency graph is and seeing how modular is for rustc and perhaps other Rust crates. I put in some work towards doing this, but got derailed at some point. It's not particularly difficult, though, and then we could make an informed decision about whether or not to go down this path. A lot of the work for incremental compilation will likely also be useful for parallelizing the compiler. So I don't see it as a waste of time. Cheers, Tim -- Tim Chevalier * http://catamorphism.org/ * Often in error, never in doubt "Too much to carry, too much to let go Time goes fast, learning goes slow." -- Bruce Cockburn From vadimcn at gmail.com Mon Jan 7 00:10:06 2013 From: vadimcn at gmail.com (Vadim) Date: Mon, 7 Jan 2013 00:10:06 -0800 Subject: [rust-dev] Matching items in macros Message-ID: Hi, Am I totally off-base in expecting the following to work? macro_rules! gen_mod( ( $name:ident { $( $fun:item )+ } ) => ( mod $name { $( $fun )+ } ) ) gen_mod!( MyMod { fn Foo() -> u32 {} fn Bar() -> u32 {} } ) Gives me this error: test.rs:21:4: 21:5 error: Local ambiguity: multiple parsing options: built-in NTs item ('fun') or 1 other options. test.rs:21 } -------------- next part -------------- An HTML attachment was scrubbed... URL: From graydon at mozilla.com Mon Jan 7 09:40:22 2013 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 07 Jan 2013 09:40:22 -0800 Subject: [rust-dev] Plans for improving compiler performance In-Reply-To: <50E89BC5.4090208@mozilla.com> References: <50E89BC5.4090208@mozilla.com> Message-ID: <50EB0886.3050500@mozilla.com> On 05/01/2013 1:37 PM, Patrick Walton wrote: > First off, I do not see any way we can improve *general* compiler > performance by an order of magnitude. I don't feel this sort of quantitative statement can be made, from where we are now. When I say "I expect we should be able to get it an order of magnitude faster" I'm just talking comparatively, against codebases of similar size/complexity in similar-abstraction-level languages, it builds too slowly and produces artifacts that are too big. The landscape of possible changes is IMO too complex to judge the particulars in quantitative terms yet. And I'm not going to address the abstract arguments about upper bounds of speed; in particular, thingsl ike "we can't typecheck the way $LANGUAGE does" in a big-O sense is just not the level I want to be looking at the problem. I'm mostly concerned about residual systemic taxes / technical debt: - Non-use of arenas and &, pervasively - Allocating and freeing effectively constant data we don't properly constify. Note: _no_ expression-level constant folding happens presently outside of const item initializers. - Tag discriminant values loaded from memory (!) - Implicit copying - Inefficient representations: as you point out, our memory manager probably uses 2x-3x more memory than it needs to, especially for small allocations - Allocator locking, fragmentation, general non-optimality - Refcounting traffic - Stack growth and task lifecycle stuff that may be optional or obsolete - Landing pads - Cleanup duplication in _any_ scope-exit scenario - Redundant basic blocks - The match code translation, which (in our own admission) nobody is comfortable enough with to hack on presently - Glue code (not just visitor; we still generate drop and free) - Falling off fastisel - Not annotating functions correctly (noalias/nocapture/nounwind/readonly/readnone/optsize) - Metadata encoding, organization, I/O paths - Use of dynamic closures rather than traits and/or vtables, &Objects - Wrong inline settings in libs (we don't capture or measure this) - Wrong monomorphization settings (likewise; count copies of vec::foo in any resulting binary) - Non-optimality / taxes of: - fmt! - assert and fail - bounds checks - / and % checks, idiv on [] That's just off the top of my head. My general rule with perf work is that you don't even discover the worst of it until you're knee deep in solving "curious unexplained parts" of such problems. Given how many of the items on that list have direct consequences in terms of how much code we throw at LLVM, I'm not confident saying anything is "optimal" or "not able to be improved" yet. All I see are lots of areas of performance-related technical debt juxtaposed with too-big binaries and too-slow compiles. I'm not doing any arithmetic beyond that. -Graydon From graydon at mozilla.com Mon Jan 7 09:50:46 2013 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 07 Jan 2013 09:50:46 -0800 Subject: [rust-dev] Plans for improving compiler performance In-Reply-To: References: <50E89BC5.4090208@mozilla.com> <50E9C135.8040001@mozilla.com> Message-ID: <50EB0AF6.70605@mozilla.com> On 06/01/2013 2:57 PM, Tim Chevalier wrote: >> Personally, I think our time would be better spent making it easy to break >> large projects up into multiple finer-grained crates. We should be able to >> tell cargo and/or the work-in-progress `fbuild` workalike to compile a crate >> and rebuild all of its dependent crates (if they were modified) in one >> command. >> >> This strikes me as more reliable than incremental compilation, because crate >> structure enforces a DAG. What worries me with incremental compilation is >> that we'll do a lot of work to make it work, then we'll discover that in >> practice intra-crate dependencies are so intertwined that most changes >> result in a full rebuild anyway. > > I think that it would be good to do an experiment to see whether or > not that worry is justified, which is to say, printing out what the > dependency graph is and seeing how modular is for rustc and perhaps > other Rust crates. In terms of immediate project scheduling, getting the build / package system under control is higher priority as it's the main gate on community involvement / growing the library ecosystem / scaling up servo work. It'll also make it easier to develop patches to (say) libstd without always cycling through a triple-rustc-bootstrap. And easier to get started by just downloading a prebuilt librustllvm. &c &c. In the longer term of parallelizing rustc as much as possible, you're right that formalizing intra-crate item dependency is an essential first step. It's just a matter of priority-making. -Graydon From sh4.seo at samsung.com Wed Jan 9 03:07:16 2013 From: sh4.seo at samsung.com (Sanghyeon Seo) Date: Wed, 09 Jan 2013 11:07:16 +0000 (GMT) Subject: [rust-dev] Release Statistics Message-ID: <22952044.476311357729635900.JavaMail.weblogic@epml11> Is some sort of download statistics for Rust releases available? I am interested in a ballpark figure. Things like the number of unique IP addresses who downloaded rust-0.5.tar.gz. static.rust-lang.org seems to be hosted on Amazon S3. Is web log available for analysis? From davidb at davidb.org Wed Jan 9 08:43:01 2013 From: davidb at davidb.org (David Brown) Date: Wed, 9 Jan 2013 08:43:01 -0800 Subject: [rust-dev] Integer constant treated as 16-bit value? Message-ID: <20130109164301.GA17982@davidb.org> Somewhere between 0.5 and rustc 0.6 (09bb07b 2012-12-24 18:29:02 -0800) host: x86_64-unknown-linux-gnu The program below started printing 16959 instead of 999999 (16959 == 999999 & 0xFFFF). It seems rustc has decided to use a 16-bit integer, even though the constant doesn't even fit. fn main() { let mut count = 0; for 999_999.times() { count += 1; } io::println(fmt!("%u", count)); } Is this known, or should I file a ticket? David From graydon at mozilla.com Wed Jan 9 08:45:22 2013 From: graydon at mozilla.com (Graydon Hoare) Date: Wed, 09 Jan 2013 08:45:22 -0800 Subject: [rust-dev] Release Statistics In-Reply-To: <22952044.476311357729635900.JavaMail.weblogic@epml11> References: <22952044.476311357729635900.JavaMail.weblogic@epml11> Message-ID: <50ED9EA2.9030500@mozilla.com> On 09/01/2013 3:07 AM, Sanghyeon Seo wrote: > Is some sort of download statistics for Rust releases available? I am interested > in a ballpark figure. Things like the number of unique IP addresses who downloaded > rust-0.5.tar.gz. > > static.rust-lang.org seems to be hosted on Amazon S3. Is web log available for > analysis? No, we did not have request logging enabled until today. I just turned it on, so we should start gathering stats presently. It'll take a while to get a picture from that (and will, in any case, probably not be interesting before the next release). One thing I've often wondered about adding a telemetry-reporting mode for rustc (opt-in of course) that reports usage stats to us, so we can see things like: - Which ICEs are host often hit - Which error messages and warnings are most often emitted - Which packages are most often installed - Which language constructs are most often used - How rustc is performing in terms of memory use and compile time I haven't seen this sort of telemetry in a command-line tool before but given how valuable it has turned out to be in browsers, I wonder if it'd be a tolerable addition for users. -Graydon From graydon at mozilla.com Wed Jan 9 08:46:44 2013 From: graydon at mozilla.com (Graydon Hoare) Date: Wed, 09 Jan 2013 08:46:44 -0800 Subject: [rust-dev] Integer constant treated as 16-bit value? In-Reply-To: <20130109164301.GA17982@davidb.org> References: <20130109164301.GA17982@davidb.org> Message-ID: <50ED9EF4.70804@mozilla.com> On 09/01/2013 8:43 AM, David Brown wrote: > Is this known, or should I file a ticket? We know there are a few issues like this; but that's _especially_ poor behavior. Please file a ticket! -Graydon From niko at alum.mit.edu Wed Jan 9 09:46:59 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 09 Jan 2013 09:46:59 -0800 Subject: [rust-dev] Release Statistics In-Reply-To: <50ED9EA2.9030500@mozilla.com> References: <22952044.476311357729635900.JavaMail.weblogic@epml11> <50ED9EA2.9030500@mozilla.com> Message-ID: <50EDAD13.8080601@alum.mit.edu> That's a really intriguing idea! Graydon Hoare wrote: > On 09/01/2013 3:07 AM, Sanghyeon Seo wrote: >> Is some sort of download statistics for Rust releases available? I am interested >> in a ballpark figure. Things like the number of unique IP addresses who downloaded >> rust-0.5.tar.gz. >> >> static.rust-lang.org seems to be hosted on Amazon S3. Is web log available for >> analysis? > > No, we did not have request logging enabled until today. I just turned > it on, so we should start gathering stats presently. It'll take a while > to get a picture from that (and will, in any case, probably not be > interesting before the next release). > > One thing I've often wondered about adding a telemetry-reporting mode > for rustc (opt-in of course) that reports usage stats to us, so we can > see things like: > > - Which ICEs are host often hit > - Which error messages and warnings are most often emitted > - Which packages are most often installed > - Which language constructs are most often used > - How rustc is performing in terms of memory use and compile time > > I haven't seen this sort of telemetry in a command-line tool before but > given how valuable it has turned out to be in browsers, I wonder if it'd > be a tolerable addition for users. > > -Graydon > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From banderson at mozilla.com Wed Jan 9 12:56:43 2013 From: banderson at mozilla.com (Brian Anderson) Date: Wed, 09 Jan 2013 12:56:43 -0800 Subject: [rust-dev] Integer constant treated as 16-bit value? In-Reply-To: <20130109164301.GA17982@davidb.org> References: <20130109164301.GA17982@davidb.org> Message-ID: <50EDD98B.3060907@mozilla.com> On 01/09/2013 08:43 AM, David Brown wrote: > Somewhere between 0.5 and > > rustc 0.6 (09bb07b 2012-12-24 18:29:02 -0800) > host: x86_64-unknown-linux-gnu > > The program below started printing 16959 instead of 999999 (16959 == > 999999 & 0xFFFF). It seems rustc has decided to use a 16-bit integer, > even though the constant doesn't even fit. > This problem has a ticket open: https://github.com/mozilla/rust/issues/3211 https://github.com/mozilla/rust/issues/3398 From catamorphism at gmail.com Wed Jan 9 17:44:47 2013 From: catamorphism at gmail.com (Tim Chevalier) Date: Wed, 9 Jan 2013 17:44:47 -0800 Subject: [rust-dev] Plans for improving compiler performance In-Reply-To: <50E9A4B5.4090602@mansionfamily.plus.com> References: <50E89BC5.4090208@mozilla.com> <50E9A4B5.4090602@mansionfamily.plus.com> Message-ID: I noticed that nobody answered this question... On Sun, Jan 6, 2013 at 8:22 AM, james wrote: > Could you use multiple threads to type check and code gen in parallel? > Yes, but with a non-trivial amount of work. rustc is not very parallel right now. Parallelizing the compiler would require a lot of refactoring (in fact, probably a lot of the same refactoring that would be necessary to do incremental recompilation) and I think it's something we all want, but the time isn't budgeted for it right now. > Could you retain information from a previous run of the compiler and reuse > it (especially for code generation)? I think we could. This would be more of a research project (possible intern project or volunteer project for someone with a love for compiler research!) The only real thing that I see standing in the way of that is time. It's also possible that LLVM has infrastructure for this kind of profile-guided optimization already, I really don't know. Cheers, Tim -- Tim Chevalier * http://catamorphism.org/ * Often in error, never in doubt "Too much to carry, too much to let go Time goes fast, learning goes slow." -- Bruce Cockburn From stevej at fruitless.org Thu Jan 10 15:19:38 2013 From: stevej at fruitless.org (Steve Jenson) Date: Thu, 10 Jan 2013 15:19:38 -0800 Subject: [rust-dev] Question about lifetime analysis (a 0.5 transition question) In-Reply-To: <50E0AF48.6080703@alum.mit.edu> References: <50DCD6DE.5050409@alum.mit.edu> <50E0AF48.6080703@alum.mit.edu> Message-ID: On Sun, Dec 30, 2012 at 1:16 PM, Niko Matsakis wrote: > Oh, one other thing: > > Your each() method does not obey the for protocol! When the callback > returns false, you should abort iteration altogether. This presumably > means you need to do the recursion in a helper method that itself returns > bool so that you can detect when to carry on and when to abort. > Thanks for pointing that out, I didn't realize. I'm working on making each abortable but noticed that I'm not able to call a function defined in an anonymous impl from this method, there's an interaction with &self here that I don't understand. Here is the error: red_black_tree.rs:107:8: 107:32 error: type `&self/red_black_tree::RBMap<'a,'b>` does not implement any method in scope named `real_each` red_black_tree.rs:107 self.real_each(f, true); Here is the call site: https://github.com/stevej/rustled/blob/master/red_black_tree.rs#L107 And here is the definition of the function real_each: https://github.com/stevej/rustled/blob/master/red_black_tree.rs#L61 Do you understand what is going on here? Thanks, Steve -------------- next part -------------- An HTML attachment was scrubbed... URL: From clements at brinckerhoff.org Thu Jan 10 17:39:53 2013 From: clements at brinckerhoff.org (John Clements) Date: Thu, 10 Jan 2013 17:39:53 -0800 Subject: [rust-dev] change "use of moved variable" to "use of moved value" ? Message-ID: There's currently a somewhat misleading (I claim) error message that states "error: use of moved variable". I claim this should instead be "use of moved value". In particular, there's no problem with using the variable; in particular, you can mutate it to some new value. For example, this code is adapted from the borrowed pointers tutorial: fn example3() -> int { struct R { g: int } struct S { mut f: ~R } let mut x = ~S {mut f: ~R {g: 3}}; let qqq = x; let y = &x.f; // Error reported here. x = ~S {mut f: ~R {g: 4}}; // ... but this line is fine! 3 } FWIW, this error message actually misled me; I'm not just being a PL pedant :). This change seems significant enough that I should ask, rather than just making a change?. John From niko at alum.mit.edu Fri Jan 11 06:03:25 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Fri, 11 Jan 2013 06:03:25 -0800 Subject: [rust-dev] change "use of moved variable" to "use of moved value" ? In-Reply-To: References: Message-ID: <50F01BAD.3090400@alum.mit.edu> I am fine with this change. Niko John Clements wrote: > There's currently a somewhat misleading (I claim) error message that states "error: use of moved variable". I claim this should instead be "use of moved value". In particular, there's no problem with using the variable; in particular, you can mutate it to some new value. For example, this code is adapted from the borrowed pointers tutorial: > > fn example3() -> int { > struct R { g: int } > struct S { mut f: ~R } > > let mut x = ~S {mut f: ~R {g: 3}}; > let qqq = x; > let y =&x.f; // Error reported here. > x = ~S {mut f: ~R {g: 4}}; // ... but this line is fine! > 3 > } > > FWIW, this error message actually misled me; I'm not just being a PL pedant :). > > This change seems significant enough that I should ask, rather than just making a change?. > > John > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From niko at alum.mit.edu Fri Jan 11 10:50:06 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Fri, 11 Jan 2013 10:50:06 -0800 Subject: [rust-dev] RFC: syntax of multiple trait bounds on a type parameter Message-ID: <50F05EDE.7090508@alum.mit.edu> Hi, Right now we use no delimiter at all to separate multiple trait bounds for a single type variable: fn foo(...) {...} Originally we thought to use commas, but inserting commas into the list creates an ambiguity between the comma that separates type parameters, as illustrated here: fn foo(...) {...} Marijn and I hashed through all kinds of delimeters and failed to find one, and hence he settled on spaces. It seemed like a good idea at the time. However, over time, I have found that this syntax is very hard for me to parse, visually speaking. Moreover, Patrick recently observed that there is an ambiguity when the trait names are multi-component paths: fn foo(...) {...} Does this indicate one bound `Ord::Eq` or two bounds `Ord` and `::Eq`? The current fix for this is to make the tokenizer treat `id::` differently from `id ::` (note the separating space in the latter example). The first is 1 token. The second is 2 tokens. Actually, I think this behavior already existed, and I can't remember why?I think it had something to do with the flexible treatment of keywords that we used to have. @brson, do you remember? Anyway, it was proposed on IRC today that we could do something like this instead: fn foo(..) {...} // One bound is the same fn foo(...) {...} // Multiple bounds require parentheses I find this visually appealing. It's easier for my eye to read, particularly if the bounds are complicated. I know it's a syntax change, and we're trying to avoid those, but I thought I'd throw it out for a wider audience to ponder, particularly as it would eliminate a rather surprising whitespace dependency. Niko From lucian.branescu at gmail.com Fri Jan 11 10:54:41 2013 From: lucian.branescu at gmail.com (Lucian Branescu) Date: Fri, 11 Jan 2013 18:54:41 +0000 Subject: [rust-dev] RFC: syntax of multiple trait bounds on a type parameter In-Reply-To: <50F05EDE.7090508@alum.mit.edu> References: <50F05EDE.7090508@alum.mit.edu> Message-ID: Whitespace dependency is painful in the long term. This looks nice, the only downside is that changing between one and two traits requires editing in two places.. Perhaps if parens were merely optional with one trait and was valid this would be mitigated somewhat. On 11 January 2013 18:50, Niko Matsakis wrote: > Hi, > > Right now we use no delimiter at all to separate multiple trait bounds for > a single type variable: > > fn foo(...) {...} > > Originally we thought to use commas, but inserting commas into the list > creates an ambiguity between the comma that separates type parameters, as > illustrated here: > > fn foo(...) {...} > > Marijn and I hashed through all kinds of delimeters and failed to find > one, and hence he settled on spaces. It seemed like a good idea at the time. > > However, over time, I have found that this syntax is very hard for me to > parse, visually speaking. Moreover, Patrick recently observed that there is > an ambiguity when the trait names are multi-component paths: > > fn foo(...) {...} > > Does this indicate one bound `Ord::Eq` or two bounds `Ord` and `::Eq`? The > current fix for this is to make the tokenizer treat `id::` differently from > `id ::` (note the separating space in the latter example). The first is 1 > token. The second is 2 tokens. Actually, I think this behavior already > existed, and I can't remember why?I think it had something to do with the > flexible treatment of keywords that we used to have. @brson, do you > remember? > > Anyway, it was proposed on IRC today that we could do something like this > instead: > > fn foo(..) {...} // One bound is the same > fn foo(...) {...} // Multiple bounds require parentheses > > I find this visually appealing. It's easier for my eye to read, > particularly if the bounds are complicated. > > I know it's a syntax change, and we're trying to avoid those, but I > thought I'd throw it out for a wider audience to ponder, particularly as it > would eliminate a rather surprising whitespace dependency. > > > > Niko > ______________________________**_________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/**listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Fri Jan 11 11:03:09 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Fri, 11 Jan 2013 11:03:09 -0800 Subject: [rust-dev] RFC: syntax of multiple trait bounds on a type parameter In-Reply-To: References: <50F05EDE.7090508@alum.mit.edu> Message-ID: <50F061ED.8060107@alum.mit.edu> Lucian Branescu wrote: > This looks nice, the only downside is that changing between one and > two traits requires editing in two places.. Perhaps if parens were > merely optional with one trait and was valid this would be > mitigated somewhat. I was assuming they'd be optional for the single trait case. For the reason you state, I also considered the possibility of writing instead of , but that seemed like a strictly larger change that also has implications for our "impl Type: Trait" syntax (which of course @pcwalton wants to change to "impl Trait for Type"). Niko From graydon at mozilla.com Fri Jan 11 11:25:49 2013 From: graydon at mozilla.com (Graydon Hoare) Date: Fri, 11 Jan 2013 11:25:49 -0800 Subject: [rust-dev] RFC: syntax of multiple trait bounds on a type parameter In-Reply-To: <50F05EDE.7090508@alum.mit.edu> References: <50F05EDE.7090508@alum.mit.edu> Message-ID: <50F0673D.3090804@mozilla.com> On 11/01/2013 10:50 AM, Niko Matsakis wrote: > Anyway, it was proposed on IRC today that we could do something like > this instead: > > fn foo(..) {...} // One bound is the same > fn foo(...) {...} // Multiple bounds require parentheses > > I find this visually appealing. It's easier for my eye to read, > particularly if the bounds are complicated. > > I know it's a syntax change, and we're trying to avoid those, but I > thought I'd throw it out for a wider audience to ponder, particularly as > it would eliminate a rather surprising whitespace dependency. I'm ok with this, and much less fond of solving the ambiguity by adding whitespace dependency there. IME a multi-trait-bound signature is a bit of a code smell anyways. (Incidentally, this sort of thing is exactly why I feel like we need someone to start formalizing the grammar. It's not really ok to say "we've frozen the syntax" before we know that the existing grammar is unambiguous!) -Graydon From clements at brinckerhoff.org Fri Jan 11 11:41:37 2013 From: clements at brinckerhoff.org (John Clements) Date: Fri, 11 Jan 2013 11:41:37 -0800 Subject: [rust-dev] RFC: syntax of multiple trait bounds on a type parameter In-Reply-To: <50F0673D.3090804@mozilla.com> References: <50F05EDE.7090508@alum.mit.edu> <50F0673D.3090804@mozilla.com> Message-ID: On Jan 11, 2013, at 11:25 AM, Graydon Hoare wrote: > On 11/01/2013 10:50 AM, Niko Matsakis wrote: > >> Anyway, it was proposed on IRC today that we could do something like >> this instead: >> >> fn foo(..) {...} // One bound is the same >> fn foo(...) {...} // Multiple bounds require parentheses >> >> I find this visually appealing. It's easier for my eye to read, >> particularly if the bounds are complicated. >> >> I know it's a syntax change, and we're trying to avoid those, but I >> thought I'd throw it out for a wider audience to ponder, particularly as >> it would eliminate a rather surprising whitespace dependency. > > I'm ok with this, and much less fond of solving the ambiguity by adding > whitespace dependency there. IME a multi-trait-bound signature is a bit > of a code smell anyways. > > (Incidentally, this sort of thing is exactly why I feel like we need > someone to start formalizing the grammar. It's not really ok to say > "we've frozen the syntax" before we know that the existing grammar is > unambiguous!) +1, possibly volunteering :). John From banderson at mozilla.com Fri Jan 11 11:50:45 2013 From: banderson at mozilla.com (Brian Anderson) Date: Fri, 11 Jan 2013 11:50:45 -0800 Subject: [rust-dev] RFC: syntax of multiple trait bounds on a type parameter In-Reply-To: <50F05EDE.7090508@alum.mit.edu> References: <50F05EDE.7090508@alum.mit.edu> Message-ID: <50F06D15.8020806@mozilla.com> On 01/11/2013 10:50 AM, Niko Matsakis wrote: > Hi, > > Right now we use no delimiter at all to separate multiple trait bounds > for a single type variable: > > fn foo(...) {...} Supertraits share the same syntax, so we should consider them together. If or when impls can implement multiple traits they may also need a similar syntax. trait Foo: Ord Eq Hash { } > > Originally we thought to use commas, but inserting commas into the > list creates an ambiguity between the comma that separates type > parameters, as illustrated here: > > fn foo(...) {...} > > Marijn and I hashed through all kinds of delimeters and failed to find > one, and hence he settled on spaces. It seemed like a good idea at the > time. > > However, over time, I have found that this syntax is very hard for me > to parse, visually speaking. Moreover, Patrick recently observed that > there is an ambiguity when the trait names are multi-component paths: > > fn foo(...) {...} > > Does this indicate one bound `Ord::Eq` or two bounds `Ord` and `::Eq`? > The current fix for this is to make the tokenizer treat `id::` > differently from `id ::` (note the separating space in the latter > example). The first is 1 token. The second is 2 tokens. Actually, I > think this behavior already existed, and I can't remember why?I think > it had something to do with the flexible treatment of keywords that we > used to have. @brson, do you remember? No, but it's the ugliest part of the lexer. Let's make it go away. > > Anyway, it was proposed on IRC today that we could do something like > this instead: > > fn foo(..) {...} // One bound is the same > fn foo(...) {...} // Multiple bounds require > parentheses As an example, this is why the parens are needed: fn foo(...) {...} so you _could_ still allow `foo` since there's no ambiguity, but that's more complex that necessary. Here's how it looks like in a trait: trait Foo:(Ord, Eq, Hash) { } The parens aren't necessary here for disambiguation. Would we want them anyway? With the parens, the colons aren't necessary so you could instead have: fn foo(...) {...} fn foo(...) {...} > > I find this visually appealing. It's easier for my eye to read, > particularly if the bounds are complicated. It does contain many new tokens though. > > I know it's a syntax change, and we're trying to avoid those, but I > thought I'd throw it out for a wider audience to ponder, particularly > as it would eliminate a rather surprising whitespace dependency. One thing to consider is that presumably trait inheritance will eventually make multi-trait bounds less common. It could even replace multi-trait bounds entirely, though I don't think that's a great idea. From banderson at mozilla.com Fri Jan 11 11:54:30 2013 From: banderson at mozilla.com (Brian Anderson) Date: Fri, 11 Jan 2013 11:54:30 -0800 Subject: [rust-dev] RFC: syntax of multiple trait bounds on a type parameter In-Reply-To: <50F06D15.8020806@mozilla.com> References: <50F05EDE.7090508@alum.mit.edu> <50F06D15.8020806@mozilla.com> Message-ID: <50F06DF6.3000301@mozilla.com> On 01/11/2013 11:50 AM, Brian Anderson wrote: > On 01/11/2013 10:50 AM, Niko Matsakis wrote: >> Hi, >> >> Right now we use no delimiter at all to separate multiple trait >> bounds for a single type variable: >> >> fn foo(...) {...} > > Supertraits share the same syntax, so we should consider them > together. If or when impls can implement multiple traits they may also > need a similar syntax. > > trait Foo: Ord Eq Hash { } > >> >> Originally we thought to use commas, but inserting commas into the >> list creates an ambiguity between the comma that separates type >> parameters, as illustrated here: >> >> fn foo(...) {...} >> >> Marijn and I hashed through all kinds of delimeters and failed to >> find one, and hence he settled on spaces. It seemed like a good idea >> at the time. >> >> However, over time, I have found that this syntax is very hard for me >> to parse, visually speaking. Moreover, Patrick recently observed that >> there is an ambiguity when the trait names are multi-component paths: >> >> fn foo(...) {...} >> >> Does this indicate one bound `Ord::Eq` or two bounds `Ord` and >> `::Eq`? The current fix for this is to make the tokenizer treat >> `id::` differently from `id ::` (note the separating space in the >> latter example). The first is 1 token. The second is 2 tokens. >> Actually, I think this behavior already existed, and I can't remember >> why?I think it had something to do with the flexible treatment of >> keywords that we used to have. @brson, do you remember? > > No, but it's the ugliest part of the lexer. Let's make it go away. > >> >> Anyway, it was proposed on IRC today that we could do something like >> this instead: >> >> fn foo(..) {...} // One bound is the same >> fn foo(...) {...} // Multiple bounds require >> parentheses > > As an example, this is why the parens are needed: > > fn foo(...) {...} > > so you _could_ still allow `foo` since there's no > ambiguity, but that's more complex that necessary. Of course, there _is_ an ambiguity, this would just be requiring parens to add the second type parameter, so not a good idea. From graydon at mozilla.com Fri Jan 11 14:03:31 2013 From: graydon at mozilla.com (Graydon Hoare) Date: Fri, 11 Jan 2013 14:03:31 -0800 Subject: [rust-dev] change "use of moved variable" to "use of moved value" ? In-Reply-To: References: Message-ID: <50F08C33.7040201@mozilla.com> On 10/01/2013 5:39 PM, John Clements wrote: > This change seems significant enough that I should ask, rather than just making a change?. In general, we do changes to the code (both "asking" and "just making") via bugs, pull requests and code review. If you are pretty sure something ought to be done, you can make the change locally and open a pull request to have it integrated; if there's disagreement with the change we can discuss there. If you want to discuss first before spending time on producing a change (say, it's something involved) then file an bug and we can discuss there. We're trying to stick to the rule now that the only 'no review' changes are those that are either completely trivial or necessary to un-break a broken tree (see http://buildbot.rust-lang.org for current build status) -Graydon From garethdanielsmith at gmail.com Fri Jan 11 14:27:44 2013 From: garethdanielsmith at gmail.com (Gareth Smith) Date: Fri, 11 Jan 2013 22:27:44 +0000 Subject: [rust-dev] RFC: syntax of multiple trait bounds on a type parameter In-Reply-To: <50F05EDE.7090508@alum.mit.edu> References: <50F05EDE.7090508@alum.mit.edu> Message-ID: <50F091E0.6010907@gmail.com> On 11/01/13 18:50, Niko Matsakis wrote: > > fn foo(..) {...} // One bound is the same > fn foo(...) {...} // Multiple bounds require > parentheses How about using { ... } rather than ( ... ), like imports: use xxx::{a, b, c}; fn foo(...) { ... } I don't know that this is better but maybe it is worth considering? Gareth. From gaozm55 at gmail.com Fri Jan 11 18:21:14 2013 From: gaozm55 at gmail.com (James Gao) Date: Sat, 12 Jan 2013 10:21:14 +0800 Subject: [rust-dev] RFC: syntax of multiple trait bounds on a type parameter In-Reply-To: <50F091E0.6010907@gmail.com> References: <50F05EDE.7090508@alum.mit.edu> <50F091E0.6010907@gmail.com> Message-ID: and how about these two case: a) fn foo (...) {...} b) fn foo (...) {...} On Sat, Jan 12, 2013 at 6:27 AM, Gareth Smith wrote: > On 11/01/13 18:50, Niko Matsakis wrote: > >> >> fn foo(..) {...} // One bound is the same >> fn foo(...) {...} // Multiple bounds require >> parentheses >> > > How about using { ... } rather than ( ... ), like imports: > > use xxx::{a, b, c}; > > fn foo(...) { ... } > > I don't know that this is better but maybe it is worth considering? > > Gareth. > > ______________________________**_________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/**listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.monrocq at gmail.com Sat Jan 12 04:51:04 2013 From: matthieu.monrocq at gmail.com (Matthieu Monrocq) Date: Sat, 12 Jan 2013 13:51:04 +0100 Subject: [rust-dev] RFC: syntax of multiple trait bounds on a type parameter In-Reply-To: References: <50F05EDE.7090508@alum.mit.edu> <50F091E0.6010907@gmail.com> Message-ID: On Sat, Jan 12, 2013 at 3:21 AM, James Gao wrote: > and how about these two case: > > a) fn foo (...) {...} > > b) fn foo (...) {...} > > Really likes b), + looks especially suiting since we are adding up requirements. -- Matthieu > > On Sat, Jan 12, 2013 at 6:27 AM, Gareth Smith > wrote: > >> On 11/01/13 18:50, Niko Matsakis wrote: >> >>> >>> fn foo(..) {...} // One bound is the same >>> fn foo(...) {...} // Multiple bounds require >>> parentheses >>> >> >> How about using { ... } rather than ( ... ), like imports: >> >> use xxx::{a, b, c}; >> >> fn foo(...) { ... } >> >> I don't know that this is better but maybe it is worth considering? >> >> Gareth. >> >> ______________________________**_________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/**listinfo/rust-dev >> > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.a.boyden at gmail.com Sat Jan 12 07:37:24 2013 From: j.a.boyden at gmail.com (James Boyden) Date: Sun, 13 Jan 2013 02:37:24 +1100 Subject: [rust-dev] RFC: syntax of multiple trait bounds on a type parameter In-Reply-To: References: <50F05EDE.7090508@alum.mit.edu> <50F091E0.6010907@gmail.com> Message-ID: On Sat, Jan 12, 2013 at 1:21 PM, James Gao wrote: > and how about these two case: > > a) fn foo (...) {...} I think that a problem with using semicolon as the delimiter between trait type parameters (i.e., between `T1: X` and `T2: Y`) is that it would differ subtly and unexpectedly from the use of comma in function definitions and structures (i.e., "everywhere else"). As a result, you would have: fn f(a: X, b: Y) ... struct Z {a: X, b: Y} versus: fn foo ... > b) fn foo (...) {...} Similarly to `+`, perhaps `|` or `&` could make sense as the delimiter between trait bounds within a single type variable, if you interpret (or define) the combination of trait bounds as a set operation: either "the union of the constraints specified by the bounds" or "the intersection of types that meet the constraints". fn foo (...) {...} fn foo (...) {...} jb > On Sat, Jan 12, 2013 at 6:27 AM, Gareth Smith > wrote: >> >> On 11/01/13 18:50, Niko Matsakis wrote: >>> >>> >>> fn foo(..) {...} // One bound is the same >>> fn foo(...) {...} // Multiple bounds require >>> parentheses >> >> >> How about using { ... } rather than ( ... ), like imports: >> >> use xxx::{a, b, c}; >> >> fn foo(...) { ... } >> >> I don't know that this is better but maybe it is worth considering? >> >> Gareth. >> >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev > > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > From knocte at gmail.com Sat Jan 12 07:59:53 2013 From: knocte at gmail.com (Andres G. Aragoneses) Date: Sat, 12 Jan 2013 15:59:53 +0000 Subject: [rust-dev] Why in the Rust language functions are not pure by default? In-Reply-To: <50B3AC12.4080700@mozilla.com> References: <50B3AC12.4080700@mozilla.com> Message-ID: On 26/11/12 17:51, Graydon Hoare wrote: > On 12-11-26 08:11 AM, Andres G. Aragoneses wrote: >> Hello, >> >> I just wanted to know the rationale behind the decision about having a >> "pure" keyword instead of an "impure" one (in the same way, the analogy >> here is that there is an "unsafe" keyword, not a "safe" one). > > The decision is an old one related to a time when we had a full effect > system and typestate system: the definition was too strong to meet in > most cases and wound up requiring 'impure' (at the time, spelled 'io') > annotations on nearly every function in normal code. It is no longer > strongly justified by those reasons, imo, Then, with the aim of avoiding Perfect being the enemy of Done, would a pull-request from an external contributor that flipped this two (to make pure be the default) be accepted? (It's just that this small room for improvement seems the only blocker for me to take a serious look at Rust. So I feel I could manage to create a patch to inverse the default purity, but of course I could not manage any simplification over this because I've never looked at rust's implementation and likely won't have enough time to figure out.) > but I suspect any change to it > now would be accompanied by an attempt to simplify the relationship > between borrowing and purity altogether (it's a bit unintuitive > presently). I expect there may still be some reform in this area, though > the details are mostly in the heads of others presently. Those with the upcoming changes in their head would still be able to simplify things, but dealing with the "impure" keyword instead of the "pure" one. From niko at alum.mit.edu Sat Jan 12 08:05:11 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Sat, 12 Jan 2013 08:05:11 -0800 Subject: [rust-dev] Why in the Rust language functions are not pure by default? In-Reply-To: References: <50B3AC12.4080700@mozilla.com> Message-ID: <50F189B7.40003@alum.mit.edu> I think you underestimate the impact of this change---flipping the default is easy, but dealing with the repercussions, particularly as regards self-hosting, is not. That said, now that we have agreed to attempt the write barrier plan (sometimes called INHTWAMA, after this blog post [1]), it is very likely that the distinction between pure and impure fns will go away entirely. Niko [1] http://www.smallcultfollowing.com/babysteps/blog/2012/11/18/imagine-never-hearing-the-phrase-aliasable/ Andres G. Aragoneses wrote: > On 26/11/12 17:51, Graydon Hoare wrote: >> On 12-11-26 08:11 AM, Andres G. Aragoneses wrote: >>> Hello, >>> >>> I just wanted to know the rationale behind the decision about having a >>> "pure" keyword instead of an "impure" one (in the same way, the analogy >>> here is that there is an "unsafe" keyword, not a "safe" one). >> >> The decision is an old one related to a time when we had a full effect >> system and typestate system: the definition was too strong to meet in >> most cases and wound up requiring 'impure' (at the time, spelled 'io') >> annotations on nearly every function in normal code. It is no longer >> strongly justified by those reasons, imo, > > Then, with the aim of avoiding Perfect being the enemy of Done, would > a pull-request from an external contributor that flipped this two (to > make pure be the default) be accepted? > > (It's just that this small room for improvement seems the only blocker > for me to take a serious look at Rust. So I feel I could manage to > create a patch to inverse the default purity, but of course I could > not manage any simplification over this because I've never looked at > rust's implementation and likely won't have enough time to figure out.) > > >> but I suspect any change to it >> now would be accompanied by an attempt to simplify the relationship >> between borrowing and purity altogether (it's a bit unintuitive >> presently). I expect there may still be some reform in this area, though >> the details are mostly in the heads of others presently. > > Those with the upcoming changes in their head would still be able to > simplify things, but dealing with the "impure" keyword instead of the > "pure" one. > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From pwalton at mozilla.com Sat Jan 12 12:34:54 2013 From: pwalton at mozilla.com (Patrick Walton) Date: Sat, 12 Jan 2013 12:34:54 -0800 Subject: [rust-dev] Why in the Rust language functions are not pure by default? In-Reply-To: References: <50B3AC12.4080700@mozilla.com> Message-ID: <50F1C8EE.1050203@mozilla.com> On 1/12/13 7:59 AM, Andres G. Aragoneses wrote: > (It's just that this small room for improvement seems the only blocker > for me to take a serious look at Rust. So I feel I could manage to > create a patch to inverse the default purity, but of course I could not > manage any simplification over this because I've never looked at rust's > implementation and likely won't have enough time to figure out.) Once the write barrier ("Imagine Never Hearing the Phrase 'Aliasable, Mutable' Again") changes go through, purity will not be needed for soundness in any part of the language (with the exception of the "unsafe" effect). When that happens, I think "pure" should be removed from the language, and future efforts toward effect systems would likely be better spent investigating pluggable effect systems along the lines of "Lightweight Polymorphic Effects" proposed for Scala [1]. The reasons are: (a) there are many effects that programmers may want, "pure" being only one of a large set; and (b) there's been an endless debate since we started about what "pure" actually means. To elaborate on (a), consider that there are many useful invariants that you might want the compiler to check. For instance, if you're building a mobile app that needs touch responsiveness, you might want at "gc" effect to help the compiler ensure that you aren't using the GC on the UI thread. For another example, you might want the compiler to verify that you can't do blocking I/O on your server; this would be a "blocking" effect. If you're doing functional programming, you might want a "pure" effect that's stronger than the "pure" we have now. These examples aren't just made up; people have asked for all of these at some point or another. It strikes me as better to see whether we can roll these ideas together into one simple system than to try to build more and more into the language over time, or to try to layer static analysis tools on top of the language that won't work as well as they would if we build the system into the language to begin with. Regarding (b), formulating a concept of purity that's precise enough to be useful but forgiving enough to be practical has been hard. For example, can you write to "&mut" parameters in pure functions? Allowing it is extremely useful and has no effect on safety, but it strays far enough from the formal notion of purity that some have questioned Rust for it. There are tougher questions as well -- for instance, can you log errors in a pure function? You might think that it's obviously harmless to do so, but consider the case in which you're running a server and your error logging is sent over the network; now pure functions can perform random network I/O. Yet if you can't log errors this has serious ramifications for usability; even Haskell breaks purity for `Debug.Trace`. The take-away from all of this, to me, is that there are different notions of purity that are useful in different circumstances, and instead of debating about which one is the best one we should investigate whether we can allow programmers to define a notion of purity suitable for the task at hand. Incidentally, pluggable effect systems strike me as very much not a Rust 1.0 feature; I'd like to leave the door open for them at a later date, but they'd most likely strain the complexity and time budget too much to be worth it. Patrick [1]: http://infoscience.epfl.ch/record/175240/files/ecoop_1.pdf From clements at brinckerhoff.org Sat Jan 12 22:53:16 2013 From: clements at brinckerhoff.org (John Clements) Date: Sat, 12 Jan 2013 22:53:16 -0800 Subject: [rust-dev] significance of apparently extraneous #[legacy_exports] decl? Message-ID: In the file rust/src/libsyntax/syntax.rc, I see this code: mod print { #[legacy_exports]; #[legacy_exports] mod pp; #[legacy_exports] mod pprust; } Is there any significance to the repetition of the #[legacy_exports] declaration before 'mod pp;'? John Clements -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4370 bytes Desc: not available URL: From banderson at mozilla.com Sun Jan 13 19:29:34 2013 From: banderson at mozilla.com (Brian Anderson) Date: Sun, 13 Jan 2013 19:29:34 -0800 Subject: [rust-dev] significance of apparently extraneous #[legacy_exports] decl? In-Reply-To: References: Message-ID: <50F37B9E.9020007@mozilla.com> On 01/12/2013 10:53 PM, John Clements wrote: > In the file rust/src/libsyntax/syntax.rc, I see this code: > > mod print { > #[legacy_exports]; > #[legacy_exports] > mod pp; > #[legacy_exports] > mod pprust; > } > > Is there any significance to the repetition of the #[legacy_exports] declaration before 'mod pp;'? > > John Clements There is! Attributes can be applied either to the outside of elements or the inside. The first `#[legacy_exports];` is applying itself to `mod print`, as indicated by the semicolon, and the second to `mod pp`. These attributes were added by a script - a person probably wouldn't write them that way. This semicolon distinction is oft-maligned. From dherman at mozilla.com Tue Jan 15 00:19:09 2013 From: dherman at mozilla.com (David Herman) Date: Tue, 15 Jan 2013 00:19:09 -0800 Subject: [rust-dev] significance of apparently extraneous #[legacy_exports] decl? In-Reply-To: <50F37B9E.9020007@mozilla.com> References: <50F37B9E.9020007@mozilla.com> Message-ID: On Jan 13, 2013, at 7:29 PM, Brian Anderson wrote: > On 01/12/2013 10:53 PM, John Clements wrote: >> In the file rust/src/libsyntax/syntax.rc, I see this code: >> >> mod print { >> #[legacy_exports]; >> #[legacy_exports] >> mod pp; >> #[legacy_exports] >> mod pprust; >> } >> >> Is there any significance to the repetition of the #[legacy_exports] declaration before 'mod pp;'? >> >> John Clements > > There is! Attributes can be applied either to the outside of elements or the inside. The first `#[legacy_exports];` is applying itself to `mod print`, as indicated by the semicolon, and the second to `mod pp`. These attributes were added by a script - a person probably wouldn't write them that way. This semicolon distinction is oft-maligned. Is there really value in attributes being applied on the inside? That seems really odd to me. Dave From graydon at mozilla.com Tue Jan 15 00:39:53 2013 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 15 Jan 2013 00:39:53 -0800 Subject: [rust-dev] significance of apparently extraneous #[legacy_exports] decl? In-Reply-To: References: <50F37B9E.9020007@mozilla.com> Message-ID: <50F515D9.7070701@mozilla.com> On 15/01/2013 12:19 AM, David Herman wrote: > Is there really value in attributes being applied on the inside? That seems really odd to me. 1. Applying them to the crate you're in. 2. Doc attributes. From niko at alum.mit.edu Tue Jan 15 05:17:23 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 15 Jan 2013 05:17:23 -0800 Subject: [rust-dev] significance of apparently extraneous #[legacy_exports] decl? In-Reply-To: References: <50F37B9E.9020007@mozilla.com> Message-ID: <50F556E3.5010304@alum.mit.edu> David Herman wrote: > Is there really value in attributes being applied on the inside? That seems really odd to me. I thought so too at first but now I find it much more readable in general. Comments, for example, seem to look better inside the function than in front. Then I can see the function name and parameters immediately, which is usually what I want to know first: fn foo(x: T, y: U) { /*! * ... */ } I haven't tried writing other attributes inside much, but I could imagine a similar principle might apply. For example: struct Foo { #[auto_encode]; #[auto_decode]; #[deriving_eq]; field1: int, field 2: int } vs #[auto_encode] #[auto_decode] #[deriving_eq] struct Foo { field1: int, field 2: int } I always find the latter (which is what we do today) pretty hard to read, actually, but the former seems pretty easy. Niko From ben.striegel at gmail.com Tue Jan 15 12:25:55 2013 From: ben.striegel at gmail.com (Benjamin Striegel) Date: Tue, 15 Jan 2013 15:25:55 -0500 Subject: [rust-dev] RFC: syntax of multiple trait bounds on a type parameter In-Reply-To: References: <50F05EDE.7090508@alum.mit.edu> <50F091E0.6010907@gmail.com> Message-ID: > fn foo (...) {...} I heavily prefer using + to wrapping in parens. That said, what does ::Eq even mean? Is it possible to avert all this discussion by changing ::Eq to ..Eq or something? I've never seen this syntax used before. On Fri, Jan 11, 2013 at 9:21 PM, James Gao wrote: > and how about these two case: > > a) fn foo (...) {...} > > b) fn foo (...) {...} > > > On Sat, Jan 12, 2013 at 6:27 AM, Gareth Smith > wrote: > >> On 11/01/13 18:50, Niko Matsakis wrote: >> >>> >>> fn foo(..) {...} // One bound is the same >>> fn foo(...) {...} // Multiple bounds require >>> parentheses >>> >> >> How about using { ... } rather than ( ... ), like imports: >> >> use xxx::{a, b, c}; >> >> fn foo(...) { ... } >> >> I don't know that this is better but maybe it is worth considering? >> >> Gareth. >> >> ______________________________**_________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/**listinfo/rust-dev >> > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Tue Jan 15 12:52:47 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 15 Jan 2013 12:52:47 -0800 Subject: [rust-dev] RFC: syntax of multiple trait bounds on a type parameter In-Reply-To: References: <50F05EDE.7090508@alum.mit.edu> <50F091E0.6010907@gmail.com> Message-ID: <50F5C19F.20600@alum.mit.edu> Benjamin Striegel wrote: > > I heavily prefer using + to wrapping in parens. That said, what does > ::Eq even mean? Is it possible to avert all this discussion by > changing ::Eq to ..Eq or something? I've never seen this syntax used > before. FWIW the only two directly comparable scenarios I can think of are Java and Scala. Java uses `&` for this same purpose and (presumably) for this same reason. Scala uses `with`. I'd be fine with `+` or `&`. Niko From banderson at mozilla.com Tue Jan 15 16:25:28 2013 From: banderson at mozilla.com (Brian Anderson) Date: Tue, 15 Jan 2013 16:25:28 -0800 Subject: [rust-dev] RFC: syntax of multiple trait bounds on a type parameter In-Reply-To: References: <50F05EDE.7090508@alum.mit.edu> <50F091E0.6010907@gmail.com> Message-ID: <50F5F378.7090801@mozilla.com> On 01/12/2013 04:51 AM, Matthieu Monrocq wrote: > > > On Sat, Jan 12, 2013 at 3:21 AM, James Gao > wrote: > > and how about these two case: > > a) fn foo (...) {...} > > b) fn foo (...) {...} > > > Really likes b), + looks especially suiting since we are adding up > requirements. Agree, + looks like a nice solution. -------------- next part -------------- An HTML attachment was scrubbed... URL: From snopanen at gmail.com Wed Jan 16 12:14:37 2013 From: snopanen at gmail.com (Sami Nopanen) Date: Wed, 16 Jan 2013 15:14:37 -0500 Subject: [rust-dev] Couple of Random Questions Message-ID: Hi, Some random questions based on playing around with Rust for a few weeks. 1. Do mutable variant types allocated in stack always occupy a fixed space (the size of the largest possibly value) ? I guess it would have to be so, just wondering if there might be some other magic going on. That is, e.g. what happens with the following piece of code: pub fn test() -> Option { let mut r = None; let s = 10; // Just some value presumably allocated after r in stack r = Some(10.3f); r } 2. How to allocate a mutable managed vector dynamically? I can create an owned vector with: let a = vec::from_elem(..); I can create a managed vector with: let a = at_vec::from_elem(..); But I can't seem to figure out how to create a mutable managed vector (apart from using a literal). I tried: let mut a = at_vec::from_elem(...); a[0] = 1.1; But the compiler didn't seem to like it. 3. Can you return stack allocated literal vectors (either using the create directly in calling stack semantic or as a copied value type)? I tried: fn newInStack() -> [int] { [1, 2] } But was hit with a compile error. 4. Can you control where a result gets built from the calling side? Making several copies of a simple constructor function to be able to allocate things in different places seems a bit silly. fn newInStack() -> Foo { Foo(1,2) } fn newOwned() -> ~Foo { ~Foo(1,2) } fn newManaged() -> @Foo { @Foo(1,2) } I especially was wondering about this in the context of 'vec' and 'at_vec' and trying (and failing) to create a mutable managed array; and left wondering if I'd need to create a new module 'at_mut_vec'. (And somewhere in the back of my head wondering, if all these modules would really be needed or if this is an indication of a problem with expressiveness of the language in such cases). Thanks, Sami -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Wed Jan 16 14:14:38 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 16 Jan 2013 14:14:38 -0800 Subject: [rust-dev] Couple of Random Questions In-Reply-To: References: Message-ID: <50F7264E.3090308@alum.mit.edu> Sami Nopanen wrote: > Hi, > > Some random questions based on playing around with Rust for a few weeks. > > 1. Do mutable variant types allocated in stack always occupy a fixed > space (the size of the largest possibly value) ? Yes. > 2. How to allocate a mutable managed vector dynamically? > I can create an owned vector with: let a = vec::from_elem(..); > I can create a managed vector with: let a = at_vec::from_elem(..); You cannot. Because the elements of a managed vector are stored inline without indirection, and managed vectors are inherently shared, they cannot change length after they are created. Think of a managed vector like a Java array, which has the same properties. If you want a mutable vector, you must place an owned vector into a managed box. At the moment, this is most conveniently done using the `DVec` wrapper (this is what it exists for). That is, a type like `@DVec` is basically the equivalent of Java's `ArrayList`. In the future, we currently plan to build in better support for managed, mutable data using a plan, so it is likely that `@DVec` will be removed in favor of something like `@mut ~[T]`. > 3. Can you return stack allocated literal vectors (either using the > create directly in calling stack semantic or as a copied value type)? > I tried: > fn newInStack() -> [int] { [1, 2] } > But was hit with a compile error. You can only return types with a fixed size, and you cannot return pointers into your own stack frame. The return type you gave here (`[int]`) is not in fact a valid Rust type. Arrays come in two varieties: pointers, like `~[int]`, `@[int]`, or `&[int]`, and fixed-length, like `[int * 2]`. That function might best be written using a fixed-length vector (`[int * 2]`), presuming of course you know how long the result will be. This would be legal because the caller would know how much memory to allocate in order to store the array on their stack, and then the callee will write directly into the caller's stack frame. 4. Can you control where a result gets built from the calling side? Making several copies of a simple > constructor function to be able to allocate things in different places > seems a bit silly. > fn newInStack() -> Foo { Foo(1,2) } > fn newOwned() -> ~Foo { ~Foo(1,2) } > fn newManaged() -> @Foo { @Foo(1,2) } Yes, that would be silly, but you don't have to do it. Just make the one version that returns by value: fn newFoo() -> Foo { Foo(1, 2) } and then in the caller's side you can write `@newFoo()` or `~newFoo()` as desired. This doesn't work with vector returns, though. If you want to write a function that results in a vector of unknown length it's easiest to just return `~[int]`, though it is possible to use generic builder types to return either `@[int]` or `~[int]` depending on what the user wants. > I especially was wondering about this in the context of 'vec' and > 'at_vec' and trying (and failing) to create > a mutable managed array; and left wondering if I'd need to create a > new module 'at_mut_vec'. (And somewhere in > the back of my head wondering, if all these modules would really be > needed or if this is an indication of a > problem with expressiveness of the language in such cases). There is something of a balancing act here. In principle there are many, many ways that one could write generic functions (generic over @ vs ~, generic over mutability, etc) but we've tried to keep that limited in order to best manage complexity. There are ways to write functions and types that maximize reusability: ? Use borrowed pointers whenever possible so that you can accept inputs from anywhere (stack, `@`, `~`). ? Return value types (like `Foo`) instead of a pointer type (like `@Foo). ? If you can't use `&`, `~` is somewhat more general than `@`, because a `~` value can always be placed into a managed box. ? Use inherited mutability rather than declaring mutability at the field level to allow for freezing. We are still tuning some of these aspects?particularly mutability?and we plan on removing some of the "choices" that are available today, at least once we're certain what the best choices are. Niko From eddycizeron at gmail.com Fri Jan 18 06:14:45 2013 From: eddycizeron at gmail.com (Eddy Cizeron) Date: Fri, 18 Jan 2013 15:14:45 +0100 Subject: [rust-dev] Arithmetics and programming Message-ID: Hello everybody, I recently realised something while browsing the core library of Rust. It is about an old and recurrent debate that might have arisen in every programming language creation (or at least every modern language). I'm sorry if this topic have already been debated. Which arithmetic operations does trait num::Num have? - add, div, mul, rem, sub. And which types implement trait num::Num? - int, uint, (+ fixed size variants) and float (+ fixed size variants). To me this is fundamentally wrong. Because the euclidean division of integer types is a distinct operation from the division of floats. Int family aims at reproducing the arithmetic of ZZ (or NN) which is a ring (or a semi-ring) while Float family tries to reproduce the behavior of RR which is field. So yeah yeah yeah, I know, that's not how C works and Rust tries to remain highly compliant with C language. But what? This way operation div in trait num::Num is very unlikely to be used at all because it is very unlikely to me that anyone would like to have a function where division could be euclidean or natural depending on the type of the inputs. Actually, I find very natural to have two different traits (I don't mind the names, it's just for the example): - num::IntegerNum with functions add, eucDiv, mul, rem, sub - num::GenericNum with functions add, eucDiv, div, mul, rem, sub Note that for floats, euclidean division (as euclidean remainder) is still a meaningfull operation. So num::GenericNum can be a sub-trait of num::IntegerNum. One other important question is: which behavior should be chosen for the two euclidean operations? As I feared, for now Rust just imitates C. My point is: in all the programs I've ever writen, I think I have never used the pair "truncated division + remainder" (which is the only operations C-family languages have) and I've always found myself recoding the pair "floored division + modulo" which makes more sense mathematically. Some of you certainly know this topic provoked a big debate when Guido van Rossum changed python's behavior in favor of the latter option. I personnally think he was absolutely right, even if (of course) it would have been better to think about it from the beginning instead of creating a breaking change for such a primitive operation. I don't mind if Rust also has the first pair of operations for compatibility with C, I just think it would be great if it naturally has both. -- Eddy -------------- next part -------------- An HTML attachment was scrubbed... URL: From jhebus at googlemail.com Fri Jan 18 09:21:09 2013 From: jhebus at googlemail.com (Paul Harvey) Date: Fri, 18 Jan 2013 17:21:09 +0000 Subject: [rust-dev] Uniqueness of Array Slices Message-ID: Hi, So, i am trying to get my head around slices in Rust, specifically how being unique affects them. In C i can do the following, and when i am done fiddling, the values are in the struct. struct foo{ int a[100]; int b[100]; } struct foo f; int dumb_a = &f.a; int dumb_b = &f.b; fiddle(dumb_a); fiddle(dumb_b); I am trying to figure out if this is possible in Rust with unique types: fn main(){ let mut v = ~[1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,]; let mut part1 = v.view(0,5); let mut part2 = vec::slice(v, 6 , 10); let mut count = 0; for 5.times{ io::println(fmt!("%?", part1[count])); count = count + 1; } part1[2] = 3; } This code is giving me an error, and i am not sure how to devlace a unique vector with mutable content. Now apart from the fact that i am getting a compile error, would the values of vector v be changed after the statement : part1[2] = 3; ???? Is this even legal? What would happen if i sent my slice over a channel and fiddled with it? Is that allowed? Paul From niko at alum.mit.edu Sun Jan 20 16:06:09 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Sun, 20 Jan 2013 16:06:09 -0800 Subject: [rust-dev] Uniqueness of Array Slices In-Reply-To: References: Message-ID: <50FC8671.7000408@alum.mit.edu> The view method provides an *immutable view*, which is why you're getting a compilation error. I suggest the function `vec::mut_view()`---unfortunately it does not appear to be offered as a method at the moment. We're still much too inconsistent about that, but methods are the future. So just change `let part1 = v.view(0, 5)` to `let part1 = vec::mut_view(v, 0, 5)` and you should be able to modify `part1[2]`. The `vec::slice()` function copies data out and results in a new, unique array, so changes to the result of `vec::slice()` will not affect the original. However, the current `slice()` is deprecated and the plan is to rename what is now `view()` to `slice()` once `slice()` is removed (see issue #3869 [1]). Sorry for the confusion, pardon our dust and all that. (The current `slice()` function predates the existence of what we now call slices) You cannot send slices (that is, the result of view()) over a channel. A slice is a borrowed vector---and you can never send borrowed content over channels, only owned content. You can invoke `slice.to_vec()` to copy out the data from the slice into a fresh, owned vector, and then send that. Hopefully that helps! regards, Niko [1] https://github.com/mozilla/rust/issues/3869 Paul Harvey wrote: > Hi, > > So, i am trying to get my head around slices in Rust, specifically how > being unique affects them. > > In C i can do the following, and when i am done fiddling, the values > are in the struct. > > struct foo{ > int a[100]; > int b[100]; > } > > struct foo f; > int dumb_a =&f.a; > int dumb_b =&f.b; > > > fiddle(dumb_a); > fiddle(dumb_b); > > > I am trying to figure out if this is possible in Rust with unique types: > > fn main(){ > let mut v = ~[1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,]; > > let mut part1 = v.view(0,5); > let mut part2 = vec::slice(v, 6 , 10); > let mut count = 0; > for 5.times{ > io::println(fmt!("%?", part1[count])); > count = count + 1; > } > part1[2] = 3; > } > > > This code is giving me an error, and i am not sure how to devlace a > unique vector with mutable content. > > Now apart from the fact that i am getting a compile error, would the > values of vector v be changed after the statement : part1[2] = 3; ???? > > Is this even legal? > > What would happen if i sent my slice over a channel and fiddled with it? > > Is that allowed? > > Paul > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From deansherthompson at gmail.com Tue Jan 22 06:55:46 2013 From: deansherthompson at gmail.com (Dean Thompson) Date: Tue, 22 Jan 2013 06:55:46 -0800 Subject: [rust-dev] "intimidation factor" vs target audience Message-ID: I am new to Rust, but quite excited about it. I have read most of the docs carefully. I'm looking at some code that Niko Matsakis updated in https://github.com/stevej/rustled/commits/master/red_black_tree.rs pure fn each(&self, f: fn(&(&self/K, &self/V)) -> bool) { match *self { Leaf => (), Tree(_, ref left, ref key, ref maybe_value, ref right) => { let left: &self/@RBMap = left; let key: &self/K = key; let maybe_value: &self/Option = maybe_value; let right: &self/@RBMap = right; left.each(f); match *maybe_value { Some(ref value) => { let value: &self/V = value; f(&(key, value)); } None => () }; right.each(f); } } } I understand this code reasonably well. I greatly value the attention to safety in Rust, and I appreciate the value of pointer lifetimes in maintaining that safety. My gut reaction, though, is that this code is almost as intimidating as Haskell. Even more worrisome to me, I think most mainstream programmers would find the *explanation* of this code intimidating. Who is our target audience for Rust? Graydon has said it is "frustrated C++ developers", but how sophisticated and how "brave" are we thinking they will be? (I'd like to think of myself as a team member who is just getting started, so while deferring to the senior folks, I'll say "we".) How intimidating do we think Rust is today? Am I just overreacting to unfamiliarity? How can we calibrate our "intimidation factor" before language decisions start getting harder to change? Do we want (and is it feasible) to define a simpler subset of the language that beginners are encouraged to stick to and that most libraries don't force clients away from? Dean ---- Dean Thompson https://github.com/deansher From bruant.d at gmail.com Tue Jan 22 08:16:52 2013 From: bruant.d at gmail.com (David Bruant) Date: Tue, 22 Jan 2013 17:16:52 +0100 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: References: Message-ID: <50FEBB74.4070806@gmail.com> Le 22/01/2013 15:55, Dean Thompson a ?crit : > (...) > > Who is our target audience for Rust? Graydon has said it is > "frustrated C++ developers", but how sophisticated and how "brave" > are we thinking they will be? Je pense que c'est une question de co?t/b?n?fice. Compte-tenu du co?t d'apprentissage d'un nouveau langage (Rust), est-ce que les gains en performance et s?curit? par rapport ? l'existant (C/C++) valent le coup? Est-ce que ma r?ponse vaut suffisamment le coup d'apprendre/comprendre ce qu'il faut de fran?ais pour la comprendre ? S?rement pas ;-) Chaque ?quipe doit poser le pour et le contre et faire un choix. > (I'd like to think of myself as a team member who is just getting > started, so while deferring to the senior folks, I'll say "we".) > > How intimidating do we think Rust is today? Am I just overreacting > to unfamiliarity? Aussi intimidant que d'apprendre un nouveau langage avec des fa?ons de construire des id?es de mani?re diff?rentes de ce qu'on est habitu?. Peut-?tre qu'un jour Rust remplacera C ou C++ dans les enseignements. Ce jour-l?, te demanderas-tu quelle est l'audience de C ou C++? > How can we calibrate our "intimidation factor" before language > decisions start getting harder to change? > > Do we want (and is it feasible) to define a simpler subset of the > language that beginners are encouraged to stick to and that most > libraries don't force clients away from? Tu veux dire cr?er un mini-Rust? Ou dans les tutoriels/documentation? Le fait qu'un langage soit compliqu? ne force pas les gens ? les comprendre enti?rement. Si tu lis un livre en fran?ais, tu n'as pas besoin de conna?tre tous les mots, toutes les expressions. Conna?tre le coeur du langage est suffisant pour lire des programmes. Les patterns compliqu?s s'apprendront au fil de l'eau. David From bruant.d at gmail.com Tue Jan 22 08:21:49 2013 From: bruant.d at gmail.com (David Bruant) Date: Tue, 22 Jan 2013 17:21:49 +0100 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <50FEBB74.4070806@gmail.com> References: <50FEBB74.4070806@gmail.com> Message-ID: <50FEBC9D.9030908@gmail.com> Le 22/01/2013 17:16, David Bruant a ?crit : > Le 22/01/2013 15:55, Dean Thompson a ?crit : >> (...) >> >> Who is our target audience for Rust? Graydon has said it is >> "frustrated C++ developers", but how sophisticated and how "brave" >> are we thinking they will be? > Je pense que c'est une question de co?t/b?n?fice. Compte-tenu du co?t > d'apprentissage d'un nouveau langage (Rust), est-ce que les gains en > performance et s?curit? par rapport ? l'existant (C/C++) valent le coup? > Est-ce que ma r?ponse vaut suffisamment le coup d'apprendre/comprendre > ce qu'il faut de fran?ais pour la comprendre ? S?rement pas ;-) Chaque > ?quipe doit poser le pour et le contre et faire un choix. Most important part of my answer: "Is my answer worth enough for you to learn/understand enough French to understand it? Probably not ;-) Each team must weight whether it's worth it for them." Your point about a subset is interesting and relates to how we learn natural languages. You dont need to learn every French word and expression to read most texts. Just learn the heart of the language and grow your knowledge while you work on projects, read new code, etc. But it requires a lot work to determine what's a useful heart of the language and expose that in a friendly manner, so many it's too early to do that for Rust. David From kodafox at gmail.com Tue Jan 22 09:05:14 2013 From: kodafox at gmail.com (Jake Kerr) Date: Wed, 23 Jan 2013 02:05:14 +0900 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <50FEBC9D.9030908@gmail.com> References: <50FEBB74.4070806@gmail.com> <50FEBC9D.9030908@gmail.com> Message-ID: I'm in a similar position to Dean, being new to the language but have studied the docs quite a bit. I have to agree that the sample he posted is rather intimidating. There are a few things about it: I find the syntax for lifetimes to be quite hard to get used to since it overloads the & operator to mean something different, and because the convention seems to be to name the lifetime 'self' inside methods, which overloads that meaning as well. So now just in the method signature you have &self in three places that mean two separate things. All of the casting (is it casting?, I'll use that term as I don't know the correct one. ) to local variables with said lifecycle, is also a bit noisy and hard to parse. It would be nice if there was syntax such that the lifecycle could be applied as the variables are bound in the match deconstruction. And couldn't the types be inferred in the casting case such that the lifecycle doesn't need to have the type after the slash, if the point of the casting is just to specify the lifecycle it would be nice if there was a syntax to specify the lifecycle without repeating the type info of the operand. I hope my suggestions make at least a bit of sense. I'm sorry, its quite the challenge to talk about a language when you don't have all the vocabulary down. -------------- next part -------------- An HTML attachment was scrubbed... URL: From graydon at mozilla.com Tue Jan 22 09:23:51 2013 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 22 Jan 2013 09:23:51 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: References: Message-ID: <50FECB27.8020700@mozilla.com> On 22/01/2013 6:55 AM, Dean Thompson wrote: > I'm looking at some code that Niko Matsakis updated in > https://github.com/stevej/rustled/commits/master/red_black_tree.rs > > pure fn each(&self, f: fn(&(&self/K, &self/V)) -> bool) { > match *self { > Leaf => (), > Tree(_, ref left, ref key, ref maybe_value, ref right) => { > let left: &self/@RBMap = left; > let key: &self/K = key; > let maybe_value: &self/Option = maybe_value; > let right: &self/@RBMap = right; > left.each(f); > match *maybe_value { > Some(ref value) => { > let value: &self/V = value; > f(&(key, value)); > } > None => () > }; > right.each(f); > } > } > } > > I understand this code reasonably well. I greatly value the attention > to safety in Rust, and I appreciate the value of pointer lifetimes in > maintaining that safety. > > My gut reaction, though, is that this code is almost as intimidating > as Haskell. Even more worrisome to me, I think most mainstream > programmers would find the *explanation* of this code intimidating. I agree that the cognitive load on this code sample is high. This is the main risk we took (aside from "potential unsoundness", which I didn't really think to be a big risk, judging from Niko's comfort with the semantics) when adopting first class region pointers: that the resulting types would be too complex to understand, and/or require too much chatter when writing out in full. To my eyes the matter is not yet entirely clear. It's complex but it's not quite "impossibly complex"; if you made all the '&self/' symbols into just '&' it would be, I think, not so bad. Compare if you like to the associated bits of code from libc++ required to implement roughly-equivalent "iterate through the treemap" sort of functionality: _LIBCPP_INLINE_VISIBILITY __tree_iterator& operator++() { __ptr_ = static_cast<__node_pointer( __tree_next( static_cast<__node_base_pointer>(__ptr_))); return *this; } template _NodePtr __tree_next(_NodePtr __x) _NOEXCEPT { if (__x->__right_ != nullptr) return __tree_min(__x->__right_); while (!__tree_is_left_child(__x)) __x = __x->__parent_; return __x->__parent_; } template inline _LIBCPP_INLINE_VISIBILITY bool __tree_is_left_child(_NodePtr __x) _NOEXCEPT { return __x == __x->__parent_->__left_; } template inline _LIBCPP_INLINE_VISIBILITY _NodePtr __tree_min(_NodePtr __x) _NOEXCEPT { while (__x->__left_ != nullptr) __x = __x->__left_; return __x; } And keep in mind that there is no memory-safety in that code: if I invalidate a C++ map while iterating, I just get a wild pointer dereference and crash. If I rewrote it in terms of shared_ptr<> it'd be even chattier. > Who is our target audience for Rust? Graydon has said it is > "frustrated C++ developers", but how sophisticated and how "brave" > are we thinking they will be? The target audience is frustrated C++ developers, same as always. If they balk at the syntax for lifetime-bounds on borrowed pointers, then yes, we've blown the cognitive budget, and have failed. It is not clear to me yet that that's true. But it's a risk. One we're all aware of and worried about. > How intimidating do we think Rust is today? Am I just overreacting > to unfamiliarity? I don't know. It's a very hard thing to measure. I know of lots of languages that have failed for this reason. It's a major hazard. > How can we calibrate our "intimidation factor" before language > decisions start getting harder to change? If you search our mailing list, IRC logs or meeting minutes for "cognitive budget", "cognitive load" or "cognitive burden" you will see we have always been keenly aware of this risk and treat it as a primary constraint when doing design work. It's a leading reason why many features have been removed, simplified, minimized or excluded from consideration. > Do we want (and is it feasible) to define a simpler subset of the > language that beginners are encouraged to stick to and that most > libraries don't force clients away from? Personal opinion: no. That just makes the issue even more confusing. The way to approach this is head-on, by looking at the things that cause the most confusion and trying to make them cause less. Thanks for bringing this up. I'm interested to hear others' opinions on whether we're past a reasonable limit of comprehensibility. It's a hard thing to hear, but better to hear now than later, if true. -Graydon From snopanen at gmail.com Tue Jan 22 09:25:17 2013 From: snopanen at gmail.com (Sami Nopanen) Date: Tue, 22 Jan 2013 12:25:17 -0500 Subject: [rust-dev] Couple of Random Questions In-Reply-To: <50F7264E.3090308@alum.mit.edu> References: <50F7264E.3090308@alum.mit.edu> Message-ID: Thank you for the answers. 2. How to allocate a mutable managed vector dynamically? > I can create an owned vector with: let a = vec::from_elem(..); >> I can create a managed vector with: let a = at_vec::from_elem(..); >> > > You cannot. Because the elements of a managed vector are stored inline > without indirection, and managed vectors are inherently shared, they cannot > change length after they are created. Think of a managed vector like a Java > array, which has the same properties. > > If you want a mutable vector, you must place an owned vector into a > managed box. At the moment, this is most conveniently done using the `DVec` > wrapper (this is what it exists for). That is, a type like `@DVec` is > basically the equivalent of Java's `ArrayList`. > > In the future, we currently plan to build in better support for managed, > mutable data using a plan, so it is likely that `@DVec` will be removed > in favor of something like `@mut ~[T]`. I'm finding not being able to store array types in mutable boxes a bit concerning. Some of my main fields of interest are in computer graphics and numerical computing, both in which you'd end up having large arrays of mutable data (frame buffer: mut [u8], matrices: mut [float]). And in both cases, the sizes would not be known until runtime, so they could not be represented as [type * cnt]. In both cases, growing the vector is of no importance. I guess I could store both in the exchange heap, but it just seems a bit weird as I don't really have interest in passing the data between different threads; and I'll lose the flexibility of being able to have multiple pointers to the data, apart from using a wrapper. -------------- next part -------------- An HTML attachment was scrubbed... URL: From graydon at mozilla.com Tue Jan 22 10:09:43 2013 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 22 Jan 2013 10:09:43 -0800 Subject: [rust-dev] Couple of Random Questions In-Reply-To: References: <50F7264E.3090308@alum.mit.edu> Message-ID: <50FED5E7.6080602@mozilla.com> On 22/01/2013 9:25 AM, Sami Nopanen wrote: > And in both cases, the sizes would not be known until runtime, so they > could not be represented as [type * cnt]. In both cases, growing the > vector is of no importance. Oh, I think perhaps Niko overestimated what you were asking for. One can't create a _resizable_ @[], due to the managed ownership, but one with a fixed size that happens to only be learned at runtime should be quite possible, mutable or otherwise. I think this is just a missing function in our standard library for creating the mutable variant. -Graydon From niko at alum.mit.edu Tue Jan 22 10:32:52 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 22 Jan 2013 10:32:52 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: References: Message-ID: <50FEDB54.8070501@alum.mit.edu> I think this is a very important question. Rust is certainly betting that programmers are willing to learn new concepts and tools. I don't think anyone will be able to say for certain if this bet will pay off. Any time that we talk about introducing a new language feature, though, you can be sure that we worry about exceeding our complexity budget. I want to look again at this example that you raised. However, I've modified by removing the `let` statements that are there just to workaround a bug. Itlooks a lot simpler now, but there are still a fair number of concepts at play here. > pure fn each(&self, f: fn(&(&self/K,&self/V)) -> bool) { > match *self { > Leaf => (), > Tree(_, ref left, ref key, ref maybe_value, ref right) => { > left.each(f); > match *maybe_value { > Some(ref value) => { f(&(key, value)); } > None => {} > } > right.each(f); > } > } > } Here as I see it are the important new ideas: - closures. - matching against disjoint unions and `ref` bindings that create pointers into the structure being matched. - lifetime declarations in the signature, tying the lifetime of the key/value pointers that will be provided to the lifetime of the receiver. The first two seem to me to be clearly within your average C++ programmer's range of understanding. (Let's not forget that to write C++ programs that actually work without crashing?as opposed to just ones that compile?you've got to have a fair amount of knowledge to start with). The third point, lifetimes, is where things get more complex. Certainly C++ programmers have an intution for this, but it's never been necessary to notate it before. There are some mitigating circumstances. For example, the signature could be modified to drop all explicit lifetimes, as follows: pure fn each(&self, f: fn(&(&K,&V)) -> bool) { This would still typecheck. However, it would be providing less information to the iteratee. That is, the function `f` is only being told to expect two borrowed pointers, but it is not being told what the lifetimes of those pointers are. Therefore, it must assume they are only valid for the duration of the call. This is usually good enough but not always, so it's certainly better for libraries to go the extra mile in terms of providing full annotation. Anyway, I'm not precisely answering your question, of course. I don't know the answer. I think though that, if you want to guarantee safety conditions, there is a big danger in designing the language to be *too* simple. I think a lot of things that seem simple at first wind up being frustrating once you've gained more experience. We definitely want Rust to be as simple as possible, but not so simple that it can't express the kinds of things you want to express. For example, the by-reference modes that predated lifetimes were quite possibly easier to understand (though I'm not sure, let's not forget that they engendered a fair amount of confusion, even amongst the core developers), but they were also fairly inexpressive. I ran up against these limits *all the time* in practice and the only escape was to make use of managed pointers. That was part of the motivation for regions in the first place. That said, you don't want the language to be too intimidating. I worry about the idea of a subset that we encourage people to stick to, but I do think we should think about the order in which ideas should be "phased in". I hope also that when writing casual code people can get away with less specific annotation. Niko Dean Thompson wrote: > I am new to Rust, but quite excited about it. I have read most of the > docs > carefully. > > I'm looking at some code that Niko Matsakis updated in > https://github.com/stevej/rustled/commits/master/red_black_tree.rs > > pure fn each(&self, f: fn(&(&self/K,&self/V)) -> bool) { > match *self { > Leaf => (), > Tree(_, ref left, ref key, ref maybe_value, ref right) => { > let left:&self/@RBMap = left; > let key:&self/K = key; > let maybe_value:&self/Option = maybe_value; > let right:&self/@RBMap = right; > left.each(f); > match *maybe_value { > Some(ref value) => { > let value:&self/V = value; > f(&(key, value)); > } > None => () > }; > right.each(f); > } > } > } > > I understand this code reasonably well. I greatly value the attention > to safety in Rust, and I appreciate the value of pointer lifetimes in > maintaining that safety. > > My gut reaction, though, is that this code is almost as intimidating > as Haskell. Even more worrisome to me, I think most mainstream > programmers would find the *explanation* of this code intimidating. > > Who is our target audience for Rust? Graydon has said it is > "frustrated C++ developers", but how sophisticated and how "brave" > are we thinking they will be? > > (I'd like to think of myself as a team member who is just getting > started, so while deferring to the senior folks, I'll say "we".) > > How intimidating do we think Rust is today? Am I just overreacting > to unfamiliarity? > > How can we calibrate our "intimidation factor" before language > decisions start getting harder to change? > > Do we want (and is it feasible) to define a simpler subset of the > language that beginners are encouraged to stick to and that most > libraries don't force clients away from? > > Dean > ---- > > Dean Thompson > https://github.com/deansher > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From snopanen at gmail.com Tue Jan 22 10:37:24 2013 From: snopanen at gmail.com (Sami Nopanen) Date: Tue, 22 Jan 2013 13:37:24 -0500 Subject: [rust-dev] Couple of Random Questions In-Reply-To: <50FED5E7.6080602@mozilla.com> References: <50F7264E.3090308@alum.mit.edu> <50FED5E7.6080602@mozilla.com> Message-ID: > > And in both cases, the sizes would not be known until runtime, so they > > could not be represented as [type * cnt]. In both cases, growing the > > vector is of no importance. > > Oh, I think perhaps Niko overestimated what you were asking for. One > can't create a _resizable_ @[], due to the managed ownership, but one > with a fixed size that happens to only be learned at runtime should be > quite possible, mutable or otherwise. I think this is just a missing > function in our standard library for creating the mutable variant. > Thanks, it sounded like something that should be possible :) Anyway, this brings me to yet another question/concern regarding mutable managed vectors: I was reading pcwaltons blog regarding the new borrow checker rules. It mentions that for managed mutable boxes, attempting to mutate will go through runtime checks to enforce these rules. It wasn't quite clear on when these runtime checks actually do occur. Let's say I'm representing a large matrix as @mut [float], and make a function to do some operating on it: fn fooOp(data : &mut [float]) { for uint::range(0u, vec::len(data)) |idx| { data[idx] = someOp(data[idx]); } } Will the new borrow checker now cause such tight inner loops (that would be a pretty common use case I'd assume) to have additional runtime overhead? Sami -------------- next part -------------- An HTML attachment was scrubbed... URL: From pwalton at mozilla.com Tue Jan 22 10:42:48 2013 From: pwalton at mozilla.com (Patrick Walton) Date: Tue, 22 Jan 2013 10:42:48 -0800 Subject: [rust-dev] Couple of Random Questions In-Reply-To: References: <50F7264E.3090308@alum.mit.edu> <50FED5E7.6080602@mozilla.com> Message-ID: <50FEDDA8.6030009@mozilla.com> On 1/22/13 10:37 AM, Sami Nopanen wrote: > Anyway, this brings me to yet another question/concern regarding mutable > managed vectors: > I was reading pcwaltons blog regarding the new borrow checker rules. It > mentions that for managed mutable boxes, attempting to mutate will go > through runtime checks to enforce these rules. It wasn't quite clear on > when these runtime checks actually do occur. Let's say I'm representing > a large matrix as @mut [float], and make a function to do some operating > on it: > > fn fooOp(data : &mut [float]) { > for uint::range(0u, vec::len(data)) |idx| { > data[idx] = someOp(data[idx]); > } > } > > Will the new borrow checker now cause such tight inner loops (that would > be a pretty common use case I'd assume) to have additional runtime overhead? There's no runtime overhead for the code you wrote. The check is performed at the moment you borrow the `@mut [float]` vector to `&mut [float]`, and no checks are performed as long as the `&mut [float]` pointer remains alive. Patrick From deansherthompson at gmail.com Tue Jan 22 11:01:43 2013 From: deansherthompson at gmail.com (Dean Thompson) Date: Tue, 22 Jan 2013 11:01:43 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <50FECB27.8020700@mozilla.com> Message-ID: Looking at Niko's blog post http://smallcultfollowing.com/babysteps/blog/2012/12/30/lifetime-notation/ We do, to my eye, get a huge improvement if we both tweak the notation and also augment the ref deconstruction syntax to indicate the resulting pointer timeline. Doing this with Niko's preferred option 8 gives us: pure fn each(&self, f: fn(&(&{self}K, &{self}V)) -> bool) { match *self { Leaf => (), Tree(_, ref{self} left, ref{self} key, ref{self} maybe_value, ref{self} right) => { left.each(f); match *maybe_value { Some(ref{self} value) => { f(&(key, value)); } None => () }; right.each(f); } } FWIW, Niko's ${foo}bar notation helps my mental "parser" a great deal, because it makes foo look like a modifier to me. When I see &foo/bar, my mind fights to make it a pointer to foo with a strange trailing bar. Dean On 1/22/13 9:23 AM, "Graydon Hoare" wrote: >On 22/01/2013 6:55 AM, Dean Thompson wrote: > >> I'm looking at some code that Niko Matsakis updated in >> https://github.com/stevej/rustled/commits/master/red_black_tree.rs >> >> pure fn each(&self, f: fn(&(&self/K, &self/V)) -> bool) { >> match *self { >> Leaf => (), >> Tree(_, ref left, ref key, ref maybe_value, ref right) => { >> let left: &self/@RBMap = left; >> let key: &self/K = key; >> let maybe_value: &self/Option = maybe_value; >> let right: &self/@RBMap = right; >> left.each(f); >> match *maybe_value { >> Some(ref value) => { >> let value: &self/V = value; >> f(&(key, value)); >> } >> None => () >> }; >> right.each(f); >> } >> } >> } >> >> I understand this code reasonably well. I greatly value the attention >> to safety in Rust, and I appreciate the value of pointer lifetimes in >> maintaining that safety. >> >> My gut reaction, though, is that this code is almost as intimidating >> as Haskell. Even more worrisome to me, I think most mainstream >> programmers would find the *explanation* of this code intimidating. > >I agree that the cognitive load on this code sample is high. This is the >main risk we took (aside from "potential unsoundness", which I didn't >really think to be a big risk, judging from Niko's comfort with the >semantics) when adopting first class region pointers: that the resulting >types would be too complex to understand, and/or require too much >chatter when writing out in full. > >To my eyes the matter is not yet entirely clear. It's complex but it's >not quite "impossibly complex"; if you made all the '&self/' symbols >into just '&' it would be, I think, not so bad. Compare if you like to >the associated bits of code from libc++ required to implement >roughly-equivalent "iterate through the treemap" sort of functionality: > > >_LIBCPP_INLINE_VISIBILITY >__tree_iterator& operator++() { > __ptr_ = static_cast<__node_pointer( > __tree_next( > static_cast<__node_base_pointer>(__ptr_))); > return *this; >} > >template >_NodePtr >__tree_next(_NodePtr __x) _NOEXCEPT >{ > if (__x->__right_ != nullptr) > return __tree_min(__x->__right_); > while (!__tree_is_left_child(__x)) > __x = __x->__parent_; > return __x->__parent_; >} > >template >inline _LIBCPP_INLINE_VISIBILITY >bool >__tree_is_left_child(_NodePtr __x) _NOEXCEPT >{ > return __x == __x->__parent_->__left_; >} > >template >inline _LIBCPP_INLINE_VISIBILITY >_NodePtr >__tree_min(_NodePtr __x) _NOEXCEPT >{ > while (__x->__left_ != nullptr) > __x = __x->__left_; > return __x; >} > >And keep in mind that there is no memory-safety in that code: if I >invalidate a C++ map while iterating, I just get a wild pointer >dereference and crash. If I rewrote it in terms of shared_ptr<> it'd be >even chattier. > >> Who is our target audience for Rust? Graydon has said it is >> "frustrated C++ developers", but how sophisticated and how "brave" >> are we thinking they will be? > >The target audience is frustrated C++ developers, same as always. If >they balk at the syntax for lifetime-bounds on borrowed pointers, then >yes, we've blown the cognitive budget, and have failed. > >It is not clear to me yet that that's true. But it's a risk. One we're >all aware of and worried about. > >> How intimidating do we think Rust is today? Am I just overreacting >> to unfamiliarity? > >I don't know. It's a very hard thing to measure. I know of lots of >languages that have failed for this reason. It's a major hazard. > >> How can we calibrate our "intimidation factor" before language >> decisions start getting harder to change? > >If you search our mailing list, IRC logs or meeting minutes for >"cognitive budget", "cognitive load" or "cognitive burden" you will see >we have always been keenly aware of this risk and treat it as a primary >constraint when doing design work. It's a leading reason why many >features have been removed, simplified, minimized or excluded from >consideration. > >> Do we want (and is it feasible) to define a simpler subset of the >> language that beginners are encouraged to stick to and that most >> libraries don't force clients away from? > >Personal opinion: no. That just makes the issue even more confusing. The >way to approach this is head-on, by looking at the things that cause the >most confusion and trying to make them cause less. > >Thanks for bringing this up. I'm interested to hear others' opinions on >whether we're past a reasonable limit of comprehensibility. It's a hard >thing to hear, but better to hear now than later, if true. > >-Graydon > >_______________________________________________ >Rust-dev mailing list >Rust-dev at mozilla.org >https://mail.mozilla.org/listinfo/rust-dev From illissius at gmail.com Tue Jan 22 15:46:06 2013 From: illissius at gmail.com (=?ISO-8859-1?Q?G=E1bor_Lehel?=) Date: Wed, 23 Jan 2013 00:46:06 +0100 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: References: <50FECB27.8020700@mozilla.com> Message-ID: On Tue, Jan 22, 2013 at 8:01 PM, Dean Thompson wrote: > FWIW, Niko's ${foo}bar notation helps my mental "parser" a great deal, > because it > makes foo look like a modifier to me. When I see &foo/bar, my mind fights > to make > it a pointer to foo with a strange trailing bar. +1 to this. I haven't been able to articulate why &foo/bar feels unnatural to me, but this is probably it. It's hard to intuit by looking at it which half is what, and why it is that way instead of the reverse. That, and my brain tries to interpret it as division, or something for which division is a good metaphor, and it doesn't lead anywhere. (I don't mean to harp on the issue, I already commented about it at Niko's blog.) -- Your ship was destroyed in a monadic eruption. From snopanen at gmail.com Tue Jan 22 19:33:19 2013 From: snopanen at gmail.com (Sami Nopanen) Date: Tue, 22 Jan 2013 22:33:19 -0500 Subject: [rust-dev] Lifetime Questions Message-ID: Hi, I'm trying to get my head around lifetime parameters. I think I mostly get them, for the simple cases anyway, but there are couple of examples that are leaving me confused. Copied some of the example code here from http://smallcultfollowing.com/babysteps/blog/2012/12/30/lifetime-notation/: struct StringReader { value: &str, count: uint } impl StringReader { fn new(value: &self/str) -> StringReader/&self { StringReader { value: value, count: 0 } } } fn remaining(s: &StringReader) -> uint { return s.value.len() - s.count; } fn value(s: &v/StringReader) -> &v/str { return s.value; } 1. Why is the lifetime name sometimes before a type and sometimes after the type. For example in the return types of the following two functions: fn new(value: &self/str) -> StringReader/&self { .. } fn value(s: &v/StringReader) -> &v/str { .. } And I think there was a mention that in general the notation is just a shorthand for &lf1/(type/&lf2); what's the difference between the two lifetimes lf1 and lf2 and why do we need two of them? 2. In the 'new' method, is the 'self' in the parameter 'value: &self/str' just a random name for the lifetime parameter or does this refer somehow to the 'self' type (in which case I'd be ever more confused, I guess :-): fn new(value: &self/str) -> StringReader/&self { StringReader { value: value, count: 0 } } 3. In the 'new' method, we are actually building a new instance of a type with a lifetime as restricted by the 'value' parameter. As far as I understand, this would mean that the type instance would have to be built in the same stack frame as where the variable binding defining the lifetime parameter precides. I can see how this could work if the lifetime parameter comes from the directly calling function, such as: fn foo() { let s = ~"foobar"; let sr = StringReader::new(s); .. } Now, using the 'build rvalue directly in calling stack' ABI trick, this would make sense. But how about if the string has a lifetime beyond the calling function, such as: fn foo(s : &str) { let sr = StringReader::new(s); ... sr } fn bar() { let s = ~"foobar"; let sr = foo(s); } Now, for the lifetime of StringReader to match the lifetime of the string, it AFAIK would need to be stored in the stack frame of 'bar' instead of the stackframe of 'foo'. Does this actually happen (e.g. the compiler is able to preallocate the correct slots in the stack), is there some other magic going on or would this just be a compile error? Sami -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.striegel at gmail.com Tue Jan 22 19:39:53 2013 From: ben.striegel at gmail.com (Benjamin Striegel) Date: Tue, 22 Jan 2013 22:39:53 -0500 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: References: <50FECB27.8020700@mozilla.com> Message-ID: Sadly, you should really read this subsequent blog post: http://smallcultfollowing.com/babysteps/blog/2013/01/15/lifetime-notation-redux/ It turns out that this syntax is ambiguous without introducing a whitespace dependency. I think it might still be worth it, but I know that a lot of people tend to shy away from such things on principle. On Tue, Jan 22, 2013 at 2:01 PM, Dean Thompson wrote: > Looking at Niko's blog post > > > http://smallcultfollowing.com/babysteps/blog/2012/12/30/lifetime-notation/ > > We do, to my eye, get a huge improvement if we both tweak the notation and > also augment the ref deconstruction syntax to indicate the resulting > pointer > timeline. > > Doing this with Niko's preferred option 8 gives us: > > pure fn each(&self, f: fn(&(&{self}K, &{self}V)) -> bool) { > match *self { > Leaf => (), > Tree(_, ref{self} left, ref{self} key, > ref{self} maybe_value, ref{self} right) => { > left.each(f); > match *maybe_value { > Some(ref{self} value) => { > f(&(key, value)); > } > None => () > }; > right.each(f); > } > } > > > FWIW, Niko's ${foo}bar notation helps my mental "parser" a great deal, > because it > makes foo look like a modifier to me. When I see &foo/bar, my mind fights > to make > it a pointer to foo with a strange trailing bar. > > Dean > > > On 1/22/13 9:23 AM, "Graydon Hoare" wrote: > > >On 22/01/2013 6:55 AM, Dean Thompson wrote: > > > >> I'm looking at some code that Niko Matsakis updated in > >> https://github.com/stevej/rustled/commits/master/red_black_tree.rs > >> > >> pure fn each(&self, f: fn(&(&self/K, &self/V)) -> bool) { > >> match *self { > >> Leaf => (), > >> Tree(_, ref left, ref key, ref maybe_value, ref right) => { > >> let left: &self/@RBMap = left; > >> let key: &self/K = key; > >> let maybe_value: &self/Option = maybe_value; > >> let right: &self/@RBMap = right; > >> left.each(f); > >> match *maybe_value { > >> Some(ref value) => { > >> let value: &self/V = value; > >> f(&(key, value)); > >> } > >> None => () > >> }; > >> right.each(f); > >> } > >> } > >> } > >> > >> I understand this code reasonably well. I greatly value the attention > >> to safety in Rust, and I appreciate the value of pointer lifetimes in > >> maintaining that safety. > >> > >> My gut reaction, though, is that this code is almost as intimidating > >> as Haskell. Even more worrisome to me, I think most mainstream > >> programmers would find the *explanation* of this code intimidating. > > > >I agree that the cognitive load on this code sample is high. This is the > >main risk we took (aside from "potential unsoundness", which I didn't > >really think to be a big risk, judging from Niko's comfort with the > >semantics) when adopting first class region pointers: that the resulting > >types would be too complex to understand, and/or require too much > >chatter when writing out in full. > > > >To my eyes the matter is not yet entirely clear. It's complex but it's > >not quite "impossibly complex"; if you made all the '&self/' symbols > >into just '&' it would be, I think, not so bad. Compare if you like to > >the associated bits of code from libc++ required to implement > >roughly-equivalent "iterate through the treemap" sort of functionality: > > > > > >_LIBCPP_INLINE_VISIBILITY > >__tree_iterator& operator++() { > > __ptr_ = static_cast<__node_pointer( > > __tree_next( > > static_cast<__node_base_pointer>(__ptr_))); > > return *this; > >} > > > >template > >_NodePtr > >__tree_next(_NodePtr __x) _NOEXCEPT > >{ > > if (__x->__right_ != nullptr) > > return __tree_min(__x->__right_); > > while (!__tree_is_left_child(__x)) > > __x = __x->__parent_; > > return __x->__parent_; > >} > > > >template > >inline _LIBCPP_INLINE_VISIBILITY > >bool > >__tree_is_left_child(_NodePtr __x) _NOEXCEPT > >{ > > return __x == __x->__parent_->__left_; > >} > > > >template > >inline _LIBCPP_INLINE_VISIBILITY > >_NodePtr > >__tree_min(_NodePtr __x) _NOEXCEPT > >{ > > while (__x->__left_ != nullptr) > > __x = __x->__left_; > > return __x; > >} > > > >And keep in mind that there is no memory-safety in that code: if I > >invalidate a C++ map while iterating, I just get a wild pointer > >dereference and crash. If I rewrote it in terms of shared_ptr<> it'd be > >even chattier. > > > >> Who is our target audience for Rust? Graydon has said it is > >> "frustrated C++ developers", but how sophisticated and how "brave" > >> are we thinking they will be? > > > >The target audience is frustrated C++ developers, same as always. If > >they balk at the syntax for lifetime-bounds on borrowed pointers, then > >yes, we've blown the cognitive budget, and have failed. > > > >It is not clear to me yet that that's true. But it's a risk. One we're > >all aware of and worried about. > > > >> How intimidating do we think Rust is today? Am I just overreacting > >> to unfamiliarity? > > > >I don't know. It's a very hard thing to measure. I know of lots of > >languages that have failed for this reason. It's a major hazard. > > > >> How can we calibrate our "intimidation factor" before language > >> decisions start getting harder to change? > > > >If you search our mailing list, IRC logs or meeting minutes for > >"cognitive budget", "cognitive load" or "cognitive burden" you will see > >we have always been keenly aware of this risk and treat it as a primary > >constraint when doing design work. It's a leading reason why many > >features have been removed, simplified, minimized or excluded from > >consideration. > > > >> Do we want (and is it feasible) to define a simpler subset of the > >> language that beginners are encouraged to stick to and that most > >> libraries don't force clients away from? > > > >Personal opinion: no. That just makes the issue even more confusing. The > >way to approach this is head-on, by looking at the things that cause the > >most confusion and trying to make them cause less. > > > >Thanks for bringing this up. I'm interested to hear others' opinions on > >whether we're past a reasonable limit of comprehensibility. It's a hard > >thing to hear, but better to hear now than later, if true. > > > >-Graydon > > > >_______________________________________________ > >Rust-dev mailing list > >Rust-dev at mozilla.org > >https://mail.mozilla.org/listinfo/rust-dev > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Tue Jan 22 20:29:31 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 22 Jan 2013 20:29:31 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: References: <50FEBB74.4070806@gmail.com> <50FEBC9D.9030908@gmail.com> Message-ID: <50FF672B.1010600@alum.mit.edu> Jake Kerr wrote: > > I find the syntax for lifetimes to be quite hard to get used to since > it overloads the & operator to mean something different, and because > the convention seems to be to name the lifetime 'self' inside methods, > which overloads that meaning as well. So now just in the method > signature you have &self in three places that mean two separate things. > Many people mention the convention of using the name `self` to refer to the main lifetime parameter as confusing. It seems likely that we should change this convention! I've been talking about redesigning some of the lifetime stuff to be less implicit, this may help here as well. > > All of the casting (is it casting?, I'll use that term as I don't know > the correct one. ) to local variables with said lifecycle, is also a > bit noisy and hard to parse. It would be nice if there was syntax such > that the lifecycle could be applied as the variables are bound in the > match deconstruction. > I agree those casts are ugly; as I said in my earlier e-mail, though, those are actually there as a workaround for a bug. Niko From niko at alum.mit.edu Tue Jan 22 20:29:42 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 22 Jan 2013 20:29:42 -0800 Subject: [rust-dev] Couple of Random Questions In-Reply-To: References: <50F7264E.3090308@alum.mit.edu> Message-ID: <50FF6736.6080509@alum.mit.edu> If all that you want is the ability to mutate the elements of the vector, that should be possible. It may be that the requisite library support is missing, however, I'm not really certain. Niko Sami Nopanen wrote: > Thank you for the answers. > > 2. How to allocate a mutable managed vector dynamically? > > I can create an owned vector with: let a = vec::from_elem(..); > I can create a managed vector with: let a = at_vec::from_elem(..); > > > You cannot. Because the elements of a managed vector are stored > inline without indirection, and managed vectors are inherently > shared, they cannot change length after they are created. Think of > a managed vector like a Java array, which has the same properties. > > If you want a mutable vector, you must place an owned vector into > a managed box. At the moment, this is most conveniently done using > the `DVec` wrapper (this is what it exists for). That is, a type > like `@DVec` is basically the equivalent of Java's `ArrayList`. > > In the future, we currently plan to build in better support for > managed, mutable data using a plan, so it is likely that > `@DVec` will be removed in favor of something like `@mut ~[T]`. > > I'm finding not being able to store array types in mutable boxes a bit > concerning. Some of my main fields of interest are in computer > graphics and numerical computing, both in which you'd end up having > large arrays of mutable data (frame buffer: mut [u8], matrices: mut > [float]). > And in both cases, the sizes would not be known until runtime, so they > could not be represented as [type * cnt]. In both cases, growing the > vector is of no importance. I guess I could store both in the exchange > heap, but it just seems a bit weird as I don't really have interest > in passing the data between different threads; and I'll lose the > flexibility of being able to have multiple pointers to the data, apart > from using > a wrapper. -------------- next part -------------- An HTML attachment was scrubbed... URL: From deansherthompson at gmail.com Wed Jan 23 05:16:20 2013 From: deansherthompson at gmail.com (Dean Thompson) Date: Wed, 23 Jan 2013 05:16:20 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: Message-ID: Benjamin Striegel writes: Sadly, you should really read this subsequent blog post: http://smallcultfollowing.com/babysteps/blog/2013/01/15/lifetime-notation-r edux/ Ok, here's another suggestion: pure fn each(&self, f: fn(&(&/f/K, &/f/V)) -> bool) { ... } Lifetimes are always written in slashes. We drop the convention of using a /self/ lifetime. We require the explicit lifetime parameter on the function, to minimize magic. ----------- struct StringReader { value: &/s/str, count: uint } struct Foo { value: &/f/T, count: uint } We treat the lifetime as a type parameter, in <...>. We stick to requiring that it be explicit on the struct and the field, to minimize magic. I believe this meets Niko's goal of recognizing lifetimes in the parser. ------------- struct RefPair { first: &/fst/T, second: &/snd/T } Multiple lifetime parameters on a struct work fine. It is clear both to the human reader and to the parser that /snd/ is another lifetime while T is a type. ------------- impl StringReader { fn new(value: &/f/str) -> StringReader { StringReader { value: value, count: 0 } } } fn value(s: &/f/StringReader) -> &/f/str { return s.value; } Lots of characters, yes, but the author of the code is intentionally exercising great control, so I feel it is worth spelling out what is going on. To my eye, the consistency of always writing the lifetime in the same way, /f/, helps tie the mentions of it together. Although &/f/str is less concise than today's syntax &f/str, I feel that &/f/str alerts the C++-trained reader that something special is going on, and perhaps even suggests that f is a modifier of some kind. --------------- fn remaining(s: &StringReader) -> uint { return s.value.len() - s.count; } Here, the author has chosen to exercise less control. We default to scoping the lifetimes across the function declaration. --------------- It's a thought! Dean From sh4.seo at samsung.com Wed Jan 23 05:16:40 2013 From: sh4.seo at samsung.com (Sanghyeon Seo) Date: Wed, 23 Jan 2013 13:16:40 +0000 (GMT) Subject: [rust-dev] Indenting "match" Message-ID: <22213239.105641358947000510.JavaMail.weblogic@epml26> Rust compiler seems to use 4 spaces indentation, but indentation of "match" is mixed. Sometimes arms are indented 2 spaces, sometimes 4 spaces. Is there a hidden rule behind this, or is it a personal preference? From ben.striegel at gmail.com Wed Jan 23 07:04:59 2013 From: ben.striegel at gmail.com (Benjamin Striegel) Date: Wed, 23 Jan 2013 10:04:59 -0500 Subject: [rust-dev] Indenting "match" In-Reply-To: <22213239.105641358947000510.JavaMail.weblogic@epml26> References: <22213239.105641358947000510.JavaMail.weblogic@epml26> Message-ID: A long time ago, the Rust mode for Emacs would indent match arms by two spaces, so any two-space match arms are a remnant of that time. I believe the current convention is to use four spaces for match arms. On Wed, Jan 23, 2013 at 8:16 AM, Sanghyeon Seo wrote: > Rust compiler seems to use 4 spaces indentation, but indentation of > "match" is mixed. > Sometimes arms are indented 2 spaces, sometimes 4 spaces. > > Is there a hidden rule behind this, or is it a personal preference? > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sh4.seo at samsung.com Wed Jan 23 08:31:02 2013 From: sh4.seo at samsung.com (Sanghyeon Seo) Date: Wed, 23 Jan 2013 16:31:02 +0000 (GMT) Subject: [rust-dev] Indenting "match" Message-ID: <32756176.106391358958662565.JavaMail.weblogic@epml26> > A long time ago, the Rust mode for Emacs would indent match arms by two > spaces, so any two-space match arms are a remnant of that time. I believe > the current convention is to use four spaces for match arms. After I sent the email I found that "rustc --pretty normal" indents match arms by 2 spaces... Or is that too a remnant? From pwalton at mozilla.com Wed Jan 23 08:31:42 2013 From: pwalton at mozilla.com (Patrick Walton) Date: Wed, 23 Jan 2013 08:31:42 -0800 Subject: [rust-dev] Indenting "match" In-Reply-To: References: <22213239.105641358947000510.JavaMail.weblogic@epml26> Message-ID: <5100106E.9080000@mozilla.com> On 1/23/13 7:04 AM, Benjamin Striegel wrote: > A long time ago, the Rust mode for Emacs would indent match arms by two > spaces, so any two-space match arms are a remnant of that time. I > believe the current convention is to use four spaces for match arms. This is correct. We used to half-indent match arms, but it looked bad, so we went to a full indent for them. Unfortunately not all the code has been updated. Someday we should fix the rough edges in the pretty printer, close the tree for a day, and tidy up the entire Rust codebase. Patrick From pwalton at mozilla.com Wed Jan 23 08:32:10 2013 From: pwalton at mozilla.com (Patrick Walton) Date: Wed, 23 Jan 2013 08:32:10 -0800 Subject: [rust-dev] Indenting "match" In-Reply-To: <32756176.106391358958662565.JavaMail.weblogic@epml26> References: <32756176.106391358958662565.JavaMail.weblogic@epml26> Message-ID: <5100108A.50907@mozilla.com> On 1/23/13 8:31 AM, Sanghyeon Seo wrote: >> A long time ago, the Rust mode for Emacs would indent match arms by two >> spaces, so any two-space match arms are a remnant of that time. I believe >> the current convention is to use four spaces for match arms. > > After I sent the email I found that "rustc --pretty normal" indents match arms by > 2 spaces... Or is that too a remnant? Yes, that is a bug in the pretty printer. Patrick From niko at alum.mit.edu Wed Jan 23 13:03:00 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 23 Jan 2013 13:03:00 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: References: Message-ID: <51005004.2000205@alum.mit.edu> How would people feel about something like this? & Foo Foo<<> It's somewhat inconsistent in that lifetime names do not always begin with `&` , but I think it retains the "modifier feeling" without introducing any ambiguities. Another option might be `&<<> Foo` but that feels like &-overload to me! Niko Dean Thompson wrote: > Benjamin Striegel writes: > > > Sadly, you should really read this subsequent blog post: > > > http://smallcultfollowing.com/babysteps/blog/2013/01/15/lifetime-notation-r > edux/ > > Ok, here's another suggestion: > > > pure fn each(&self, f: fn(&(&/f/K,&/f/V)) -> bool) { ... } > > > Lifetimes are always written in slashes. We drop the convention > of using a /self/ lifetime. We require the explicit lifetime > parameter on the function, to minimize magic. > > ----------- > struct StringReader { > value:&/s/str, > count: uint > } > > > struct Foo { > value:&/f/T, > count: uint > } > > We treat the lifetime as a type parameter, in<...>. We stick to > requiring that it be explicit on the struct and the field, to > minimize magic. I believe this meets Niko's goal of recognizing > lifetimes in the parser. > > ------------- > struct RefPair { > first:&/fst/T, > second:&/snd/T > } > > Multiple lifetime parameters on a struct work fine. > It is clear both to the human reader and to the parser > that /snd/ is another lifetime while T is a type. > > ------------- > impl StringReader { > fn new(value:&/f/str) -> StringReader { > StringReader { value: value, count: 0 } > } > } > > > > fn value(s:&/f/StringReader) -> &/f/str { > return s.value; > } > > > Lots of characters, yes, but the author of the code is > intentionally exercising great control, so I feel it is > worth spelling out what is going on. To my eye, the > consistency of always writing the lifetime in the same > way, /f/, helps tie the mentions of it together. Although > &/f/str is less concise than today's syntax&f/str, I > feel that&/f/str alerts the C++-trained reader that > something special is going on, and perhaps even suggests > that f is a modifier of some kind. > > > --------------- > fn remaining(s:&StringReader) -> uint { > return s.value.len() - s.count; > } > > Here, the author has chosen to exercise less control. > > We default to scoping the lifetimes across the function > declaration. > > --------------- > > It's a thought! > > Dean > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From niko at alum.mit.edu Wed Jan 23 13:29:15 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 23 Jan 2013 13:29:15 -0800 Subject: [rust-dev] Lifetime Questions In-Reply-To: References: Message-ID: <5100562B.3080000@alum.mit.edu> Sami Nopanen wrote: > 1. Why is the lifetime name sometimes before a type and sometimes > after the type. There are two kinds of types which can be annotated with lifetime names. The first is a borrowed pointer: </T In this case, the lifetime `lt` is the lifetime of this pointer. The second is a struct, enum, or type alias: StringReader/< In this case, the lifetime `lt` is the lifetime of any borrowed pointers contained within the struct, enum, or type alias. So, in the case of the `StringReader` example you gave, the definition of `StringReader` was: struct StringReader { value: &str, count: uint } Therefore, `StringReader/<` means "a StringReader where the lifetime of the `value` field is `lt`". The definition of StringReader is really a kind of generic type definition parameterized over a lifetime, kind of like: struct StringReader<<> { value: </str, count: uint } Right now the compiler hides this declaration from you to make things more pleasant and easy to type. However, we have all agreed on changing this so that lifetime parameters on structs, enums, etc will become explicit. But there are some niggling details remaining as to the precise syntax. > 2. In the 'new' method, is the 'self' in the parameter 'value: > &self/str' just a random name for > the lifetime parameter or does this refer somehow to the 'self' > type (in which case I'd be ever > more confused, I guess :-): The `self` is a lifetime in that instance. I mentioned before that (today) the compiler implicitly decides when a type is parameterized by a lifetime. When it does, it uses the name `self` as the name of the lifetime parameter. So the `self` lifetime here refers to the lifetime of the borrowed pointers contained within the `self` struct. > As far as I understand, this would mean that the type instance would > have to be > built in the same stack frame as where the variable binding > defining the lifetime parameter precides. This is close but not quite right. It's not actually important where the newly constructed struct is stored. It's only important when it gets used. So if the struct is built with a string with lifetime X, then we must make sure that the struct is not used outside of that lifetime X. > > But how about if the string has a lifetime beyond the calling > function, such as: > fn foo(s : &str) { > let sr = StringReader::new(s); > ... > sr > } > fn bar() { > let s = ~"foobar"; > let sr = foo(s); > } This example can be typed, but it requires some annotations. In particular, `foo` should look like: fn foo(s : &v/str) -> StringReader/&v { let sr = StringReader::new(s); ... sr } Here, the return type `StringReader/&v` tells the caller of `foo()`: "I am going to be returning a StringReader that is valid for the same lifetime as the string `s` that you gave me." If you did not supply the explicit lifetime annotation, you will get a compile-time error, because the caller does not know that there is a link between the lifetime of `s` and the lifetime of the returned value. In general, whenever you return a borrowed pointer, or a value that contains borrowed pointers, you will need an explicit lifetime annotation so that you can link the lifetime of that return value to the lifetime of a parameter. > Now, for the lifetime of StringReader to match the lifetime of the > string, it AFAIK would need to be > stored in the stack frame of 'bar' instead of the stackframe of > 'foo'. Does this actually happen (e.g. the > compiler is able to preallocate the correct slots in the stack), is > there some other magic going on or > would this just be a compile error? As I wrote above, there is less magic going on then you think. The `StringReader` will initially be stored in the stack frame of `foo` and then copied into the stack frame of `bar()` when `foo()` returns. It's not required that it only be stored in `bar()`, so long as it is never used after `bar()` returns. Hope this helps. regards, Niko -------------- next part -------------- An HTML attachment was scrubbed... URL: From deansherthompson at gmail.com Wed Jan 23 13:37:19 2013 From: deansherthompson at gmail.com (Dean Thompson) Date: Wed, 23 Jan 2013 13:37:19 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <51005004.2000205@alum.mit.edu> Message-ID: On 1/23/13 1:03 PM, "Niko Matsakis" wrote: >How would people feel about something like this? > >& Foo > Foo<<> > I'm ok with that. Personally, though, I find myself increasingly attracted to the idea of having a consistent notation for writing a lifetime everywhere one appears, independently of the & symbol. (/lt/ is the only such notation I've found yet that seems reasonable.) Consider, for example, how the documentation would read: A lifetime name is written as /name/. The & operator can be followed by a lifetime name. The name of the item can be followed by angle brackets containing a list of lifetime names and type parameters. It strikes me as very easy for the programmer to learn to see /foo/ and at least go "hmm ... how is that lifetime being used?" With the "permutations of &" approach, you can never make such definitive statements. Instead, you are always saying things like this: The name of the item can be followed by a set of angle brackets with a list of lifetime names (which are marked with a & prefix) and type parameters. The & operator can be followed by a lifetime name in angle brackets... Also, with "permutations of &", every time there's a new need to refer to a lifetime name, we would have to come up with a new way of representing it. For example, if ref needed an explicit lifetime name, we'd have to decide where to put the & marker. There are some corresponding decisions to make in my approach, but somehow it just seems more direct to me. But then, that's me :-) Dean On 1/23/13 1:03 PM, "Niko Matsakis" wrote: >How would people feel about something like this? > >& Foo > Foo<<> > >It's somewhat inconsistent in that lifetime names do not always begin >with `&` , but I think it retains the "modifier feeling" without >introducing any ambiguities. Another option might be `&<<> Foo` but >that feels like &-overload to me! > > > >Niko > >Dean Thompson wrote: >> Benjamin Striegel writes: >> >> >> Sadly, you should really read this subsequent blog post: >> >> >> >>http://smallcultfollowing.com/babysteps/blog/2013/01/15/lifetime-notation >>-r >> edux/ >> >> Ok, here's another suggestion: >> >> >> pure fn each(&self, f: fn(&(&/f/K,&/f/V)) -> bool) { ... } >> >> >> Lifetimes are always written in slashes. We drop the convention >> of using a /self/ lifetime. We require the explicit lifetime >> parameter on the function, to minimize magic. >> >> ----------- >> struct StringReader { >> value:&/s/str, >> count: uint >> } >> >> >> struct Foo { >> value:&/f/T, >> count: uint >> } >> >> We treat the lifetime as a type parameter, in<...>. We stick to >> requiring that it be explicit on the struct and the field, to >> minimize magic. I believe this meets Niko's goal of recognizing >> lifetimes in the parser. >> >> ------------- >> struct RefPair { >> first:&/fst/T, >> second:&/snd/T >> } >> >> Multiple lifetime parameters on a struct work fine. >> It is clear both to the human reader and to the parser >> that /snd/ is another lifetime while T is a type. >> >> ------------- >> impl StringReader { >> fn new(value:&/f/str) -> StringReader { >> StringReader { value: value, count: 0 } >> } >> } >> >> >> >> fn value(s:&/f/StringReader) -> &/f/str { >> return s.value; >> } >> >> >> Lots of characters, yes, but the author of the code is >> intentionally exercising great control, so I feel it is >> worth spelling out what is going on. To my eye, the >> consistency of always writing the lifetime in the same >> way, /f/, helps tie the mentions of it together. Although >> &/f/str is less concise than today's syntax&f/str, I >> feel that&/f/str alerts the C++-trained reader that >> something special is going on, and perhaps even suggests >> that f is a modifier of some kind. >> >> >> --------------- >> fn remaining(s:&StringReader) -> uint { >> return s.value.len() - s.count; >> } >> >> Here, the author has chosen to exercise less control. >> >> We default to scoping the lifetimes across the function >> declaration. >> >> --------------- >> >> It's a thought! >> >> Dean >> >> >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev From niko at alum.mit.edu Wed Jan 23 13:44:25 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 23 Jan 2013 13:44:25 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: References: Message-ID: <510059B9.8020308@alum.mit.edu> Dean Thompson wrote: > Personally, though, I find myself increasingly attracted to the idea of > having a consistent notation for writing a lifetime everywhere one > appears, independently of the& symbol. (/lt/ is the only such notation > I've found yet that seems reasonable.) I like this idea too, I just don't like /lt/ for that role. Maybe `'`? (shades of ML) &'lt Foo Foo<'lt> That actually doesn't look half bad to me. Maybe `.`? &.lt Foo Foo<.lt> I don't like Foo<.lt>, but &.lt Foo and Foo might be ok, though it doesn't adhere to the principle (in that lifetime names are just like any other identifier). Another option: &{lt} Foo Foo<{lt}> But the latter form feels pretty sigil heavy. Niko -------------- next part -------------- An HTML attachment was scrubbed... URL: From deansherthompson at gmail.com Wed Jan 23 13:56:00 2013 From: deansherthompson at gmail.com (Dean Thompson) Date: Wed, 23 Jan 2013 13:56:00 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <510059B9.8020308@alum.mit.edu> Message-ID: Lol ? I like 'lt Somehow it makes me laugh compared to the ML usage, but I like it. Perhaps it plays off my brain's wiring for reading something like "I think &'s are pretty". (Or for reading "brain's", for that matter.) I like the {lt} approach ok, too, although not as much. I hadn't noticed that the ambiguity in your original Option 8 went away when we put the lifetime in the angle brackets, even if we then used curlies. But yeah, it feels heavier. Dean From: Niko Matsakis Date: Wednesday, January 23, 2013 1:44 PM To: Dean Thompson Cc: Subject: Re: [rust-dev] "intimidation factor" vs target audience Dean Thompson wrote: > > Personally, though, I find myself increasingly attracted to the idea of > having a consistent notation for writing a lifetime everywhere one > appears, independently of the & symbol. (/lt/ is the only such notation > I've found yet that seems reasonable.) I like this idea too, I just don't like /lt/ for that role. Maybe `'`? (shades of ML) &'lt Foo Foo<'lt> That actually doesn't look half bad to me. Maybe `.`? &.lt Foo Foo<.lt> I don't like Foo<.lt>, but &.lt Foo and Foo might be ok, though it doesn't adhere to the principle (in that lifetime names are just like any other identifier). Another option: &{lt} Foo Foo<{lt}> But the latter form feels pretty sigil heavy. Niko -------------- next part -------------- An HTML attachment was scrubbed... URL: From pnathan at vandals.uidaho.edu Wed Jan 23 13:59:09 2013 From: pnathan at vandals.uidaho.edu (Paul Nathan) Date: Wed, 23 Jan 2013 13:59:09 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <510059B9.8020308@alum.mit.edu> References: <510059B9.8020308@alum.mit.edu> Message-ID: <4BA886B4-4ECB-420D-BF7B-4747243D86FC@vandals.uidaho.edu> My general 2c, worthless as it might be: One of my criticisms of perl when I use it is that my cats could sit on the keyboard and produce correct code. Sigil heavy code produces IMO a heavy mental burden. There is the popular c++ quiz question 'what does this syntax bundle do'. It's really quite pointless and just a puzzle IMO. Mostly a mental exercise to prove something..... I'd like not to endure that with Rust. Lifetime-qualifier/variable seems to be reasonable: I'd suggest further information could be stacked up with /s as Info/lifetime/variable Or Type/info/lt/variable Etc. I write lisp a lot and if verbosity is onerous, a macro can be written to shorten it. I presume/hope rust's macros are sufficient to that task. I think it's better to opt for obvious and lengthy at first and then allow power users to use shortcuts. Too many sigils is confusing to the uninitiated. Regards, Paul Nathan Sent from my iPhone On Jan 23, 2013, at 1:44 PM, Niko Matsakis wrote: > > > Dean Thompson wrote: >> >> Personally, though, I find myself increasingly attracted to the idea of >> having a consistent notation for writing a lifetime everywhere one >> appears, independently of the & symbol. (/lt/ is the only such notation >> I've found yet that seems reasonable.) > > I like this idea too, I just don't like /lt/ for that role. > > Maybe `'`? (shades of ML) > > &'lt Foo > Foo<'lt> > > That actually doesn't look half bad to me. > > Maybe `.`? > > &.lt Foo > Foo<.lt> > > I don't like Foo<.lt>, but &.lt Foo and Foo might be ok, though it doesn't adhere to the principle (in that lifetime names are just like any other identifier). > > Another option: > > &{lt} Foo > Foo<{lt}> > > But the latter form feels pretty sigil heavy. > > > Niko > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Wed Jan 23 15:00:17 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 23 Jan 2013 15:00:17 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <4BA886B4-4ECB-420D-BF7B-4747243D86FC@vandals.uidaho.edu> References: <510059B9.8020308@alum.mit.edu> <4BA886B4-4ECB-420D-BF7B-4747243D86FC@vandals.uidaho.edu> Message-ID: <51006B81.5020506@alum.mit.edu> It's a tough balancing act. Sigils are bad but then if things are too concrete that's bad too. The slashes are visually hard to parse, I think everyone agrees. Imagine this: `&a/B/c/d/e`. That would be a legal type under that proposal and I think it's pretty darn confusing vs `&'a B<'c, 'd, 'e>` I'm really starting to like the 'lt notation, I have to say. Niko Paul Nathan wrote: > My general 2c, worthless as it might be: > > One of my criticisms of perl when I use it is that my cats could sit > on the keyboard and produce correct code. > > Sigil heavy code produces IMO a heavy mental burden. There is the > popular c++ quiz question 'what does this syntax bundle do'. It's > really quite pointless and just a puzzle IMO. Mostly a mental exercise > to prove something..... > > I'd like not to endure that with Rust. > > Lifetime-qualifier/variable seems to be reasonable: I'd suggest > further information could be stacked up with /s as > > Info/lifetime/variable > > Or > > Type/info/lt/variable > > Etc. I write lisp a lot and if verbosity is onerous, a macro can be > written to shorten it. I presume/hope rust's macros are sufficient to > that task. > > I think it's better to opt for obvious and lengthy at first and then > allow power users to use shortcuts. Too many sigils is confusing to > the uninitiated. > > > Regards, > Paul Nathan > > Sent from my iPhone > > On Jan 23, 2013, at 1:44 PM, Niko Matsakis > wrote: > >> >> >> Dean Thompson wrote: >>> Personally, though, I find myself increasingly attracted to the idea of >>> having a consistent notation for writing a lifetime everywhere one >>> appears, independently of the& symbol. (/lt/ is the only such notation >>> I've found yet that seems reasonable.) >> >> I like this idea too, I just don't like /lt/ for that role. >> >> Maybe `'`? (shades of ML) >> >> &'lt Foo >> Foo<'lt> >> >> That actually doesn't look half bad to me. >> >> Maybe `.`? >> >> &.lt Foo >> Foo<.lt> >> >> I don't like Foo<.lt>, but &.lt Foo and Foo might be ok, though >> it doesn't adhere to the principle (in that lifetime names are just >> like any other identifier). >> >> Another option: >> >> &{lt} Foo >> Foo<{lt}> >> >> But the latter form feels pretty sigil heavy. >> >> >> Niko >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From kodafox at gmail.com Wed Jan 23 22:30:02 2013 From: kodafox at gmail.com (Jake Kerr) Date: Thu, 24 Jan 2013 15:30:02 +0900 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <51006B81.5020506@alum.mit.edu> References: <510059B9.8020308@alum.mit.edu> <4BA886B4-4ECB-420D-BF7B-4747243D86FC@vandals.uidaho.edu> <51006B81.5020506@alum.mit.edu> Message-ID: +1 for the 'lt syntax, it's the first suggestion that I haven't found difficult to visually parse. Also it's easier to distinguish from other constructs, since it's not overloading another common symbol. I also think the `.lt` option is pretty (actually more so, personally), but could see the dot being more confusing because at first glace it looks similar to accessing a struct field: &.lt, whereas &'lt is obviously different, so less chance for confusion. I'd be happy with either of these options. On Thu, Jan 24, 2013 at 8:00 AM, Niko Matsakis wrote: > It's a tough balancing act. Sigils are bad but then if things are too > concrete that's bad too. > > The slashes are visually hard to parse, I think everyone agrees. Imagine > this: `&a/B/c/d/e`. That would be a legal type under that proposal and I > think it's pretty darn confusing vs `&'a B<'c, 'd, 'e>` > > I'm really starting to like the 'lt notation, I have to say. > > > Niko > > > Paul Nathan wrote: > > My general 2c, worthless as it might be: > > One of my criticisms of perl when I use it is that my cats could sit on > the keyboard and produce correct code. > > Sigil heavy code produces IMO a heavy mental burden. There is the popular > c++ quiz question 'what does this syntax bundle do'. It's really quite > pointless and just a puzzle IMO. Mostly a mental exercise to prove > something..... > > I'd like not to endure that with Rust. > > Lifetime-qualifier/variable seems to be reasonable: I'd suggest further > information could be stacked up with /s as > > Info/lifetime/variable > > Or > > Type/info/lt/variable > > Etc. I write lisp a lot and if verbosity is onerous, a macro can be > written to shorten it. I presume/hope rust's macros are sufficient to that > task. > > I think it's better to opt for obvious and lengthy at first and then allow > power users to use shortcuts. Too many sigils is confusing to the > uninitiated. > > > Regards, > Paul Nathan > > Sent from my iPhone > > On Jan 23, 2013, at 1:44 PM, Niko Matsakis wrote: > > > > Dean Thompson wrote: > > Personally, though, I find myself increasingly attracted to the idea of > having a consistent notation for writing a lifetime everywhere one > appears, independently of the & symbol. (*/lt/* is the only such notation > I've found yet that seems reasonable.) > > > I like this idea too, I just don't like /lt/ for that role. > > Maybe `'`? (shades of ML) > > &'lt Foo > Foo<'lt> > > That actually doesn't look half bad to me. > > Maybe `.`? > > &.lt Foo > Foo<.lt> > > I don't like Foo<.lt>, but &.lt Foo and Foo might be ok, though it > doesn't adhere to the principle (in that lifetime names are just like any > other identifier). > > Another option: > > &{lt} Foo > Foo<{lt}> > > But the latter form feels pretty sigil heavy. > > > Niko > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Wed Jan 23 22:37:28 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 23 Jan 2013 22:37:28 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <51006B81.5020506@alum.mit.edu> References: <510059B9.8020308@alum.mit.edu> <4BA886B4-4ECB-420D-BF7B-4747243D86FC@vandals.uidaho.edu> <51006B81.5020506@alum.mit.edu> Message-ID: <5100D6A8.3000200@alum.mit.edu> Niko Matsakis wrote: > It's a tough balancing act. Sigils are bad but then if things are too > concrete that's bad too. Re-reading this mail, I'm not sure if I interpreted your e-mail correctly. I think I was responding as much to thoughts in my own head as to what you wrote. What I meant to say that sometimes verbose types like Region<'t, int> (which you did not propose, of course) can be hard to read too if things get too explicit. Anyhow, we have to find a notation that's as readable as possible, to be sure. > The slashes are visually hard to parse, I think everyone agrees. > Imagine this: `&a/B/c/d/e`. That would be a legal type under that > proposal and I think it's pretty darn confusing vs `&'a B<'c, 'd, 'e>` To be clear, what I don't like about &a/B/c/d/e is that it is hard to extract the structure. I think the use of `<>` is better in that respect. I also like `<>` because lifetime parameters on types are exactly analogous to type parameters and so I hope that `<>` provides some intuition for that. That is, just as `Option` means "a type that's just like the definition of Option but with int in place of the type parameter", `Foo<'a>` would mean "a type just like Foo but with the lifetime 'a in place of 'self". regards, Niko -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.kjeldaas at gmail.com Thu Jan 24 02:10:58 2013 From: alexander.kjeldaas at gmail.com (Alexander Kjeldaas) Date: Thu, 24 Jan 2013 11:10:58 +0100 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <50FECB27.8020700@mozilla.com> References: <50FECB27.8020700@mozilla.com> Message-ID: On Tue, Jan 22, 2013 at 6:23 PM, Graydon Hoare wrote: > On 22/01/2013 6:55 AM, Dean Thompson wrote: > > > I'm looking at some code that Niko Matsakis updated in > > https://github.com/stevej/rustled/commits/master/red_black_tree.rs > > > > pure fn each(&self, f: fn(&(&self/K, &self/V)) -> bool) { > > match *self { > > Leaf => (), > > Tree(_, ref left, ref key, ref maybe_value, ref right) => { > > let left: &self/@RBMap = left; > > let key: &self/K = key; > > let maybe_value: &self/Option = maybe_value; > > let right: &self/@RBMap = right; > > left.each(f); > > match *maybe_value { > > Some(ref value) => { > > let value: &self/V = value; > > f(&(key, value)); > > } > > None => () > > }; > > right.each(f); > > } > > } > > } > > > > I understand this code reasonably well. I greatly value the attention > > to safety in Rust, and I appreciate the value of pointer lifetimes in > > maintaining that safety. > > > > My gut reaction, though, is that this code is almost as intimidating > > as Haskell. Even more worrisome to me, I think most mainstream > > programmers would find the *explanation* of this code intimidating. > > I agree that the cognitive load on this code sample is high. This is the > main risk we took (aside from "potential unsoundness", which I didn't > really think to be a big risk, judging from Niko's comfort with the > semantics) when adopting first class region pointers: that the resulting > types would be too complex to understand, and/or require too much > chatter when writing out in full. > > To my eyes the matter is not yet entirely clear. It's complex but it's > not quite "impossibly complex"; if you made all the '&self/' symbols > into just '&' it would be, I think, not so bad. Compare if you like to > the associated bits of code from libc++ required to implement > roughly-equivalent "iterate through the treemap" sort of functionality: > > And with this clue, I think I can jump into the discussion because I know *nothing* about the semantics you are discussing. This is my wild guess at what is happening in this function, and what is not obvious, line by line: > pure fn each(&self, f: fn(&(&self/K, &self/V)) -> bool) { There is some difference between & and 'ref', but they are both some sort of reference. 'fn' is a callback function that takes a tuple with the key and value. The &self/K syntax is impossible to understand, but given your clue that I could just assume it doesn't exist, I will guess that &self/K means that the lifetime of K is somehow bound to &self, so that the callback function must copy K and V if they should be retained. I am pretty familiar with the concept of alias analysis in a compiler. It is not obvious why there is a '&self/' instead of 'self/' though. > match *self { I'm not sure why we need *self here, but it looks like dereferencing so we're matching on the structure of the tree. > Leaf => (), > Tree(_, ref left, ref key, ref maybe_value, ref right) => { As I said earlier, why this is 'ref' and not '&' is not clear, but it is "obvious" that they are both some kind of reference. > let left: &self/@RBMap = left; > let key: &self/K = key; > let maybe_value: &self/Option = maybe_value; > let right: &self/@RBMap = right; The above is sort of straight forward, but I am a little disappointed that the &self/ stuff isn't inferred, given that we have f(&(key, value)) below. I mean, if my intuition on &self/ is correct, then in Haskell at least, the '&self/' part of the 'key' and 'value' would be inferred by the compiler because of the restriction in the type of 'fn'. > left.each(f); This tells me that the first parameter to the function is special. I am not sure whether each(left, f) would be valid or not. > match *maybe_value { Here again, it seems like using the '*' is a little unnecessary syntax. When would we want to match on a reference which can only be a reference? > Some(ref value) => { > let value: &self/V = value; > f(&(key, value)); Given the above, I ask myself why this can't be written like this: Some(ref &self/V value) => { f (&(key, value)); } or something like that? > } > None => () I ask myself whether this case is needed. The return value of the 'match *maybe_value' is not used, since this is imperative code. Probably this is here because the compiler enforces full case analysis and the runtime will crash on an unmatched item. Maybe Rust is missing something that doesn't enforce full case analysis, I think: match_some *maybe_value { ... } > }; > right.each(f); > } > } > } All in all, my feeling after trying to understand the code is that it is too verbose, and that especially the '&self/' thing should be in some sort of type inference part of the compiler. Alexander > > _LIBCPP_INLINE_VISIBILITY > __tree_iterator& operator++() { > __ptr_ = static_cast<__node_pointer( > __tree_next( > static_cast<__node_base_pointer>(__ptr_))); > return *this; > } > > template > _NodePtr > __tree_next(_NodePtr __x) _NOEXCEPT > { > if (__x->__right_ != nullptr) > return __tree_min(__x->__right_); > while (!__tree_is_left_child(__x)) > __x = __x->__parent_; > return __x->__parent_; > } > > template > inline _LIBCPP_INLINE_VISIBILITY > bool > __tree_is_left_child(_NodePtr __x) _NOEXCEPT > { > return __x == __x->__parent_->__left_; > } > > template > inline _LIBCPP_INLINE_VISIBILITY > _NodePtr > __tree_min(_NodePtr __x) _NOEXCEPT > { > while (__x->__left_ != nullptr) > __x = __x->__left_; > return __x; > } > > And keep in mind that there is no memory-safety in that code: if I > invalidate a C++ map while iterating, I just get a wild pointer > dereference and crash. If I rewrote it in terms of shared_ptr<> it'd be > even chattier. > > > Who is our target audience for Rust? Graydon has said it is > > "frustrated C++ developers", but how sophisticated and how "brave" > > are we thinking they will be? > > The target audience is frustrated C++ developers, same as always. If > they balk at the syntax for lifetime-bounds on borrowed pointers, then > yes, we've blown the cognitive budget, and have failed. > > It is not clear to me yet that that's true. But it's a risk. One we're > all aware of and worried about. > > > How intimidating do we think Rust is today? Am I just overreacting > > to unfamiliarity? > > I don't know. It's a very hard thing to measure. I know of lots of > languages that have failed for this reason. It's a major hazard. > > > How can we calibrate our "intimidation factor" before language > > decisions start getting harder to change? > > If you search our mailing list, IRC logs or meeting minutes for > "cognitive budget", "cognitive load" or "cognitive burden" you will see > we have always been keenly aware of this risk and treat it as a primary > constraint when doing design work. It's a leading reason why many > features have been removed, simplified, minimized or excluded from > consideration. > > > Do we want (and is it feasible) to define a simpler subset of the > > language that beginners are encouraged to stick to and that most > > libraries don't force clients away from? > > Personal opinion: no. That just makes the issue even more confusing. The > way to approach this is head-on, by looking at the things that cause the > most confusion and trying to make them cause less. > > Thanks for bringing this up. I'm interested to hear others' opinions on > whether we're past a reasonable limit of comprehensibility. It's a hard > thing to hear, but better to hear now than later, if true. > > -Graydon > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From deansherthompson at gmail.com Thu Jan 24 04:00:32 2013 From: deansherthompson at gmail.com (Dean Thompson) Date: Thu, 24 Jan 2013 04:00:32 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: Message-ID: Alexander, Thanks ? that's a good exercise! Niko has indicated that the sequences like this: > Some(ref value) => { > let value: &self/V = value; are working around a current bug in ref. The let should be unnecessary. > All in all, my feeling after trying to understand the code is that it is too verbose, > and that especially the '&self/' thing should be in some sort of type inference part > of the compiler. Yes, the Rust team is trying to balance good defaults with allowing the programmer to exercise control when needed. Here's a good stream-of-consciousness blog post from Niko exploring the options: http://smallcultfollowing.com/babysteps/blog/2013/01/15/lifetime-notation-re dux/ I'd be interested in how you react to the syntax Niko is currently experimenting with: &'lt Foo Foo<'lt> In this syntax, a lifetime named "foo" is always written as 'foo. Dean From: Alexander Kjeldaas Date: Thursday, January 24, 2013 2:10 AM To: Graydon Hoare Cc: Subject: Re: [rust-dev] "intimidation factor" vs target audience On Tue, Jan 22, 2013 at 6:23 PM, Graydon Hoare wrote: > On 22/01/2013 6:55 AM, Dean Thompson wrote: > >> > I'm looking at some code that Niko Matsakis updated in >> > https://github.com/stevej/rustled/commits/master/red_black_tree.rs >> > >> > pure fn each(&self, f: fn(&(&self/K, &self/V)) -> bool) { >> > match *self { >> > Leaf => (), >> > Tree(_, ref left, ref key, ref maybe_value, ref right) => { >> > let left: &self/@RBMap = left; >> > let key: &self/K = key; >> > let maybe_value: &self/Option = maybe_value; >> > let right: &self/@RBMap = right; >> > left.each(f); >> > match *maybe_value { >> > Some(ref value) => { >> > let value: &self/V = value; >> > f(&(key, value)); >> > } >> > None => () >> > }; >> > right.each(f); >> > } >> > } >> > } >> > >> > I understand this code reasonably well. I greatly value the attention >> > to safety in Rust, and I appreciate the value of pointer lifetimes in >> > maintaining that safety. >> > >> > My gut reaction, though, is that this code is almost as intimidating >> > as Haskell. Even more worrisome to me, I think most mainstream >> > programmers would find the *explanation* of this code intimidating. > > I agree that the cognitive load on this code sample is high. This is the > main risk we took (aside from "potential unsoundness", which I didn't > really think to be a big risk, judging from Niko's comfort with the > semantics) when adopting first class region pointers: that the resulting > types would be too complex to understand, and/or require too much > chatter when writing out in full. > > To my eyes the matter is not yet entirely clear. It's complex but it's > not quite "impossibly complex"; if you made all the '&self/' symbols > into just '&' it would be, I think, not so bad. Compare if you like to > the associated bits of code from libc++ required to implement > roughly-equivalent "iterate through the treemap" sort of functionality: > And with this clue, I think I can jump into the discussion because I know *nothing* about the semantics you are discussing. This is my wild guess at what is happening in this function, and what is not obvious, line by line: > pure fn each(&self, f: fn(&(&self/K, &self/V)) -> bool) { There is some difference between & and 'ref', but they are both some sort of reference. 'fn' is a callback function that takes a tuple with the key and value. The &self/K syntax is impossible to understand, but given your clue that I could just assume it doesn't exist, I will guess that &self/K means that the lifetime of K is somehow bound to &self, so that the callback function must copy K and V if they should be retained. I am pretty familiar with the concept of alias analysis in a compiler. It is not obvious why there is a '&self/' instead of 'self/' though. > match *self { I'm not sure why we need *self here, but it looks like dereferencing so we're matching on the structure of the tree. > Leaf => (), > Tree(_, ref left, ref key, ref maybe_value, ref right) => { As I said earlier, why this is 'ref' and not '&' is not clear, but it is "obvious" that they are both some kind of reference. > let left: &self/@RBMap = left; > let key: &self/K = key; > let maybe_value: &self/Option = maybe_value; > let right: &self/@RBMap = right; The above is sort of straight forward, but I am a little disappointed that the &self/ stuff isn't inferred, given that we have f(&(key, value)) below. I mean, if my intuition on &self/ is correct, then in Haskell at least, the '&self/' part of the 'key' and 'value' would be inferred by the compiler because of the restriction in the type of 'fn'. > left.each(f); This tells me that the first parameter to the function is special. I am not sure whether each(left, f) would be valid or not. > match *maybe_value { Here again, it seems like using the '*' is a little unnecessary syntax. When would we want to match on a reference which can only be a reference? > Some(ref value) => { > let value: &self/V = value; > f(&(key, value)); Given the above, I ask myself why this can't be written like this: Some(ref &self/V value) => { f (&(key, value)); } or something like that? > } > None => () I ask myself whether this case is needed. The return value of the 'match *maybe_value' is not used, since this is imperative code. Probably this is here because the compiler enforces full case analysis and the runtime will crash on an unmatched item. Maybe Rust is missing something that doesn't enforce full case analysis, I think: match_some *maybe_value { ... } > }; > right.each(f); > } > } > } All in all, my feeling after trying to understand the code is that it is too verbose, and that especially the '&self/' thing should be in some sort of type inference part of the compiler. Alexander > > _LIBCPP_INLINE_VISIBILITY > __tree_iterator& operator++() { > __ptr_ = static_cast<__node_pointer( > __tree_next( > static_cast<__node_base_pointer>(__ptr_))); > return *this; > } > > template > _NodePtr > __tree_next(_NodePtr __x) _NOEXCEPT > { > if (__x->__right_ != nullptr) > return __tree_min(__x->__right_); > while (!__tree_is_left_child(__x)) > __x = __x->__parent_; > return __x->__parent_; > } > > template > inline _LIBCPP_INLINE_VISIBILITY > bool > __tree_is_left_child(_NodePtr __x) _NOEXCEPT > { > return __x == __x->__parent_->__left_; > } > > template > inline _LIBCPP_INLINE_VISIBILITY > _NodePtr > __tree_min(_NodePtr __x) _NOEXCEPT > { > while (__x->__left_ != nullptr) > __x = __x->__left_; > return __x; > } > > And keep in mind that there is no memory-safety in that code: if I > invalidate a C++ map while iterating, I just get a wild pointer > dereference and crash. If I rewrote it in terms of shared_ptr<> it'd be > even chattier. > >> > Who is our target audience for Rust? Graydon has said it is >> > "frustrated C++ developers", but how sophisticated and how "brave" >> > are we thinking they will be? > > The target audience is frustrated C++ developers, same as always. If > they balk at the syntax for lifetime-bounds on borrowed pointers, then > yes, we've blown the cognitive budget, and have failed. > > It is not clear to me yet that that's true. But it's a risk. One we're > all aware of and worried about. > >> > How intimidating do we think Rust is today? Am I just overreacting >> > to unfamiliarity? > > I don't know. It's a very hard thing to measure. I know of lots of > languages that have failed for this reason. It's a major hazard. > >> > How can we calibrate our "intimidation factor" before language >> > decisions start getting harder to change? > > If you search our mailing list, IRC logs or meeting minutes for > "cognitive budget", "cognitive load" or "cognitive burden" you will see > we have always been keenly aware of this risk and treat it as a primary > constraint when doing design work. It's a leading reason why many > features have been removed, simplified, minimized or excluded from > consideration. > >> > Do we want (and is it feasible) to define a simpler subset of the >> > language that beginners are encouraged to stick to and that most >> > libraries don't force clients away from? > > Personal opinion: no. That just makes the issue even more confusing. The > way to approach this is head-on, by looking at the things that cause the > most confusion and trying to make them cause less. > > Thanks for bringing this up. I'm interested to hear others' opinions on > whether we're past a reasonable limit of comprehensibility. It's a hard > thing to hear, but better to hear now than later, if true. > > -Graydon > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev _______________________________________________ Rust-dev mailing list Rust-dev at mozilla.org https://mail.mozilla.org/listinfo/rust-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From samuel at framond.fr Thu Jan 24 04:37:27 2013 From: samuel at framond.fr (Samuel de Framond) Date: Thu, 24 Jan 2013 20:37:27 +0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <5100D6A8.3000200@alum.mit.edu> References: <510059B9.8020308@alum.mit.edu> <4BA886B4-4ECB-420D-BF7B-4747243D86FC@vandals.uidaho.edu> <51006B81.5020506@alum.mit.edu> <5100D6A8.3000200@alum.mit.edu> Message-ID: <51012B07.4010005@framond.fr> Just a stupid question, could we read this (assuming this syntax): &'a B<'c, 'd, 'e> as: A borrowed pointer of lifetime 'a to a value of type B with the lifetime parameters 'c, 'd, 'e. This must be obvious to many but I admit being quite new to this kind of lifetime parameter concept. Thanks! -- Samuel de Framond P: +86 135 8556 8964 M: samuel at framond.fr -------------- next part -------------- An HTML attachment was scrubbed... URL: From deansherthompson at gmail.com Thu Jan 24 04:51:29 2013 From: deansherthompson at gmail.com (Dean Thompson) Date: Thu, 24 Jan 2013 04:51:29 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <51012B07.4010005@framond.fr> Message-ID: Yes. :-) From: Samuel de Framond Date: Thursday, January 24, 2013 4:37 AM To: Subject: Re: [rust-dev] "intimidation factor" vs target audience Just a stupid question, could we read this (assuming this syntax): > &'a B<'c, 'd, 'e> > as: > A borrowed pointer of lifetime 'a to a value of type B with the lifetime > parameters 'c, 'd, 'e. > This must be obvious to many but I admit being quite new to this kind of lifetime parameter concept. Thanks! -- Samuel de Framond P: +86 135 8556 8964 M: samuel at framond.fr _______________________________________________ Rust-dev mailing list Rust-dev at mozilla.org https://mail.mozilla.org/listinfo/rust-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From gaozm55 at gmail.com Thu Jan 24 09:23:41 2013 From: gaozm55 at gmail.com (James Gao) Date: Fri, 25 Jan 2013 01:23:41 +0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <51012B07.4010005@framond.fr> References: <510059B9.8020308@alum.mit.edu> <4BA886B4-4ECB-420D-BF7B-4747243D86FC@vandals.uidaho.edu> <51006B81.5020506@alum.mit.edu> <5100D6A8.3000200@alum.mit.edu> <51012B07.4010005@framond.fr> Message-ID: I think we should try to clear some definitions about lifetime. IMO, a lifetime is a compile-time constant bound to some run-time variable. While constructing one variable and its lifetime, we can deduce a lifetime from variables within the same stack frame, or do some intersection operations over lifetimes of other variables. Based on the new popular syntax, we have some shorthand: - *'x means lifetime of x * - *'x.fieldA.subFieldB* means lifetime of x.fieldA.subfieldB - *'x ^ 'y* means intersection of lifetimes binding x and y - *['x.*]* means lifetime vector of x's deep fields, this sigil is only for easy description. For Snippet "*x: &'a B = y*", we mean a borrowed box *x *ref to a resource *y* of type *B*, with *'x* == *'y ^* {all lifetimes tagged with *'a*} and *['x.*] *==* ['c, 'd, 'e]*. IMO, the token *'a* here may introduce some confusion, it allocates a new lifetime constant equals some others' intersection, instead of simply fetching an existing one from a variable. So how about trying to avoid "named lifetime", and explicitly define the lifetime with intersect operations, for example: - *x: &T ^ 'y ^ 'z = t*, or - *x: &T ~ 'y ^ 'z = t*, or - *x: & , 'y ^ 'z> = t * means a borrowed box *x *ref to a resource *t* of type T**, with *'x* == *'t ^ 'y ^ 'z*. * * Then we can declare a function like this (part of assigning *['returnVal.*]* is somewhat ugly): - *fn foo(x: &X, y: &Y) -> T { ... }* If *T<>* has many fields need to bind lifetimes on return type, we can give a default one when defining *T<>*: * * *struct T {* * fieldA: &str,* * fieldB: &str ~ 'fieldA, // *default activity: 'fieldB = 'fieldA, else use the one in <{fieldB: ...}> * fieldC: [&str * 3] ~ 'fieldA,* // bind 'fieldC and we have 'fieldC[0] == 'fieldC[1] == 'fieldC[2] * fieldD: [&U * 3] ~ 'fieldA* // we have ['fieldC[0].*] == ['fieldC[1].*] == ['fieldC[2].*] *}* * * then define *fn foo* and *fn each* as: - *fn foo(x: &T) -> T {* - *pure fn each(&self, f: fn(&(&K ~ 'self, &V ~ 'self)) -> bool) {* Here is StringReader example from Niko's blog without named lifetime: struct StringReader { value: &str, count: uint } impl StringReader { fn new(value: &str) -> StringReader <{value: 'value}> { StringReader { value: value, count: 0 } } } fn remaining(s: &StringReader) -> uint { return s.value.len() - s.count; } fn value(s: &StringReader) -> &str ~ 's.value { return s.value; } -- James Gao On Thu, Jan 24, 2013 at 8:37 PM, Samuel de Framond wrote: > Just a stupid question, could we read this (assuming this syntax): > > &'a B<'c, 'd, 'e> > > as: > > A borrowed pointer of lifetime 'a to a value of type B with the lifetime > parameters 'c, 'd, 'e. > > This must be obvious to many but I admit being quite new to this kind of > lifetime parameter concept. > > Thanks! > > -- > Samuel de Framond > P: +86 135 8556 8964 > M: samuel at framond.fr > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Thu Jan 24 09:43:29 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 24 Jan 2013 09:43:29 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: References: <510059B9.8020308@alum.mit.edu> <4BA886B4-4ECB-420D-BF7B-4747243D86FC@vandals.uidaho.edu> <51006B81.5020506@alum.mit.edu> <5100D6A8.3000200@alum.mit.edu> <51012B07.4010005@framond.fr> Message-ID: <510172C1.5000009@alum.mit.edu> James Gao wrote: > IMO, a lifetime is a compile-time constant bound to some run-time > variable. This is not correct---or, at least, this is not what the Rust compiler implements. The names of lifetimes and the names of variables are not related. > Based on the new popular syntax, we have some shorthand: > > * **'x* means lifetime of x > * > * *'x.fieldA.subFieldB* means lifetime of x.fieldA.subfieldB > * *'x ^ 'y* means intersection of lifetimes binding x and y > * *['x.*]* means lifetime vector of x's deep fields, this sigil is > only for easy description. > Likewise, this is incorrect as is what follows Lifetime names are not expressions, though that would be a reasonable way to construct a system. We often use the same names for lifetimes and variables but that is to suggest a relationship, it does not create one. For example, the following definitions of `foo()` are all valid and equivalent (using the proposed syntax): struct Foo { f: int } fn foo(v: &'a Foo) -> &'a int { &v.f } fn foo(v: &'b Foo) -> &'b int { &v.f } fn foo(v: &'v Foo) -> &'v int { &v.f } Niko -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.ronnquist at gmail.com Thu Jan 24 10:51:36 2013 From: peter.ronnquist at gmail.com (Peter Ronnquist) Date: Thu, 24 Jan 2013 19:51:36 +0100 Subject: [rust-dev] Stuck with pointer/references to vector elements Message-ID: Hi, I am trying to create a vector of pointers into another vector without much luck. I would be happy if someone could tell me if this a good way to use rust or if there is a better way to achieve a similar data structure or if I am using the wrong language :-) I have a vector of physical objects: obj_vec = [ mut @obj1, @obj2, @obj3, ..., @objn ] I will iterate through those objects and see which ones are close to each other and I want to store a pointer/reference to those objects in a separate vector: obj_close_vec = [ &obj2, &obj3 ] The objects in obj_close_vec will be investigated further and might be updated. I want to avoid making copies of the objects. It would be nice to be able to allocate the obj_vec once (prefarably on the stack) and then manipulate the objects through pointers/references. Stack: obj_vec obj_close_vec [ [ obj1 |------ &obj2 | obj2 <---------| |-- &obj3 | obj3 <--------------| ] ... objn ] A small test program: fn main() { let x = @ mut 10; let a = x; *x = 3; // Gives a=3, x=3 as I want. io::println(fmt!("a: %d", *a) ); io::println(fmt!("x: %d", *x) ); // Same thing with vector elementsx let mut v = @ mut [mut @1, @2]; let e = v[0]; // Does not work: // error: assigning to dereference of immutable @ pointer // *(v[0]) = 0; io::println(fmt!("e: %d", *e) ); io::println(fmt!("x: %d", *v[0]) ); } I have struggled with this (different variations of @ ~ vectors and dvec) for far to long and would appreciate any comments. Regards Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From pwalton at mozilla.com Thu Jan 24 11:48:42 2013 From: pwalton at mozilla.com (Patrick Walton) Date: Thu, 24 Jan 2013 11:48:42 -0800 Subject: [rust-dev] Stuck with pointer/references to vector elements In-Reply-To: References: Message-ID: <5101901A.80103@mozilla.com> On 1/24/13 10:51 AM, Peter Ronnquist wrote: > > Hi, > > I am trying to create a vector of pointers into another vector without > much luck. I would be happy if someone could tell me if this a good way > to use rust or if there is a better way to achieve a similar data > structure or if I am using the wrong language :-) > > I have a vector of physical objects: > > > obj_vec = [ mut @obj1, @obj2, @obj3, ..., @objn ] > > I will iterate through those objects and see which ones are close to > each other and I want to store a pointer/reference to those objects in a > separate vector: > > > obj_close_vec = [ &obj2, &obj3 ] > > The objects in obj_close_vec will be investigated further and might be > updated. Does something like this work? let mut obj_vec = [ @mut obj1, @mut obj2, ... @mut objn ]; let mut obj_close_vec: ~[@mut ObjType] = ~[]; for obj_vec.each |&obj| { if obj.is_interesting() { obj_close_vec.push(obj); } } for obj_close_vec.each |&obj| { if i_want_to_mutate(obj) { *obj = munge(obj); } } Patrick From peter.ronnquist at gmail.com Thu Jan 24 14:02:13 2013 From: peter.ronnquist at gmail.com (Peter Ronnquist) Date: Thu, 24 Jan 2013 23:02:13 +0100 Subject: [rust-dev] Stuck with pointer/references to vector elements Message-ID: On 1/24/13 10:51 AM, Peter Ronnquist wrote: >**>* Hi,*>**>* I am trying to create a vector of pointers into another vector without*>* much luck. I would be happy if someone could tell me if this a good way*>* to use rust or if there is a better way to achieve a similar data*>* structure or if I am using the wrong language :-)*>**>* I have a vector of physical objects:*>**>**>* obj_vec = [ mut @obj1, @obj2, @obj3, ..., @objn ]*>**>* I will iterate through those objects and see which ones are close to*>* each other and I want to store a pointer/reference to those objects in a*>* separate vector:*>**>**>* obj_close_vec = [ &obj2, &obj3 ]*>**>* The objects in obj_close_vec will be investigated further and might be*>* updated.* Does something like this work? let mut obj_vec = [ @mut obj1, @mut obj2, ... @mut objn ]; let mut obj_close_vec: ~[@mut ObjType] = ~[]; for obj_vec.each |&obj| { if obj.is_interesting() { obj_close_vec.push(obj); } } for obj_close_vec.each |&obj| { if i_want_to_mutate(obj) { *obj = munge(obj); } } Patrick ---------------------------------- I tried this: struct Vec2 { x: float, y: float } fn main () { let mut obj_vec = [@mut Vec2 {x: 0.3, y: 0.4}, @mut Vec2 {x: 0.1, y: 0.2} ]; let mut obj_close_vec: ~[@mut Vec2] = ~[]; for obj_vec.each |&obj| { obj_close_vec.push(obj); } } and got: rustc test_vec.rs test_vec.rs:14:24: 14:27 error: obsolete syntax: by-mutable-reference mode test_vec.rs:14 for obj_vec.each |&obj| { ^~~ note: Declare an argument of type &mut T instead error: aborting due to previous error hmm, don't know what that means but I will try to get past that error. Thanks for your quick reply! -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.ronnquist at gmail.com Thu Jan 24 14:21:06 2013 From: peter.ronnquist at gmail.com (Peter Ronnquist) Date: Thu, 24 Jan 2013 23:21:06 +0100 Subject: [rust-dev] Stuck with pointer/references to vector elements In-Reply-To: References: Message-ID: On Thu, Jan 24, 2013 at 11:02 PM, Peter Ronnquist wrote: > > On 1/24/13 10:51 AM, Peter Ronnquist wrote: > > > > Hi, > > > > I am trying to create a vector of pointers into another vector without > > much luck. I would be happy if someone could tell me if this a good way > > to use rust or if there is a better way to achieve a similar data > > structure or if I am using the wrong language :-) > > > > I have a vector of physical objects: > > > > > > obj_vec = [ mut @obj1, @obj2, @obj3, ..., @objn ] > > > > I will iterate through those objects and see which ones are close to > > each other and I want to store a pointer/reference to those objects in a > > separate vector: > > > > > > obj_close_vec = [ &obj2, &obj3 ] > > > > The objects in obj_close_vec will be investigated further and might be > > updated. > > Does something like this work? > > let mut obj_vec = [ @mut obj1, @mut obj2, ... @mut objn ]; > let mut obj_close_vec: ~[@mut ObjType] = ~[]; > for obj_vec.each |&obj| { > if obj.is_interesting() { > obj_close_vec.push(obj); > } > } > for obj_close_vec.each |&obj| { > if i_want_to_mutate(obj) { > *obj = munge(obj); > } > } > > Patrick > > ---------------------------------- > > I tried this: > > > struct Vec2 { > x: float, > y: float > } > > fn main () { > > let mut obj_vec = [@mut Vec2 {x: 0.3, y: 0.4}, @mut Vec2 {x: 0.1, y: 0.2} ]; > > > let mut obj_close_vec: ~[@mut Vec2] = ~[]; > > for obj_vec.each |&obj| { > obj_close_vec.push(obj); > } > > } > > and got: > > > rustc test_vec.rs > > > test_vec.rs:14:24: 14:27 error: obsolete syntax: by-mutable-reference mode > test_vec.rs:14 for obj_vec.each |&obj| { > ^~~ > > note: Declare an argument of type &mut T instead > > error: aborting due to previous error > > > hmm, don't know what that means but I will try to get past that error. > Thanks for your quick reply! ------------------------------------------- My problem was that I used rust v0.4 on this computer, after updating to v0.5 then it compiles! Thanks a lot. From mneumann at ntecs.de Thu Jan 24 15:55:41 2013 From: mneumann at ntecs.de (Michael Neumann) Date: Fri, 25 Jan 2013 00:55:41 +0100 Subject: [rust-dev] Misc questions Message-ID: <5101C9FD.7030204@ntecs.de> Hi, Again a couple of random question... * Would it be possible to optimize this kind of enum (two cases, where one case contains a borrowed pointer) into a simple pointer, where None would be represented as the null pointer? enum Option { None, Some(~A) } As the pointer to A can never be null it should be possible. This probably wouldn't affect performance much, but when storing it into an Array that would save a lot of space (basically cut down space usage half). * match() statements. I think the order in which the matches are performed are important. But when I have a very simple statement like this: match io.read_char() as u8 { 0x0c => ..., 0x0d => ..., 0x0f .. 0x1a => ... } will the compiler construct an efficient goto jump table or will it construct sequential if statements instead? ? My question is if it makes sense to reorder more frequent cases to the top or not. Also I wonder why I get a "non-exhaustive patterns" error message for this one: match c as u8 { 0 .. 255 => 1 } * Using str::as_bytes() I cannot get str::as_bytes working. The example in the documention is not working for several reasons (wrong syntax...) I tried this: fn write(buf: ~[u8]) { io::println(fmt!("%?", buf)); } fn main() { let mystr = ~"Hello World"; do str::as_bytes(&mystr) |bytes| { write(*bytes); } } But get the compiler error: t.rs:8:10: 8:16 error: moving out of dereference of immutable & pointer t.rs:8 write(*bytes); ^~~~~~ * A ~str is internally represented by an ~[u8] vector. It is basically a heap-allocated struct rust_vec { size_t fill; size_t alloc; uint8_t data[]; } When I correctly read the code the string is allocated in-place, i.e. for a string of size 5, you will allocate sizeof(struct rust_vec) + 5 + 1 bytes. So the data pointer points past the data. As a reallocation can change the pointer to the rust_vec, I know understand why I have to pass sometimes a &mut ~[T]into a function. In case the reallocation returns a new pointer, the passed pointer has to be updated. I guess slices use the same rust_vec struct, but allocated on the stack, where data points to another ~vectors data. What I don't understand is the following comment for struct rust_vec: size_t fill; // in bytes; if zero, heapified Please correct me if I am wrong. * There is a severe performance bug in TcpSocketBuf.read(). I am trying to fix it right now myself, once I am done, I will do another post. This explains why I get very bad network I/O performance. Basically the function copies the internal buffer over and over again, once for each call. This is especially bad when using read_line(), as it calls read() for every byte. * What exaclty is the semantic of "as"? Is it like a C-cast? Imagine if I have let b: u8 = 255; let s: i8 = b as i8; This gives -1 for s. But when I do "b as i32", it gives 255. If I want to keep the sign I have to do "(b as i8) as i32". * I don't like the way libuv is currently integrated into the system. It works, but performance is quite low and IMHO the blocking interface is not very usable. For example I want to write a process that accepts messages from other processes, and then writes something to the socket or reads from the socket. This will currently not work, as reading from the socket will block the process, and then no more requests can be sent to the process. So instead of using the read() / write() API of an io::Reader, I'd prefer to expose the read/write events of libuv via messages (this is already done between the iotask and the read()/write() methods, but it is not accessible to the "end-user"). So instead of: io.read(...) one would simply write: readport.recv() The same for writes. EOF results in closing the readport. The question is how these messages should look like to be usable for the programmer (how to handle errors?). What do you think? Actually there would be connecting ports, which receive events whenever a new connection is established. A successfully established connection would then be represented by a readport and writechannel. * I'd like to know more how the task scheduler and the pipes work together. Is there any info available somewhere? Also, if I would create a native pthread in C, could I simply call an external rust function? Best, Michael From pwalton at mozilla.com Thu Jan 24 16:37:58 2013 From: pwalton at mozilla.com (Patrick Walton) Date: Thu, 24 Jan 2013 16:37:58 -0800 Subject: [rust-dev] Misc questions In-Reply-To: <5101C9FD.7030204@ntecs.de> References: <5101C9FD.7030204@ntecs.de> Message-ID: <5101D3E6.7070902@mozilla.com> On 1/24/13 3:55 PM, Michael Neumann wrote: > Hi, > > Again a couple of random question... > > * Would it be possible to optimize this kind of enum (two cases, where > one case contains a borrowed pointer) > into a simple pointer, where None would be represented as the null pointer? > > enum Option { > None, > Some(~A) > } > > As the pointer to A can never be null it should be possible. This > probably wouldn't affect performance much, > but when storing it into an Array that would save a lot of space > (basically cut down space usage half). Yes. This has been on the agenda for years. The reason why we don't make any guarantees as to the memory layout for enums is precisely so that we can implement optimizations like this. > * match() statements. I think the order in which the matches are > performed are important. But when I have > a very simple statement like this: > > match io.read_char() as u8 { > 0x0c => ..., > 0x0d => ..., > 0x0f .. 0x1a => > ... > } > > will the compiler construct an efficient goto jump table or will it > construct sequential if statements instead? > ? My question is if it makes sense to reorder more frequent cases to the > top or not. LLVM will construct a jump table. I've verified this in my NES emulator. > > Also I wonder why I get a "non-exhaustive patterns" error message for > this one: > > match c as u8 { > 0 .. 255 => 1 > } The exhaustiveness checker currently doesn't know about integer ranges. This is probably a bug. > * Using str::as_bytes() > > I cannot get str::as_bytes working. The example in the documention is > not working for several reasons (wrong syntax...) > > I tried this: > > fn write(buf: ~[u8]) { > io::println(fmt!("%?", buf)); > } > > fn main() { > let mystr = ~"Hello World"; > do str::as_bytes(&mystr) |bytes| { > write(*bytes); > } > } > > But get the compiler error: > > t.rs:8:10: 8:16 error: moving out of dereference of immutable & pointer > t.rs:8 write(*bytes); > ^~~~~~ I think you want `&[u8]`. > > * A ~str is internally represented by an ~[u8] vector. > It is basically a heap-allocated > > struct rust_vec { > size_t fill; > size_t alloc; > uint8_t data[]; > } > > When I correctly read the code the string is allocated in-place, i.e. > for a string of size 5, > you will allocate sizeof(struct rust_vec) + 5 + 1 bytes. So the data > pointer points past the > data. As a reallocation can change the pointer to the rust_vec, I know > understand why > I have to pass sometimes a &mut ~[T]into a function. In case the > reallocation returns a new > pointer, the passed pointer has to be updated. > > I guess slices use the same rust_vec struct, but allocated on the stack, > where data points to > another ~vectors data. > > What I don't understand is the following comment for struct rust_vec: > > size_t fill; // in bytes; if zero, heapified This comment is incorrect. It refers to the representation in Rust as of June 2012, instead of today. > Please correct me if I am wrong. > > * There is a severe performance bug in TcpSocketBuf.read(). I am trying > to fix it right now myself, > once I am done, I will do another post. This explains why I get very bad > network I/O performance. > Basically the function copies the internal buffer over and over again, > once for each call. > This is especially bad when using read_line(), as it calls read() for > every byte. Ah, that's indeed very bad. Patches welcome :) > * What exaclty is the semantic of "as"? Is it like a C-cast? > > Imagine if I have > > let b: u8 = 255; > let s: i8 = b as i8; > > This gives -1 for s. But when I do "b as i32", it gives 255. > If I want to keep the sign I have to do "(b as i8) as i32". It's supposed to be like a C cast. This seems like a bug to me. > * I don't like the way libuv is currently integrated into the system. It > works, but performance is > quite low and IMHO the blocking interface is not very usable. For > example I want to write a process > that accepts messages from other processes, and then writes something to > the socket or reads from > the socket. This will currently not work, as reading from the socket > will block the process, and > then no more requests can be sent to the process. > So instead of using the read() / write() API of an io::Reader, I'd > prefer to expose the read/write > events of libuv via messages (this is already done between the iotask > and the read()/write() methods, > but it is not accessible to the "end-user"). > > So instead of: > > io.read(...) > > one would simply write: > > readport.recv() > > The same for writes. EOF results in closing the readport. The question > is how these messages > should look like to be usable for the programmer (how to handle errors?). > > What do you think? > > Actually there would be connecting ports, which receive events whenever > a new connection is established. > A successfully established connection would then be represented by a > readport and writechannel. brson is working on a rewrite of the scheduler. This new scheduler should run directly on the libuv event loop. This should have much higher performance. > > * I'd like to know more how the task scheduler and the pipes work > together. Is there any info available somewhere? I think brson knows best. > Also, if I would create a native pthread in C, could I simply call an > external rust function? You need a Rust task stored in the TLS. You'd have to set that up yourself somehow. Patrick From banderson at mozilla.com Thu Jan 24 19:01:06 2013 From: banderson at mozilla.com (Brian Anderson) Date: Thu, 24 Jan 2013 19:01:06 -0800 Subject: [rust-dev] Misc questions In-Reply-To: <5101D3E6.7070902@mozilla.com> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> Message-ID: <5101F572.3090202@mozilla.com> On 01/24/2013 04:37 PM, Patrick Walton wrote: > On 1/24/13 3:55 PM, Michael Neumann wrote: > >> * I don't like the way libuv is currently integrated into the system. It I sympathize. >> works, but performance is >> quite low and IMHO the blocking interface is not very usable. For >> example I want to write a process >> that accepts messages from other processes, and then writes something to >> the socket or reads from >> the socket. This will currently not work, as reading from the socket >> will block the process, and >> then no more requests can be sent to the process. >> So instead of using the read() / write() API of an io::Reader, I'd >> prefer to expose the read/write >> events of libuv via messages (this is already done between the iotask >> and the read()/write() methods, >> but it is not accessible to the "end-user"). >> >> So instead of: >> >> io.read(...) >> >> one would simply write: >> >> readport.recv() Both of these are blocking. The significant advantage of using a port here though is that core::pipes has several ways to receive on multiple ports at once, so you could wait for both the read event and some other signal instead of being stuck until the read either succeeds or times out. There is a lot of overlap functionally between Port/Chan and Reader/Writer (std::flatpipes event implements the GenericPort and GenericChan traits on top of Reader and Writer). They both have their strengths though. Streaming data over channels is just going to leave you with a bunch of byte buffers, whereas Readers give you lots of control over how to interpret those bytes. I could see the api here being channel based, letting the user opt into Reader and Writer implementations for Port<~[u8]> etc. as needed. >> >> The same for writes. EOF results in closing the readport. The question >> is how these messages >> should look like to be usable for the programmer (how to handle >> errors?). >> >> What do you think? >> >> Actually there would be connecting ports, which receive events whenever >> a new connection is established. >> A successfully established connection would then be represented by a >> readport and writechannel. > > brson is working on a rewrite of the scheduler. This new scheduler > should run directly on the libuv event loop. This should have much > higher performance. The current implementation will go away completely. It was useful as a prototype but it has problems. The new intent is to design the Rust scheduler so that it can be driven by an arbitrary event loop, then use the uv event loop for that purpose by default (this could also be useful for e.g. integrating with the Win32 event loop, etc.). The main advantage of integrating uv into the scheduler over the current design is that there will be much less synchronization and context switching to dispatch events. This will unfortunately necessitate a complex integration of the scheduler and the I/O system and I don't know how that is going to work yet. > >> >> * I'd like to know more how the task scheduler and the pipes work >> together. Is there any info available somewhere? > > I think brson knows best. There's no info besides the source code. The relevant files are `src/libcore/pipes.rs` and `src/rt/rust_task.cpp`. Pipes uses three foreign calls to indirectly control the scheduler: `task_wait_event`, `task_signal_event`, `task_clear_event_reject`, the details of which I don't know off hand but which look fairly straightforward. The implementation is made more complex by the continued existence of `core::oldcomm`, which uses a slightly different method of signalling events and relies on much more foreign code. > >> Also, if I would create a native pthread in C, could I simply call an >> external rust function? > Not yet. Today all Rust code depends on the Rust runtime (I've been saying that code must be run 'in task context'). Creating a pthread puts you outside of a task context. Being in task context essentially means that various runtime calls are able to locate a pointer to `rust_task` in TLS, and of course that pointer must be set up and managed correctly. It's not something you can do manually. We are slowly working on 'freestanding Rust', which will let you run Rust code without the runtime. This is particularly needed at the moment to port the Rust runtime to Rust. The first steps are in this pull request: https://github.com/mozilla/rust/pull/4619 After that pull request you can make foreign calls and use the exchange heap outside of task context, but if you do call a function that requires the runtime the process will abort. From mneumann at ntecs.de Fri Jan 25 09:33:32 2013 From: mneumann at ntecs.de (Michael Neumann) Date: Fri, 25 Jan 2013 18:33:32 +0100 Subject: [rust-dev] Problem with conflicting implementations for traits Message-ID: <5102C1EC.80507@ntecs.de> Hi, I am getting the following error: msgpack.rs:545:0: 555:1 error: conflicting implementations for a trait msgpack.rs:545 pub impl, msgpack.rs:547 V: serialize::Decodable> ~[(K,V)]: serialize::Decodable { msgpack.rs:548 static fn decode(&self, d: &D) -> ~[(K,V)] { msgpack.rs:549 do d.read_map |len| { msgpack.rs:550 do vec::from_fn(len) |i| { ... msgpack.rs:539:0: 543:1 note: note conflicting implementation here msgpack.rs:539 pub impl T: serialize::Decodable { msgpack.rs:540 static fn decode(&self, d: &D) -> T { msgpack.rs:541 serialize::Decodable::decode(d as &serialize::Decoder) msgpack.rs:542 } msgpack.rs:543 } It's obvious that the two trait implementations conflict, as the one (for T) is more general as the other (for ~[(K,V)]). Is there anything I can do to fix it? I have found this discussion [1] but I see no solution to the problem. For msgpack, I want to support "maps". They are specially encoded, so I need a special Decoder (serialize::Decoder does not support maps in any way). Above I tried to extend the serialize::Decoder trait for read_map() and read_map_elt(), leading to DecoderWithMap: pub trait DecoderWithMap : serialize::Decoder { fn read_map(&self, f: fn(uint) ?? T) ?? T; fn read_map_elt(&self, _idx: uint, f: fn() ?? T) ?? T; } Then I tried to implement Decodable for ~[(K,V)] (which I want to use as a rust representation as a map; here I'd probably run into problems again as serializer defines a generic implementation for ~[] and for tuples... this at least I could solve by using a different type). Now I would need to reimplement Decodable for any type I use, so I tried to use the second generic trait implementation. But this is where it failed with a conflict. Would it be possible to "override" a standard (more generic) trait implementation, either implicitly (like C++ is doing) or explictly? Best, Michael [1]: https://github.com/mozilla/rust/issues/3429 From mneumann at ntecs.de Sat Jan 26 01:50:00 2013 From: mneumann at ntecs.de (Michael Neumann) Date: Sat, 26 Jan 2013 10:50:00 +0100 Subject: [rust-dev] Misc questions In-Reply-To: <5101D3E6.7070902@mozilla.com> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> Message-ID: <5103A6C8.7070407@ntecs.de> Am 25.01.2013 01:37, schrieb Patrick Walton: > On 1/24/13 3:55 PM, Michael Neumann wrote: >> Hi, >> >> Again a couple of random question... >> >> * Would it be possible to optimize this kind of enum (two cases, where >> one case contains a borrowed pointer) >> into a simple pointer, where None would be represented as the null >> pointer? >> >> enum Option { >> None, >> Some(~A) >> } >> >> As the pointer to A can never be null it should be possible. This >> probably wouldn't affect performance much, >> but when storing it into an Array that would save a lot of space >> (basically cut down space usage half). > > Yes. This has been on the agenda for years. The reason why we don't > make any guarantees as to the memory layout for enums is precisely so > that we can implement optimizations like this. Great to know that this will eventually be implemented. >> * match() statements. I think the order in which the matches are >> performed are important. But when I have >> a very simple statement like this: >> >> match io.read_char() as u8 { >> 0x0c => ..., >> 0x0d => ..., >> 0x0f .. 0x1a => >> ... >> } >> >> will the compiler construct an efficient goto jump table or will it >> construct sequential if statements instead? >> ? My question is if it makes sense to reorder more frequent cases to the >> top or not. > > LLVM will construct a jump table. I've verified this in my NES emulator. Great. Hopefully it merges common statements as well. That is for 0x00 .. 0x0f => io::println("...") it don't generate 15 separate io::println instructions. I assume it generates a table which contains IP offsets. >> >> Also I wonder why I get a "non-exhaustive patterns" error message for >> this one: >> >> match c as u8 { >> 0 .. 255 => 1 >> } > > The exhaustiveness checker currently doesn't know about integer > ranges. This is probably a bug. It's unituitive, so I think it's a bug :) >> * Using str::as_bytes() >> >> I cannot get str::as_bytes working. The example in the documention is >> not working for several reasons (wrong syntax...) >> >> I tried this: >> >> fn write(buf: ~[u8]) { >> io::println(fmt!("%?", buf)); >> } >> >> fn main() { >> let mystr = ~"Hello World"; >> do str::as_bytes(&mystr) |bytes| { >> write(*bytes); >> } >> } >> >> But get the compiler error: >> >> t.rs:8:10: 8:16 error: moving out of dereference of immutable & pointer >> t.rs:8 write(*bytes); >> ^~~~~~ > > I think you want `&[u8]`. Of course! I feel little stupid now *g*. >> * What exaclty is the semantic of "as"? Is it like a C-cast? >> >> Imagine if I have >> >> let b: u8 = 255; >> let s: i8 = b as i8; >> >> This gives -1 for s. But when I do "b as i32", it gives 255. >> If I want to keep the sign I have to do "(b as i8) as i32". > > It's supposed to be like a C cast. This seems like a bug to me. Hm, for me it makes sense somehow. I think that every cast from unsigned to signed (or vice versa) will first extend the size to the requested size so that: u8 as i32 equivalent to (u8 as u32) as i32 and this is clearly different to (u8 as i8) as i32, because i8 as i32 will sign-extend, i.e. keep the upper bit. Thanks for your answers, Michael From mneumann at ntecs.de Sat Jan 26 03:28:19 2013 From: mneumann at ntecs.de (Michael Neumann) Date: Sat, 26 Jan 2013 12:28:19 +0100 Subject: [rust-dev] Misc questions In-Reply-To: <5101F572.3090202@mozilla.com> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> <5101F572.3090202@mozilla.com> Message-ID: <5103BDD3.9000401@ntecs.de> Am 25.01.2013 04:01, schrieb Brian Anderson: > On 01/24/2013 04:37 PM, Patrick Walton wrote: >> On 1/24/13 3:55 PM, Michael Neumann wrote: >> >>> * I don't like the way libuv is currently integrated into the >>> system. It > > I sympathize. :) > >>> works, but performance is >>> quite low and IMHO the blocking interface is not very usable. For >>> example I want to write a process >>> that accepts messages from other processes, and then writes >>> something to >>> the socket or reads from >>> the socket. This will currently not work, as reading from the socket >>> will block the process, and >>> then no more requests can be sent to the process. >>> So instead of using the read() / write() API of an io::Reader, I'd >>> prefer to expose the read/write >>> events of libuv via messages (this is already done between the iotask >>> and the read()/write() methods, >>> but it is not accessible to the "end-user"). >>> >>> So instead of: >>> >>> io.read(...) >>> >>> one would simply write: >>> >>> readport.recv() > > Both of these are blocking. The significant advantage of using a port > here though is that core::pipes has several ways to receive on > multiple ports at once, so you could wait for both the read event and > some other signal instead of being stuck until the read either > succeeds or times out. Exactly! > There is a lot of overlap functionally between Port/Chan and > Reader/Writer (std::flatpipes event implements the GenericPort and > GenericChan traits on top of Reader and Writer). They both have their > strengths though. Streaming data over channels is just going to leave > you with a bunch of byte buffers, whereas Readers give you lots of > control over how to interpret those bytes. I could see the api here > being channel based, letting the user opt into Reader and Writer > implementations for Port<~[u8]> etc. as needed. > >>> >>> The same for writes. EOF results in closing the readport. The question >>> is how these messages >>> should look like to be usable for the programmer (how to handle >>> errors?). >>> >>> What do you think? >>> >>> Actually there would be connecting ports, which receive events whenever >>> a new connection is established. >>> A successfully established connection would then be represented by a >>> readport and writechannel. >> >> brson is working on a rewrite of the scheduler. This new scheduler >> should run directly on the libuv event loop. This should have much >> higher performance. > > The current implementation will go away completely. It was useful as a > prototype but it has problems. The new intent is to design the Rust > scheduler so that it can be driven by an arbitrary event loop, then > use the uv event loop for that purpose by default (this could also be > useful for e.g. integrating with the Win32 event loop, etc.). The main > advantage of integrating uv into the scheduler over the current design > is that there will be much less synchronization and context switching > to dispatch events. This will unfortunately necessitate a complex > integration of the scheduler and the I/O system and I don't know how > that is going to work yet. Do you mean that the libuv callbacks -> messages conversion will then go away? Currenctly every callback is sent from the iotask to the "interested" task via a message. Will the new design be different in this regard? I'd like to hear more details about what you are trying to accomplish. Is there already some code? I think another thing that currently makes I/O slow is that for every read() call, at first a messages is sent to the iotask to let the libuv know that we want to start reading from the socket (uv_read_start()). Then we actually request the read (another message) and finally we stop reading again (uv_read_stop()). So in total, every read will involve 3 complete message cycles between iotask and the requesting task. I hope this will be going to be reduced to just one (but this is probably just a library issue and not related to the scheduler...). Another question: When a task sends a message to another task, and this task is waiting exactly for this event, will it directly switch to that task, or will it buffer the message? Sometimes this could be quite handy and efficient. I rember this was done in the L4 microkernel (www.l4ka.org), which only allowed synchronous IPC. It could make sense to provide a send_and_receive directive, which sends to the channel and lets the scheduler know that it is now waiting for a message to receive from another port. So send_and_receive could directly switch to the other task, and when this does a send back to the calling task, it will switch back to it. If you don't have send_and_receive as atomic operation, there is no way to switch back to the other task, as it might still be running. In L4 it is always the sender who blocks when the receiving side is not ready. How about the pipes implementation in Rust? What I noticed when doing some benchmarks with I/O was that the timing results did vary a lot. I think this is due to the scheduling. >> >>> >>> * I'd like to know more how the task scheduler and the pipes work >>> together. Is there any info available somewhere? >> >> I think brson knows best. > > There's no info besides the source code. The relevant files are > `src/libcore/pipes.rs` and `src/rt/rust_task.cpp`. Pipes uses three > foreign calls to indirectly control the scheduler: `task_wait_event`, > `task_signal_event`, `task_clear_event_reject`, the details of which I > don't know off hand but which look fairly straightforward. The > implementation is made more complex by the continued existence of > `core::oldcomm`, which uses a slightly different method of signalling > events and relies on much more foreign code. What's the current branch you are working on the new scheduler? Is it newsched of brson/rust? > >> >>> Also, if I would create a native pthread in C, could I simply call an >>> external rust function? >> > > Not yet. Today all Rust code depends on the Rust runtime (I've been > saying that code must be run 'in task context'). Creating a pthread > puts you outside of a task context. Being in task context essentially > means that various runtime calls are able to locate a pointer to > `rust_task` in TLS, and of course that pointer must be set up and > managed correctly. It's not something you can do manually. Is it possible to initialize a pthread like rust_pthread_init() so that later on it has a task context? > We are slowly working on 'freestanding Rust', which will let you run > Rust code without the runtime. This is particularly needed at the > moment to port the Rust runtime to Rust. The first steps are in this > pull request: > > https://github.com/mozilla/rust/pull/4619 > > After that pull request you can make foreign calls and use the > exchange heap outside of task context, but if you do call a function > that requires the runtime the process will abort. Very interesting... I think any memory allocation will require the runtime, no? Though, it's easy to call malloc() for libc... Thanks for your answers, Michael From mneumann at ntecs.de Sat Jan 26 04:01:23 2013 From: mneumann at ntecs.de (Michael Neumann) Date: Sat, 26 Jan 2013 13:01:23 +0100 Subject: [rust-dev] Misc questions In-Reply-To: <5103BDD3.9000401@ntecs.de> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> <5101F572.3090202@mozilla.com> <5103BDD3.9000401@ntecs.de> Message-ID: <5103C593.4080904@ntecs.de> Am 26.01.2013 12:28, schrieb Michael Neumann: > Another question: When a task sends a message to another task, and > this task is waiting exactly for this event, will it directly switch > to that task, or will it buffer the message? > Sometimes this could be quite handy and efficient. I rember this was > done in the L4 microkernel (www.l4ka.org), which only allowed > synchronous IPC. It could make sense to provide a > send_and_receive directive, which sends to the channel and lets the > scheduler know that it is now waiting for a message to receive from > another port. So send_and_receive could > directly switch to the other task, and when this does a send back to > the calling task, it will switch back to it. If you don't have > send_and_receive as atomic operation, there > is no way to switch back to the other task, as it might still be running. "as it might still be running" is here of course wrong (as we switched to another thread). What I wanted to say is, that it is not waiting for any event, so it is not in a blocking state, so that we cannot directly switch back (matching the recv() and the send()). Ideally the task that wants to read would do the non-blocking I/O itself, and the scheduler would just notify when it can "read". But I think this is not possible with libuv as you have no control over when to read (except using uv_read_start() / _stop). I think this would be much more efficient and even more powerful (one can read directly into a buffer... there is no need to allocate a new buffer for each read as done by libuv). So what I would suggest is the following: // task blocking_read(socket, buffer, ...) // this will register socket with the schedulers event queue (if not yet done) and block. // once the scheduler will receive an "data is available" event from the kernel // it will unblock the task. // then the task will do an non-blocking read() on it's own. Basically it's the same what libuv does internally on it's own, just that the responsibility for doing the read's for example is moved into the task itself, so there is no longer a need for an I/O task and we gain full control of the asynchronous reads. The advantage is: * we no longer need messages for I/O. * more flexibility * much better memory usage (no need to copy anymore) * the design is much easier and better to understand, libraries become so much easier Maybe that's just what you want to implement with the scheduler rewrite? Best, Michael From mneumann at ntecs.de Sat Jan 26 04:07:08 2013 From: mneumann at ntecs.de (Michael Neumann) Date: Sat, 26 Jan 2013 13:07:08 +0100 Subject: [rust-dev] Misc questions In-Reply-To: <5103C593.4080904@ntecs.de> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> <5101F572.3090202@mozilla.com> <5103BDD3.9000401@ntecs.de> <5103C593.4080904@ntecs.de> Message-ID: <5103C6EC.5050500@ntecs.de> Am 26.01.2013 13:01, schrieb Michael Neumann: > Am 26.01.2013 12:28, schrieb Michael Neumann: >> Another question: When a task sends a message to another task, and >> this task is waiting exactly for this event, will it directly switch >> to that task, or will it buffer the message? >> Sometimes this could be quite handy and efficient. I rember this was >> done in the L4 microkernel (www.l4ka.org), which only allowed >> synchronous IPC. It could make sense to provide a >> send_and_receive directive, which sends to the channel and lets the >> scheduler know that it is now waiting for a message to receive from >> another port. So send_and_receive could >> directly switch to the other task, and when this does a send back to >> the calling task, it will switch back to it. If you don't have >> send_and_receive as atomic operation, there >> is no way to switch back to the other task, as it might still be >> running. > > "as it might still be running" is here of course wrong (as we switched > to another thread). What I wanted to say is, that it is not waiting > for any event, so it is not in a blocking state, so that > we cannot directly switch back (matching the recv() and the send()). > > Ideally the task that wants to read would do the non-blocking I/O > itself, and the scheduler would just notify when it can "read". But I > think this is not possible with libuv as you > have no control over when to read (except using uv_read_start() / > _stop). I think this would be much more efficient and even more > powerful (one can read directly into a buffer... > there is no need to allocate a new buffer for each read as done by > libuv). So what I would suggest is the following: > > // task > blocking_read(socket, buffer, ...) > // this will register socket with the schedulers event queue (if > not yet done) and block. > // once the scheduler will receive an "data is available" event > from the kernel > // it will unblock the task. > // then the task will do an non-blocking read() on it's own. > > Basically it's the same what libuv does internally on it's own, just > that the responsibility for doing the read's for example is > moved into the task itself, so there is no longer a need for an I/O > task and we gain full control of the asynchronous reads. > > The advantage is: > > * we no longer need messages for I/O. > * more flexibility > * much better memory usage (no need to copy anymore) > * the design is much easier and better to understand, > libraries become so much easier * and message passing could be done synchronous, i.e. very fast :) From j.a.boyden at gmail.com Sat Jan 26 06:26:57 2013 From: j.a.boyden at gmail.com (James Boyden) Date: Sun, 27 Jan 2013 01:26:57 +1100 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <51006B81.5020506@alum.mit.edu> References: <510059B9.8020308@alum.mit.edu> <4BA886B4-4ECB-420D-BF7B-4747243D86FC@vandals.uidaho.edu> <51006B81.5020506@alum.mit.edu> Message-ID: On Thu, Jan 24, 2013 at 10:00 AM, Niko Matsakis wrote: > The slashes are visually hard to parse, I think everyone agrees. Imagine > this: `&a/B/c/d/e`. That would be a legal type under that proposal and I > think it's pretty darn confusing vs `&'a B<'c, 'd, 'e>` Hi, This is a very interesting discussion. I'm also new to the concept of lifetime parameters, but as a frustrated longtime C++ programmer, I appreciate the concept. Unfortunately, having studied these pages: http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html#returning-borrowed-pointers http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html#named-lifetimes http://smallcultfollowing.com/babysteps/blog/2012/12/30/lifetime-notation/ http://smallcultfollowing.com/babysteps/blog/2013/01/15/lifetime-notation-redux/ I still find myself no closer to understanding what `B<'c, 'd, 'e>` would practically mean, nor how it would be realistically used. Would you be able to clarify? Thanks, jb From illissius at gmail.com Sat Jan 26 08:06:42 2013 From: illissius at gmail.com (=?ISO-8859-1?Q?G=E1bor_Lehel?=) Date: Sat, 26 Jan 2013 17:06:42 +0100 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: References: <50FECB27.8020700@mozilla.com> Message-ID: I've since seen multiple people express that &{foo} ("option 8") is the most appealing to them (which may or may not be confirmation bias, given that it's also the most appealing to me). In either case, one argument in favor of that route might be that lifetimes are (probably by far) the most alien and intimidating part of the language, and anything to make them more accessible is worth pursuing. So *if* there is a consensus that "option 8" is in fact the nicest option, then it might be worth biting the bullet. If it were any other part of the language, I'm not sure, but here if whitespace dependency is the price, it might be worth the sacrifice. (That said, I agree that &'foo is also better than the current syntax.) On Wed, Jan 23, 2013 at 4:39 AM, Benjamin Striegel wrote: > Sadly, you should really read this subsequent blog post: > > http://smallcultfollowing.com/babysteps/blog/2013/01/15/lifetime-notation-redux/ > > It turns out that this syntax is ambiguous without introducing a whitespace > dependency. I think it might still be worth it, but I know that a lot of > people tend to shy away from such things on principle. > > > On Tue, Jan 22, 2013 at 2:01 PM, Dean Thompson > wrote: >> >> Looking at Niko's blog post >> >> >> http://smallcultfollowing.com/babysteps/blog/2012/12/30/lifetime-notation/ >> >> We do, to my eye, get a huge improvement if we both tweak the notation and >> also augment the ref deconstruction syntax to indicate the resulting >> pointer >> timeline. >> >> Doing this with Niko's preferred option 8 gives us: >> >> pure fn each(&self, f: fn(&(&{self}K, &{self}V)) -> bool) { >> match *self { >> Leaf => (), >> Tree(_, ref{self} left, ref{self} key, >> ref{self} maybe_value, ref{self} right) => { >> left.each(f); >> match *maybe_value { >> Some(ref{self} value) => { >> f(&(key, value)); >> } >> None => () >> }; >> right.each(f); >> } >> } >> >> >> FWIW, Niko's ${foo}bar notation helps my mental "parser" a great deal, >> because it >> makes foo look like a modifier to me. When I see &foo/bar, my mind fights >> to make >> it a pointer to foo with a strange trailing bar. >> >> Dean >> >> >> On 1/22/13 9:23 AM, "Graydon Hoare" wrote: >> >> >On 22/01/2013 6:55 AM, Dean Thompson wrote: >> > >> >> I'm looking at some code that Niko Matsakis updated in >> >> https://github.com/stevej/rustled/commits/master/red_black_tree.rs >> >> >> >> pure fn each(&self, f: fn(&(&self/K, &self/V)) -> bool) { >> >> match *self { >> >> Leaf => (), >> >> Tree(_, ref left, ref key, ref maybe_value, ref right) => { >> >> let left: &self/@RBMap = left; >> >> let key: &self/K = key; >> >> let maybe_value: &self/Option = maybe_value; >> >> let right: &self/@RBMap = right; >> >> left.each(f); >> >> match *maybe_value { >> >> Some(ref value) => { >> >> let value: &self/V = value; >> >> f(&(key, value)); >> >> } >> >> None => () >> >> }; >> >> right.each(f); >> >> } >> >> } >> >> } >> >> >> >> I understand this code reasonably well. I greatly value the attention >> >> to safety in Rust, and I appreciate the value of pointer lifetimes in >> >> maintaining that safety. >> >> >> >> My gut reaction, though, is that this code is almost as intimidating >> >> as Haskell. Even more worrisome to me, I think most mainstream >> >> programmers would find the *explanation* of this code intimidating. >> > >> >I agree that the cognitive load on this code sample is high. This is the >> >main risk we took (aside from "potential unsoundness", which I didn't >> >really think to be a big risk, judging from Niko's comfort with the >> >semantics) when adopting first class region pointers: that the resulting >> >types would be too complex to understand, and/or require too much >> >chatter when writing out in full. >> > >> >To my eyes the matter is not yet entirely clear. It's complex but it's >> >not quite "impossibly complex"; if you made all the '&self/' symbols >> >into just '&' it would be, I think, not so bad. Compare if you like to >> >the associated bits of code from libc++ required to implement >> >roughly-equivalent "iterate through the treemap" sort of functionality: >> > >> > >> >_LIBCPP_INLINE_VISIBILITY >> >__tree_iterator& operator++() { >> > __ptr_ = static_cast<__node_pointer( >> > __tree_next( >> > static_cast<__node_base_pointer>(__ptr_))); >> > return *this; >> >} >> > >> >template >> >_NodePtr >> >__tree_next(_NodePtr __x) _NOEXCEPT >> >{ >> > if (__x->__right_ != nullptr) >> > return __tree_min(__x->__right_); >> > while (!__tree_is_left_child(__x)) >> > __x = __x->__parent_; >> > return __x->__parent_; >> >} >> > >> >template >> >inline _LIBCPP_INLINE_VISIBILITY >> >bool >> >__tree_is_left_child(_NodePtr __x) _NOEXCEPT >> >{ >> > return __x == __x->__parent_->__left_; >> >} >> > >> >template >> >inline _LIBCPP_INLINE_VISIBILITY >> >_NodePtr >> >__tree_min(_NodePtr __x) _NOEXCEPT >> >{ >> > while (__x->__left_ != nullptr) >> > __x = __x->__left_; >> > return __x; >> >} >> > >> >And keep in mind that there is no memory-safety in that code: if I >> >invalidate a C++ map while iterating, I just get a wild pointer >> >dereference and crash. If I rewrote it in terms of shared_ptr<> it'd be >> >even chattier. >> > >> >> Who is our target audience for Rust? Graydon has said it is >> >> "frustrated C++ developers", but how sophisticated and how "brave" >> >> are we thinking they will be? >> > >> >The target audience is frustrated C++ developers, same as always. If >> >they balk at the syntax for lifetime-bounds on borrowed pointers, then >> >yes, we've blown the cognitive budget, and have failed. >> > >> >It is not clear to me yet that that's true. But it's a risk. One we're >> >all aware of and worried about. >> > >> >> How intimidating do we think Rust is today? Am I just overreacting >> >> to unfamiliarity? >> > >> >I don't know. It's a very hard thing to measure. I know of lots of >> >languages that have failed for this reason. It's a major hazard. >> > >> >> How can we calibrate our "intimidation factor" before language >> >> decisions start getting harder to change? >> > >> >If you search our mailing list, IRC logs or meeting minutes for >> >"cognitive budget", "cognitive load" or "cognitive burden" you will see >> >we have always been keenly aware of this risk and treat it as a primary >> >constraint when doing design work. It's a leading reason why many >> >features have been removed, simplified, minimized or excluded from >> >consideration. >> > >> >> Do we want (and is it feasible) to define a simpler subset of the >> >> language that beginners are encouraged to stick to and that most >> >> libraries don't force clients away from? >> > >> >Personal opinion: no. That just makes the issue even more confusing. The >> >way to approach this is head-on, by looking at the things that cause the >> >most confusion and trying to make them cause less. >> > >> >Thanks for bringing this up. I'm interested to hear others' opinions on >> >whether we're past a reasonable limit of comprehensibility. It's a hard >> >thing to hear, but better to hear now than later, if true. >> > >> >-Graydon >> > >> >_______________________________________________ >> >Rust-dev mailing list >> >Rust-dev at mozilla.org >> >https://mail.mozilla.org/listinfo/rust-dev >> >> >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev > > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > -- Your ship was destroyed in a monadic eruption. From j.a.boyden at gmail.com Sat Jan 26 08:40:46 2013 From: j.a.boyden at gmail.com (James Boyden) Date: Sun, 27 Jan 2013 03:40:46 +1100 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: References: <510059B9.8020308@alum.mit.edu> <4BA886B4-4ECB-420D-BF7B-4747243D86FC@vandals.uidaho.edu> <51006B81.5020506@alum.mit.edu> Message-ID: Upon re-reading my message, I realised I could have explained my source of confusion better. So, if I may clarify my question: If `B<'c, 'd, 'e>` is intended to be equivalent to what's currently written `B/c/d/e`, does this mean that struct B would be declared something like this: struct B<'X, 'Y, 'Z> ? That is, a template with 3 independent lifetime parameters? In which case, what would be a hypothetical example which might require 3 lifetime parameters? I can understand the purpose of a single lifetime parameter that corresponds to the lifetime of the struct instance, such as in the example: struct StringReader<&self> but I can't think of an example involving 3 independent lifetime parameters at the struct definition level. Or have I misunderstood what `B<'c, 'd, 'e>` is meant to mean? Thanks, jb On Sun, Jan 27, 2013 at 1:26 AM, James Boyden wrote: > On Thu, Jan 24, 2013 at 10:00 AM, Niko Matsakis wrote: >> The slashes are visually hard to parse, I think everyone agrees. Imagine >> this: `&a/B/c/d/e`. That would be a legal type under that proposal and I >> think it's pretty darn confusing vs `&'a B<'c, 'd, 'e>` > > Hi, > > This is a very interesting discussion. I'm also new to the concept of > lifetime parameters, but as a frustrated longtime C++ programmer, I > appreciate the concept. > > Unfortunately, having studied these pages: > http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html#returning-borrowed-pointers > http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html#named-lifetimes > http://smallcultfollowing.com/babysteps/blog/2012/12/30/lifetime-notation/ > http://smallcultfollowing.com/babysteps/blog/2013/01/15/lifetime-notation-redux/ > > I still find myself no closer to understanding what `B<'c, 'd, 'e>` > would practically mean, nor how it would be realistically used. > > Would you be able to clarify? > > Thanks, > jb From niko at alum.mit.edu Sat Jan 26 09:46:09 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Sat, 26 Jan 2013 09:46:09 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: References: <510059B9.8020308@alum.mit.edu> <4BA886B4-4ECB-420D-BF7B-4747243D86FC@vandals.uidaho.edu> <51006B81.5020506@alum.mit.edu> Message-ID: <51041661.7070300@alum.mit.edu> Hi, You are interpreting it correctly. I don't have time at the moment to write up an example where multiple independent lifetime parameters would be required, but it is certainly true that this would be very unusual. In any case, I have several examples that require two lifetime parameters to be done most naturally, but none that require three, and I'll try to write them up (perhaps in a blog post) over the next few days. The main thing is that, today, we are limited to one---it is *almost* always enough, but not quite, which is why I'd like to generalize to multiple for advanced uses. Niko James Boyden wrote: > Upon re-reading my message, I realised I could have explained > my source of confusion better. So, if I may clarify my question: > > If `B<'c, 'd, 'e>` is intended to be equivalent to what's currently > written `B/c/d/e`, does this mean that struct B would be declared > something like this: > struct B<'X, 'Y, 'Z> > ? > That is, a template with 3 independent lifetime parameters? > > In which case, what would be a hypothetical example which might > require 3 lifetime parameters? > > I can understand the purpose of a single lifetime parameter that > corresponds to the lifetime of the struct instance, such as in the > example: > struct StringReader<&self> > but I can't think of an example involving 3 independent lifetime > parameters at the struct definition level. > > Or have I misunderstood what `B<'c, 'd, 'e>` is meant to mean? > > Thanks, > jb > > > On Sun, Jan 27, 2013 at 1:26 AM, James Boyden wrote: >> On Thu, Jan 24, 2013 at 10:00 AM, Niko Matsakis wrote: >>> The slashes are visually hard to parse, I think everyone agrees. Imagine >>> this: `&a/B/c/d/e`. That would be a legal type under that proposal and I >>> think it's pretty darn confusing vs `&'a B<'c, 'd, 'e>` >> Hi, >> >> This is a very interesting discussion. I'm also new to the concept of >> lifetime parameters, but as a frustrated longtime C++ programmer, I >> appreciate the concept. >> >> Unfortunately, having studied these pages: >> http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html#returning-borrowed-pointers >> http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html#named-lifetimes >> http://smallcultfollowing.com/babysteps/blog/2012/12/30/lifetime-notation/ >> http://smallcultfollowing.com/babysteps/blog/2013/01/15/lifetime-notation-redux/ >> >> I still find myself no closer to understanding what `B<'c, 'd, 'e>` >> would practically mean, nor how it would be realistically used. >> >> Would you be able to clarify? >> >> Thanks, >> jb From j.a.boyden at gmail.com Sat Jan 26 09:54:16 2013 From: j.a.boyden at gmail.com (James Boyden) Date: Sun, 27 Jan 2013 04:54:16 +1100 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <51041661.7070300@alum.mit.edu> References: <510059B9.8020308@alum.mit.edu> <4BA886B4-4ECB-420D-BF7B-4747243D86FC@vandals.uidaho.edu> <51006B81.5020506@alum.mit.edu> <51041661.7070300@alum.mit.edu> Message-ID: Thanks for the clarification. I would be interested to read more blog posts about further details of this topic. jb On Sun, Jan 27, 2013 at 4:46 AM, Niko Matsakis wrote: > Hi, > > You are interpreting it correctly. I don't have time at the moment to write > up an example where multiple independent lifetime parameters would be > required, but it is certainly true that this would be very unusual. In any > case, I have several examples that require two lifetime parameters to be > done most naturally, but none that require three, and I'll try to write them > up (perhaps in a blog post) over the next few days. The main thing is that, > today, we are limited to one---it is *almost* always enough, but not quite, > which is why I'd like to generalize to multiple for advanced uses. > > > Niko > > > > James Boyden wrote: >> >> Upon re-reading my message, I realised I could have explained >> my source of confusion better. So, if I may clarify my question: >> >> If `B<'c, 'd, 'e>` is intended to be equivalent to what's currently >> written `B/c/d/e`, does this mean that struct B would be declared >> something like this: >> struct B<'X, 'Y, 'Z> >> ? >> That is, a template with 3 independent lifetime parameters? >> >> In which case, what would be a hypothetical example which might >> require 3 lifetime parameters? >> >> I can understand the purpose of a single lifetime parameter that >> corresponds to the lifetime of the struct instance, such as in the >> example: >> struct StringReader<&self> >> but I can't think of an example involving 3 independent lifetime >> parameters at the struct definition level. >> >> Or have I misunderstood what `B<'c, 'd, 'e>` is meant to mean? >> >> Thanks, >> jb >> >> >> On Sun, Jan 27, 2013 at 1:26 AM, James Boyden >> wrote: >>> >>> On Thu, Jan 24, 2013 at 10:00 AM, Niko Matsakis >>> wrote: >>>> >>>> The slashes are visually hard to parse, I think everyone agrees. >>>> Imagine >>>> this: `&a/B/c/d/e`. That would be a legal type under that proposal and >>>> I >>>> think it's pretty darn confusing vs `&'a B<'c, 'd, 'e>` >>> >>> Hi, >>> >>> This is a very interesting discussion. I'm also new to the concept of >>> lifetime parameters, but as a frustrated longtime C++ programmer, I >>> appreciate the concept. >>> >>> Unfortunately, having studied these pages: >>> >>> http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html#returning-borrowed-pointers >>> >>> http://static.rust-lang.org/doc/tutorial-borrowed-ptr.html#named-lifetimes >>> >>> http://smallcultfollowing.com/babysteps/blog/2012/12/30/lifetime-notation/ >>> >>> http://smallcultfollowing.com/babysteps/blog/2013/01/15/lifetime-notation-redux/ >>> >>> I still find myself no closer to understanding what `B<'c, 'd, 'e>` >>> would practically mean, nor how it would be realistically used. >>> >>> Would you be able to clarify? >>> >>> Thanks, >>> jb > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From steven099 at gmail.com Sat Jan 26 10:20:06 2013 From: steven099 at gmail.com (Steven Blenkinsop) Date: Sat, 26 Jan 2013 13:20:06 -0500 Subject: [rust-dev] Problem with conflicting implementations for traits In-Reply-To: <5102C1EC.80507@ntecs.de> References: <5102C1EC.80507@ntecs.de> Message-ID: You could define an `enum Default = T` On Friday, 25 January 2013, Michael Neumann wrote: > Hi, > > I am getting the following error: > > msgpack.rs:545:0: 555:1 error: conflicting implementations for a trait > msgpack.rs:545 pub impl msgpack.rs:546 K: serialize::Decodable, > msgpack.rs:547 V: serialize::Decodable> ~[(K,V)]: > serialize::Decodable { > msgpack.rs:548 static fn decode(&self, d: &D) -> ~[(K,V)] { > msgpack.rs:549 do d.read_map |len| { > msgpack.rs:550 do vec::from_fn(len) |i| { > ... > msgpack.rs:539:0: 543:1 note: note conflicting implementation here > msgpack.rs:539 pub impl T: serialize::Decodable { > msgpack.rs:540 static fn decode(&self, d: &D) -> T { > msgpack.rs:541 serialize::Decodable::decode(d as &serialize::Decoder) > msgpack.rs:542 } > msgpack.rs:543 } > > > It's obvious that the two trait implementations conflict, as the one (for > T) is > more general as the other (for ~[(K,V)]). Is there anything I can do to > fix it? > I have found this discussion [1] but I see no solution to the problem. > > For msgpack, I want to support "maps". They are specially encoded, so I > need a > special Decoder (serialize::Decoder does not support maps in any way). > Above I > tried to extend the serialize::Decoder trait for read_map() and > read_map_elt(), > leading to DecoderWithMap: > > pub trait DecoderWithMap : serialize::Decoder { > fn read_map(&self, f: fn(uint) ? T) ? T; > fn read_map_elt(&self, _idx: uint, f: fn() ? T) ? T; > } > > Then I tried to implement Decodable for ~[(K,V)] (which I want to use as a > rust > representation as a map; here I'd probably run into problems again as > serializer defines a generic implementation for ~[] and for tuples... > this at least I could solve by using a different type). > > Now I would need to reimplement Decodable for any type I use, so I tried > to use the second generic trait implementation. But this is where it failed > with a conflict. > > Would it be possible to "override" a standard (more generic) trait > implementation, > either implicitly (like C++ is doing) or explictly? > > Best, > > Michael > > [1]: https://github.com/mozilla/**rust/issues/3429 > > ______________________________**_________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/**listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.a.boyden at gmail.com Sat Jan 26 11:51:20 2013 From: j.a.boyden at gmail.com (James Boyden) Date: Sun, 27 Jan 2013 06:51:20 +1100 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <50FECB27.8020700@mozilla.com> References: <50FECB27.8020700@mozilla.com> Message-ID: On Wed, Jan 23, 2013 at 4:23 AM, Graydon Hoare wrote: > The target audience is frustrated C++ developers, same as always. If > they balk at the syntax for lifetime-bounds on borrowed pointers, then > yes, we've blown the cognitive budget, and have failed. [...] > Thanks for bringing this up. I'm interested to hear others' opinions on > whether we're past a reasonable limit of comprehensibility. It's a hard > thing to hear, but better to hear now than later, if true. As an archetypal "frustrated longtime C++ programmer" I've been reading about borrowed pointer lifetimes, and following this thread, with great interest. For what it's worth, here's my 2 cents: 1. Borrowed pointer lifetimes are a very interesting concept. I think they definitely enrich Rust. However... 2. The current syntax of `&lifetime/type` completely threw me when I first saw it in the wild (before reading up on the topic in the tutorials). I'm concerned that this could be something that might push Rust slightly too far towards "obscure academic theory language" for many C++ programmers. Specifically: 2a. I'm used to seeing the pointed-to type right next to the pointer. I can mentally reverse C++'s `int &p` to `p: &int` without problems, but having extra line-noise in there seems to stretch my on-the-fly language-parsing capability beyond its limits -- particularly so when the identifiers both before and after the `/` look like types, and there's no whitespace breaks to guide my mental tokeniser. 2b. I found the `p: &r/float` syntax especially confusing in that sometimes, without any prior or intervening warning, the next token after the borrow-pointer sigil `&` was a lifetime instead of a type (but of course, not all the time). When you're mentally parsing it, you don't get any explanation of what you're reading until *after* you've read those tokens. 2c. I care more about the pointed-to type than the lifetime (at least on my first scan through the function parameters), so I'd strongly prefer to have the pointed-to type closer to the front, and closer to the parameter name and pointer sigil (i.e., before the lifetime name). This has the additional benefit that now all the parameter type info is together, followed by the lifetime info. 2d. `&` means "pointers" or (occasionally) bitwise-AND. Please don't use it for any other plumbing! (I'm thinking of "Lifetime parameter designated with &" in the previous notation.) For the above reasons, is there any way that the lifetime syntax could be moved *after* the type? The current proposal of `&'lt Foo` does address the ambiguity described in point 2b, but not 2a or 2c. (The `/` sigil itself actually doesn't faze me that much.) 3. I'd politely discourage any conventions of calling things `self` (such as the main lifetime parameter) which aren't the official parameter-of-the-same-type-as-the-struct-or-impl. Seeing the blessed, method-making parameter-name `self` serving multiple purposes within a single parameter list was very confusing. 4. As an aside (off the specific topic of lifetime syntax, but still on the more general topic of "intimidation factor" in this thread): While we're on the topic of (not) overloading `self`, could I propose introducing `%` as a unary operator for "typeof(variable)"? If you had a variable `v` of type `T`, then `%v` would be evaluated to `T` at compile-time, as if `T` had been specified directly. Then `%self` could be used as a type in traits and `impl`s, instead of overloading `self` to also mean a type in those particular contexts. So the example in the tutorial would become: // In a trait, `self` refers to the self argument; // `%self` refers to the type implementing the trait trait Eq { fn equals(&self, other: &%self) -> bool; } 5. Back onto the topic of lifetime syntax, whitespace dependency would be disappointing, so I'm not keen on the `{lifetime}` syntax. Still a fan of Rust, just wanting very much to see it succeed! jb From niko at alum.mit.edu Sat Jan 26 14:16:39 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Sat, 26 Jan 2013 14:16:39 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: References: <50FECB27.8020700@mozilla.com> Message-ID: <510455C7.90104@alum.mit.edu> James Boyden wrote: > For the above reasons, is there any way that the lifetime syntax > could be moved*after* the type? The current proposal of `&'lt Foo` > does address the ambiguity described in point 2b, but not 2a or 2c. > (The `/` sigil itself actually doesn't faze me that much.) We considered this, but it's very hard to mix prefix and postfix notation like this without ambiguity. Consider: &&T/b Is this: &'b &T or &&'b T I suppose we could just resolve it arbitrarily, like if-else-if chains. Niko -------------- next part -------------- An HTML attachment was scrubbed... URL: From martindemello at gmail.com Sun Jan 27 00:47:30 2013 From: martindemello at gmail.com (Martin DeMello) Date: Sun, 27 Jan 2013 00:47:30 -0800 Subject: [rust-dev] Compiling without interference from the system LLVM Message-ID: How do I configure and make rust (from incoming) so that it doesn't try to link against my system LLVM and die? The main problem seems to be that the configure script picks up /usr/bin/llvm-config configure: CFG_LLVM_CONFIG := /usr/bin/llvm-config (3.2) but I couldn't find anything in the configure options to stop it doing that. Leaving llvm-root unset was in theory supposed to do the right thing, but it fails. I filed a bug about it here: https://github.com/mozilla/rust/issues/4607 but I figured I'd check that it wasn't just something I was doing wrong. martin From graydon at mozilla.com Mon Jan 28 10:55:29 2013 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 28 Jan 2013 10:55:29 -0800 Subject: [rust-dev] Misc questions In-Reply-To: <5103C593.4080904@ntecs.de> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> <5101F572.3090202@mozilla.com> <5103BDD3.9000401@ntecs.de> <5103C593.4080904@ntecs.de> Message-ID: <5106C9A1.4030304@mozilla.com> On 13-01-26 04:01 AM, Michael Neumann wrote: > So what I would suggest is the following: > > // task > blocking_read(socket, buffer, ...) > // this will register socket with the schedulers event queue (if not > yet done) and block. > // once the scheduler will receive an "data is available" event from > the kernel > // it will unblock the task. > // then the task will do an non-blocking read() on it's own. > > Basically it's the same what libuv does internally on it's own, just > that the responsibility for doing the read's for example is > moved into the task itself, so there is no longer a need for an I/O task > and we gain full control of the asynchronous reads. > > The advantage is: > > * we no longer need messages for I/O. > * more flexibility > * much better memory usage (no need to copy anymore) > * the design is much easier and better to understand, > libraries become so much easier > > Maybe that's just what you want to implement with the scheduler rewrite? That is what I would like to see in the rewrite, yes. It will require some care in designing the event-loop interface that's called from the rust-library side, but I think it ought to work. Really we should have done this from the beginning, we just (wrongly) deferred attention to AIO library work too long and wound up already having grown a (duplicate) task scheduler that didn't integrate an IO event loop. So now we have to de-duplicate them. -Graydon From niko at alum.mit.edu Mon Jan 28 10:59:39 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Mon, 28 Jan 2013 10:59:39 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: References: Message-ID: <5106CA9B.9020601@alum.mit.edu> Dean Thompson wrote: > Niko has indicated that the sequences like this: > > > Some(ref value) => { > > let value: &self/V = value; > > are working around a current bug in ref. The let should be unnecessary. Hello, I just wanted to note that as of a recent push, the bug (issue #3148) that made it necessary to include these manual annotations should be fixed. Niko From banderson at mozilla.com Mon Jan 28 12:23:39 2013 From: banderson at mozilla.com (Brian Anderson) Date: Mon, 28 Jan 2013 12:23:39 -0800 Subject: [rust-dev] Compiling without interference from the system LLVM In-Reply-To: References: Message-ID: <5106DE4B.5030504@mozilla.com> On 01/27/2013 12:47 AM, Martin DeMello wrote: > How do I configure and make rust (from incoming) so that it doesn't > try to link against my system LLVM and die? > > The main problem seems to be that the configure script picks up > /usr/bin/llvm-config > > configure: CFG_LLVM_CONFIG := /usr/bin/llvm-config (3.2) > > but I couldn't find anything in the configure options to stop it doing > that. Leaving llvm-root unset was in theory supposed to do the right > thing, but it fails. > > I filed a bug about it here: > https://github.com/mozilla/rust/issues/4607 but I figured I'd check > that it wasn't just something I was doing wrong. > > I left some comments in the bug ticket. From mneumann at ntecs.de Mon Jan 28 13:24:12 2013 From: mneumann at ntecs.de (Michael Neumann) Date: Mon, 28 Jan 2013 22:24:12 +0100 Subject: [rust-dev] Illegal borrow unless pure Message-ID: <5106EC7C.80109@ntecs.de> Hi, I am trying to get the following example to work. What I find strange is, that I can call vec::push in fn main(), but when I do the same from fn pushit(), it fails with: t.rs:6:17: 6:24 error: illegal borrow unless pure: unique value in aliasable, mutable location t.rs:6 vec::push(&mut a.arr[0], 3); ^~~~~~~ t.rs:6:2: 6:11 note: impure due to access to impure function t.rs:6 vec::push(&mut a.arr[0], 3); ^~~~~~~~~ Example: struct A { arr: ~[ ~[int] ] } fn pushit(a: &mut A) /*unsafe*/ { vec::push(&mut a.arr[0], 3); } fn main() { let mut a: A = A {arr: ~[ ~[1,2,3], ~[3,4,5] ] }; vec::push(&mut a.arr[0], 3); // WORKS pushit(&mut a); // DOES NOT work!!! error!("%?", a); } I am a bit lost here. When I use "unsafe" it works, but is it safe??? Best, Michael From niko at alum.mit.edu Mon Jan 28 13:27:15 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Mon, 28 Jan 2013 13:27:15 -0800 Subject: [rust-dev] Illegal borrow unless pure In-Reply-To: <5106EC7C.80109@ntecs.de> References: <5106EC7C.80109@ntecs.de> Message-ID: <5106ED33.7080401@alum.mit.edu> This looks like a bug. The new so-called "INHTWAMA semantics" should allow this sort of thing. I'll take a look. Also, as an aside, I think the code reads nicer if you use methods: (e.g., `a.arr[0].push(3);`). But it's equivalent anyhow. Niko Michael Neumann wrote: > Hi, > > I am trying to get the following example to work. What I find strange > is, that > I can call vec::push in fn main(), but when I do the same from fn > pushit(), it fails > with: > > t.rs:6:17: 6:24 error: illegal borrow unless pure: unique value in > aliasable, mutable location > t.rs:6 vec::push(&mut a.arr[0], 3); > ^~~~~~~ > t.rs:6:2: 6:11 note: impure due to access to impure function > t.rs:6 vec::push(&mut a.arr[0], 3); > ^~~~~~~~~ > > Example: > > struct A { > arr: ~[ ~[int] ] > } > > fn pushit(a: &mut A) /*unsafe*/ { > vec::push(&mut a.arr[0], 3); > } > > fn main() { > let mut a: A = A {arr: ~[ ~[1,2,3], ~[3,4,5] ] }; > vec::push(&mut a.arr[0], 3); // WORKS > pushit(&mut a); // DOES NOT work!!! > error!("%?", a); > } > > I am a bit lost here. When I use "unsafe" it works, but is it safe??? > > Best, > > Michael > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From banderson at mozilla.com Mon Jan 28 13:30:13 2013 From: banderson at mozilla.com (Brian Anderson) Date: Mon, 28 Jan 2013 13:30:13 -0800 Subject: [rust-dev] Misc questions In-Reply-To: <5103BDD3.9000401@ntecs.de> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> <5101F572.3090202@mozilla.com> <5103BDD3.9000401@ntecs.de> Message-ID: <5106EDE5.4000505@mozilla.com> On 01/26/2013 03:28 AM, Michael Neumann wrote: > Am 25.01.2013 04:01, schrieb Brian Anderson: >> On 01/24/2013 04:37 PM, Patrick Walton wrote: >>> On 1/24/13 3:55 PM, Michael Neumann wrote: >>> >>>> * I don't like the way libuv is currently integrated into the >>>> system. It >> >> I sympathize. > > :) > >> >>>> works, but performance is >>>> quite low and IMHO the blocking interface is not very usable. For >>>> example I want to write a process >>>> that accepts messages from other processes, and then writes >>>> something to >>>> the socket or reads from >>>> the socket. This will currently not work, as reading from the socket >>>> will block the process, and >>>> then no more requests can be sent to the process. >>>> So instead of using the read() / write() API of an io::Reader, I'd >>>> prefer to expose the read/write >>>> events of libuv via messages (this is already done between the iotask >>>> and the read()/write() methods, >>>> but it is not accessible to the "end-user"). >>>> >>>> So instead of: >>>> >>>> io.read(...) >>>> >>>> one would simply write: >>>> >>>> readport.recv() >> >> Both of these are blocking. The significant advantage of using a port >> here though is that core::pipes has several ways to receive on >> multiple ports at once, so you could wait for both the read event and >> some other signal instead of being stuck until the read either >> succeeds or times out. > > Exactly! > >> There is a lot of overlap functionally between Port/Chan and >> Reader/Writer (std::flatpipes event implements the GenericPort and >> GenericChan traits on top of Reader and Writer). They both have their >> strengths though. Streaming data over channels is just going to leave >> you with a bunch of byte buffers, whereas Readers give you lots of >> control over how to interpret those bytes. I could see the api here >> being channel based, letting the user opt into Reader and Writer >> implementations for Port<~[u8]> etc. as needed. >> >>>> >>>> The same for writes. EOF results in closing the readport. The question >>>> is how these messages >>>> should look like to be usable for the programmer (how to handle >>>> errors?). >>>> >>>> What do you think? >>>> >>>> Actually there would be connecting ports, which receive events >>>> whenever >>>> a new connection is established. >>>> A successfully established connection would then be represented by a >>>> readport and writechannel. >>> >>> brson is working on a rewrite of the scheduler. This new scheduler >>> should run directly on the libuv event loop. This should have much >>> higher performance. >> >> The current implementation will go away completely. It was useful as >> a prototype but it has problems. The new intent is to design the Rust >> scheduler so that it can be driven by an arbitrary event loop, then >> use the uv event loop for that purpose by default (this could also be >> useful for e.g. integrating with the Win32 event loop, etc.). The >> main advantage of integrating uv into the scheduler over the current >> design is that there will be much less synchronization and context >> switching to dispatch events. This will unfortunately necessitate a >> complex integration of the scheduler and the I/O system and I don't >> know how that is going to work yet. > > Do you mean that the libuv callbacks -> messages conversion will then > go away? Currenctly every callback is sent from the iotask to the > "interested" task via a message. > Will the new design be different in this regard? I'd like to hear more > details about what you are trying to accomplish. Is there already some > code? > The callback to message conversion will go away I think since the callbacks will be happening on the same thread where the data is sent or received. What I am trying to accomplish is minimize the overhead imposed by the Rust runtime, particularly regarding context switches and locking. There is no code yet as I am still working through some blockers. I am not even sure how it's going to work, I just have a vague idea that our scheduler threads are event loops, and uv is an event loop, so let's plug them together. I think the transition from uv's asynchronous callbacks to Rust imperative code will still involve a buffered message object that gets returned to Rust code through a context switch, but it won't involve nearly as much work to do the handoff since the I/O work and the Rust code is both running in the same thread. > I think another thing that currently makes I/O slow is that for every > read() call, at first a messages is sent to the iotask to let the > libuv know that we want to start reading from > the socket (uv_read_start()). Then we actually request the read > (another message) and finally we stop reading again (uv_read_stop()). > So in total, every read will involve 3 complete > message cycles between iotask and the requesting task. I hope this > will be going to be reduced to just one (but this is probably just a > library issue and not related to the scheduler...). This overhead should go away I think. > > Another question: When a task sends a message to another task, and > this task is waiting exactly for this event, will it directly switch > to that task, or will it buffer the message? It will not directly switch to the other task. > Sometimes this could be quite handy and efficient. I rember this was > done in the L4 microkernel (www.l4ka.org), which only allowed > synchronous IPC. It could make sense to provide a > send_and_receive directive, which sends to the channel and lets the > scheduler know that it is now waiting for a message to receive from > another port. So send_and_receive could > directly switch to the other task, and when this does a send back to > the calling task, it will switch back to it. If you don't have > send_and_receive as atomic operation, there > is no way to switch back to the other task, as it might still be running. We've discussed this optimization and it could be important, but don't have any specific plans for it at the moment. > > In L4 it is always the sender who blocks when the receiving side is > not ready. How about the pipes implementation in Rust? In Rust the sender does not block. > > What I noticed when doing some benchmarks with I/O was that the timing > results did vary a lot. I think this is due to the scheduling. Likely. Rust tasks to not preempt at all at the moment. > >>> >>>> >>>> * I'd like to know more how the task scheduler and the pipes work >>>> together. Is there any info available somewhere? >>> >>> I think brson knows best. >> >> There's no info besides the source code. The relevant files are >> `src/libcore/pipes.rs` and `src/rt/rust_task.cpp`. Pipes uses three >> foreign calls to indirectly control the scheduler: `task_wait_event`, >> `task_signal_event`, `task_clear_event_reject`, the details of which >> I don't know off hand but which look fairly straightforward. The >> implementation is made more complex by the continued existence of >> `core::oldcomm`, which uses a slightly different method of signalling >> events and relies on much more foreign code. > > What's the current branch you are working on the new scheduler? Is it > newsched of brson/rust? Sort of. There's no scheduler code there but there are a few prerequisite changes to make Rust code run outside of task context. I have not started writing any new scheduler code yet. Here are my current work items: * Remove oldcomm - Before moving the scheduler to Rust I want to reduce the existing implementation to something minimal so it is easier to understand. The first step here is removing the old comm subsystem. * Upgrade libuv - pfox__ is doing some work on this * Freestanding Rust - We need to be able to run Rust code * Reorganize core - core needs a bit of restructuring to accommodate all the forthcoming runtime code. My ideas on this are here: https://github.com/mozilla/rust/issues/2240 * Start sketching out the scheduler types. > >> >>> >>>> Also, if I would create a native pthread in C, could I simply call an >>>> external rust function? >>> >> >> Not yet. Today all Rust code depends on the Rust runtime (I've been >> saying that code must be run 'in task context'). Creating a pthread >> puts you outside of a task context. Being in task context essentially >> means that various runtime calls are able to locate a pointer to >> `rust_task` in TLS, and of course that pointer must be set up and >> managed correctly. It's not something you can do manually. > > Is it possible to initialize a pthread like rust_pthread_init() so > that later on it has a task context? Not independently of the runtime. The task object provides several services, some of which depend on further scheduler or kernel (runtime) resources, and some of which could conceivably be factored out independently of the rest of the scheduler. * Local heap allocation * Stack growth and foreign stack switching. Stack growth could be done without the scheduler but the current scheme for doing foreign stack switches involves interaction with the scheduler. * Signalling and waiting on events (for scheduling) All of this is lives in the C++ rust_task class which has back references to scheduler types. At this point you either have access to the runtime or don't, but part of this work involves splitting out these capabilities so that fewer features require access to a running scheduler. > >> We are slowly working on 'freestanding Rust', which will let you run >> Rust code without the runtime. This is particularly needed at the >> moment to port the Rust runtime to Rust. The first steps are in this >> pull request: >> >> https://github.com/mozilla/rust/pull/4619 >> >> After that pull request you can make foreign calls and use the >> exchange heap outside of task context, but if you do call a function >> that requires the runtime the process will abort. > > Very interesting... > > I think any memory allocation will require the runtime, no? Though, > it's easy to call malloc() for libc... The exchange heap can be implemented as calls to malloc. From banderson at mozilla.com Mon Jan 28 13:46:05 2013 From: banderson at mozilla.com (Brian Anderson) Date: Mon, 28 Jan 2013 13:46:05 -0800 Subject: [rust-dev] Misc questions In-Reply-To: <5103C6EC.5050500@ntecs.de> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> <5101F572.3090202@mozilla.com> <5103BDD3.9000401@ntecs.de> <5103C593.4080904@ntecs.de> <5103C6EC.5050500@ntecs.de> Message-ID: <5106F19D.8010802@mozilla.com> On 01/26/2013 04:07 AM, Michael Neumann wrote: > Am 26.01.2013 13:01, schrieb Michael Neumann: >> Am 26.01.2013 12:28, schrieb Michael Neumann: >>> Another question: When a task sends a message to another task, and >>> this task is waiting exactly for this event, will it directly switch >>> to that task, or will it buffer the message? >>> Sometimes this could be quite handy and efficient. I rember this was >>> done in the L4 microkernel (www.l4ka.org), which only allowed >>> synchronous IPC. It could make sense to provide a >>> send_and_receive directive, which sends to the channel and lets the >>> scheduler know that it is now waiting for a message to receive from >>> another port. So send_and_receive could >>> directly switch to the other task, and when this does a send back to >>> the calling task, it will switch back to it. If you don't have >>> send_and_receive as atomic operation, there >>> is no way to switch back to the other task, as it might still be >>> running. >> >> "as it might still be running" is here of course wrong (as we >> switched to another thread). What I wanted to say is, that it is not >> waiting for any event, so it is not in a blocking state, so that >> we cannot directly switch back (matching the recv() and the send()). >> >> Ideally the task that wants to read would do the non-blocking I/O >> itself, and the scheduler would just notify when it can "read". But I >> think this is not possible with libuv as you >> have no control over when to read (except using uv_read_start() / >> _stop). I think this would be much more efficient and even more >> powerful (one can read directly into a buffer... >> there is no need to allocate a new buffer for each read as done by >> libuv). So what I would suggest is the following: >> >> // task >> blocking_read(socket, buffer, ...) >> // this will register socket with the schedulers event queue (if >> not yet done) and block. >> // once the scheduler will receive an "data is available" event >> from the kernel >> // it will unblock the task. >> // then the task will do an non-blocking read() on it's own. I'm not that familiar with the uv API. Is there a distinct 'data available' event that happens before we start reading? I've been assuming that, as you say, we have to control over when the read events happen, so we would need to check whether the task initiating this read was currently waiting for data, and either buffer it or context switch to the task depending on its state. >> >> Basically it's the same what libuv does internally on it's own, just >> that the responsibility for doing the read's for example is >> moved into the task itself, so there is no longer a need for an I/O >> task and we gain full control of the asynchronous reads. >> >> The advantage is: >> >> * we no longer need messages for I/O. >> * more flexibility >> * much better memory usage (no need to copy anymore) Agree on these points. >> * the design is much easier and better to understand, >> libraries become so much easier I do not think the design that integrates the scheduler with the I/O loop will be easier to understand. I expect the interactions will be complicated. > > * and message passing could be done synchronous, i.e. very fast :) In many cases this should be true, if the task initiating the I/O is still waiting on it. It could go off and execute other code and block on other things though, in which case the data needs to be buffered until the task can be scheduled. From mneumann at ntecs.de Mon Jan 28 15:37:06 2013 From: mneumann at ntecs.de (Michael Neumann) Date: Tue, 29 Jan 2013 00:37:06 +0100 Subject: [rust-dev] Misc questions In-Reply-To: <5106F19D.8010802@mozilla.com> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> <5101F572.3090202@mozilla.com> <5103BDD3.9000401@ntecs.de> <5103C593.4080904@ntecs.de> <5103C6EC.5050500@ntecs.de> <5106F19D.8010802@mozilla.com> Message-ID: <51070BA2.6000608@ntecs.de> Am 28.01.2013 22:46, schrieb Brian Anderson: > On 01/26/2013 04:07 AM, Michael Neumann wrote: >> Am 26.01.2013 13:01, schrieb Michael Neumann: >>> Am 26.01.2013 12:28, schrieb Michael Neumann: >>>> Another question: When a task sends a message to another task, and >>>> this task is waiting exactly for this event, will it directly >>>> switch to that task, or will it buffer the message? >>>> Sometimes this could be quite handy and efficient. I rember this >>>> was done in the L4 microkernel (www.l4ka.org), which only allowed >>>> synchronous IPC. It could make sense to provide a >>>> send_and_receive directive, which sends to the channel and lets the >>>> scheduler know that it is now waiting for a message to receive from >>>> another port. So send_and_receive could >>>> directly switch to the other task, and when this does a send back >>>> to the calling task, it will switch back to it. If you don't have >>>> send_and_receive as atomic operation, there >>>> is no way to switch back to the other task, as it might still be >>>> running. >>> >>> "as it might still be running" is here of course wrong (as we >>> switched to another thread). What I wanted to say is, that it is not >>> waiting for any event, so it is not in a blocking state, so that >>> we cannot directly switch back (matching the recv() and the send()). >>> >>> Ideally the task that wants to read would do the non-blocking I/O >>> itself, and the scheduler would just notify when it can "read". But >>> I think this is not possible with libuv as you >>> have no control over when to read (except using uv_read_start() / >>> _stop). I think this would be much more efficient and even more >>> powerful (one can read directly into a buffer... >>> there is no need to allocate a new buffer for each read as done by >>> libuv). So what I would suggest is the following: >>> >>> // task >>> blocking_read(socket, buffer, ...) >>> // this will register socket with the schedulers event queue (if >>> not yet done) and block. >>> // once the scheduler will receive an "data is available" event >>> from the kernel >>> // it will unblock the task. >>> // then the task will do an non-blocking read() on it's own. > > I'm not that familiar with the uv API. Is there a distinct 'data > available' event that happens before we start reading? I've been > assuming that, as you say, we have to control over when the read > events happen, so we would need to check whether the task initiating > this read was currently waiting for data, and either buffer it or > context switch to the task depending on its state. No there isn't! The reason why, as far as I understand it, lies in the way Windows handles reads. In UNIX you get notified, when you can read, while in Windows, you get notified when a read completed, so you are basically doing the read asynchronously in the background (saving you another context switch to the kernel). I think this is called Proactor (the UNIX-way is called Reactor). libuv wants to do this in a platform-independent way, where the programmer who uses libuv does not have to care about which platform he is working with. So when we think about this sequence in libuv uv_read_start(fd) -> on_read_cb gets triggered uv_read_stop(fd) what it does internally is the following: UNIX: register event for `fd` in event queue epoll() -> allocate buffer -> read(fd, "nonblocking") -> call on_read_cb unregister event Windows: allocate buffer start asynchronous read request wait for completion (of any outstanding I/O) -> call on_read_cb I think libuv is doing too much here. For example, if I don't want to remove the socket from the event queue, just disable the callback, then this is not possible. I'd prefer when I could just tell libuv that I am interested in event X (on Windows: I/O completion, on UNIX: I/O availability). I think a simple hack would be to store the buffer address and size of buffer in the uv_handle_t structure: struct our_handle { uv_handle_t handle; void *buffer; size_t buffer_size; } and then have the alloc_cb return that: static uv_buf_t alloc_cb(uv_handle_t *handle, size_t suggested_size) { struct our_handle *h = (struct our_handle*)handle; return uv_buf_init(h->buffer, h->buffer_size); } You specify the alloc_cb in uv_read_start(). The only thing that you need to consider is that when on_read_cb gets called, you better call uv_read_stop(), otherwise the buffer could be overwritten the next time. Well, yes, this should work for both UNIX and Windows. If you need specific help, let me know. I've been hacking a lot with libuv lately and I can't wait using async I/O in rust (which actually performs well). Regards, Michael From banderson at mozilla.com Mon Jan 28 16:56:38 2013 From: banderson at mozilla.com (Brian Anderson) Date: Mon, 28 Jan 2013 16:56:38 -0800 Subject: [rust-dev] Misc questions In-Reply-To: <51070BA2.6000608@ntecs.de> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> <5101F572.3090202@mozilla.com> <5103BDD3.9000401@ntecs.de> <5103C593.4080904@ntecs.de> <5103C6EC.5050500@ntecs.de> <5106F19D.8010802@mozilla.com> <51070BA2.6000608@ntecs.de> Message-ID: <51071E46.5080709@mozilla.com> On 01/28/2013 03:37 PM, Michael Neumann wrote: > Am 28.01.2013 22:46, schrieb Brian Anderson: >> On 01/26/2013 04:07 AM, Michael Neumann wrote: >>> Am 26.01.2013 13:01, schrieb Michael Neumann: >>>> Am 26.01.2013 12:28, schrieb Michael Neumann: >>>>> Another question: When a task sends a message to another task, and >>>>> this task is waiting exactly for this event, will it directly >>>>> switch to that task, or will it buffer the message? >>>>> Sometimes this could be quite handy and efficient. I rember this >>>>> was done in the L4 microkernel (www.l4ka.org), which only allowed >>>>> synchronous IPC. It could make sense to provide a >>>>> send_and_receive directive, which sends to the channel and lets >>>>> the scheduler know that it is now waiting for a message to receive >>>>> from another port. So send_and_receive could >>>>> directly switch to the other task, and when this does a send back >>>>> to the calling task, it will switch back to it. If you don't have >>>>> send_and_receive as atomic operation, there >>>>> is no way to switch back to the other task, as it might still be >>>>> running. >>>> >>>> "as it might still be running" is here of course wrong (as we >>>> switched to another thread). What I wanted to say is, that it is >>>> not waiting for any event, so it is not in a blocking state, so that >>>> we cannot directly switch back (matching the recv() and the send()). >>>> >>>> Ideally the task that wants to read would do the non-blocking I/O >>>> itself, and the scheduler would just notify when it can "read". But >>>> I think this is not possible with libuv as you >>>> have no control over when to read (except using uv_read_start() / >>>> _stop). I think this would be much more efficient and even more >>>> powerful (one can read directly into a buffer... >>>> there is no need to allocate a new buffer for each read as done by >>>> libuv). So what I would suggest is the following: >>>> >>>> // task >>>> blocking_read(socket, buffer, ...) >>>> // this will register socket with the schedulers event queue >>>> (if not yet done) and block. >>>> // once the scheduler will receive an "data is available" event >>>> from the kernel >>>> // it will unblock the task. >>>> // then the task will do an non-blocking read() on it's own. >> >> I'm not that familiar with the uv API. Is there a distinct 'data >> available' event that happens before we start reading? I've been >> assuming that, as you say, we have to control over when the read >> events happen, so we would need to check whether the task initiating >> this read was currently waiting for data, and either buffer it or >> context switch to the task depending on its state. > > No there isn't! The reason why, as far as I understand it, lies in the > way Windows handles reads. In UNIX you get notified, when you can > read, while in Windows, > you get notified when a read completed, so you are basically doing the > read asynchronously in the background (saving you another context > switch to the kernel). > I think this is called Proactor (the UNIX-way is called Reactor). > libuv wants to do this in a platform-independent way, where the > programmer who uses libuv does > not have to care about which platform he is working with. > > So when we think about this sequence in libuv > > uv_read_start(fd) > -> on_read_cb gets triggered > uv_read_stop(fd) > > what it does internally is the following: > > UNIX: > > register event for `fd` in event queue > epoll() > -> allocate buffer > -> read(fd, "nonblocking") > -> call on_read_cb > unregister event > > Windows: > > allocate buffer > start asynchronous read request > wait for completion (of any outstanding I/O) > -> call on_read_cb > > I think libuv is doing too much here. For example, if I don't want to > remove the socket from the event > queue, just disable the callback, then this is not possible. I'd > prefer when I could just tell libuv that > I am interested in event X (on Windows: I/O completion, on UNIX: I/O > availability). > > I think a simple hack would be to store the buffer address and size of > buffer in the uv_handle_t structure: > > struct our_handle { > uv_handle_t handle; > void *buffer; > size_t buffer_size; > } > > and then have the alloc_cb return that: > > static uv_buf_t alloc_cb(uv_handle_t *handle, size_t suggested_size) > { > struct our_handle *h = (struct our_handle*)handle; > return uv_buf_init(h->buffer, h->buffer_size); > } > > You specify the alloc_cb in uv_read_start(). The only thing that you > need to consider > is that when on_read_cb gets called, you better call uv_read_stop(), > otherwise > the buffer could be overwritten the next time. > > Well, yes, this should work for both UNIX and Windows. If you need > specific help, let me know. > I've been hacking a lot with libuv lately and I can't wait using async > I/O in rust (which actually > performs well). > Is it possible to do this optimization later or do we need to plan for this ahead of time? I would prefer to use the uv API as it's presented to start with. I welcome any help here. One important and big step we need to get through before trying to integrate uv into the scheduler is to create safe Rust bindings to libuv. Last time around we coded directly to the uv API and it resulted in some big maintenance problems (unsafe code everywhere). pfox is working on updating libuv to upstream trunk now, after which I expect somebody will start on the bindings. If you want to discuss the integration of uv into the scheduler there is an issue open: https://github.com/mozilla/rust/issues/4419. From graydon at mozilla.com Mon Jan 28 17:29:14 2013 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 28 Jan 2013 17:29:14 -0800 Subject: [rust-dev] Misc questions In-Reply-To: <51071E46.5080709@mozilla.com> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> <5101F572.3090202@mozilla.com> <5103BDD3.9000401@ntecs.de> <5103C593.4080904@ntecs.de> <5103C6EC.5050500@ntecs.de> <5106F19D.8010802@mozilla.com> <51070BA2.6000608@ntecs.de> <51071E46.5080709@mozilla.com> Message-ID: <510725EA.2090907@mozilla.com> On 13-01-28 04:56 PM, Brian Anderson wrote: >> I think libuv is doing too much here. For example, if I don't want to >> remove the socket from the event >> queue, just disable the callback, then this is not possible. I'd >> prefer when I could just tell libuv that >> I am interested in event X (on Windows: I/O completion, on UNIX: I/O >> availability). Yet the optimization you suggest has to do with recycling the buffer, not listening for one kind of event vs. another. In general I'm not interested in trying to "get underneath" the abstraction uv is providing. It's providing an IOCP-oriented interface, I would like to code to that and make the rust IO library not have to worry when it's on windows vs. unix. That's the point of the abstraction uv provides, and it's valuable. If it means bouncing off epoll a few too many times (or reallocating a buffer a few too many times), I'm not too concerned. Those should both be O(1) operations. > Is it possible to do this optimization later or do we need to plan for > this ahead of time? I would prefer to use the uv API as it's presented > to start with. The optimization to use a caller-provided buffer should (a) not be necessary to get us started and (b) be equally possible on either platform, unix or windows, _so long as_ we're actually sleeping a task during its period of interest in IO (either the pre-readiness sleep or a post-issue, pre-completion sleep). In other words, if we're simulating sync IO, then we can use a task-local buffer. If we're _not_ simulating sync IO (I sure hope we do!) then we should let uv allocate and free dynamic buffers as it needs them. But I really hope we wind up structuring it so it simulates sync IO. We're providing a task abstraction. Users _want_ the sync IO abstraction the same way they want the sequential control flow abstraction. (Indeed, on an appropriately-behaving system I fully expect task=thread and sync IO calls=system calls) > I welcome any help here. One important and big step we need to get > through before trying to integrate uv into the scheduler is to create > safe Rust bindings to libuv. Last time around we coded directly to the > uv API and it resulted in some big maintenance problems (unsafe code > everywhere). pfox is working on updating libuv to upstream trunk now, > after which I expect somebody will start on the bindings. If you want to > discuss the integration of uv into the scheduler there is an issue open: > https://github.com/mozilla/rust/issues/4419. I'll try to remain more-involved this time. We have to get this right. -Graydon From banderson at mozilla.com Mon Jan 28 18:01:33 2013 From: banderson at mozilla.com (Brian Anderson) Date: Mon, 28 Jan 2013 18:01:33 -0800 Subject: [rust-dev] Misc questions In-Reply-To: <510725EA.2090907@mozilla.com> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> <5101F572.3090202@mozilla.com> <5103BDD3.9000401@ntecs.de> <5103C593.4080904@ntecs.de> <5103C6EC.5050500@ntecs.de> <5106F19D.8010802@mozilla.com> <51070BA2.6000608@ntecs.de> <51071E46.5080709@mozilla.com> <510725EA.2090907@mozilla.com> Message-ID: <51072D7D.8090609@mozilla.com> On 01/28/2013 05:29 PM, Graydon Hoare wrote: > On 13-01-28 04:56 PM, Brian Anderson wrote: > >>> I think libuv is doing too much here. For example, if I don't want to >>> remove the socket from the event >>> queue, just disable the callback, then this is not possible. I'd >>> prefer when I could just tell libuv that >>> I am interested in event X (on Windows: I/O completion, on UNIX: I/O >>> availability). > Yet the optimization you suggest has to do with recycling the buffer, > not listening for one kind of event vs. another. > > In general I'm not interested in trying to "get underneath" the > abstraction uv is providing. It's providing an IOCP-oriented interface, > I would like to code to that and make the rust IO library not have to > worry when it's on windows vs. unix. That's the point of the abstraction > uv provides, and it's valuable. If it means bouncing off epoll a few too > many times (or reallocating a buffer a few too many times), I'm not too > concerned. Those should both be O(1) operations. > >> Is it possible to do this optimization later or do we need to plan for >> this ahead of time? I would prefer to use the uv API as it's presented >> to start with. > The optimization to use a caller-provided buffer should (a) not be > necessary to get us started and (b) be equally possible on either > platform, unix or windows, _so long as_ we're actually sleeping a task > during its period of interest in IO (either the pre-readiness sleep or a > post-issue, pre-completion sleep). In other words, if we're simulating > sync IO, then we can use a task-local buffer. If we're _not_ simulating > sync IO (I sure hope we do!) then we should let uv allocate and free > dynamic buffers as it needs them. > > But I really hope we wind up structuring it so it simulates sync IO. > We're providing a task abstraction. Users _want_ the sync IO abstraction > the same way they want the sequential control flow abstraction. Presenting the scheduler-originating I/O as synchronous is what I intend. I am not sure that we can guarantee that a task is actually waiting for I/O when an I/O event occurs that that task is waiting for. A task may block on some other unrelated event while the event loop is doing I/O. Pseudocode: let port = IOPort::connect(); // Assume we're doing I/O reads using something portlike while port.recv() { // Block on a different port, while uv continues doing I/O on our behalf let intermediate_value = some_other_port.recv(); } This is why I'm imagining that the scheduler will sometimes need to buffer. > > (Indeed, on an appropriately-behaving system I fully expect task=thread > and sync IO calls=system calls) > >> I welcome any help here. One important and big step we need to get >> through before trying to integrate uv into the scheduler is to create >> safe Rust bindings to libuv. Last time around we coded directly to the >> uv API and it resulted in some big maintenance problems (unsafe code >> everywhere). pfox is working on updating libuv to upstream trunk now, >> after which I expect somebody will start on the bindings. If you want to >> discuss the integration of uv into the scheduler there is an issue open: >> https://github.com/mozilla/rust/issues/4419. > I'll try to remain more-involved this time. We have to get this right. > > -Graydon > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From mneumann at ntecs.de Tue Jan 29 04:47:35 2013 From: mneumann at ntecs.de (Michael Neumann) Date: Tue, 29 Jan 2013 13:47:35 +0100 Subject: [rust-dev] Misc questions In-Reply-To: <51072D7D.8090609@mozilla.com> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> <5101F572.3090202@mozilla.com> <5103BDD3.9000401@ntecs.de> <5103C593.4080904@ntecs.de> <5103C6EC.5050500@ntecs.de> <5106F19D.8010802@mozilla.com> <51070BA2.6000608@ntecs.de> <51071E46.5080709@mozilla.com> <510725EA.2090907@mozilla.com> <51072D7D.8090609@mozilla.com> Message-ID: <5107C4E7.70609@ntecs.de> Am 29.01.2013 03:01, schrieb Brian Anderson: > On 01/28/2013 05:29 PM, Graydon Hoare wrote: >> On 13-01-28 04:56 PM, Brian Anderson wrote: >> >>>> I think libuv is doing too much here. For example, if I don't want to >>>> remove the socket from the event >>>> queue, just disable the callback, then this is not possible. I'd >>>> prefer when I could just tell libuv that >>>> I am interested in event X (on Windows: I/O completion, on UNIX: I/O >>>> availability). >> Yet the optimization you suggest has to do with recycling the buffer, >> not listening for one kind of event vs. another. >> >> In general I'm not interested in trying to "get underneath" the >> abstraction uv is providing. It's providing an IOCP-oriented interface, >> I would like to code to that and make the rust IO library not have to >> worry when it's on windows vs. unix. That's the point of the abstraction >> uv provides, and it's valuable. If it means bouncing off epoll a few too >> many times (or reallocating a buffer a few too many times), I'm not too >> concerned. Those should both be O(1) operations. >> >>> Is it possible to do this optimization later or do we need to plan for >>> this ahead of time? I would prefer to use the uv API as it's presented >>> to start with. >> The optimization to use a caller-provided buffer should (a) not be >> necessary to get us started and (b) be equally possible on either >> platform, unix or windows, _so long as_ we're actually sleeping a task >> during its period of interest in IO (either the pre-readiness sleep or a >> post-issue, pre-completion sleep). In other words, if we're simulating >> sync IO, then we can use a task-local buffer. If we're _not_ simulating >> sync IO (I sure hope we do!) then we should let uv allocate and free >> dynamic buffers as it needs them. >> >> But I really hope we wind up structuring it so it simulates sync IO. >> We're providing a task abstraction. Users _want_ the sync IO abstraction >> the same way they want the sequential control flow abstraction. > > Presenting the scheduler-originating I/O as synchronous is what I > intend. I am not sure that we can guarantee that a task is actually > waiting for I/O when an I/O event occurs that that task is waiting > for. A task may block on some other unrelated event while the event > loop is doing I/O. Pseudocode: > > let port = IOPort::connect(); // Assume we're doing I/O reads using > something portlike > while port.recv() { > // Block on a different port, while uv continues doing I/O on our > behalf > let intermediate_value = some_other_port.recv(); > } > > This is why I'm imagining that the scheduler will sometimes need to > buffer. I don't think so. Let me explain. This anyway is only a problem (which can be solved) iff we want to be able to treat I/O like a port and want to wait for either one to resume our thread. And I assume we want this, so that we can listen on an I/O socket AND for example for incoming messages at the same time. The kernel provides a way to do (task-local) blocking I/O operations. There is no way for the task to return from a read() call unless data comes in or in case of EOF (or any other error condition). This behaves basically like a blocking POSIX read() call, just that it is converted into asynchronous read by libuv under the hood. To expose I/O as port, we have to start a new task: let fd = open(...); let (po, ch) = streams::pipe(); do task::spawn { loop { let buf: ~[u8] = vec::from_fn(1000, || 0); let nread = fd.read(buf, 1000); if nread > 0 { ch.send(Data(buf)) } else if nread == 0 { ch.send(EOF) } else { ch.send(Error) } } } // now we can treat `po` as a Port and call select() on it But I don't think channel I/O will be used that often. Note that one big advantage is that we can specify the buffer size ourself! When we would let libuv create a buffer for us, how would it know the buffer size? The alloc_cb you provide to libuv upon uv_start_read() will get a suggested_size parameter passed, but this is 64k by default, and libuv cannot know what kind of I/O protocol you are handling. When I do line oriented I/O, I would not need a full 64k buffer allocated for every read, which in the worst case would only return one byte in it in case of a very slow sender (send one byte each second). Or is 64k enough for receiving a very large packet. We clearly want a way to tell the I/O system how large we expect the packet to be that will arrive over I/O otherwise this is completely useless IMHO. We would still have one separate iotask per scheduler. This is a native thread and runs the I/O loop. There is no way to do that inside the scheduler as we would block any task while waiting for I/O. The callbacks like on_read_cb would simply notify the scheduler that the task that was responsible for doing this read operation can now resume. As the scheduler lives in another thread (the thread in which all tasks of that scheduler live in) and might be active, we need to do some locking here. When the scheduler gets activated next time, either by issuing a blocking I/O operation, giving up by using task::yield or by waiting for a message on a port, or when sending a message blocks, the scheduler can decide which task to schedule next and consider those for which I/O has arrived as well. One thing to consider is that we'd need a way to return the number of bytes written to the buffer to the calling task of read(). We should store this in the same manner as the pointer to the buffer and the buffer_size in the stream_t handle. This is safe, as one I/O object is always exclusively used by one task. We can call this field last_nread for example, and when the scheduler reactivates a task blocked on a read I/O, we would simply return this field as number of read bytes. In short: * A task can either block on exactly *one* I/O object * or on a channel/port. * Each I/O object belongs exclusivly to one task * I/O and Port/Chan are two different things * I/O is "lower" than Port/Chan, but can be easily wrapped into an Port/Chan abstraction (see code above) * When a task blocks on an I/O event, it blocks until this I/O event arrives. * A task can only ever block on *one* I/O event * For Channel I/O (I/O over Port/Chan) a separate task in needed for each connection object. * We have on iotask per scheduler. Actually what we are doing is reversing what the core provides us by default. Right now, blocking I/O (read(), read_line()) is provided over a Chan/Port system. While in the future blocking I/O will be provided by the scheduler, and it is very easy to build a Chan/Port upon this. And I forgot, the scheduler of course also has to talk to the iotask. For example when a task blocks on I/O, the scheduler has to notify the iotask somehow that it should update it's event list (call uv_start_read() for example), which upon the next event loop iteration would be integrated. Note that while the iotask -> scheduler communication can be done using a simple mutex, as the scheduler only runs for a very short period of time, the reverse communication between scheduler -> iotask must be done non-blocking, as the iotask might sleep for quite some time. Actually, we would need a way to wake up the event loop in some way from the scheduler. Because imagine there are two tasks. One is blocked for I/O. This means that the iotask is blocked on I/O. When now the second task also wants to do I/O, it cannot register it's event in the iotask, as this is still blocked in epoll/whatever. If no I/O arrives for the first task, the second would be blocked forever, waiting to register it's interest in I/O. Luckily libuv provides a way to wake up the event loop from another thread (in our case the scheduler thread). This is uv_async_init(), uv_async_send(). struct iotask { uv_loop_t *evloop; uv_async_t *async; list *pending_event_registrations; }; struct iohandle { uv_stream_t stream; void *buffer; size_t buffer_size; ssize_t last_nread; task *owning_task; scheduler *sched; }; uv_buf_t alloc_cb(io, _suggested_size) { return uv_buf_init(io->buffer, io->buffer_size); } on_read_cb(io, ssize_t nread, int status) { uv_stop_read(io); io->last_nread = nread; scheduler_notify(io); } async_cb() { mutex_lock_on(iot->pending_event_registrations_list); foreach e in iot->pending_event_registration_list { match e { DoRead(iohandle) -> uv_read_start(iohandle, alloc_cb, on_read_cb), ... } mutex_unlock(iot->pending_event_registrations_list); } iotask_start(iotask *iot) { uv_async_init(iot->evloop, iot->async, async_cb); uv_run(iot->evloop, UV_RUN_DEFAULT); // this will run forever } scheduler_notify(io) { mutex_lock(io->sched); io->sched->unblock_task(io->task); mutex_unlock(io->sched); } scheduler_read(io) { mutex_lock_on(io->iotask->pending_event_registrations_list); io->iotask->pending_registration_list.append( DoRead(io) ); mutex_unlock(io->iotask->pending_event_registrations_list); uv_async_send(iotask->async); // now block calling task and switch to other task } task() { io.read(buf, 1000); // this will do the following: // io.buffer = buf; // io.buffer_size = 1000; // scheduler_read(io); } Seems to be similar in concept to the code we have right now, just that we would not use channels for communication between iotask, and that the uv_read_stop is done in the iotask itself, i.e. no need for further communication. Also there are no context switches, because scheduler and iotask are separate tasks. The only downside is that every read() needs to awake the iotask. This might be an issue!!!! If instead we would use Chan I/O: * allocate buffer for each read(). We only need a way to tell the system a per allocation buffer size, *and* a "total number of bytes" value used before blocking (blocking would mean here calling uv_read_stop), then this is OK. The length of the queue (channels are buffered) would suffice here, as the total number of bytes before blocking would then simply be length_of_channel_buffer * per_alloc_buffer_size * the on_read_cb would then simply aquire the lock for the associated channel and append the buffer to it. * Only when the channel is blocked, we have to call uv_stop_read. And once it is unblocked (someone recv()ed from it) we have to notice this and call uv_start_read again. Hm, now that I think about it, forget everything what I said in the beginning. I start to like this channel based approachmore, and I think it will perform much better, because the iotask is not interrupted anymore (no more uv_read_stop). And it seems to be simpler to implement, given that it is easy (and fast) to put something on a channel from the iotask (which is in C or low-level rust). Can I specify a size for a Channel, so it works like a SizedQueue? I guess this would be important to limit DOS attacks, otherwise someone could flood our channel from the outside. In case we want to read exactly n bytes from a I/O handle, we could also special case it easily, so that it reuses an allocated buffer until it contains n bytes (or EOF), before pushing it to the channel. This would be quite easy to do, but we should not take this special case into account right now. Another thing we could do to optimize certain cases is to explicitly specify an allocation method onto the I/O handle which is responsible for buffer allocation. Using I/O channels this will be a very cool system!!! It will be as fast as pure libuv minus a) allocation overhead (vs. hand-crafted code), minus task switching (vs. simple callbacks). While a) is negligible with a good memory allocator (with a GC this would be a problem!) b) is the cost we pay for easier coding and is exactly what we want! This would be the "pseudo" code for the Channel I/O. Note that there would be special code in the ChanPort code, that invokes start_read again once a channel turns from full to not_full. I am assuming a SizedQueue here. struct iotask { uv_loop_t *evloop; uv_async_t *async; list *pending_event_registrations; }; struct iohandle { uv_stream_t stream; size_t buffer_size; chan *read_chan; }; uv_buf_t alloc_cb(io, _suggested_size) { return malloc(io->buffer_size); } on_read_cb(io, uv_buf_t buf, ssize_t nread, int status) { lock(io->read_chan); // this will never block! NOTE that the iotask should be the only // one allowed to send to the read_chan (i.e. single writer!) io->read_chan->push(buf); if io->read_chan->full() { // if the channel is full, we have to tell libuv to stop reading // if we consume from a full channel, we would need to call // scheduler_start_read(io) again. uv_stop_read(io); } unlock(io->read_chan); } async_cb() { mutex_lock_on(iot->pending_event_registrations_list); foreach e in iot->pending_event_registration_list { match e { StartRead(iohandle) -> uv_read_start(iohandle, alloc_cb, on_read_cb), ... } mutex_unlock(iot->pending_event_registrations_list); } iotask_start(iotask *iot) { uv_async_init(iot->evloop, iot->async, async_cb); uv_run(iot->evloop, UV_RUN_DEFAULT); // this will run forever } scheduler_start_read(io) { mutex_lock_on(io->iotask->pending_event_registrations_list); io->iotask->pending_registration_list.append( StartRead(io) ); mutex_unlock(io->iotask->pending_event_registrations_list); uv_async_send(iotask->async); // return to calling task!!! } task() { io = SocketOpen(buffer_size: 1000); // this will call scheduler_start_read(io) // io should actually contain internally two channels/ports, one pair for // each direction while io.recv() { ... } } Regards, Michael From mneumann at ntecs.de Tue Jan 29 06:26:34 2013 From: mneumann at ntecs.de (Michael Neumann) Date: Tue, 29 Jan 2013 15:26:34 +0100 Subject: [rust-dev] Misc questions In-Reply-To: <510725EA.2090907@mozilla.com> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> <5101F572.3090202@mozilla.com> <5103BDD3.9000401@ntecs.de> <5103C593.4080904@ntecs.de> <5103C6EC.5050500@ntecs.de> <5106F19D.8010802@mozilla.com> <51070BA2.6000608@ntecs.de> <51071E46.5080709@mozilla.com> <510725EA.2090907@mozilla.com> Message-ID: <5107DC1A.9000803@ntecs.de> Am 29.01.2013 02:29, schrieb Graydon Hoare: > On 13-01-28 04:56 PM, Brian Anderson wrote: > >>> I think libuv is doing too much here. For example, if I don't want to >>> remove the socket from the event >>> queue, just disable the callback, then this is not possible. I'd >>> prefer when I could just tell libuv that >>> I am interested in event X (on Windows: I/O completion, on UNIX: I/O >>> availability). > Yet the optimization you suggest has to do with recycling the buffer, > not listening for one kind of event vs. another. > > In general I'm not interested in trying to "get underneath" the > abstraction uv is providing. It's providing an IOCP-oriented interface, > I would like to code to that and make the rust IO library not have to > worry when it's on windows vs. unix. That's the point of the abstraction > uv provides, and it's valuable. If it means bouncing off epoll a few too > many times (or reallocating a buffer a few too many times), I'm not too > concerned. Those should both be O(1) operations. I think allocating a buffer performs much better than waking up the event loop for every read, because waking up the event loop involves kernel activity on both scheduler and iotask, while malloc should in most cases be pure user-level. And allocating buffers allows us to asynchronously continue reading while the task is still doing some computations. If we would allow multiple readers on a single port (do we? I think channels in Go allow that), then we would even have a very simple way to load balance I/O to multiple tasks. This could actually make sense in many scenarios. Kind of work-stealing. And we could build arbitrary pipelines. Of course we can simulate the same by using a dispatcher task, but this would incur some overhead. >> Is it possible to do this optimization later or do we need to plan for >> this ahead of time? I would prefer to use the uv API as it's presented >> to start with. > The optimization to use a caller-provided buffer should (a) not be > necessary to get us started and (b) be equally possible on either > platform, unix or windows, _so long as_ we're actually sleeping a task > during its period of interest in IO (either the pre-readiness sleep or a > post-issue, pre-completion sleep). In other words, if we're simulating > sync IO, then we can use a task-local buffer. If we're _not_ simulating > sync IO (I sure hope we do!) then we should let uv allocate and free > dynamic buffers as it needs them. We are kind of simulating sync IO by using a channel. But IO would be async in the background (if we do not want to wakup the event loop for every read), so we would need buffers. > But I really hope we wind up structuring it so it simulates sync IO. > We're providing a task abstraction. Users _want_ the sync IO abstraction > the same way they want the sequential control flow abstraction. Yes. I (now) fully agree. > (Indeed, on an appropriately-behaving system I fully expect task=thread > and sync IO calls=system calls) If using a "SizedChannel(1)" each io.recv would correspond to one read() syscall, except that the syscall could have happend long ago. Channels with longer queues would mean that up to "n" (size of queue) read() calls could have happend. Regards, Michael From mneumann at ntecs.de Tue Jan 29 06:47:58 2013 From: mneumann at ntecs.de (Michael Neumann) Date: Tue, 29 Jan 2013 15:47:58 +0100 Subject: [rust-dev] Misc questions In-Reply-To: <51071E46.5080709@mozilla.com> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> <5101F572.3090202@mozilla.com> <5103BDD3.9000401@ntecs.de> <5103C593.4080904@ntecs.de> <5103C6EC.5050500@ntecs.de> <5106F19D.8010802@mozilla.com> <51070BA2.6000608@ntecs.de> <51071E46.5080709@mozilla.com> Message-ID: <5107E11E.6000109@ntecs.de> Am 29.01.2013 01:56, schrieb Brian Anderson: > On 01/28/2013 03:37 PM, Michael Neumann wrote: >> Am 28.01.2013 22:46, schrieb Brian Anderson: >>> On 01/26/2013 04:07 AM, Michael Neumann wrote: >>>> Am 26.01.2013 13:01, schrieb Michael Neumann: >>>>> Am 26.01.2013 12:28, schrieb Michael Neumann: >>>>>> Another question: When a task sends a message to another task, >>>>>> and this task is waiting exactly for this event, will it directly >>>>>> switch to that task, or will it buffer the message? >>>>>> Sometimes this could be quite handy and efficient. I rember this >>>>>> was done in the L4 microkernel (www.l4ka.org), which only allowed >>>>>> synchronous IPC. It could make sense to provide a >>>>>> send_and_receive directive, which sends to the channel and lets >>>>>> the scheduler know that it is now waiting for a message to >>>>>> receive from another port. So send_and_receive could >>>>>> directly switch to the other task, and when this does a send back >>>>>> to the calling task, it will switch back to it. If you don't have >>>>>> send_and_receive as atomic operation, there >>>>>> is no way to switch back to the other task, as it might still be >>>>>> running. >>>>> >>>>> "as it might still be running" is here of course wrong (as we >>>>> switched to another thread). What I wanted to say is, that it is >>>>> not waiting for any event, so it is not in a blocking state, so that >>>>> we cannot directly switch back (matching the recv() and the send()). >>>>> >>>>> Ideally the task that wants to read would do the non-blocking I/O >>>>> itself, and the scheduler would just notify when it can "read". >>>>> But I think this is not possible with libuv as you >>>>> have no control over when to read (except using uv_read_start() / >>>>> _stop). I think this would be much more efficient and even more >>>>> powerful (one can read directly into a buffer... >>>>> there is no need to allocate a new buffer for each read as done by >>>>> libuv). So what I would suggest is the following: >>>>> >>>>> // task >>>>> blocking_read(socket, buffer, ...) >>>>> // this will register socket with the schedulers event queue >>>>> (if not yet done) and block. >>>>> // once the scheduler will receive an "data is available" >>>>> event from the kernel >>>>> // it will unblock the task. >>>>> // then the task will do an non-blocking read() on it's own. >>> >>> I'm not that familiar with the uv API. Is there a distinct 'data >>> available' event that happens before we start reading? I've been >>> assuming that, as you say, we have to control over when the read >>> events happen, so we would need to check whether the task initiating >>> this read was currently waiting for data, and either buffer it or >>> context switch to the task depending on its state. >> >> No there isn't! The reason why, as far as I understand it, lies in >> the way Windows handles reads. In UNIX you get notified, when you can >> read, while in Windows, >> you get notified when a read completed, so you are basically doing >> the read asynchronously in the background (saving you another context >> switch to the kernel). >> I think this is called Proactor (the UNIX-way is called Reactor). >> libuv wants to do this in a platform-independent way, where the >> programmer who uses libuv does >> not have to care about which platform he is working with. >> >> So when we think about this sequence in libuv >> >> uv_read_start(fd) >> -> on_read_cb gets triggered >> uv_read_stop(fd) >> >> what it does internally is the following: >> >> UNIX: >> >> register event for `fd` in event queue >> epoll() >> -> allocate buffer >> -> read(fd, "nonblocking") >> -> call on_read_cb >> unregister event >> >> Windows: >> >> allocate buffer >> start asynchronous read request >> wait for completion (of any outstanding I/O) >> -> call on_read_cb >> >> I think libuv is doing too much here. For example, if I don't want to >> remove the socket from the event >> queue, just disable the callback, then this is not possible. I'd >> prefer when I could just tell libuv that >> I am interested in event X (on Windows: I/O completion, on UNIX: I/O >> availability). >> >> I think a simple hack would be to store the buffer address and size >> of buffer in the uv_handle_t structure: >> >> struct our_handle { >> uv_handle_t handle; >> void *buffer; >> size_t buffer_size; >> } >> >> and then have the alloc_cb return that: >> >> static uv_buf_t alloc_cb(uv_handle_t *handle, size_t suggested_size) >> { >> struct our_handle *h = (struct our_handle*)handle; >> return uv_buf_init(h->buffer, h->buffer_size); >> } >> >> You specify the alloc_cb in uv_read_start(). The only thing that you >> need to consider >> is that when on_read_cb gets called, you better call uv_read_stop(), >> otherwise >> the buffer could be overwritten the next time. >> >> Well, yes, this should work for both UNIX and Windows. If you need >> specific help, let me know. >> I've been hacking a lot with libuv lately and I can't wait using >> async I/O in rust (which actually >> performs well). >> > > Is it possible to do this optimization later or do we need to plan for > this ahead of time? I would prefer to use the uv API as it's presented > to start with. > > I welcome any help here. One important and big step we need to get > through before trying to integrate uv into the scheduler is to create > safe Rust bindings to libuv. Last time around we coded directly to the > uv API and it resulted in some big maintenance problems (unsafe code > everywhere). pfox is working on updating libuv to upstream trunk now, > after which I expect somebody will start on the bindings. If you want > to discuss the integration of uv into the scheduler there is an issue > open: https://github.com/mozilla/rust/issues/4419. Hm, there would be really little integration between scheduler and iotask. The iotask only needs to know about channels. And there is this "async wakeup" call between a task and the iotask. For performance reasons, I would suggest writing the iotask directly in C++. Having each registered callback call back into Rust would probably be a pain. Also, a lot of functionality of libuv will not be needed to be accessible in Rust. We only need to be able to open/connect a socket/whatever. Every uv call that needs a callback would be associated with a channel. For example uv_listen(). It's important that all callbacks need to be performaned inside the iotask. We don't want to expose them to the scheduler thread or any task, as otherwise, this would block everything (callbacks may never block). The "task" of the iotask is to convert every callback into a message. That's all. For uv_tcp_connect() we'd need a single shot channel, which fires exactly once. Regards, Michael From deansherthompson at gmail.com Tue Jan 29 09:26:21 2013 From: deansherthompson at gmail.com (Dean Thompson) Date: Tue, 29 Jan 2013 09:26:21 -0800 Subject: [rust-dev] "intimidation factor" vs target audience In-Reply-To: <510455C7.90104@alum.mit.edu> Message-ID: It just occurred to me that the 'lt syntax for lifetimes does (I think) provide an unambiguous way keep the pointed-to type right next to the & sigil: 'lt&T Here, the usual & sigil becomes 'lt&, which taken together is a prefix on the type T. In this notation, there is still a clear distinction between Niko's examples below, which become 'b&T &'b&T I'm not sure whether I like it better or worse than the prevailing alternative &'b&T (with the opposite meaning!) &&'b T I believe it does address James Boyden's goals (below) 2a and 2b, although not 2c, and this issue is orthogonal to 2d. Since I resonate with 2a and 2b myself, I lean toward thinking I like this alternative approach a little better. Dean James Boyden had written: > > For the above reasons, is there any way that the lifetime syntax > could be moved *after* the type? The current proposal of `&'lt Foo` > does address the ambiguity described in point 2b, but not 2a or 2c. > (The `/` sigil itself actually doesn't faze me that much.) Niko Matsakis had written: > We considered this, but it's very hard to mix prefix and postfix notation like > this without ambiguity. Consider: > &&T/b > Is this: > &'b &T > or > &&'b T > > I suppose we could just resolve it arbitrarily, like if-else-if chains. More completely (for reference), James Boyden had written: > For what it's worth, here's my 2 cents: > > 1. Borrowed pointer lifetimes are a very interesting concept. I think > they definitely enrich Rust. However... > > 2. The current syntax of `&lifetime/type` completely threw me when > I first saw it in the wild (before reading up on the topic in the > tutorials). I'm concerned that this could be something that might > push Rust slightly too far towards "obscure academic theory language" > for many C++ programmers. Specifically: > > 2a. I'm used to seeing the pointed-to type right next to the pointer. > I can mentally reverse C++'s `int &p` to `p: &int` without problems, > but having extra line-noise in there seems to stretch my on-the-fly > language-parsing capability beyond its limits -- particularly so when > the identifiers both before and after the `/` look like types, and > there's no whitespace breaks to guide my mental tokeniser. > > 2b. I found the `p: &r/float` syntax especially confusing in that > sometimes, without any prior or intervening warning, the next token > after the borrow-pointer sigil `&` was a lifetime instead of a type > (but of course, not all the time). When you're mentally parsing it, > you don't get any explanation of what you're reading until *after* > you've read those tokens. > > 2c. I care more about the pointed-to type than the lifetime (at least > on my first scan through the function parameters), so I'd strongly > prefer to have the pointed-to type closer to the front, and closer to > the parameter name and pointer sigil (i.e., before the lifetime name). > This has the additional benefit that now all the parameter type info > is together, followed by the lifetime info. > > 2d. `&` means "pointers" or (occasionally) bitwise-AND. Please don't > use it for any other plumbing! (I'm thinking of "Lifetime parameter > designated with &" in the previous notation.) > > > For the above reasons, is there any way that the lifetime syntax > could be moved *after* the type? The current proposal of `&'lt Foo` > does address the ambiguity described in point 2b, but not 2a or 2c. > (The `/` sigil itself actually doesn't faze me that much.) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From banderson at mozilla.com Tue Jan 29 15:43:41 2013 From: banderson at mozilla.com (Brian Anderson) Date: Tue, 29 Jan 2013 15:43:41 -0800 Subject: [rust-dev] Misc questions In-Reply-To: <5107C4E7.70609@ntecs.de> References: <5101C9FD.7030204@ntecs.de> <5101D3E6.7070902@mozilla.com> <5101F572.3090202@mozilla.com> <5103BDD3.9000401@ntecs.de> <5103C593.4080904@ntecs.de> <5103C6EC.5050500@ntecs.de> <5106F19D.8010802@mozilla.com> <51070BA2.6000608@ntecs.de> <51071E46.5080709@mozilla.com> <510725EA.2090907@mozilla.com> <51072D7D.8090609@mozilla.com> <5107C4E7.70609@ntecs.de> Message-ID: <51085EAD.5090204@mozilla.com> On 01/29/2013 04:47 AM, Michael Neumann wrote: > Am 29.01.2013 03:01, schrieb Brian Anderson: >> On 01/28/2013 05:29 PM, Graydon Hoare wrote: >>> On 13-01-28 04:56 PM, Brian Anderson wrote: >>> >>>>> I think libuv is doing too much here. For example, if I don't want to >>>>> remove the socket from the event >>>>> queue, just disable the callback, then this is not possible. I'd >>>>> prefer when I could just tell libuv that >>>>> I am interested in event X (on Windows: I/O completion, on UNIX: I/O >>>>> availability). >>> Yet the optimization you suggest has to do with recycling the buffer, >>> not listening for one kind of event vs. another. >>> >>> In general I'm not interested in trying to "get underneath" the >>> abstraction uv is providing. It's providing an IOCP-oriented interface, >>> I would like to code to that and make the rust IO library not have to >>> worry when it's on windows vs. unix. That's the point of the >>> abstraction >>> uv provides, and it's valuable. If it means bouncing off epoll a few >>> too >>> many times (or reallocating a buffer a few too many times), I'm not too >>> concerned. Those should both be O(1) operations. >>> >>>> Is it possible to do this optimization later or do we need to plan for >>>> this ahead of time? I would prefer to use the uv API as it's presented >>>> to start with. >>> The optimization to use a caller-provided buffer should (a) not be >>> necessary to get us started and (b) be equally possible on either >>> platform, unix or windows, _so long as_ we're actually sleeping a task >>> during its period of interest in IO (either the pre-readiness sleep >>> or a >>> post-issue, pre-completion sleep). In other words, if we're simulating >>> sync IO, then we can use a task-local buffer. If we're _not_ simulating >>> sync IO (I sure hope we do!) then we should let uv allocate and free >>> dynamic buffers as it needs them. >>> >>> But I really hope we wind up structuring it so it simulates sync IO. >>> We're providing a task abstraction. Users _want_ the sync IO >>> abstraction >>> the same way they want the sequential control flow abstraction. >> >> Presenting the scheduler-originating I/O as synchronous is what I >> intend. I am not sure that we can guarantee that a task is actually >> waiting for I/O when an I/O event occurs that that task is waiting >> for. A task may block on some other unrelated event while the event >> loop is doing I/O. Pseudocode: >> >> let port = IOPort::connect(); // Assume we're doing I/O reads using >> something portlike >> while port.recv() { >> // Block on a different port, while uv continues doing I/O on our >> behalf >> let intermediate_value = some_other_port.recv(); >> } >> >> This is why I'm imagining that the scheduler will sometimes need to >> buffer. > > I don't think so. Let me explain. > > This anyway is only a problem (which can be solved) iff we want to be > able to treat I/O like a > port and want to wait for either one to resume our thread. And I > assume we want this, so > that we can listen on an I/O socket AND for example for incoming > messages at the same time. > > The kernel provides a way to do (task-local) blocking I/O operations. > There is no way for the > task to return from a read() call unless data comes in or in case of > EOF (or any other error > condition).This behaves basically like a blocking POSIX read() call, > just that it is converted > into asynchronous read by libuv under the hood. To expose I/O as port, > we have to start > a new task: > > let fd = open(...); > let (po, ch) = streams::pipe(); > do task::spawn { > loop { > let buf: ~[u8] = vec::from_fn(1000, || 0); > let nread = fd.read(buf, 1000); > if nread > 0 { > ch.send(Data(buf)) > } > else if nread == 0 { > ch.send(EOF) > } > else { > ch.send(Error) > } > } > } Yes, a single call to 'read' will not return until some I/O arrives, but after 'read' returns I/O continues to arrive and that I/O needs to be stored somewhere if the task doesn't immediately block in another call to 'read' on that same fd. Taking the above example: loop { // This will block until data arrives at which point the task will be context-switched in and the data returned. let nread = fd.read(buf, 1000); // This will put the task to sleep waiting on a message on cmd_port let command = cmd_port.recv(); } Until data arrives on cmd_port the task cannot be scheduled. While the task is asleep the I/O loop can't be blocked since other tasks are using it too. So in the meantime uv continues to receive data from the open fd and it needs to live somewhere until the task calls 'read' again on the same fd. Perhaps there's something I don't understand about the uv API here, but I think that once we start reading uv is going to continually provide us with data whether we are ready for it or not. > > // now we can treat `po` as a Port and call select() on it > > > But I don't think channel I/O will be used that often. > > Note that one big advantage is that we can specify the buffer size > ourself! > When we would let libuv create a buffer for us, how would it know the > buffer size? The alloc_cb you provide to libuv upon uv_start_read() > will get > a suggested_size parameter passed, but this is 64k by default, and libuv > cannot know what kind of I/O protocol you are handling. When I do > line oriented I/O, I would not need a full 64k buffer allocated for every > read, which in the worst case would only return one byte in it in case > of a very slow sender (send one byte each second). Or is 64k enough > for receiving a very large packet. We clearly want a way to tell the I/O > system how large we expect the packet to be that will arrive over I/O > otherwise this is completely useless IMHO. I am not sure how to do this with the forementioned issues - when we receive the alloc_cb the task may not have been able to communicate to the event loop the size of the next buffer. Imagine this sequence of events: * task issues fd.read(buf, 1000); * alloc_cb arrives. great, 'read' already told us the size of the next buffer. * task wakes up and starts handling data * task goes to sleep for some other reason * alloc_cb arrives. how big is the buffer supposed to be? * task wakes up and issues fd.read(buf, 1000). What can we do with '1000'? we already missed the underlying read event > > We would still have one separate iotask per scheduler.This is a native > thread and runs the I/O loop. There is no way to do that inside the > scheduler as we would block any task while waiting for I/O. I have something different in mind. There is no iotask. The scheduler is the event loop and I/O callbacks are interleaved with running tasks. There will be no thread synchronization required to pass data from the event loop to a task - only context switches to schedule and deschedule the task. I don't anticipate problems with blocking the scheduler - when an external event requires the scheduler to wake up it will create an async_cb to run some scheduler code. > > The callbacks like on_read_cb would simply notify the scheduler > that the task that was responsible for doing this read operation > can now resume. As the scheduler lives in another thread > (the thread in which all tasks of that scheduler live in) > and might be active, we need to do some locking here. > When the scheduler gets activated next time, either by > issuing a blocking I/O operation, giving up by using task::yield > or by waiting for a message on a port, or when sending a message > blocks, the scheduler can decide which task to schedule next > and consider those for which I/O has arrived as well. > > One thing to consider is that we'd need a way to return the number > of bytes written to the buffer to the calling task of read(). > We should store this in the same manner as the pointer to the buffer > and the buffer_size in the stream_t handle. This is safe, as one I/O > object is always exclusively used by one task. I am not familiar enough with the mechanics of the uv API enough to understand this point. I think though that it assumes that the synchronous code will handle I/O events as they arrive and we will be passing the uv handles from the async code to the sync code (therefore uv won't be overwriting a handle while a Rust task is in possession of it). For the reasons I mentioned before I don't see how this is possible (in the general case) since the scheduler may need to buffer data until it can be acted on by the task. > We can call this field > last_nread for example, and when the scheduler reactivates a > task blocked on a read I/O, we would simply return this field as number > of read bytes. > > In short: > > * A task can either block on exactly *one* I/O object > * or on a channel/port. > * Each I/O object belongs exclusivly to one task > * I/O and Port/Chan are two different things > * I/O is "lower" than Port/Chan, but can be easily > wrapped into an Port/Chan abstraction (see code above) > * When a task blocks on an I/O event, it blocks until > this I/O event arrives. > * A task can only ever block on *one* I/O event > * For Channel I/O (I/O over Port/Chan) a separate task > in needed for each connection object. > * We have on iotask per scheduler. I mostly agree with these points but some are at odds with my previous statements. Please forgive me for not responding to the remaining points in detail. We can discuss more later. Regards, Brian From mneumann at ntecs.de Tue Jan 29 16:03:10 2013 From: mneumann at ntecs.de (Michael Neumann) Date: Wed, 30 Jan 2013 01:03:10 +0100 Subject: [rust-dev] Problem with conflicting implementations for traits In-Reply-To: References: <5102C1EC.80507@ntecs.de> Message-ID: <5108633E.5080302@ntecs.de> Am 26.01.2013 19:20, schrieb Steven Blenkinsop: > You could define an `enum Default= T` Hm, can you make an example? How will that work? How do I use that? Michael > > On Friday, 25 January 2013, Michael Neumann wrote: > > Hi, > > I am getting the following error: > > msgpack.rs:545:0: 555:1 error: conflicting implementations for a trait > msgpack.rs:545 pub impl msgpack.rs:546 K: serialize::Decodable, > msgpack.rs:547 V: serialize::Decodable> > ~[(K,V)]: serialize::Decodable { > msgpack.rs:548 static fn decode(&self, d: > &D) -> ~[(K,V)] { > msgpack.rs:549 do d.read_map |len| { > msgpack.rs:550 do vec::from_fn(len) |i| { > ... > msgpack.rs:539:0: 543:1 note: note conflicting implementation here > msgpack.rs:539 pub impl T> T: serialize::Decodable { > msgpack.rs:540 static fn decode(&self, d: > &D) -> T { > msgpack.rs:541 > serialize::Decodable::decode(d as &serialize::Decoder) > msgpack.rs:542 } > msgpack.rs:543 } > > > It's obvious that the two trait implementations conflict, as the > one (for T) is > more general as the other (for ~[(K,V)]). Is there anything I can > do to fix it? > I have found this discussion [1] but I see no solution to the problem. > > For msgpack, I want to support "maps". They are specially encoded, > so I need a > special Decoder (serialize::Decoder does not support maps in any > way). Above I > tried to extend the serialize::Decoder trait for read_map() and > read_map_elt(), > leading to DecoderWithMap: > > pub trait DecoderWithMap : serialize::Decoder { > fn read_map(&self, f: fn(uint) ?? T) ?? T; > fn read_map_elt(&self, _idx: uint, f: fn() ?? T) ?? T; > } > > Then I tried to implement Decodable for ~[(K,V)] (which I want to > use as a rust > representation as a map; here I'd probably run into problems again as > serializer defines a generic implementation for ~[] and for tuples... > this at least I could solve by using a different type). > > Now I would need to reimplement Decodable for any type I use, so I > tried > to use the second generic trait implementation. But this is where > it failed > with a conflict. > > Would it be possible to "override" a standard (more generic) trait > implementation, > either implicitly (like C++ is doing) or explictly? > > Best, > > Michael > > [1]: https://github.com/mozilla/rust/issues/3429 > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pshagl007 at gmail.com Thu Jan 31 00:39:28 2013 From: pshagl007 at gmail.com (piyush agarwal) Date: Thu, 31 Jan 2013 14:09:28 +0530 Subject: [rust-dev] Hash Table Message-ID: How can we implement hash table in rust ..or there is any built-in type for it. -- Piyush Agarwal Please don?t print this e-mail unless you really need to! -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.sapin at exyr.org Thu Jan 31 00:49:33 2013 From: simon.sapin at exyr.org (Simon Sapin) Date: Thu, 31 Jan 2013 09:49:33 +0100 Subject: [rust-dev] Hash Table In-Reply-To: References: Message-ID: <510A301D.1080101@exyr.org> Le 31/01/2013 09:39, piyush agarwal a ?crit : > How can we implement hash table in rust ..or there is any built-in type > for it. Hi, Have you looked into the std::map module? If that does not fit your use case, I think the underlying hash is in core::hash. http://static.rust-lang.org/doc/std/map.html http://static.rust-lang.org/doc/core/hash.html Cheers, -- Simon Sapin From sh4.seo at samsung.com Thu Jan 31 04:54:28 2013 From: sh4.seo at samsung.com (Sanghyeon Seo) Date: Thu, 31 Jan 2013 12:54:28 +0000 (GMT) Subject: [rust-dev] Lifetime notation Message-ID: <2805762.735911359636868523.JavaMail.weblogic@epml02> UtherII on Reddit /r/rust suggested an idea I like: &{'lt} T T{'lt} Basically "option 8" of http://smallcultfollowing.com/babysteps/blog/2012/12/30/lifetime-notation/ with ' from https://mail.mozilla.org/pipermail/rust-dev/2013-January/002942.html This does need a lookahead but as far as I can tell unambiguous and manageable. More on: http://www.reddit.com/r/rust/comments/17ka3b/meeting_weekly_20130129_region_syntax_impl_type/c86t7wg From niko at alum.mit.edu Thu Jan 31 05:58:36 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 31 Jan 2013 05:58:36 -0800 Subject: [rust-dev] Lifetime notation In-Reply-To: <2805762.735911359636868523.JavaMail.weblogic@epml02> References: <2805762.735911359636868523.JavaMail.weblogic@epml02> Message-ID: <510A788C.40400@alum.mit.edu> Interesting. That would indeed address the ambiguity issue. Niko Sanghyeon Seo wrote: > UtherII on Reddit /r/rust suggested an idea I like: > > &{'lt} T > T{'lt} > > Basically "option 8" of http://smallcultfollowing.com/babysteps/blog/2012/12/30/lifetime-notation/ > with ' from https://mail.mozilla.org/pipermail/rust-dev/2013-January/002942.html > > This does need a lookahead but as far as I can tell unambiguous and manageable. More on: > http://www.reddit.com/r/rust/comments/17ka3b/meeting_weekly_20130129_region_syntax_impl_type/c86t7wg > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From ben.striegel at gmail.com Thu Jan 31 06:33:23 2013 From: ben.striegel at gmail.com (Benjamin Striegel) Date: Thu, 31 Jan 2013 09:33:23 -0500 Subject: [rust-dev] Lifetime notation In-Reply-To: <510A788C.40400@alum.mit.edu> References: <2805762.735911359636868523.JavaMail.weblogic@epml02> <510A788C.40400@alum.mit.edu> Message-ID: +1 to this. Option 8 was always the best-case syntax, and prefixing an apostrophe on lifetime names is entirely inoffensive. On Thu, Jan 31, 2013 at 8:58 AM, Niko Matsakis wrote: > Interesting. That would indeed address the ambiguity issue. > > > Niko > > > Sanghyeon Seo wrote: > >> UtherII on Reddit /r/rust suggested an idea I like: >> >> &{'lt} T >> T{'lt} >> >> Basically "option 8" of http://smallcultfollowing.com/** >> babysteps/blog/2012/12/30/**lifetime-notation/ >> with ' from https://mail.mozilla.org/**pipermail/rust-dev/2013-** >> January/002942.html >> >> This does need a lookahead but as far as I can tell unambiguous and >> manageable. More on: >> http://www.reddit.com/r/rust/**comments/17ka3b/meeting_** >> weekly_20130129_region_syntax_**impl_type/c86t7wg >> ______________________________**_________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/**listinfo/rust-dev >> > ______________________________**_________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/**listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hatahet at gmail.com Thu Jan 31 06:53:28 2013 From: hatahet at gmail.com (Ziad Hatahet) Date: Thu, 31 Jan 2013 06:53:28 -0800 Subject: [rust-dev] Lifetime notation In-Reply-To: References: <2805762.735911359636868523.JavaMail.weblogic@epml02> <510A788C.40400@alum.mit.edu> Message-ID: Would using a dot '.' instead of a quote ' also resolve the ambiguity, without introducing an extra sigil into the language? &{.lt}T T{.lt} -- Ziad On Thu, Jan 31, 2013 at 6:33 AM, Benjamin Striegel wrote: > +1 to this. Option 8 was always the best-case syntax, and prefixing an > apostrophe on lifetime names is entirely inoffensive. > > > On Thu, Jan 31, 2013 at 8:58 AM, Niko Matsakis wrote: > >> Interesting. That would indeed address the ambiguity issue. >> >> >> Niko >> >> >> Sanghyeon Seo wrote: >> >>> UtherII on Reddit /r/rust suggested an idea I like: >>> >>> &{'lt} T >>> T{'lt} >>> >>> Basically "option 8" of http://smallcultfollowing.com/** >>> babysteps/blog/2012/12/30/**lifetime-notation/ >>> with ' from https://mail.mozilla.org/**pipermail/rust-dev/2013-** >>> January/002942.html >>> >>> This does need a lookahead but as far as I can tell unambiguous and >>> manageable. More on: >>> http://www.reddit.com/r/rust/**comments/17ka3b/meeting_** >>> weekly_20130129_region_syntax_**impl_type/c86t7wg >>> ______________________________**_________________ >>> Rust-dev mailing list >>> Rust-dev at mozilla.org >>> https://mail.mozilla.org/**listinfo/rust-dev >>> >> ______________________________**_________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/**listinfo/rust-dev >> > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From deansherthompson at gmail.com Thu Jan 31 07:11:24 2013 From: deansherthompson at gmail.com (Dean Thompson) Date: Thu, 31 Jan 2013 07:11:24 -0800 Subject: [rust-dev] Lifetime notation In-Reply-To: Message-ID: I expect it would, but at the expense of no longer being able to make as simple a statement in the language tutorial as this: The notation 'foo means a lifetime called "foo". To me, it seems nicer for a newbie to wonder "how is that lifetime being used?" than to wonder "what's that thing after the dot?" Dean From: Ziad Hatahet Date: Thursday, January 31, 2013 6:53 AM To: Benjamin Striegel , Niko Matsakis Cc: "rust-dev at mozilla.org" Subject: Re: [rust-dev] Lifetime notation Would using a dot '.' instead of a quote ' also resolve the ambiguity, without introducing an extra sigil into the language? &{.lt}T T{.lt} -- Ziad On Thu, Jan 31, 2013 at 6:33 AM, Benjamin Striegel wrote: > +1 to this. Option 8 was always the best-case syntax, and prefixing an > apostrophe on lifetime names is entirely inoffensive. > > > On Thu, Jan 31, 2013 at 8:58 AM, Niko Matsakis wrote: >> Interesting. That would indeed address the ambiguity issue. >> >> >> Niko >> >> >> Sanghyeon Seo wrote: >>> UtherII on Reddit /r/rust suggested an idea I like: >>> >>> &{'lt} T >>> T{'lt} >>> >>> Basically "option 8" of >>> http://smallcultfollowing.com/babysteps/blog/2012/12/30/lifetime-notation/ >>> >>> with ' from >>> https://mail.mozilla.org/pipermail/rust-dev/2013-January/002942.html >>> >>> >>> This does need a lookahead but as far as I can tell unambiguous and >>> manageable. More on: >>> http://www.reddit.com/r/rust/comments/17ka3b/meeting_weekly_20130129_region_ >>> syntax_impl_type/c86t7wg >>> >> _syntax_impl_type/c86t7wg> >>> _______________________________________________ >>> Rust-dev mailing list >>> Rust-dev at mozilla.org >>> https://mail.mozilla.org/listinfo/rust-dev >>> >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev >> > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > _______________________________________________ Rust-dev mailing list Rust-dev at mozilla.org https://mail.mozilla.org/listinfo/rust-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From lucian.branescu at gmail.com Thu Jan 31 07:25:11 2013 From: lucian.branescu at gmail.com (Lucian Branescu) Date: Thu, 31 Jan 2013 15:25:11 +0000 Subject: [rust-dev] Lifetime notation In-Reply-To: References: Message-ID: I would also find it valuable to have a particular sigil for lifetimes, as there is for the various pointers. On 31 January 2013 15:11, Dean Thompson wrote: > I expect it would, but at the expense of no longer being able to make as > simple a statement in the language tutorial as this: > > The notation 'foo means a lifetime called "foo". > > To me, it seems nicer for a newbie to wonder "how is that lifetime being > used?" than to wonder "what's that thing after the dot?" > > Dean > > From: Ziad Hatahet > Date: Thursday, January 31, 2013 6:53 AM > To: Benjamin Striegel , Niko Matsakis < > niko at alum.mit.edu> > Cc: "rust-dev at mozilla.org" > Subject: Re: [rust-dev] Lifetime notation > > Would using a dot '.' instead of a quote ' also resolve the ambiguity, > without introducing an extra sigil into the language? > > &{.lt}T > T{.lt} > > > -- > Ziad > > > On Thu, Jan 31, 2013 at 6:33 AM, Benjamin Striegel > wrote: > >> +1 to this. Option 8 was always the best-case syntax, and prefixing an >> apostrophe on lifetime names is entirely inoffensive. >> >> >> On Thu, Jan 31, 2013 at 8:58 AM, Niko Matsakis wrote: >> >>> Interesting. That would indeed address the ambiguity issue. >>> >>> >>> Niko >>> >>> >>> Sanghyeon Seo wrote: >>> >>>> UtherII on Reddit /r/rust suggested an idea I like: >>>> >>>> &{'lt} T >>>> T{'lt} >>>> >>>> Basically "option 8" of http://smallcultfollowing.com/** >>>> babysteps/blog/2012/12/30/**lifetime-notation/ >>>> with ' from https://mail.mozilla.org/**pipermail/rust-dev/2013-** >>>> January/002942.html >>>> >>>> This does need a lookahead but as far as I can tell unambiguous and >>>> manageable. More on: >>>> http://www.reddit.com/r/rust/**comments/17ka3b/meeting_** >>>> weekly_20130129_region_syntax_**impl_type/c86t7wg >>>> ______________________________**_________________ >>>> Rust-dev mailing list >>>> Rust-dev at mozilla.org >>>> https://mail.mozilla.org/**listinfo/rust-dev >>>> >>> ______________________________**_________________ >>> Rust-dev mailing list >>> Rust-dev at mozilla.org >>> https://mail.mozilla.org/**listinfo/rust-dev >>> >> >> >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev >> >> > _______________________________________________ Rust-dev mailing list > Rust-dev at mozilla.org https://mail.mozilla.org/listinfo/rust-dev > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Thu Jan 31 07:44:08 2013 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 31 Jan 2013 07:44:08 -0800 Subject: [rust-dev] Lifetime notation In-Reply-To: References: Message-ID: <510A9148.9010902@alum.mit.edu> Wouldn't the sigil be .identifier instead of 'identifier? Anyhow, I have some thoughts on this, but no time to reply atm. Niko Lucian Branescu wrote: > I would also find it valuable to have a particular sigil for > lifetimes, as there is for the various pointers. > > > On 31 January 2013 15:11, Dean Thompson > wrote: > > I expect it would, but at the expense of no longer being able to > make as simple a statement in the language tutorial as this: > > The notation 'foo means a lifetime called "foo". > > To me, it seems nicer for a newbie to wonder "how is that lifetime > being used?" than to wonder "what's that thing after the dot?" > > Dean > > From: Ziad Hatahet > > Date: Thursday, January 31, 2013 6:53 AM > To: Benjamin Striegel >, Niko Matsakis > > Cc: "rust-dev at mozilla.org " > > > Subject: Re: [rust-dev] Lifetime notation > > Would using a dot '.' instead of a quote ' also resolve the > ambiguity, without introducing an extra sigil into the language? > > &{.lt}T > T{.lt} > > > -- > Ziad > > > On Thu, Jan 31, 2013 at 6:33 AM, Benjamin Striegel > > wrote: > > +1 to this. Option 8 was always the best-case syntax, and > prefixing an apostrophe on lifetime names is entirely inoffensive. > > > On Thu, Jan 31, 2013 at 8:58 AM, Niko Matsakis > > wrote: > > Interesting. That would indeed address the ambiguity issue. > > > Niko > > > Sanghyeon Seo wrote: > > UtherII on Reddit /r/rust suggested an idea I like: > > &{'lt} T > T{'lt} > > Basically "option 8" of > http://smallcultfollowing.com/babysteps/blog/2012/12/30/lifetime-notation/ > with ' from > https://mail.mozilla.org/pipermail/rust-dev/2013-January/002942.html > > This does need a lookahead but as far as I can tell > unambiguous and manageable. More on: > http://www.reddit.com/r/rust/comments/17ka3b/meeting_weekly_20130129_region_syntax_impl_type/c86t7wg > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > > _______________________________________________ Rust-dev mailing > list Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lucian.branescu at gmail.com Thu Jan 31 07:55:53 2013 From: lucian.branescu at gmail.com (Lucian Branescu) Date: Thu, 31 Jan 2013 15:55:53 +0000 Subject: [rust-dev] Lifetime notation In-Reply-To: <510A9148.9010902@alum.mit.edu> References: <510A9148.9010902@alum.mit.edu> Message-ID: I meant that ' is not overloaded with anything, whereas things like . are. On 31 January 2013 15:44, Niko Matsakis wrote: > Wouldn't the sigil be .identifier instead of 'identifier? Anyhow, I have > some thoughts on this, but no time to reply atm. > > > Niko > > > Lucian Branescu wrote: > > I would also find it valuable to have a particular sigil for lifetimes, as > there is for the various pointers. > > > On 31 January 2013 15:11, Dean Thompson wrote: > >> I expect it would, but at the expense of no longer being able to make as >> simple a statement in the language tutorial as this: >> >> The notation 'foo means a lifetime called "foo". >> >> To me, it seems nicer for a newbie to wonder "how is that lifetime being >> used?" than to wonder "what's that thing after the dot?" >> >> Dean >> >> From: Ziad Hatahet >> Date: Thursday, January 31, 2013 6:53 AM >> To: Benjamin Striegel , Niko Matsakis < >> niko at alum.mit.edu> >> Cc: "rust-dev at mozilla.org" >> Subject: Re: [rust-dev] Lifetime notation >> >> Would using a dot '.' instead of a quote ' also resolve the ambiguity, >> without introducing an extra sigil into the language? >> >> &{.lt}T >> T{.lt} >> >> >> -- >> Ziad >> >> >> On Thu, Jan 31, 2013 at 6:33 AM, Benjamin Striegel < >> ben.striegel at gmail.com> wrote: >> >>> +1 to this. Option 8 was always the best-case syntax, and prefixing an >>> apostrophe on lifetime names is entirely inoffensive. >>> >>> >>> On Thu, Jan 31, 2013 at 8:58 AM, Niko Matsakis wrote: >>> >>>> Interesting. That would indeed address the ambiguity issue. >>>> >>>> >>>> Niko >>>> >>>> >>>> Sanghyeon Seo wrote: >>>> >>>>> UtherII on Reddit /r/rust suggested an idea I like: >>>>> >>>>> &{'lt} T >>>>> T{'lt} >>>>> >>>>> Basically "option 8" of >>>>> http://smallcultfollowing.com/babysteps/blog/2012/12/30/lifetime-notation/ >>>>> with ' from >>>>> https://mail.mozilla.org/pipermail/rust-dev/2013-January/002942.html >>>>> >>>>> This does need a lookahead but as far as I can tell unambiguous and >>>>> manageable. More on: >>>>> >>>>> http://www.reddit.com/r/rust/comments/17ka3b/meeting_weekly_20130129_region_syntax_impl_type/c86t7wg >>>>> _______________________________________________ >>>>> Rust-dev mailing list >>>>> Rust-dev at mozilla.org >>>>> https://mail.mozilla.org/listinfo/rust-dev >>>>> >>>> _______________________________________________ >>>> Rust-dev mailing list >>>> Rust-dev at mozilla.org >>>> https://mail.mozilla.org/listinfo/rust-dev >>>> >>> >>> >>> _______________________________________________ >>> Rust-dev mailing list >>> Rust-dev at mozilla.org >>> https://mail.mozilla.org/listinfo/rust-dev >>> >>> >> _______________________________________________ Rust-dev mailing list >> Rust-dev at mozilla.org https://mail.mozilla.org/listinfo/rust-dev >> >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hatahet at gmail.com Thu Jan 31 09:07:06 2013 From: hatahet at gmail.com (Ziad Hatahet) Date: Thu, 31 Jan 2013 09:07:06 -0800 Subject: [rust-dev] Lifetime notation In-Reply-To: References: Message-ID: On Thu, Jan 31, 2013 at 7:11 AM, Dean Thompson wrote: > I expect it would, but at the expense of no longer being able to make as > simple a statement in the language tutorial as this: > > The notation 'foo means a lifetime called "foo". > > To me, it seems nicer for a newbie to wonder "how is that lifetime being > used?" than to wonder "what's that thing after the dot?" > > Dean > True; however, is it worth to introduce more visual noise in order to be able to make a statement like that in the tutorial? At least the dot operator is not being used for package scopes, like Java or C# for instance. -- Ziad -------------- next part -------------- An HTML attachment was scrubbed... URL: From pwalton at mozilla.com Thu Jan 31 11:27:44 2013 From: pwalton at mozilla.com (Patrick Walton) Date: Thu, 31 Jan 2013 11:27:44 -0800 Subject: [rust-dev] Lifetime notation In-Reply-To: References: <2805762.735911359636868523.JavaMail.weblogic@epml02> <510A788C.40400@alum.mit.edu> Message-ID: <510AC5B0.2070705@mozilla.com> On 1/31/13 6:33 AM, Benjamin Striegel wrote: > +1 to this. Option 8 was always the best-case syntax, and prefixing an > apostrophe on lifetime names is entirely inoffensive. I like this as well. Patrick From banderson at mozilla.com Thu Jan 31 11:29:41 2013 From: banderson at mozilla.com (Brian Anderson) Date: Thu, 31 Jan 2013 11:29:41 -0800 Subject: [rust-dev] Hash Table In-Reply-To: <510A301D.1080101@exyr.org> References: <510A301D.1080101@exyr.org> Message-ID: <510AC625.5090004@mozilla.com> On 01/31/2013 12:49 AM, Simon Sapin wrote: > Le 31/01/2013 09:39, piyush agarwal a ?crit : >> How can we implement hash table in rust ..or there is any built-in type >> for it. > > Hi, > > Have you looked into the std::map module? If that does not fit your > use case, I think the underlying hash is in core::hash. > > http://static.rust-lang.org/doc/std/map.html > http://static.rust-lang.org/doc/core/hash.html > > Cheers, On the Rust master branch the `core::hashmap::linear::LinearMap` type is the preferred hashmap (it uses `core::hash`). LinearMap is an owned type that has a more 'Rusty' design than `std::map`. From graydon at mozilla.com Thu Jan 31 11:43:37 2013 From: graydon at mozilla.com (Graydon Hoare) Date: Thu, 31 Jan 2013 11:43:37 -0800 Subject: [rust-dev] Lifetime notation In-Reply-To: <510AC5B0.2070705@mozilla.com> References: <2805762.735911359636868523.JavaMail.weblogic@epml02> <510A788C.40400@alum.mit.edu> <510AC5B0.2070705@mozilla.com> Message-ID: <510AC969.2040506@mozilla.com> On 13-01-31 11:27 AM, Patrick Walton wrote: > On 1/31/13 6:33 AM, Benjamin Striegel wrote: >> +1 to this. Option 8 was always the best-case syntax, and prefixing an >> apostrophe on lifetime names is entirely inoffensive. > > I like this as well. As awkward as it is to be a source of direct contradiction, much less one on syntax (sigh) I have to express my objection: I'm fine with the use of a variable-sigil like 'a but putting the whole thing in {} is terribly offputting to my eyes -- indeed, bringing any other bracketing forms into the type language at all. Particularly when combining with type parameters: Foo{'lt} seems past the point of tolerable reading. I'm sympathetic to the points raised in the reddit thread concerning introducing lifetime names via the <> binder on a function call, as well as the lower value of &<'a> vs. &'a. I'm am ok with &'a T (or even 'a&T) rather than &<'a>T if there's strong preference there; the preference I expressed in the meeting for the latter was only minor. I would strongly prefer no more uses of brackets though. -Graydon From pwalton at mozilla.com Thu Jan 31 11:46:20 2013 From: pwalton at mozilla.com (Patrick Walton) Date: Thu, 31 Jan 2013 11:46:20 -0800 Subject: [rust-dev] Lifetime notation In-Reply-To: <510AC969.2040506@mozilla.com> References: <2805762.735911359636868523.JavaMail.weblogic@epml02> <510A788C.40400@alum.mit.edu> <510AC5B0.2070705@mozilla.com> <510AC969.2040506@mozilla.com> Message-ID: <510ACA0C.2080709@mozilla.com> On 1/31/13 11:43 AM, Graydon Hoare wrote: > On 13-01-31 11:27 AM, Patrick Walton wrote: >> On 1/31/13 6:33 AM, Benjamin Striegel wrote: >>> +1 to this. Option 8 was always the best-case syntax, and prefixing an >>> apostrophe on lifetime names is entirely inoffensive. >> >> I like this as well. > > As awkward as it is to be a source of direct contradiction, much less > one on syntax (sigh) I have to express my objection: I'm fine with the > use of a variable-sigil like 'a but putting the whole thing in {} is > terribly offputting to my eyes -- indeed, bringing any other bracketing > forms into the type language at all. Particularly when combining with > type parameters: > > Foo{'lt} > > seems past the point of tolerable reading. Totally fair, and agreed. Patrick From malte.schuetze at fgms.de Thu Jan 31 12:56:51 2013 From: malte.schuetze at fgms.de (=?ISO-8859-1?Q?Malte_Sch=FCtze?=) Date: Thu, 31 Jan 2013 21:56:51 +0100 Subject: [rust-dev] Lifetime notation In-Reply-To: <510ACA0C.2080709@mozilla.com> References: <2805762.735911359636868523.JavaMail.weblogic@epml02> <510A788C.40400@alum.mit.edu> <510AC5B0.2070705@mozilla.com> <510AC969.2040506@mozilla.com> <510ACA0C.2080709@mozilla.com> Message-ID: <510ADA93.2000004@fgms.de> On 01/31/2013 08:46 PM, Patrick Walton wrote: > On 1/31/13 11:43 AM, Graydon Hoare wrote: >> On 13-01-31 11:27 AM, Patrick Walton wrote: >>> On 1/31/13 6:33 AM, Benjamin Striegel wrote: >>>> +1 to this. Option 8 was always the best-case syntax, and prefixing an >>>> apostrophe on lifetime names is entirely inoffensive. >>> >>> I like this as well. >> >> As awkward as it is to be a source of direct contradiction, much less >> one on syntax (sigh) I have to express my objection: I'm fine with the >> use of a variable-sigil like 'a but putting the whole thing in {} is >> terribly offputting to my eyes -- indeed, bringing any other bracketing >> forms into the type language at all. Particularly when combining with >> type parameters: >> >> Foo{'lt} >> >> seems past the point of tolerable reading. > > Totally fair, and agreed. > > Patrick > I really prefer Foo{'lt} over Foo'lt - the former makes it visually clearer to me where each section of the declaration starts and ends. From deansherthompson at gmail.com Thu Jan 31 12:58:55 2013 From: deansherthompson at gmail.com (Dean Thompson) Date: Thu, 31 Jan 2013 12:58:55 -0800 Subject: [rust-dev] Lifetime notation In-Reply-To: <510ADA93.2000004@fgms.de> Message-ID: On 1/31/13 12:56 PM, "Malte Sch?tze" wrote: > >I really prefer Foo{'lt} over Foo'lt - the former makes it >visually clearer to me where each section of the declaration starts and >ends. The non-curly choice is Foo<'lt,X,Y>. How does that grab you? Dean From malte.schuetze at fgms.de Thu Jan 31 13:37:46 2013 From: malte.schuetze at fgms.de (=?ISO-8859-1?Q?Malte_Sch=FCtze?=) Date: Thu, 31 Jan 2013 22:37:46 +0100 Subject: [rust-dev] Lifetime notation In-Reply-To: References: Message-ID: <510AE42A.1090409@fgms.de> On 01/31/2013 09:58 PM, Dean Thompson wrote: > On 1/31/13 12:56 PM, "Malte Sch?tze" wrote: >> I really prefer Foo{'lt} over Foo'lt - the former makes it >> visually clearer to me where each section of the declaration starts and >> ends. > The non-curly choice is Foo<'lt,X,Y>. How does that grab you? > > Dean I'm worried that it might be confusing to read when it becomes longer. Foo<'lt,X,Y> still is readable, but Foo<'lt,'xy,X,Y,Z> isn't anymore. Having it in curly braces (Foo{'lt,'xy}) breaks it down in smaller parts and makes it easier to understand in my opinion. From deansherthompson at gmail.com Thu Jan 31 13:42:45 2013 From: deansherthompson at gmail.com (Dean Thompson) Date: Thu, 31 Jan 2013 13:42:45 -0800 Subject: [rust-dev] Lifetime notation In-Reply-To: <510AE42A.1090409@fgms.de> Message-ID: Makes sense. One counter point though, which I find more persuasive: the more common case by far is surely a single lifetime parameter and a single type parameter. In which case Foo<'lt,X> seems less noisy than Foo{'lt}. Having said that, Graydon gently invoked BDFL rights to push against using the curlies. :-) So that ship has presumably sailed. On 1/31/13 1:37 PM, "Malte Sch?tze" wrote: >On 01/31/2013 09:58 PM, Dean Thompson wrote: >> On 1/31/13 12:56 PM, "Malte Sch?tze" wrote: >>> I really prefer Foo{'lt} over Foo'lt - the former makes it >>> visually clearer to me where each section of the declaration starts and >>> ends. >> The non-curly choice is Foo<'lt,X,Y>. How does that grab you? >> >> Dean > >I'm worried that it might be confusing to read when it becomes longer. >Foo<'lt,X,Y> still is readable, but Foo<'lt,'xy,X,Y,Z> isn't anymore. >Having it in curly braces (Foo{'lt,'xy}) breaks it down in >smaller parts and makes it easier to understand in my opinion. >_______________________________________________ >Rust-dev mailing list >Rust-dev at mozilla.org >https://mail.mozilla.org/listinfo/rust-dev From martindemello at gmail.com Thu Jan 31 14:06:54 2013 From: martindemello at gmail.com (Martin DeMello) Date: Thu, 31 Jan 2013 14:06:54 -0800 Subject: [rust-dev] Lifetime notation In-Reply-To: <510AE42A.1090409@fgms.de> References: <510AE42A.1090409@fgms.de> Message-ID: Personally, Foo<'lt, 'xy, X, Y, Z> is perfectly readable, and far less noisy-looking than having two delimited lists one after another. It will be even more readable with syntax highlighting. If multiple lifetime parameters were really a common thing I might have liked a second separator, maybe Foo<'lt, 'xy | X, Y Z> or Foo<'lt, 'xy / X, Y, Z>, but that's a very minor issue. martin On Thu, Jan 31, 2013 at 1:37 PM, Malte Sch?tze wrote: > On 01/31/2013 09:58 PM, Dean Thompson wrote: >> >> On 1/31/13 12:56 PM, "Malte Sch?tze" wrote: >>> >>> I really prefer Foo{'lt} over Foo'lt - the former makes it >>> visually clearer to me where each section of the declaration starts and >>> ends. >> >> The non-curly choice is Foo<'lt,X,Y>. How does that grab you? >> >> Dean > > > I'm worried that it might be confusing to read when it becomes longer. > Foo<'lt,X,Y> still is readable, but Foo<'lt,'xy,X,Y,Z> isn't anymore. Having > it in curly braces (Foo{'lt,'xy}) breaks it down in smaller parts and > makes it easier to understand in my opinion. > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From illissius at gmail.com Thu Jan 31 14:08:47 2013 From: illissius at gmail.com (=?ISO-8859-1?Q?G=E1bor_Lehel?=) Date: Thu, 31 Jan 2013 23:08:47 +0100 Subject: [rust-dev] Lifetime notation In-Reply-To: <2805762.735911359636868523.JavaMail.weblogic@epml02> References: <2805762.735911359636868523.JavaMail.weblogic@epml02> Message-ID: (Not sure if anyone cares about my opinion, but: if apostrophes are a given, the braces of "option 8" aren't obviously preferable to me any more. The appeal of option 8 was that it visually distinguished lifetime parameters, and just overall looked nice, gave the right impression. With apostrophes the lifetimes are already distinguished by the apostrophes, and it no longer looks as nice. It's not clearly better to me than angle brackets with a consolidated lifetime + type parameter list. So if I had a vote, I would probably cast it for the latter, because at least it's simpler.) On Thu, Jan 31, 2013 at 1:54 PM, Sanghyeon Seo wrote: > UtherII on Reddit /r/rust suggested an idea I like: > > &{'lt} T > T{'lt} > > Basically "option 8" of http://smallcultfollowing.com/babysteps/blog/2012/12/30/lifetime-notation/ > with ' from https://mail.mozilla.org/pipermail/rust-dev/2013-January/002942.html > > This does need a lookahead but as far as I can tell unambiguous and manageable. More on: > http://www.reddit.com/r/rust/comments/17ka3b/meeting_weekly_20130129_region_syntax_impl_type/c86t7wg > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev -- Your ship was destroyed in a monadic eruption. From pwalton at mozilla.com Thu Jan 31 14:37:33 2013 From: pwalton at mozilla.com (Patrick Walton) Date: Thu, 31 Jan 2013 14:37:33 -0800 Subject: [rust-dev] RFC: Explicit stack switching Message-ID: <510AF22D.3040509@mozilla.com> Hi everyone, With the revamp of the scheduler underway, I'd like to propose a change to the way C functions work. Currently, we generate a shim and a stack switch for every function call from Rust to C and likewise from C to Rust, except for functions annotated with `#[rust_stack]`. These wrappers result in a significant performance overhead. For some workloads this performance overhead is acceptable in order to maintain small stacks. For some workloads the performance overhead is undesirable. For instance, the DOM in Servo requires lots of very small calls from JavaScript to Rust. The overhead of stack switching swamps most of the time here. Popular Web benchmarks will do things like `someElement.clientX;` over and over, which require calls from JavaScript to Rust to retrieve a cached value. So we must carefully consider every CPU cycle spent in the C-to-Rust transition. To address these issues I would like to propose a somewhat radical change: don't have the compiler generate stack switching stubs at all. Instead, the scheduler can expose a primitive that generates the stack switch, and it's the programmer's responsibility to perform the stack switch to call out to C functions. To avoid the obvious footgun here, I propose a lint pass, on by default, that ensures that functions not annotated with `#[rust_stack]` are called inside a stack switching helper. The rationale here is as follows: 1. It should be possible to group many C calls under a single stack switching operation. For example: do stackswitch { c_function_1(); c_function_2(); c_function_3(); } This amortizes the cost of the stack switch over many native function calls. 2. It should be possible to have sections of Rust code that run on a big C stack and do not use segmented stacks; for example, the new Rust scheduler (which is to be written in Rust), or the Servo DOM as mentioned above. 3. If (2) is possible, the Rust compiler never knows whether there's enough stack space available to safely call a C function. Therefore, performing the stack switch ought to be under the programmer's control. 4. We should have a lint pass that ensures that stack switches are performed properly, because we do not want programmers to accidentally shoot themselves in the foot. 5. Because C functions are always unsafe in the Rust sense, Rust code will almost always wrap functionality provided by foreign libraries into safe Rust abstractions. The stack switch can be moved into these abstractions. 6. C functions are always unsafe, so this does not, formally, add any new unsafety. Whatever decision we come to, we should make this decision soon (before 0.6), because this will break code. Thoughts? Patrick From banderson at mozilla.com Thu Jan 31 15:35:48 2013 From: banderson at mozilla.com (Brian Anderson) Date: Thu, 31 Jan 2013 15:35:48 -0800 Subject: [rust-dev] RFC: Explicit stack switching In-Reply-To: <510AF22D.3040509@mozilla.com> References: <510AF22D.3040509@mozilla.com> Message-ID: <510AFFD4.7030207@mozilla.com> On 01/31/2013 02:37 PM, Patrick Walton wrote: > Hi everyone, > > With the revamp of the scheduler underway, I'd like to propose a > change to the way C functions work. > > Currently, we generate a shim and a stack switch for every function > call from Rust to C and likewise from C to Rust, except for functions > annotated with `#[rust_stack]`. These wrappers result in a significant > performance overhead. For some workloads this performance overhead is > acceptable in order to maintain small stacks. For some workloads the > performance overhead is undesirable. > > For instance, the DOM in Servo requires lots of very small calls from > JavaScript to Rust. The overhead of stack switching swamps most of the > time here. Popular Web benchmarks will do things like > `someElement.clientX;` over and over, which require calls from > JavaScript to Rust to retrieve a cached value. So we must carefully > consider every CPU cycle spent in the C-to-Rust transition. > > To address these issues I would like to propose a somewhat radical > change: don't have the compiler generate stack switching stubs at all. > Instead, the scheduler can expose a primitive that generates the stack > switch, and it's the programmer's responsibility to perform the stack > switch to call out to C functions. To avoid the obvious footgun here, > I propose a lint pass, on by default, that ensures that functions not > annotated with `#[rust_stack]` are called inside a stack switching > helper. > > The rationale here is as follows: > > 1. It should be possible to group many C calls under a single stack > switching operation. For example: > > do stackswitch { > c_function_1(); > c_function_2(); > c_function_3(); > } > > This amortizes the cost of the stack switch over many native function > calls. I think this API requires #4479 and #4480 to be safe. Currently, the execution environment after the stack switch is very different, so running arbitrary Rust code there is dangerous. We may want to think of 'stack switching' instead as 'make sure I'm running on a stack segment that is big'. Then whatever code executes after that doesn't matter - if it's C code it will run till it runs off the stack, if it's Rust code it will request a new segment when it hits the end. https://github.com/mozilla/rust/issues/4479 https://github.com/mozilla/rust/issues/4480 An API that was more like `stackswitch!(function, args)` wouldn't have that problem. > > 2. It should be possible to have sections of Rust code that run on a > big C stack and do not use segmented stacks; for example, the new Rust > scheduler (which is to be written in Rust), or the Servo DOM as > mentioned above. > > 3. If (2) is possible, the Rust compiler never knows whether there's > enough stack space available to safely call a C function. Therefore, > performing the stack switch ought to be under the programmer's control. I don't strictly agree with this but also don't think it invalidates your argument. My pull request #4691 makes the decision to do the stack switch dynamic in a way that it should work correctly whether it's in task context or not. There is overhead though. > > 4. We should have a lint pass that ensures that stack switches are > performed properly, because we do not want programmers to accidentally > shoot themselves in the foot. This does force more work on programmers in the common case, and it doesn't just punish the person declaring the functions but every caller. That's a difficult trade off. > > 5. Because C functions are always unsafe in the Rust sense, Rust code > will almost always wrap functionality provided by foreign libraries > into safe Rust abstractions. The stack switch can be moved into these > abstractions. Sure. Alleviates the problem I mentioned above. > > 6. C functions are always unsafe, so this does not, formally, add any > new unsafety. > > Whatever decision we come to, we should make this decision soon > (before 0.6), because this will break code. Thoughts? I do want 'extern fn's (the ones that call into Rust from C) to be plain C ABI functions, but I think this is solvable just with #4479 and #4480. Under that scenario calling into Rust via C would not ever have a stack switch. Instead, it would simply be relying on the segmented stack prologue to determine if there was enough stack. I also though would like to move the stack switching machinery out of trans, and I want access to foreign function pointers, and I think a syntax extension is doable, but I'm worried about how inconvenient it might be. I also am reluctant surfacing the stack switching in the language. Maybe we can phrase it a way that isn't tied to the implementation, like 'prepare_foreign_call' (just spitballing here). From banderson at mozilla.com Thu Jan 31 15:41:15 2013 From: banderson at mozilla.com (Brian Anderson) Date: Thu, 31 Jan 2013 15:41:15 -0800 Subject: [rust-dev] RFC: Explicit stack switching In-Reply-To: <510AFFD4.7030207@mozilla.com> References: <510AF22D.3040509@mozilla.com> <510AFFD4.7030207@mozilla.com> Message-ID: <510B011B.1080904@mozilla.com> On 01/31/2013 03:35 PM, Brian Anderson wrote: > On 01/31/2013 02:37 PM, Patrick Walton wrote: >> Hi everyone, >> >> With the revamp of the scheduler underway, I'd like to propose a >> change to the way C functions work. >> >> Currently, we generate a shim and a stack switch for every function >> call from Rust to C and likewise from C to Rust, except for functions >> annotated with `#[rust_stack]`. These wrappers result in a >> significant performance overhead. For some workloads this performance >> overhead is acceptable in order to maintain small stacks. For some >> workloads the performance overhead is undesirable. >> >> For instance, the DOM in Servo requires lots of very small calls from >> JavaScript to Rust. The overhead of stack switching swamps most of >> the time here. Popular Web benchmarks will do things like >> `someElement.clientX;` over and over, which require calls from >> JavaScript to Rust to retrieve a cached value. So we must carefully >> consider every CPU cycle spent in the C-to-Rust transition. >> >> To address these issues I would like to propose a somewhat radical >> change: don't have the compiler generate stack switching stubs at >> all. Instead, the scheduler can expose a primitive that generates the >> stack switch, and it's the programmer's responsibility to perform the >> stack switch to call out to C functions. To avoid the obvious footgun >> here, I propose a lint pass, on by default, that ensures that >> functions not annotated with `#[rust_stack]` are called inside a >> stack switching helper. >> >> The rationale here is as follows: >> >> 1. It should be possible to group many C calls under a single stack >> switching operation. For example: >> >> do stackswitch { >> c_function_1(); >> c_function_2(); >> c_function_3(); >> } >> >> This amortizes the cost of the stack switch over many native function >> calls. > > I think this API requires #4479 and #4480 to be safe. Currently, the > execution environment after the stack switch is very different, so > running arbitrary Rust code there is dangerous. We may want to think > of 'stack switching' instead as 'make sure I'm running on a stack > segment that is big'. Then whatever code executes after that doesn't > matter - if it's C code it will run till it runs off the stack, if > it's Rust code it will request a new segment when it hits the end. > > https://github.com/mozilla/rust/issues/4479 > https://github.com/mozilla/rust/issues/4480 > > An API that was more like `stackswitch!(function, args)` wouldn't have > that problem. There's also the problem with failure after switching stacks. Right now there is a guard in the stack switch that catches exceptions thrown by foreign code and aborts the process, which makes this bogus: do stackswitch { fail; } We could remove that guard and leave the behavior undefined ... From pwalton at mozilla.com Thu Jan 31 15:42:31 2013 From: pwalton at mozilla.com (Patrick Walton) Date: Thu, 31 Jan 2013 15:42:31 -0800 Subject: [rust-dev] RFC: Explicit stack switching In-Reply-To: <510B011B.1080904@mozilla.com> References: <510AF22D.3040509@mozilla.com> <510AFFD4.7030207@mozilla.com> <510B011B.1080904@mozilla.com> Message-ID: <510B0167.9000109@mozilla.com> On 1/31/13 3:41 PM, Brian Anderson wrote: > There's also the problem with failure after switching stacks. Right now > there is a guard in the stack switch that catches exceptions thrown by > foreign code and aborts the process, which makes this bogus: > > do stackswitch { > fail; > } > > We could remove that guard and leave the behavior undefined ... Could we implement unsafe `catch` to make this work? Then the stack switching code could catch the failure and abort. The scheduler probably needs the ability to catch anyway, right? Patrick From banderson at mozilla.com Thu Jan 31 15:43:55 2013 From: banderson at mozilla.com (Brian Anderson) Date: Thu, 31 Jan 2013 15:43:55 -0800 Subject: [rust-dev] RFC: Explicit stack switching In-Reply-To: <510B011B.1080904@mozilla.com> References: <510AF22D.3040509@mozilla.com> <510AFFD4.7030207@mozilla.com> <510B011B.1080904@mozilla.com> Message-ID: <510B01BB.9020106@mozilla.com> On 01/31/2013 03:41 PM, Brian Anderson wrote: > On 01/31/2013 03:35 PM, Brian Anderson wrote: >> On 01/31/2013 02:37 PM, Patrick Walton wrote: >>> Hi everyone, >>> >>> With the revamp of the scheduler underway, I'd like to propose a >>> change to the way C functions work. >>> >>> Currently, we generate a shim and a stack switch for every function >>> call from Rust to C and likewise from C to Rust, except for >>> functions annotated with `#[rust_stack]`. These wrappers result in a >>> significant performance overhead. For some workloads this >>> performance overhead is acceptable in order to maintain small >>> stacks. For some workloads the performance overhead is undesirable. >>> >>> For instance, the DOM in Servo requires lots of very small calls >>> from JavaScript to Rust. The overhead of stack switching swamps most >>> of the time here. Popular Web benchmarks will do things like >>> `someElement.clientX;` over and over, which require calls from >>> JavaScript to Rust to retrieve a cached value. So we must carefully >>> consider every CPU cycle spent in the C-to-Rust transition. >>> >>> To address these issues I would like to propose a somewhat radical >>> change: don't have the compiler generate stack switching stubs at >>> all. Instead, the scheduler can expose a primitive that generates >>> the stack switch, and it's the programmer's responsibility to >>> perform the stack switch to call out to C functions. To avoid the >>> obvious footgun here, I propose a lint pass, on by default, that >>> ensures that functions not annotated with `#[rust_stack]` are called >>> inside a stack switching helper. >>> >>> The rationale here is as follows: >>> >>> 1. It should be possible to group many C calls under a single stack >>> switching operation. For example: >>> >>> do stackswitch { >>> c_function_1(); >>> c_function_2(); >>> c_function_3(); >>> } >>> >>> This amortizes the cost of the stack switch over many native >>> function calls. >> >> I think this API requires #4479 and #4480 to be safe. Currently, the >> execution environment after the stack switch is very different, so >> running arbitrary Rust code there is dangerous. We may want to think >> of 'stack switching' instead as 'make sure I'm running on a stack >> segment that is big'. Then whatever code executes after that doesn't >> matter - if it's C code it will run till it runs off the stack, if >> it's Rust code it will request a new segment when it hits the end. >> >> https://github.com/mozilla/rust/issues/4479 >> https://github.com/mozilla/rust/issues/4480 >> >> An API that was more like `stackswitch!(function, args)` wouldn't >> have that problem. > > There's also the problem with failure after switching stacks. Right > now there is a guard in the stack switch that catches exceptions > thrown by foreign code and aborts the process, which makes this bogus: > > do stackswitch { > fail; > } > > We could remove that guard and leave the behavior undefined ... I'm going to stop replying to myself, but with return-based unwinding the story here will be changing. Probably we can thread the return flag through the stack switch and leave the DWARF unwinding case undefined. From banderson at mozilla.com Thu Jan 31 15:46:58 2013 From: banderson at mozilla.com (Brian Anderson) Date: Thu, 31 Jan 2013 15:46:58 -0800 Subject: [rust-dev] RFC: Explicit stack switching In-Reply-To: <510B0167.9000109@mozilla.com> References: <510AF22D.3040509@mozilla.com> <510AFFD4.7030207@mozilla.com> <510B011B.1080904@mozilla.com> <510B0167.9000109@mozilla.com> Message-ID: <510B0272.6060703@mozilla.com> On 01/31/2013 03:42 PM, Patrick Walton wrote: > On 1/31/13 3:41 PM, Brian Anderson wrote: >> There's also the problem with failure after switching stacks. Right now >> there is a guard in the stack switch that catches exceptions thrown by >> foreign code and aborts the process, which makes this bogus: >> >> do stackswitch { >> fail; >> } >> >> We could remove that guard and leave the behavior undefined ... > > Could we implement unsafe `catch` to make this work? Then the stack > switching code could catch the failure and abort. > > The scheduler probably needs the ability to catch anyway, right? I think that's not exactly the issue. If we catch the failure we could be catching a legitimate Rust failure that shouldn't abort. We won't know if we're catching a Rust failure or a C++ failure. The more I think about it though the more I think it's not a major obstacle. From a.stavonin at gmail.com Thu Jan 31 22:05:15 2013 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Fri, 1 Feb 2013 15:05:15 +0900 Subject: [rust-dev] Tuples and to_str() Message-ID: I'm trying to convert tuple of uints to string using to_str() module like this: io::println((1,2).to_string()); Looks like it should work as to_str has implementation for (A, B), but: test.rs:11 io::println((1,2).to_string()); ^~~~~~~~~~~~~~~~~~ error: aborting due to previous error Is it compiler error or I've missed something? -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.stavonin at gmail.com Thu Jan 31 22:07:00 2013 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Fri, 1 Feb 2013 15:07:00 +0900 Subject: [rust-dev] Tuples and to_str() In-Reply-To: References: Message-ID: Full error report: test.rs:11:16: error: type `(,)` does not implement any method in scope named `to_string` test.rs:11 io::println((1,2).to_string()); ^~~~~~~~~~~~~~~~~~ error: aborting due to previous error -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewrink at gmail.com Thu Jan 31 22:18:29 2013 From: andrewrink at gmail.com (Andrew Rink) Date: Thu, 31 Jan 2013 22:18:29 -0800 Subject: [rust-dev] Tuples and to_str() In-Reply-To: References: Message-ID: Hi Alexander Looks like a typo. The function your code calls is to_string() instead of to_str(). Tried the statements below and both worked: io::println(fmt!("%s", (1,2).to_str())); io::println((1,2).to_str()); Andrew On Thu, Jan 31, 2013 at 10:07 PM, Alexander Stavonin wrote: > Full error report: > > test.rs:11:16: error: type `(,)` does not implement any method > in scope named `to_string` > > test.rs:11 io::println((1,2).to_string()); > ^~~~~~~~~~~~~~~~~~ > error: aborting due to previous error > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sh4.seo at samsung.com Thu Jan 31 22:20:10 2013 From: sh4.seo at samsung.com (Sanghyeon Seo) Date: Fri, 01 Feb 2013 06:20:10 +0000 (GMT) Subject: [rust-dev] Tuples and to_str() Message-ID: <8906695.771491359699610482.JavaMail.weblogic@epml02> > I'm trying to convert tuple of uints to string using to_str() module like this: > > io::println((1,2).to_string()); > > Is it compiler error or I've missed something? Did you mean .to_str()? The following program works for me. fn main() { io::println((1, 2).to_str()); } From hatahet at gmail.com Thu Jan 31 22:26:52 2013 From: hatahet at gmail.com (Ziad Hatahet) Date: Thu, 31 Jan 2013 22:26:52 -0800 Subject: [rust-dev] Tuples and to_str() In-Reply-To: References: Message-ID: As others mentioned, it is to_str() instead of to_string(). You can also use the %? format string qualifier: io::println(fmt!("Tuple is %?", (1, 2))); -- Ziad -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.stavonin at gmail.com Thu Jan 31 22:31:42 2013 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Fri, 1 Feb 2013 15:31:42 +0900 Subject: [rust-dev] Tuples and to_str() In-Reply-To: References: Message-ID: Yes, you're right, just typo.Thanks! 2013/2/1 Andrew Rink > Hi Alexander > > Looks like a typo. The function your code calls is to_string() instead of > to_str(). Tried the statements below and both worked: > io::println(fmt!("%s", (1,2).to_str())); > io::println((1,2).to_str()); > > Andrew > -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.stavonin at gmail.com Thu Jan 31 23:12:39 2013 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Fri, 1 Feb 2013 16:12:39 +0900 Subject: [rust-dev] trait and lifetime Message-ID: I want to add function like _1(), _2(), etc for Rust tuple. Unfortunately I do not understand how to tell compiler lifetime of returning result in case of `trait` pub trait TupleVal { pub pure fn _1() -> T; pub pure fn _2() -> T; } impl (T, T): TupleVal { pure fn _1() -> T { let (a, _) = self; a } pure fn _2() -> T { let (_, b) = self; b } } And the errors: test.rs:31:21: 31:25 error: moving out of self reference test.rs:31 let (a, _) = self; ^~~~ test.rs:35:21: 35:25 error: moving out of self reference test.rs:35 let (_, b) = self; ^~~~ error: aborting due to 2 previous errors How can I tell the compiler returning values lifetime? Actually it couldn't be more than lifetime of self. -------------- next part -------------- An HTML attachment was scrubbed... URL: