From pwalton at mozilla.com Tue Nov 1 08:52:02 2011 From: pwalton at mozilla.com (Patrick Walton) Date: Tue, 01 Nov 2011 08:52:02 -0700 Subject: [rust-dev] Naming convention for libraries Message-ID: <4EB015A2.9080405@mozilla.com> We recently renamed libstd.so to libruststd.so to avoid stomping on a libstd that might exist in /usr/lib. Perhaps we should attack this in a more holistic way: either (a) all Rust libraries should start with rust* or (b) Rust libraries should install themselves into /usr/lib/rust. The latter seems to be more common among language runtimes, but I'm not sure if there are going to be library path issues on some systems if we go down that route. Patrick From marijnh at gmail.com Tue Nov 1 09:50:20 2011 From: marijnh at gmail.com (Marijn Haverbeke) Date: Tue, 1 Nov 2011 17:50:20 +0100 Subject: [rust-dev] First skeleton of a tutorial In-Reply-To: References: Message-ID: I wrote the sections on modules and native functions today, cleaned up some of the existing text, and added pretty syntax highlighting to the code in the HTML output. Still at http://marijnhaverbeke.nl/rust_tutorial Cheers, Marijn From brendan at mozilla.org Tue Nov 1 10:04:03 2011 From: brendan at mozilla.org (Brendan Eich) Date: Tue, 1 Nov 2011 10:04:03 -0700 Subject: [rust-dev] Naming convention for libraries In-Reply-To: <4EB015A2.9080405@mozilla.com> References: <4EB015A2.9080405@mozilla.com> Message-ID: On Nov 1, 2011, at 8:52 AM, Patrick Walton wrote: > We recently renamed libstd.so to libruststd.so to avoid stomping on a libstd that might exist in /usr/lib. Perhaps we should attack this in a more holistic way: either (a) all Rust libraries should start with rust* or (b) Rust libraries should install themselves into /usr/lib/rust. > > The latter seems to be more common among language runtimes, but I'm not sure if there are going to be library path issues on some systems if we go down that route. Let's find out. It seems much better to use the filesystem than mangle the filenames. /be From banderson at mozilla.com Tue Nov 1 10:14:47 2011 From: banderson at mozilla.com (Brian Anderson) Date: Tue, 01 Nov 2011 10:14:47 -0700 Subject: [rust-dev] Naming convention for libraries In-Reply-To: <4EB015A2.9080405@mozilla.com> References: <4EB015A2.9080405@mozilla.com> Message-ID: <4EB02907.1080105@mozilla.com> On 11/01/2011 08:52 AM, Patrick Walton wrote: > We recently renamed libstd.so to libruststd.so to avoid stomping on a > libstd that might exist in /usr/lib. Perhaps we should attack this in > a more holistic way: either (a) all Rust libraries should start with > rust* or (b) Rust libraries should install themselves into /usr/lib/rust. > > The latter seems to be more common among language runtimes, but I'm > not sure if there are going to be library path issues on some systems > if we go down that route. The current setup is that libraries that rustc needs, libruststd and librustllvm, live in /usr/lib. Then we have our own area under /usr/lib/rustc that we expect to find rust libraries (including another copy of std). The rpathing is set up so there are some fallback options, but with the expectation that rust libraries are basically managed by 'the rust system', whatever that turns out to be. From banderson at mozilla.com Tue Nov 1 10:16:54 2011 From: banderson at mozilla.com (Brian Anderson) Date: Tue, 01 Nov 2011 10:16:54 -0700 Subject: [rust-dev] Naming convention for libraries In-Reply-To: References: <4EB015A2.9080405@mozilla.com> Message-ID: <4EB02986.9060906@mozilla.com> On 11/01/2011 10:04 AM, Brendan Eich wrote: > On Nov 1, 2011, at 8:52 AM, Patrick Walton wrote: > >> We recently renamed libstd.so to libruststd.so to avoid stomping on a libstd that might exist in /usr/lib. Perhaps we should attack this in a more holistic way: either (a) all Rust libraries should start with rust* or (b) Rust libraries should install themselves into /usr/lib/rust. >> >> The latter seems to be more common among language runtimes, but I'm not sure if there are going to be library path issues on some systems if we go down that route. > Let's find out. It seems much better to use the filesystem than mangle the filenames. Our envisioned (but unimplemented) library versioning scheme entailed attaching sha-1's (of the types in the crate + metadata) to file names. From banderson at mozilla.com Tue Nov 1 12:31:58 2011 From: banderson at mozilla.com (Brian Anderson) Date: Tue, 01 Nov 2011 12:31:58 -0700 Subject: [rust-dev] Fwd: Re: Naming convention for libraries Message-ID: <4EB0492E.8060504@mozilla.com> I accidentally took this discussion off-list. Here's some further explanation of rust's versioning. On 11/01/2011 10:23 AM, Brendan Eich wrote: > On Nov 1, 2011, at 10:16 AM, Brian Anderson wrote: > >> On 11/01/2011 10:04 AM, Brendan Eich wrote: >>> On Nov 1, 2011, at 8:52 AM, Patrick Walton wrote: >>> >>>> We recently renamed libstd.so to libruststd.so to avoid stomping on a libstd that might exist in /usr/lib. Perhaps we should attack this in a more holistic way: either (a) all Rust libraries should start with rust* or (b) Rust libraries should install themselves into /usr/lib/rust. >>>> >>>> The latter seems to be more common among language runtimes, but I'm not sure if there are going to be library path issues on some systems if we go down that route. >>> Let's find out. It seems much better to use the filesystem than mangle the filenames. >> Our envisioned (but unimplemented) library versioning scheme entailed attaching sha-1's (of the types in the crate + metadata) to file names. > But also in metadata in the files? Names change and get (truly, as in accidentally) mangled. > > Sorry to be out of touch on this library versioning scheme. Sure, Linux has a naming scheme that lacks crypto-hashes. Are we trying to improve that by making names based on contents, instead of making up version numbers? How will the user choose? > > /be The decision of which library to link is made solely by the metadata within the libraries. The file naming scheme is to allow different versions and configurations to live alongside each other in a way that is OS-agnostic. The user doesn't choose what library they want by filename - they specify some amount of metadata in their 'use' statements and rustc picks the right crate to link. To clarify, I looked up the proposed naming scheme and it's 'libname-hash-version.so' where the hash is a hash of the exported metadata (type hashes don't come into play here). From marijnh at gmail.com Fri Nov 4 05:32:19 2011 From: marijnh at gmail.com (Marijn Haverbeke) Date: Fri, 4 Nov 2011 13:32:19 +0100 Subject: [rust-dev] Kind system revision proposal Message-ID: Kinds are assigned to generic type parameters to determine what a generic function can do with values of that kind. Rust currently has three kinds: - 'unique' types can be moved and sent (all the scalar types, unique boxes with no resources in them) - 'shared' types can be moved, but can not be sent (anything involving refcounted boxes-- @, obj, lambda) - 'pinned' types can be sent, but not moved (this currently applies to resources, anything including resources, and blocks) I'm arguing that this is not a very good system, and that we'd be better off considering 'copying' rather than 'moving' as a property. This was the way the system was originally designed, but when implementing it, Graydon changed to the system outlined above. In private mail, he told me that the reason for this was that we couldn't allow blocks (functions closing over local frames) to be moved. I propose the following alternative, which allows resources to be moved again (*), greatly reduces the awkwardness of declaring types for generic functions, and defines a more specialized approach to safety for blocks (**), and actually help define noncopyability (which the current system doesn't do at all). We'd have the following kinds (better terms will have to be found, but I'm using these for clarity): - 'send,copy' types can be copied and sent (scalar types, unique boxes without resources in them) - 'nosend,copy' types can be copied, not sent (the same as the 'shared' types above, refcounted things) - 'nosend,nocopy' types can't be sent or copied (resources, unique boxes with resources in them, blocks) Contrary to the old kinds, this forms a hierarchy, and the kinds lower in the list can be treated as sub-kinds of those above them (a 'send,copy' value can be safely treated as a 'nosend,nocopy' value). Most generic functions will neither send nor copy values of their parameterized type, so they can safely default to 'nosend,nocopy' without losing genericity. When they *do* copy or send, and the programmer forgets to annotate the type parameter as such, a clear, understandable error message can be provided. Generic types (type and tag) do not seem to have a reason to ever narrow their kind bounds in this system, so (unless I am missing something), they should probably not even be allowed to specify a kind on their parameters. We'll have to define more closely when copying happens. I probably forgot some cases here, but?an lvalue (rvalues are always conceptually moved) is copied when: - It is assigned to some other lvalue with = - It is copied into a lambda closure - It is passed as an argument in an unsafe way (safe argument passing does not copy) - It is used as the content of a newly created data structure (it is not yet clear how tag/resource constructors can be distinguished from other functions for this purpose. maybe they should take their args in move mode, though that would require one to always say option::some(copy localvar), which is also awkward) - It is returned from a function or block It might be useful to consider the last use of a local (let) variable to be an rvalue, since that local will never be referenced again, and it is always safe to move out of it. This will cause most returns to become moves (though returning a non-move-mode argument or the content of a data structure is still a copy). The intention is to only annotate things as a copy where there will *actually* be two reachable versions of a value after the operation. (The translator pass is already optimizing most of the situations where this isn't the case into a move or construct-in-place. This system could move some of that logic into the kind-checking pass.) Block safety will have to be handled on a more or less syntactic level. Values with block type can be restricted to only appear in function argument or callee position, and when appearing in argument position, the argument mode has to be by-reference. This is a kludge, but a relatively simple one (a small pass coming after typechecking can verify it). Given this, we can have a kind-checking pass that actually works, without imposing a big burden on users. Generic functions (and objects, as they currently exist) will have their parameters be 'nocopy,nosend' when no kind annotation is present, which will usually be the right thing. Generic data types will never have to be annotated with kinds. I hope people agree this system is easier to understand than the current one. I also hope I didn't miss any gaping soundness holes. Please comment. *) Resources that can't be moved are rather useless. They can only exist as local variables and never be stored in a data structure. This is why there is currently a kludge in the kind-checking code to allow resources to be constructed into data structures. This is needlessly special-cased, and doesn't really work very well at the moment (you can't, for example, put a resource in a tag value.) **) If the 'regions' design that Patrick and Niko are working on pads out, that will produce a much better way of handling blocks. Best, Marijn From banderson at mozilla.com Fri Nov 4 11:52:49 2011 From: banderson at mozilla.com (Brian Anderson) Date: Fri, 04 Nov 2011 11:52:49 -0700 Subject: [rust-dev] Kind system revision proposal In-Reply-To: References: Message-ID: <4EB43481.1040402@mozilla.com> On 11/04/2011 05:32 AM, Marijn Haverbeke wrote: > Kinds are assigned to generic type parameters to determine what a > generic function can do with values of that kind. > > Rust currently has three kinds: > - 'unique' types can be moved and sent (all the scalar types, unique > boxes with no resources in them) > - 'shared' types can be moved, but can not be sent (anything > involving refcounted boxes-- @, obj, lambda) > - 'pinned' types can be sent, but not moved (this currently applies > to resources, anything including resources, and blocks) Pinned types cannot be sent or moved. Also, in the current implementation, but not in the current design, kinds play a role in determining what can be copied (unique and shared can be copied). > I'm arguing that this is not a very good system, and that we'd be > better off considering 'copying' rather than 'moving' as a property. > This was the way the system was originally designed, but when > implementing it, I agree that copying feels like the more important property, and the current system is unsatisfactory. > I propose the following alternative, which allows resources to be > moved again (*), greatly reduces the awkwardness of declaring types > for generic functions, and defines a more specialized approach to > safety for blocks (**), and actually help define noncopyability (which > the current system doesn't do at all). > > We'd have the following kinds (better terms will have to be found, but > I'm using these for clarity): > - 'send,copy' types can be copied and sent (scalar types, unique boxes > without resources in them) > - 'nosend,copy' types can be copied, not sent (the same as the > 'shared' types above, refcounted things) > - 'nosend,nocopy' types can't be sent or copied (resources, unique > boxes with resources in them, blocks) These copy rules are implemented currently (at least to the extent I could guess where copying was happening), but they are not reflected in the big explanatory comments in kind.rs because they are contrary to Graydon's design. The difference here is that 'pinned' kinds become movable and blocks get some rules of their own. That seems ok. > Contrary to the old kinds, this forms a hierarchy, and the kinds lowerd > in the list can be treated as sub-kinds of those above them (a > 'send,copy' value can be safely treated as a 'nosend,nocopy' value). > Most generic functions will neither send nor copy values of their > parameterized type, so they can safely default to 'nosend,nocopy' > without losing genericity. I'm skeptical that this will be the case, primarily because of return values and vectors. > We'll have to define more closely when copying happens. I probably > forgot some cases here, but?an lvalue (rvalues are always conceptually > moved) is copied when: > - It is assigned to some other lvalue with = > - It is copied into a lambda closure > - It is passed as an argument in an unsafe way (safe argument passing > does not copy) > - It is used as the content of a newly created data structure (it is > not yet clear how tag/resource constructors can be distinguished from > other functions for this purpose. maybe they should take their args in > move mode, though that would require one to always say > option::some(copy localvar), which is also awkward) > - It is returned from a function or block Most operations on vectors involve copy. > It might be useful to consider the last use of a local (let) variable > to be an rvalue, since that local will never be referenced again, and > it is always safe to move out of it. This will cause most returns to > become moves (though returning a non-move-mode argument or the content > of a data structure is still a copy). The intention is to only > annotate things as a copy where there will *actually* be two reachable > versions of a value after the operation. (The translator pass is > already optimizing most of the situations where this isn't the case > into a move or construct-in-place. This system could move some of that > logic into the kind-checking pass.) This seems like the same kind of special casing that allows @resource to work, i.e. the kind checking pass already uses knowledge about the optimizations in trans to make some things work, and I agree with your assessment that the @resource thing is not good. A big problem we have now is that the type system cares about copyability and movability, and those properties are mostly defined by optimizations in trans. Now kind checking (maybe some other passes) and trans are both doing their own analyses of what is allowed and hoping they agree. I'm all for any attempts to make the kind system understandable. From banderson at mozilla.com Sat Nov 5 16:34:50 2011 From: banderson at mozilla.com (Brian Anderson) Date: Sat, 05 Nov 2011 16:34:50 -0700 Subject: [rust-dev] LLVM is now a git submodule of Rust Message-ID: <4EB5C81A.3020502@mozilla.com> Rust's build system is now set up to manage the LLVM build automatically. There are two reasons for this: 1) building LLVM correctly is a consistent obstacle for newcomers, and 2) we're going to be carrying some patches to LLVM for a while to support split stacks. So now the build process is just ./configure && make. Nice. For those who still want to use their own LLVM configure provides an --llvm-root option. The remote for our LLVM submodule is located here: https://github.com/brson/llvm. Hopefully, this will just be a temporary location until we finish the work on split stacks, at which point we can use the official git repo. As part of this transition I recommend manually reconfiguring your Rust build. If you don't, the build will continue to use its existing LLVM installation, and will break mysteriously some day. I will be updated the instructions on the wiki shortly. Regards, Brian From graydon at mozilla.com Tue Nov 8 07:41:27 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 08 Nov 2011 07:41:27 -0800 Subject: [rust-dev] LLVM is now a git submodule of Rust In-Reply-To: <4EB5C81A.3020502@mozilla.com> References: <4EB5C81A.3020502@mozilla.com> Message-ID: <4EB94DA7.20908@mozilla.com> On 05/11/2011 4:34 PM, Brian Anderson wrote: > Rust's build system is now set up to manage the LLVM build > automatically. There are two reasons for this: 1) building LLVM > correctly is a consistent obstacle for newcomers, and 2) we're going to > be carrying some patches to LLVM for a while to support split stacks. So > now the build process is just ./configure && make. Nice. For those who > still want to use their own LLVM configure provides an --llvm-root option. Awesome, thanks. Submodules seem to be the best-of-the-bad ways of doing this sort of thing, I'm glad to see everyone's comfortable enough with them to move this way. I wasn't sure. So many years of wrestling the problem... (Next up, libicu submodule? Hmm) -Graydon From graydon at mozilla.com Tue Nov 8 07:53:26 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 08 Nov 2011 07:53:26 -0800 Subject: [rust-dev] Kind system revision proposal In-Reply-To: References: Message-ID: <4EB95076.5030208@mozilla.com> On 04/11/2011 5:32 AM, Marijn Haverbeke wrote: > Contrary to the old kinds, this forms a hierarchy, and the kinds lower > in the list can be treated as sub-kinds of those above them (a > 'send,copy' value can be safely treated as a 'nosend,nocopy' value). The old scheme was a lattice as well. But I appreciate your motives here and the direction you're taking this. If you can make it hold together, I've no objection. Try to tone down the negativity though? I had to hold of replying to this email longer than I wanted to due to stifling my emotional reaction to all the judgmental language ("not good", "awkward", "doesn't do", "actually work", "useless", etc.) I did *try* in the existing kind system, did the best I could, and it has caught a lot of bugs and helped enforce our memory model. That said, I agree it's not ideal (counterintuitive in some places, ill-defined in others). It's just more productive to phrase things in terms of what is improved, which I see you're also doing. A few notes (mostly vigorous agreement): > Most generic functions will neither send nor copy values of their > parameterized type, so they can safely default to 'nosend,nocopy' > without losing genericity. When they *do* copy or send, and the > programmer forgets to annotate the type parameter as such, a clear, > understandable error message can be provided. I think there's more copying than you expect, but "most" is something it's worth doing experiments on, certainly. I like nocopy/nosend as a default though, because it encourages writing in move-centric style, which is often synonymous with "efficient code" style. > Generic types (type and tag) do not seem to have a reason to > ever narrow their kind bounds in this system, so (unless I am missing > something), they should probably not even be allowed to specify a kind > on their parameters. Agreed. > It might be useful to consider the last use of a local (let) variable > to be an rvalue, since that local will never be referenced again, and > it is always safe to move out of it. This will cause most returns to > become moves (though returning a non-move-mode argument or the content > of a data structure is still a copy). Agreed. Returns should be considered move-outs. > Block safety will have to be handled on a more or less syntactic > level. Values with block type can be restricted to only appear in > function argument or callee position, and when appearing in argument > position, the argument mode has to be by-reference. This is a kludge, > but a relatively simple one (a small pass coming after typechecking > can verify it). This kludge is my only real concern, but if you can nail it down then go for it. > *) Resources that can't be moved are rather useless. They can only > exist as local variables and never be stored in a data structure. This > is why there is currently a kludge in the kind-checking code to allow > resources to be constructed into data structures. This is needlessly > special-cased, and doesn't really work very well at the moment (you > can't, for example, put a resource in a tag value.) The existing approach required us to differentiate "initialization" as a separate evaluation context from "existing rvalue". We did not fully implement this distinction, which is why there are such gaps in the existing scheme. I agree that this scheme is counter-intuitive and if your proposed approach, by eliminating that distinction, may work better. -Graydon From graydon at mozilla.com Tue Nov 8 08:09:59 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 08 Nov 2011 08:09:59 -0800 Subject: [rust-dev] LLVM is now a git submodule of Rust In-Reply-To: <4EB5C81A.3020502@mozilla.com> References: <4EB5C81A.3020502@mozilla.com> Message-ID: <4EB95457.3070108@mozilla.com> Oh, I should also point out: it might be nice to have the configury and makefile notice when submodules are not expanded out, and emit a pleasant error rather than an obscure one. It's likely a lot of people will forget --recursive when cloning. I've filed this as https://github.com/graydon/rust/issues/1156 -Graydon From graydon at mozilla.com Tue Nov 8 08:16:31 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 08 Nov 2011 08:16:31 -0800 Subject: [rust-dev] Fwd: Re: Naming convention for libraries In-Reply-To: <4EB0492E.8060504@mozilla.com> References: <4EB0492E.8060504@mozilla.com> Message-ID: <4EB955DF.4060809@mozilla.com> On 01/11/2011 12:31 PM, Brian Anderson wrote: > To clarify, I looked up the proposed naming scheme and it's > 'libname-hash-version.so' where the hash is a hash of the exported > metadata (type hashes don't come into play here). Yes, this is the way the compiler *should* be producing libraries by default. Even with the recent changes to "emitting proper library names" (eg. bug #744) I don't think we're quite doing it right yet. Better though! And I *think* it was probably safe to keep "std" as "std", not "ruststd", since there's unlikely to be a collision with any normal C-ABI stuff on libstd--0.1.so (or, I suppose, libstd-.so.0.1 on linux, as is their style) The idea here is that gives us a fast name-based scanning set of libraries to inspect while resolving a "use" directive, gives us the ability to dodge all (or all likely) name collisions in the field, and hooks us into whatever OS package-versioning primitives and conventions exist. -Graydon From pwalton at mozilla.com Tue Nov 8 08:18:33 2011 From: pwalton at mozilla.com (Patrick Walton) Date: Tue, 08 Nov 2011 08:18:33 -0800 Subject: [rust-dev] Object system redesign Message-ID: <4EB95659.8010804@mozilla.com> Hi everyone, I've been working on this for quite a while, and I think it's time to get it out in the open. I tried hard to keep this as minimal as possible, while addressing the very real issues in expressivity and developer ergonomics that we have run into with the current Rust object system. It also tries to preserve the benefits of our current object system (in particular, the convenience of interfaces). This has been through several iterations by now, but it should still be considered a strawman proposal. Feedback is welcome. Patrick Object system redesign ====================== Motivation ---------- The current object system in Rust suffers from these limitations: 1. All method calls are virtual. 2. All objects must be stored on the heap. 3. There are no private methods. 4. **self** is not a first-class value, which means in particular that self-dispatch requires special machinery in the compiler. 5. There are no public fields. 6. Only single inheritance is supported through object extension. 7. Object extension requires the construction of forwarding and backwarding vtables. 8. Object types are structural and so cannot be recursive. Additionally, it's a very common pattern to have a context record threaded throughout a module (most notably `trans.rs`). It would be convenient if the language provided a way to avoid having to directly thread common state through many functions. Proposal -------- There are three main components to this proposal: *classes*, *interfaces*, and *traits*. We describe each separately. ### Classes A class is a nominal type that groups together related methods and fields. A class declaration looks like this: class cat { priv { let mutable x : int; fn meow() { log_err "Meow"; } } let y : int; new(in_x : int, in_y : int) { x = in_x; self.y = in_y; } fn speak() { meow(); } fn eat() { ... } } And its use looks like this: let c : cat = cat(1, 2); c.speak(); Class instances may be allocated in any region: on the stack, in the exchange heap, or on the task heap. Fields in a class are immutable unless declared with the **mutable** keyword. They may, however, be mutated in the constructor. Fields and methods of the current instance may, but need not be, prefixed with `self.`. Fields and methods in the **priv** section are private to the class (class-private, not instance-private). Fields and methods in the **pub** section (as well as fields and methods outside any section) are public. The **new** keyword delimits the constructor inside the class declaration. It is not used to create instances of the class; rather, the class declaration results in the introduction of the constructor into the module as a first-class function with bare function type and the same name as the class itself. The constructor must initialize all fields of the object and cannot call any methods on **self** until it has done so. After calling a method on **self**, the constructor is not allowed to mutate any of its immutable fields. The flow analysis used in typestate enforces these invariants. There is an alternate class form, **@class** (for example, `@class dog { ... }`), which has two effects: 1. An instance of a class declared with **@class** is always allocated on the task heap. 2. Within a class declared with **class**, **self** is not a first-class value. It may only be used to reference fields and methods. But in a class declared with **@class**, **self** is a first-class value. Classes do not feature polymorphism or inheritance except through interfaces and traits. As a result, all method dispatch is non-virtual. A class may have a destructor notated by **drop**: class file { let fd : int; drop { os::close(fd); } } Destructors may not reference any data in the task heap, in order to disallow object resurrection. Classes with destructors replace resources. Class instances are copyable if and only if the class has no destructor and all of its fields are copyable. Class instances are sendable if and only if all of its fields are sendable and the class was not declared with **@class**. The order of fields in a class instance is significant; its runtime representation is the same as that of a record with identical fields laid out in the same order. Classes may be type-parametric. Methods may not be type-parametric. ### Interfaces Interfaces are the way we achieve polymorphism. An interface is a nominal type. (This makes recursive types easier to deal with.) The following is an example of an interface: iface animal { fn speak(); fn play(); }; Interfaces allow us to create *views* of a class that expose only the subset of the methods defined in the interface: class cat { fn speak() { ... } fn play() { ... } } class dog { fn speak() { ... } fn play() { ... } } let c : @cat = @cat(); let ac : animal = c as animal; let d : @dog = @dog(); let ad : animal = d as animal; let animals = [ ac, ad ]; vec::each(animals, { |a| a.speak(); }); Views are represented at runtime as a pointer to the class instance and a vtable. The methods in the vtable are laid out in memory in the same order they were defined in the interface. The class instance must be in the task heap; thus all views have shared kind. Views are created with the **as** operator. The left-hand side of the **as** operator must be one of (a) a boxed instance of a class declared with **class**; (b) an unboxed instance of a class declared with **@class**; or (c) another view. In all cases, the class instance or view on the left-hand side of the **as** operator must expose, for each method named in the view, a method whose name and signature matches. All calls through a view are dispatched through a vtable. ### Traits Traits are the way we achieve implementation reuse. A trait is not a type but instead provides reusable implementations that can be mixed in to classes. Here is an example of a trait and its use: trait playful { req { let mutable is_tired : bool; fn fetch(); } fn play() { if !is_tired { fetch(); } } } class dog : playful { let mutable is_tired : bool; fn fetch() { ... } ... } A trait describes the fields and methods that the class inheriting from it must expose via a **req** block. The trait is allowed to reference those (and only those) fields and methods of the class. Trait definitions are duplicated at compile time for each class that inherits from them. In particular, there is no virtual dispatch when calling a method of the class, so there is no runtime abstraction penalty for factoring out common functionality into a trait. Note that this means that a crate exposing a trait must be statically linked with the crates that implement the trait. Traits may be combined together. The resulting class inherits all the methods of all the traits it derives from. For instance: trait hungry { req { let mutable is_hungry : bool; } fn feed() { ... } } class dog : playful, hungry { ... } let d = dog(); d.fetch(); d.feed(); It is a compile-time error to attempt to derive from two traits that define a method with the same name. Trait composition affords renaming of fields or methods: class dog : playful with play = please_play, hungry with eat = please_eat { ... } let d = dog(); d.please_play(); d.please_eat(); Fields or methods named in the **req** section can be renamed in the same way: class dog : playful with is_tired = is_sleepy { let mutable is_sleepy : bool; ... } Traits may not define fields. This prevents the diamond inheritance problem of C++. Trait implementations may call private methods and reference private fields of the class. The methods and fields named in the **req** section may be either public or private. The order of trait composition is not significant. ### Scoped object extensions (concepts, categories) Niko has a proposal for this, so I'll let him present it. Patrick From graydon at mozilla.com Tue Nov 8 08:19:43 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 08 Nov 2011 08:19:43 -0800 Subject: [rust-dev] First skeleton of a tutorial In-Reply-To: References: Message-ID: <4EB9569F.1090207@mozilla.com> On 31/10/2011 8:24 AM, Marijn Haverbeke wrote: > http://marijnhaverbeke.nl/rust_tutorial (I didn't have time to > integrate it with rust-lang.net today, and that probably should wait > until it's a bit more fleshed out). > > Please comment. There's a bunch of stuff completely missing for now > (tasks, most notably). This is great, thanks so much. I'll go through it in "editorial nit-picking" mode at some point in coming weeks but at the moment I think it reads at just about the right level of detail. Much friendlier than diving into the lexical-class rules :) -Gradon From banderson at mozilla.com Tue Nov 8 09:12:34 2011 From: banderson at mozilla.com (Brian Anderson) Date: Tue, 8 Nov 2011 09:12:34 -0800 (PST) Subject: [rust-dev] Fwd: Re: Naming convention for libraries In-Reply-To: <4EB955DF.4060809@mozilla.com> Message-ID: <186209090.101887.1320772354670.JavaMail.root@zimbra1.shared.sjc1.mozilla.com> ----- Original Message ----- From: "Graydon Hoare" To: rust-dev at mozilla.org Sent: Tuesday, November 8, 2011 8:16:31 AM Subject: Re: [rust-dev] Fwd: Re: Naming convention for libraries On 01/11/2011 12:31 PM, Brian Anderson wrote: > To clarify, I looked up the proposed naming scheme and it's > 'libname-hash-version.so' where the hash is a hash of the exported > metadata (type hashes don't come into play here). Yes, this is the way the compiler *should* be producing libraries by default. Even with the recent changes to "emitting proper library names" (eg. bug #744) I don't think we're quite doing it right yet. Better though! And I *think* it was probably safe to keep "std" as "std", not "ruststd", since there's unlikely to be a collision with any normal C-ABI stuff on libstd--0.1.so This is an excellent point. I will file a bug to implement library naming as designed and revert back to the name-based crate search. From dteller at mozilla.com Tue Nov 8 09:12:52 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Tue, 08 Nov 2011 18:12:52 +0100 Subject: [rust-dev] Object system redesign In-Reply-To: <4EB95659.8010804@mozilla.com> References: <4EB95659.8010804@mozilla.com> Message-ID: <4EB96314.1050202@mozilla.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/8/11 5:18 PM, Patrick Walton wrote: > The **new** keyword delimits the constructor inside the class > declaration. It is not used to create instances of the class; > rather, the class declaration results in the introduction of the > constructor into the module as a first-class function with bare > function type and the same name as the class itself. Does this mean that we can only have one constructor? If we wish to have several constructors ? and if we accept that they must not have the same name ? the class could yield a full module, in which each constructor is a function. > There is an alternate class form, **@class** (for example, `@class > dog { ... }`), which has two effects: What is the rationale behind **@class**? > Destructors may not reference any data in the task heap, in order > to disallow object resurrection. So a destructor cannot trigger cleanup of any heap-allocated data? > Class instances are copyable if and only if the class has no > destructor and all of its fields are copyable. Class instances are > sendable if and only if all of its fields are sendable and the > class was not declared with **@class**. No destructor? I would rather have guessed no constructor, what am I missing? > Classes may be type-parametric. Methods may not be > type-parametric. Hmmm... What is the rationale? > ### Interfaces I like it. Be warned that someone is bound to ask for RTTI to pattern-match on interfaces and/or on interface fields. > ### Traits Ahah, I was just wondering if there were any plans for functors :) > ### Scoped object extensions (concepts, categories) > > Niko has a proposal for this, so I'll let him present it. Generally, this looks very good. One more question, though: what about class-less objects? I feel that they could be quite useful to provide defer-style cleanup. Cheers, David -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOuWMUAAoJED+FkPgNe9W+Ml8H/2cBdFC3Uc6RYNFaNIFBjyYu DKUUDDtDW/7TXns44RBEZtiR2whT+CSq5QdoDvs3DLYlaTl71kU0AwhjiDbYrU4b WFom51ddJVZBpu0w/sLkmD6JCP/CY3nQ0QDhvtCqf0KnqKsq9DlKmPqFK3LOny51 IqS02tMBG1JK+77Lwuixdh0oNYBi+pCVCqlynioS+RRzlArTpwSniC7FEqrMcoz9 CL1OvYZ2OatbKuK+QDv/hE+HUuJLPK8CwkoBXPtDWADzc3EEa1mJSHRcjJuZuInK m4p4blbrjuOB1Rs5faCo15F7raT5e+Re04edQAE3cTjeQhDiNMjhkYQUcO7U+Ug= =gFHf -----END PGP SIGNATURE----- From graydon at mozilla.com Tue Nov 8 09:24:37 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 08 Nov 2011 09:24:37 -0800 Subject: [rust-dev] Object system redesign In-Reply-To: <4EB95659.8010804@mozilla.com> References: <4EB95659.8010804@mozilla.com> Message-ID: <4EB965D5.1040303@mozilla.com> On 08/11/2011 8:18 AM, Patrick Walton wrote: > Hi everyone, > > I've been working on this for quite a while, and I think it's time to > get it out in the open. I tried hard to keep this as minimal as > possible, while addressing the very real issues in expressivity and > developer ergonomics that we have run into with the current Rust object > system. It also tries to preserve the benefits of our current object > system (in particular, the convenience of interfaces). > > This has been through several iterations by now, but it should still be > considered a strawman proposal. Feedback is welcome. I'm still just observing a bit of the action from a distance, but I thought I'd give a few bits of response: - Overall: technically strong proposal, but I'm lukewarm about doing it, and definitely don't want to *now*. In order, I'd prefer to finish what we've got scoped for the next while, then explore Niko's sketch, then get to this if we find it necessary. - If priv {} is class-private, I think this means adding global state to rust? I'm not sure, but that's what I think 'class private' (vs. 'instance private') means. If so I'd prefer to rethink that. - The treatments of class vs. @class, copy/send-ability, implied location in the task heap (vs. exchange), etc. is I think not quite well enough baked. It's hard. I realize it's hard. But I think it's a bit too ad-hoc as proposed. It'll surprise users constantly (including us). - The trait/class/interface split, while I think elegant and conceptually accurate when it comes to what's often under the covers in OO systems, is conceptually *heavy* to users: it makes you think an lot about OO-design-theory to write code in. It also carries a high engineering cost to implement (pickling across crates, lots of logic for simply implementing these abstractions in-crate). These costs are the first and main thing on my mind when reading. - The existing obj system is very specifically there for "flexible integration of unrelated code". Hence defaulting to the loose-coupled version of every OO-system device (all-virtual, all-private, ad-hoc methods as types, no hierarchy). It's as minimal as possible in order to accomplish that one role, no other. I think it's an important role since combining not-really-related code in the field is a very common occurrence. What you're proposing with 'class' is, IIUC, geared towards people who really want to write their modules in OO style from the get-go, who "think" in classes. I recognize this is an existing large audience, but am ... I guess overall less sympathetic or interested simply due to my own experience working with systems in OO style: they suffer from a unique sort of "abstraction spaghetti", where control and data have been prematurely over-factored, to the point of being hard to understand. I gather this is my own taste-bias though, and will try to separate that aesthetic response separate (if possible). - Given that this doesn't subsume the built-in tycons -- [], {}, tag don't turn into classes -- I'm more inclined towards initially exploring Niko's scheme and seeing if we really need an independent concept of a class. I realize we're not able to factor as aggressively as we'd like in some cases now; I'm not convinced taking this cost is the best way to do it. I realize that, by volume, that's a lot of negative. I'm sorry for that level of negativity. Let me reiterate that this is a conceptually strong direction to take an OO system, particularly if you want to encourage more OO-style coding. My own objections -- beyond the technical hair-splitting around heaps and private global state - are mostly to do with "do we want something like that" and "can we afford it". Philosophical more than technical. -Graydon From dherman at mozilla.com Tue Nov 8 09:29:11 2011 From: dherman at mozilla.com (David Herman) Date: Tue, 8 Nov 2011 09:29:11 -0800 Subject: [rust-dev] Object system redesign In-Reply-To: <4EB965D5.1040303@mozilla.com> References: <4EB95659.8010804@mozilla.com> <4EB965D5.1040303@mozilla.com> Message-ID: <7CA51003-4C7D-402F-9C37-795D84DFCA21@mozilla.com> Just a couple detail responses: > - Overall: technically strong proposal, but I'm lukewarm about doing it, > and definitely don't want to *now*. In order, I'd prefer to finish > what we've got scoped for the next while, then explore Niko's > sketch, then get to this if we find it necessary. I agree, and I think Patrick does too. > - If priv {} is class-private, I think this means adding global state to > rust? I'm not sure, but that's what I think 'class private' (vs. > 'instance private') means. If so I'd prefer to rethink that. No global state; by "class-private" it simply means that all instances of the same class can see each other's private fields. They're still instance fields. Dave From graydon at mozilla.com Tue Nov 8 09:29:49 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 08 Nov 2011 09:29:49 -0800 Subject: [rust-dev] Renaming "tag" and "log_err" In-Reply-To: <4EABEC4A.9010706@mozilla.com> References: <4EAB0A0A.5070905@mozilla.com> <4EAB2BCC.6000607@mozilla.com> <4EABEC4A.9010706@mozilla.com> Message-ID: <4EB9670D.2040607@mozilla.com> On 29/10/2011 5:06 AM, David Rajchenbach-Teller wrote: > Between "enum" and "union", I tend to favor "enum", for a simple > reason: > - attempting to use a variant as a C-style or Java-style enum will work > flawlessly; > - by opposition, attempting to use a variant as a C-style union will > fail for reasons that will be very unclear for C programmers. If we have to rename it, I'd go with enum as well. Also for mapping C libs into rust-ese, it'll be nice to support providing-the-numbers style, as in: "enum x { foo=2; bar=16; }", which we currently don't support, but should. I'm not sure it's *that* weird for "enum" to be overloaded to mean "newtype" as well. People regularly use enum-in-C++ for just that reason: to make a type-disjoint int-like-thing. Let's try just renaming it and see how it goes. -Graydon From graydon at mozilla.com Tue Nov 8 09:38:00 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 08 Nov 2011 09:38:00 -0800 Subject: [rust-dev] Renaming "tag" and "log_err" In-Reply-To: References: <4EAB0A0A.5070905@mozilla.com> <4EAB2BCC.6000607@mozilla.com> <4EABEC4A.9010706@mozilla.com> <4EAC1922.6070201@mozilla.com> <4EAC66CF.2020105@mozilla.com> <4EAC6C69.5080801@mozilla.com> Message-ID: <4EB968F8.4050202@mozilla.com> On 31/10/2011 2:39 AM, Marijn Haverbeke wrote: > Also relevant here: log_err was originally added as a stopgap > temporary solution, with the idea being that logging eventually would > be a more primitive operation where you specified both a log level > and a message, and there would be a macro that'd help you do this in a > more nice-looking way. We have four log levels, but are only using two > of them at the moment. Making log polymorphic has removed one reason > for making it a macro (the idea was to integrate #fmt), but proper > support for more log levels would still be nice. I agree. log_err was a kludge and I'd prefer to un-kludge it before shipping rather than adding another keyword. Multiple log-levels is the way to go. Macro if there's something relatively easy, or just keep 'log' as compiler-supported, but extend the syntax and include a bunch of log-level numeric constant names in the prelude (err, warn, info, debug, say?) > (Tangentially, the standard prelude, if such a thing materializes, > would be the ideal spot for such a logging macro.) Yes. Or just the constants, as above. -Graydon From pwalton at mozilla.com Tue Nov 8 09:41:29 2011 From: pwalton at mozilla.com (Patrick Walton) Date: Tue, 08 Nov 2011 09:41:29 -0800 Subject: [rust-dev] Renaming "tag" and "log_err" In-Reply-To: <4EB968F8.4050202@mozilla.com> References: <4EAB0A0A.5070905@mozilla.com> <4EAB2BCC.6000607@mozilla.com> <4EABEC4A.9010706@mozilla.com> <4EAC1922.6070201@mozilla.com> <4EAC66CF.2020105@mozilla.com> <4EAC6C69.5080801@mozilla.com> <4EB968F8.4050202@mozilla.com> Message-ID: <4EB969C9.50106@mozilla.com> > I agree. log_err was a kludge and I'd prefer to un-kludge it before > shipping rather than adding another keyword. Multiple log-levels is the > way to go. Macro if there's something relatively easy, or just keep > 'log' as compiler-supported, but extend the syntax and include a bunch > of log-level numeric constant names in the prelude (err, warn, info, > debug, say?) By the way, `alert` was suggested in a bug as a replacement for log_err (and as a synonym for log(err, ...)), which Marijn and I like. Has a cute JavaScripty feel to it. Patrick From graydon at mozilla.com Tue Nov 8 09:41:52 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 08 Nov 2011 09:41:52 -0800 Subject: [rust-dev] Renaming "tag" and "log_err" In-Reply-To: <60CACA18-33E9-4323-9B2A-EBAB63795021@mozilla.org> References: <4EAB0A0A.5070905@mozilla.com> <4EAB2BCC.6000607@mozilla.com> <4EABEC4A.9010706@mozilla.com> <4EAC1922.6070201@mozilla.com> <60CACA18-33E9-4323-9B2A-EBAB63795021@mozilla.org> Message-ID: <4EB969E0.9060701@mozilla.com> On 29/10/2011 1:34 PM, Brendan Eich wrote: > This seems like it will pay off for many, and rarely or never bite back. Have to keep it small, of course. Likely yes. Though we should offer a compiler flag / crate attribute to disable the auto-import of it. -Graydon From pwalton at mozilla.com Tue Nov 8 09:42:26 2011 From: pwalton at mozilla.com (Patrick Walton) Date: Tue, 08 Nov 2011 09:42:26 -0800 Subject: [rust-dev] Renaming "tag" and "log_err" In-Reply-To: <4EB969E0.9060701@mozilla.com> References: <4EAB0A0A.5070905@mozilla.com> <4EAB2BCC.6000607@mozilla.com> <4EABEC4A.9010706@mozilla.com> <4EAC1922.6070201@mozilla.com> <60CACA18-33E9-4323-9B2A-EBAB63795021@mozilla.org> <4EB969E0.9060701@mozilla.com> Message-ID: <4EB96A02.8050207@mozilla.com> On 11/8/11 9:41 AM, Graydon Hoare wrote: > Likely yes. Though we should offer a compiler flag / crate attribute to > disable the auto-import of it. In fact, we'll have to, in order to bootstrap. Patrick From pwalton at mozilla.com Tue Nov 8 09:46:05 2011 From: pwalton at mozilla.com (Patrick Walton) Date: Tue, 08 Nov 2011 09:46:05 -0800 Subject: [rust-dev] Object system redesign In-Reply-To: <4EB96314.1050202@mozilla.com> References: <4EB95659.8010804@mozilla.com> <4EB96314.1050202@mozilla.com> Message-ID: <4EB96ADD.3020408@mozilla.com> On 11/8/11 9:12 AM, David Rajchenbach-Teller wrote: > Does this mean that we can only have one constructor? > If we wish to have several constructors ? and if we accept that they > must not have the same name ? the class could yield a full module, in > which each constructor is a function. Yeah, we may want multiple constructors. I left it out due to simplicity, but I don't have any opposition. If we do have them, they should be named though, as you say. > What is the rationale behind **@class**? Because you may want to e.g. register the instance with an observer, and if all you have is an alias to self you can't do that. > So a destructor cannot trigger cleanup of any heap-allocated data? Right, but keep in mind that if a class instance contains the last reference to heap-allocated data, the destructor on that heap-allocated data will be called as usual. >> Class instances are copyable if and only if the class has no >> destructor and all of its fields are copyable. Class instances are >> sendable if and only if all of its fields are sendable and the >> class was not declared with **@class**. > > No destructor? I would rather have guessed no constructor, what am I > missing? Destructors are for classes that encapsulate things like OS file descriptors. If you copy their contents, you'd end up closing the file twice. >> Classes may be type-parametric. Methods may not be >> type-parametric. > > Hmmm... What is the rationale? Mostly because it's what we did before. I wouldn't be opposed to making methods type-parametric either, honestly. > I like it. Be warned that someone is bound to ask for RTTI to > pattern-match on interfaces and/or on interface fields. Rust has always been slated to have reflection -- RTTI is baked into the system already, it's just a matter of implementing it. > One more question, though: what about class-less objects? I feel that > they could be quite useful to provide defer-style cleanup. I'm not opposed to that, if we can make it work. Keep in mind that you can define nominal classes (and nominal items generally) inside functions already, so this is mostly a question of avoiding giving a class a name (and maybe closing over surrounding lexical items; the same memory management issues we have with closures apply, of course). Patrick From niko at alum.mit.edu Tue Nov 8 10:01:16 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 8 Nov 2011 10:01:16 -0800 Subject: [rust-dev] Object system redesign In-Reply-To: <4EB95659.8010804@mozilla.com> References: <4EB95659.8010804@mozilla.com> Message-ID: <2EEBCF89-9C94-4519-8C8D-43F2BBDE60E7@alum.mit.edu> > ### Scoped object extensions (concepts, categories) > > Niko has a proposal for this, so I'll let him present it. Here is my proposal. It is still fairly raw. If you'd prefer to read it marked up, you can find it at . # Categories ## Goals This proposal is a bridge Haskell's type classes and class-based OOP. It also addresses a pet theme of mine, which is the ability to add methods to classes and define groups of related methods together. The core unit of organization is called a category, and it is a bundle of statically dispatched methods based on a shared receiver type (a form of static overloading, essentially). Traits can be used as before to implement these methods and provide implementation inheritance. Interfaces can be used as before for polymorphism. Categories can either complement the classes found in pcwalton's proposal or they can replace them altogether. If one preserves classes, you can have a stronger notion of privacy. Preserving classes also allows for destructors, which don't make sense in a system based purely on categories. ## Syntax I assume a nominal type system, though this is not a requirement. Records are declared using the `struct` keyword. There is no class keyword. Instead, there is the `category` keyword that is used to declare a group of methods. It works as follows: category name(T) { fn foo(params) -> ret { /* self has type &T */ } priv fn bar(params) -> ret { /* self has type &T */ } } This declares a set of methods that can be invoked on any instance of type `T`, which can be any type. The method bundle has a name (`name`), but it is not a type. Method bundles can be imported. When a method call `rcvr.foo(...)` is seen, the type of `rcvr` is resolved and then all imported method bundles are searched for a function `foo()`. If one is found, then the call is statically dispatched. ## Structs Although not necessary, I think the system works more smoothly if we move to nominal records, which I have termed struct. The syntax would be: struct T<...> { member1: T1; mutable member2: T2; priv member3: T3; } and so forth. ## Access control Methods and fields may be marked as private. The effect of this is to disallow access to those members except (a) when the instance of the struct is initially created and (b) from methods declared on that struct type. There have been objections that this notion of private is not very, well, private. This is true. A stronger notion could be achieved by allowing methods to be defined inside of structs, and saying that those methods are the only ones with access to the private fields (in that case, structs are basically classes, because they would combine fields and a default category of methods, as well as a constructor). ## Traits One can incorporate traits in the typical way: category name(T) : trait1, trait2 { ... } If an object is cast to an interface, the set of imported methods is searched to find matching objects for each interface method. ## Constructors There is no need of constructors in this system. A constructor is just a function that returns an instance of the struct: fn make_struct_T(m1: T1, m2: T2) -> T { ret { m1: m1, m2: m2, m3: initial_value_for_private_field }; } ## Generics Method blocks may include generic parameters. For example, the following category category vec_mthds([A]) { fn len() -> int { // self has type &[A]: ret vec::len(self); } } ## Regions, boxes, etc self is always passed by reference and therefore has the type `&T` where `T` is the type of the method receiver. A function name can be prefixed by `@` to require that self has the type `@T`. ## Potential abuse Because multiple blocks of methods can be defined for any given type, there is the chance for ambiguity and odd scenarios. Consider: struct T { ... } mthds foo1(T) { fn bar() { ... } } mthds foo2(T) { fn bar() { ... } } iface inter { fn bar(); } This raises several questions: - What happens when t.bar() is invoked if both groups of methods are in scope? - I think the result is a static error. Perhaps we allow the syntax t.foo1::bar() to make it clear. - What happens when t is cast to an instance of "inter"? - Again a static error. Finally, this also raises the potential to have two instances of the interace `inter`, both based on the same receiver, but with different vtables and hence different definitions of `bar()`! This can arise if you have two separate modules like so: Module 1: import T, inter; mthds foo1(T) { fn bar() { ... } } fn make_i(t: T) -> i { ret t as inter; // uses foo1::bar() } Module 2: import T, inter; mthds foo2(T) { fn bar() { ... } } fn make_i(t: T) -> i { ret t as inter; // uses foo2::bar() } I see this potential for abuse as a fair trade for the power of defining methods on any type and also breaking them up into categories and so forth. Others may disagree. From niko at alum.mit.edu Tue Nov 8 10:04:54 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 8 Nov 2011 10:04:54 -0800 Subject: [rust-dev] Object system redesign In-Reply-To: <4EB965D5.1040303@mozilla.com> References: <4EB95659.8010804@mozilla.com> <4EB965D5.1040303@mozilla.com> Message-ID: > - The trait/class/interface split, while I think elegant and > conceptually accurate when it comes to what's often under the covers > in OO systems, is conceptually *heavy* to users: it makes you think > an lot about OO-design-theory to write code in. It also carries a > high engineering cost to implement (pickling across crates, lots > of logic for simply implementing these abstractions in-crate). > These costs are the first and main thing on my mind when reading. It is certainly possible that the three abstractions will feel ponderous in practice and I do think this is something to be avoided. In general, I think having many ways to achieve the same goal in a language is dangerous because it requires one to mentally choose what design to use. However, I think the strict dissection of class/iface/trait in this proposal does help in that regard: when polymorphism is required, you must use an interface. When code reuse is desired, you must use traits. Otherwise, you can get away with classes. In other words, there is usually only one way to do it. Ultimately this does seem hard to judge without using the system. One other question is the code evolution path: presumably most users will start with a class-based design. Adding traits to achieve code reuse is easy and can be done in a backwards-compatible fashion. Changing a class into an interface is mostly sound, but for a couple of corner cases: constructors and fields. I am not so concerned about constructors, but having no way to model fields in interfaces will result in an annoying round of "search and replace foo.f with foo.f()". Maybe this is ok, particularly as Rust generally aims to make costs transparent, and so a field access should never look like a method call. I personally would favor simple properties (e.g., `this.f` desugars into `this.get_f()` and `this.set_f()` if no field `f` is defined) but these do hide costs and they also introduce more than one way to do things. Niko From niko at alum.mit.edu Tue Nov 8 10:14:21 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 8 Nov 2011 10:14:21 -0800 Subject: [rust-dev] Object system redesign In-Reply-To: References: <4EB95659.8010804@mozilla.com> <4EB965D5.1040303@mozilla.com> Message-ID: In case it's not clear, I should add that I like the proposal overall. It feels reminiscent of C++ in that you have fine-grained control over the costs of your abstractions but without the overwhelming complexity that C++ brings to the table. Niko On Nov 8, 2011, at 10:04 AM, Niko Matsakis wrote: >> - The trait/class/interface split, while I think elegant and >> conceptually accurate when it comes to what's often under the covers >> in OO systems, is conceptually *heavy* to users: it makes you think >> an lot about OO-design-theory to write code in. It also carries a >> high engineering cost to implement (pickling across crates, lots >> of logic for simply implementing these abstractions in-crate). >> These costs are the first and main thing on my mind when reading. > > It is certainly possible that the three abstractions will feel ponderous in practice and I do think this is something to be avoided. In general, I think having many ways to achieve the same goal in a language is dangerous because it requires one to mentally choose what design to use. However, I think the strict dissection of class/iface/trait in this proposal does help in that regard: when polymorphism is required, you must use an interface. When code reuse is desired, you must use traits. Otherwise, you can get away with classes. In other words, there is usually only one way to do it. Ultimately this does seem hard to judge without using the system. > > One other question is the code evolution path: presumably most users will start with a class-based design. Adding traits to achieve code reuse is easy and can be done in a backwards-compatible fashion. Changing a class into an interface is mostly sound, but for a couple of corner cases: constructors and fields. I am not so concerned about constructors, but having no way to model fields in interfaces will result in an annoying round of "search and replace foo.f with foo.f()". Maybe this is ok, particularly as Rust generally aims to make costs transparent, and so a field access should never look like a method call. > > I personally would favor simple properties (e.g., `this.f` desugars into `this.get_f()` and `this.set_f()` if no field `f` is defined) but these do hide costs and they also introduce more than one way to do things. > > > Niko > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > From dteller at mozilla.com Tue Nov 8 12:01:05 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Tue, 08 Nov 2011 21:01:05 +0100 Subject: [rust-dev] Renaming "tag" and "log_err" In-Reply-To: <4EB969C9.50106@mozilla.com> References: <4EAB0A0A.5070905@mozilla.com> <4EAB2BCC.6000607@mozilla.com> <4EABEC4A.9010706@mozilla.com> <4EAC1922.6070201@mozilla.com> <4EAC66CF.2020105@mozilla.com> <4EAC6C69.5080801@mozilla.com> <4EB968F8.4050202@mozilla.com> <4EB969C9.50106@mozilla.com> Message-ID: <4EB98A81.7060404@mozilla.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/8/11 6:41 PM, Patrick Walton wrote: > By the way, `alert` was suggested in a bug as a replacement for > log_err (and as a synonym for log(err, ...)), which Marijn and I > like. Has a cute JavaScripty feel to it. +1 (although any self-respecting JS developer knows that alert() should never be used and that the correct function to call is console.error() ) Cheers, David -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOuYqBAAoJED+FkPgNe9W+uB4IAJF9JfWMdKyiQNyTRvVd5FhO 0aQ0PkjWUVNjk9PZxvErT3B5vpaQb/fHHeYehdwDPe4Q2MyhN7aSoy6UlDgg0u5z aLPoVusa0npLuap+CBdu6MS/yPjuNUkqbsjxFg0Vn+Dxu56eiOPbx2XWtxrDrOw+ sIxP+rTlXYya5lt4Eq/XbafIEl0ykDXVuJjKHYGZ76fqlENwnEJm9UXg7nYXZEW5 kuI3YXeTyynD7iTls+FGZAP+7obhugbxKpRcAlAAxiFI5n4O99vSZOTN/f1dKwo2 F1IEQHWICeKgqzWt7CnxEK2X+HQRmBdAwXpOFLG6Am3LrG4isUAjHNWYdKUiR34= =lj+o -----END PGP SIGNATURE----- From elly at leptoquark.net Wed Nov 9 16:39:50 2011 From: elly at leptoquark.net (Elly Jones) Date: Wed, 9 Nov 2011 18:39:50 -0600 Subject: [rust-dev] Bindings for libcrypto Message-ID: <20111110003950.GE24012@leptoquark.net> In principle, there should exist Rust crypto libraries. I would be happy to implement them, but there are a couple of competing forces at work, and I'd like opinions. First, the forces: 1) The more portable, the better. 2) All other things being equal, Rust code is better than native code. 3) Writing one's own crypto is historically a very poor idea. The options: 1) Write our own crypto from scratch, in Rust. 2) Write bindings for OpenSSL's libcrypto. 3) Write bindings for something else external. 4) Pull something else external into rustrt, write bindings for that. My evaluations: I think option 1 is a nonstarter, at least for me, since a) I'm not really qualified to do it, b) it would have ~0 eyes on it, most likely and c) the world really doesn't need another crypto implementation from scratch with different bugs and a different subset of features. Option 2 is tempting, but libcrypto is _large_ (~1.7M on my system). The advantage of this is that we don't need to pull libcrypto into the basis, since openssl is installed on basically every system. This also gives us support for a truly huge selection of cryptographic primitives. Option 3 is also tempting. There are a lot of other libraries, but they all seem to fall along a sliding scale between "elephantine" and "un-audited", with OpenSSL at one extreme and things like PolarSSL at the other. Option 4 is probably not a very good idea. Doing this leaves us responsible for picking up upstream security fixes to the library we'd have pulled in, and if we picked a large, well-audited one (e.g. OpenSSL) would increase our source tree size and build times commesurately. On the other hand, if we do this, every Rust program can depend on having crypto primitives present at all times. I'm leaning toward option 2, I think. Does anyone have other thoughts? -- elly -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 490 bytes Desc: Digital signature URL: From giles at thaumas.net Wed Nov 9 17:00:31 2011 From: giles at thaumas.net (Ralph Giles) Date: Wed, 9 Nov 2011 17:00:31 -0800 Subject: [rust-dev] Bindings for libcrypto In-Reply-To: <20111110003950.GE24012@leptoquark.net> References: <20111110003950.GE24012@leptoquark.net> Message-ID: On 9 November 2011 16:39, Elly Jones wrote: > 1) Write our own crypto from scratch, in Rust. > 2) Write bindings for OpenSSL's libcrypto. > 3) Write bindings for something else external. > 4) Pull something else external into rustrt, write bindings for that. Do (2) and then use it to validate a much smaller set of routines in (1). -r From as at hacks.yi.org Wed Nov 9 17:27:33 2011 From: as at hacks.yi.org (austin seipp) Date: Wed, 9 Nov 2011 19:27:33 -0600 Subject: [rust-dev] Bindings for libcrypto In-Reply-To: <20111110003950.GE24012@leptoquark.net> References: <20111110003950.GE24012@leptoquark.net> Message-ID: On Wed, Nov 9, 2011 at 6:39 PM, Elly Jones wrote: > The options: > > 1) Write our own crypto from scratch, in Rust. > 2) Write bindings for OpenSSL's libcrypto. > 3) Write bindings for something else external. > 4) Pull something else external into rustrt, write bindings for that. > ... > snip > ... > Option 2 is tempting, but libcrypto is _large_ (~1.7M on my system). The > advantage of this is that we don't need to pull libcrypto into the basis, since > openssl is installed on basically every system. This also gives us support for a > truly huge selection of cryptographic primitives. > > Option 3 is also tempting. There are a lot of other libraries, but they all seem > to fall along a sliding scale between "elephantine" and "un-audited", with > OpenSSL at one extreme and things like PolarSSL at the other. Pragmatically I am in support of option 2, since indeed, almost every machine will have a recent version of OpenSSL installed. In the name of option 3, I'd like to bring on Dan Bernstein's NaCl project, for which I've been writing Haskell bindings: http://nacl.cace-project.eu - it's very simple and a very nice library. Unfortunately at the moment I don't think it's a realistic option, because it has a strange compilation model (djb software, who'da thought) in order to select optimized crypto primitives, and extracting the portable reference implementations in a way that's totally compatible with optimized implementations is difficult (although I'm working on this.) It otherwise has some nice, very attractive properties, though - and it's djb, so you know it's good. That's just me thinking out-loud. For the long-haul, I'd say 2 is probably the way to go it seems from a distribution standpoint. Doesn't Mozilla also have their own cryptographic networking library? Network Security Services (NSS,) I believe? Either way, me and others can write various crypto bindings to other libraries for Rust if needed/desired. If the proposal is to find something and get it into the standard library as it stands - for wide-spread usage - OpenSSL may be the only serious contender, I'm afraid. -- Regards, Austin From elly at leptoquark.net Wed Nov 9 17:04:57 2011 From: elly at leptoquark.net (Elly Jones) Date: Wed, 9 Nov 2011 19:04:57 -0600 Subject: [rust-dev] Bindings for libcrypto In-Reply-To: References: <20111110003950.GE24012@leptoquark.net> Message-ID: <20111110010457.GF24012@leptoquark.net> On Wed, Nov 09, 2011 at 05:00:31PM -0800, Ralph Giles wrote: > On 9 November 2011 16:39, Elly Jones wrote: > > > 1) Write our own crypto from scratch, in Rust. > > 2) Write bindings for OpenSSL's libcrypto. > > 3) Write bindings for something else external. > > 4) Pull something else external into rustrt, write bindings for that. > > Do (2) and then use it to validate a much smaller set of routines in (1). The concern is not correctness. Correctness is easy to test. The concern is things like exposure to timing attacks or side-channel attacks. These things cannot be tested for exhaustively, and avoiding them is subtle and fraught with peril. > -r > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev -- elly -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 490 bytes Desc: Digital signature URL: From tellrob at gmail.com Thu Nov 10 07:09:49 2011 From: tellrob at gmail.com (Rob Arnold) Date: Thu, 10 Nov 2011 07:09:49 -0800 Subject: [rust-dev] Bindings for libcrypto In-Reply-To: <20111110003950.GE24012@leptoquark.net> References: <20111110003950.GE24012@leptoquark.net> Message-ID: OpenSSL is not installed on Windows which is a platform that Rust intends to support natively (i.e. no cygwin, msys). I think using NSS would be a better choice since it is used on all the platforms that Rust runs on. You could also use the Windows crypto libraries for Windows but then you'd need two bindings and would have platform-specific differences. -Rob On Wed, Nov 9, 2011 at 4:39 PM, Elly Jones wrote: > In principle, there should exist Rust crypto libraries. I would be happy to > implement them, but there are a couple of competing forces at work, and > I'd like > opinions. First, the forces: > > 1) The more portable, the better. > 2) All other things being equal, Rust code is better than native code. > 3) Writing one's own crypto is historically a very poor idea. > > The options: > > 1) Write our own crypto from scratch, in Rust. > 2) Write bindings for OpenSSL's libcrypto. > 3) Write bindings for something else external. > 4) Pull something else external into rustrt, write bindings for that. > > My evaluations: > > I think option 1 is a nonstarter, at least for me, since a) I'm not really > qualified to do it, b) it would have ~0 eyes on it, most likely and c) the > world > really doesn't need another crypto implementation from scratch with > different > bugs and a different subset of features. > > Option 2 is tempting, but libcrypto is _large_ (~1.7M on my system). The > advantage of this is that we don't need to pull libcrypto into the basis, > since > openssl is installed on basically every system. This also gives us support > for a > truly huge selection of cryptographic primitives. > > Option 3 is also tempting. There are a lot of other libraries, but they > all seem > to fall along a sliding scale between "elephantine" and "un-audited", with > OpenSSL at one extreme and things like PolarSSL at the other. > > Option 4 is probably not a very good idea. Doing this leaves us > responsible for > picking up upstream security fixes to the library we'd have pulled in, and > if we > picked a large, well-audited one (e.g. OpenSSL) would increase our source > tree > size and build times commesurately. On the other hand, if we do this, > every Rust > program can depend on having crypto primitives present at all times. > > I'm leaning toward option 2, I think. Does anyone have other thoughts? > > -- elly > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > > iQEcBAEBAgAGBQJOux1WAAoJEEySSMpJmAEIGa0H/1CfsUTRWKlJQfHEikLJsRTL > Uz39MlZYW1LtajIMZKzac9GdybBNIYtNjEK8olx2gfj1SMCem+m0t7Dkq/p93039 > 9rVZjhS5KH85+MPUDE6EhesYjXV+4tWPn2YTWW/12HFjeqFOObfdKas3HUFBC5/a > 7buYSMO3sc6KvHHM2RO6CqcTQQsuptTKDoThywFVXlPhs3KJJ1mPWEvOZOWDox3n > rH50jFXTZ9FCF9z3BobeuoQshQyMwFJwWXwmYsIEWq5nPYE5uMwfc/r8pK02Yf72 > hjfkaDMlqVC6gXh7EVcfRWKReFGAVgWXDSPUlQvJYHzEdT70nZhfTG8ACcpwVzc= > =tKgw > -----END PGP SIGNATURE----- > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dteller at mozilla.com Thu Nov 10 07:42:38 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Thu, 10 Nov 2011 16:42:38 +0100 Subject: [rust-dev] Object system redesign In-Reply-To: <2EEBCF89-9C94-4519-8C8D-43F2BBDE60E7@alum.mit.edu> References: <4EB95659.8010804@mozilla.com> <2EEBCF89-9C94-4519-8C8D-43F2BBDE60E7@alum.mit.edu> Message-ID: <4EBBF0EE.2060709@mozilla.com> A few questions and remarks: - How do you compile the following extract? fn call_foo(x: T) { x.foo(); } > Perhaps we allow the syntax t.foo1::bar() to make it clear. What would you think of the following? (t as foo1(T)).bar() > Finally, this also raises the potential to have two instances of the interace inter, both based on the same receiver, but with different vtables and hence different definitions of bar()! What exactly is the issue there? Readability? Cheers, David From graydon at mozilla.com Thu Nov 10 08:35:57 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Thu, 10 Nov 2011 08:35:57 -0800 Subject: [rust-dev] Bindings for libcrypto In-Reply-To: <20111110003950.GE24012@leptoquark.net> References: <20111110003950.GE24012@leptoquark.net> Message-ID: <4EBBFD6D.5010806@mozilla.com> On 09/11/2011 4:39 PM, Elly Jones wrote: > In principle, there should exist Rust crypto libraries. I'd stop here to discuss a bit further, before getting too into crypto. In principle we should have libraries for lots of stuff. At some point we are going to need to start having a more-serious discussion about organization of the library ecosystem; I think this is a reasonable place to begin the conversation. "Crypto libraries" can mean anything from: - NaCL: hard-wired to a small, secure, ultra-modern algorithm set. - Suite B like: modern standards, but multiple modes of use. - Botan, openssl, Crypto++ etc.: mix and match every algorithm ever. Somewhere along the line openPGP and X.509 interop comes into play as well. It's a *big* landscape. I don't think it's sensible to talk about which point to choose in this landscape "for crypto" without forming a larger philosophy about libraries in general. Crypto is a subset of concerns that will come up over and over. Let's talk a bit more generally about how to handle the tension between "include everything in the standard library" and "make installations tractable and easy to manage". Discuss! Is the python distribution the model to aim for (massive stdlib, "all batteries included") or is the C++ model better? So far we've talked some, but not a lot, about following the python-y path. If so, how do we develop that? One big repo with lots of submodules? Lots of stuff detected by configury? Automatic bindings generted by a tool? On-the-fly bindings using clang as a dependency? -Graydon From josh at joshmatthews.net Thu Nov 10 10:42:20 2011 From: josh at joshmatthews.net (Josh Matthews) Date: Thu, 10 Nov 2011 13:42:20 -0500 Subject: [rust-dev] How safe can reinterpret_cast be? Message-ID: I have written the following code: tag debug_metadata { file_metadata(@metadata); compile_unit_metadata(@metadata); subprogram_metadata(@metadata); } fn md_from_metadata(val: debug_metadata) -> T unsafe { alt val { file_metadata(md) { unsafe::reinterpret_cast(md) } compile_unit_metadata(md) { unsafe::reinterpret_cast(md) } subprogram_metadata(md) { unsafe::reinterpret_cast(md) } } } Assume that I know precisely what type I am extracting at any given point when I call md_from_metadata, so I call the specific typed version that gives me the correct output (ie. I am never actually casting a value to the incorrect type). My Principles of Software Engineering prof would surely call this "bad zen", but using md_from_metadata in this way makes the calling code noticeably cleaner in my eyes. Are there any safety concerns that come with using reinterpret_cast in this way, or is the code simply a harmless hack? Cheers, Josh From martine at danga.com Thu Nov 10 14:15:37 2011 From: martine at danga.com (Evan Martin) Date: Thu, 10 Nov 2011 14:15:37 -0800 Subject: [rust-dev] Bindings for libcrypto In-Reply-To: <4EBBFD6D.5010806@mozilla.com> References: <20111110003950.GE24012@leptoquark.net> <4EBBFD6D.5010806@mozilla.com> Message-ID: On Thu, Nov 10, 2011 at 8:35 AM, Graydon Hoare wrote: > On 09/11/2011 4:39 PM, Elly Jones wrote: > >> In principle, there should exist Rust crypto libraries. > > I'd stop here to discuss a bit further, before getting too into crypto. > > In principle we should have libraries for lots of stuff. At some point we > are going to need to start having a more-serious discussion about > organization of the library ecosystem; I think this is a reasonable place to > begin the conversation. > > "Crypto libraries" can mean anything from: > > ?- NaCL: hard-wired to a small, secure, ultra-modern algorithm set. > ?- Suite B like: modern standards, but multiple modes of use. > ?- Botan, openssl, Crypto++ etc.: mix and match every algorithm ever. > > Somewhere along the line openPGP and X.509 interop comes into play as well. > It's a *big* landscape. I don't think it's sensible to talk about which > point to choose in this landscape "for crypto" without forming a larger > philosophy about libraries in general. Crypto is a subset of concerns that > will come up over and over. > > Let's talk a bit more generally about how to handle the tension between > "include everything in the standard library" and "make installations > tractable and easy to manage". Discuss! Is the python distribution the model > to aim for (massive stdlib, "all batteries included") or is the C++ model > better? So far we've talked some, but not a lot, about following the > python-y path. If so, how do we develop that? One big repo with lots of > submodules? Lots of stuff detected by configury? Automatic bindings generted > by a tool? On-the-fly bindings using clang as a dependency? For comparison, the Go language has a crypto library, written in Go, as part of its standard library. (It's written by one of the few people I'd trust to get it right, but new crypto code is still scary.) For their v1 release they plan to kick it out into a submodule: https://docs.google.com/document/pub?id=1ny8uI-_BHrDCZv_zNBSthNKAMX_fR_0dc6epA6lztRE&pli=1 (search for "crypto"). Go's model is that it's trivial to install modules from third parties -- instead of 'import "crypto"' in your code you can write 'import "github.com/foobar/crypto"' and a built-in program "goinstall" knows how to extract those references from your source, download, compile, install, etc. So I think they don't worry too much about what is core or not. But Haskell's model is similar (though cabal takes a lot more effort to participate in) and my experience as an enthusiast is that it is kind of a disaster: any given project requires a bunch more additional modules, frequently in versions that conflict with the versions you already have installed. This may not be a problem in Go just due to the youth of the language and libraries (I haven't, for my bits of Go code, ever actually needed to use goinstall; their standard library is pretty large). My lurker's opinion on this subject in particular: as always in these discussions I wish you guys would aggressively limit scope by punting it for later, and just worry about how to not paint yourself into a corner for the future once all the other stuff (like "can I build hello world quickly") is in place. :) From niko at alum.mit.edu Fri Nov 11 07:29:20 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Fri, 11 Nov 2011 07:29:20 -0800 Subject: [rust-dev] Object system redesign In-Reply-To: <4EBBF0EE.2060709@mozilla.com> References: <4EB95659.8010804@mozilla.com> <2EEBCF89-9C94-4519-8C8D-43F2BBDE60E7@alum.mit.edu> <4EBBF0EE.2060709@mozilla.com> Message-ID: <6DE715DC-3736-45B0-875F-86DB39EA3961@alum.mit.edu> On Nov 10, 2011, at 7:42 AM, David Rajchenbach-Teller wrote: > - How do you compile the following extract? > > fn call_foo(x: T) { > x.foo(); > } You don't. If you wanted to write `call_foo()`, you would probably write: iface has_foo { fn foo(); } fn call_foo(x: has_foo) { x.foo(); } then you could call this method like: call_foo(x as has_foo) In principle, it might be nice to allow something like bounded polymorphism: fn call_foo(x: T) { x.foo(); } Without subtyping, it would make less sense. Perhaps it corresponds to passing the vtable that converts a `T` into a `has_foo`, so when you invoke x.foo() it compiles down to "T_vtable.foo(x)" (which I think is how Haskell type classes work at runtime, but that's more from me guessing, perhaps people who know Haskell better can correct me). > > Perhaps we allow the syntax t.foo1::bar() to make it clear. > > What would you think of the following? > > (t as foo1(T)).bar() Well, that makes a category into a type, which it currently is not. > > Finally, this also raises the potential to have two instances of the interace inter, both based on the same receiver, but with different vtables and hence different definitions of bar()! > > What exactly is the issue there? Readability? Just that it's potentially confusing. Here you have two instances of the same interface, both bound to the same receiver, but they behave differently. I can see some hard-to-find bugs arising this way. Niko From pwalton at mozilla.com Fri Nov 11 09:01:59 2011 From: pwalton at mozilla.com (Patrick Walton) Date: Fri, 11 Nov 2011 09:01:59 -0800 Subject: [rust-dev] Object system redesign In-Reply-To: <6DE715DC-3736-45B0-875F-86DB39EA3961@alum.mit.edu> References: <4EB95659.8010804@mozilla.com> <2EEBCF89-9C94-4519-8C8D-43F2BBDE60E7@alum.mit.edu> <4EBBF0EE.2060709@mozilla.com> <6DE715DC-3736-45B0-875F-86DB39EA3961@alum.mit.edu> Message-ID: <4EBD5507.5000408@mozilla.com> On 11/11/2011 07:29 AM, Niko Matsakis wrote: > In principle, it might be nice to allow something like bounded polymorphism: > > fn call_foo(x: T) { > x.foo(); > } > > Without subtyping, it would make less sense. Perhaps it corresponds > to passing the vtable that converts a `T` into a `has_foo`, so when you invoke x.foo() it compiles down to "T_vtable.foo(x)" (which I think is how Haskell type classes work at runtime, but that's more from me guessing, perhaps people who know Haskell better can correct me). It does. Patrick From dteller at mozilla.com Sat Nov 12 14:08:43 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Sat, 12 Nov 2011 23:08:43 +0100 Subject: [rust-dev] Object system redesign In-Reply-To: References: <4EB95659.8010804@mozilla.com> <4EB965D5.1040303@mozilla.com> Message-ID: <4EBEEE6B.4020109@mozilla.com> On 11/8/11 7:04 PM, Niko Matsakis wrote: > One other question is the code evolution path: presumably most users will start with a class-based design. Adding traits to achieve code reuse is easy and can be done in a backwards-compatible fashion. Changing a class into an interface is mostly sound, but for a couple of corner cases: constructors and fields. I am not so concerned about constructors, but having no way to model fields in interfaces will result in an annoying round of "search and replace foo.f with foo.f()". Maybe this is ok, particularly as Rust generally aims to make costs transparent, and so a field access should never look like a method call. > > I personally would favor simple properties (e.g., `this.f` desugars into `this.get_f()` and `this.set_f()` if no field `f` is defined) but these do hide costs and they also introduce more than one way to do things. Why not go for uniformity and allow virtual fields? Cheers, Davud From niko at alum.mit.edu Sat Nov 12 22:13:04 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Sat, 12 Nov 2011 22:13:04 -0800 Subject: [rust-dev] compatibility of Rust data structures with C Message-ID: I wanted to mention some of the problems I've encountered in doing the x86_64 port of Rust and query the opinions of people as to whether there are problems that need to be fixed and---if so---what is the best remedy. At various points in the C runtime, we have C types that are intended to be bit-for-bit compatible with corresponding Rust types. The problem is that the translation of Rust types into LLVM does not always preserve this compatibility. Both problems I've encountered so far have had to do with tags. The first problem arises from simple "enum-like" tags. We were translating this Rust type: tag task_result { /* Variant: tr_success */ tr_success; /* Variant: tr_failure */ tr_failure; } into `enum task_result { tr_success, tr_failure }`. This seems reasonable except that the C compiler was not allocating 64 bits for an enum but rather 32. Rust always allocates a full word for the variant ID of a tag. The second problem concerns alignment. This type from comm.rs: tag chan { chan_t(task::task, port_id); } is translated in C as: struct chan_handle { rust_task_id task; rust_port_id port; }; (where both rust_task_id and rust_port_id are effectively uint64_t). The rust tag variant translates to char[16]. While both of these types have the same SIZE in bytes, they have different alignment properties. This causes problems because the chan_handle/chan<> is then embedded in a struct, and suddenly the Rust and C code disagree about the offsets of the various fields. For now, I've been working around these problems by adjusting the C code. However, it seems like it would be nice to have a simple way to reproduce C types in Rust and vice-versa. We're fairly close to it now but some tweaks are needed. I don't know what's the best translation for tags. It would be nice if it matched C enums when there is no associated data (i32, it seems, but we could check the clang source perhaps). If there is associated data, it would be nice if it matched unions. Sadly, LLVM does not seem to have a union type in its type system; clang seems to translate unions into structs with the most highly aligned variant and then use bitcasts. Niko From eric.holk at gmail.com Sat Nov 12 13:19:30 2011 From: eric.holk at gmail.com (Eric Holk) Date: Sat, 12 Nov 2011 16:19:30 -0500 Subject: [rust-dev] Kind system revision proposal In-Reply-To: <4EB95076.5030208@mozilla.com> References: <4EB95076.5030208@mozilla.com> Message-ID: <49B9CF55-E0EC-42D8-95F6-E0006A9F62C9@gmail.com> On Nov 8, 2011, at 10:53 AM, Graydon Hoare wrote: > On 04/11/2011 5:32 AM, Marijn Haverbeke wrote: >> Generic types (type and tag) do not seem to have a reason to >> ever narrow their kind bounds in this system, so (unless I am missing >> something), they should probably not even be allowed to specify a kind >> on their parameters. > > Agreed. Unless it's changed from when I left, ports and chans are defined as generic types, and here I think it still makes sense to narrow their type parameter to only things that can be sent. It's a minor detail though, as we can just leave those kind annotations on all of the functions to operate on ports and chans and get basically the same result. -Eric From dteller at mozilla.com Sun Nov 13 22:52:11 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Mon, 14 Nov 2011 07:52:11 +0100 Subject: [rust-dev] Bindings for libcrypto In-Reply-To: References: <20111110003950.GE24012@leptoquark.net> <4EBBFD6D.5010806@mozilla.com> Message-ID: <4EC0BA9B.5000703@mozilla.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/10/11 11:15 PM, Evan Martin wrote: > My lurker's opinion on this subject in particular: as always in > these discussions I wish you guys would aggressively limit scope > by punting it for later, and just worry about how to not paint > yourself into a corner for the future once all the other stuff > (like "can I build hello world quickly") is in place. :) I like this approach. IMHO, the first question to answer before attempting to build a comprehensive platform is to choose a strategy wrt to the first projects that should be written with Rust: a part of the Mozilla Platform itself? A server-side component? A part of the build process? OS-level utilities? etc. In my experience, this is the kind of thing that influences stdlib development. Best regards, David -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOwLqbAAoJED+FkPgNe9W+PQQIAInsX+K9SPfqlVCgPt0sxvDN 1jpAmyLvwnbBZz+26BeNynNY1IItEXX1W8J6Qzs+GfY9uCNPnIjXCLRoyKN7/XwQ /kiL3rsF03HbGoDrD/Cj5KYUf3McTEzZvTNt1/5HkJ+gWb/33LqPTieURRAgAMWF 5D0Ou4mJoclGe6coIPV6qAMD2MHoXtsLDafh+y60fzudbBG2cN36sGZPIGLD7e3q P6hHyKuwKMB8BFQDtvt9AvOrrKSX1/gG1sGzvtPAYcygWjTDWqtmPUOmD4BeYyfP 8FCKMzVxg8htcC7iwKi5+B0Pm91k8GObj5L6UpuAtNAQ+iumPVg6njOFZDMX/m0= =MSrg -----END PGP SIGNATURE----- From dteller at mozilla.com Sun Nov 13 23:53:27 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Mon, 14 Nov 2011 08:53:27 +0100 Subject: [rust-dev] Writing cross-platform low-level code Message-ID: <4EC0C8F7.80706@mozilla.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Dear Rusties, I am currently in the early stages of writing a file system access library for mozilla-central that might eventually replace some or all of mozilla-central low-level file access code with something faster and a little higher level. I also consider porting ? perhaps even prototyping ? this library to Rust. For this purpose, I need a little guidance on a few points. *** #ifdef My code heavily relies on #ifdefs with macros to compile code conditionally, depending on both: - - which platform is targeted (in mozilla-central, that's macros XP_WIN, XP_UNIX); - - which primitives are available in libc (through autoconf's AC_CHECK_FUNS and the macros HAVE_xxxxx that it defines). What is the best way to achieve this in Rust? *** Foreign types - From what I see in the code of unix_os.rs, we can simply define `type foobar` in a foreign module and use this in the code. Are there any limitations to this that I should know before employing the technique? *** Unicode On Unix platforms, file names are `char*`. On Windows platforms, they are `wchar*`. In mozilla-central, I use the strings API to make my life simpler and to handle all conversions for me. How should I do this with Rust? As far as I understand, the current win32_os.rs simply assumes that any `str` passed can be used as a valid `char*`, and relies on Cygwin to handle any oddity. As most of the features of Win32 are not available through Cygwin, this is probably something that I cannot use. *** Garbage-collection / destruction I am not familiar with Rust resources yet, but chances are that I will need to interact with them to get destruction and garbage-collection on file descriptors and directory structures. - - Do I understand correctly that resources a the right tool for the task? - - From the documentation, I understand that a failed task will not call destructors. Do I understand correctly? If so, this seems like a major issue, as a process with failed tasks will leak file descriptors. *** Error-handling For file access, some error conditions are indeed failures, while some are simply information that should be propagated. As I understand, Rust will not have ML/C++-style exceptions in any foreseeable future. I therefore see the following possibilities: - - Either aggressively `spawn_notify` tasks, `unsupervise` them and `fail` in case of any error, with some convention to pass `errno`; - - or have each function return `either::t` and propagate error conditions manually, much as is done in current mozilla-central code; - - or offer both styles, with some naming convention to differentiate them ? say submodules `fails` / `dnf`. Any suggestion on the best policy? Well, that covers most of my interrogations. Any suggestion or idea? Thanks, David -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (Darwin) iQEcBAEBAgAGBQJOwMj3AAoJED+FkPgNe9W+5XIH/iHhkD5SFjJtEaabRtmvEwie dqKHzv10wD0nO4yNB9N2SUpUHNbysm3Cc1d5Rug48eBiqaPjRJYm2Gila0dYeOI3 SJs/8AQ6Bis0mLbggs9lBNNsfmBPw+E7RjXTsqS0AZG20Cyn1xeCPI1fnXuzKyms 2lrDSduSZNyVbbpWV/9zR9+7iXASX/wx1CaNbaznJ92Uf+FMFdkwV20VXtlnkQYU yWHMicwswmYuav3xRYGtCKG+qKgFxh0KRkJl0gXuNdR//v/qZaRC6YCneMjm/Qeb jlfO2jdE3NXxYwqsZA4gxbPVHpIYn9g27s0hO4WwkR84Ou9Qux2FwuIKPGOqPS4= =2EG1 -----END PGP SIGNATURE----- From eric.holk at gmail.com Mon Nov 14 08:17:06 2011 From: eric.holk at gmail.com (Eric Holk) Date: Mon, 14 Nov 2011 08:17:06 -0800 Subject: [rust-dev] How safe can reinterpret_cast be? In-Reply-To: References: Message-ID: Is it possible to write this code so that the return type isn't a T? I don't think it's possible to have a function that returns T but takes no parameters of type T in a way that is truly safe. As long as only you use md_from_metadata, and you are careful about how you use it, you can probably avoid any errors, but these sorts of errors are exactly what a type system is supposed to protect you from in the first place. More problems are likely to come if other people start using md_from_metadata. Normally, the right place for reinterpret_cast is when you're writing really low-level code that can't really be done safely anyway. For example, at least at one point (and possibly still now), task.rs would construct new stack frames and this required casting data into raw bytes to store on the stack. The fact that you called this debug_metadata suggests you might be in an appropriate place to use reinterpret_cast, but if so, try not to export md_from_metadata, and if at all possible, try to write you code so you don't need reinterpret_cast. -Eric On Nov 10, 2011, at 10:42 AM, Josh Matthews wrote: > I have written the following code: > > tag debug_metadata { > file_metadata(@metadata); > compile_unit_metadata(@metadata); > subprogram_metadata(@metadata); > } > > fn md_from_metadata(val: debug_metadata) -> T unsafe { > alt val { > file_metadata(md) { unsafe::reinterpret_cast(md) } > compile_unit_metadata(md) { unsafe::reinterpret_cast(md) } > subprogram_metadata(md) { unsafe::reinterpret_cast(md) } > } > } > > Assume that I know precisely what type I am extracting at any given > point when I call md_from_metadata, so I call the specific typed > version that gives me the correct output (ie. I am never actually > casting a value to the incorrect type). My Principles of Software > Engineering prof would surely call this "bad zen", but using > md_from_metadata in this way makes the calling code noticeably cleaner > in my eyes. Are there any safety concerns that come with using > reinterpret_cast in this way, or is the code simply a harmless hack? > > Cheers, > Josh > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From banderson at mozilla.com Mon Nov 14 12:53:08 2011 From: banderson at mozilla.com (Brian Anderson) Date: Mon, 14 Nov 2011 12:53:08 -0800 Subject: [rust-dev] Writing cross-platform low-level code In-Reply-To: <4EC0C8F7.80706@mozilla.com> References: <4EC0C8F7.80706@mozilla.com> Message-ID: <4EC17FB4.40108@mozilla.com> On 11/13/2011 11:53 PM, David Rajchenbach-Teller wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Dear Rusties, > > I am currently in the early stages of writing a file system access > library for mozilla-central that might eventually replace some or all > of mozilla-central low-level file access code with something faster > and a little higher level. I also consider porting ? perhaps even > prototyping ? this library to Rust. > > For this purpose, I need a little guidance on a few points. > > *** #ifdef > > My code heavily relies on #ifdefs with macros to compile code > conditionally, depending on both: > - - which platform is targeted (in mozilla-central, that's macros > XP_WIN, XP_UNIX); > - - which primitives are available in libc (through autoconf's > AC_CHECK_FUNS and the macros HAVE_xxxxx that it defines). > > What is the best way to achieve this in Rust? > Rust items can be conditionally compiled with the 'cfg' attribute, so your build can call 'rustc --cfg FEATURE' (and you can provide --cfg any number of times) then you can have functions annotated #[cfg(FEATURE)]. The target platform is already set by default as 'target_os', so you can say #[cfg(target_os = "win32")]. If your question is more about how to integrate your autoconf-based build with Rust's anti-autoconf build then the answer is probably that they have to be kept separate. > *** Foreign types > > - From what I see in the code of unix_os.rs, we can simply define `type > foobar` in a foreign module and use this in the code. Are there any > limitations to this that I should know before employing the technique? > These types are pointer sized, so they can be used to pass around pointers to opaque types, but not much else. If that doesn't work then you have to create a Rust declaration with the same structure as the C declaration. std does this in a few places. > *** Unicode > > On Unix platforms, file names are `char*`. On Windows platforms, they > are `wchar*`. In mozilla-central, I use the strings API to make my > life simpler and to handle all conversions for me. How should I do > this with Rust? > > As far as I understand, the current win32_os.rs simply assumes that > any `str` passed can be used as a valid `char*`, and relies on Cygwin > to handle any oddity. As most of the features of Win32 are not > available through Cygwin, this is probably something that I cannot use. stdlib's string handling for win32 is wrong and I don't believe anybody has put much thought into what needs to happen. Suggestions welcome. > *** Garbage-collection / destruction > > I am not familiar with Rust resources yet, but chances are that I will > need to interact with them to get destruction and garbage-collection > on file descriptors and directory structures. > > - - Do I understand correctly that resources a the right tool for the task? > - - From the documentation, I understand that a failed task will not > call destructors. Do I understand correctly? If so, this seems like a > major issue, as a process with failed tasks will leak file descriptors. > Resources are what you want, but you usually want to box them and wrap them in some other type because they are a bit unwieldy from a user perspective. Failed tasks will call destructors during unwinding. Can you point me to the incorrect documentation? One thing to note is that we still don't implement unwinding on win32 so failure is currently unrecoverable on that platform. > *** Error-handling > > For file access, some error conditions are indeed failures, while some > are simply information that should be propagated. As I understand, > Rust will not have ML/C++-style exceptions in any foreseeable future. > I therefore see the following possibilities: > > - - Either aggressively `spawn_notify` tasks, `unsupervise` them and > `fail` in case of any error, with some convention to pass `errno`; I've tried to implement this approach before and found that the language is not yet expressive enough to do this in a convenient way (because we can't spawn closures). It's also quite inefficient for general use. > - - or have each function return `either::t` and > propagate error conditions manually, much as is done in current > mozilla-central code; std has a 'result::t' type that I am trying to use for this purpose. std::io makes use of this now. > - - or offer both styles, with some naming convention to differentiate > them ? say submodules `fails` / `dnf`. > > Any suggestion on the best policy? I prefer std::result > Well, that covers most of my interrogations. Any suggestion or idea? Sounds promising. From niko at alum.mit.edu Mon Nov 14 15:47:18 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Mon, 14 Nov 2011 15:47:18 -0800 Subject: [rust-dev] Writing cross-platform low-level code In-Reply-To: <4EC17FB4.40108@mozilla.com> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> Message-ID: On Nov 14, 2011, at 12:53 PM, Brian Anderson wrote: > std has a 'result::t' type that I am trying to use for this purpose. std::io makes use of this now. Besides exceptions, I do not have a better alternative than `std::result`. However, I fear that if we go too far down this style, it will very painful and we will find ourselves wishing for syntactic sugar to support chaining (whether that is monads or something else). At least this has been my experience when working in OCaml. Are we sure that we do not want exceptions? Niko From niko at alum.mit.edu Mon Nov 14 21:25:41 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Mon, 14 Nov 2011 21:25:41 -0800 Subject: [rust-dev] compile-command in each file Message-ID: <56ED9F10-BB39-4BB0-B2A1-C693723F8FEF@alum.mit.edu> How do people feel about removing the "compile-command" local variable in each Rust file? They are often slightly different and in any case they tend to make my life (at least) somewhat more difficult when using emacs. Without those variables, I could build once by selecting the Makefile and then doing "M-x compile" and "make". Then I could just do "M-x recompile". But with the variables, "M-x recompile" is overridden by the file local variable. This also sets the compilation direction to be the same as the file I am currently editing, and so while the compilation generally proceeds due to the -C flag to make, the relative paths are not resolved properly by emacs. Niko From dteller at mozilla.com Mon Nov 14 23:40:26 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Tue, 15 Nov 2011 08:40:26 +0100 Subject: [rust-dev] Writing cross-platform low-level code In-Reply-To: References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> Message-ID: <4EC2176A.5070106@mozilla.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/15/11 12:47 AM, Niko Matsakis wrote: > > On Nov 14, 2011, at 12:53 PM, Brian Anderson wrote: > >> std has a 'result::t' type that I am trying to use for this >> purpose. std::io makes use of this now. > > Besides exceptions, I do not have a better alternative than > `std::result`. However, I fear that if we go too far down this > style, it will very painful and we will find ourselves wishing for > syntactic sugar to support chaining (whether that is monads or > something else). At least this has been my experience when working > in OCaml. > > Are we sure that we do not want exceptions? Actually, since we have `ret`, I'm sure that we can have small syntactic sugar without resorting to monads. Assuming we have, foo(x: str) -> result::t { ... } We could write let x = #result[foo(y)] And expand it to let x = { let __result = foo(y); alt(__result) { ok(z) { z } err(_){ //Possibly add some logging here ret __result } } } In practice, this is already what happens in mozilla-central. Not ideal, but probably easier to sell to non-fp users as monads, and it takes advantage of `ret`. I wonder about performance, though. Is there a way that the compiler could recognize this pattern and somehow optimize it to attain the same kind of performance as exceptions? Cheers, David -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (Darwin) iQEcBAEBAgAGBQJOwhdqAAoJED+FkPgNe9W+AMoH/i5scrjiLO2tyscpUcrc3+x5 682VB3aoKN30ddmZDWMbtFSDTMpRERfoNWlNIYtkL0fB5jnw+jyVibPIHzEzArsd p+VlXLicPizWHywvIKsNcIkJs2kcQOVP/GQ8BM3XccvnU4I17mYBRWWHsNLtQEFG YVNmHplLp3bSFORST+R9ZxG+ToTzmhPWCDWmqJECJnSkTj4owlyKd223XyqgYacG NV9eZFbfTTV94m7hgu9shSIWzfbV7KG2JLbGzMCFT1wb8AK6+uZvQohjXbAe+u9G yk6KGaoF4JDo02osH/cf9CY1t23sOBAHfzTfb8BKjnHF+pce6LrYa5gZTWGZ9c4= =87nd -----END PGP SIGNATURE----- From niko at alum.mit.edu Tue Nov 15 06:30:34 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 15 Nov 2011 06:30:34 -0800 Subject: [rust-dev] Writing cross-platform low-level code In-Reply-To: <4EC2176A.5070106@mozilla.com> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> Message-ID: Yes, I was thinking something similar yesterday. Such a pattern might well be perfect. On Nov 14, 2011, at 11:40 PM, David Rajchenbach-Teller wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 11/15/11 12:47 AM, Niko Matsakis wrote: >> >> On Nov 14, 2011, at 12:53 PM, Brian Anderson wrote: >> >>> std has a 'result::t' type that I am trying to use for this >>> purpose. std::io makes use of this now. >> >> Besides exceptions, I do not have a better alternative than >> `std::result`. However, I fear that if we go too far down this >> style, it will very painful and we will find ourselves wishing for >> syntactic sugar to support chaining (whether that is monads or >> something else). At least this has been my experience when working >> in OCaml. >> >> Are we sure that we do not want exceptions? > > Actually, since we have `ret`, I'm sure that we can have small > syntactic sugar without resorting to monads. > > Assuming we have, > > foo(x: str) -> result::t { ... } > > We could write > > let x = #result[foo(y)] > > And expand it to > > let x = { > let __result = foo(y); > alt(__result) { > ok(z) { z } > err(_){ > //Possibly add some logging here > ret __result > } > } > } > > In practice, this is already what happens in mozilla-central. Not > ideal, but probably easier to sell to non-fp users as monads, and it > takes advantage of `ret`. I wonder about performance, though. Is there > a way that the compiler could recognize this pattern and somehow > optimize it to attain the same kind of performance as exceptions? > > Cheers, > David > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.11 (Darwin) > > iQEcBAEBAgAGBQJOwhdqAAoJED+FkPgNe9W+AMoH/i5scrjiLO2tyscpUcrc3+x5 > 682VB3aoKN30ddmZDWMbtFSDTMpRERfoNWlNIYtkL0fB5jnw+jyVibPIHzEzArsd > p+VlXLicPizWHywvIKsNcIkJs2kcQOVP/GQ8BM3XccvnU4I17mYBRWWHsNLtQEFG > YVNmHplLp3bSFORST+R9ZxG+ToTzmhPWCDWmqJECJnSkTj4owlyKd223XyqgYacG > NV9eZFbfTTV94m7hgu9shSIWzfbV7KG2JLbGzMCFT1wb8AK6+uZvQohjXbAe+u9G > yk6KGaoF4JDo02osH/cf9CY1t23sOBAHfzTfb8BKjnHF+pce6LrYa5gZTWGZ9c4= > =87nd > -----END PGP SIGNATURE----- > From banderson at mozilla.com Tue Nov 15 10:10:53 2011 From: banderson at mozilla.com (Brian Anderson) Date: Tue, 15 Nov 2011 10:10:53 -0800 Subject: [rust-dev] compatibility of Rust data structures with C In-Reply-To: References: Message-ID: <4EC2AB2D.7090405@mozilla.com> On 11/12/2011 10:13 PM, Niko Matsakis wrote: > I wanted to mention some of the problems I've encountered in doing the x86_64 port of Rust and query the opinions of people as to whether there are problems that need to be fixed and---if so---what is the best remedy. > > At various points in the C runtime, we have C types that are intended to be bit-for-bit compatible with corresponding Rust types. The problem is that the translation of Rust types into LLVM does not always preserve this compatibility. Both problems I've encountered so far have had to do with tags. > > The first problem arises from simple "enum-like" tags. We were translating this Rust type: > > tag task_result { > /* Variant: tr_success */ > tr_success; > /* Variant: tr_failure */ > tr_failure; > } > > into `enum task_result { tr_success, tr_failure }`. This seems reasonable except that the C compiler was not allocating 64 bits for an enum but rather 32. Rust always allocates a full word for the variant ID of a tag. We should do what C does. > The second problem concerns alignment. This type from comm.rs: > > tag chan { > chan_t(task::task, port_id); > } > > is translated in C as: > > struct chan_handle { > rust_task_id task; > rust_port_id port; > }; > > (where both rust_task_id and rust_port_id are effectively uint64_t). The rust tag variant translates to char[16]. While both of these types have the same SIZE in bytes, they have different alignment properties. This causes problems because the chan_handle/chan<> is then embedded in a struct, and suddenly the Rust and C code disagree about the offsets of the various fields. Seems like we should be able to get it right in this case. I know we have lots of alignment problems. From banderson at mozilla.com Tue Nov 15 10:39:11 2011 From: banderson at mozilla.com (Brian Anderson) Date: Tue, 15 Nov 2011 10:39:11 -0800 Subject: [rust-dev] Writing cross-platform low-level code In-Reply-To: <4EC21D07.4030006@mozilla.com> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC21D07.4030006@mozilla.com> Message-ID: <4EC2B1CF.50102@mozilla.com> On 11/15/2011 12:04 AM, David Rajchenbach-Teller wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 11/14/11 9:53 PM, Brian Anderson wrote: >> On 11/13/2011 11:53 PM, David Rajchenbach-Teller wrote: >>> *** #ifdef >>> >>> My code heavily relies on #ifdefs with macros to compile code >>> conditionally, depending on both: - - which platform is targeted >>> (in mozilla-central, that's macros XP_WIN, XP_UNIX); - - which >>> primitives are available in libc (through autoconf's >>> AC_CHECK_FUNS and the macros HAVE_xxxxx that it defines). >>> >>> What is the best way to achieve this in Rust? >>> >> Rust items can be conditionally compiled with the 'cfg' attribute, >> so your build can call 'rustc --cfg FEATURE' (and you can provide >> --cfg any number of times) then you can have functions annotated >> #[cfg(FEATURE)]. The target platform is already set by default as >> 'target_os', so you can say #[cfg(target_os = "win32")]. >> >> If your question is more about how to integrate your >> autoconf-based build with Rust's anti-autoconf build then the >> answer is probably that they have to be kept separate. > Indeed, I have seen `cfg`, and I was wondering if this was the right > tool. From what I see, it is much coarser-grained than what I achieve > atm with autoconf. > > > Let me detail: > - - I need to compile some code for MacOS only; > - - I need to compile some code for all Unix platforms; > - - I need to compile some code for Unix platforms in which libc defines > a given function, say, `mkdirat`; > - - I need to compile some code for Unix platforms in which libc does > not define a given function. > > How do you suggest I do this? > > To complicate matters, on MacOS platform, I need to compile and link > to Objective-C code. > Unfortunately the cfg attribute is the only tool we have for this. Maybe we need more. >>> *** Unicode >> stdlib's string handling for win32 is wrong and I don't believe >> anybody has put much thought into what needs to happen. Suggestions >> welcome. > Well, I can try and work on this along the way. For this purpose, > though, I may need Unicode conversion. > We have an intent to integrate libicu. Maybe that would help. >>> *** Garbage-collection / destruction >> Resources are what you want, but you usually want to box them and >> wrap them in some other type because they are a bit unwieldy from a >> user perspective. > Good to know, thanks. > > In cases where I would use a `auto_ptr` / go's `defer` / Java's > `finally` / etc., though, I suppose that I should rather have my > resource on the stack, no? Ideally, yes. Currently resources are pinned and can't be copied or moved (though Marijn is soon going to make them moveable). If you put them in a box and are careful about taking references to them they will still run the destructor when you expect. >> Failed tasks will call destructors during unwinding. Can you point >> me to the incorrect documentation? > I probably over-interpreted ? The unwinding procedure of hard failure > frees resources but does not execute destructors. ? (Ref.Task.Life) This is referring to what happens when a destructor itself fails during unwinding (after the task has already entered the failure state). This is, by the way, not implemented. Failing within a destructor will leak currently. >> One thing to note is that we still don't implement unwinding on >> win32 so failure is currently unrecoverable on that platform. > Will that happen? > Is this related to the fact that win32 stack introspection is off by > default on 64bit platforms, as I have heard? It will happen, but probably not by the time 0.1 is released. It's because LLVM exception handling doesn't seem to work on windows. The reason for that i don't know. >>> *** Error-handling - - Either aggressively `spawn_notify` tasks, >>> `unsupervise` them and `fail` in case of any error, with some >>> convention to pass `errno`; >> I've tried to implement this approach before and found that the >> language is not yet expressive enough to do this in a convenient >> way (because we can't spawn closures). It's also quite inefficient >> for general use. > Ah, I was not aware that we could not spawn closures. I guess it makes > sense, but it's a bit counter-intuitive for a fp/process algebra guy > like me. I assume this is because type-state cannot make guarantees on > values hidden inside a closure. Are there plans to improve the > expressivity of the language on this front? Yes, it is because we don't know what is inside the closure. There is a plan to add a unique closure type that is guaranteed to only contain sendable things, but not everyone likes the idea of adding yet another type of function. We definitely do have to come up with some solution, because working with tasks is not very easy right now. Regards, Brian From banderson at mozilla.com Tue Nov 15 10:42:01 2011 From: banderson at mozilla.com (Brian Anderson) Date: Tue, 15 Nov 2011 10:42:01 -0800 Subject: [rust-dev] compile-command in each file In-Reply-To: <56ED9F10-BB39-4BB0-B2A1-C693723F8FEF@alum.mit.edu> References: <56ED9F10-BB39-4BB0-B2A1-C693723F8FEF@alum.mit.edu> Message-ID: <4EC2B279.5050402@mozilla.com> On 11/14/2011 09:25 PM, Niko Matsakis wrote: > How do people feel about removing the "compile-command" local variable in each Rust file? They are often slightly different and in any case they tend to make my life (at least) somewhat more difficult when using emacs. I agree. From niko at alum.mit.edu Tue Nov 15 11:55:25 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 15 Nov 2011 11:55:25 -0800 Subject: [rust-dev] compatibility of Rust data structures with C In-Reply-To: <4EC2AB2D.7090405@mozilla.com> References: <4EC2AB2D.7090405@mozilla.com> Message-ID: <728C83A6-C3F1-4E66-AC58-F0F0D16AB4A7@alum.mit.edu> > 'We should do what C does. Agreed, I am going to change this on the x64 branch to always use 32-bits for the variant ID. There is an existing issue #792 that is relevant. I just added the following comment: In general, I think tags should be laid out as follows (this is fairly close to what the current code does): - No data: just like an enum. - Only one variant: just like that variant's data - Multiple variants with data: struct Tag { unsigned variant_id; union { variant_1, variant_2, variant_3 } data; } where variant_N stands in for the translation of the data types. The tricky case of course is with generic types. For something like tag option { none; some(val: T); } where is the data located? That depends on the type of T and its corresponding alignment restrictions. I am not clear on how the current dynamic shape-walking code (GEP-tuple-like, Shape.h, and friends) handles this kind of dynamic alignment. I could see an argument for using a pessimistic alignment (i.e., maximally align the data in tags always), but that seems to waste a lot of space, since the maximal reasonable alignment is probably 16 bytes (vector data), and you only NEED 4 bytes (for the variant ID). For reference, the current behavior is: - No data: unsigned long variant_id - Only one variant: use a char[X] array where X is the size of the variant - Multiple variants: struct Tag { unsigned long variant_id; char[X] data; } where X is the size of the largest variant Niko From dteller at mozilla.com Tue Nov 15 12:01:44 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Tue, 15 Nov 2011 21:01:44 +0100 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> Message-ID: <4EC2C528.3030507@mozilla.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tue Nov 15 15:30:34 2011, Niko Matsakis wrote: > Yes, I was thinking something similar yesterday. Such a pattern > might well be perfect. I have just submitted a blocker issue https://github.com/graydon/rust/issues/1176 Syntax-wise, let's write this as follows let x = #do[e] ...and explore possible issues. 1/ Mixing and matching distinct kinds of errors As such, not too good: foo() -> result::t bar() -> result::t { let x = #do[ foo() ] let y = #do[ bar() ] ... } this will cause a type error. Unless there are plans to implement open sums, which I doubt, we need to impose that the second member of `result::t` has a predictable type. At the moment, the only extensible predictable type that I know of is `any` (or some ADT built on top of it). In the future, we will probably have interfaces and RTTI to differentiate them. Both options seem acceptable to me. For the moment, let's call `exn` the type of exceptions. 2/ Syntax I suggest we manipulate it only through the following macros (here displayed with their pseudo-type and ret): #do[expr] //Continue if `expr` succeeds, propagate exception otherwise //If `expr` has type result::t, this has type A and // returns result::t #throw[err] //Throw exception //If `err` has type A, this has type `any` and // returns result::t #success[val] //Returns a success //If `val` has type A, this has type result::t // returns any #catch[expr, handler] //Catch some exceptions, maybe not all //If `expr` has type result::t //and `handler` has type block(exn) -> result::t //this has type A //and returns result::t #try[expr] //Leave the world of exceptions //If `expr` has type result::t //this has type A //and fails if `expr` has evaluated to an error Example: fn div(a: int, b: int) -> result::t { if x == 0 { #throw[ arith_error(division_by_zero) ] } #success[ a / b ] } fn write_int(a: int) -> result::t { //... } fn main() { #try[ #catch[[ let x = #do[div(5,10)] + #do[div(5,30)]; #do[write_int(x)]; #success[()] ]]{|e| -> //log error } ] } Ok, that is still a bit heavy, syntax-wise. Any idea on how to improve this? 3/ Debugging aid Let us return to the definition of exceptions. Once we have safely semi-abstracted our definition of exceptions behind macros, we can define `exn` for instance as follows type exn = { spec: any //Exception-specific data //or some root interface and RTTI text: option::t //Optional description #cfg[debug] stack: @list::t //Stack information } Which would essentially give us the debugging power of Java exceptions. Any thoughts? Cheers, David -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (Darwin) iQEcBAEBAgAGBQJOwsUoAAoJED+FkPgNe9W+0w0H/jKb52dHf5YTE0lPxW13h4NI jhy+qF6pBIXg9bkLsQOVWJzpGTdahKqVOvfIjLEnAQvpDDd3HaHkZbegB9Yc4BQ9 JmPzLB9GvedJoTcXlNaWfmecpjYZZPzZvtfVl0/m3dfathiJBZdValRjlQuKI7V0 x+DXTVEWybBZraRi0dUtIduTfqC8B/OYK6qoOTCHlaVA+/43QHXkbvtLYYIE2+V2 g4uGBl+47hlt4tEdumNvclfqmbeze7fpMMgB4HhdjR3+L28OdQNvWzQgb009/4aA CNQ8BGCTkJ7Z1d7dNw8sFrZm0/uIMwiSo/QDGQQ3U8btWYFnEZOoeo1L6szqgKM= =Et5p -----END PGP SIGNATURE----- From niko at alum.mit.edu Wed Nov 16 10:58:56 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 16 Nov 2011 10:58:56 -0800 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <4EC2C528.3030507@mozilla.com> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> Message-ID: I think this set of macros may be overkill. The #do[] macro alone seems sufficient to me. "Throw" is just "ret error()" and succeed is just "ret success()", after all, both of which are fairly clear and succinct. "Catch" is just "alt" with pattern matching. As for the issues with the error types, that can be annoying, probably something like any will be the best. In fact, I could imagine that a useful type alias might be type may_fail = std::result or perhaps type may_fail = std::result Niko On Nov 15, 2011, at 12:01 PM, David Rajchenbach-Teller wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Tue Nov 15 15:30:34 2011, Niko Matsakis wrote: >> Yes, I was thinking something similar yesterday. Such a pattern >> might well be perfect. > > I have just submitted a blocker issue > https://github.com/graydon/rust/issues/1176 > > Syntax-wise, let's write this as follows > let x = #do[e] > > ...and explore possible issues. > > > 1/ Mixing and matching distinct kinds of errors > > As such, not too good: > foo() -> result::t > bar() -> result::t > > { > let x = #do[ foo() ] > let y = #do[ bar() ] > ... > } > this will cause a type error. > > Unless there are plans to implement open sums, which I doubt, we need > to impose that the second member of `result::t` has a predictable > type. At the moment, the only extensible predictable type that I know > of is `any` (or some ADT built on top of it). In the future, we will > probably have interfaces and RTTI to differentiate them. Both options > seem acceptable to me. > > For the moment, let's call `exn` the type of exceptions. > > > > 2/ Syntax > > I suggest we manipulate it only through the following macros (here > displayed with their pseudo-type and ret): > > #do[expr] > //Continue if `expr` succeeds, propagate exception otherwise > //If `expr` has type result::t, this has type A and > // returns result::t > > #throw[err] > //Throw exception > //If `err` has type A, this has type `any` and > // returns result::t > > #success[val] > //Returns a success > //If `val` has type A, this has type result::t > // returns any > > #catch[expr, handler] > //Catch some exceptions, maybe not all > //If `expr` has type result::t > //and `handler` has type block(exn) -> result::t > //this has type A > //and returns result::t > > #try[expr] > //Leave the world of exceptions > //If `expr` has type result::t > //this has type A > //and fails if `expr` has evaluated to an error > > > Example: > fn div(a: int, b: int) -> result::t { > if x == 0 { #throw[ arith_error(division_by_zero) ] } > #success[ a / b ] > } > fn write_int(a: int) -> result::t { > //... > } > fn main() { > #try[ > #catch[[ > let x = #do[div(5,10)] + #do[div(5,30)]; > #do[write_int(x)]; > #success[()] > ]]{|e| -> > //log error > } > ] > } > > > Ok, that is still a bit heavy, syntax-wise. Any idea on how to improve > this? > > 3/ Debugging aid > > Let us return to the definition of exceptions. Once we have safely > semi-abstracted our definition of exceptions behind macros, we can > define `exn` for instance as follows > > type exn = { > spec: any //Exception-specific data > //or some root interface and RTTI > text: option::t //Optional description > > #cfg[debug] > stack: @list::t //Stack information > } > > Which would essentially give us the debugging power of Java exceptions. > > > > Any thoughts? > > Cheers, > David > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.11 (Darwin) > > iQEcBAEBAgAGBQJOwsUoAAoJED+FkPgNe9W+0w0H/jKb52dHf5YTE0lPxW13h4NI > jhy+qF6pBIXg9bkLsQOVWJzpGTdahKqVOvfIjLEnAQvpDDd3HaHkZbegB9Yc4BQ9 > JmPzLB9GvedJoTcXlNaWfmecpjYZZPzZvtfVl0/m3dfathiJBZdValRjlQuKI7V0 > x+DXTVEWybBZraRi0dUtIduTfqC8B/OYK6qoOTCHlaVA+/43QHXkbvtLYYIE2+V2 > g4uGBl+47hlt4tEdumNvclfqmbeze7fpMMgB4HhdjR3+L28OdQNvWzQgb009/4aA > CNQ8BGCTkJ7Z1d7dNw8sFrZm0/uIMwiSo/QDGQQ3U8btWYFnEZOoeo1L6szqgKM= > =Et5p > -----END PGP SIGNATURE----- > From banderson at mozilla.com Wed Nov 16 11:07:17 2011 From: banderson at mozilla.com (Brian Anderson) Date: Wed, 16 Nov 2011 11:07:17 -0800 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> Message-ID: <4EC409E5.6000904@mozilla.com> On 11/16/2011 10:58 AM, Niko Matsakis wrote: > I think this set of macros may be overkill. The #do[] macro alone seems sufficient to me. "Throw" is just "ret error()" and succeed is just "ret success()", after all, both of which are fairly clear and succinct. "Catch" is just "alt" with pattern matching. As for the issues with the error types, that can be annoying, probably something like any will be the best. In fact, I could imagine that a useful type alias might be > > type may_fail = std::result > > or perhaps > > type may_fail = std::result > I had thought to redefine result as result { ok(T); err(any); } once that was possible, but I do think maybe creating an exn type as David proposed could be better. From niko at alum.mit.edu Wed Nov 16 21:10:20 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 16 Nov 2011 21:10:20 -0800 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <4EC409E5.6000904@mozilla.com> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> Message-ID: <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> On Nov 16, 2011, at 11:07 AM, Brian Anderson wrote: > I had thought to redefine result as result { ok(T); err(any); } once that was possible, but I do think maybe creating an exn type as David proposed could be better. I am somewhat indifferent as to the precise type of exceptions, so long as you can print it out it's good enough 99% of the time. I could imagine this being a good place for an interface, too. Something like: iface failure { fn msg() -> str; fn data() -> any; } Niko From dteller at mozilla.com Thu Nov 17 08:27:40 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Thu, 17 Nov 2011 17:27:40 +0100 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> Message-ID: <4EC535FC.70408@mozilla.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thu Nov 17 06:10:20 2011, Niko Matsakis wrote: > > On Nov 16, 2011, at 11:07 AM, Brian Anderson wrote: > >> I had thought to redefine result as result { ok(T); err(any); >> } once that was possible, but I do think maybe creating an exn >> type as David proposed could be better. > > I am somewhat indifferent as to the precise type of exceptions, so > long as you can print it out it's good enough 99% of the time. I > could imagine this being a good place for an interface, too. > Something like: > > iface failure { fn msg() -> str; fn data() -> any; } Here's a reworked version with two macros and one type. #throw[val] //if `val` has type T //result typescheme: forall A, A //return typescheme: forall A, result::t #do[val] //if `val` has type A //result typescheme: A //return typescheme: result::t type exn = { //Public API: spec: any //Private API, for debugging only: #[cfg(debug)] stack: list::t//Added and updated by #throw and #do } All of `#throw`, `#do` and `exn` are trivial to implement (or, well, will be once we really have macros :) ) and neither should hurt. Cheers, David -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (Darwin) iQEcBAEBAgAGBQJOxTX8AAoJED+FkPgNe9W+XhUH/0srKVq2WKP8AvS2B7bZXCLr ZvAtuYoMsOSsTLWXuZhZxmk/bXkUGUpKuSuaN2uRUJ7Wbkir9E6iXTmTkyb949x+ aQPYkSTS25N3uUo+6011IWtQKNo9cP59LZGJ/ieypXGsXbWa4hoHcdeSSk0ehoSZ vrPhGxIWoLZpnEKs0KHKQICi6Sedko+0/7yIsdT31WWtuH3zEMgPMBM/WB1OFyIc vZtqd0PcKsof9Z+RlQx/SXO66pk9N9azuBY3olOAERq+mcaivcQn9mFL1vWksM0M j8NlD6BOeyL3hgACdrzWBGRxojRpr9HpCXQ7mxoTxtL7/kKkZUVIIjOz4irg80k= =SzZ/ -----END PGP SIGNATURE----- From lihaitao at gmail.com Thu Nov 17 22:58:53 2011 From: lihaitao at gmail.com (Haitao Li) Date: Fri, 18 Nov 2011 14:58:53 +0800 Subject: [rust-dev] Optional trailing comma in record type-constructor and initializers Message-ID: How about allowing an optional trailing comma appended in record? So below coding style will be encouraged, when writing a record with many fields: type album = { title: str, ...... artist: str, }; let x = { title: "some", ...... artist: "artist", }; To append a new field to the record, adding a line of new field is enough instead of (1) add comma to the previous last field; (2) add new field. Besides conveniences of editing or recording of the fields within a record, the trailing comma helps generate clean patches. Since the order of record fields are important, new fields are more likely to be appended to the end of the record. It's not uncommon that this kind of merge conflict happens. Another approach is writing the record like: type album = { title: str , artist: str }; But this looks more like an afterthought and the same issue exists if change the first field. ECMAScript5[1] allows trailing comma in object initializers. Python has long been supporting this. C structure fields deliminators are semi-colon, like Rust tag, while C enum declaration allows trailing comma in C99[2]. Go doesn't have this issue, since each field is written in its own line and deliminators are all eliminated. --- 1. https://bugzilla.mozilla.org/show_bug.cgi?id=508637 2. http://david.tribble.com/text/cdiffs.htm#C99-enum-decl PS. Not sure if this is the right time of raising this kind of syntax discussion. I read a line about avoiding syntax discussion on Wiki presently. Sorry for breaking the rule if there is one. :) From marijnh at gmail.com Fri Nov 18 07:37:16 2011 From: marijnh at gmail.com (Marijn Haverbeke) Date: Fri, 18 Nov 2011 16:37:16 +0100 Subject: [rust-dev] I added another parameter passing style Message-ID: I'm sorry. There is now a mode called by-copy, which was needed to make constructors behave sensibly in the face of proper enforcement of noncopyability. By-copy works just like by-move, except that when the passed value is an lvalue, it is copied instead of moved (or, if it is a type that can't be copied, an error is raised). By-copy guarantees that the callee owns the passed argument, and does it in a way that generates as few copies as possible. Tag variant constructors, object constructors, and resource constructors now all take their arguments by copy. Don't worry too much about passing-style proliferation. I think by-copy will turn out to supersede by-move. Once we have warnings for accidental copies of uniques (the code exists, but it won't be practical to turn it on until vectors become non-unique again), there won't be much of a reason to use by-move over by-copy. We'll also be able to remove the user-visible distinction between by-val and by-ref at some point (though we might want to keep that in native functions only). Cheers, Marijn From marijnh at gmail.com Fri Nov 18 07:55:50 2011 From: marijnh at gmail.com (Marijn Haverbeke) Date: Fri, 18 Nov 2011 16:55:50 +0100 Subject: [rust-dev] Kind system update Message-ID: I pushed the new kind system today. Things to be aware of are: - We now have proper copyability analysis, and resources can be used in saner ways. - The way arguments are passed to tag, object, and resource constructors changed. See my other e-mail to the list. - 'Last uses' of locals (arguments passed in a mode that makes them owned by the function, and local let variables) are now treated specially -- when stored or passed somewhere, they are moved instead of copied. Most importantly, this makes most returning a local or putting it in a data structure more efficient. This is taken into account by the copyability analysis, so that you only get an error when your program actually tries to use a noncopyable local after storing it somewhere. - The kinds are now called 'sendable', 'copyable', and 'noncopyable'. The keywords to mark generic parameters are 'send' and 'copy' (noncopyable is what you get when you don't specify a keyword). - I got rid of the 'implicit shared [copyable] kind for generic functions' thing again. This means you'll once again often forget to add 'copy' and have to add it after the compiler complains. About 40% of the generic functions in our standard library require copyable arguments -- this is more than I expected. Still, over half can operate on noncopyable types. Defaulting to copyable would have an effect similar to the 'const' keyword in C++ -- people forget to think about it when they write generic functions, so when you do need to apply a generic to a noncopyable kind, you'll first have to fix the generic and all generics it calls to have the right kind bound. - Warning about copying of unique types is easy now (it's implemented and commented out at https://github.com/graydon/rust/blob/master/src/comp/middle/kind.rs#L127 ), but it generates an enormous amount of warnings because we're copying vectors everywhere. I think we might as well leave this off until we have non-unique vector types. That's it. I'll write something up in the tutorial early next week. I think the new system is easier to think about and to explain. And the last-use analysis provided a nice speedup by saving us a bunch of copies. Best, Marijn From graydon at mozilla.com Fri Nov 18 10:37:21 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Fri, 18 Nov 2011 10:37:21 -0800 Subject: [rust-dev] Kind system update In-Reply-To: References: Message-ID: <4EC6A5E1.90409@mozilla.com> On 18/11/2011 7:55 AM, Marijn Haverbeke wrote: > That's it. I'll write something up in the tutorial early next week. I > think the new system is easier to think about and to explain. And the > last-use analysis provided a nice speedup by saving us a bunch of > copies. This is fantastic news, I'm very happy you did this (and it seems to have held together). Thanks. -Graydon From graydon at mozilla.com Fri Nov 18 10:39:23 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Fri, 18 Nov 2011 10:39:23 -0800 Subject: [rust-dev] I added another parameter passing style In-Reply-To: References: Message-ID: <4EC6A65B.6070308@mozilla.com> On 18/11/2011 7:37 AM, Marijn Haverbeke wrote: > I'm sorry. > > There is now a mode called by-copy, which was needed to make > constructors behave sensibly in the face of proper enforcement of > noncopyability. Ok. How is it denoted? Your patch looks like it added + and possibly ++ or #? I can't quite tell. > Don't worry too much about passing-style proliferation. I think > by-copy will turn out to supersede by-move. Once we have warnings for > accidental copies of uniques (the code exists, but it won't be > practical to turn it on until vectors become non-unique again), there > won't be much of a reason to use by-move over by-copy. I'm a bit worried about the proliferation, but I'm hopeful there's enough redundancy at this point to factor-and-discard one or more of them, as you say. I'm confused about by-move going away; the point of by-move is that it takes ownership in the callee. How do we express "the channel takes ownership" in the comm system if we eliminate by-move? -Graydon From graydon at mozilla.com Fri Nov 18 10:40:54 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Fri, 18 Nov 2011 10:40:54 -0800 Subject: [rust-dev] I added another parameter passing style In-Reply-To: <4EC6A65B.6070308@mozilla.com> References: <4EC6A65B.6070308@mozilla.com> Message-ID: <4EC6A6B6.3060707@mozilla.com> On 18/11/2011 10:39 AM, Graydon Hoare wrote: > Ok. How is it denoted? Your patch looks like it added + and possibly ++ > or #? I can't quite tell. Nevermind, got the tyencode / parser rules confused while reading. -Graydon From banderson at mozilla.com Fri Nov 18 11:30:47 2011 From: banderson at mozilla.com (Brian Anderson) Date: Fri, 18 Nov 2011 11:30:47 -0800 Subject: [rust-dev] Kind system update In-Reply-To: References: Message-ID: <4EC6B267.1010509@mozilla.com> On 11/18/2011 07:55 AM, Marijn Haverbeke wrote: > - 'Last uses' of locals (arguments passed in a mode that makes them > owned by the function, and local let variables) are now treated > specially -- when stored or passed somewhere, they are moved instead > of copied. Most importantly, this makes most returning a local or > putting it in a data structure more efficient. This is taken into > account by the copyability analysis, so that you only get an error > when your program actually tries to use a noncopyable local after > storing it somewhere. I'm worried that this rule is too clever. If I have some code like: let a = "whatever"; let v = variant(a); then 'a' is moved into 'v'. If I change the code later to use 'a' again after the initialization of 'v' then variant(a) becomes a copy. I don't know that there's actually any risk there, but it seems counterintuitive. From graydon at mozilla.com Fri Nov 18 11:45:12 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Fri, 18 Nov 2011 11:45:12 -0800 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <4EC535FC.70408@mozilla.com> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> Message-ID: <4EC6B5C8.8010906@mozilla.com> Woah woah, I'm arriving late here. We *have* exceptions. They're called 'failure'. The only point about them is that you can't catch them. For a couple reasons: - We can't re-establish task typestate from unknown starting point; memory might not be initialized, arbitrary other predicates not hold, etc. - Philosophically, catching means either one of two things: 1. You know exactly what the failure means and an exact, transparent recovery mode. For this we recommend simply modeling the recovery mode you want for expected-but-rare circumstances as arguments passed *into* the callee, and handled at the site of circumstance. Think of how O_CREAT and O_TRUNC work in ::open. We have these wonderful tag things for passing in such options, they can be highly structured if necessary. 2. You don't or can't really understand and manage the failure, want to wall off a whole subsystem, to contain an unknown failure but otherwise, at best, just log it and try to reset/try again. For this we have tasks and failure. Intentionally unrecoverable until the task boundary, and intentionally not-very-typed. IOW, our approach to managing faults requires thinking a bit and classifying them, but not much more than you'd have to classify a java exception as checked or unchecked. And we (hopefully) don't wind up throwing or unwinding quite as much either, as a result. Throwing (in terms of implementation) is actually not all that great: it's a fair bit slower than just passing a flag, and catch paths in production code are often bitrotted and/or untested. Flags for common/expected special circumstances are much more likely to get tested, run quickly and correctly. There's a bit of fudging possible on that line because we've talked about (and I'm still ok with) a cross-task failure notice being something that carries an inspectable stack of "any" values (stacktrace entries, "note"s, an original "fail" argument). But that's unimplemented and, in any case, will never be precise: an "any" can always be some type you weren't expecting. -Graydon From graydon at mozilla.com Fri Nov 18 11:57:56 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Fri, 18 Nov 2011 11:57:56 -0800 Subject: [rust-dev] Kind system update In-Reply-To: <4EC6B267.1010509@mozilla.com> References: <4EC6B267.1010509@mozilla.com> Message-ID: <4EC6B8C4.6090104@mozilla.com> On 18/11/2011 11:30 AM, Brian Anderson wrote: > I'm worried that this rule is too clever. If I have some code like: > > let a = "whatever"; > let v = variant(a); > > then 'a' is moved into 'v'. If I change the code later to use 'a' again > after the initialization of 'v' then variant(a) becomes a copy. I don't > know that there's actually any risk there, but it seems counterintuitive. Ah, I see, this is how he's proposing to obviate the need for move-mode. Just have last-use-into-copy-mode-argument do a move? And with the "warn on big copy" behavior, the risk is ... mitigated. I agree it feels a bit too clever / counterintuitive that way, but I haven't used it much yet. It might be ok. My initial thought is to add a unary "move" operator to mirror "<-", the same way "copy" mirrors "=". Where "<-" and "move" now both really only have the effect of turning the putative "large-copy warning" into a hard error. A careful author might well want to do this inside their module. I certainly would, if I knew I was writing in "move style". I'd place the moves where I think they should occur, to minimize the amount of "lval last-use reasoning" I have to keep in my head, make it explicit. (Especially if I had internalized the concept "move means fast"; who doesn't want to add all the "go fast" symbols to their code they can? :) Anyway, I think this is all definitely on the right track, but we might need to play with it a bit more to get it feeling just right. -Graydon From marijnh at gmail.com Fri Nov 18 12:07:37 2011 From: marijnh at gmail.com (Marijn Haverbeke) Date: Fri, 18 Nov 2011 21:07:37 +0100 Subject: [rust-dev] Kind system update In-Reply-To: References: <4EC6B267.1010509@mozilla.com> <4EC6B8C4.6090104@mozilla.com> Message-ID: I somewhat share your worries about the cleverness of the last-userule. But my logic for adding it anyway is: A) We need to do thisanyway as an optimization, so B) if we are already doing it, we mightas well remove the burden of explicitly annotating moves from the userto the compiler. I still think an explicit unary move would be auseful addition for allowing people to write more obvious code, but Idon't think the kind checker should complain about copies that arealready going to be optimized to moves anyway. From graydon at mozilla.com Fri Nov 18 12:10:51 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Fri, 18 Nov 2011 12:10:51 -0800 Subject: [rust-dev] Kind system update In-Reply-To: References: <4EC6B267.1010509@mozilla.com> <4EC6B8C4.6090104@mozilla.com> Message-ID: <4EC6BBCB.80907@mozilla.com> On 18/11/2011 12:07 PM, Marijn Haverbeke wrote: > I somewhat share your worries about the cleverness of the > last-userule. But my logic for adding it anyway is: A) We need to do > thisanyway as an optimization, so B) if we are already doing it, we > mightas well remove the burden of explicitly annotating moves from the > userto the compiler. I still think an explicit unary move would be > auseful addition for allowing people to write more obvious code, but > Idon't think the kind checker should complain about copies that > arealready going to be optimized to moves anyway. Agreed. (You might also want to take a look -- sorry I didn't catch you doing this earlier and mention it -- at the flow analysis done in typestate. I think it knows last uses as well. You might be able to borrow what it knows.) -Graydon From dherman at mozilla.com Fri Nov 18 12:17:35 2011 From: dherman at mozilla.com (David Herman) Date: Fri, 18 Nov 2011 15:17:35 -0500 Subject: [rust-dev] Kind system update In-Reply-To: References: Message-ID: <92BA2217-484A-4B81-AA8E-40A11AD20029@mozilla.com> I'm very concerned about these changes. Let's talk about this on Tuesday. Dave On Nov 18, 2011, at 10:55 AM, Marijn Haverbeke wrote: > I pushed the new kind system today. Things to be aware of are: > > - We now have proper copyability analysis, and resources can be used > in saner ways. > > - The way arguments are passed to tag, object, and resource > constructors changed. See my other e-mail to the list. > > - 'Last uses' of locals (arguments passed in a mode that makes them > owned by the function, and local let variables) are now treated > specially -- when stored or passed somewhere, they are moved instead > of copied. Most importantly, this makes most returning a local or > putting it in a data structure more efficient. This is taken into > account by the copyability analysis, so that you only get an error > when your program actually tries to use a noncopyable local after > storing it somewhere. > > - The kinds are now called 'sendable', 'copyable', and 'noncopyable'. > The keywords to mark generic parameters are 'send' and 'copy' > (noncopyable is what you get when you don't specify a keyword). > > - I got rid of the 'implicit shared [copyable] kind for generic > functions' thing again. This means you'll once again often forget to > add 'copy' and have to add it after the compiler complains. About 40% > of the generic functions in our standard library require copyable > arguments -- this is more than I expected. Still, over half can > operate on noncopyable types. Defaulting to copyable would have an > effect similar to the 'const' keyword in C++ -- people forget to think > about it when they write generic functions, so when you do need to > apply a generic to a noncopyable kind, you'll first have to fix the > generic and all generics it calls to have the right kind bound. > > - Warning about copying of unique types is easy now (it's implemented > and commented out at > https://github.com/graydon/rust/blob/master/src/comp/middle/kind.rs#L127 > ), but it generates an enormous amount of warnings because we're > copying vectors everywhere. I think we might as well leave this off > until we have non-unique vector types. > > That's it. I'll write something up in the tutorial early next week. I > think the new system is easier to think about and to explain. And the > last-use analysis provided a nice speedup by saving us a bunch of > copies. > > Best, > Marijn > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From jim at uazu.net Fri Nov 18 13:12:22 2011 From: jim at uazu.net (Jim Peters) Date: Fri, 18 Nov 2011 16:12:22 -0500 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <4EC6B5C8.8010906@mozilla.com> References: <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> Message-ID: <20111118211222.GA5950@uazu.net> Graydon Hoare wrote: > - Philosophically, catching means either one of two things: > > 1. You know exactly what the failure means and an exact, transparent > recovery mode. For this we recommend simply modeling the recovery > mode you want for expected-but-rare circumstances as arguments > passed *into* the callee, and handled at the site of > circumstance. Think of how O_CREAT and O_TRUNC work in ::open. > We have these wonderful tag things for passing in such options, > they can be highly structured if necessary. > > 2. You don't or can't really understand and manage the failure, > want to wall off a whole subsystem, to contain an unknown failure > but otherwise, at best, just log it and try to reset/try again. > For this we have tasks and failure. Intentionally unrecoverable > until the task boundary, and intentionally not-very-typed. Let's say I have a function parse_file(), it might fail because of I/O problems or syntax failures, or it may give me back a valid parsed object. Given that both errors may be detected at any depth in the internal call tree, in another language exceptions would be a good approach, keeping the success path apart from the failure path. With the approach above, this means using a task to contain the failure, i.e. the model is spawn/wait, rather than try/catch. (Tasks are again taking a role like a language construct, not merely as a scheduling entity.) Jim -- Jim Peters (_)/=\~/_(_) jim at uazu.net (_) /=\ ~/_ (_) Uaz? (_) /=\ ~/_ (_) http:// in Peru (_) ____ /=\ ____ ~/_ ____ (_) uazu.net From graydon at mozilla.com Fri Nov 18 13:34:59 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Fri, 18 Nov 2011 13:34:59 -0800 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <20111118211222.GA5950@uazu.net> References: <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <20111118211222.GA5950@uazu.net> Message-ID: <4EC6CF83.1050405@mozilla.com> On 18/11/2011 1:12 PM, Jim Peters wrote: > Let's say I have a function parse_file(), it might fail because of I/O > problems or syntax failures, or it may give me back a valid parsed > object. Given that both errors may be detected at any depth in the > internal call tree, in another language exceptions would be a good > approach, keeping the success path apart from the failure path. > > With the approach above, this means using a task to contain the > failure, i.e. the model is spawn/wait, rather than try/catch. To some extent. It depends a lot on the parsing task. I don't mean to be dismissive. This is a great example. But I want to clarify things: - In a Serious Parser, errors are managed explicitly because error reporting to the user is an Important Diagnostic Feature of the parsing. There's a whole error formatting, diagnostic emitting, suppressing-and-counting logic built in that goes beyond what an exception will do. You are always going to be doing manual work All Through The Parser. - In a Throwaway Parser, it's not clear to me that treating all failures as interchangeable, as with tasks ("we can't get meaningful input") is losing a lot of fidelity over an exception-based approach. That's what failure is for in rust. And in both cases, putting the parser in its own task has the wonderful advantage of also allowing you do to other things while the parser is waiting on I/O. It's a natural approach. Sorry to blather on like this, but ... the point of "lightweight tasks" in this language is that they're lightweight, and get used early and often. If everyone's reaction to "use a task" is "oh bother, those are far too heavy", I think we've made a mistake somewhere. > (Tasks are again taking a role like a language construct, not merely > as a scheduling entity.) They originally were first class, and I am guilty-as-charged with being willing to treat them as relatively high-ranking concepts in the design. The design thesis in rust is that proper structured programming *needs* task-like memory-and-concurrency partitioning to scale robustly, and it's one of the things we've been falling down on all through the 80s, 90s, 2000s. Languages keep making devices that don't quite get used, systems wind up with far too few internal boundaries, so are far too serial and fragile. -Graydon From erick.tryzelaar at gmail.com Fri Nov 18 14:18:14 2011 From: erick.tryzelaar at gmail.com (Erick Tryzelaar) Date: Fri, 18 Nov 2011 14:18:14 -0800 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <4EC6CF83.1050405@mozilla.com> References: <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <20111118211222.GA5950@uazu.net> <4EC6CF83.1050405@mozilla.com> Message-ID: Has anyone considered haskell's monadic errors? As far as the end user is concerned, it's pretty much equivalent to exceptions, but minus the stack-jumping-ness of true exceptions. Since at it's heart it's just returning values of either, I would imagine it wouldn't break typestate: http://www.randomhacks.net/articles/2007/03/10/haskell-8-ways-to-report-errors From jim at uazu.net Fri Nov 18 14:46:27 2011 From: jim at uazu.net (Jim Peters) Date: Fri, 18 Nov 2011 17:46:27 -0500 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <4EC6CF83.1050405@mozilla.com> References: <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <20111118211222.GA5950@uazu.net> <4EC6CF83.1050405@mozilla.com> Message-ID: <20111118224626.GA6133@uazu.net> Graydon Hoare wrote: > To some extent. It depends a lot on the parsing task. I don't mean > to be dismissive. This is a great example. But I want to clarify > things: > > - In a Serious Parser, errors are managed explicitly because error > reporting to the user is an Important Diagnostic Feature of the > parsing. There's a whole error formatting, diagnostic emitting, > suppressing-and-counting logic built in that goes beyond what an > exception will do. You are always going to be doing manual work > All Through The Parser. > > - In a Throwaway Parser, it's not clear to me that treating all > failures as interchangeable, as with tasks ("we can't get > meaningful input") is losing a lot of fidelity over an > exception-based approach. That's what failure is for in rust. Okay, that makes sense. The coder just needs to get used to using tasks for containment instead of other constructions. As I code from now on (in other languages) I will try to imagine how I would structure the same thing in Rust based on your two options (pre-specify a solution, or fail the task). > [...] > > Sorry to blather on like this, but ... the point of "lightweight > tasks" in this language is that they're lightweight, and get used > early and often. If everyone's reaction to "use a task" is "oh > bother, those are far too heavy", I think we've made a mistake > somewhere. > > >(Tasks are again taking a role like a language construct, not merely > >as a scheduling entity.) > > They originally were first class, and I am guilty-as-charged with > being willing to treat them as relatively high-ranking concepts in > the design. The design thesis in rust is that proper structured > programming *needs* task-like memory-and-concurrency partitioning to > scale robustly, and it's one of the things we've been falling down > on all through the 80s, 90s, 2000s. Languages keep making devices > that don't quite get used, systems wind up with far too few internal > boundaries, so are far too serial and fragile. Sorry -- I didn't mean for it to sound like criticism. In a previous thread I was trying to establish whether Rust's tasks were first-class language constructs and optimisable, or just scheduling entities. I agree. I really do like the whole first-class "task as a construct" approach -- with task primitives in the same list as if/while/for. However, I still worry that you may need to apply "identity transformations" on task use to make them sufficiently efficient, when Rust approaches C's efficiency in all other aspects. By "identity transformations" I mean something like algebraic identities, for example, converting a certain pattern of task+port use into another one, or into an inlined (serial) version, where the effect is indistinguishable to the programmer, but lets Rust run it faster. (There were some scenarios in a previous thread.) The question is where will Rust's eventual task-costs lie on the scale: totally free (optimised down to what I might hand-code if I knew the number of cores I was running on), cheap-enough (probably I shouldn't use too many hundred, and tasks will slow me down in inner loops), or too-slow (avoid tasks!). I would prefer "totally free" if possible -- when Rust eventually gets to that level of optimisation -- which means that the coder can use them freely as a language construct like braces or inline functions or something, without any concern at all that they might cause inefficiency. Jim -- Jim Peters (_)/=\~/_(_) jim at uazu.net (_) /=\ ~/_ (_) Uaz? (_) /=\ ~/_ (_) http:// in Peru (_) ____ /=\ ____ ~/_ ____ (_) uazu.net From banderson at mozilla.com Fri Nov 18 15:10:54 2011 From: banderson at mozilla.com (Brian Anderson) Date: Fri, 18 Nov 2011 15:10:54 -0800 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <4EC6CF83.1050405@mozilla.com> References: <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <20111118211222.GA5950@uazu.net> <4EC6CF83.1050405@mozilla.com> Message-ID: <4EC6E5FE.3080905@mozilla.com> On 11/18/2011 01:34 PM, Graydon Hoare wrote: > Sorry to blather on like this, but ... the point of "lightweight > tasks" in this language is that they're lightweight, and get used > early and often. If everyone's reaction to "use a task" is "oh bother, > those are far too heavy", I think we've made a mistake somewhere. The reason std::io returns error values now is largely because it's still not practical to use tasks as they are intended. I did first try to just isolate in a task the IO failures I needed to handle, but the language is not expressive enough yet to do this in a general way (really needs unique closures). Error handling is super important and so far we aren't providing any guidance on how to do it the Rust way (since rustc mostly is ok with the entire process failing). From niko at alum.mit.edu Fri Nov 18 16:45:46 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Fri, 18 Nov 2011 16:45:46 -0800 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: References: <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <20111118211222.GA5950@uazu.net> <4EC6CF83.1050405@mozilla.com> Message-ID: <0DFA5C65-FFDA-4923-AA15-4BBDCA337DB3@alum.mit.edu> A #do[] macro which just returns in the case of failure is basically equivalent to this in practice, I think. Niko On Nov 18, 2011, at 2:18 PM, Erick Tryzelaar wrote: > Has anyone considered haskell's monadic errors? As far as the end user > is concerned, it's pretty much equivalent to exceptions, but minus the > stack-jumping-ness of true exceptions. Since at it's heart it's just > returning values of either, I would imagine it wouldn't break > typestate: > > http://www.randomhacks.net/articles/2007/03/10/haskell-8-ways-to-report-errors > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > From niko at alum.mit.edu Fri Nov 18 16:51:49 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Fri, 18 Nov 2011 16:51:49 -0800 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <4EC6B5C8.8010906@mozilla.com> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> Message-ID: <00A816FC-0F28-40F6-941C-F11D656E4888@alum.mit.edu> I too hope that we can get by with using tasks to contain failure rather than catch handlers. However, I think there will be "lighter weight" cases where a long function wants to bail out early. This is sometimes because some internal operation failed due to error or might be something like a recursive search which found a result. These cases are well-modeled with a type like either<> or std::result<>, but sometimes the syntactic burden of matching and returning failures distracts from the main line of the code. I think this is an important issue to address: a simple macro is probably enough, though. On Nov 18, 2011, at 11:45 AM, Graydon Hoare wrote: > Woah woah, I'm arriving late here. > > We *have* exceptions. They're called 'failure'. The only point about them is that you can't catch them. For a couple reasons: > > - We can't re-establish task typestate from unknown starting point; > memory might not be initialized, arbitrary other predicates not > hold, etc. > > - Philosophically, catching means either one of two things: > > 1. You know exactly what the failure means and an exact, transparent > recovery mode. For this we recommend simply modeling the recovery > mode you want for expected-but-rare circumstances as arguments > passed *into* the callee, and handled at the site of > circumstance. Think of how O_CREAT and O_TRUNC work in ::open. > We have these wonderful tag things for passing in such options, > they can be highly structured if necessary. > > 2. You don't or can't really understand and manage the failure, > want to wall off a whole subsystem, to contain an unknown failure > but otherwise, at best, just log it and try to reset/try again. > For this we have tasks and failure. Intentionally unrecoverable > until the task boundary, and intentionally not-very-typed. > > IOW, our approach to managing faults requires thinking a bit and classifying them, but not much more than you'd have to classify a java exception as checked or unchecked. And we (hopefully) don't wind up throwing or unwinding quite as much either, as a result. > > Throwing (in terms of implementation) is actually not all that great: it's a fair bit slower than just passing a flag, and catch paths in production code are often bitrotted and/or untested. Flags for common/expected special circumstances are much more likely to get tested, run quickly and correctly. > > There's a bit of fudging possible on that line because we've talked about (and I'm still ok with) a cross-task failure notice being something that carries an inspectable stack of "any" values (stacktrace entries, "note"s, an original "fail" argument). But that's unimplemented and, in any case, will never be precise: an "any" can always be some type you weren't expecting. > > -Graydon > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > From niko at alum.mit.edu Fri Nov 18 16:56:35 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Fri, 18 Nov 2011 16:56:35 -0800 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <00A816FC-0F28-40F6-941C-F11D656E4888@alum.mit.edu> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <00A816FC-0F28-40F6-941C-F11D656E4888@alum.mit.edu> Message-ID: <35B320DA-4CB3-4D79-813F-1D9902BC34A1@alum.mit.edu> On Nov 18, 2011, at 4:51 PM, Niko Matsakis wrote: > 1. You know exactly what the failure means and an exact, transparent > recovery mode. For this we recommend simply modeling the recovery > mode you want for expected-but-rare circumstances as arguments > passed *into* the callee, and handled at the site of > circumstance. Think of how O_CREAT and O_TRUNC work in ::open. > We have these wonderful tag things for passing in such options, > they can be highly structured if necessary. This is a good point, however. Moving the recovery into the code itself is elegant, when it applies; I'm not sure how much this generalizes. Have to think about it. Niko From dteller at mozilla.com Sat Nov 19 04:02:22 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Sat, 19 Nov 2011 13:02:22 +0100 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <4EC6B5C8.8010906@mozilla.com> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> Message-ID: <4EC79ACE.8020305@mozilla.com> On 11/18/11 8:45 PM, Graydon Hoare wrote: > Woah woah, I'm arriving late here. > > We *have* exceptions. They're called 'failure'. The only point about > them is that you can't catch them. For a couple reasons: I feel that there is a little misunderstanding. If you do not mind, I would like to contextualize a little our conversation. Let's start with a little safety vocabulary, not specifically related to Rust. It is quite common to have two distinct words: "exceptions" for exceptional but expected behavior ? that you typically want to catch ? and "failures" for issues that are beyond the control of the developer and can at best be contained by shutting down a system and relaunch it. To attain a high level of safety, having a manner of dealing with both it is a Good Thing (tm). Fortunately, these two concepts mostly map to your points 1. or 2.: > [...] > - Philosophically, catching means either one of two things: > > 1. You know exactly what the failure means and an exact, transparent > recovery mode. For this we recommend simply modeling the recovery > mode you want for expected-but-rare circumstances as arguments > passed *into* the callee, and handled at the site of > circumstance. Think of how O_CREAT and O_TRUNC work in ::open. > We have these wonderful tag things for passing in such options, > they can be highly structured if necessary. > > 2. You don't or can't really understand and manage the failure, > want to wall off a whole subsystem, to contain an unknown failure > but otherwise, at best, just log it and try to reset/try again. > For this we have tasks and failure. Intentionally unrecoverable > until the task boundary, and intentionally not-very-typed. For category 2., as you mention, Rust has the mechanism of tasks and failures. The non-spawnability of closures has me a little worried, but I am sure that most/all useful cases can be encoded without this feature and since my current hands-on experience with Rust tasks and failures is essentially non-existing, I feel incompetent to discuss these in depth. However, this whole thread was about category 1 and, more precisely, about library-design of category 1, rather than language-design. For reporting exceptional behaviors, the "wonderful tag things" you mention are necessary, and a sufficient *language* mechanism, but stopping the *library* design at this point is inviting either the same mess as Haskell or OCaml or the same mess as mozilla-central. Both Haskell and OCaml have around 6 distinct ? and largely type-incompatible ? manners of reporting exceptions. This does not even take into account the fact that both OCaml and Haskell sum types are (or can be made) more flexible/powerful than Rust tags, something we probably do not want in Rust. On the other side, mozilla-central has only one mechanism, which has a fixed set of exceptions, and new exceptions can only be added by rebuilding the whole platform. In order to avoid both pitfalls, I advocate that we need to decide of a standard manner of reporting exceptions very early in the development of Rust ? ideally before anybody starts writing or using any library that makes heavy use of exceptions, such as an IO library. The rest of the thread was about how to best offer standard library mechanisms for handling with exception reporting. I hope this clarified matters a little. Cheers, David -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 487 bytes Desc: OpenPGP digital signature URL: From dteller at mozilla.com Sat Nov 19 04:09:35 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Sat, 19 Nov 2011 13:09:35 +0100 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <0DFA5C65-FFDA-4923-AA15-4BBDCA337DB3@alum.mit.edu> References: <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <20111118211222.GA5950@uazu.net> <4EC6CF83.1050405@mozilla.com> <0DFA5C65-FFDA-4923-AA15-4BBDCA337DB3@alum.mit.edu> Message-ID: <4EC79C7F.6030702@mozilla.com> On 11/19/11 1:45 AM, Niko Matsakis wrote: > A #do[] macro which just returns in the case of failure is basically equivalent to this in practice, I think. > > > Niko Not only that, but it plays better with the strengths of Rust. Cheers, David -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 487 bytes Desc: OpenPGP digital signature URL: From dteller at mozilla.com Sat Nov 19 04:14:31 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Sat, 19 Nov 2011 13:14:31 +0100 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <35B320DA-4CB3-4D79-813F-1D9902BC34A1@alum.mit.edu> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <00A816FC-0F28-40F6-941C-F11D656E4888@alum.mit.edu> <35B320DA-4CB3-4D79-813F-1D9902BC34A1@alum.mit.edu> Message-ID: <4EC79DA7.5080308@mozilla.com> On 11/19/11 1:56 AM, Niko Matsakis wrote: > > On Nov 18, 2011, at 4:51 PM, Niko Matsakis wrote: > >> 1. You know exactly what the failure means and an exact, transparent >> recovery mode. For this we recommend simply modeling the recovery >> mode you want for expected-but-rare circumstances as arguments >> passed *into* the callee, and handled at the site of >> circumstance. Think of how O_CREAT and O_TRUNC work in ::open. >> We have these wonderful tag things for passing in such options, >> they can be highly structured if necessary. > > This is a good point, however. Moving the recovery into the code itself is elegant, when it applies; I'm not sure how much this generalizes. Have to think about it. I can't remember the title of the original exceptions paper, but if I recall correctly, exceptions were initially designed specifically to avoid such behaviors, as this quite often leads to messy and unreliable code. Oh, and a simple example where it does not work: you do not want to deal with scenario "user has suddenly removed the USB key" smack in the middle of your code for, say, serializing your data structure to a compressed output stream for said key. You want to do this at higher level. Cheers, David -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 487 bytes Desc: OpenPGP digital signature URL: From graydon at mozilla.com Sat Nov 19 11:41:13 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Sat, 19 Nov 2011 11:41:13 -0800 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <4EC79ACE.8020305@mozilla.com> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <4EC79ACE.8020305@mozilla.com> Message-ID: <4EC80659.2090809@mozilla.com> On 19/11/2011 4:02 AM, David Rajchenbach-Teller wrote: > Let's start with a little safety vocabulary, not specifically related to > Rust. It is quite common to have two distinct words: "exceptions" for > exceptional but expected behavior ? that you typically want to catch ? Heh. I appreciate the attempt to pick such terminology neutrally, but "exception" and "catch" very plainly points towards the modern typed-exceptions-with-unwind-semantics implementation. There are many other systems for error management: error codes, monads, conditions-and-restarts, signals-and-handlers, etc. etc. Unwinding-and-catching is a very specific strategy. I point all this out because, quite a long while ago (before publishing anything), the Rust-on-paper design I worked on had a non-unwind-based condition system; errors represented an opportunity for a dynamically-scoped handler to offer recovery, *or fail*. > and "failures" for issues that are beyond the control of the developer > and can at best be contained by shutting down a system and relaunch it. > To attain a high level of safety, having a manner of dealing with both > it is a Good Thing (tm). > > Fortunately, these two concepts mostly map to your points 1. or 2.: Agreed. There are at least expected-and-possibly-handled things and unexpected-or-definitely-can't-handle things. 1. and 2. :) > For category 2., as you mention, Rust has the mechanism of tasks and > failures. The non-spawnability of closures has me a little worried, but > I am sure that most/all useful cases can be encoded without this feature > and since my current hands-on experience with Rust tasks and failures is > essentially non-existing, I feel incompetent to discuss these in depth. Closures will be spawnable when we've implemented unique closures. That's a WIP, not a permanently-missing-feature. > However, this whole thread was about category 1 and, more precisely, > about library-design of category 1, rather than language-design. Ok. > For reporting exceptional behaviors, the "wonderful tag things" you > mention are necessary, and a sufficient *language* mechanism, but > stopping the *library* design at this point is inviting either the same > mess as Haskell or OCaml or the same mess as mozilla-central. Both > Haskell and OCaml have around 6 distinct ? and largely type-incompatible > ? manners of reporting exceptions. This does not even take into account > the fact that both OCaml and Haskell sum types are (or can be made) more > flexible/powerful than Rust tags, something we probably do not want in > Rust. On the other side, mozilla-central has only one mechanism, which > has a fixed set of exceptions, and new exceptions can only be added by > rebuilding the whole platform. > > In order to avoid both pitfalls, I advocate that we need to decide of a > standard manner of reporting exceptions very early in the development of > Rust ? ideally before anybody starts writing or using any library that > makes heavy use of exceptions, such as an IO library. Yes, and I am proposing one: pass *in* your handlers (or symbolic codes indicating handler-strategy) and have the callee handle *at the site of the condition*. Sorry if I wasn't clear enough about the implied use of tags I meant, up-thread. I'm serious. I've read and understood what you wrote above, so I'll ask you do the courtesy of reading and understanding the following paragraphs fully as well. I'm not writing them not to clobber you with the Obvious Superiority of my own beliefs -- they may well be wrong -- just to clarify exactly what I'm suggesting, what I'd do as alternatives, and why. The old Rust condition system was modeled on the condition system in Mesa. You named "signals" as global items and gave a syntactic form to routing a given signal to a locally-defined "handler" in the caller, much as you would a try/catch block. The difference is that a handler in this scheme is a typed function-like definition dangling after the protected block. It reads like so: try { os::open(fname); } handle os::file_not_exist(str filename) -> file { ret os::create(fname); } So the recovery logic remains off the main code path, like a modern "catch block", but with a fn-like signature: arguments and its own "recovery value" return type. At signal occurrence, the originating site would invoke the signal by item name; this would cause the runtime to find the innermost installed handler via either "head-of-a-task-local-list" search, or by static code-range search of the caller stack, similar to C++-unwinding, and call it. The handler would either return the typed recovery value, or fail. Failure to locate a handler at all, of course, also generates a fail. This is a nice pleasant scheme half way between Liskov's CLU exceptions and lisp's restarts. It has the positive property that it's oriented towards handling at the signalling site rather than unwinding when you actually intend to continue; but it introduces fewer moving parts than the lisp system. During early review, someone -- I think possibly Brendan? -- pointed out some retrospective comments -- I think possibly from Lampson? -- on the system in Mesa. The retrospective was somewhat damning: not of the system in particular, but of the whole notion of splitting the recovery path off into a slow-to-invoke secondary handler (as in Mesa but also as in most modern exception systems). The retrospective reasoning, IIRC (working from memory here; if Brendan or whoever pointed it out is reading I'd appreciate a pointer to the original text) went like this: - The conditions you expect to generate, the author of the callee code necessarily can enumerate in their own mind. They invoke the signals when things go wrong, after all! - The set of plausible-and-useful recoveries for any given signal is really quite small and predictable; that same author of the callee can mentally enumerate all the ways they could expect to be told to recover anyways. - If a signal is frequent enough to make it into the API this way, it's frequent enough that you're going to be invoking the handler regularly. Having that invocation be hundreds of times slower is undesirable. - Having the recovery logic at a distance from the origin and duplicated for each caller who wants to follow a given pattern actually leads to buggier, more fragile and less likely recovery. - The above points combined to -- quite naturally and without stated intention -- make the programmers using the system gradually shift any API they designed from using signals to using flags or variants that described the recovery mode they wanted into any subsystem with predictable signals. So they eventually removed the remaining uses of the signal system (mostly bitrotted) and were happier for it. I found this argument compelling. So much so I'm probably exaggerating or mis-stating the arguments a bit. But it lead me to reflect a bit more on the *realistic* uses of exceptions I've seen in programs, and found myself unable to debate it: most catch clauses I see do one of a small number of very predictable things: ignore, retry, hard-fail, log, or try one of a very small number of alternative strategies to accomplishing the initial goal (create the file rather than open it, say) that the author of the callee could very well have predicted and codified in a small tag-set of extra arguments. So I removed the condition system from the design docs, and never implemented it. I propose structuring the libraries along these lines. That is, to have the callee authors actually think a bit about what an unusual-return means, which ways there might plausibly be for recovering from it, and take a tag or vec-of-tags carrying the preferred recovery strategy. If this fails to hold together and we really, really have to revive some kind of structured at-a-distance recovery system, I'm going to suggest going back to the Mesa-like signal scheme I sketched out earlier (and above). The main (substantial!) advantage it offers is that recovery paths cause no actual unwinding-or-destruction -- recovery occurs effectively "at the signal site" -- so there's no question of perturbation of the typestate. Unwinding still only happens during failure. The handler is invoked like any other function and if it succeeds the unwinder is never even involved. IMO it's much tidier than try/throw/catch and/or monads-by-macros. -Graydon From graydon at mozilla.com Sat Nov 19 11:50:48 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Sat, 19 Nov 2011 11:50:48 -0800 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <4EC79DA7.5080308@mozilla.com> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <00A816FC-0F28-40F6-941C-F11D656E4888@alum.mit.edu> <35B320DA-4CB3-4D79-813F-1D9902BC34A1@alum.mit.edu> <4EC79DA7.5080308@mozilla.com> Message-ID: <4EC80898.9030705@mozilla.com> On 19/11/2011 4:14 AM, David Rajchenbach-Teller wrote: > I can't remember the title of the original exceptions paper, but if I > recall correctly, exceptions were initially designed specifically to > avoid such behaviors, as this quite often leads to messy and unreliable > code. Exceptions date back a long, long way. People have been batting this idea around since the 60s or earlier. I'd have to dig into the HOPL stuff on the shelf to find the origin, if it's even possible to locate. What I want to point out here is this: if you study any of the deployed exception systems, and how they're used, you'll find they're still an active point of debate among language designers, some 50 years on. There's no approach that's clearly and unambiguously superior to others yet discovered. Nothing where people look at the resulting software and say "oh yes, it's orders of magnitude better at handling faults, or easier to reason about the faults in, than this other approach". > Oh, and a simple example where it does not work: you do not want to deal > with scenario "user has suddenly removed the USB key" smack in the > middle of your code for, say, serializing your data structure to a > compressed output stream for said key. You want to do this at higher level. No. That's a category #2 error. You moved the goalposts. Nobody involved in any of the libraries is able to predict that occurrence. It's an externally generated, byzantine fault. Those can happen anywhere and no API beyond "crash the subsystem and restart it" can sensibly defend against them. That's why we have "failure". Same as a divide-by-zero or whatever. -Graydon From dteller at mozilla.com Sat Nov 19 12:01:35 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Sat, 19 Nov 2011 21:01:35 +0100 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <4EC80898.9030705@mozilla.com> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <00A816FC-0F28-40F6-941C-F11D656E4888@alum.mit.edu> <35B320DA-4CB3-4D79-813F-1D9902BC34A1@alum.mit.edu> <4EC79DA7.5080308@mozilla.com> <4EC80898.9030705@mozilla.com> Message-ID: <4EC80B1F.3070707@mozilla.com> On Sat Nov 19 20:50:48 2011, Graydon Hoare wrote: >> Oh, and a simple example where it does not work: you do not want to deal >> with scenario "user has suddenly removed the USB key" smack in the >> middle of your code for, say, serializing your data structure to a >> compressed output stream for said key. You want to do this at higher >> level. > > No. That's a category #2 error. You moved the goalposts. Nobody > involved in any of the libraries is able to predict that occurrence. > It's an externally generated, byzantine fault. Those can happen > anywhere and no API beyond "crash the subsystem and restart it" can > sensibly defend against them. That's why we have "failure". Same as a > divide-by-zero or whatever. You are absolutely right. I was just giving a quick counter-example to show that we cannot handle everything at low-level. Cheers, David From dteller at mozilla.com Sat Nov 19 14:31:27 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Sat, 19 Nov 2011 23:31:27 +0100 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <4EC80659.2090809@mozilla.com> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <4EC79ACE.8020305@mozilla.com> <4EC80659.2090809@mozilla.com> Message-ID: <4EC82E3F.2050309@mozilla.com> On 11/19/11 8:41 PM, Graydon Hoare wrote: > On 19/11/2011 4:02 AM, David Rajchenbach-Teller wrote: > >> Let's start with a little safety vocabulary, not specifically related to >> Rust. It is quite common to have two distinct words: "exceptions" for >> exceptional but expected behavior ? that you typically want to catch ? > > Heh. I appreciate the attempt to pick such terminology neutrally, but > "exception" and "catch" very plainly points towards the modern > typed-exceptions-with-unwind-semantics implementation. > > There are many other systems for error management: error codes, monads, > conditions-and-restarts, signals-and-handlers, etc. etc. > Unwinding-and-catching is a very specific strategy. Let's switch vocabulary, then. "Report" and "Handle" are probably less loaded with implicit meanings. In my mind, that error codes and monads are effectively mechanisms for type 1 report-and-handle (in addition to which, Haskell-style error monads are effectively a different implementation of the unwind-and- catch paradigm). Conditions-and-restarts are somewhat wilder but I would essentially sit them under this umbrella ? note that my knowledge of conditions-and-restarts is extremely superficial. Still in my mind, signals-and-handlers are on the other side, i.e. a mechanism for failures. > I point all this out because, quite a long while ago (before publishing > anything), the Rust-on-paper design I worked on had a non-unwind-based > condition system; errors represented an opportunity for a > dynamically-scoped handler to offer recovery, *or fail*. If you still have notes, I am interested in reading them. Strike that, here they are :) >> [...] > > Closures will be spawnable when we've implemented unique closures. > That's a WIP, not a permanently-missing-feature. Great :) > Yes, and I am proposing one: pass *in* your handlers (or symbolic codes > indicating handler-strategy) and have the callee handle *at the site of > the condition*. Sorry if I wasn't clear enough about the implied use of > tags I meant, up-thread. > > I'm serious. I've read and understood what you wrote above, so I'll ask > you do the courtesy of reading and understanding the following > paragraphs fully as well. I'm not writing them not to clobber you with > the Obvious Superiority of my own beliefs -- they may well be wrong -- > just to clarify exactly what I'm suggesting, what I'd do as > alternatives, and why. Hey, don't worry, I love brainstorming and learning new stuff :) > The old Rust condition system was modeled on the condition system in > Mesa. [...] Can I assume that your example is actually too short to demonstrate the meaningful difference with C++/ML-style exceptions and that the following would also work? fn foo(fname: str) -> uint { try { let x = os::open(fname); ret mylib::get_some_uint_property(x); } handle os::file_not_exist(str filename) -> file { ret os::create(fname); } } > I found this argument compelling. So much so I'm probably exaggerating > or mis-stating the arguments a bit. But it lead me to reflect a bit more > on the *realistic* uses of exceptions I've seen in programs, and found > myself unable to debate it: most catch clauses I see do one of a small > number of very predictable things: ignore, retry, hard-fail, log, or try > one of a very small number of alternative strategies to accomplishing > the initial goal (create the file rather than open it, say) that the > author of the callee could very well have predicted and codified in a > small tag-set of extra arguments. > > So I removed the condition system from the design docs, and never > implemented it. I propose structuring the libraries along these lines. > That is, to have the callee authors actually think a bit about what an > unusual-return means, which ways there might plausibly be for recovering > from it, and take a tag or vec-of-tags carrying the preferred recovery > strategy. Ok, I now get what you meant with "wonderful tag things". I hope you will forgive me for misunderstanding your previous e-mail, it was a tad elliptic :) This sounds interesting, and it might work, but I have the intuition that this would not be precise enough. Still, as the ground has shifted a little since the start of this thread, I would like to sum up the current setting. If I understand correctly, Rust 0.x will offer: * tasks that are lightweight enough that we can theoretically afford to launch as often as we want just for error-handling; * tasks that are first-class. In addition, I am nearly certain that we wish to be able to display a different error message to handle, say, a failure of a module due to a network error or a failure of the same module doing exact the same thing but due to a disk error. Therefore, I also understand that: * `fail` or some variant thereof can actually report a failure reason, (presumably as task-bound value with type `any`) If so, in effect, we already have Report-and-handle, and more precisely, C++/ML-style type-safe-modulo-dynamic-typing exceptions, with performance comparable to C++ exceptions. And in that case, I have some difficulty finding out exactly which case remain to be covered by the library design you detail above. Could you provide a few examples before I answer the following? > If this fails to hold together and we really, really have to revive some > kind of structured at-a-distance recovery system, I'm going to suggest > going back to the Mesa-like signal scheme I sketched out earlier (and > above). The main (substantial!) advantage it offers is that recovery > paths cause no actual unwinding-or-destruction -- recovery occurs > effectively "at the signal site" -- so there's no question of > perturbation of the typestate. Unwinding still only happens during > failure. The handler is invoked like any other function and if it > succeeds the unwinder is never even involved. IMO it's much tidier than > try/throw/catch and/or monads-by-macros. Thanks for your detailed answer, David -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 487 bytes Desc: OpenPGP digital signature URL: From banderson at mozilla.com Sat Nov 19 15:44:36 2011 From: banderson at mozilla.com (Brian Anderson) Date: Sat, 19 Nov 2011 15:44:36 -0800 Subject: [rust-dev] Optional trailing comma in record type-constructor and initializers In-Reply-To: References: Message-ID: <4EC83F64.7060502@mozilla.com> On 11/17/2011 10:58 PM, Haitao Li wrote: > How about allowing an optional trailing comma appended in record? Thanks for this well-thought proposal. I am not opposed, and I do run into this regularly. From banderson at mozilla.com Sat Nov 19 15:51:58 2011 From: banderson at mozilla.com (Brian Anderson) Date: Sat, 19 Nov 2011 15:51:58 -0800 Subject: [rust-dev] Optional trailing comma in record type-constructor and initializers In-Reply-To: <4EC83F64.7060502@mozilla.com> References: <4EC83F64.7060502@mozilla.com> Message-ID: <4EC8411E.5050201@mozilla.com> On 11/19/2011 03:44 PM, Brian Anderson wrote: > On 11/17/2011 10:58 PM, Haitao Li wrote: >> How about allowing an optional trailing comma appended in record? > > Thanks for this well-thought proposal. I am not opposed, and I do run > into this regularly. If we did this we would probably want to preserve the trailing comma in the pretty-printer. From graydon at mozilla.com Mon Nov 21 10:34:56 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 21 Nov 2011 10:34:56 -0800 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <4EC82E3F.2050309@mozilla.com> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <4EC79ACE.8020305@mozilla.com> <4EC80659.2090809@mozilla.com> <4EC82E3F.2050309@mozilla.com> Message-ID: <4ECA99D0.8050600@mozilla.com> On 19/11/2011 2:31 PM, David Rajchenbach-Teller wrote: > Let's switch vocabulary, then. "Report" and "Handle" are probably less > loaded with implicit meanings. Sure. Those are good terms. So long as it's clear there are a few dimensions to the issue: - What is checked statically (if at all) for exhaustiveness. - Whether you can recover at the reporting site or must always unwind a bit (say, to the nearest handler). - Whether you permit unwinding at all. - Whether unwinding, once started, is permitted to stop before it unwinds the whole task. - If you can stop (catch) an unwinding early, what the meaning of that is in terms of the consistency or inconsistency of the local heap. - How handlers are located and associated (globally, by dynamic scope, by type, by static region, by task ownership, etc.) > Can I assume that your example is actually too short to demonstrate the > meaningful difference with C++/ML-style exceptions and that the > following would also work? I think the example I gave was long enough, but I guess it depends how you look at it. The salient point is that recovery happens w/o unwinding. The dynamic scope is *searched* to find a handler, but the handler is invoked w/o any unwinding, and since the handler succeeds (executes a 'ret', returning a value to the reporting site) execution continues at the reporting site. No unwinding occurs. Unwinding is limited to failure only, which is also unrecoverable (so there's no question of "what happens to the local heap on unwinding" -- it's always thrown away) > If I understand correctly, Rust 0.x will offer: > * tasks that are lightweight enough that we can theoretically afford to > launch as often as we want just for error-handling; I'd like that to be true. It's hard to promise. > * tasks that are first-class. I'm not sure what this means in the context of this discussion. > In addition, I am nearly certain that we wish to be able to display a > different error message to handle, say, a failure of a module due to a > network error or a failure of the same module doing exact the same thing > but due to a disk error. Therefore, I also understand that: > * `fail` or some variant thereof can actually report a failure reason, > (presumably as task-bound value with type `any`) We have no 'any' type yet. When we do, I'd like this to be true. > If so, in effect, we already have Report-and-handle, and more precisely, > C++/ML-style type-safe-modulo-dynamic-typing exceptions, with > performance comparable to C++ exceptions. Spawning a task is going to be more expensive than a C++ try-protected block; there's a private heap and private scheduling (failure is async relative to the execution of the parent). I can't pretend the costs are equivalent. It'll be more expensive. But the intent was for that expense to be the price paid for more-useful isolation: IME most C++ try-protected blocks fail to actually contain their error, because the heap is corrupted on the way out. Aside from the cost angle, yes, I think given an implementation of 'any' and the appropriate failure-message-conveyance on task failure, we'll have something that behaves similar to C++ exceptions, but with task isolation -- more-proper containment. That's why I chimed in initially on this thread saying "we already have exceptions". That much is on the drawing board. IMO this thread is about whether we need something else, in addition, that *is* as cheap as C++ try-protected blocks, and possibly does finer-grained recovery (in terms of language abstractions employed). To me that's not clear yet, and I'm suggesting we err on the side of not providing it and seeing how the existing abstractions hold up. > And in that case, I have some difficulty finding out exactly which case > remain to be covered by the library design you detail above. > > Could you provide a few examples before I answer the following? Which, the pass-flags-in approach, or reviving the older design for signals-and-handlers? -Graydon From jws at csse.unimelb.edu.au Mon Nov 21 17:20:38 2011 From: jws at csse.unimelb.edu.au (Jeff Schultz) Date: Tue, 22 Nov 2011 12:20:38 +1100 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <4EC80659.2090809@mozilla.com> References: <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <4EC79ACE.8020305@mozilla.com> <4EC80659.2090809@mozilla.com> Message-ID: <20111122012038.GA21755@mulga.csse.unimelb.edu.au> On Sat, Nov 19, 2011 at 11:41:13AM -0800, Graydon Hoare wrote: > Yes, and I am proposing one: pass *in* your handlers (or symbolic codes > indicating handler-strategy) and have the callee handle *at the site of the > condition*. Sorry if I wasn't clear enough about the implied use of tags I > meant, up-thread. > The old Rust condition system was modeled on the condition system in Mesa. > You named "signals" as global items and gave a syntactic form to routing a > given signal to a locally-defined "handler" in the caller, much as you > would a try/catch block. The difference is that a handler in this scheme is > a typed function-like definition dangling after the protected block. It > reads like so: I may not be understanding this correctly, but it seems to have security implications by breaking encapsulation. The caller-supplied handler will have access to values at the exception site that the caller would not normally be able to see. Further, code has no way to prevent this from happening. It can define its own handlers for all known conditions to dynamically shadow any defined by its caller, but it can't anticipate new conditions defined in later versions of functions it calls. [Snip] > So I removed the condition system from the design docs, and never > implemented it. I propose structuring the libraries along these lines. That > is, to have the callee authors actually think a bit about what an > unusual-return means, which ways there might plausibly be for recovering > from it, and take a tag or vec-of-tags carrying the preferred recovery > strategy. > If this fails to hold together and we really, really have to revive some > kind of structured at-a-distance recovery system, I'm going to suggest > going back to the Mesa-like signal scheme I sketched out earlier (and > above). The main (substantial!) advantage it offers is that recovery paths > cause no actual unwinding-or-destruction -- recovery occurs effectively "at > the signal site" -- so there's no question of perturbation of the > typestate. Unwinding still only happens during failure. The handler is > invoked like any other function and if it succeeds the unwinder is never > even involved. IMO it's much tidier than try/throw/catch and/or > monads-by-macros. Jeff Schultz From graydon at mozilla.com Tue Nov 22 16:44:50 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 22 Nov 2011 16:44:50 -0800 Subject: [rust-dev] Exceptions without exceptions (was Re: Writing cross-platform low-level code) In-Reply-To: <20111122012038.GA21755@mulga.csse.unimelb.edu.au> References: <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <4EC79ACE.8020305@mozilla.com> <4EC80659.2090809@mozilla.com> <20111122012038.GA21755@mulga.csse.unimelb.edu.au> Message-ID: <4ECC4202.3020006@mozilla.com> On 21/11/2011 5:20 PM, Jeff Schultz wrote: > I may not be understanding this correctly, but it seems to have > security implications by breaking encapsulation. The caller-supplied > handler will have access to values at the exception site that the > caller would not normally be able to see. Couple comments on this: - Security and encapsulation are not the same concept. And security systems have (imo) no business in language design. It's the wrong level to try to defend against an attacker. If your threat model is "turing complete attack code mixed into my call chain", imo you've already lost. I've never seen a language that can credibly defend against an attacker running in-process with the target; the attack surface between same-process language abstractions is just way too big. - Encapsulation is a worthy goal, but encapsulation does not mean "the encapsulated part exposes zero information". It means the encapsulated part exposes a well-defined signature that is rich enough to use the subsystem but not expose pointless detail. The signature of any signals the encapsulated part might need help handling is part of the subsystem signature, the same way the set of unwinding exceptions thrown by a function is. (Like exceptions, could be "checked" or "unchecked", statically; I'd go with unchecked for a variety of other reasons, but that is not relevant to this discussion, I'd say the same for exns.) > Further, code has no way to prevent this from happening. It can > define its own handlers for all known conditions to dynamically shadow > any defined by its caller, but it can't anticipate new conditions > defined in later versions of functions it calls. I don't see how this is any different form the set of encapsulation concerns that arise with unwinding exceptions. Any sort of nonlocal handler introduces, by definition, some sort of additional "distant" coupling between the reporting site and the handling site. Whether it's done by a handler-call or an unwinding. It's a detail through which extra information leaks out from the reporting site, sure; but if properly documented, this information can be seen as "part of the subsystem's invocation interface", and modeled as such (restricted to only the information necessary for recovery, say). -Graydon From lindsey at rockstargirl.org Tue Nov 22 17:49:55 2011 From: lindsey at rockstargirl.org (Lindsey Kuper) Date: Tue, 22 Nov 2011 20:49:55 -0500 Subject: [rust-dev] Vectorization In-Reply-To: References: Message-ID: On Tue, Oct 11, 2011 at 6:22 PM, Lindsey Kuper wrote: > Has there ever been any discussion of vectorization (i.e., taking > advantage of LLVM's vector types) in Rust? ?Patrick said that he'd > brought it up in passing before, but I don't think I've seen it come > up on the mailing list yet. ?I'm thinking about trying it out for a > class project. ?I'm at the "looked at the Wikipedia page some" level > of knowledge about vectorization right now, so I have a lot to learn. > Um...thoughts? So I've been reading more about vectorization. The gold standard seems to be the Allen & Kennedy vectorization algorithm (chapter 2 of http://www.amazon.com/Optimizing-Compilers-Modern-Architectures-Dependence-based/dp/1558602860 -- sadly not free online, although a bootleg PDF can be found if you look hard enough). In order to get the data dependence graph that the vectorization algorithm uses to do its magic, you need to do a data dependence analysis; how to do this is the focus of Chapter 3 (and of the rest of the book, in some sense). So one step in adding vectorization to Rust would be to add a data dependence analysis pass to the compiler. However, it looks like a data dependence analysis is happening in LLVM already. At the very least, there are two passes in the LLVM documentation (http://llvm.org/docs/Passes.html) that have the word "dependence" in their names. The "loop dependence analysis" pass (http://llvm.org/docs/Passes.html#lda) is a likely suspect. Does anyone here know if that pass (or one that uses the result of it) introduces LLVM vector instructions into previously unvectorized IR? If we want vectorization to happen in Rust, should it happen in our LLVM codegen or should it happen during an LLVM pass (or both)? Here's some other interesting stuff I found while digging around: https://llvm.org/svn/llvm-project/poolalloc/tags/RELEASE_14/lib/DSA/Parallelize.cpp -- "This file implements a pass that automatically parallelizes a program, using the Cilk multi-threaded runtime system to execute parallel code. The pass uses the Program Dependence Graph (class PDGIterator) to identify parallelizable function calls, i.e., calls whose instances can be executed in parallel with instances of other function calls." This is the first time I've seen a use of Cilk in the wild, btw. https://llvm.org/svn/llvm-project/poolalloc/tags/RELEASE_14/lib/DSA/PgmDependenceGraph.cpp -- Here's the aforementioned Program Dependence Graph and PDGIterator. I don't know how similar or different any of this is to what Allen & Kennedy present, but it's all apparently part of something called the Automatic Pool Allocator, which presents itself as an LLVM optimization pass (https://llvm.org/svn/llvm-project/poolalloc/trunk/README) and which was the subject of a 2005 PLDI paper (http://llvm.org/pubs/2005-05-21-PLDI-PoolAlloc.pdf) that later became a chapter of Lattner's dissertation. (By the way, the paper contrasts "pools" with regions a little -- from my 30-second glance at it, the gist is something like "they're like regions, but for unsafe languages!".) I'm in far over my head now, so I guess I'm just going to go back to reading Allen & Kennedy and try to understand their algorithm for producing the dependence graph. Comments on any of this are welcome. Lindsey From dteller at mozilla.com Fri Nov 25 07:29:59 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Fri, 25 Nov 2011 16:29:59 +0100 Subject: [rust-dev] Exceptions without exceptions In-Reply-To: <4ECA99D0.8050600@mozilla.com> References: <4EC0C8F7.80706@mozilla.com> <4EC17FB4.40108@mozilla.com> <4EC2176A.5070106@mozilla.com> <4EC2C528.3030507@mozilla.com> <4EC409E5.6000904@mozilla.com> <34962223-F3ED-458F-98AE-7BA1F4AFD7FB@alum.mit.edu> <4EC535FC.70408@mozilla.com> <4EC6B5C8.8010906@mozilla.com> <4EC79ACE.8020305@mozilla.com> <4EC80659.2090809@mozilla.com> <4EC82E3F.2050309@mozilla.com> <4ECA99D0.8050600@mozilla.com> Message-ID: <4ECFB477.2040005@mozilla.com> On 11/21/11 7:34 PM, Graydon Hoare wrote: > On 19/11/2011 2:31 PM, David Rajchenbach-Teller wrote: > >> Let's switch vocabulary, then. "Report" and "Handle" are probably less >> loaded with implicit meanings. > > Sure. Those are good terms. So long as it's clear there are a few > dimensions to the issue: > > - What is checked statically (if at all) for exhaustiveness. > > - Whether you can recover at the reporting site or must always > unwind a bit (say, to the nearest handler). > > - Whether you permit unwinding at all. > > - Whether unwinding, once started, is permitted to stop before it > unwinds the whole task. > > - If you can stop (catch) an unwinding early, what the meaning of > that is in terms of the consistency or inconsistency of the local > heap. > > - How handlers are located and associated (globally, by dynamic scope, > by type, by static region, by task ownership, etc.) Absolutely. >> Can I assume that your example is actually too short to demonstrate the >> meaningful difference with C++/ML-style exceptions and that the >> following would also work? > > I think the example I gave was long enough, but I guess it depends how > you look at it. The salient point is that recovery happens w/o > unwinding. The dynamic scope is *searched* to find a handler, but the > handler is invoked w/o any unwinding, and since the handler succeeds > (executes a 'ret', returning a value to the reporting site) execution > continues at the reporting site. No unwinding occurs. Ah, I had missed the dynamic scoping, thanks for pointing this out. > Spawning a task is going to be more expensive than a C++ try-protected > block; there's a private heap and private scheduling (failure is async > relative to the execution of the parent). > > I can't pretend the costs are equivalent. It'll be more expensive. But > the intent was for that expense to be the price paid for more-useful > isolation: IME most C++ try-protected blocks fail to actually contain > their error, because the heap is corrupted on the way out. Really? I was not aware of this. How is it corrupted? Do I understand that, a try-catch scenario, the main interest of using a task instead of the C++ model is to protect the heap? > Aside from the cost angle, yes, I think given an implementation of 'any' > and the appropriate failure-message-conveyance on task failure, we'll > have something that behaves similar to C++ exceptions, but with task > isolation -- more-proper containment. That's why I chimed in initially > on this thread saying "we already have exceptions". That much is on the > drawing board. > > IMO this thread is about whether we need something else, in addition, > that *is* as cheap as C++ try-protected blocks, and possibly does > finer-grained recovery (in terms of language abstractions employed). To > me that's not clear yet, and I'm suggesting we err on the side of not > providing it and seeing how the existing abstractions hold up. Well, the initial thread was about writing a nice I/O library and, in particular, how Reporting should be designed. I admit that, for the moment, I do not see any good way of using either pass-flags-in or signals-and-handlers in any meaningful manner in that specific scenario. > >> And in that case, I have some difficulty finding out exactly which case >> remain to be covered by the library design you detail above. >> >> Could you provide a few examples before I answer the following? > > Which, the pass-flags-in approach, or reviving the older design for > signals-and-handlers? I was thinking about pass-flags-in, but actually both would be nice. I have toyed with both on paper and, at the moment, my personal impression is that neither mechanism is particularly useful. Both seem to allow recovery only from the most trivial mistakes, and both seem to require outside so much knowledge of the internals of the function that the only uses I can see are purely local, in wrapper functions co-developed with the function that may Report. I suspect that you have a different impression/experience, so please do not hesitate to convince me :) Thanks, David -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 487 bytes Desc: OpenPGP digital signature URL: From marijnh at gmail.com Fri Nov 25 08:06:52 2011 From: marijnh at gmail.com (Marijn Haverbeke) Date: Fri, 25 Nov 2011 17:06:52 +0100 Subject: [rust-dev] Interface / typeclass proposal Message-ID: Niko proposed categories [1] two weeks ago. I'm happy that we're looking in this direction. Niko's proposal makes interfaces structural. I'm going to argue that nominal interfaces have advantages both in programmer ergonomics and in compilation-model complexity. [1]: https://mail.mozilla.org/pipermail/rust-dev/2011-November/000941.html I'll be sticking rather closely to Haskell's type class system, which has proven itself in practice. If you aren't already enthusiastic about Haskell's type classes, I recommend watching Simon Peyton Jones' talk about them [2]. He goes over the way type classes can be implemented, and shows a number of really cool applications. [2]: http://channel9.msdn.com/posts/MDCC-TechTalk-Classes-Jim-but-not-as-we-know-them (try to skip the first 3 minutes, they might spoil your appetite) Context: I'd like to implement some minimum viable dynamic dispatch system and get rid of our `obj` implementation before the first public release. Something that can be extended later with classes and traits, but for now just allows us to define vtables that accompany types. To recap, Niko's categories look like this: category vec_seq([T]) { fn len() -> uint { vec::len(self) } fn iter(f: block(T)) { for elt in self { f(elt); } } } Having that, you can do import vec::vec_seq; assert [1].len() == 1u; [1, 2, 3].iter({|i| log i; }) Which is statically resolved. Dynamic dispatch (if I understand correctly), would look like this: type iterable = iface { fn iter(block(T)); }; fn append_to_vec(x: iterable, y: itable) -> [T] { let r = []; x.iter {|e| r += [e];} y.iter {|e| r += [e];} r } // Assuming there exists a category for lists that implements iter append_to_vec([1, 2, 3] as iterable, list::cons(4, list::nil) as iterable) That causes the compiler to create two vtables, both containing an `iter` method, and wrap the arguments in {vtable, @value} structures when they are cast to `iterable` (they'll probably have to be boxed to make sure the size of such a value is uniform, and cleanup is straightforward). Alternatively, my proposal looks like this: interfaces could be fixed groups of methods, that are always implemented all at once. // Define an interface called `seq` interface seq { fn len() -> uint; fn iter(f: block(T)); } // Declare [T] to be an instance of seq instance [T] seq { fn len() -> uint { vec::len(self) } fn iter(f: block(T)) { for elt in self { f(elt); } } } The static way to use this would look the same as above. If you've imported `vec::seq` (std::vec's implementation of seq), you can simply say [1].len(). If there is any instance in scope that applies to type [int] and has a method `len`, that instance's implementation is called. If multiple interfaces are found, the one in the closes scope is chosen. If they are in the same scope, or no interface is found, you get a compiler error. Dynamic dispatch works differently. // Declare T to be an instance of the seq interface fn total_len(seqs: [T]) -> uint { let cnt = 0u; for s in seqs { count += s.len(); } count } In this proposal, the seq vtable is not something that get attached to the value by casting it to an interface, but rather acts as an implicit parameter to the function. The cool thing is that we already have these implicit parameters -- they map very closely to our type descriptors, which we are implicitly passing to generics. What would happen, for such a call, is that the compiler notices that the type parameter has an interface bound, so that instead of passing just a tydesc, it passes both a tydesc and the `seq` vtable that belongs to that type. Inside the function, `s` is known to be of type `T:seq`, so the `s.len` call looks up the `len` method in the vtable passed for type parameter T. You can also require type parameter to conform to multiple interfaces, as in `fn foo(...)` -- that requires passing multiple vtables. (Niko: this is the thing you asked about. Turns out it's not hard to do.) It should be noted that this has both advantages and disadvantages compared to the 'wrap by casting to interface approach'. For one thing, it doesn't allow this fn draw_all(shapes: [T]) { for s in shapes { s.draw(); } } .. or at least, it doesn't do what you want, because it requires all arguments to be of the same type, and only passes a single vtable. An extension can be implemented (at a later time), to support this instead: fn draw_all(shapes: [drawable]) { ... } draw_all([my_circle as drawable, my_rectangle as drawable]); The drawable interface, when used as a type instead of a type parameter bound, would denote a boxed value paired with a vtable, just like in Niko's proposal. And the good part: In the case where the interface is used as a type parameter bound, which should cover most use cases, things do not have to be boxed to be handled generically, and the content of regular data structures (such as `[int]`) can be approached generically. This is fast, and it allows type classes to applied all over the place... Everything that's currently an obj could become an instance. We'd get static, super-fast dispatch when using them monomorphically, and be able to decide on our own representation (rather than being locked into a boxed representation, as objs are) for the values. Being able to define methods on built-in types means that many things wouldn't require defining a new representation at all. The operations that are currently magically implemented by the compiler and runtime, such as comparing and logging, could be methods on interfaces (see Haskell's Eq, Cmp, and Show type classes). That'd make them overridable and extendable. With a Sufficiently Smart Inliner, we could even do arithmetic with methods, and get operator overloading for free (see Haskell's Num type class), and allow things like generic implementations of sum, average, and similar numeric operations over sequences of `T: num` type. Our type parameter bounds, `copy` and `send`, could become interfaces, with suitable implementations in the standard library. Copying of generic types would be forbidden, but `copy` would be a method of the `copy` typeclass, so you could say `copy(elt)` if you declared your type parameter with a `: copy` bound. Implementing a system like this in its simple form is not terribly hard, especially since we already have tydescs implemented. Making it as powerful as Haskell's requires some extensions (most importantly: default implementations of methods, and interfaces that are parameterized with other interfaces), but those can be tackled piecemeal. There's even a credible path towards multiple dispatch (methods dispatching on the type of more than one argument), though it requires a more complicated interface-dispatching mechanism. I'd like to spend next week implementing this. Comments, additions, and violently disagreeing flames are welcome. Cheers, Marijn From lindsey at rockstargirl.org Sun Nov 27 14:32:33 2011 From: lindsey at rockstargirl.org (Lindsey Kuper) Date: Sun, 27 Nov 2011 17:32:33 -0500 Subject: [rust-dev] Vectorization In-Reply-To: References: Message-ID: On Tue, Nov 22, 2011 at 8:49 PM, Lindsey Kuper wrote: > So I've been reading more about vectorization. Thanks to Kevin Cant? and Manuel Chakravarty pointing it out on Twitter, I'm now aware that there's a *lot* of information on the GHC wiki about the ongoing effort to use LLVM vector types for SIMD support in (the LLVM backend of) GHC. Recommended reading for those interested in this stuff. http://hackage.haskell.org/trac/ghc/wiki/SIMD http://hackage.haskell.org/trac/ghc/wiki/SIMDPlan Lindsey From dherman at mozilla.com Sun Nov 27 17:27:53 2011 From: dherman at mozilla.com (David Herman) Date: Sun, 27 Nov 2011 17:27:53 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: References: Message-ID: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> On Nov 25, 2011, at 8:06 AM, Marijn Haverbeke wrote: > Niko proposed categories [1] two weeks ago. I'm happy that we're > looking in this direction. Niko's proposal makes interfaces > structural. I'm going to argue that nominal interfaces have advantages > both in programmer ergonomics and in compilation-model complexity. The sense I got was that most everyone prefers a move towards almost everything being nominal. Niko's proposal does not depend on being structural; IIRC he wrote it that way primarily to stay closer to the current object system. > I'll be sticking rather closely to Haskell's type class system, which > has proven itself in practice. Well, it's not without its issues. Some of the problems relate to inference and are therefore not an issue for us, but others are more serious. The most important issue I see is that instance declarations are anti-modular: if multiple modules have conflicting instance declarations, the program can't be compiled. Anything we do in this space should address this issue. > Context: I'd like to implement some minimum viable dynamic dispatch > system and get rid of our `obj` implementation before the first public > release. Something that can be extended later with classes and traits, > but for now just allows us to define vtables that accompany types. If there's internal work needed for vtables, it makes sense to work on that. But we don't need to hurry to get new user-facing features in for the first release. > Alternatively, my proposal looks like this: interfaces could be fixed > groups of methods, that are always implemented all at once. > > // Define an interface called `seq` > interface seq { > fn len() -> uint; > fn iter(f: block(T)); > } > // Declare [T] to be an instance of seq > instance [T] seq { > fn len() -> uint { vec::len(self) } > fn iter(f: block(T)) { for elt in self { f(elt); } } > } > > The static way to use this would look the same as above. If you've > imported `vec::seq` (std::vec's implementation of seq), you can simply > say [1].len(). If there is any instance in scope that applies to type > [int] and has a method `len`, that instance's implementation is > called. What would you do with multiple instance declarations that differ by specificity? For example, one declared on [int] and one declared on [T]. > If multiple interfaces are found, the one in the closes scope > is chosen. If they are in the same scope, or no interface is found, > you get a compiler error. Conflict resolution is often the place where the usability bugs show up in systems with overloading. You want (a) to avoid having automatic resolution rules that are too strict, but (b) to avoid having automatic resolution rules that are too subtle, and then (c) to make sure there's some sort of lightweight syntax for explicitly resolving ambiguous/failed resolutions. > Dynamic dispatch works differently. > > // Declare T to be an instance of the seq interface > fn total_len(seqs: [T]) -> uint { > let cnt = 0u; > for s in seqs { count += s.len(); } > count > } > > In this proposal, the seq vtable is not something that get attached to > the value by casting it to an interface, but rather acts as an > implicit parameter to the function. The cool thing is that we already > have these implicit parameters -- they map very closely to our type > descriptors, which we are implicitly passing to generics. Haven't we been talking about eliminating type descriptors, in particular for monomorphization? I haven't been close enough to that topic to fully understand the trade-offs, but I know that Patrick believes we will need to monomorphize. If we do, I would imagine one approach is simply to hard-wire each specialized copy of a function to specific vtables. Another is to say that we don't specialize bounded type parameters, but that's strange and asymmetric. This comes down to the different implementation approaches polymorphism. Monomorphizing is rare in the ML/Haskell tradition, so type classes with dictionary-passing is a nice fit. But it's not as clear what happens to type classes when you introduce specialization. > It should be noted that this has both advantages and disadvantages > compared to the 'wrap by casting to interface approach'. For one > thing, it doesn't allow this > > fn draw_all(shapes: [T]) { for s in shapes { s.draw(); } } > > .. or at least, it doesn't do what you want, because it requires all > arguments to be of the same type, and only passes a single vtable. An > extension can be implemented (at a later time), to support this > instead: > > fn draw_all(shapes: [drawable]) { ... } > draw_all([my_circle as drawable, my_rectangle as drawable]); This corresponds to where the implicit "forall" occurs in the type: forall drawable T.([T] -> unit) [forall drawable T] -> unit GHC allows both, but the second one requires an explicit forall. > With a Sufficiently Smart Inliner, we could even do arithmetic with > methods, and get operator overloading for free (see Haskell's Num type > class), and allow things like generic implementations of sum, average, > and similar numeric operations over sequences of `T: num` type. Overloaded arithmetic is definitely a pleasant aspect of Haskell's type classes. But they do exploit type inference pretty heavily. For example, literals have polymorphic types, and expressions can be given different types based on the expected type of their context. Have you thought about how this would look in our system, where inference is more limited? > Our type parameter bounds, `copy` and `send`, could become interfaces, > with suitable implementations in the standard library. Copying of > generic types would be forbidden, but `copy` would be a method of the > `copy` typeclass, so you could say `copy(elt)` if you declared your > type parameter with a `: copy` bound. If this worked, it might be the biggest win of type classes. Specifically: if we can squeeze all notions of type bounds in the entire language to just one concept (type classes), then that would be a big win for conceptual simplicity. I'm not sure yet I understand how this works. For example, how do we prevent programmers from declaring some things to be instances of `send` that aren't supposed to be? > Implementing a system like this in its simple form is not terribly > hard, especially since we already have tydescs implemented. Making it > as powerful as Haskell's requires some extensions (most importantly: > default implementations of methods, and interfaces that are > parameterized with other interfaces), but those can be tackled > piecemeal. There's even a credible path towards multiple dispatch > (methods dispatching on the type of more than one argument), though it > requires a more complicated interface-dispatching mechanism. It would help if you could flesh this out more in a wiki proposal. > I'd like to spend next week implementing this. Is that a good way to proceed? When we have new proposals, we need to get the group on board before jumping in with both feet. A full proposal needs more than an overview email, and it needs to be fleshed out, integrated into the rest of the design, etc. We should discuss how this does or doesn't fit in with objects, classes and traits, what consequences monomorphization would have on it or vice versa, etc etc. And IINM, there are still blockers that are well-established and fundamental, like stack growth and x64, no? Dave From niko at alum.mit.edu Sun Nov 27 22:13:19 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Sun, 27 Nov 2011 22:13:19 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: References: Message-ID: <4ED3267F.9050204@alum.mit.edu> Hi Marijn, Sorry for the late reply. Overall, I am a fan of this idea. In the previous thread the idea of bounded polymorphism came up and it's been rattling around in my head ever since: I'm glad to see it be sketched out in more detail. To my mind, this need not conflict with the previous proposals: basically it can be seen as a pure addition that allows one to avoid constructing interface instances but still make use of dynamic dispatch. Some comments: - I don't see why every category must declare an associated interface. I like the idea of *allowing* categories to declare associated interfaces, as it will lead to better error reporting, but I think *requiring* it is too much. It implies that the main purpose of a category is dynamic dispatch and I am not sure that will be the case. - I also like the idea that an interface can be created after the fact and make use of methods declared in categories or elsewhere that are not aware of the new interface. This is not possible in your proposal unless you declare a wrapper "instance." Maybe this is not a big deal in practice, or even a good thing, as you avoid the potential problem of the compiler assembling an interface out of various unrelated methods that happen to have the right names. - I think the term instance ought to be reserved for runtime values: most languages that aren't Haskell use instance to refer to an instance of a type, such as an object etc. I'm not sure why Haskell felt the need to change that convention but I think it's confusing and there is no need for us to follow suit. Therefore, I will keep calling instances categories :). - I really like the idea of making copy, equality, and other such things interface functions (not sure whether send will work or not). I am a bit concerned about the definition of comparison and equality because they really want to have two receiver types. I guess that one could do: interface ord { fn lt(other: T) -> bool; } and now you can write a sort function: fn sort>(v: [S]) { ... } assuming F-bounded polymorphism, where the bounds of the type variable S can reference S. We have to make sure we get this right of course. - It also raises some questions about static overloading. Assuming that there are multiple ord<> categories declared for the type of the variable `x`, what happens when `x.lt(y)` is invoked? Under my proposal, I suppose, each category would have a unique name, and an explicit category name would be required: `x.foo::lt(y)`. The more Java-like solution would be to decide which `lt()` is invoked based on the type of `y`, but this complicates type inference and is confusing to boot (static overloading vs dynamic dispatch). Niko On 11/25/11 8:06 AM, Marijn Haverbeke wrote: > Niko proposed categories [1] two weeks ago. I'm happy that we're > looking in this direction. Niko's proposal makes interfaces > structural. I'm going to argue that nominal interfaces have advantages > both in programmer ergonomics and in compilation-model complexity. > > [1]:https://mail.mozilla.org/pipermail/rust-dev/2011-November/000941.html > > I'll be sticking rather closely to Haskell's type class system, which > has proven itself in practice. If you aren't already enthusiastic > about Haskell's type classes, I recommend watching Simon Peyton Jones' > talk about them [2]. He goes over the way type classes can be > implemented, and shows a number of really cool applications. > > [2]:http://channel9.msdn.com/posts/MDCC-TechTalk-Classes-Jim-but-not-as-we-know-them > (try to skip the first 3 minutes, they might spoil your appetite) > > Context: I'd like to implement some minimum viable dynamic dispatch > system and get rid of our `obj` implementation before the first public > release. Something that can be extended later with classes and traits, > but for now just allows us to define vtables that accompany types. > > To recap, Niko's categories look like this: > > category vec_seq([T]) { > fn len() -> uint { vec::len(self) } > fn iter(f: block(T)) { for elt in self { f(elt); } } > } > > Having that, you can do > > import vec::vec_seq; > assert [1].len() == 1u; > [1, 2, 3].iter({|i| log i; }) > > Which is statically resolved. Dynamic dispatch (if I understand > correctly), would look like this: > > type iterable = iface { fn iter(block(T)); }; > fn append_to_vec(x: iterable, y: itable) -> [T] { > let r = []; > x.iter {|e| r += [e];} > y.iter {|e| r += [e];} > r > } > // Assuming there exists a category for lists that implements iter > append_to_vec([1, 2, 3] as iterable, list::cons(4, list::nil) as iterable) > > That causes the compiler to create two vtables, both containing an > `iter` method, and wrap the arguments in {vtable, @value} structures > when they are cast to `iterable` (they'll probably have to be boxed to > make sure the size of such a value is uniform, and cleanup is > straightforward). > > Alternatively, my proposal looks like this: interfaces could be fixed > groups of methods, that are always implemented all at once. > > // Define an interface called `seq` > interface seq { > fn len() -> uint; > fn iter(f: block(T)); > } > // Declare [T] to be an instance of seq > instance [T] seq { > fn len() -> uint { vec::len(self) } > fn iter(f: block(T)) { for elt in self { f(elt); } } > } > > The static way to use this would look the same as above. If you've > imported `vec::seq` (std::vec's implementation of seq), you can simply > say [1].len(). If there is any instance in scope that applies to type > [int] and has a method `len`, that instance's implementation is > called. If multiple interfaces are found, the one in the closes scope > is chosen. If they are in the same scope, or no interface is found, > you get a compiler error. > > Dynamic dispatch works differently. > > // Declare T to be an instance of the seq interface > fn total_len(seqs: [T]) -> uint { > let cnt = 0u; > for s in seqs { count += s.len(); } > count > } > > In this proposal, the seq vtable is not something that get attached to > the value by casting it to an interface, but rather acts as an > implicit parameter to the function. The cool thing is that we already > have these implicit parameters -- they map very closely to our type > descriptors, which we are implicitly passing to generics. What would > happen, for such a call, is that the compiler notices that the type > parameter has an interface bound, so that instead of passing just a > tydesc, it passes both a tydesc and the `seq` vtable that belongs to > that type. Inside the function, `s` is known to be of type `T:seq`, so > the `s.len` call looks up the `len` method in the vtable passed for > type parameter T. > > You can also require type parameter to conform to multiple interfaces, > as in `fn foo(...)` -- that requires passing multiple > vtables. (Niko: this is the thing you asked about. Turns out it's not > hard to do.) > > It should be noted that this has both advantages and disadvantages > compared to the 'wrap by casting to interface approach'. For one > thing, it doesn't allow this > > fn draw_all(shapes: [T]) { for s in shapes { s.draw(); } } > > .. or at least, it doesn't do what you want, because it requires all > arguments to be of the same type, and only passes a single vtable. An > extension can be implemented (at a later time), to support this > instead: > > fn draw_all(shapes: [drawable]) { ... } > draw_all([my_circle as drawable, my_rectangle as drawable]); > > The drawable interface, when used as a type instead of a type > parameter bound, would denote a boxed value paired with a vtable, just > like in Niko's proposal. > > And the good part: In the case where the interface is used as a type > parameter bound, which should cover most use cases, things do not have > to be boxed to be handled generically, and the content of regular data > structures (such as `[int]`) can be approached generically. This is > fast, and it allows type classes to applied all over the place... > > Everything that's currently an obj could become an instance. We'd get > static, super-fast dispatch when using them monomorphically, and be > able to decide on our own representation (rather than being locked > into a boxed representation, as objs are) for the values. Being able > to define methods on built-in types means that many things wouldn't > require defining a new representation at all. > > The operations that are currently magically implemented by the > compiler and runtime, such as comparing and logging, could be methods > on interfaces (see Haskell's Eq, Cmp, and Show type classes). That'd > make them overridable and extendable. > > With a Sufficiently Smart Inliner, we could even do arithmetic with > methods, and get operator overloading for free (see Haskell's Num type > class), and allow things like generic implementations of sum, average, > and similar numeric operations over sequences of `T: num` type. > > Our type parameter bounds, `copy` and `send`, could become interfaces, > with suitable implementations in the standard library. Copying of > generic types would be forbidden, but `copy` would be a method of the > `copy` typeclass, so you could say `copy(elt)` if you declared your > type parameter with a `: copy` bound. > > Implementing a system like this in its simple form is not terribly > hard, especially since we already have tydescs implemented. Making it > as powerful as Haskell's requires some extensions (most importantly: > default implementations of methods, and interfaces that are > parameterized with other interfaces), but those can be tackled > piecemeal. There's even a credible path towards multiple dispatch > (methods dispatching on the type of more than one argument), though it > requires a more complicated interface-dispatching mechanism. > > > I'd like to spend next week implementing this. Comments, additions, > and violently disagreeing flames are welcome. > > Cheers, > Marijn > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > From marijnh at gmail.com Mon Nov 28 00:16:07 2011 From: marijnh at gmail.com (Marijn Haverbeke) Date: Mon, 28 Nov 2011 09:16:07 +0100 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> Message-ID: Hi David, > What would you do with multiple instance declarations that differ by specificity? For example, one declared on [int] and one declared on [T]. Eventually, we'll have to specify a scoring scheme. The closest import always wins, so you can disambiguate explicitly. If you have multiple interface implementations imported at the same level, I'd make that an error in the first version of this system, but I agree that eventually we'll want specificity rules. > This comes down to the different implementation approaches polymorphism. Monomorphizing is rare in the ML/Haskell tradition, so type classes with dictionary-passing is a nice fit. But it's not as clear what happens to type classes when you introduce specialization. Monomorphizing generics that take interfaces sounds wrong. It is probably a win in some situations, and heuristics can be found to detect such situations, but I doubt we'll want to do in general. (What happens seems quite clear though -- you compile a version of the function with the vtable known at compile-time, so all dispatch becomes static.) > Overloaded arithmetic is definitely a pleasant aspect of Haskell's type classes. But they do exploit type inference pretty heavily. For example, literals have polymorphic types, and expressions can be given different types based on the expected type of their context. Have you thought about how this would look in our system, where inference is more limited? If we stick to our current approach (literals have unambiguous types, binops always act on values of the same type), the only new subtlety is that the type checker has to figure out the return type of a method. I think this can be done early enough, but I'm not deeply familiar with our type checker, so I'd have to try and see. > For example, how do we prevent programmers from declaring some things to be instances of `send` that aren't supposed to be? For `copy`, this is easy -- your `copy` method does a copy, and the compiler has the full type of the arguments when compiling the method body, so it checks, and complains when you implement a copy method on an invalid type. We'll have to introduce some explicit way of checking whether a type is sendable to do the same for `send`. We'll probably have to wire in automatic 'derived' instances/categories for this to be pleasant though, or you'll be forced to implement your own instances for every tag you declare. Or maybe `copy` and `send` will just look and behave like interfaces on the surface, and in fact refer to some built-in magic interface that the compiler handles specially. > And IINM, there are still blockers that are well-established and fundamental, like stack growth and x64, no? Right, but those have very competent people working on them already (and are progressing really well). Actually, I agree that this isn't something that has any bearing on a first release. Coming up with new neat requirements at the last moment is a sure way to make that release be delayed. But at least having a proposal to point people to when they ask about support for dynamic dispatch would be a plus. (And yes, this will end up on the wiki, but I find that initial discussion is better done in the mailing list.) From marijnh at gmail.com Mon Nov 28 00:38:36 2011 From: marijnh at gmail.com (Marijn Haverbeke) Date: Mon, 28 Nov 2011 09:38:36 +0100 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: <4ED3267F.9050204@alum.mit.edu> References: <4ED3267F.9050204@alum.mit.edu> Message-ID: Hi Niko, Actually, after I sent this off I realized "... but I didn't really make any good points in favour of nominal interfaces at all". I prefer them because they enforce a much more structured way of thinking. Declaring grab-bags of methods and assembling them into interfaces ad-hoc seems clumsy and, as you mention, error prone --- just because a method has a given name, doesn't mean it actually does the same thing as other methods by that name. Explicitly declaring what interface you're implementing would remove this vagueness. It also makes generating and reusing vtables easier -- most of them would only have to be generated once, and can be exported from the module that declares the category. Once we have categories / instances that depend on another type conforming to an interface, those'll have to be parameterized by the other type's vtable, so things get more tricky (much like our current dynamic tydescs). > - I don't see why every category must declare an associated interface. ?I > like the idea of *allowing* categories to declare associated interfaces, as > it will lead to better error reporting, but I think *requiring* it is too > much. ?It implies that the main purpose of a category is dynamic dispatch > and I am not sure that will be the case. I don't really see explicitly declared interfaces as having a lot, specifically, to do with dynamic dispatch. I you're right guess ad-hoc category declarations can be nice sometimes, especially since they allow post-hoc interop with code that wasn't written with any kind of interface in mind. But the latter can already be done by, once you've defined your interface, simply declaring an instance that calls the existing methods. Would you be satisfied with ad-hoc, interface-less methods that can only be called statically? > - I think the term instance ought to be reserved for runtime values You're probably right. But I'm not sure 'category' is much more helpful here. Maybe 'implementation'? (`impl`) > I am a bit > concerned about the definition of comparison and equality because they > really want to have two receiver types. Our == requires both operands to be of the same type. It seems all sane equality implementation have that property. However, there are two receive values, and we definitely don't want to end up writing foo.==(bar) or something awful like that. There are several ways to get around this by doing some surface-syntax-shuffling. But I think eventually we'll want to support both oo-method-style interfaces (written val.method(), used when there is clearly a single receiver), and functional-style interfaces (written method(arg, arg2), used when this is not the case). The resolution for the second type is actually easier to do. x == y could desugar to std::cmp::eq(x, y), or we could define some correspondence between infix ops and function names (which would remove the need for things like std::int::add and std::str::eq) the way Haskell does it. Best, Marijn From dherman at mozilla.com Mon Nov 28 23:02:26 2011 From: dherman at mozilla.com (David Herman) Date: Mon, 28 Nov 2011 23:02:26 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> Message-ID: I think we're beginning to converge; in many ways there's not a ton of difference between the "categories" proposal and typeclasses. >> What would you do with multiple instance declarations that differ by specificity? For example, one declared on [int] and one declared on [T]. > > Eventually, we'll have to specify a scoring scheme. The closest import > always wins, so you can disambiguate explicitly. If you have multiple > interface implementations imported at the same level, I'd make that an > error in the first version of this system, but I agree that eventually > we'll want specificity rules. One of the things I do like about the categories proposal is that instance declarations are named and scoped. This makes it possible to have finer-grained control over what typeclass implementations are in scope, and it also makes it possible to disambiguate explicitly when the system can't automatically resolve ambiguities. Consequently it takes the pressure off us to try to construct the "perfect" set of disambiguation rules. Now from your proposal, I *love* the idea of unifying interfaces and typeclasses (ie type parameter bounds), and I like the idea of separating interfaces from implementations. (I also, incidentally, like the name `impl` that you suggested. I also have suggested `view` but Patrick really doesn't like that. But I'm cool with `impl` much more than `category` or `cat`.) So I would favor this combination of the two ideas: - typeclasses are declared with `iface` - typeclass instances are declared with `impl` and are named and scoped - we have some overload resolution rules based on scope and specificity - we allow explicit disambiguation by naming, e.g.: `x.I::y(...)` where I is an impl - we are conservative about avoiding subtle automatic resolution, and instead rely on explicit disambiguation >> This comes down to the different implementation approaches polymorphism. Monomorphizing is rare in the ML/Haskell tradition, so type classes with dictionary-passing is a nice fit. But it's not as clear what happens to type classes when you introduce specialization. > > Monomorphizing generics that take interfaces sounds wrong. It is > probably a win in some situations, and heuristics can be found to > detect such situations, but I doubt we'll want to do in general. > > (What happens seems quite clear though -- you compile a version of the > function with the vtable known at compile-time, so all dispatch > becomes static.) After thinking about this some more, let me try to make the case that monomorphizing bounded generics is the right thing to do. You mentioned the difference between fn draw_all(shapes: [T]) { ... } and fn draw_all(shapes: [drawable]) { ... } In Haskell, this distinction is subtle, as it involves higher-rank types (polymorphic types where at least one "forall" is not at the top level of the type). That is, the second function is actually something like: fn draw_all(shapes: [forall . T]) { ... } But as frightening as that makes it appear, it's a very common thing to want to express in OO programming. Now, we (rightly) haven't planned on allowing first-class polymorphism (aka higher-rank types). Instead, we can use the more user-friendly OO idioms of interfaces and classes for dynamic polymorphism. So this leads to a clean separation of use cases. If you want static resolution, you use bounded type parameters. If you want dynamic resolution, you use interface types. With full monomorphization, you always get static resolution for bounded type parmaeters. And instead of the mind-bending forall-types, interface types are just the familiar dynamic pattern from OO programming. I think this combines the well-studied type discipline of Haskell with the tried-and-true performance model of C++: with type parameters, you get static resolution, and with objects (interfaces), you get vtables and dynamic resolution. >> For example, how do we prevent programmers from declaring some things to be instances of `send` that aren't supposed to be? > > For `copy`, this is easy -- your `copy` method does a copy, and the > compiler has the full type of the arguments when compiling the method > body, so it checks, and complains when you implement a copy method on > an invalid type. We'll have to introduce some explicit way of checking > whether a type is sendable to do the same for `send`. I guess what I was suggesting was that we currently have a notion of types that *must not* be instances of `send`, and there's no way with typeclasses to enforce *non*-implementation of an interface. But I'm beginning to suspect that that's not a big deal. If you want to declare that a type implements `send`, even when that type really shouldn't be, you can go ahead and do it, but the implementation is going to be silly. For example, you could declare that @int is sendable with a no-op implementation. It's a stupid thing for a programmer to do, but it's not unsafe. I think that's totally worth living with, for the benefit of having `send` just be an iface. That is such a win in terms of mental burden on the user. > We'll probably have to wire in automatic 'derived' > instances/categories for this to be pleasant though, or you'll be > forced to implement your own instances for every tag you declare. Or > maybe `copy` and `send` will just look and behave like interfaces on > the surface, and in fact refer to some built-in magic interface that > the compiler handles specially. Yeah, Haskell's `deriving` is nice. >> And IINM, there are still blockers that are well-established and fundamental, like stack growth and x64, no? > > Right, but those have very competent people working on them already > (and are progressing really well). > > Actually, I agree that this isn't something that has any bearing on a > first release. Coming up with new neat requirements at the last moment > is a sure way to make that release be delayed. But at least having a > proposal to point people to when they ask about support for dynamic > dispatch would be a plus. > > (And yes, this will end up on the wiki, but I find that initial > discussion is better done in the mailing list.) Fair enough. Dave From dherman at mozilla.com Mon Nov 28 23:08:36 2011 From: dherman at mozilla.com (David Herman) Date: Mon, 28 Nov 2011 23:08:36 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: References: <4ED3267F.9050204@alum.mit.edu> Message-ID: On Nov 28, 2011, at 12:38 AM, Marijn Haverbeke wrote: > I prefer them because they enforce a much more structured way of > thinking. Declaring grab-bags of methods and assembling them into > interfaces ad-hoc seems clumsy and, as you mention, error prone --- > just because a method has a given name, doesn't mean it actually does > the same thing as other methods by that name. Explicitly declaring > what interface you're implementing would remove this vagueness. It makes sense to have nominal interfaces, but being too strict about requiring explicit implementation declarations has a well-known modularity issue: if you use a third-party library that does not declare that it implements a particular interface, there's no way to declare that it does without changing the third-party library's code. This comes up in two ways: 1) classes that don't explicitly declare they implement an interface 2) interfaces that don't explicit declare they extend an interface We should *allow* explicit declaration of interface implementation, but we should consider being less strict than Java about requiring these declarations. Dave From marijnh at gmail.com Mon Nov 28 23:16:09 2011 From: marijnh at gmail.com (Marijn Haverbeke) Date: Tue, 29 Nov 2011 08:16:09 +0100 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: References: <4ED3267F.9050204@alum.mit.edu> Message-ID: >?there's no way to declare that it does without changing the third-party library's code This is true in classical sealed classes, but not in open typeclasses. You can always declare an implementation of an interface for a 3rd-party datatype in your own module, or in the module that declares the interface. If there really is no connection between the interface and the methods that a module happens to implement, the chance that they will coincidentally match and implement the same protocol is very small anyway. A wrapper/adapter impl is extremely trivial to write. I'm still skeptical about monomorphizing all bounded generics. Might work out, but it'll require an overhaul of the compiler and some extensive measurement (for unintended bloat). The first version of this should probably do vtable-passing. Best, Marijn From niko at alum.mit.edu Tue Nov 29 12:00:58 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 29 Nov 2011 12:00:58 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> Message-ID: <4ED539FA.1050006@alum.mit.edu> On 11/28/11 11:02 PM, David Herman wrote: > I guess what I was suggesting was that we currently have a notion of > types that *must not* be instances of `send`, and there's no way with > typeclasses to enforce *non*-implementation of an interface. But I'm > beginning to suspect that that's not a big deal. If you want to > declare that a type implements `send`, even when that type really > shouldn't be, you can go ahead and do it, but the implementation is > going to be silly. For example, you could declare that @int is > sendable with a no-op implementation. It's a stupid thing for a > programmer to do, but it's not unsafe. I think that's totally worth > living with, for the benefit of having `send` just be an iface. That > is such a win in terms of mental burden on the user. The other day there was some off-list discussion about how this might work. I just wanted to write it up and put it on the list and make sure we're all on the same page. The basic idea is that there is a well-known interface sendable, defined something like: iface sendable { fn send(chan: chan); } (Note the implicit `self` variable; I think something like this will be needed, though it introduces complexities. If we decide it's infeasiable we can probably get rid of it. For now I will assume we can make it work) Now when the compiler encounters an interface bound: fn foo(s: S) { ... } it searches for an implementation of `sendable` as normal. Assuming nothing is found, a special case rule kicks in: if the type matches our definition of `sendable` (no transitive reference to an `@`), then a default compiler implementation is used. An analogous process applies for copy and log. One twist that just occurred to me: sending a value *moves* the receiver. We need a way to note that. I like that this allows the user to customize how types can be copied, logged, and sent. I think it works better in a nominal type system, though, because otherwise you end up defining too much (i.e., you define a rule for how to send all records with the same set of fields, rather than the particular kind of record you meant). Maybe this is not a problem in practice (kind of like duck typing, which seems to work fine in practice). Niko From graydon at mozilla.com Tue Nov 29 15:11:08 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 29 Nov 2011 15:11:08 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: <4ED539FA.1050006@alum.mit.edu> References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> <4ED539FA.1050006@alum.mit.edu> Message-ID: <4ED5668C.1030200@mozilla.com> On 11-11-29 12:00 PM, Niko Matsakis wrote: > I like that this allows the user to customize how types can be copied, > logged, and sent. I don't. It's in conflict with predictability and the guarantees the language is trying to make for its users' mental models. IMO that's more important. -Graydon From dherman at mozilla.com Tue Nov 29 15:43:25 2011 From: dherman at mozilla.com (David Herman) Date: Tue, 29 Nov 2011 15:43:25 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: <4ED5668C.1030200@mozilla.com> References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> <4ED539FA.1050006@alum.mit.edu> <4ED5668C.1030200@mozilla.com> Message-ID: <1425DC23-16B4-4573-852D-E13582FBE994@mozilla.com> On Nov 29, 2011, at 3:11 PM, Graydon Hoare wrote: > On 11-11-29 12:00 PM, Niko Matsakis wrote: > >> I like that this allows the user to customize how types can be copied, >> logged, and sent. > > I don't. It's in conflict with predictability and the guarantees the language is trying to make for its users' mental models. I don't see any evidence of this claim. These are still totally statically predictable, and not only that, the client can control which implementations they want to use by means of scope. Dave From graydon at mozilla.com Tue Nov 29 15:51:16 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 29 Nov 2011 15:51:16 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: <1425DC23-16B4-4573-852D-E13582FBE994@mozilla.com> References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> <4ED539FA.1050006@alum.mit.edu> <4ED5668C.1030200@mozilla.com> <1425DC23-16B4-4573-852D-E13582FBE994@mozilla.com> Message-ID: <4ED56FF4.5000809@mozilla.com> On 11-11-29 03:43 PM, David Herman wrote: > On Nov 29, 2011, at 3:11 PM, Graydon Hoare wrote: > >> On 11-11-29 12:00 PM, Niko Matsakis wrote: >> >>> I like that this allows the user to customize how types can be copied, >>> logged, and sent. >> >> I don't. It's in conflict with predictability and the guarantees the language is trying to make for its users' mental models. > > I don't see any evidence of this claim. These are still totally statically predictable, and not only that, the client can control which implementations they want to use by means of scope. "Customize" and "predictable" are, in general, opposites. If custom code might be invoked as part of an operation, the reader has to go read the custom code to figure out what it does. I have the exact same objection to operator overloading and several other dimensions along which extensibility may be offered. I think too much extensibility is not a good thing. -Graydon From dherman at mozilla.com Tue Nov 29 16:10:13 2011 From: dherman at mozilla.com (David Herman) Date: Tue, 29 Nov 2011 16:10:13 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: <4ED56FF4.5000809@mozilla.com> References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> <4ED539FA.1050006@alum.mit.edu> <4ED5668C.1030200@mozilla.com> <1425DC23-16B4-4573-852D-E13582FBE994@mozilla.com> <4ED56FF4.5000809@mozilla.com> Message-ID: On Nov 29, 2011, at 3:51 PM, Graydon Hoare wrote: > On 11-11-29 03:43 PM, David Herman wrote: >> On Nov 29, 2011, at 3:11 PM, Graydon Hoare wrote: >> >>> On 11-11-29 12:00 PM, Niko Matsakis wrote: >>> >>>> I like that this allows the user to customize how types can be copied, >>>> logged, and sent. >>> >>> I don't. It's in conflict with predictability and the guarantees the language is trying to make for its users' mental models. >> >> I don't see any evidence of this claim. These are still totally statically predictable, and not only that, the client can control which implementations they want to use by means of scope. > > "Customize" and "predictable" are, in general, opposites. If custom code might be invoked as part of an operation, the reader has to go read the custom code to figure out what it does. The same argument can be made against user-defined functions. But in practice, you don't have to read code you invoke to understand it, because the writer of the code provides a high-level description of what it does. Overloaded methods similarly invoke user-defined functions. This *theoretically* means that, in general, the definer of the overloaded implementation could do anything, but just like other functions, in practice they should write programs that obey the general contract of the overloaded operation. And the nice thing about this design is that clients only get the overloadings they explicitly ask for (by bringing them into scope via `import`). So you don't get silent changes to the meaning of operations -- this makes a big difference for predictability and client-side control. Crucially, it also decreases the amount of magic in the language by making things like `send` into part of a single, uniform notion of a type bound. This is a really big win for the mental model. > I have the exact same objection to operator overloading and several other dimensions along which extensibility may be offered. I think too much extensibility is not a good thing. I think operator overloading is important; I don't think it's too much extensibility. But I think typeclasses and method overloading are even more important. Maybe you're not arguing against typeclasses and bounded types, but just that `send` and `copy` should be fixed typeclasses that can't be user-instantiated. That at least preserves the uniformity of concept of type bounds. But I think the restriction adds complexity and hurts expressiveness (in particular, that you can't define new sendable data abstractions) for not enough gain. Dave From graydon at mozilla.com Tue Nov 29 18:18:54 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 29 Nov 2011 18:18:54 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> <4ED539FA.1050006@alum.mit.edu> <4ED5668C.1030200@mozilla.com> <1425DC23-16B4-4573-852D-E13582FBE994@mozilla.com> <4ED56FF4.5000809@mozilla.com> Message-ID: <4ED5928E.5080903@mozilla.com> On 11-11-29 04:10 PM, David Herman wrote: > The same argument can be made against user-defined functions. Absolutely. I think making a language is, in many ways, a process of figuring out which parts are "worth" making user-definable -- because some must be, or users expect them -- and which parts are better left as "fixed" structure users can rely on when they read other people's code. > And the nice thing about this design is that clients only get the overloadings they explicitly ask for (by bringing them into scope via `import`). People will import foo::* and by doing so change the meaning of otherwise-obvious code that interacts with type foo. It's *somewhat bad* in the case of send: the surprise might be that nothing gets sent, or something is sent multiple times, or there are side effects you didn't expect. It's clearer what's going on if sending is sending and conversion functions or other pre/post-send operations are separate functions. It's *very bad* in the case of copy: copies are made all over the place and the world where copy-does-not-mean-copy is a familiar C++ nightmare I do not want to recreate. Copy constructors, move constructors, conversion operators and assignment operators. Please no. If someone wants to invoke custom code, make them write custom_code() or custom.code(). At least. In C++ the argument is "we want custom code to look like built-in code", and have access to all the same sorts of magic. I want the opposite of that. I want custom code to *look* like it's doing custom things, so that the absence of custom-looking-operations can be safely skimmed by a reader with better ways to spend their day. > Crucially, it also decreases the amount of magic in the language by making things like `send` into part of a single, uniform notion of a type bound. This is a really big win for the mental model. Mental model, yes. This is the argument in favour. I can see this, if it fits. But it might be a "typeclass" that's defined by the compiler and can't be touched by the user. Customization model, I'm much less keen on. > I think operator overloading is important; I don't think it's too much extensibility. But I think typeclasses and method overloading are even more important. Ok. We'll have to arm-wrestle on the first point when the time comes :) The second remains to be seen. I'm hopeful though! > Maybe you're not arguing against typeclasses and bounded types, but just that `send` and `copy` should be fixed typeclasses that can't be user-instantiated. I suppose that's a way of looking at it. I'm generally arguing that I want send to send and copy to copy. That the value of an author being able to "transparently" change the meaning of those operations is less than the value of a reader being able to rely on fixed meanings for them. I guess as you say the cognitive load goes down if you can eliminate "kind" in favour of "typeclass that is automatically defined for all tycons and also cannot be redefined for user types". I'd be tentatively ok with that if the typeclass thing works out. Same for ord and eq, IMO. And all the other operators. If we absolutely cannot agree on whether these should be overridable, I will at least go implement a #[no(custom_operators)] pragma to accompany #[no(gc)]. For every user who wants expanded flexibility there are many who just want to fix their bugs and go home at 5. -Graydon From marijnh at gmail.com Wed Nov 30 00:24:51 2011 From: marijnh at gmail.com (Marijn Haverbeke) Date: Wed, 30 Nov 2011 09:24:51 +0100 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: <4ED5928E.5080903@mozilla.com> References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> <4ED539FA.1050006@alum.mit.edu> <4ED5668C.1030200@mozilla.com> <1425DC23-16B4-4573-852D-E13582FBE994@mozilla.com> <4ED56FF4.5000809@mozilla.com> <4ED5928E.5080903@mozilla.com> Message-ID: I agree with Graydon that there's probably not much to be gained from overriding copy and send. I think they should look and act like interfaces when specified as type parameter bounds, but the actual implementations should be automatically derived by the compiler (as they are now). For operator overloading, I disagree -- I think we'll definitely want that to implement decent bignum, complex number, or (mathematical) vector types. Yes, people can go crazy and do dumb things with overloaded operators, but people can do dumb things with just about any feature you give them. A pragma to turn it off is probably an easy thing to support. From marijnh at gmail.com Wed Nov 30 03:08:46 2011 From: marijnh at gmail.com (Marijn Haverbeke) Date: Wed, 30 Nov 2011 12:08:46 +0100 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> <4ED539FA.1050006@alum.mit.edu> <4ED5668C.1030200@mozilla.com> <1425DC23-16B4-4573-852D-E13582FBE994@mozilla.com> <4ED56FF4.5000809@mozilla.com> <4ED5928E.5080903@mozilla.com> Message-ID: https://github.com/graydon/rust/wiki/Interface-and-Implementation-Proposal From dteller at mozilla.com Wed Nov 30 05:55:36 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Wed, 30 Nov 2011 14:55:36 +0100 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: <4ED56FF4.5000809@mozilla.com> References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> <4ED539FA.1050006@alum.mit.edu> <4ED5668C.1030200@mozilla.com> <1425DC23-16B4-4573-852D-E13582FBE994@mozilla.com> <4ED56FF4.5000809@mozilla.com> Message-ID: <4ED635D8.1090202@mozilla.com> I second Graydon's remark. While syntactically very nice, customizable operator overloading is the kind of feature that makes it much harder to understand snippets and, in particular, to review patches. Now, of course, avoiding operator overloading can be a matter of project-specific coding guidelines and/or pragmas rather than something hardwired into the language. Cheers, David On 11/30/11 12:51 AM, Graydon Hoare wrote: > > "Customize" and "predictable" are, in general, opposites. If custom code > might be invoked as part of an operation, the reader has to go read the > custom code to figure out what it does. > > I have the exact same objection to operator overloading and several > other dimensions along which extensibility may be offered. I think too > much extensibility is not a good thing. > > -Graydon > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 487 bytes Desc: OpenPGP digital signature URL: From niko at alum.mit.edu Wed Nov 30 06:49:00 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 30 Nov 2011 06:49:00 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: <4ED5928E.5080903@mozilla.com> References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> <4ED539FA.1050006@alum.mit.edu> <4ED5668C.1030200@mozilla.com> <1425DC23-16B4-4573-852D-E13582FBE994@mozilla.com> <4ED56FF4.5000809@mozilla.com> <4ED5928E.5080903@mozilla.com> Message-ID: <4ED6425C.9050503@alum.mit.edu> On 11/29/11 6:18 PM, Graydon Hoare wrote: > It's *very bad* in the case of copy: copies are made all over the > place and the world where copy-does-not-mean-copy is a familiar C++ > nightmare I do not want to recreate. Copy constructors, move > constructors, conversion operators and assignment operators. Please > no. If someone wants to invoke custom code, make them write > custom_code() or custom.code(). At least. I have two objections to this: First, I don't think copies should be going on all over the place, to start, and certainly not implicitly. If I had any concern about Rust as it stands today, it's that we implicitly copy left and right. If state is immutable, it's harmless but inefficient. If state is mutable, it's changing the semantics of the program in *very* subtle and non-obvious ways. Second, unless copies can be user-defined, the built-in semantics are just not always suitable. Imagine a user-defined hashtable, for example. If I write "copy map" where map is an instance of this hashtable, I want it to be a *deep copy*. I do not want to be sharing the bucket lists and so forth. Niko -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Wed Nov 30 12:36:20 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 30 Nov 2011 12:36:20 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: <4ED6425C.9050503@alum.mit.edu> References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> <4ED539FA.1050006@alum.mit.edu> <4ED5668C.1030200@mozilla.com> <1425DC23-16B4-4573-852D-E13582FBE994@mozilla.com> <4ED56FF4.5000809@mozilla.com> <4ED5928E.5080903@mozilla.com> <4ED6425C.9050503@alum.mit.edu> Message-ID: <4ED693C4.4070908@alum.mit.edu> On 11/30/11 6:49 AM, Niko Matsakis wrote: > On 11/29/11 6:18 PM, Graydon Hoare wrote: >> It's *very bad* in the case of copy: copies are made all over the >> place and the world where copy-does-not-mean-copy is a familiar C++ >> nightmare I do not want to recreate. Copy constructors, move >> constructors, conversion operators and assignment operators. Please >> no. If someone wants to invoke custom code, make them write >> custom_code() or custom.code(). At least. > I have two objections to this: I have been thinking over my own objections and I realize I was mistaken. I still think we should avoid implicit copies, but we do have to be very careful with allowing people to modify the copy mechanism, for two reasons - copy must have a certain minimal meaning for reasoning about unique pointers to be sound; - given a user-defined type where copying ought to be customized, it would be unfortunate if failing to import the proper implementation gave you the wrong copy. To put this another way, I think allowing users to extend the language to cover undefined operations is generally fine. For example, if copying were not permitted unless the user defined it, that would be one thing. But if there is a default behavior, then letting the user override that behavior---particularly via scoped implementations---is probably not a good idea. Niko -------------- next part -------------- An HTML attachment was scrubbed... URL: From graydon at mozilla.com Wed Nov 30 14:45:09 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Wed, 30 Nov 2011 14:45:09 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: <4ED693C4.4070908@alum.mit.edu> References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> <4ED539FA.1050006@alum.mit.edu> <4ED5668C.1030200@mozilla.com> <1425DC23-16B4-4573-852D-E13582FBE994@mozilla.com> <4ED56FF4.5000809@mozilla.com> <4ED5928E.5080903@mozilla.com> <4ED6425C.9050503@alum.mit.edu> <4ED693C4.4070908@alum.mit.edu> Message-ID: <4ED6B1F5.1070907@mozilla.com> On 11-11-30 12:36 PM, Niko Matsakis wrote: > I have been thinking over my own objections and I realize I was > mistaken. I still think we should avoid implicit copies, but we do have > to be very careful with allowing people to modify the copy mechanism, > for two reasons > - copy must have a certain minimal meaning for reasoning about unique > pointers to be sound; > - given a user-defined type where copying ought to be customized, it > would be unfortunate if failing to import the proper implementation gave > you the wrong copy. > > To put this another way, I think allowing users to extend the language > to cover undefined operations is generally fine. For example, if copying > were not permitted unless the user defined it, that would be one thing. > But if there is a default behavior, then letting the user override that > behavior?particularly via scoped implementations?is probably not a good > idea. Agreed. Relative to your example, I'd be totally fine with a "cloneable" iface (to reuse the java terminology) that does a deep, type-specific copy, on types that choose to implement that. Hashtbl.clone() is fine. I also agree that careful design consideration for minimizing implicit or accidental copies is important. It's definitely good to keep in mind, and a fair bit of the existing machinery tends in this direction already. I just don't want to go injecting custom code paths into the meaning of initialization, assignment, moving, argument-passing, and similar "primitives". Remove too many primitives and the user has no solid ground to stand on. Thanks for giving this further thought! I felt like I came off quite a bit more harsh yesterday than I wanted to, am sorry for the reactionary tone. -Graydon From graydon at mozilla.com Wed Nov 30 17:21:09 2011 From: graydon at mozilla.com (Graydon Hoare) Date: Wed, 30 Nov 2011 17:21:09 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> <4ED539FA.1050006@alum.mit.edu> <4ED5668C.1030200@mozilla.com> <1425DC23-16B4-4573-852D-E13582FBE994@mozilla.com> <4ED56FF4.5000809@mozilla.com> <4ED5928E.5080903@mozilla.com> Message-ID: <4ED6D685.70505@mozilla.com> On 11-11-30 03:08 AM, Marijn Haverbeke wrote: > https://github.com/graydon/rust/wiki/Interface-and-Implementation-Proposal Looks good to me. At least the first chunk. Gets a bit more debatable in the "Extensions" part, but we can cross those bridges later. Concerning operator overloading (as I gather there are >1 fans of it here): I often mentally differentiate these operators: + - * / ^ % < <= == => > ! || && [] From these operators: . () ~ @ # &(unary) *(unary) In the sense that the former group are "more ALU-like" and the latter group are "more load/store/jump-like". Values vs. memory-and-control. Do you feel (straw-vote) like you'd be sufficiently happy to be able to override the former group but not the latter? Languages vary on how far down the rabbit hole of operator overloading they permit, and I wonder where each proponent of the concept draws the line. The former group is enough to implement most intuitively-arithmetic-ish types, which seems to be the big use-case. (Also: please say you've no interest in permitting user-defined operator-symbols with their own associativity and precedence. Right?) -Graydon From niko at alum.mit.edu Wed Nov 30 17:31:00 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 30 Nov 2011 17:31:00 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: <4ED6D685.70505@mozilla.com> References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> <4ED539FA.1050006@alum.mit.edu> <4ED5668C.1030200@mozilla.com> <1425DC23-16B4-4573-852D-E13582FBE994@mozilla.com> <4ED56FF4.5000809@mozilla.com> <4ED5928E.5080903@mozilla.com> <4ED6D685.70505@mozilla.com> Message-ID: <4ED6D8D4.2070501@alum.mit.edu> On 11/30/11 5:21 PM, Graydon Hoare wrote: > Do you feel (straw-vote) like you'd be sufficiently happy to be able > to override the former group but not the latter? Languages vary on how > far down the rabbit hole of operator overloading they permit, and I > wonder where each proponent of the concept draws the line. The former > group is enough to implement most intuitively-arithmetic-ish types, > which seems to be the big use-case. I'd probably be happy with the first set. What's important to me is that custom numeric types and collections feel natural to use. Niko From niko at alum.mit.edu Wed Nov 30 21:26:14 2011 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 30 Nov 2011 21:26:14 -0800 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> <4ED539FA.1050006@alum.mit.edu> <4ED5668C.1030200@mozilla.com> <1425DC23-16B4-4573-852D-E13582FBE994@mozilla.com> <4ED56FF4.5000809@mozilla.com> <4ED5928E.5080903@mozilla.com> Message-ID: <4ED70FF6.6020506@alum.mit.edu> On 11/30/11 3:08 AM, Marijn Haverbeke wrote: > https://github.com/graydon/rust/wiki/Interface-and-Implementation-Proposal As I mentioned on IRC, I think it might be useful to have some syntax to denote the mode of the receiver. Almost all the time you want it to be `&&`, but for the `sendable` interface, in particular, a move mode is more appropriate. I am not sure what this syntax ought to be but here are some possibilities that come to mind. Prefix the method declaration: iface sendable { -fn send(c: chan<_>); } After the method declaration (? la C++): iface sendable { fn send(c: chan<_>) -; } First argument that consists of just a mode: iface sendable { fn send(-, c: chan<_>); } self keyword with no type as first argument: iface sendable { fn send(-self, c: chan<_>); } However, it occurs to me that we could just ignore this issue if we want to make sendable more of a marker interface (i.e., it would be an interface with no methods that is not explicitly implementable). In that case, sending would be a function like: fn send(chan: chan, -msg: T); Niko From dteller at mozilla.com Wed Nov 30 23:59:07 2011 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Thu, 01 Dec 2011 08:59:07 +0100 Subject: [rust-dev] Interface / typeclass proposal In-Reply-To: <4ED6D685.70505@mozilla.com> References: <3EF691C1-2F2C-4A5F-B580-755D2AB632D3@mozilla.com> <4ED539FA.1050006@alum.mit.edu> <4ED5668C.1030200@mozilla.com> <1425DC23-16B4-4573-852D-E13582FBE994@mozilla.com> <4ED56FF4.5000809@mozilla.com> <4ED5928E.5080903@mozilla.com> <4ED6D685.70505@mozilla.com> Message-ID: <4ED733CB.1060105@mozilla.com> What about the following convention that would at least attract attention to the fact that we are dealing with custom/overloaded operators? foo + bar // usual operator foo `+` bar // regular function "+" with infix notation foo `plus` bar // regular function "plus" with infix notation In which functions "+" and "plus" can be overloaded in some manner. Also, yes, custom precedence and associativity sounds like a bad idea. Cheers, David On 12/1/11 2:21 AM, Graydon Hoare wrote: > On 11-11-30 03:08 AM, Marijn Haverbeke wrote: >> https://github.com/graydon/rust/wiki/Interface-and-Implementation-Proposal >> > > Looks good to me. At least the first chunk. Gets a bit more debatable in > the "Extensions" part, but we can cross those bridges later. > > Concerning operator overloading (as I gather there are >1 fans of it here): > > I often mentally differentiate these operators: > > + - * / ^ % < <= == => > ! || && [] > > From these operators: > > . () ~ @ # &(unary) *(unary) > > In the sense that the former group are "more ALU-like" and the latter > group are "more load/store/jump-like". Values vs. memory-and-control. > > Do you feel (straw-vote) like you'd be sufficiently happy to be able to > override the former group but not the latter? Languages vary on how far > down the rabbit hole of operator overloading they permit, and I wonder > where each proponent of the concept draws the line. The former group is > enough to implement most intuitively-arithmetic-ish types, which seems > to be the big use-case. > > (Also: please say you've no interest in permitting user-defined > operator-symbols with their own associativity and precedence. Right?) > > -Graydon > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 487 bytes Desc: OpenPGP digital signature URL: