From bilal at bilalhusain.com Sun Apr 1 02:49:04 2012 From: bilal at bilalhusain.com (Mohd. Bilal Husain) Date: Sun, 1 Apr 2012 15:19:04 +0530 Subject: [rust-dev] building on SunOS, regex Message-ID: I wrote to Marijn directly and got pointed to this mailing list. I hope this mailing list also serves rust-users. Two questions: 1. How to compile for SunOS. >From what I understand, I first need a precompiled snapshot for SunOS# which was probably written in Ocaml## I have no clue which earlier version to checkout for the same. Or if the process has been changed. 2. Is there support for regular expressions; especially, in alt arms? Also, I couldn't locate regex in std or core module. Thanks. # https://github.com/mozilla/rust/blob/master/INSTALL.txt ## http://en.wikipedia.org/wiki/Rust_%28programming_language%29 -------------- next part -------------- An HTML attachment was scrubbed... URL: From graydon at mozilla.com Sun Apr 1 09:38:05 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Sun, 01 Apr 2012 09:38:05 -0700 Subject: [rust-dev] building on SunOS, regex In-Reply-To: References: Message-ID: <4F78846D.1000802@mozilla.com> On 01/04/2012 2:49 AM, Mohd. Bilal Husain wrote: > 1. How to compile for SunOS. > From what I understand, I first need a precompiled snapshot for SunOS# > which was probably written in Ocaml## > I have no clue which earlier version to checkout for the same. Or if the > process has been changed. Re-bootstrapping from rustboot (the original ocaml-based bootstrap compiler) is not a good route to a new target. The dialect of the language it compiled is long-since obsolete. You'd have to walk it through a year and a half of changes, hundreds of builds and snapshots. It would be very delicate and hard. The only programs that can process the current rust dialect are the existing stage0 snapshots (which we have binary snapshots of for 6 hosts presently: {macos,linux}-{x86,x64}, freebsd-x64, win32-x86) The efficient route to getting a new target is to add support for the target architecture, in the form of .S files, to the runtime (if it's not already there), then add support to the configuration machinery, driver and linkage-driver for the new target. Then keep fiddling with it until it produces binaries that run on your target. If you want a new _host_, you get 'target' mode working (as above) and then just cross-compile rustc from an existing host to your new SunOS target, and register the output from that cross-compilation as your target's new stage0 snapshot. When you say "compile for SunOS", it depends if you mean as host or as target. We're going to be a bit cautious about adding more and more supported kinds of host. We added freebsd this time around, partly because it's so similar to the macos and linux ports that there was very little delta; but every host we support is a new bit of build machinery we have to keep online and moving forward in lock-step with the others. That's more ongoing porting and maintenance effort for us (mozilla) and at some point we're going to draw a line. We should probably work out some sort of policy about community-supported hosts, possibly a way for people to run secondary repos or branches that advance at their own pace (rather than as a bottleneck on our master branch) while still using mostly-similar infrastructure. Extra targets, though, I think should usually be welcome in our master branch. They cost us much less than extra hosts. > 2. Is there support for regular expressions; especially, in alt arms? > Also, I couldn't locate regex in std or core module. There's a pcre module in cargo[1]. We intend to integrate regexp-based switching (alt-like) via a syntax extension at some future date, and will probably bring re2 or pcre or something into libstd once we have a plausible story for optional std components, but have not done any work on this yet. -Graydon [1] https://github.com/mozilla/cargo-central From bilal at bilalhusain.com Sun Apr 1 10:34:28 2012 From: bilal at bilalhusain.com (Mohd. Bilal Husain) Date: Sun, 1 Apr 2012 23:04:28 +0530 Subject: [rust-dev] building on SunOS, regex In-Reply-To: <4F78846D.1000802@mozilla.com> References: <4F78846D.1000802@mozilla.com> Message-ID: 1. Thanks a ton for the detailed instructions. Although I do understand it in parts, I am not sure if I am able to comprehend a lot of things. I'll try and see if I can make some progress. 2. Thanks for pointing to the pcre package. Apart from that I have a question about bind (probably due to my lack of functional programming knowledge) and few remarks: I am following the tutorial at http://doc.rust-lang.org/doc/tutorial.html 3. Section 5.2 Bind. I understand that I can option::unwrap(daynum("do")) to get back the uint. What is the difference if I skip the bind keyword. The llvm bitcode files that are generated w/ and w/o appear to be the name. Sidenote: Is the item 'do' a joke amidst 'mo', 'tu', ... which appear to be weekdays 4. Suggesting a few edits a) Section 8.6 - The map functions should read vec::map([1, 2, 3], plus1); b) Section 11, line 8 should declare acc mutable let mut acc = ""; c) Similary, in section 13 Testing, the variable i must be mutable so line 7 should read let mut i = -100; And again, I am feeling intimidated for posting this on dev mailing list. On 1 April 2012 22:08, Graydon Hoare wrote: > On 01/04/2012 2:49 AM, Mohd. Bilal Husain wrote: > >> 1. How to compile for SunOS. >> From what I understand, I first need a precompiled snapshot for SunOS# >> which was probably written in Ocaml## >> I have no clue which earlier version to checkout for the same. Or if the >> process has been changed. >> > > Re-bootstrapping from rustboot (the original ocaml-based bootstrap > compiler) is not a good route to a new target. The dialect of the language > it compiled is long-since obsolete. You'd have to walk it through a year > and a half of changes, hundreds of builds and snapshots. It would be very > delicate and hard. The only programs that can process the current rust > dialect are the existing stage0 snapshots (which we have binary snapshots > of for 6 hosts presently: {macos,linux}-{x86,x64}, freebsd-x64, win32-x86) > > The efficient route to getting a new target is to add support for the > target architecture, in the form of .S files, to the runtime (if it's not > already there), then add support to the configuration machinery, driver and > linkage-driver for the new target. Then keep fiddling with it until it > produces binaries that run on your target. > > If you want a new _host_, you get 'target' mode working (as above) and > then just cross-compile rustc from an existing host to your new SunOS > target, and register the output from that cross-compilation as your > target's new stage0 snapshot. > > When you say "compile for SunOS", it depends if you mean as host or as > target. We're going to be a bit cautious about adding more and more > supported kinds of host. We added freebsd this time around, partly because > it's so similar to the macos and linux ports that there was very little > delta; but every host we support is a new bit of build machinery we have to > keep online and moving forward in lock-step with the others. That's more > ongoing porting and maintenance effort for us (mozilla) and at some point > we're going to draw a line. > > We should probably work out some sort of policy about community-supported > hosts, possibly a way for people to run secondary repos or branches that > advance at their own pace (rather than as a bottleneck on our master > branch) while still using mostly-similar infrastructure. > > Extra targets, though, I think should usually be welcome in our master > branch. They cost us much less than extra hosts. > > > 2. Is there support for regular expressions; especially, in alt arms? >> Also, I couldn't locate regex in std or core module. >> > > There's a pcre module in cargo[1]. We intend to integrate regexp-based > switching (alt-like) via a syntax extension at some future date, and will > probably bring re2 or pcre or something into libstd once we have a > plausible story for optional std components, but have not done any work on > this yet. > > -Graydon > > [1] https://github.com/mozilla/**cargo-central > ______________________________**_________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/**listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From graydon at mozilla.com Sun Apr 1 11:12:23 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Sun, 01 Apr 2012 11:12:23 -0700 Subject: [rust-dev] building on SunOS, regex In-Reply-To: References: <4F78846D.1000802@mozilla.com> Message-ID: <4F789A87.7060508@mozilla.com> On 01/04/2012 10:34 AM, Mohd. Bilal Husain wrote: > 1. Thanks a ton for the detailed instructions. Although I do understand > it in parts, I am not sure if I am able to comprehend a lot of things. > I'll try and see if I can make some progress. Ok. If you want to stop by our IRC channel, irc.mozilla.org channel #rust during weekdays, some mozilla developers should be around and able to help answer questions a bit more "interactively", clarify the parts you do not understand. Most of us are there during working hours (9h-17h) in UTC-7, but Marijn works in UTC+1. You're also welcome to keep posting here asking for clarification on details of what I wrote in the previous message, if you prefer email. Nobody's really written up instructions on how to add a new target yet, so it's worth elaborating in some detail here. Maybe it would make for a good section in the wiki or manual. There are lots of targets yet to support, and there's no need for mozilla developers to act as bottlenecks on new targets (indeed, the FreeBSD target was contributed by Jyun-Yan You, a volunteer). > Apart from that I have a question about bind (probably due to my lack of > functional programming knowledge) and few remarks: > > I am following the tutorial at http://doc.rust-lang.org/doc/tutorial.html > > 3. Section 5.2 Bind. I understand that I can > option::unwrap(daynum("do")) to get back the uint. > What is the difference if I skip the bind keyword. > The llvm bitcode files that are generated w/ and w/o appear to be the name. I believe we're in the process of deprecating/removing the 'bind' keyword, but I'm not certain. Niko is adjusting that part of the syntax, so you might have run across a temporary redundancy. > 4. Suggesting a few edits Thanks, those are sharp eyes! We'll try to fix these up. > And again, I am feeling intimidated for posting this on dev mailing list. Oh, don't worry about it. This is the only mailing list we have that's not just automated commit-postings. We don't have enough users yet to warrant separate -users and -dev mailing lists. Your questions and suggestions are quite welcome here. -Graydon From banderson at mozilla.com Sun Apr 1 15:40:43 2012 From: banderson at mozilla.com (Brian Anderson) Date: Sun, 01 Apr 2012 15:40:43 -0700 Subject: [rust-dev] building on SunOS, regex In-Reply-To: References: <4F78846D.1000802@mozilla.com> Message-ID: <4F78D96B.3010907@mozilla.com> On 04/01/2012 10:34 AM, Mohd. Bilal Husain wrote: > I am following the tutorial at http://doc.rust-lang.org/doc/tutorial.html > > 3. Section 5.2 Bind. I understand that I can > option::unwrap(daynum("do")) to get back the uint. > What is the difference if I skip the bind keyword. > The llvm bitcode files that are generated w/ and w/o appear to be the > name. The bind keyword here is redundant. Any time you have a function call where one of the parameters is `_`, that's a bind. The only time that bind is currently necessary is when you want to bind all the arguments. In that case there's no way to discern that it's a bind without the keyword. > Sidenote: Is the item 'do' a joke amidst 'mo', 'tu', ... which appear > to be weekdays 'do' is the German abbreviation. > 4. Suggesting a few edits > a) Section 8.6 - The map functions should read > vec::map([1, 2, 3], plus1); > > b) Section 11, line 8 should declare acc mutable > let mut acc = ""; > > c) Similary, in section 13 Testing, the variable i must be mutable so > line 7 should read > let mut i = -100; > > And again, I am feeling intimidated for posting this on dev mailing list. Thanks. I checked in these fixes. From marijnh at gmail.com Sun Apr 1 23:05:26 2012 From: marijnh at gmail.com (Marijn Haverbeke) Date: Mon, 2 Apr 2012 08:05:26 +0200 Subject: [rust-dev] building on SunOS, regex In-Reply-To: <4F78D96B.3010907@mozilla.com> References: <4F78846D.1000802@mozilla.com> <4F78D96B.3010907@mozilla.com> Message-ID: > 'do' is the German abbreviation. Dutch, too. I guess my brain momentarily flipped back to my native language when typing out these abbreviations. Thanks for fixing that. From mictadlo at gmail.com Tue Apr 3 01:04:14 2012 From: mictadlo at gmail.com (Mic) Date: Tue, 3 Apr 2012 18:04:14 +1000 Subject: [rust-dev] read file line by line Message-ID: Hello, I found read_line, but I do not how to convert the following Python code (skip first line and print all other lines from a file) to Rust. f = open(file_name, 'r') f.next() #skip line for line in f: print line f.close() How rust handle exceptions? Thank you in advance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mictadlo at gmail.com Tue Apr 3 01:18:15 2012 From: mictadlo at gmail.com (Mic) Date: Tue, 3 Apr 2012 18:18:15 +1000 Subject: [rust-dev] hashmap benchmark Message-ID: Hello, here are some benchmarks for hashmaps http://lh3lh3.users.sourceforge.net/udb.shtml . How is Rust's hashmap memory and speed efficient compare to the above link? Thank you in advance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bilal at bilalhusain.com Tue Apr 3 03:36:46 2012 From: bilal at bilalhusain.com (Mohd. Bilal Husain) Date: Tue, 3 Apr 2012 16:06:46 +0530 Subject: [rust-dev] read file line by line In-Reply-To: References: Message-ID: As you figured out, the function read_line can be used from the reader_util implementation from module io ~~~~ import io::reader_util; #[doc = "reads the entire file line by line except the first line"] fn main(args: [str]) { if args.len() == 1u { fail #fmt("usage: %s ", args[0]); } let r = io::file_reader(args[1]); // r is result if result::failure(r) { fail result::get_err(r); } let rdr = result::get(r); rdr.read_line(); // skip line while !rdr.eof() { io::println(rdr.read_line()); } } ~~~~ I don't think Rust lets you catch exceptions while reading the stream as you can't do much about it*. * Error handling in Rust is unrecoverable unwinding On 3 April 2012 13:34, Mic wrote: > Hello, > I found read_line, but I do not how to convert the following Python code > (skip first line and print all other lines from a file) to Rust. > > f = open(file_name, 'r') > f.next() #skip line > for line in f: > print line > f.close() > > How rust handle exceptions? > > Thank you in advance. > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mictadlo at gmail.com Tue Apr 3 06:54:38 2012 From: mictadlo at gmail.com (Mic) Date: Tue, 3 Apr 2012 23:54:38 +1000 Subject: [rust-dev] read file line by line In-Reply-To: References: Message-ID: Thank you. How to check whether the last line is not empty? Because line.split_char('\t') would not make sense to run on an empty line. In python I did it in the following way: with open(args.output, 'r') as outfile: for line in infile: try: parts = [part.strip() for part in line.split('\t')] except IndexError: continue On Tue, Apr 3, 2012 at 8:36 PM, Mohd. Bilal Husain wrote: > As you figured out, the function read_line can be used from the > reader_util implementation from module io > > ~~~~ > > import io::reader_util; > > #[doc = "reads the entire file line by line except the first line"] > fn main(args: [str]) { > if args.len() == 1u { > fail #fmt("usage: %s ", args[0]); > } > > let r = io::file_reader(args[1]); // r is result > if result::failure(r) { > fail result::get_err(r); > } > > let rdr = result::get(r); > rdr.read_line(); // skip line > while !rdr.eof() { > io::println(rdr.read_line()); > } > } > ~~~~ > > I don't think Rust lets you catch exceptions while reading the stream as > you can't do much about it*. > > * Error handling in Rust is unrecoverable unwinding > > On 3 April 2012 13:34, Mic wrote: > >> Hello, >> I found read_line, but I do not how to convert the following Python code >> (skip first line and print all other lines from a file) to Rust. >> >> f = open(file_name, 'r') >> f.next() #skip line >> for line in f: >> print line >> f.close() >> >> How rust handle exceptions? >> >> Thank you in advance. >> >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bilal at bilalhusain.com Tue Apr 3 09:21:23 2012 From: bilal at bilalhusain.com (Mohd. Bilal Husain) Date: Tue, 3 Apr 2012 21:51:23 +0530 Subject: [rust-dev] read file line by line In-Reply-To: References: Message-ID: Although I have doubts about IndexError in the python code, you can possibly check empty line by testing the string length. You can use str::len to get the string length str::len(line) == 0u split using str::split_char let parts = str::split_char(line, '\t'); and iterate on parts for part in parts { /* ... */ } Use str::trim for trimming unicode space characters and cont keyword* to continue the loop. Also, in case you are benchmarking Rust vs Python code for text processing, can you post your results and if you liked writing Rust code :) * http://doc.rust-lang.org/doc/tutorial.html#loops On 3 April 2012 19:24, Mic wrote: > Thank you. How to check whether the last line is not empty? > Because line.split_char('\t') would not make sense to run on an empty line. > > In python I did it in the following way: > with open(args.output, 'r') as outfile: > for line in infile: > try: > parts = [part.strip() for part in line.split('\t')] > except IndexError: > continue > > On Tue, Apr 3, 2012 at 8:36 PM, Mohd. Bilal Husain wrote: > >> As you figured out, the function read_line can be used from the >> reader_util implementation from module io >> >> ~~~~ >> >> import io::reader_util; >> >> #[doc = "reads the entire file line by line except the first line"] >> fn main(args: [str]) { >> if args.len() == 1u { >> fail #fmt("usage: %s ", args[0]); >> } >> >> let r = io::file_reader(args[1]); // r is result >> if result::failure(r) { >> fail result::get_err(r); >> } >> >> let rdr = result::get(r); >> rdr.read_line(); // skip line >> while !rdr.eof() { >> io::println(rdr.read_line()); >> } >> } >> ~~~~ >> >> I don't think Rust lets you catch exceptions while reading the stream as >> you can't do much about it*. >> >> * Error handling in Rust is unrecoverable unwinding >> >> On 3 April 2012 13:34, Mic wrote: >> >>> Hello, >>> I found read_line, but I do not how to convert the following Python code >>> (skip first line and print all other lines from a file) to Rust. >>> >>> f = open(file_name, 'r') >>> f.next() #skip line >>> for line in f: >>> print line >>> f.close() >>> >>> How rust handle exceptions? >>> >>> Thank you in advance. >>> >>> _______________________________________________ >>> Rust-dev mailing list >>> Rust-dev at mozilla.org >>> https://mail.mozilla.org/listinfo/rust-dev >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bilal at bilalhusain.com Tue Apr 3 09:59:47 2012 From: bilal at bilalhusain.com (Mohd. Bilal Husain) Date: Tue, 3 Apr 2012 22:29:47 +0530 Subject: [rust-dev] enscripten demo? Message-ID: After I couldn't build on SunOS (as host) due to lack of swap space (one of the hindrance), knowledge and control of machine, in general, I was wondering that with all llvm bindings, it should be straightforward to run emscripten on Rust's IR generated for LLVM. And indeed there seems to be working pieces.* What I want to know is whether there's an online demo or compiled javascript which I can play with? Thanks. * I read in the emscripten paper that there's a frontend for Rust. -------------- next part -------------- An HTML attachment was scrubbed... URL: From banderson at mozilla.com Tue Apr 3 10:19:42 2012 From: banderson at mozilla.com (Brian Anderson) Date: Tue, 03 Apr 2012 10:19:42 -0700 Subject: [rust-dev] enscripten demo? In-Reply-To: References: Message-ID: <4F7B312E.5070402@mozilla.com> On 04/03/2012 09:59 AM, Mohd. Bilal Husain wrote: > After I couldn't build on SunOS (as host) due to lack of swap space > (one of the hindrance), knowledge and control of machine, in general, > I was wondering that with all llvm bindings, it should be > straightforward to run emscripten on Rust's IR generated for LLVM. And > indeed there seems to be working pieces.* What I want to know is > whether there's an online demo or compiled javascript which I can play > with? Nobody has done this yet, but it gets talked about frequently and would be awesome. Getting something working minimally would not be too difficult. You would probably start by running rust bitcode through emscripten, then rewriting missing runtime functions in javascript. I would love patches for this. From banderson at mozilla.com Tue Apr 3 10:44:26 2012 From: banderson at mozilla.com (Brian Anderson) Date: Tue, 03 Apr 2012 10:44:26 -0700 Subject: [rust-dev] hashmap benchmark In-Reply-To: References: Message-ID: <4F7B36FA.1000507@mozilla.com> On 04/03/2012 01:18 AM, Mic wrote: > Hello, > here are some benchmarks for hashmaps > http://lh3lh3.users.sourceforge.net/udb.shtml . > > How is Rust's hashmap memory and speed efficient compare to the above > link? > It would be great if somebody were to write some Rust hashing benchmarks, contribute them under `src/test/bench` and share the results. From ehsan.akhgari at gmail.com Tue Apr 3 10:48:03 2012 From: ehsan.akhgari at gmail.com (Ehsan Akhgari) Date: Tue, 3 Apr 2012 13:48:03 -0400 Subject: [rust-dev] enscripten demo? In-Reply-To: <4F7B312E.5070402@mozilla.com> References: <4F7B312E.5070402@mozilla.com> Message-ID: I would expect most of the work here would be implementing the rust runtime library functions. On Apr 3, 2012 1:19 PM, "Brian Anderson" wrote: > On 04/03/2012 09:59 AM, Mohd. Bilal Husain wrote: > >> After I couldn't build on SunOS (as host) due to lack of swap space (one >> of the hindrance), knowledge and control of machine, in general, I was >> wondering that with all llvm bindings, it should be straightforward to run >> emscripten on Rust's IR generated for LLVM. And indeed there seems to be >> working pieces.* What I want to know is whether there's an online demo or >> compiled javascript which I can play with? >> > > Nobody has done this yet, but it gets talked about frequently and would be > awesome. Getting something working minimally would not be too difficult. > You would probably start by running rust bitcode through emscripten, then > rewriting missing runtime functions in javascript. I would love patches for > this. > > ______________________________**_________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/**listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbrubeck at mozilla.com Tue Apr 3 11:11:35 2012 From: mbrubeck at mozilla.com (mbrubeck) Date: Tue, 03 Apr 2012 11:11:35 -0700 Subject: [rust-dev] enscripten demo? In-Reply-To: References: Message-ID: <4F7B3D57.3040004@mozilla.com> On 04/03/2012 09:59 AM, Mohd. Bilal Husain wrote: > I was wondering that with all llvm bindings, it should be > straightforward to run emscripten on Rust's IR generated for LLVM. And > indeed there seems to be working pieces.* What I want to know is > whether there's an online demo or compiled javascript which I can play > with? Here's what I got several months ago when I ran a simple Rust file through Emscripten, just out of curiousity. It doesn't run because of missing functions like isPointerType: http://limpet.net/mbrubeck/temp/emscripted.js (Obviously I never tried running the original Rust code either, since I later noticed it has an infinite loop bug.) From niko at alum.mit.edu Tue Apr 3 11:45:12 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 03 Apr 2012 11:45:12 -0700 Subject: [rust-dev] hashmap benchmark In-Reply-To: References: Message-ID: <4F7B4538.40809@alum.mit.edu> On 4/3/12 1:18 AM, Mic wrote: > How is Rust's hashmap memory and speed efficient compare to the above > link? We haven't measured it, but the algorithm being used is not sophisticated and I'm sure that there is much room for improvement. Niko From mictadlo at gmail.com Wed Apr 4 02:47:06 2012 From: mictadlo at gmail.com (Mic) Date: Wed, 4 Apr 2012 19:47:06 +1000 Subject: [rust-dev] read file line by line In-Reply-To: References: Message-ID: Thank you. I can make a benchmark compare to python. How to use str::trim on each element in parts array? In python I did it with 'strip' in the following way: parts = [part.strip() for part in line.split('\t')] Thank you in advance. On Wed, Apr 4, 2012 at 2:21 AM, Mohd. Bilal Husain wrote: > Although I have doubts about IndexError in the python code, you can > possibly check empty line by testing the string length. > > You can use str::len to get the string length > > str::len(line) == 0u > > split using str::split_char > > let parts = str::split_char(line, '\t'); > > and iterate on parts > > for part in parts { > /* ... */ > } > > Use str::trim for trimming unicode space characters and cont keyword* to > continue the loop. > > Also, in case you are benchmarking Rust vs Python code for text > processing, can you post your results and if you liked writing Rust code :) > > * http://doc.rust-lang.org/doc/tutorial.html#loops > > > On 3 April 2012 19:24, Mic wrote: > >> Thank you. How to check whether the last line is not empty? >> Because line.split_char('\t') would not make sense to run on an empty line. >> >> In python I did it in the following way: >> with open(args.output, 'r') as outfile: >> for line in infile: >> try: >> parts = [part.strip() for part in line.split('\t')] >> except IndexError: >> continue >> >> On Tue, Apr 3, 2012 at 8:36 PM, Mohd. Bilal Husain > > wrote: >> >>> As you figured out, the function read_line can be used from the >>> reader_util implementation from module io >>> >>> ~~~~ >>> >>> import io::reader_util; >>> >>> #[doc = "reads the entire file line by line except the first line"] >>> fn main(args: [str]) { >>> if args.len() == 1u { >>> fail #fmt("usage: %s ", args[0]); >>> } >>> >>> let r = io::file_reader(args[1]); // r is result >>> if result::failure(r) { >>> fail result::get_err(r); >>> } >>> >>> let rdr = result::get(r); >>> rdr.read_line(); // skip line >>> while !rdr.eof() { >>> io::println(rdr.read_line()); >>> } >>> } >>> ~~~~ >>> >>> I don't think Rust lets you catch exceptions while reading the stream as >>> you can't do much about it*. >>> >>> * Error handling in Rust is unrecoverable unwinding >>> >>> On 3 April 2012 13:34, Mic wrote: >>> >>>> Hello, >>>> I found read_line, but I do not how to convert the following Python >>>> code (skip first line and print all other lines from a file) to Rust. >>>> >>>> f = open(file_name, 'r') >>>> f.next() #skip line >>>> for line in f: >>>> print line >>>> f.close() >>>> >>>> How rust handle exceptions? >>>> >>>> Thank you in advance. >>>> >>>> _______________________________________________ >>>> Rust-dev mailing list >>>> Rust-dev at mozilla.org >>>> https://mail.mozilla.org/listinfo/rust-dev >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bilal at bilalhusain.com Wed Apr 4 02:55:27 2012 From: bilal at bilalhusain.com (Mohd. Bilal Husain) Date: Wed, 4 Apr 2012 15:25:27 +0530 Subject: [rust-dev] enscripten demo? In-Reply-To: <4F7B3D57.3040004@mozilla.com> References: <4F7B3D57.3040004@mozilla.com> Message-ID: Passed a dumb sample rust bitcode to emscripten, got js functions#. Realized I need to run on core modules too for printing simple hello world. Took io from libcore, decimated code to avoid few build errors, emcc throws error Unclear type in struct Anyways, need to figure out how to build native modules and core lib, std lib; and how to map these modules to imports in a sample hello-world. Can use some help and pointers about strategy. Thanks. # http://bilalhusain.com/rust/CANinyTjctG+-Ka6hk-Jf1kdFnq3y_-DoOD+Gv+K9VmirqMNPyA/ On 3 April 2012 23:41, mbrubeck wrote: > On 04/03/2012 09:59 AM, Mohd. Bilal Husain wrote: > >> I was wondering that with all llvm bindings, it should be straightforward >> to run emscripten on Rust's IR generated for LLVM. And indeed there seems >> to be working pieces.* What I want to know is whether there's an online >> demo or compiled javascript which I can play with? >> > Here's what I got several months ago when I ran a simple Rust file through > Emscripten, just out of curiousity. It doesn't run because of missing > functions like isPointerType: > > http://limpet.net/mbrubeck/**temp/emscripted.js > > (Obviously I never tried running the original Rust code either, since I > later noticed it has an infinite loop bug.) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mictadlo at gmail.com Wed Apr 4 06:46:40 2012 From: mictadlo at gmail.com (Mic) Date: Wed, 4 Apr 2012 23:46:40 +1000 Subject: [rust-dev] vim with Rust Message-ID: Hello, I did following steps: 1. mkdir ~/.vim 2. cp ~/Downloads/rust-0.2/src/etc/vim ~/.vim Where have I to copy ctags.rust? For what are these files?: ~/Downloads/rust-0.2/src/etc $ ls apple-darwin.supp combine-tests.py extract_grammar.py gyp-uv make-snapshot.py snapshot.py vim check-links.pl ctags.rust extract-tests.py latest-unix-snaps.py mirror-all-snapshots.py tidy.py x86.supp cmathconsts.c emacs get-snapshot.py libc.c pkg unicode.py Is possible to get with Vim Rust this http://blog.dispatched.ch/wp-content/uploads/2009/05/omnicompletion.png ? Thank you in advance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric.holk at gmail.com Wed Apr 4 08:09:49 2012 From: eric.holk at gmail.com (Eric Holk) Date: Wed, 4 Apr 2012 11:09:49 -0400 Subject: [rust-dev] read file line by line In-Reply-To: References: Message-ID: In Rust, you can do something like this instead: let parts = vec::map([" a", "b ", " c ", "d"]) {|s| str::trim(s) }; Obviously, you'd want to replace the vector literal in map with the vector you actually want to trim everything in. I think there are cleaner ways to do this using the new extension methods, but this way compiled on my machine. -Eric On Wed, Apr 4, 2012 at 5:47 AM, Mic wrote: > Thank you. I can make a benchmark compare to python. > > How to use str::trim on each element in parts array? In python I did it > with 'strip' in the following way: > parts = [part.strip() for part in line.split('\t')] > > Thank you in advance. > > On Wed, Apr 4, 2012 at 2:21 AM, Mohd. Bilal Husain wrote: > >> Although I have doubts about IndexError in the python code, you can >> possibly check empty line by testing the string length. >> >> You can use str::len to get the string length >> >> str::len(line) == 0u >> >> split using str::split_char >> >> let parts = str::split_char(line, '\t'); >> >> and iterate on parts >> >> for part in parts { >> /* ... */ >> } >> >> Use str::trim for trimming unicode space characters and cont keyword* to >> continue the loop. >> >> Also, in case you are benchmarking Rust vs Python code for text >> processing, can you post your results and if you liked writing Rust code :) >> >> * http://doc.rust-lang.org/doc/tutorial.html#loops >> >> >> On 3 April 2012 19:24, Mic wrote: >> >>> Thank you. How to check whether the last line is not empty? >>> Because line.split_char('\t') would not make sense to run on an empty line. >>> >>> In python I did it in the following way: >>> with open(args.output, 'r') as outfile: >>> for line in infile: >>> try: >>> parts = [part.strip() for part in line.split('\t')] >>> except IndexError: >>> continue >>> >>> On Tue, Apr 3, 2012 at 8:36 PM, Mohd. Bilal Husain < >>> bilal at bilalhusain.com> wrote: >>> >>>> As you figured out, the function read_line can be used from the >>>> reader_util implementation from module io >>>> >>>> ~~~~ >>>> >>>> import io::reader_util; >>>> >>>> #[doc = "reads the entire file line by line except the first line"] >>>> fn main(args: [str]) { >>>> if args.len() == 1u { >>>> fail #fmt("usage: %s ", args[0]); >>>> } >>>> >>>> let r = io::file_reader(args[1]); // r is result >>>> if result::failure(r) { >>>> fail result::get_err(r); >>>> } >>>> >>>> let rdr = result::get(r); >>>> rdr.read_line(); // skip line >>>> while !rdr.eof() { >>>> io::println(rdr.read_line()); >>>> } >>>> } >>>> ~~~~ >>>> >>>> I don't think Rust lets you catch exceptions while reading the stream >>>> as you can't do much about it*. >>>> >>>> * Error handling in Rust is unrecoverable unwinding >>>> >>>> On 3 April 2012 13:34, Mic wrote: >>>> >>>>> Hello, >>>>> I found read_line, but I do not how to convert the following Python >>>>> code (skip first line and print all other lines from a file) to Rust. >>>>> >>>>> f = open(file_name, 'r') >>>>> f.next() #skip line >>>>> for line in f: >>>>> print line >>>>> f.close() >>>>> >>>>> How rust handle exceptions? >>>>> >>>>> Thank you in advance. >>>>> >>>>> _______________________________________________ >>>>> Rust-dev mailing list >>>>> Rust-dev at mozilla.org >>>>> https://mail.mozilla.org/listinfo/rust-dev >>>>> >>>>> >>>> >>> >> > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Wed Apr 4 08:17:43 2012 From: masklinn at masklinn.net (Masklinn) Date: Wed, 4 Apr 2012 17:17:43 +0200 Subject: [rust-dev] read file line by line In-Reply-To: References: Message-ID: On 4 avr. 2012, at 17:09, Eric Holk wrote: > In Rust, you can do something like this instead: > > let parts = vec::map([" a", "b ", " c ", "d"]) {|s| > str::trim(s) > }; Isn't it possible to pass str::trim directly to vec::map? It the indirection through the block really needed? From marijnh at gmail.com Wed Apr 4 12:16:07 2012 From: marijnh at gmail.com (Marijn Haverbeke) Date: Wed, 4 Apr 2012 21:16:07 +0200 Subject: [rust-dev] read file line by line In-Reply-To: References: Message-ID: > Isn't it possible to pass str::trim directly to vec::map? It the indirection through the block really needed? Currently it is, due to argument modes (map takes a function that expects its argument by-reference, str::split takes it by value). We're hopeful that we've found a way to get rid of this restriction, once the region work is more complete, but it might take a while before all that is implemented. From banderson at mozilla.com Wed Apr 4 13:39:05 2012 From: banderson at mozilla.com (Brian Anderson) Date: Wed, 04 Apr 2012 13:39:05 -0700 Subject: [rust-dev] read file line by line In-Reply-To: References: Message-ID: <4F7CB169.8020409@mozilla.com> On 04/04/2012 08:17 AM, Masklinn wrote: > On 4 avr. 2012, at 17:09, Eric Holk wrote: >> In Rust, you can do something like this instead: >> >> let parts = vec::map([" a", "b ", " c ", "d"]) {|s| >> str::trim(s) >> }; > Isn't it possible to pass str::trim directly to vec::map? It the indirection through the block really needed? In this case I believe the block isn't necessary, but in many situations it is so I've gotten used to just always using it (sadly). The reason is because generic functions always take arguments by reference while functions on scalars take their arguments by value, so composing them isn't possible without an adapter between them. Strings are passed by reference though so this example should work without the extra block. From eric.holk at gmail.com Thu Apr 5 09:58:08 2012 From: eric.holk at gmail.com (Eric Holk) Date: Thu, 5 Apr 2012 12:58:08 -0400 Subject: [rust-dev] read file line by line In-Reply-To: <4F7CB169.8020409@mozilla.com> References: <4F7CB169.8020409@mozilla.com> Message-ID: On Wed, Apr 4, 2012 at 4:39 PM, Brian Anderson wrote: > On 04/04/2012 08:17 AM, Masklinn wrote: > >> On 4 avr. 2012, at 17:09, Eric Holk wrote: >> >>> In Rust, you can do something like this instead: >>> >>> let parts = vec::map([" a", "b ", " c ", "d"]) {|s| >>> str::trim(s) >>> }; >>> >> Isn't it possible to pass str::trim directly to vec::map? It the >> indirection through the block really needed? >> > > In this case I believe the block isn't necessary, but in many situations > it is so I've gotten used to just always using it (sadly). The reason is > because generic functions always take arguments by reference while > functions on scalars take their arguments by value, so composing them isn't > possible without an adapter between them. Strings are passed by reference > though so this example should work without the extra block. > > I originally wrote this without the extra block, and the compiler complained. I was also trying to use the extension methods (`[" a", "b ", " c ", "d"].map(str::trim)`) instead, but the compiler was having trouble finding them. I'm pretty sure I was using the rustc I had just pulled from Github. -Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From mictadlo at gmail.com Fri Apr 6 02:21:16 2012 From: mictadlo at gmail.com (Mic) Date: Fri, 6 Apr 2012 19:21:16 +1000 Subject: [rust-dev] spawn on top of OpenCL In-Reply-To: References: Message-ID: Hi, In the next 3 years some GPUs will have 20 000 cores. Looking at this http://jogamp.org/jocl/www/ benchmark it is now a huge difference between GPU and CPU code. In Aparapi http://blogs.amd.com/developer/2011/09/14/i-dont-always-write-gpu-code-in-java-but-when-i-do-i-like-to-use-aparapi/ you write everything in Java. Any plans to make spawn on top of OpenCL ( http://en.wikipedia.org/wiki/OpenCL ) so the user can choose by running the application whether it would like to do it on CPU, GPU or Hadoop? Cheers, On Tue, Apr 3, 2012 at 6:43 PM, Mic wrote: > Hello, > Any plans to make spawn on top of OpenCL ( > http://en.wikipedia.org/wiki/OpenCL ) similar to: > http://code.google.com/p/copperhead/ > http://deeplearning.net/software/theano/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dteller at mozilla.com Fri Apr 6 02:40:34 2012 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Fri, 06 Apr 2012 11:40:34 +0200 Subject: [rust-dev] spawn on top of OpenCL In-Reply-To: References: Message-ID: <4F7EBA12.9000207@mozilla.com> Quick note: writing GPU code and CPU code is extremely different. Things may have changed, but last time I checked, GPUs were very antagonistic to non-static memory management ? which basically meant no stack and no memory allocation. For these reasons, I doubt that |spawn| can even be implemented on top of OpenCL. On the other hand, it is certainly possible to write a nice library of combinators for on-GPU computations. I suspect that Rust-style typeclasses can even be used to define generic algorithms that can be implemented both on-GPU and on-CPU. Cheers, David On 4/6/12 11:21 AM, Mic wrote: > Hi, > In the next 3 years some GPUs will have 20 000 cores. > > Looking at this http://jogamp.org/jocl/www/ benchmark it is now a huge > difference between GPU and CPU code. In Aparapi > http://blogs.amd.com/developer/2011/09/14/i-dont-always-write-gpu-code-in-java-but-when-i-do-i-like-to-use-aparapi/ you > write everything in Java. > > Any plans to make spawn on top of OpenCL > ( http://en.wikipedia.org/wiki/OpenCL ) so the user can choose by > running the application whether it would like to do it on CPU, GPU > or Hadoop? > > Cheers, > > On Tue, Apr 3, 2012 at 6:43 PM, Mic > wrote: > > Hello, > Any plans to make spawn on top of OpenCL > ( http://en.wikipedia.org/wiki/OpenCL ) similar to: > http://code.google.com/p/copperhead/ > http://deeplearning.net/software/theano/ > > > > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev -- David Rajchenbach-Teller, PhD Performance Team, Mozilla -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 487 bytes Desc: OpenPGP digital signature URL: From kobi2187 at gmail.com Fri Apr 6 06:13:15 2012 From: kobi2187 at gmail.com (Kobi Lurie) Date: Fri, 06 Apr 2012 16:13:15 +0300 Subject: [rust-dev] hello Message-ID: <4F7EEBEB.1050409@gmail.com> hello list, newbie here, just read the tutorial. I am interested in rust, and would like to ask general questions, or suggest features or simplifications. Is this the right place? first question: I am an early adopter type, I don't mind api breaking under my feet. I'd like to know if at this stage the language is suitable enough for writing small libraries or apps. Are the general semantics expected to change dramatically? Thanks, Kobi From mictadlo at gmail.com Fri Apr 6 06:53:48 2012 From: mictadlo at gmail.com (Mic) Date: Fri, 6 Apr 2012 23:53:48 +1000 Subject: [rust-dev] executing error Message-ID: Hello, I am getting by executing the bellow code with Permission denied fn main() { for i in [1, 2, 3] { io::println(#fmt("hello %d\n", i)); } } $ rustc hello.rs $ ./hello bash: ./hello: Permission denied $ ls -ahl total 44K drwx------ 2 mictadlo mictadlo 4.0K Apr 6 23:49 . drwx------ 27 mictadlo mictadlo 8.0K Apr 6 23:43 .. -rw-r--r-- 1 mictadlo mictadlo 14K Apr 6 23:49 hello What did I do wrong? Thank you in advance. Cheers, -------------- next part -------------- An HTML attachment was scrubbed... URL: From rick.richardson at gmail.com Fri Apr 6 08:06:41 2012 From: rick.richardson at gmail.com (Rick Richardson) Date: Fri, 6 Apr 2012 11:06:41 -0400 Subject: [rust-dev] hello In-Reply-To: <4F7EEBEB.1050409@gmail.com> References: <4F7EEBEB.1050409@gmail.com> Message-ID: Welcome! I too am relatively new and have been enjoying the hell out of Rust. Rust is definitely suitable for writing small libraries or apps. Rust is definitely going go change its API, and, to a lesser extent, its language semantics. The parts that is still in flux are some finer points relating to pointers and memory management. For many apps you'd never notice, but in my opinion there are some non-intuitive quirks. These are being ironed out as the region based memory management is implemented. In all honesty, this is the best experience I've ever had with a 0.2 language. The toolchain is quite solid and everything works as expected. The team is also very responsive to issues filed in the github tracker. If you haven't already, check out cargo central. It is a repository of user contributed libraries and anyone can contribute to it. So if you see something missing from Rust's core or stdlib, that would be a great way to contribute. On Fri, Apr 6, 2012 at 9:13 AM, Kobi Lurie wrote: > hello list, newbie here, just read the tutorial. > I am interested in rust, and would like to ask general questions, or suggest > features or simplifications. > > Is this the right place? > > first question: I am an early adopter type, I don't mind api breaking under > my feet. > I'd like to know if at this stage the language is suitable enough for > writing small libraries or apps. > Are the general semantics expected to change dramatically? > > Thanks, Kobi > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From rick.richardson at gmail.com Fri Apr 6 08:08:29 2012 From: rick.richardson at gmail.com (Rick Richardson) Date: Fri, 6 Apr 2012 11:08:29 -0400 Subject: [rust-dev] executing error In-Reply-To: References: Message-ID: Your code works perfectly for me (except there are two newlines between each print (the \n is redundant)) What is your platform and rustc version? On Fri, Apr 6, 2012 at 9:53 AM, Mic wrote: > Hello, > I am getting by executing the bellow code with Permission denied > > fn main() { > ? ? ? ? for i in [1, 2, 3] { > ? ? ? ? ? ? ? ? io::println(#fmt("hello %d\n", i)); > ? ? ? ? } > } > > $ rustc hello.rs > $ ./hello > bash: ./hello: Permission denied > $ ls -ahl > total 44K > drwx------ ?2 mictadlo mictadlo 4.0K Apr ?6 23:49 . > drwx------ 27 mictadlo mictadlo 8.0K Apr ?6 23:43 .. > -rw-r--r-- ?1 mictadlo mictadlo ?14K Apr ?6 23:49 hello > > What did I do wrong? > > Thank you in advance. > > Cheers, > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > From rick.richardson at gmail.com Fri Apr 6 08:16:23 2012 From: rick.richardson at gmail.com (Rick Richardson) Date: Fri, 6 Apr 2012 11:16:23 -0400 Subject: [rust-dev] executing error In-Reply-To: References: Message-ID: Also, out of curiosity, what is your umask? and if you 'chmod +x hello' does it work then? On Fri, Apr 6, 2012 at 11:08 AM, Rick Richardson wrote: > Your code works perfectly for me (except there are two newlines > between each print (the \n is redundant)) > > What is your platform and rustc version? > > On Fri, Apr 6, 2012 at 9:53 AM, Mic wrote: >> Hello, >> I am getting by executing the bellow code with Permission denied >> >> fn main() { >> ? ? ? ? for i in [1, 2, 3] { >> ? ? ? ? ? ? ? ? io::println(#fmt("hello %d\n", i)); >> ? ? ? ? } >> } >> >> $ rustc hello.rs >> $ ./hello >> bash: ./hello: Permission denied >> $ ls -ahl >> total 44K >> drwx------ ?2 mictadlo mictadlo 4.0K Apr ?6 23:49 . >> drwx------ 27 mictadlo mictadlo 8.0K Apr ?6 23:43 .. >> -rw-r--r-- ?1 mictadlo mictadlo ?14K Apr ?6 23:49 hello >> >> What did I do wrong? >> >> Thank you in advance. >> >> Cheers, >> >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev >> From eric.holk at gmail.com Fri Apr 6 08:49:15 2012 From: eric.holk at gmail.com (Eric Holk) Date: Fri, 6 Apr 2012 11:49:15 -0400 Subject: [rust-dev] spawn on top of OpenCL In-Reply-To: <4F7EBA12.9000207@mozilla.com> References: <4F7EBA12.9000207@mozilla.com> Message-ID: Newer versions of CUDA do allow memory allocation on the GPU. Even if you're not using CUDA, it should be possible to implement your own on-GPU memory allocation. Doing recursion on the GPU isn't supported yet as far as I know, but I suspect if you're clever there's a way to do it. At any rate, GPUs are significantly more powerful with each generation, so many of these restrictions will probably be relaxed in a few years. I don't think Rust's spawn is a great fit for GPU programming though, since the GPU style is to spawn thousands of identical threads. Rust spawn so far seems to be used for fewer numbers of tasks that each are responsible for different things. It's hard to be sure at this point though, since there is not a lot a parallel Rust code in existence. I think Rust will definitely grow a GPU computing story someday. I'm not really sure what the best approach is. Rust's interface system seems to be approaching the power needed to do something like Accelerate does for Haskell. There may be some benefit in making GPU computing a more baked-in part of the language though, although this is probably somewhat contrary to Rust's overall design. -Eric On Fri, Apr 6, 2012 at 5:40 AM, David Rajchenbach-Teller < dteller at mozilla.com> wrote: > Quick note: writing GPU code and CPU code is extremely different. Things > may have changed, but last time I checked, GPUs were very antagonistic > to non-static memory management ? which basically meant no stack and no > memory allocation. > > For these reasons, I doubt that |spawn| can even be implemented on top > of OpenCL. On the other hand, it is certainly possible to write a nice > library of combinators for on-GPU computations. I suspect that > Rust-style typeclasses can even be used to define generic algorithms > that can be implemented both on-GPU and on-CPU. > > Cheers, > David > > On 4/6/12 11:21 AM, Mic wrote: > > Hi, > > In the next 3 years some GPUs will have 20 000 cores. > > > > Looking at this http://jogamp.org/jocl/www/ benchmark it is now a huge > > difference between GPU and CPU code. In Aparapi > > > http://blogs.amd.com/developer/2011/09/14/i-dont-always-write-gpu-code-in-java-but-when-i-do-i-like-to-use-aparapi/you > > write everything in Java. > > > > Any plans to make spawn on top of OpenCL > > ( http://en.wikipedia.org/wiki/OpenCL ) so the user can choose by > > running the application whether it would like to do it on CPU, GPU > > or Hadoop? > > > > Cheers, > > > > On Tue, Apr 3, 2012 at 6:43 PM, Mic > > wrote: > > > > Hello, > > Any plans to make spawn on top of OpenCL > > ( http://en.wikipedia.org/wiki/OpenCL ) similar to: > > http://code.google.com/p/copperhead/ > > http://deeplearning.net/software/theano/ > > > > > > > > > > > > _______________________________________________ > > Rust-dev mailing list > > Rust-dev at mozilla.org > > https://mail.mozilla.org/listinfo/rust-dev > > > -- > David Rajchenbach-Teller, PhD > Performance Team, Mozilla > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From graydon at mozilla.com Fri Apr 6 13:04:13 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Fri, 06 Apr 2012 13:04:13 -0700 Subject: [rust-dev] hello In-Reply-To: <4F7EEBEB.1050409@gmail.com> References: <4F7EEBEB.1050409@gmail.com> Message-ID: <4F7F4C3D.1040109@mozilla.com> On 06/04/2012 6:13 AM, Kobi Lurie wrote: > hello list, newbie here, just read the tutorial. Welcome! Thanks for your interest. > I am interested in rust, and would like to ask general questions, or > suggest features or simplifications. > > Is this the right place? Best we have so far. As the community grows, we may set up a separate -users list but for now most of the language users and developers are the same people :) > first question: I am an early adopter type, I don't mind api breaking > under my feet. > I'd like to know if at this stage the language is suitable enough for > writing small libraries or apps. > Are the general semantics expected to change dramatically? It depends on how you view "dramatic". We certainly aren't making source-level compatibility promises yet. I don't expect we'll be at a place to make those sorts of promises for ... probably the rest of this year. We have a modest number of changes queued up for the course of this year, some of which we don't know the exact outcome of yet. The main changes-in-progress (proposed or under debate) are currently open as "RFC" bugs in our tracker. See: https://github.com/mozilla/rust/issues?labels=B-RFC Many of these are sort of "corner case" work; changing an awkwardness we find in the existing syntax and semantics without changing anything deep about them. A few are additions (classes, slices) which are not going to break old code, just enable interesting kinds of new code. A few queued changes are somewhat deep: the changes to import/export control, the attribute system, the "extern" C FFI, region memory management, vectors, unwinding, gc and error signalling, for example. But in all these cases we're hoping to, again, refactor the semantics and "file down" rough parts we discovered in the existing language while working with it, not fundamentally alter the "character" of it. Of course this is subjective. But if you like the current semantics, my hope is that we're just going to be making them more robust and useful, not springing any surprises on you. -Graydon From mictadlo at gmail.com Fri Apr 6 14:00:58 2012 From: mictadlo at gmail.com (Mic) Date: Sat, 7 Apr 2012 07:00:58 +1000 Subject: [rust-dev] executing error In-Reply-To: References: Message-ID: Hello, Thank you, but I get still permission denied. $ umask 0022 $ chmod +x hello $ ./hello bash: ./hello: Permission denied $ ls -ahl total 32K drwx------ 2 mictadlo mictadlo 4.0K Apr 6 23:54 . drwx------ 27 mictadlo mictadlo 8.0K Jan 1 1970 .. -rw-r--r-- 1 mictadlo mictadlo 14K Apr 6 23:49 hello -rw-r--r-- 1 mictadlo mictadlo 77 Apr 6 23:46 hello.rs $ rustc -v rustc 0.2 host: x86_64-unknown-linux-gnu I am using Sabayon linux 8 64 bit. On Sat, Apr 7, 2012 at 1:16 AM, Rick Richardson wrote: > Also, out of curiosity, what is your umask? and if you 'chmod +x > hello' does it work then? > > > On Fri, Apr 6, 2012 at 11:08 AM, Rick Richardson > wrote: > > Your code works perfectly for me (except there are two newlines > > between each print (the \n is redundant)) > > > > What is your platform and rustc version? > > > > On Fri, Apr 6, 2012 at 9:53 AM, Mic wrote: > >> Hello, > >> I am getting by executing the bellow code with Permission denied > >> > >> fn main() { > >> for i in [1, 2, 3] { > >> io::println(#fmt("hello %d\n", i)); > >> } > >> } > >> > >> $ rustc hello.rs > >> $ ./hello > >> bash: ./hello: Permission denied > >> $ ls -ahl > >> total 44K > >> drwx------ 2 mictadlo mictadlo 4.0K Apr 6 23:49 . > >> drwx------ 27 mictadlo mictadlo 8.0K Apr 6 23:43 .. > >> -rw-r--r-- 1 mictadlo mictadlo 14K Apr 6 23:49 hello > >> > >> What did I do wrong? > >> > >> Thank you in advance. > >> > >> Cheers, > >> > >> _______________________________________________ > >> Rust-dev mailing list > >> Rust-dev at mozilla.org > >> https://mail.mozilla.org/listinfo/rust-dev > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbrubeck at mozilla.com Fri Apr 6 14:08:28 2012 From: mbrubeck at mozilla.com (mbrubeck) Date: Fri, 06 Apr 2012 14:08:28 -0700 Subject: [rust-dev] executing error In-Reply-To: References: Message-ID: <4F7F5B4C.4050806@mozilla.com> On 04/06/2012 02:00 PM, Mic wrote: > $ umask > 0022 > $ chmod +x hello > $ ./hello > bash: ./hello: Permission denied > $ ls -ahl > total 32K > drwx------ 2 mictadlo mictadlo 4.0K Apr 6 23:54 . > drwx------ 27 mictadlo mictadlo 8.0K Jan 1 1970 .. > -rw-r--r-- 1 mictadlo mictadlo 14K Apr 6 23:49 hello > -rw-r--r-- 1 mictadlo mictadlo 77 Apr 6 23:46 hello.rs > For some reason the execute byte is still not set on the "hello" binary. Is it possible you are working within a filesystem that is mounted with the "noexec" option? (You can check this by running "mount" looking for "noexec" in the output.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rick.richardson at gmail.com Fri Apr 6 14:08:06 2012 From: rick.richardson at gmail.com (Rick Richardson) Date: Fri, 6 Apr 2012 17:08:06 -0400 Subject: [rust-dev] executing error In-Reply-To: References: Message-ID: > -rw-r--r-- ?1 mictadlo mictadlo ?14K Apr ?6 23:49 hello Notice, there, that your hello binary is *still* not executable. I don't think that this has anything to do with rustc at this point. On Fri, Apr 6, 2012 at 5:00 PM, Mic wrote: > Hello, > Thank you, but I get still permission denied. > > $ umask > 0022 > $ chmod +x hello > $ ./hello > bash: ./hello: Permission denied > $ ls -ahl > total 32K > drwx------ ?2 mictadlo mictadlo 4.0K Apr ?6 23:54 . > drwx------ 27 mictadlo mictadlo 8.0K Jan ?1 ?1970 .. > -rw-r--r-- ?1 mictadlo mictadlo ?14K Apr ?6 23:49 hello > -rw-r--r-- ?1 mictadlo mictadlo ? 77 Apr ?6 23:46 hello.rs > > $ rustc -v > rustc 0.2 > host: x86_64-unknown-linux-gnu > > I am using Sabayon linux 8 64 bit. > > On Sat, Apr 7, 2012 at 1:16 AM, Rick Richardson > wrote: >> >> Also, out of curiosity, what is your umask? ? and if you ?'chmod +x >> hello' ?does it work then? >> >> >> On Fri, Apr 6, 2012 at 11:08 AM, Rick Richardson >> wrote: >> > Your code works perfectly for me (except there are two newlines >> > between each print (the \n is redundant)) >> > >> > What is your platform and rustc version? >> > >> > On Fri, Apr 6, 2012 at 9:53 AM, Mic wrote: >> >> Hello, >> >> I am getting by executing the bellow code with Permission denied >> >> >> >> fn main() { >> >> ? ? ? ? for i in [1, 2, 3] { >> >> ? ? ? ? ? ? ? ? io::println(#fmt("hello %d\n", i)); >> >> ? ? ? ? } >> >> } >> >> >> >> $ rustc hello.rs >> >> $ ./hello >> >> bash: ./hello: Permission denied >> >> $ ls -ahl >> >> total 44K >> >> drwx------ ?2 mictadlo mictadlo 4.0K Apr ?6 23:49 . >> >> drwx------ 27 mictadlo mictadlo 8.0K Apr ?6 23:43 .. >> >> -rw-r--r-- ?1 mictadlo mictadlo ?14K Apr ?6 23:49 hello >> >> >> >> What did I do wrong? >> >> >> >> Thank you in advance. >> >> >> >> Cheers, >> >> >> >> _______________________________________________ >> >> Rust-dev mailing list >> >> Rust-dev at mozilla.org >> >> https://mail.mozilla.org/listinfo/rust-dev >> >> > > From mictadlo at gmail.com Fri Apr 6 16:38:03 2012 From: mictadlo at gmail.com (Mic) Date: Sat, 7 Apr 2012 09:38:03 +1000 Subject: [rust-dev] executing error In-Reply-To: References: Message-ID: I was compiling it on my usb stick which has fat or fat32. When I copied the code to my HDD it was working. On Sat, Apr 7, 2012 at 7:08 AM, Rick Richardson wrote: > > -rw-r--r-- 1 mictadlo mictadlo 14K Apr 6 23:49 hello > > Notice, there, that your hello binary is *still* not executable. > > I don't think that this has anything to do with rustc at this point. > > > > On Fri, Apr 6, 2012 at 5:00 PM, Mic wrote: > > Hello, > > Thank you, but I get still permission denied. > > > > $ umask > > 0022 > > $ chmod +x hello > > $ ./hello > > bash: ./hello: Permission denied > > $ ls -ahl > > total 32K > > drwx------ 2 mictadlo mictadlo 4.0K Apr 6 23:54 . > > drwx------ 27 mictadlo mictadlo 8.0K Jan 1 1970 .. > > -rw-r--r-- 1 mictadlo mictadlo 14K Apr 6 23:49 hello > > -rw-r--r-- 1 mictadlo mictadlo 77 Apr 6 23:46 hello.rs > > > > $ rustc -v > > rustc 0.2 > > host: x86_64-unknown-linux-gnu > > > > I am using Sabayon linux 8 64 bit. > > > > On Sat, Apr 7, 2012 at 1:16 AM, Rick Richardson < > rick.richardson at gmail.com> > > wrote: > >> > >> Also, out of curiosity, what is your umask? and if you 'chmod +x > >> hello' does it work then? > >> > >> > >> On Fri, Apr 6, 2012 at 11:08 AM, Rick Richardson > >> wrote: > >> > Your code works perfectly for me (except there are two newlines > >> > between each print (the \n is redundant)) > >> > > >> > What is your platform and rustc version? > >> > > >> > On Fri, Apr 6, 2012 at 9:53 AM, Mic wrote: > >> >> Hello, > >> >> I am getting by executing the bellow code with Permission denied > >> >> > >> >> fn main() { > >> >> for i in [1, 2, 3] { > >> >> io::println(#fmt("hello %d\n", i)); > >> >> } > >> >> } > >> >> > >> >> $ rustc hello.rs > >> >> $ ./hello > >> >> bash: ./hello: Permission denied > >> >> $ ls -ahl > >> >> total 44K > >> >> drwx------ 2 mictadlo mictadlo 4.0K Apr 6 23:49 . > >> >> drwx------ 27 mictadlo mictadlo 8.0K Apr 6 23:43 .. > >> >> -rw-r--r-- 1 mictadlo mictadlo 14K Apr 6 23:49 hello > >> >> > >> >> What did I do wrong? > >> >> > >> >> Thank you in advance. > >> >> > >> >> Cheers, > >> >> > >> >> _______________________________________________ > >> >> Rust-dev mailing list > >> >> Rust-dev at mozilla.org > >> >> https://mail.mozilla.org/listinfo/rust-dev > >> >> > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mictadlo at gmail.com Fri Apr 6 16:38:42 2012 From: mictadlo at gmail.com (Mic) Date: Sat, 7 Apr 2012 09:38:42 +1000 Subject: [rust-dev] executing error In-Reply-To: References: Message-ID: Thank you. I was compiling it on my usb stick which has fat or fat32. When I copied the code to my HDD it was working. On Sat, Apr 7, 2012 at 9:38 AM, Mic wrote: > I was compiling it on my usb stick which has fat or fat32. When I copied > the code to my HDD it was working. > > > On Sat, Apr 7, 2012 at 7:08 AM, Rick Richardson > wrote: > >> > -rw-r--r-- 1 mictadlo mictadlo 14K Apr 6 23:49 hello >> >> Notice, there, that your hello binary is *still* not executable. >> >> I don't think that this has anything to do with rustc at this point. >> >> >> >> On Fri, Apr 6, 2012 at 5:00 PM, Mic wrote: >> > Hello, >> > Thank you, but I get still permission denied. >> > >> > $ umask >> > 0022 >> > $ chmod +x hello >> > $ ./hello >> > bash: ./hello: Permission denied >> > $ ls -ahl >> > total 32K >> > drwx------ 2 mictadlo mictadlo 4.0K Apr 6 23:54 . >> > drwx------ 27 mictadlo mictadlo 8.0K Jan 1 1970 .. >> > -rw-r--r-- 1 mictadlo mictadlo 14K Apr 6 23:49 hello >> > -rw-r--r-- 1 mictadlo mictadlo 77 Apr 6 23:46 hello.rs >> > >> > $ rustc -v >> > rustc 0.2 >> > host: x86_64-unknown-linux-gnu >> > >> > I am using Sabayon linux 8 64 bit. >> > >> > On Sat, Apr 7, 2012 at 1:16 AM, Rick Richardson < >> rick.richardson at gmail.com> >> > wrote: >> >> >> >> Also, out of curiosity, what is your umask? and if you 'chmod +x >> >> hello' does it work then? >> >> >> >> >> >> On Fri, Apr 6, 2012 at 11:08 AM, Rick Richardson >> >> wrote: >> >> > Your code works perfectly for me (except there are two newlines >> >> > between each print (the \n is redundant)) >> >> > >> >> > What is your platform and rustc version? >> >> > >> >> > On Fri, Apr 6, 2012 at 9:53 AM, Mic wrote: >> >> >> Hello, >> >> >> I am getting by executing the bellow code with Permission denied >> >> >> >> >> >> fn main() { >> >> >> for i in [1, 2, 3] { >> >> >> io::println(#fmt("hello %d\n", i)); >> >> >> } >> >> >> } >> >> >> >> >> >> $ rustc hello.rs >> >> >> $ ./hello >> >> >> bash: ./hello: Permission denied >> >> >> $ ls -ahl >> >> >> total 44K >> >> >> drwx------ 2 mictadlo mictadlo 4.0K Apr 6 23:49 . >> >> >> drwx------ 27 mictadlo mictadlo 8.0K Apr 6 23:43 .. >> >> >> -rw-r--r-- 1 mictadlo mictadlo 14K Apr 6 23:49 hello >> >> >> >> >> >> What did I do wrong? >> >> >> >> >> >> Thank you in advance. >> >> >> >> >> >> Cheers, >> >> >> >> >> >> _______________________________________________ >> >> >> Rust-dev mailing list >> >> >> Rust-dev at mozilla.org >> >> >> https://mail.mozilla.org/listinfo/rust-dev >> >> >> >> > >> > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mictadlo at gmail.com Fri Apr 6 16:45:12 2012 From: mictadlo at gmail.com (Mic) Date: Sat, 7 Apr 2012 09:45:12 +1000 Subject: [rust-dev] hello In-Reply-To: <4F7F4C3D.1040109@mozilla.com> References: <4F7EEBEB.1050409@gmail.com> <4F7F4C3D.1040109@mozilla.com> Message-ID: Hi, I found cargo here https://github.com/mozilla/cargo-central/blob/master/packages.json but it is a json file. Why it is json format and is there a tool how to download like gem or easy_install? Thank you in advance. On Sat, Apr 7, 2012 at 6:04 AM, Graydon Hoare wrote: > On 06/04/2012 6:13 AM, Kobi Lurie wrote: > >> hello list, newbie here, just read the tutorial. >> > > Welcome! Thanks for your interest. > > > I am interested in rust, and would like to ask general questions, or >> suggest features or simplifications. >> >> Is this the right place? >> > > Best we have so far. As the community grows, we may set up a separate > -users list but for now most of the language users and developers are the > same people :) > > > first question: I am an early adopter type, I don't mind api breaking >> under my feet. >> I'd like to know if at this stage the language is suitable enough for >> writing small libraries or apps. >> Are the general semantics expected to change dramatically? >> > > It depends on how you view "dramatic". We certainly aren't making > source-level compatibility promises yet. I don't expect we'll be at a place > to make those sorts of promises for ... probably the rest of this year. We > have a modest number of changes queued up for the course of this year, some > of which we don't know the exact outcome of yet. The main > changes-in-progress (proposed or under debate) are currently open as "RFC" > bugs in our tracker. See: > > https://github.com/mozilla/**rust/issues?labels=B-RFC > > Many of these are sort of "corner case" work; changing an awkwardness we > find in the existing syntax and semantics without changing anything deep > about them. A few are additions (classes, slices) which are not going to > break old code, just enable interesting kinds of new code. > > A few queued changes are somewhat deep: the changes to import/export > control, the attribute system, the "extern" C FFI, region memory > management, vectors, unwinding, gc and error signalling, for example. But > in all these cases we're hoping to, again, refactor the semantics and "file > down" rough parts we discovered in the existing language while working with > it, not fundamentally alter the "character" of it. Of course this is > subjective. But if you like the current semantics, my hope is that we're > just going to be making them more robust and useful, not springing any > surprises on you. > > -Graydon > > ______________________________**_________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/**listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mictadlo at gmail.com Fri Apr 6 16:48:48 2012 From: mictadlo at gmail.com (Mic) Date: Sat, 7 Apr 2012 09:48:48 +1000 Subject: [rust-dev] hello In-Reply-To: References: <4F7EEBEB.1050409@gmail.com> <4F7F4C3D.1040109@mozilla.com> Message-ID: I found it: $ cargo init $ cargo sync $ cargo list On Sat, Apr 7, 2012 at 9:45 AM, Mic wrote: > Hi, > I found cargo here > https://github.com/mozilla/cargo-central/blob/master/packages.json but it > is a json file. Why it is json format and is there a tool how to download > like gem or easy_install? > > Thank you in advance. > > > On Sat, Apr 7, 2012 at 6:04 AM, Graydon Hoare wrote: > >> On 06/04/2012 6:13 AM, Kobi Lurie wrote: >> >>> hello list, newbie here, just read the tutorial. >>> >> >> Welcome! Thanks for your interest. >> >> >> I am interested in rust, and would like to ask general questions, or >>> suggest features or simplifications. >>> >>> Is this the right place? >>> >> >> Best we have so far. As the community grows, we may set up a separate >> -users list but for now most of the language users and developers are the >> same people :) >> >> >> first question: I am an early adopter type, I don't mind api breaking >>> under my feet. >>> I'd like to know if at this stage the language is suitable enough for >>> writing small libraries or apps. >>> Are the general semantics expected to change dramatically? >>> >> >> It depends on how you view "dramatic". We certainly aren't making >> source-level compatibility promises yet. I don't expect we'll be at a place >> to make those sorts of promises for ... probably the rest of this year. We >> have a modest number of changes queued up for the course of this year, some >> of which we don't know the exact outcome of yet. The main >> changes-in-progress (proposed or under debate) are currently open as "RFC" >> bugs in our tracker. See: >> >> https://github.com/mozilla/**rust/issues?labels=B-RFC >> >> Many of these are sort of "corner case" work; changing an awkwardness we >> find in the existing syntax and semantics without changing anything deep >> about them. A few are additions (classes, slices) which are not going to >> break old code, just enable interesting kinds of new code. >> >> A few queued changes are somewhat deep: the changes to import/export >> control, the attribute system, the "extern" C FFI, region memory >> management, vectors, unwinding, gc and error signalling, for example. But >> in all these cases we're hoping to, again, refactor the semantics and "file >> down" rough parts we discovered in the existing language while working with >> it, not fundamentally alter the "character" of it. Of course this is >> subjective. But if you like the current semantics, my hope is that we're >> just going to be making them more robust and useful, not springing any >> surprises on you. >> >> -Graydon >> >> ______________________________**_________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/**listinfo/rust-dev >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.sylvan at gmail.com Fri Apr 6 18:19:27 2012 From: sebastian.sylvan at gmail.com (Sebastian Sylvan) Date: Fri, 6 Apr 2012 18:19:27 -0700 Subject: [rust-dev] Performance optimization Message-ID: Hi, What do you guys use to profile rust programs? Just manual timers in the code, or do you have any tools to recommend? I saw that brson had updated my old Rust ray tracer to Rust 0.2 so I downloaded his version and started piling on new features. And while it's already many times faster than it used to be (largely from "free" compiler improvements, or "cheap" inline annotations on (most) hot functions, but also some major changes to the core data structures/algorithms), I'm getting to the point where I'm essentially just guessing at what might be the main hotspots and trying different ways of doing it. Having some kind of sampling profiler would be awesome. (I can't resist! Eye candy, as of last night: http://i.imgur.com/77lAr.png . That's 300k triangles, at 512x512 and 3x3 super sampling, with 1 area light and 1-bounce global illumination, Took just under 6 minutes on my Core i7 2600k - single core, since there's no real way to write parallel programs in rust yet). Seb From niko at alum.mit.edu Fri Apr 6 18:44:01 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Fri, 06 Apr 2012 18:44:01 -0700 Subject: [rust-dev] Performance optimization In-Reply-To: References: Message-ID: <4F7F9BE1.2040804@alum.mit.edu> Something like oprofile would work, or Instruments on Mac. We tend to use instruments when profiling the compiler. Niko On 4/6/12 6:19 PM, Sebastian Sylvan wrote: > Hi, > What do you guys use to profile rust programs? Just manual timers in > the code, or do you have any tools to recommend? > > I saw that brson had updated my old Rust ray tracer to Rust 0.2 so I > downloaded his version and started piling on new features. And while > it's already many times faster than it used to be (largely from "free" > compiler improvements, or "cheap" inline annotations on (most) hot > functions, but also some major changes to the core data > structures/algorithms), I'm getting to the point where I'm essentially > just guessing at what might be the main hotspots and trying different > ways of doing it. Having some kind of sampling profiler would be > awesome. > > (I can't resist! Eye candy, as of last night: > http://i.imgur.com/77lAr.png . That's 300k triangles, at 512x512 and > 3x3 super sampling, with 1 area light and 1-bounce global > illumination, Took just under 6 minutes on my Core i7 2600k - single > core, since there's no real way to write parallel programs in rust > yet). > > Seb > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From mictadlo at gmail.com Fri Apr 6 18:48:53 2012 From: mictadlo at gmail.com (Mic) Date: Sat, 7 Apr 2012 11:48:53 +1000 Subject: [rust-dev] read file line by line In-Reply-To: References: <4F7CB169.8020409@mozilla.com> Message-ID: Hi I have trouble to compile the following code: import io::reader_util; import vec::map; fn main(args: [str]) { let r = io::file_reader(args[1]); // r is result if result::failure(r) { fail result::get_err(r); } let rdr = result::get(r); while !rdr.eof() { let line = rdr.read_line(); io::println(line); if str::len(line) != 0u { let parts = vec::map(line.split_char(',')) {|s| str::trim(s) }; } } } and got the errors: $ rustc csv.rs csv.rs:17:33: 17:48 error: attempted access of field split_char on type str, but no public field or method with that name was found csv.rs:17 let parts = vec::map(line.split_char(',')) {|s| ^~~~~~~~~~~~~~~ csv.rs:17:33: 17:53 error: the type of this value must be known in this context csv.rs:17 let parts = vec::map(line.split_char(',')) {|s| ^~~~~~~~~~~~~~~~~~~~ What did I do wrong? On Fri, Apr 6, 2012 at 2:58 AM, Eric Holk wrote: > On Wed, Apr 4, 2012 at 4:39 PM, Brian Anderson wrote: > >> On 04/04/2012 08:17 AM, Masklinn wrote: >> >>> On 4 avr. 2012, at 17:09, Eric Holk wrote: >>> >>>> In Rust, you can do something like this instead: >>>> >>>> let parts = vec::map([" a", "b ", " c ", "d"]) {|s| >>>> str::trim(s) >>>> }; >>>> >>> Isn't it possible to pass str::trim directly to vec::map? It the >>> indirection through the block really needed? >>> >> >> In this case I believe the block isn't necessary, but in many situations >> it is so I've gotten used to just always using it (sadly). The reason is >> because generic functions always take arguments by reference while >> functions on scalars take their arguments by value, so composing them isn't >> possible without an adapter between them. Strings are passed by reference >> though so this example should work without the extra block. >> >> > I originally wrote this without the extra block, and the compiler > complained. > > I was also trying to use the extension methods (`[" a", "b ", " c ", > "d"].map(str::trim)`) instead, but the compiler was having trouble finding > them. I'm pretty sure I was using the rustc I had just pulled from Github. > > -Eric > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From grahame at angrygoats.net Fri Apr 6 19:08:56 2012 From: grahame at angrygoats.net (Grahame Bowland) Date: Sat, 7 Apr 2012 10:08:56 +0800 Subject: [rust-dev] read file line by line In-Reply-To: References: <4F7CB169.8020409@mozilla.com> Message-ID: Hey You want str::split_char(line), not line.split_char(). As a tip, I'd just do a 'git grep' in a checkout of rust when you hit problems like this. It's what I do when I can't figure something out - thanks to the test coverage, you can find example code for pretty much everything in the standard/core libraries. Cheers Grahame On 7 April 2012 09:48, Mic wrote: > Hi > I have trouble to compile the following code: > > import io::reader_util; > import vec::map; > > fn main(args: [str]) { > > let r = io::file_reader(args[1]); // r is result > if result::failure(r) { > fail result::get_err(r); > } > > let rdr = result::get(r); > > while !rdr.eof() { > let line = rdr.read_line(); > io::println(line); > if str::len(line) != 0u { > let parts = vec::map(line.split_char(',')) {|s| > str::trim(s) > }; > } > > } > } > > and got the errors: > $ rustc csv.rs > csv.rs:17:33: 17:48 error: attempted access of field split_char on type > str, but no public field or method with that name was found > csv.rs:17 let parts = vec::map(line.split_char(',')) {|s| > > ^~~~~~~~~~~~~~~ > csv.rs:17:33: 17:53 error: the type of this value must be known in this > context > csv.rs:17 let parts = vec::map(line.split_char(',')) {|s| > > ^~~~~~~~~~~~~~~~~~~~ > > What did I do wrong? > > > On Fri, Apr 6, 2012 at 2:58 AM, Eric Holk wrote: > >> On Wed, Apr 4, 2012 at 4:39 PM, Brian Anderson wrote: >> >>> On 04/04/2012 08:17 AM, Masklinn wrote: >>> >>>> On 4 avr. 2012, at 17:09, Eric Holk wrote: >>>> >>>>> In Rust, you can do something like this instead: >>>>> >>>>> let parts = vec::map([" a", "b ", " c ", "d"]) {|s| >>>>> str::trim(s) >>>>> }; >>>>> >>>> Isn't it possible to pass str::trim directly to vec::map? It the >>>> indirection through the block really needed? >>>> >>> >>> In this case I believe the block isn't necessary, but in many situations >>> it is so I've gotten used to just always using it (sadly). The reason is >>> because generic functions always take arguments by reference while >>> functions on scalars take their arguments by value, so composing them isn't >>> possible without an adapter between them. Strings are passed by reference >>> though so this example should work without the extra block. >>> >>> >> I originally wrote this without the extra block, and the compiler >> complained. >> >> I was also trying to use the extension methods (`[" a", "b ", " c ", >> "d"].map(str::trim)`) instead, but the compiler was having trouble finding >> them. I'm pretty sure I was using the rustc I had just pulled from Github. >> >> -Eric >> >> >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev >> >> > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From graydon at mozilla.com Fri Apr 6 21:31:38 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Fri, 06 Apr 2012 21:31:38 -0700 Subject: [rust-dev] Performance optimization In-Reply-To: References: Message-ID: <4F7FC32A.3090102@mozilla.com> On 06/04/2012 6:19 PM, Sebastian Sylvan wrote: > What do you guys use to profile rust programs? Just manual timers in > the code, or do you have any tools to recommend? On linux, I use 'perf', it's part of the kernel utilities, pretty widely available, very high quality. Hardware sampling, numerous hardware events, flat and hierarchical modes, differential modes, summary modes, etc. (The graphs on bot.rust-lang.org are generated by parsing the output of perf runs) > Having some kind of sampling profiler would be > awesome. Stock "system" profilers work on most platforms; our symbols can be read by them. Instruments, shark, xperf, oprofile, callgrind, etc. > (I can't resist! Eye candy, as of last night: > http://i.imgur.com/77lAr.png . That's 300k triangles, at 512x512 and > 3x3 super sampling, with 1 area light and 1-bounce global > illumination, Took just under 6 minutes on my Core i7 2600k - single > core, since there's no real way to write parallel programs in rust > yet). Do you mean data-parallel or task-parallel? We are (I hope) able to do the latter reasonably well at this point. Not flawlessly, but tolerably. -Graydon From graydon at mozilla.com Fri Apr 6 22:22:08 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Fri, 06 Apr 2012 22:22:08 -0700 Subject: [rust-dev] Performance optimization In-Reply-To: References: <4F7FC32A.3090102@mozilla.com> Message-ID: <4F7FCF00.4020003@mozilla.com> On 06/04/2012 9:52 PM, Sebastian Sylvan wrote: > I assume you mean on *nix? Maybe I'm doing something wrong, but on > windows I don't get any stack symbols. Huh. That's surprising. I'd have thought at least the PE/COFF symbols would show up. Guess it depends on the tool. We're not producing .pdb files or anything, if that's what you mean. > Either, really. The main hindrance is the inability to share data > between "tasks" (or data parallel "kernels"), is there a story for > that now? > In this particular example I have a multi-megabyte kd-tree structure > for the model that I really need to share between all the parallel > tasks. At the moment we have no mechanism for "sharing" data, just moving it. So scatter/gather operations have to be done by "breaking" data structures (say, swapping 'some(node)' with 'none' at various points in a tree, in a structure made up of option) , and moving the broken-off subtrees to sub-tasks for processing. This is not entirely convenient and not always a useful model. We've discussed in the past -- are likely to investigate further, especially now that regions are starting to work -- creating a library mechanism of some sort (perhaps a region associated with the lifetime of a group of tasks) with multi-reader / single-writer semantics, or publish/subscribe semantics, or similar. In programs like that, channels would be used for signalling but not for carrying the bulkier underlying data. But we haven't built anything like that yet, no. -Graydon From graydon at mozilla.com Sat Apr 7 01:09:00 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Sat, 07 Apr 2012 01:09:00 -0700 Subject: [rust-dev] Performance optimization In-Reply-To: References: <4F7FC32A.3090102@mozilla.com> <4F7FCF00.4020003@mozilla.com> Message-ID: <4F7FF61C.502@mozilla.com> On 07/04/2012 12:46 AM, Sebastian Sylvan wrote: > You'd probably want data parallel code to use a cheaper abstraction > than tasks anyway (e.g. no real need to have an individual stack for > each "job" - just a shared set of worker threads across the whole > program that all data parallel work share). > > That said, you may want some abstraction for sharing large, immutable, > "database"-type data between long running concurrent tasks too (where > you can't guarantee that all "jobs" have finished by a specific chunk, > it may be completely dynamic, unlike the data parallel scenario). Yeah. Plausibly either some kind of pool-based abstraction or fork/join (or, yes, CUDA/OpenCL) might sneak in in future versions. We focused on task parallelism at first because, well, because it's my area of interest and I wrote the first cut. MIMD is the most general case, and the CSP model has complementary roles in both concurrency and modularity (tasks being isolated). But various SIMD and MISD flavours are often appropriate abstractions, particularly when a problem is compute bound rather than I/O bound. -Graydon From grahame at angrygoats.net Sat Apr 7 04:08:29 2012 From: grahame at angrygoats.net (Grahame Bowland) Date: Sat, 7 Apr 2012 19:08:29 +0800 Subject: [rust-dev] Performance optimization In-Reply-To: <4F7FF61C.502@mozilla.com> References: <4F7FC32A.3090102@mozilla.com> <4F7FCF00.4020003@mozilla.com> <4F7FF61C.502@mozilla.com> Message-ID: On 7 April 2012 16:09, Graydon Hoare wrote: > On 07/04/2012 12:46 AM, Sebastian Sylvan wrote: > > You'd probably want data parallel code to use a cheaper abstraction >> than tasks anyway (e.g. no real need to have an individual stack for >> each "job" - just a shared set of worker threads across the whole >> program that all data parallel work share). >> >> That said, you may want some abstraction for sharing large, immutable, >> "database"-type data between long running concurrent tasks too (where >> you can't guarantee that all "jobs" have finished by a specific chunk, >> it may be completely dynamic, unlike the data parallel scenario). >> > > Yeah. Plausibly either some kind of pool-based abstraction or fork/join > (or, yes, CUDA/OpenCL) might sneak in in future versions. We focused on > task parallelism at first because, well, because it's my area of interest > and I wrote the first cut. MIMD is the most general case, and the CSP model > has complementary roles in both concurrency and modularity (tasks being > isolated). But various SIMD and MISD flavours are often appropriate > abstractions, particularly when a problem is compute bound rather than I/O > bound. For the case of "one big data structure multiple workers want to read from", couldn't we write a module to do this within the language as it stands? The module could take a unique reference (which can't contain anything mutable), then issue (via unsafe code) immutable pointers to the structure on request. Obviously it's a broken thing to do (and afaik the language doesn't guarantee that the address won't change unexpectedly, although I don't think it will in the current implementation), but it might be an interesting experiment. Pretty much all the signals processing code I've worked on does not operate on arrays in place; you might have an input data set, go to the fourier domain, do some filtering, come back to time domain, then pass that result to the next processing stage. Being able to split that large input set between workers would be a big win, especially if unique pointers allow the results to be transferred back to a master thread cheaply. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kobi2187 at gmail.com Sat Apr 7 06:36:05 2012 From: kobi2187 at gmail.com (Kobi Lurie) Date: Sat, 07 Apr 2012 16:36:05 +0300 Subject: [rust-dev] idea: access modifiers as part of function signature In-Reply-To: <4F7F4C3D.1040109@mozilla.com> References: <4F7EEBEB.1050409@gmail.com> <4F7F4C3D.1040109@mozilla.com> Message-ID: <4F8042C5.9090608@gmail.com> Thanks for the prompt answers :-) What do you think of explicitly defining what a function can do with a certain argument? I talk about adding keywords such as: ro (readonly), rw (readwrite), wo (writeonly), unused. I think this can help in finding a bug, by shortening the "search space" - the argument was readonly so it wasn't changed here.. it's also less of a mental burden than thinking about pointers/references. C# also has the 'out' modifier for writeonly. this can make things more clear about permission/ownership. (I know that when passing by value, this somewhat achieves readonly, as the value passed will not have effects beyond the function scope.) The more fleshed out suggestion is that the compiler will force writing or removing these modifiers according to the function body. I am not sure whether interfaces should include or not include these modifiers on the function signatures. I have a few ideas, and the thought is perhaps a young language would be more open to consider them. see you, Kobi From rick.richardson at gmail.com Sat Apr 7 07:01:02 2012 From: rick.richardson at gmail.com (Rick Richardson) Date: Sat, 7 Apr 2012 10:01:02 -0400 Subject: [rust-dev] idea: access modifiers as part of function signature In-Reply-To: <4F8042C5.9090608@gmail.com> References: <4F7EEBEB.1050409@gmail.com> <4F7F4C3D.1040109@mozilla.com> <4F8042C5.9090608@gmail.com> Message-ID: http://doc.rust-lang.org/doc/tutorial.html#argument-passing This isn't the most detailed treatise on arguments and ptrs in Rust, but could you describe a use-case where ro/rw/out modifiers would be beneficial over what currently exists? On Sat, Apr 7, 2012 at 9:36 AM, Kobi Lurie wrote: > Thanks for the prompt answers :-) > > What do you think of explicitly defining what a function can do with a > certain argument? > I talk about adding keywords such as: ro (readonly), rw (readwrite), wo > (writeonly), unused. > I think this can help in finding a bug, by shortening the "search space" - > the argument was readonly so it wasn't changed here.. > it's also less of a mental burden than thinking about pointers/references. > C# also has the 'out' modifier for writeonly. > > this can make things more clear about permission/ownership. > > (I know that when passing by value, this somewhat achieves readonly, as the > value passed will not have effects beyond the function scope.) > > The more fleshed out suggestion is that the compiler will force writing or > removing these modifiers according to the function body. > I am not sure whether interfaces should include or not include these > modifiers on the function signatures. > > I have a few ideas, and the thought is perhaps a young language would be > more open to consider them. > > see you, Kobi > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From pwalton at mozilla.com Sat Apr 7 09:45:14 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Sat, 07 Apr 2012 09:45:14 -0700 Subject: [rust-dev] Performance optimization In-Reply-To: References: <4F7FC32A.3090102@mozilla.com> <4F7FCF00.4020003@mozilla.com> <4F7FF61C.502@mozilla.com> Message-ID: <4F806F1A.3030206@mozilla.com> On 04/07/2012 04:08 AM, Grahame Bowland wrote: > For the case of "one big data structure multiple workers want to read > from", couldn't we write a module to do this within the language as it > stands? The module could take a unique reference (which can't contain > anything mutable), then issue (via unsafe code) immutable pointers to > the structure on request. > > Obviously it's a broken thing to do (and afaik the language doesn't > guarantee that the address won't change unexpectedly, although I don't > think it will in the current implementation), but it might be an > interesting experiment. Relevant here are Niko's thoughts on parallel blocks: http://smallcultfollowing.com/babysteps/blog/2011/12/09/pure-blocks/ I'm very much in favor of experimentation in the unsafe module to find what works in advance of baking stuff into the language. I feel that's the right way to do programming language work -- "pave the cow paths", as Dave Herman likes to say :) Patrick From kobi2187 at gmail.com Sat Apr 7 10:03:49 2012 From: kobi2187 at gmail.com (Kobi Lurie) Date: Sat, 07 Apr 2012 20:03:49 +0300 Subject: [rust-dev] idea: access modifiers as part of function signature In-Reply-To: References: <4F7EEBEB.1050409@gmail.com> <4F7F4C3D.1040109@mozilla.com> <4F8042C5.9090608@gmail.com> Message-ID: <4F807375.80200@gmail.com> Hi Rick, thanks for replying! I think limitations are sometimes as important as flexibility. I witnessed situations where there was no alloted time in the schedule for refactoring or changing legacy code that had no design, and the problem built up farther. I think sometimes it's better for the compiler to enforce: "you just can't do it". This is one of those specifications. The programmer gets another level of declaration from this permission/ownership perspective. A way to specify on the outside (signature) what is going to happen, generally, inside the function. Of course, it's a rough idea, it could be improved or consolidated with another feature you had in mind. In part 7.4 argument passing style, the following example fn vec_push(&v: [int],elt:int) { v += [elt]; } will be written: fn vec_push(rwv: [int],roelt:int) { v += [elt]; } from the signature itself, you can already deduce that elt would not be changed inside the function, that the vector will be read and written to. note that a decision should be made, whether the parts of a record argument are considered individually or as a whole. same for a data structure. if a node inside a tree changes, it is different from the tree being replaced. (in this example, a vector object and its cells) this example of a constructor: fn make_person(+name:str, +address:str) -> person { ret {name:name,address:address}; } will be written: fn make_person(ro +name:str, ro +address:str) -> person { ret {name:name,address:address}; } pass by value here is significant, so the new person will not have its name changed, if the calling side afterwards assigns the str name reference a different value. If combined with an iface, the impls will have to conform. I'm not sure it's a good idea for an interface. maybe too limiting. I feel that most of the times the distinction of ref/value is about low level performance. bye, Kobi On 4/7/2012 5:01 PM, Rick Richardson wrote: > http://doc.rust-lang.org/doc/tutorial.html#argument-passing > > This isn't the most detailed treatise on arguments and ptrs in Rust, > but could you describe a use-case > where ro/rw/out modifiers would be beneficial over what currently exists? > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amitava.shee at gmail.com Sat Apr 7 12:28:19 2012 From: amitava.shee at gmail.com (Amitava Shee) Date: Sat, 7 Apr 2012 15:28:19 -0400 Subject: [rust-dev] build error Message-ID: I am on osx snow leopard (10.6.8) amitava:src amitava$ uname -a Darwin amitava.local 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun 7 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386 i386 It used to build fine up until now. I get the following error src/rustc/middle/typeck.rs:370:14: 370:28 error: unresolved name: ast::re_static src/rustc/middle/typeck.rs:370 ast::re_static { ^~~~~~~~~~~~~~ src/rustc/middle/typeck.rs:370:14: 370:28 error: not a enum variant: ast::re_static src/rustc/middle/typeck.rs:370 ast::re_static { ^~~~~~~~~~~~~~ src/rustc/middle/region.rs:380:6: 380:20 error: unresolved name: ast::re_static src/rustc/middle/region.rs:380 ast::re_static { /* fallthrough */ } ^~~~~~~~~~~~~~ src/rustc/middle/region.rs:380:6: 380:20 error: not a enum variant: ast::re_static src/rustc/middle/region.rs:380 ast::re_static { /* fallthrough */ } ^~~~~~~~~~~~~~ src/rustc/metadata/creader.rs:272:8: 272:32 error: unresolved name: attr::find_linkage_attrs src/rustc/metadata/creader.rs:272 for attr::find_linkage_attrs(attrs).each {|attr| ^~~~~~~~~~~~~~~~~~~~~~~~ error: aborting due to previous errors make: *** [x86_64-apple-darwin/stage0/lib/rustc/x86_64-apple-darwin/lib/librustc.dylib] Error 101 Here's my git log output ================== amitava:src amitava$ git log commit 7aaa120bcc8819c32c78f49aee90b331a2759a40 Author: Haitao Li Date: Sun Apr 8 02:00:00 2012 +0800 Check version when resolving transitive dependent crates Issue #2138 commit 5aa5220f8a3fc3ec0c58dede9f4c14be4a3752af Author: Haitao Li Date: Sun Apr 8 01:59:37 2012 +0800 Encode crate dependencies' hash and version data commit 5300662b4e06f4cde219427bbcc2313eca0a5c51 Author: Niko Matsakis Date: Fri Apr 6 17:37:37 2012 -0700 Refactor inference so that subtyping/lub/glb share more code commit 2f42b14b4f57d602ffa446a8f30840c2dcbeb127 Author: Haitao Li Date: Fri Apr 6 18:45:49 2012 +0800 Use version and hash in crate_map name Related issue #2137 Thankd & Regards, Amitava Shee -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Sat Apr 7 15:32:32 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Sat, 07 Apr 2012 15:32:32 -0700 Subject: [rust-dev] build error In-Reply-To: References: Message-ID: <4F80C080.7010703@alum.mit.edu> This sounds like it might be an issue with linking to older versions of the libraries. Tr a "make clean" and "make uninstall". Niko On 4/7/12 12:28 PM, Amitava Shee wrote: > I am on osx snow leopard (10.6.8) > > amitava:src amitava$ uname -a > Darwin amitava.local 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun 7 > 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386 i386 > > It used to build fine up until now. I get the following error > > src/rustc/middle/typeck.rs:370:14: 370:28 error: unresolved name: > ast::re_static > src/rustc/middle/typeck.rs:370 > ast::re_static { > ^~~~~~~~~~~~~~ > src/rustc/middle/typeck.rs:370:14: 370:28 error: not a enum variant: > ast::re_static > src/rustc/middle/typeck.rs:370 > ast::re_static { > ^~~~~~~~~~~~~~ > src/rustc/middle/region.rs:380:6: 380:20 error: unresolved name: > ast::re_static > src/rustc/middle/region.rs:380 > ast::re_static { /* fallthrough */ } > ^~~~~~~~~~~~~~ > src/rustc/middle/region.rs:380:6: 380:20 error: not a enum variant: > ast::re_static > src/rustc/middle/region.rs:380 > ast::re_static { /* fallthrough */ } > ^~~~~~~~~~~~~~ > src/rustc/metadata/creader.rs:272:8: 272:32 error: unresolved name: > attr::find_linkage_attrs > src/rustc/metadata/creader.rs:272 for > attr::find_linkage_attrs(attrs).each {|attr| > ^~~~~~~~~~~~~~~~~~~~~~~~ > error: aborting due to previous errors > make: *** > [x86_64-apple-darwin/stage0/lib/rustc/x86_64-apple-darwin/lib/librustc.dylib] > Error 101 > > Here's my git log output > ================== > amitava:src amitava$ git log > commit 7aaa120bcc8819c32c78f49aee90b331a2759a40 > Author: Haitao Li > > Date: Sun Apr 8 02:00:00 2012 +0800 > > Check version when resolving transitive dependent crates > > Issue #2138 > > commit 5aa5220f8a3fc3ec0c58dede9f4c14be4a3752af > Author: Haitao Li > > Date: Sun Apr 8 01:59:37 2012 +0800 > > Encode crate dependencies' hash and version data > > commit 5300662b4e06f4cde219427bbcc2313eca0a5c51 > Author: Niko Matsakis > > Date: Fri Apr 6 17:37:37 2012 -0700 > > Refactor inference so that subtyping/lub/glb share more code > > commit 2f42b14b4f57d602ffa446a8f30840c2dcbeb127 > Author: Haitao Li > > Date: Fri Apr 6 18:45:49 2012 +0800 > > Use version and hash in crate_map name > > Related issue #2137 > > Thankd & Regards, > Amitava Shee > > > > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From banderson at mozilla.com Sat Apr 7 16:08:47 2012 From: banderson at mozilla.com (Brian Anderson) Date: Sat, 07 Apr 2012 16:08:47 -0700 Subject: [rust-dev] Performance optimization In-Reply-To: References: Message-ID: <4F80C8FF.8000109@mozilla.com> On 04/06/2012 06:19 PM, Sebastian Sylvan wrote: > Hi, > What do you guys use to profile rust programs? Just manual timers in > the code, or do you have any tools to recommend? On linux I use either perf or callgrind. I usually prefer callgrind because, in combination with kcachegrind, it displays things with easily comprehensible pictures. > (I can't resist! Eye candy, as of last night: > http://i.imgur.com/77lAr.png . That's 300k triangles, at 512x512 and > 3x3 super sampling, with 1 area light and 1-bounce global > illumination, Took just under 6 minutes on my Core i7 2600k - single > core, since there's no real way to write parallel programs in rust > yet). So awesome! Can a ray tracer not be usefully written in a task-parallel way? Is data-parallelism the only thing that will help here? -Brian From banderson at mozilla.com Sat Apr 7 16:13:54 2012 From: banderson at mozilla.com (Brian Anderson) Date: Sat, 07 Apr 2012 16:13:54 -0700 Subject: [rust-dev] read file line by line In-Reply-To: References: <4F7CB169.8020409@mozilla.com> Message-ID: <4F80CA32.9010906@mozilla.com> On 04/06/2012 06:48 PM, Mic wrote: > Hi > I have trouble to compile the following code: > > import io::reader_util; > import vec::map; > > fn main(args: [str]) { > > let r = io::file_reader(args[1]); // r is result > if result::failure(r) { > fail result::get_err(r); > } > > let rdr = result::get(r); > > while !rdr.eof() { > let line = rdr.read_line(); > io::println(line); > if str::len(line) != 0u { > let parts = vec::map(line.split_char(',')) {|s| > str::trim(s) > }; > } > > } > } > > and got the errors: > $ rustc csv.rs > csv.rs:17:33: 17:48 error: attempted access of field split_char on > type str, but no public field or method with that name was found > csv.rs:17 let parts = > vec::map(line.split_char(',')) {|s| > > ^~~~~~~~~~~~~~~ > csv.rs:17:33: 17:53 error: the type of this value must be known in > this context > csv.rs:17 let parts = > vec::map(line.split_char(',')) {|s| > > ^~~~~~~~~~~~~~~~~~~~ > > What did I do wrong? > Hi Mic. The available extension methods (as in `line.split_char(',')`) have been changing a lot recently, so my guess is that your compiler is just slightly out of date and doesn't have the `split_char` extension on `str`. Try updating to Rust HEAD where you will also notice that `result::failure` is now called `result::is_failure`. -Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: From amitava.shee at gmail.com Sat Apr 7 19:46:26 2012 From: amitava.shee at gmail.com (Amitava Shee) Date: Sat, 7 Apr 2012 22:46:26 -0400 Subject: [rust-dev] build error In-Reply-To: <4F80C080.7010703@alum.mit.edu> References: <4F80C080.7010703@alum.mit.edu> Message-ID: make uninstall && make clean did not resolve the issue. However, when I removed everything in x86_64-apple-darwin, it worked. $ rm -fr x86_64-apple-darwin Thanks for the hint. Amitava On Sat, Apr 7, 2012 at 6:32 PM, Niko Matsakis wrote: > This sounds like it might be an issue with linking to older versions of > the libraries. Tr a "make clean" and "make uninstall". > > > > Niko > > > On 4/7/12 12:28 PM, Amitava Shee wrote: > >> I am on osx snow leopard (10.6.8) >> >> amitava:src amitava$ uname -a >> Darwin amitava.local 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun 7 >> 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_**I386 i386 >> >> It used to build fine up until now. I get the following error >> >> src/rustc/middle/typeck.rs:370**:14: 370:28 error: unresolved name: >> ast::re_static >> src/rustc/middle/typeck.rs:370 >> ast::re_static { >> >> ^~~~~~~~~~~~~~ >> src/rustc/middle/typeck.rs:370**:14: 370:28 error: not a enum variant: >> ast::re_static >> src/rustc/middle/typeck.rs:370 >> ast::re_static { >> >> ^~~~~~~~~~~~~~ >> src/rustc/middle/region.rs:380**:6: 380:20 error: unresolved name: >> ast::re_static >> src/rustc/middle/region.rs:380 >> ast::re_static { /* fallthrough */ } >> >> ^~~~~~~~~~~~~~ >> src/rustc/middle/region.rs:380**:6: 380:20 error: not a enum variant: >> ast::re_static >> src/rustc/middle/region.rs:380 >> ast::re_static { /* fallthrough */ } >> >> ^~~~~~~~~~~~~~ >> src/rustc/metadata/creader.rs:**272:8: 272:32 error: unresolved name: >> attr::find_linkage_attrs >> src/rustc/metadata/creader.rs:**272 < >> http://creader.rs:272> for attr::find_linkage_attrs(**attrs).each >> {|attr| >> >> ^~~~~~~~~~~~~~~~~~~~~~~~ >> error: aborting due to previous errors >> make: *** [x86_64-apple-darwin/stage0/**lib/rustc/x86_64-apple-darwin/**lib/librustc.dylib] >> Error 101 >> >> Here's my git log output >> ================== >> amitava:src amitava$ git log >> commit 7aaa120bcc8819c32c78f49aee90b3**31a2759a40 >> Author: Haitao Li > >> >> Date: Sun Apr 8 02:00:00 2012 +0800 >> >> Check version when resolving transitive dependent crates >> >> Issue #2138 >> >> commit 5aa5220f8a3fc3ec0c58dede9f4c14**be4a3752af >> Author: Haitao Li > >> >> Date: Sun Apr 8 01:59:37 2012 +0800 >> >> Encode crate dependencies' hash and version data >> >> commit 5300662b4e06f4cde219427bbcc231**3eca0a5c51 >> Author: Niko Matsakis > >> >> Date: Fri Apr 6 17:37:37 2012 -0700 >> >> Refactor inference so that subtyping/lub/glb share more code >> >> commit 2f42b14b4f57d602ffa446a8f30840**c2dcbeb127 >> Author: Haitao Li > >> >> Date: Fri Apr 6 18:45:49 2012 +0800 >> >> Use version and hash in crate_map name >> >> Related issue #2137 >> >> Thankd & Regards, >> Amitava Shee >> >> >> >> >> >> ______________________________**_________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/**listinfo/rust-dev >> > > -- Amitava Shee Software Architect There are two ways of constructing a software design. One is to make it so simple that there are obviously no deficiencies; the other is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult. -- C. A. R. Hoare The Emperor's Old Clothes, CACM February 1981 -------------- next part -------------- An HTML attachment was scrubbed... URL: From kobi2187 at gmail.com Sun Apr 8 11:13:42 2012 From: kobi2187 at gmail.com (Kobi Lurie) Date: Sun, 08 Apr 2012 21:13:42 +0300 Subject: [rust-dev] idea: specific visibility Message-ID: <4F81D556.6090203@gmail.com> hi guys, another suggestion. What do you think of: the called side (functions, members, maybe classes) can specify its visibility down to the function level. (with some wildcard syntax) this is enhancing the private public internal as seen in most languages. The Eiffel language has something similar for classes. that is, selective export, only these classes can view this function. In the Eiffel language: class ARRAYED_LIST [ G ] inherit ARRAY [ G ] rename count as capacity, item as array_item, put as array_put export {NONE} all {ANY} capacity end ('none' is the "null" class, 'any' is like object, meaning that all classes can see 'capacity' but everything else is private) this idea can be extended to functions, and checked by the compiler. for example, a member can declare that only the setter can change it. same thing for a getter. sometimes a dll has a lot of inter-related functionality inside, and you want a certain "internal" function to be visible, but only used by certain functions. The general idea, is that in a large project, this helps maintainability. In a glance, you can see and limit misbehavior, therefore easier control of the code, and a more stable basis to build on. when things are simple, it is possible to continue developing, when things get complex, every advancing step is sluggish and slower. I know these suggestions are kind of exotic, I have a few more traditional ones. Bye, Kobi From kobi2187 at gmail.com Sun Apr 8 11:15:57 2012 From: kobi2187 at gmail.com (Kobi Lurie) Date: Sun, 08 Apr 2012 21:15:57 +0300 Subject: [rust-dev] console? Message-ID: <4F81D5DD.5000000@gmail.com> hi rust list, is there something like Console.ReadLine in rust? I want to experiment, get a feel for the language by writing a little hangman game. Many thanks, Kobi From pwalton at mozilla.com Sun Apr 8 11:20:14 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Sun, 08 Apr 2012 11:20:14 -0700 Subject: [rust-dev] console? In-Reply-To: <4F81D5DD.5000000@gmail.com> References: <4F81D5DD.5000000@gmail.com> Message-ID: <4F81D6DE.5080200@mozilla.com> On 04/08/2012 11:15 AM, Kobi Lurie wrote: > hi rust list, is there something like Console.ReadLine in rust? > I want to experiment, get a feel for the language by writing a little > hangman game. Check out the io module: http://doc.rust-lang.org/doc/core/io.html In particular use stdin() to get a handle to standard input, and then use read_line(). You will need to import reader and reader_util. Patrick From kobi2187 at gmail.com Sun Apr 8 11:29:59 2012 From: kobi2187 at gmail.com (Kobi Lurie) Date: Sun, 08 Apr 2012 21:29:59 +0300 Subject: [rust-dev] console? In-Reply-To: <4F81D6DE.5080200@mozilla.com> References: <4F81D5DD.5000000@gmail.com> <4F81D6DE.5080200@mozilla.com> Message-ID: <4F81D927.1090807@gmail.com> Thanks!! On 4/8/2012 9:20 PM, Patrick Walton wrote: > On 04/08/2012 11:15 AM, Kobi Lurie wrote: >> hi rust list, is there something like Console.ReadLine in rust? >> I want to experiment, get a feel for the language by writing a little >> hangman game. > > Check out the io module: http://doc.rust-lang.org/doc/core/io.html > > In particular use stdin() to get a handle to standard input, and then > use read_line(). You will need to import reader and reader_util. > > Patrick > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From stefan.plantikow at googlemail.com Sun Apr 8 12:15:31 2012 From: stefan.plantikow at googlemail.com (Stefan Plantikow) Date: Sun, 8 Apr 2012 21:15:31 +0200 Subject: [rust-dev] hashmap benchmark In-Reply-To: <4F7B4538.40809@alum.mit.edu> References: <4F7B4538.40809@alum.mit.edu> Message-ID: <034275CD565A473F889C074C3E7BB564@googlemail.com> Hi, I was thinking about this, too. One of the state-of-the art algorithms seems to be hopscotch hashing, wikipedia has a quite good introduction to it. Even though it has been developed for concurrent access, it should also be quite good in a single core scenario and has a really low memory footprint (90% full hash table still works reasonably). I was thinking about implementing that for fun once the currently ongoing changes to regions and vectors are complete. Hashing algorithms (hopscotch, bloom filters) could greatly benefit from having access to the llvm bit manipulation intrinsics (ctpop, ctlz, cttz, bswap). I think the general plan was to access these using some form of inline llvm asm. However in the absence of that I wonder wether we should just have support for those directly in core or std for all the integer types (quite some languages do that). Feedback/Suggestions? PS: Is there yet a plan on how to move towards more use of interfaces in the libs, and (by extension) rustc? -- Stefan Plantikow From mictadlo at gmail.com Sat Apr 7 20:58:35 2012 From: mictadlo at gmail.com (Mic) Date: Sun, 8 Apr 2012 13:58:35 +1000 Subject: [rust-dev] write_str error Message-ID: Hi, I am getting the following errors: $ rustc csv_create.rs csv_create.rs:17:1: 17:14 error: attempted access of field write_str on type core::io::writer, but no public field or method with that name was found csv_create.rs:17 rdr.write_str("aaa, bbb,ccc , ddd,eee,fff,ggg,hhh,iii,jjj,kkk,lll,mmm,nnn\n"); ^~~~~~~~~~~~~ csv_create.rs:17:1: 17:78 error: mismatched types: expected function or native function but found _|_ csv_create.rs:17 rdr.write_str("aaa, bbb,ccc , ddd,eee,fff,ggg,hhh,iii,jjj,kkk,lll,mmm,nnn\n"); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ with the following code: import io::reader_util; import vec::map; fn main(args: [str]) { let r = io::file_writer("csv.csv" , [io::create, io::truncate]); // r is result if result::is_failure(r) { fail result::get_err(r); } let rdr = result::get(r); let count = 0; while true { if count == 4000000 { break; } rdr.write_str("aaa, bbb,ccc , ddd,eee,fff,ggg,hhh,iii,jjj,kkk,lll,mmm,nnn\n"); count += 1; } } What did I do wrong and would it possible to rewrite the while loop with for loop? Thank you in advance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From banderson at mozilla.com Sun Apr 8 13:32:13 2012 From: banderson at mozilla.com (Brian Anderson) Date: Sun, 08 Apr 2012 13:32:13 -0700 Subject: [rust-dev] hashmap benchmark In-Reply-To: <034275CD565A473F889C074C3E7BB564@googlemail.com> References: <4F7B4538.40809@alum.mit.edu> <034275CD565A473F889C074C3E7BB564@googlemail.com> Message-ID: <4F81F5CD.3010803@mozilla.com> On 04/08/2012 12:15 PM, Stefan Plantikow wrote: > Hi, > > I was thinking about this, too. One of the state-of-the art algorithms seems to be hopscotch hashing, wikipedia has a quite good introduction to it. Even though it has been developed for concurrent access, it should also be quite good in a single core scenario and has a really low memory footprint (90% full hash table still works reasonably). I was thinking about implementing that for fun once the currently ongoing changes to regions and vectors are complete. > > Hashing algorithms (hopscotch, bloom filters) could greatly benefit from having access to the llvm bit manipulation intrinsics (ctpop, ctlz, cttz, bswap). I think the general plan was to access these using some form of inline llvm asm. However in the absence of that I wonder wether we should just have support for those directly in core or std for all the integer types (quite some languages do that). We used to have an 'llvm' ABI for native mods that was intended to give access to the llvm intrinsics. We could add that back, or just add them as rust intrinsics ('rust-intrinsic' ABI) as needed. > Feedback/Suggestions? > > PS: Is there yet a plan on how to move towards more use of interfaces in the libs, and (by extension) rustc? > I don't think there's a plan yet. It's probably worth waiting for classes. From banderson at mozilla.com Sun Apr 8 13:37:35 2012 From: banderson at mozilla.com (Brian Anderson) Date: Sun, 08 Apr 2012 13:37:35 -0700 Subject: [rust-dev] enscripten demo? In-Reply-To: References: <4F7B3D57.3040004@mozilla.com> Message-ID: <4F81F70F.6040202@mozilla.com> On 04/04/2012 02:55 AM, Mohd. Bilal Husain wrote: > Passed a dumb sample rust bitcode to emscripten, got js functions#. > Realized I need to run on core modules too for printing simple hello > world. Took io from libcore, decimated code to avoid few build errors, > emcc throws error > > Unclear type in struct > > Anyways, need to figure out how to build native modules and core lib, > std lib; and how to map these modules to imports in a sample hello-world. I would start with test cases that do not use core or std at all, like: #[no_core]; fn main() { log(0, "hello, world"); } Build that with --emit-llvm -S and build it with emscripten. After enough fiddling it should complain about not being able to find `rust_start` (the entry point to the rust runtime). Then you write a `rust_start` function and keep repeating until you've rewritten the runtime in javascript. FWIW, last time I tried to run our bitcode through emscripten I had to use 32-bit code because emscripten had some bugs related to 64-bit code. -Brian From banderson at mozilla.com Sun Apr 8 13:45:12 2012 From: banderson at mozilla.com (Brian Anderson) Date: Sun, 08 Apr 2012 13:45:12 -0700 Subject: [rust-dev] write_str error In-Reply-To: References: Message-ID: <4F81F8D8.5060705@mozilla.com> On 04/07/2012 08:58 PM, Mic wrote: > Hi, > I am getting the following errors: > > $ rustc csv_create.rs > csv_create.rs:17:1: 17:14 error: attempted access of field write_str > on type core::io::writer, but no public field or method with that name > was found > csv_create.rs:17 rdr.write_str("aaa, > bbb,ccc , ddd,eee,fff,ggg,hhh,iii,jjj,kkk,lll,mmm,nnn\n"); > ^~~~~~~~~~~~~ > csv_create.rs:17:1: 17:78 error: mismatched types: expected function > or native function but found _|_ > csv_create.rs:17 rdr.write_str("aaa, > bbb,ccc , ddd,eee,fff,ggg,hhh,iii,jjj,kkk,lll,mmm,nnn\n"); > > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > with the following code: > > import io::reader_util; > import vec::map; > > fn main(args: [str]) { > > let r = io::file_writer("csv.csv" , [io::create, io::truncate]); > // r is result > if result::is_failure(r) { > fail result::get_err(r); > } > > let rdr = result::get(r); > > let count = 0; > while true { > > if count == 4000000 { break; } > rdr.write_str("aaa, bbb,ccc , > ddd,eee,fff,ggg,hhh,iii,jjj,kkk,lll,mmm,nnn\n"); > count += 1; > } > } > > What did I do wrong and would it possible to rewrite the while loop > with for loop? > I believe the problem is that you are using a writer type but have imported a reader impl. if you add an `import io::writer_util;` statement then it will get farther. The most concise way to write your while loop would be using `iter::repeat` which just executes a function a specific number of times, like `iter::repeat(4000000) {|| ... }`. Sadly `iter::repeat` can't be used in a for loop yet. Our iteration strategy still needs an overhaul to be compatible with `for`. -Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.plantikow at googlemail.com Sun Apr 8 13:55:15 2012 From: stefan.plantikow at googlemail.com (Stefan Plantikow) Date: Sun, 8 Apr 2012 22:55:15 +0200 Subject: [rust-dev] hashmap benchmark In-Reply-To: <4F81F5CD.3010803@mozilla.com> References: <4F7B4538.40809@alum.mit.edu> <034275CD565A473F889C074C3E7BB564@googlemail.com> <4F81F5CD.3010803@mozilla.com> Message-ID: <1F872970AFAF45C5A4E36D43248A9971@googlemail.com> Hi, Am Sonntag, 8. April 2012 um 22:32 schrieb Brian Anderson: > > Hashing algorithms (hopscotch, bloom filters) could greatly benefit from having access to the llvm bit manipulation intrinsics (ctpop, ctlz, cttz, bswap). I think the general plan was to access these using some form of inline llvm asm. However in the absence of that I wonder wether we should just have support for those directly in core or std for all the integer types (quite some languages do that). > > > We used to have an 'llvm' ABI for native mods that was intended to give > access to the llvm intrinsics. We could add that back, or just add them > as rust intrinsics ('rust-intrinsic' ABI) as needed. > That would be nice. For hash table algorithms, ctpop/ctlz/cttz will be really useful (should also speedup the sudoku benchmark :). And swap is quite helpful for dealing with utf8 byte order swapping. How would one call these via the rust intrinsics? I am not deep enough into the llvm-rust-bits. > > > Feedback/Suggestions? > > Is going for hopscotch a good idea? Ah well, will try in any case ;) > > > > PS: Is there yet a plan on how to move towards more use of interfaces in the libs, and (by extension) rustc? > > > I don't think there's a plan yet. It's probably worth waiting for classes. Ok, that makes sense to some extent, though things like comparison etc already could move towards using interfaces, I guess. -- Stefan Plantikow From banderson at mozilla.com Sun Apr 8 14:16:34 2012 From: banderson at mozilla.com (Brian Anderson) Date: Sun, 08 Apr 2012 14:16:34 -0700 Subject: [rust-dev] hashmap benchmark In-Reply-To: <1F872970AFAF45C5A4E36D43248A9971@googlemail.com> References: <4F7B4538.40809@alum.mit.edu> <034275CD565A473F889C074C3E7BB564@googlemail.com> <4F81F5CD.3010803@mozilla.com> <1F872970AFAF45C5A4E36D43248A9971@googlemail.com> Message-ID: <4F820032.1080502@mozilla.com> On 04/08/2012 01:55 PM, Stefan Plantikow wrote: > Hi, > > > Am Sonntag, 8. April 2012 um 22:32 schrieb Brian Anderson: > >>> Hashing algorithms (hopscotch, bloom filters) could greatly benefit from having access to the llvm bit manipulation intrinsics (ctpop, ctlz, cttz, bswap). I think the general plan was to access these using some form of inline llvm asm. However in the absence of that I wonder wether we should just have support for those directly in core or std for all the integer types (quite some languages do that). >> >> We used to have an 'llvm' ABI for native mods that was intended to give >> access to the llvm intrinsics. We could add that back, or just add them >> as rust intrinsics ('rust-intrinsic' ABI) as needed. >> > That would be nice. For hash table algorithms, ctpop/ctlz/cttz will be really useful (should also speedup the sudoku benchmark :). And swap is quite helpful for dealing with utf8 byte order swapping. How would one call these via the rust intrinsics? I am not deep enough into the llvm-rust-bits. As a rust intrinsic perhaps like #[abi = "rust-intrinsic"] native mod intrinsics { fn bswap_i16(i: i16) -> i16; } The rust intrinsics would all just be hardcoded into the compiler to translate to the appropriate llvm intrinsics. As an llvm intrinsic: #[abi = "llvm-intrinsic"] native mod intrinsics { #[link_name = "bswap.i16"] fn bswap_i16(i16: i16) -> i16; } In this case rustc probably doesn't need to know anything specific about the intrinsic - we just generate an intrinsic instruction with the given name and types. > >>> Feedback/Suggestions? >>> > Is going for hopscotch a good idea? Ah well, will try in any case ;) I'm not really familiar with the subject. If the intent is to replace std::map then presumably anything that demonstrates better performance is a good idea. >>> PS: Is there yet a plan on how to move towards more use of interfaces in the libs, and (by extension) rustc? >> >> I don't think there's a plan yet. It's probably worth waiting for classes. > Ok, that makes sense to some extent, though things like comparison etc already could move towards using interfaces, I guess. > > From niko at alum.mit.edu Sun Apr 8 14:21:44 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Sun, 08 Apr 2012 14:21:44 -0700 Subject: [rust-dev] hashmap benchmark In-Reply-To: <034275CD565A473F889C074C3E7BB564@googlemail.com> References: <4F7B4538.40809@alum.mit.edu> <034275CD565A473F889C074C3E7BB564@googlemail.com> Message-ID: <4F820168.5000802@alum.mit.edu> On 4/8/12 12:15 PM, Stefan Plantikow wrote: > Hi, > > I was thinking about this, too. One of the state-of-the art algorithms seems to be hopscotch hashing, wikipedia has a quite good introduction to it. Even though it has been developed for concurrent access, it should also be quite good in a single core scenario and has a really low memory footprint (90% full hash table still works reasonably). I was thinking about implementing that for fun once the currently ongoing changes to regions and vectors are complete. I am totally excited about offering a variety of map abstractions. One additional thing I would particularly like (which is kind of orthogonal) is a default map implementation that begins as a straight-up list of some fixed size and then shifts to another algorithm as the table is populated. Of course we'd want to test and tune to see where it makes sense to shift, but I believe that a lot of hash tables are generally small (but not always) and thus can benefit from changing strategies as they fill up. In fact I would like to have a similar approach to each of the basic container types (a default implementation that adjusts and does the right thing, plus a variety of more specific implementations). I also think persistent collections would be very useful. > Hashing algorithms (hopscotch, bloom filters) could greatly benefit from having access to the llvm bit manipulation intrinsics (ctpop, ctlz, cttz, bswap). I was going to say that we should just build them into the compiler as intrinsics (after Marijn's work on intrinsics, this should be fairly straightforward). But Brian's e-mail also looks pretty nice, actually. In general I'd rather avoid introducing arbitrary LLVM assembly---the further we can avoid exposing our use of LLVM, the better imo---but packaging things as "native" functions helps to keep us relatively portable should we move away from LLVM in the future. Niko From sebastian.sylvan at gmail.com Sun Apr 8 14:23:45 2012 From: sebastian.sylvan at gmail.com (Sebastian Sylvan) Date: Sun, 8 Apr 2012 14:23:45 -0700 Subject: [rust-dev] hashmap benchmark In-Reply-To: <1F872970AFAF45C5A4E36D43248A9971@googlemail.com> References: <4F7B4538.40809@alum.mit.edu> <034275CD565A473F889C074C3E7BB564@googlemail.com> <4F81F5CD.3010803@mozilla.com> <1F872970AFAF45C5A4E36D43248A9971@googlemail.com> Message-ID: On Sun, Apr 8, 2012 at 1:55 PM, Stefan Plantikow wrote: > > Hi, > > > Am Sonntag, 8. April 2012 um 22:32 schrieb Brian Anderson: > >> > Hashing algorithms (hopscotch, bloom filters) could greatly benefit from having access to the llvm bit manipulation intrinsics (ctpop, ctlz, cttz, bswap). I think the general plan was to access these using some form of inline llvm asm. However in the absence of that I wonder wether we should just have support for those directly in core or std for all the integer types (quite some languages do that). >> >> >> We used to have an 'llvm' ABI for native mods that was intended to give >> access to the llvm intrinsics. We could add that back, or just add them >> as rust intrinsics ('rust-intrinsic' ABI) as needed. >> > > That would be nice. For hash table algorithms, ctpop/ctlz/cttz will be really useful (should also speedup the sudoku benchmark :). And swap is quite helpful for dealing with utf8 byte order swapping. How would one call these via the rust intrinsics? I am not deep enough into the llvm-rust-bits. > > >> >> > Feedback/Suggestions? >> > > Is going for hopscotch a good idea? Ah well, will try in any case ;) One of the most surprisingly awesome hash table algorithms (to me at least) is open addressing based on "robin hood hashing". It's been around for ages, but almost nobody knows about it. It's *such* a simple (i.e. fast!) tweak to the regular algorithm, and it makes all the difference. I implemented the basic ops in C++ and it was something like two orders of magnitude faster than the built in unordered_set in Visual Studio for insertions and one order of magnitude for lookups. Basically, you do the "normal" open addressing where if there's a collision you just linearly walk until you find an empty slot. However, there's a twist. For each filled element you check its "displacement" (i.e. distance from the hash bucket it "wants" to be in to its current location). If the displacement for the value you're trying to place is larger than the displacement of the value already stored at the location then you swap place and keep going with the other value. This simple change causes variance of displacements to drop way down and means you can get 90+% occupancy with just a few probes per query. (Turns out this makes deletions and queries simpler too, because the latter exits based on this displacement invariant (instead of exiting based on finding an empty slot), which means the former can be done by just marking the slot as empty.) You need some way to efficiently check the "displacement" of a stored element. Maybe you store a small number (3 bits is enough) that contains the actual displacement, but a simpler strategy is to just cache the hash value with the element. You usually do this anyway because calling the hash function is sometimes expensive so you want a cheap "early out" when checking equality (especially for things like strings). Anyway, I like this quite a bit better than hopscotch hashing because it's just *so* damn simple. With my suggestion above you get one DWORD of overhead per element (for the cached hash value). No pointers. One cache miss per lookup, typically, and all the code is very simple logic and runs really fast. -- Sebastian Sylvan From mictadlo at gmail.com Sun Apr 8 16:52:58 2012 From: mictadlo at gmail.com (Mic) Date: Mon, 9 Apr 2012 09:52:58 +1000 Subject: [rust-dev] read file line by line In-Reply-To: <4F80CA32.9010906@mozilla.com> References: <4F7CB169.8020409@mozilla.com> <4F80CA32.9010906@mozilla.com> Message-ID: Hello, Thank you it is working. I created a writing and reading benchmark. In both cases Python is about 3 times faster than Rust. Please find below the results and attached the codes (create_csv.py/rs has to run first, because it creates a csv file which is used for csv.py/rs) *BENCHMARK 1*: Writting 4000000 lines to a file $ time python csv_create.py real 0m3.620s user 0m1.942s sys 0m0.339s $ ls -ahl csv.csv -rw-r--r-- 1 mictadlo mictadlo 226M Apr 9 09:05 csv.csv $ time ./csv_create real 0m11.299s user 0m3.222s sys 0m5.973s $ ls -ahl csv.csv -rw-r--r-- 1 mictadlo mictadlo 226M Apr 9 09:07 csv.csv *BENCHMARK 2: *Readind a csv file and trim each field $ time python csv.py real 0m22.136s user 0m21.728s sys 0m0.095s $ time ./csv real 1m6.796s user 1m6.364s sys 0m0.145s If you guys happy benchmark than I could commit it to git. What is the git command to commit in 'src/test/bench'? Thank you in advance. On Sun, Apr 8, 2012 at 9:13 AM, Brian Anderson wrote: > ** > On 04/06/2012 06:48 PM, Mic wrote: > > Hi > I have trouble to compile the following code: > > import io::reader_util; > import vec::map; > > fn main(args: [str]) { > > let r = io::file_reader(args[1]); // r is result > if result::failure(r) { > fail result::get_err(r); > } > > let rdr = result::get(r); > > while !rdr.eof() { > let line = rdr.read_line(); > io::println(line); > if str::len(line) != 0u { > let parts = vec::map(line.split_char(',')) {|s| > str::trim(s) > }; > } > > } > } > > and got the errors: > $ rustc csv.rs > csv.rs:17:33: 17:48 error: attempted access of field split_char on type > str, but no public field or method with that name was found > csv.rs:17 let parts = vec::map(line.split_char(',')) {|s| > > ^~~~~~~~~~~~~~~ > csv.rs:17:33: 17:53 error: the type of this value must be known in this > context > csv.rs:17 let parts = vec::map(line.split_char(',')) {|s| > > ^~~~~~~~~~~~~~~~~~~~ > > What did I do wrong? > > > Hi Mic. > > The available extension methods (as in `line.split_char(',')`) have been > changing a lot recently, so my guess is that your compiler is just slightly > out of date and doesn't have the `split_char` extension on `str`. Try > updating to Rust HEAD where you will also notice that `result::failure` is > now called `result::is_failure`. > > -Brian > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: csv_create.rs Type: application/octet-stream Size: 420 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: csv_create.py Type: application/octet-stream Size: 205 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: csv.rs Type: application/octet-stream Size: 486 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: csv.py Type: application/octet-stream Size: 255 bytes Desc: not available URL: From pwalton at mozilla.com Sun Apr 8 17:09:46 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Sun, 08 Apr 2012 17:09:46 -0700 Subject: [rust-dev] read file line by line In-Reply-To: References: <4F7CB169.8020409@mozilla.com> <4F80CA32.9010906@mozilla.com> Message-ID: <4F8228CA.6020501@mozilla.com> On 04/08/2012 04:52 PM, Mic wrote: > Hello, > Thank you it is working. I created a writing and reading benchmark. In > both cases Python is about 3 times faster than Rust. I'd bet it's due to allocating too many vectors and copying vectors. We tend to get killed in vector allocation performance due to the fact that all vectors are unique and on the heap. This is what Graydon's work should alleviate. Patrick From a.stavonin at gmail.com Sun Apr 8 17:35:05 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Mon, 9 Apr 2012 09:35:05 +0900 Subject: [rust-dev] Main page example compilation error Message-ID: Hi, I'm trying to compile main page example, but it failed with error: > rustc main.rs main.rs:5:0: 5:1 error: expecting in, found } main.rs:5 } The situation with example on main page makes "very small taste" of Rust not too good. Is it possible to fix example? -------------- next part -------------- An HTML attachment was scrubbed... URL: From pwalton at mozilla.com Sun Apr 8 17:36:46 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Sun, 08 Apr 2012 17:36:46 -0700 Subject: [rust-dev] Main page example compilation error In-Reply-To: References: Message-ID: <4F822F1E.8030800@mozilla.com> On 04/08/2012 05:35 PM, Alexander Stavonin wrote: > Hi, I'm trying to compile main page example, but it failed with error: > > > > rustc main.rs > > main.rs:5:0: 5:1 error: expecting in, found } > > main.rs:5 } > > > The situation with example on main page makes "very small taste" of Rust > not too good. Is it possible to fix example? The example is for the current git tip of Rust, not any released version. Perhaps there should be a disclaimer to that effect... Patrick From mictadlo at gmail.com Sun Apr 8 18:17:28 2012 From: mictadlo at gmail.com (Mic) Date: Mon, 9 Apr 2012 11:17:28 +1000 Subject: [rust-dev] write_str error In-Reply-To: <4F81F8D8.5060705@mozilla.com> References: <4F81F8D8.5060705@mozilla.com> Message-ID: Thank you it is working. However, why does Rust require to write '[io::create, io::truncate]' and not like in Python's node in open(filename, mode). The first argument is a string containing the filename. The second argument is another string containing a few characters describing the way in which the file will be used. mode can be 'r' when the file will only be read, 'w' for only writing (an existing file with the same name will be erased), and 'a' opens the file for appending; any data written to the file is automatically added to the end. 'r+' opens the file for both reading and writing. The mode argument is optional; 'r' will be assumed if it?s omitted. On Windows, 'b' appended to the mode opens the file in binary mode, so there are also modes like 'rb', 'wb', and 'r+b'. Python on Windows makes a distinction between text and binary files; the end-of-line characters in text files are automatically altered slightly when data is read or written. This behind-the-scenes modification to file data is fine for ASCII text files, but it?ll corrupt binary data like that in JPEG or EXE files. Be very careful to use binary mode when reading and writing such files. On Unix, it doesn?t hurt to append a 'b' to the mode, so you can use it platform-independently for all binary files. On Mon, Apr 9, 2012 at 6:45 AM, Brian Anderson wrote: > ** > On 04/07/2012 08:58 PM, Mic wrote: > > Hi, > I am getting the following errors: > > $ rustc csv_create.rs > csv_create.rs:17:1: 17:14 error: attempted access of field write_str on > type core::io::writer, but no public field or method with that name was > found > csv_create.rs:17 rdr.write_str("aaa, bbb,ccc , > ddd,eee,fff,ggg,hhh,iii,jjj,kkk,lll,mmm,nnn\n"); > ^~~~~~~~~~~~~ > csv_create.rs:17:1: 17:78 error: mismatched types: expected function or > native function but found _|_ > csv_create.rs:17 rdr.write_str("aaa, bbb,ccc , > ddd,eee,fff,ggg,hhh,iii,jjj,kkk,lll,mmm,nnn\n"); > > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > with the following code: > > import io::reader_util; > import vec::map; > > fn main(args: [str]) { > > let r = io::file_writer("csv.csv" , [io::create, io::truncate]); // > r is result > if result::is_failure(r) { > fail result::get_err(r); > } > > let rdr = result::get(r); > > let count = 0; > while true { > > if count == 4000000 { break; } > rdr.write_str("aaa, bbb,ccc , > ddd,eee,fff,ggg,hhh,iii,jjj,kkk,lll,mmm,nnn\n"); > count += 1; > } > } > > What did I do wrong and would it possible to rewrite the while loop with > for loop? > > > I believe the problem is that you are using a writer type but have > imported a reader impl. if you add an `import io::writer_util;` statement > then it will get farther. > > The most concise way to write your while loop would be using > `iter::repeat` which just executes a function a specific number of times, > like `iter::repeat(4000000) {|| ... }`. Sadly `iter::repeat` can't be used > in a for loop yet. Our iteration strategy still needs an overhaul to be > compatible with `for`. > > -Brian > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From grahame at angrygoats.net Sun Apr 8 21:11:01 2012 From: grahame at angrygoats.net (Grahame Bowland) Date: Mon, 9 Apr 2012 12:11:01 +0800 Subject: [rust-dev] read file line by line In-Reply-To: References: <4F7CB169.8020409@mozilla.com> <4F80CA32.9010906@mozilla.com> Message-ID: Hi I've written a CSV reader implementation which handles things like escaping, quotes, etc - so it's better than a naive character split. https://github.com/grahame/rust-csv It's a bit slower than Python's implementation (on the order of 3x). I found most of the time was in str::from_chars, but since spending some time speeding that up you're correct - most of the time is spent in allocations. The Python CSV module is written in C and has a maximum line length limit of 128K. It's going to be fairly hard to beat, but it's also not doing exactly the same thing. Also you're not really racing Python and Rust, you're racing C + a tiny amount of Python and Rust. Cheers Grahame (I haven't written a writer as I don't need one, but it'd be a welcome addition if someone wants to add one.) On 9 April 2012 07:52, Mic wrote: > Hello, > Thank you it is working. I created a writing and reading benchmark. In > both cases Python is about 3 times faster than Rust. > > Please find below the results and attached the codes (create_csv.py/rshas to run first, because it creates a csv file which is used for > csv.py/rs) > > *BENCHMARK 1*: Writting 4000000 lines to a file > > $ time python csv_create.py > > real 0m3.620s > user 0m1.942s > sys 0m0.339s > $ ls -ahl csv.csv > -rw-r--r-- 1 mictadlo mictadlo 226M Apr 9 09:05 csv.csv > > > $ time ./csv_create > > real 0m11.299s > user 0m3.222s > sys 0m5.973s > $ ls -ahl csv.csv > -rw-r--r-- 1 mictadlo mictadlo 226M Apr 9 09:07 csv.csv > > > *BENCHMARK 2: *Readind a csv file and trim each field > > $ time python csv.py > > real 0m22.136s > user 0m21.728s > sys 0m0.095s > > $ time ./csv > > real 1m6.796s > user 1m6.364s > sys 0m0.145s > > If you guys happy benchmark than I could commit it to git. What is the git > command to commit in 'src/test/bench'? > > Thank you in advance. > > On Sun, Apr 8, 2012 at 9:13 AM, Brian Anderson wrote: > >> ** >> On 04/06/2012 06:48 PM, Mic wrote: >> >> Hi >> I have trouble to compile the following code: >> >> import io::reader_util; >> import vec::map; >> >> fn main(args: [str]) { >> >> let r = io::file_reader(args[1]); // r is result >> if result::failure(r) { >> fail result::get_err(r); >> } >> >> let rdr = result::get(r); >> >> while !rdr.eof() { >> let line = rdr.read_line(); >> io::println(line); >> if str::len(line) != 0u { >> let parts = vec::map(line.split_char(',')) {|s| >> str::trim(s) >> }; >> } >> >> } >> } >> >> and got the errors: >> $ rustc csv.rs >> csv.rs:17:33: 17:48 error: attempted access of field split_char on type >> str, but no public field or method with that name was found >> csv.rs:17 let parts = vec::map(line.split_char(',')) {|s| >> >> ^~~~~~~~~~~~~~~ >> csv.rs:17:33: 17:53 error: the type of this value must be known in this >> context >> csv.rs:17 let parts = vec::map(line.split_char(',')) {|s| >> >> ^~~~~~~~~~~~~~~~~~~~ >> >> What did I do wrong? >> >> >> Hi Mic. >> >> The available extension methods (as in `line.split_char(',')`) have been >> changing a lot recently, so my guess is that your compiler is just slightly >> out of date and doesn't have the `split_char` extension on `str`. Try >> updating to Rust HEAD where you will also notice that `result::failure` is >> now called `result::is_failure`. >> >> -Brian >> >> > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mictadlo at gmail.com Sun Apr 8 21:25:43 2012 From: mictadlo at gmail.com (Mic) Date: Mon, 9 Apr 2012 14:25:43 +1000 Subject: [rust-dev] spawn on computer cluster Message-ID: Hi, Does spawn spread the task across computer nodes in a cluster eg like in Julia http://julialang.org/manual/parallel-computing/ ? Any plans maybe also to build spawn on top of MapReduce and HDFS API for Hadoop like in Python http://sourceforge.net/apps/mediawiki/pydoop/index.php?title=Main_Paget ? So the user of the rust application can choose by starting the application on what it should run? Thank you in advance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.plantikow at googlemail.com Mon Apr 9 04:20:42 2012 From: stefan.plantikow at googlemail.com (Stefan Plantikow) Date: Mon, 9 Apr 2012 13:20:42 +0200 Subject: [rust-dev] hashmap benchmark In-Reply-To: References: <4F7B4538.40809@alum.mit.edu> <034275CD565A473F889C074C3E7BB564@googlemail.com> <4F81F5CD.3010803@mozilla.com> <1F872970AFAF45C5A4E36D43248A9971@googlemail.com> Message-ID: Hi again, Am Sonntag, 8. April 2012 um 23:23 schrieb Sebastian Sylvan: > On Sun, Apr 8, 2012 at 1:55 PM, Stefan Plantikow > wrote: > > > > Feedback/Suggestions? > > > > > > > Is going for hopscotch a good idea? Ah well, will try in any case ;) > > > One of the most surprisingly awesome hash table algorithms (to me at > least) is open addressing based on "robin hood hashing". It's been > around for ages, but almost nobody knows about it. It's *such* a > simple (i.e. fast!) tweak to the regular algorithm, and it makes all > the difference. I implemented the basic ops in C++ and it was > something like two orders of magnitude faster than the built in > unordered_set in Visual Studio for insertions and one order of > magnitude for lookups. > Thanks for the pointer, that is definitely an algorithm to keep in mind! I read the hopscotch paper today and among other things they evaluate hopscotch against a highly optimized linear probing algorithm. Hopscotch outperforms that linear probing approach by a noticeable margin, likely due to better cache alignment (hopscotch on average requires < 1 cache miss!). Though it is not completely clear how much that carries over to another linear probing algorithm, I still tend towards implementing single core hopscotch with key displacement for now. Actually, it may be possible to combine robin hood probe counts with hopscotch hashing to get the reduced variance effect and I probably will try that. While digging through papers, I also notices that there are new results that limit the memory requirements of near perfect hash functions (fixed key sets), and stumbled upon judy arrays (which I didn't know before and may be interesting to try, too, even though they are badly described by the authors). May rust get more and faster data structures, Cheers, Stefan From a.stavonin at gmail.com Mon Apr 9 05:03:13 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Mon, 9 Apr 2012 21:03:13 +0900 Subject: [rust-dev] Main page example compilation error (Patrick Walton) In-Reply-To: References: Message-ID: Patrick, I checked the example with latest version of Rust, compiled from Git. Results: astavonin:/Users/../RustTests: rustc --version rustc 0.2 (9e1e42d 2012-04-08 14:16:55 -0700) host: x86_64-apple-darwin astavonin:/Users/../RustTests: rustc test.rs test.rs:2:4: 2:7 error: `for` must be followed by a block call test.rs:2 for i in [1, 2, 3] { ^~~ What I've made wrong? Regards, Alexander. > The example is for the current git tip of Rust, not any released > version. Perhaps there should be a disclaimer to that effect... > > Patrick From kobi2187 at gmail.com Mon Apr 9 05:24:54 2012 From: kobi2187 at gmail.com (Kobi Lurie) Date: Mon, 09 Apr 2012 15:24:54 +0300 Subject: [rust-dev] hashmap benchmark In-Reply-To: References: <4F7B4538.40809@alum.mit.edu> <034275CD565A473F889C074C3E7BB564@googlemail.com> <4F81F5CD.3010803@mozilla.com> <1F872970AFAF45C5A4E36D43248A9971@googlemail.com> Message-ID: <4F82D516.8020601@gmail.com> hello Stefan, the felix programming language uses Judy arrays (extensively?), and seem to have a good understanding of them. maybe you can check out what he came up with. It's a nice language, too, with some interesting features. bye, Kobi On 4/9/2012 2:20 PM, Stefan Plantikow wrote: > Thanks for the pointer, that is definitely an algorithm to keep in mind! > > I read the hopscotch paper today and among other things they evaluate hopscotch against a highly optimized linear probing algorithm. Hopscotch outperforms that linear probing approach by a noticeable margin, likely due to better cache alignment (hopscotch on average requires< 1 cache miss!). Though it is not completely clear how much that carries over to another linear probing algorithm, I still tend towards implementing single core hopscotch with key displacement for now. Actually, it may be possible to combine robin hood probe counts with hopscotch hashing to get the reduced variance effect and I probably will try that. > > While digging through papers, I also notices that there are new results that limit the memory requirements of near perfect hash functions (fixed key sets), and stumbled upon judy arrays (which I didn't know before and may be interesting to try, too, even though they are badly described by the authors). > > May rust get more and faster data structures, > > > Cheers, > > > Stefan > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From stefan.plantikow at googlemail.com Mon Apr 9 05:36:46 2012 From: stefan.plantikow at googlemail.com (Stefan Plantikow) Date: Mon, 9 Apr 2012 14:36:46 +0200 Subject: [rust-dev] hashmap benchmark In-Reply-To: <4F820032.1080502@mozilla.com> References: <4F7B4538.40809@alum.mit.edu> <034275CD565A473F889C074C3E7BB564@googlemail.com> <4F81F5CD.3010803@mozilla.com> <1F872970AFAF45C5A4E36D43248A9971@googlemail.com> <4F820032.1080502@mozilla.com> Message-ID: <69366FC819F843EDA0DAE9BFA73E31DD@googlemail.com> Hi, Am Sonntag, 8. April 2012 um 23:16 schrieb Brian Anderson: > > #[abi = "rust-intrinsic"] > native mod intrinsics { > fn bswap_i16(i: i16) -> i16; > } > > The rust intrinsics would all just be hardcoded into the compiler to > translate to the appropriate llvm intrinsics. > > As an llvm intrinsic: > > #[abi = "llvm-intrinsic"] > native mod intrinsics { > #[link_name = "bswap.i16"] > fn bswap_i16(i16: i16) -> i16; > } > > In this case rustc probably doesn't need to know anything specific about > the intrinsic - we just generate an intrinsic instruction with the given > name and types. is there any advantage from choosing one form over the other (maybe in terms of optimizability) beyond what already was mentioned? Greets, Stefan From grahame at angrygoats.net Mon Apr 9 05:48:32 2012 From: grahame at angrygoats.net (Grahame Bowland) Date: Mon, 9 Apr 2012 20:48:32 +0800 Subject: [rust-dev] Main page example compilation error (Patrick Walton) In-Reply-To: References: Message-ID: Hi Alexander Tip moves pretty quickly; old style for has just been removed, I think you want for vec::each([1,2,3]) { |i| now. Cheers Grahame On 9 April 2012 20:03, Alexander Stavonin wrote: > Patrick, > > I checked the example with latest version of Rust, compiled from Git. > Results: > > astavonin:/Users/../RustTests: rustc --version > rustc 0.2 (9e1e42d 2012-04-08 14:16:55 -0700) > host: x86_64-apple-darwin > astavonin:/Users/../RustTests: rustc test.rs > test.rs:2:4: 2:7 error: `for` must be followed by a block call > test.rs:2 for i in [1, 2, 3] { > ^~~ > > What I've made wrong? > > Regards, > Alexander. > > > The example is for the current git tip of Rust, not any released > > version. Perhaps there should be a disclaimer to that effect... > > > > Patrick > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Mon Apr 9 05:48:55 2012 From: masklinn at masklinn.net (Masklinn) Date: Mon, 9 Apr 2012 14:48:55 +0200 Subject: [rust-dev] hashmap benchmark In-Reply-To: <4F820168.5000802@alum.mit.edu> References: <4F7B4538.40809@alum.mit.edu> <034275CD565A473F889C074C3E7BB564@googlemail.com> <4F820168.5000802@alum.mit.edu> Message-ID: <44AE94E7-2AC2-4671-A42C-9B6513823942@masklinn.net> On 2012-04-08, at 23:21 , Niko Matsakis wrote: > On 4/8/12 12:15 PM, Stefan Plantikow wrote: >> Hi, >> >> I was thinking about this, too. One of the state-of-the art algorithms seems to be hopscotch hashing, wikipedia has a quite good introduction to it. Even though it has been developed for concurrent access, it should also be quite good in a single core scenario and has a really low memory footprint (90% full hash table still works reasonably). I was thinking about implementing that for fun once the currently ongoing changes to regions and vectors are complete. > > I am totally excited about offering a variety of map abstractions. One additional thing I would particularly like (which is kind of orthogonal) is a default map implementation that begins as a straight-up list of some fixed size and then shifts to another algorithm as the table is populated. Of course we'd want to test and tune to see where it makes sense to shift, but I believe that a lot of hash tables are generally small (but not always) and thus can benefit from changing strategies as they fill up. In fact I would like to have a similar approach to each of the basic container types (a default implementation that adjusts and does the right thing, plus a variety of more specific implementations). Cocoa has a lot of that kind of things, if somebody wants to do it. From stefan.plantikow at googlemail.com Mon Apr 9 06:12:36 2012 From: stefan.plantikow at googlemail.com (Stefan Plantikow) Date: Mon, 9 Apr 2012 15:12:36 +0200 Subject: [rust-dev] hashmap benchmark In-Reply-To: <69366FC819F843EDA0DAE9BFA73E31DD@googlemail.com> References: <4F7B4538.40809@alum.mit.edu> <034275CD565A473F889C074C3E7BB564@googlemail.com> <4F81F5CD.3010803@mozilla.com> <1F872970AFAF45C5A4E36D43248A9971@googlemail.com> <4F820032.1080502@mozilla.com> <69366FC819F843EDA0DAE9BFA73E31DD@googlemail.com> Message-ID: <1E19F5BA12C446188A5982C4A26B1D6B@googlemail.com> Hi, > > #[abi = "rust-intrinsic"] > > native mod intrinsics { > > fn bswap_i16(i: i16) -> i16; > > } > > > > The rust intrinsics would all just be hardcoded into the compiler to > > translate to the appropriate llvm intrinsics. > > > > As an llvm intrinsic: > > > > #[abi = "llvm-intrinsic"] > > native mod intrinsics { > > #[link_name = "bswap.i16"] > > fn bswap_i16(i16: i16) -> i16; > > } > > > > In this case rustc probably doesn't need to know anything specific about > > the intrinsic - we just generate an intrinsic instruction with the given > > name and types. > > > > > is there any advantage from choosing one form over the other (maybe in terms of optimizability) beyond what already was mentioned? > turns out, ctlz and cttz need an i1 as a second argument, so I guess this means there is the need to implement this via rust-intrinsic. Stefan. From niko at alum.mit.edu Mon Apr 9 06:41:26 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Mon, 09 Apr 2012 06:41:26 -0700 Subject: [rust-dev] spawn on computer cluster In-Reply-To: References: Message-ID: <4F82E706.2000000@alum.mit.edu> On 4/8/12 9:25 PM, Mic wrote: > Hi, > Does spawn spread the task across computer nodes in a cluster eg like > in Julia http://julialang.org/manual/parallel-computing/ ? Currently, no. > Any plans maybe also to build spawn on top of MapReduce and HDFS API > for Hadoop like in Python > http://sourceforge.net/apps/mediawiki/pydoop/index.php?title=Main_Paget ? Currently, no. We are currently not targeting distributed computing. Some of the features in Rust?e.g., unique pointer transfer between tasks?are really intended for processes with shared memory. Nonetheless, the current design would permit a distributed implementation: all sendable things are also copyable (and tree-shaped, for that matter), which means that they could in theory be efficiently serialized and sent over the wire. However, as we evolve, there are some planned features that do not lend themselves so well to a distributed setting. For example, we would like to make use of regions to allow the construction of a message that has arbitrary shape (for example, a graph) and which can then be sent as a whole. While it is of course possible to serialize graphs, it's just harder and slower. But I guess that so long as we stick to a strict "no shared memory" model (which I think we will) then a distributed implementation is always a possibility. (Data or small-scale task-parallelism, as discussed in the recent thread on ray tracing, is a different matter of course. That often only makes sense with shared memory. But we don't currently have any features targeting this.) Niko From ben.striegel at gmail.com Mon Apr 9 07:02:23 2012 From: ben.striegel at gmail.com (Benjamin Striegel) Date: Mon, 9 Apr 2012 10:02:23 -0400 Subject: [rust-dev] Shapiro: BitC isn't going to work In-Reply-To: <4F6E4873.1070401@mozilla.com> References: <4F6E4873.1070401@mozilla.com> Message-ID: Here's Jonathan Shapiro's followup where he talks specifically about typeclasses: http://www.bitc-lang.org/pipermail/bitc-dev/2012-April/003315.html I'm interested to know if any of the specific issues he raises (multiple instantiation, operator overloading, a desire to emulate inheritance) apply to Rust. On Sat, Mar 24, 2012 at 6:19 PM, Patrick Walton wrote: > On 03/24/2012 01:57 PM, Sebastian Sylvan wrote: > >> Here's a note by Jonathan Shapiro saying that BitC is no longer going >> to work: http://www.coyotos.org/**pipermail/bitc-dev/2012-March/** >> 003300.html >> >> It had many of the same goals as Rust, so it may be interesting to >> this mailing list to learn from BitC. >> >> >> > Very interesting. I posted my thoughts on Hacker News (along with some > comparisons to Go): > > http://news.ycombinator.com/**item?id=3750882 > > Patrick > > ______________________________**_________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/**listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From graydon at mozilla.com Mon Apr 9 10:58:47 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 09 Apr 2012 10:58:47 -0700 Subject: [rust-dev] idea: access modifiers as part of function signature In-Reply-To: <4F807375.80200@gmail.com> References: <4F7EEBEB.1050409@gmail.com> <4F7F4C3D.1040109@mozilla.com> <4F8042C5.9090608@gmail.com> <4F807375.80200@gmail.com> Message-ID: <4F832357.6040902@mozilla.com> On 12-04-07 10:03 AM, Kobi Lurie wrote: > I feel that most of the times the distinction of ref/value is about low > level performance. It is sometimes, but passing a mutable reference says pretty clearly "this function is going to change this argument". I think we already support (and will continue to support) expressing this sort of difference via reference, mutable-reference move and value modes. -Graydon From graydon at mozilla.com Mon Apr 9 11:02:23 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 09 Apr 2012 11:02:23 -0700 Subject: [rust-dev] idea: specific visibility In-Reply-To: <4F81D556.6090203@gmail.com> References: <4F81D556.6090203@gmail.com> Message-ID: <4F83242F.7030502@mozilla.com> On 12-04-08 11:13 AM, Kobi Lurie wrote: > this idea can be extended to functions, and checked by the compiler. > for example, a member can declare that only the setter can change it. > same thing for a getter. > sometimes a dll has a lot of inter-related functionality inside, and you > want a certain "internal" function to be visible, but only used by > certain functions. Rust has an visibility-control system already in the form of 'export'. It's going to be modified/rewritten in the near future either based on this bug: https://github.com/mozilla/rust/issues/1893 And/or this proposal: https://mail.mozilla.org/pipermail/rust-dev/2012-March/001464.html this will combine (hopefully) with the work on classes, outlined in this bug: https://github.com/mozilla/rust/issues/1726 -Graydon From banderson at mozilla.com Mon Apr 9 11:47:08 2012 From: banderson at mozilla.com (Brian Anderson) Date: Mon, 09 Apr 2012 11:47:08 -0700 Subject: [rust-dev] hashmap benchmark In-Reply-To: <69366FC819F843EDA0DAE9BFA73E31DD@googlemail.com> References: <4F7B4538.40809@alum.mit.edu> <034275CD565A473F889C074C3E7BB564@googlemail.com> <4F81F5CD.3010803@mozilla.com> <1F872970AFAF45C5A4E36D43248A9971@googlemail.com> <4F820032.1080502@mozilla.com> <69366FC819F843EDA0DAE9BFA73E31DD@googlemail.com> Message-ID: <4F832EAC.80604@mozilla.com> On 04/09/2012 05:36 AM, Stefan Plantikow wrote: > Hi, > > > Am Sonntag, 8. April 2012 um 23:16 schrieb Brian Anderson: > >> #[abi = "rust-intrinsic"] >> native mod intrinsics { >> fn bswap_i16(i: i16) -> i16; >> } >> >> The rust intrinsics would all just be hardcoded into the compiler to >> translate to the appropriate llvm intrinsics. >> >> As an llvm intrinsic: >> >> #[abi = "llvm-intrinsic"] >> native mod intrinsics { >> #[link_name = "bswap.i16"] >> fn bswap_i16(i16: i16) -> i16; >> } >> >> In this case rustc probably doesn't need to know anything specific about >> the intrinsic - we just generate an intrinsic instruction with the given >> name and types. > > is there any advantage from choosing one form over the other (maybe in terms of optimizability) beyond what already was mentioned? > I don't think so. -Brian From banderson at mozilla.com Mon Apr 9 12:06:00 2012 From: banderson at mozilla.com (Brian Anderson) Date: Mon, 09 Apr 2012 12:06:00 -0700 Subject: [rust-dev] Main page example compilation error In-Reply-To: <4F822F1E.8030800@mozilla.com> References: <4F822F1E.8030800@mozilla.com> Message-ID: <4F833318.5040401@mozilla.com> On 04/08/2012 05:36 PM, Patrick Walton wrote: > On 04/08/2012 05:35 PM, Alexander Stavonin wrote: >> Hi, I'm trying to compile main page example, but it failed with error: >> >> >> > rustc main.rs >> >> main.rs:5:0: 5:1 error: expecting in, found } >> >> main.rs:5 } >> >> >> The situation with example on main page makes "very small taste" of Rust >> not too good. Is it possible to fix example? > > The example is for the current git tip of Rust, not any released > version. Perhaps there should be a disclaimer to that effect... I've changed the example to be compatible with both. The docs are all still for tip though. From banderson at mozilla.com Mon Apr 9 13:00:04 2012 From: banderson at mozilla.com (Brian Anderson) Date: Mon, 09 Apr 2012 13:00:04 -0700 Subject: [rust-dev] enscripten demo? In-Reply-To: References: <4F7B3D57.3040004@mozilla.com> Message-ID: <4F833FC4.2050206@mozilla.com> On 04/04/2012 02:55 AM, Mohd. Bilal Husain wrote: > Passed a dumb sample rust bitcode to emscripten, got js functions#. > Realized I need to run on core modules too for printing simple hello > world. Took io from libcore, decimated code to avoid few build errors, > emcc throws error > > Unclear type in struct > > Anyways, need to figure out how to build native modules and core lib, > std lib; and how to map these modules to imports in a sample hello-world. I did some tinkering this weekend and found the following: * Rust code needs to be compiled for 32-bit targets. Emscripten is not heavily tested for 64-bit targets. * rustc should be invoked with --no-asm-comments because emscripten does not like asm * We were generating some bogus ll asm, fixed by #2167 * Emscripten doesn't handle quoted labels, fixed in my branch: https://github.com/brson/emscripten/tree/rust * Emscripten doesn't handle the `frem` instruction, also in my branch * Emscripten doesn't handle empty structs in some situations, filed here: https://github.com/kripken/emscripten/issues/364 From banderson at mozilla.com Mon Apr 9 13:16:54 2012 From: banderson at mozilla.com (Brian Anderson) Date: Mon, 09 Apr 2012 13:16:54 -0700 Subject: [rust-dev] enscripten demo? In-Reply-To: <4F833FC4.2050206@mozilla.com> References: <4F7B3D57.3040004@mozilla.com> <4F833FC4.2050206@mozilla.com> Message-ID: <4F8343B6.4040103@mozilla.com> On 04/09/2012 01:00 PM, Brian Anderson wrote: > On 04/04/2012 02:55 AM, Mohd. Bilal Husain wrote: >> Passed a dumb sample rust bitcode to emscripten, got js functions#. >> Realized I need to run on core modules too for printing simple hello >> world. Took io from libcore, decimated code to avoid few build >> errors, emcc throws error >> >> Unclear type in struct >> >> Anyways, need to figure out how to build native modules and core lib, >> std lib; and how to map these modules to imports in a sample >> hello-world. > > I did some tinkering this weekend and found the following: > * Emscripten doesn't handle quoted labels, fixed in my branch: > https://github.com/brson/emscripten/tree/rust This was fixed on emscripten master independently of my fix, so it's no longer in my branch From amitava.shee at gmail.com Mon Apr 9 14:26:23 2012 From: amitava.shee at gmail.com (Amitava Shee) Date: Mon, 9 Apr 2012 17:26:23 -0400 Subject: [rust-dev] How to build multiple .rs source files? Message-ID: Is there a starter project or a Makefile? When I try to compile a source file without linking, I get the following error amitava:learn amitava$ rustc -g -c shape.rs shape.rs:1:0: 1:0 error: main function not found shape.rs:1 class shape { ^ error: aborting due to previous errors How do I compile several source files to obj files and then link them together into an executable? Thanks & Regards, Amitava Shee -------------- next part -------------- An HTML attachment was scrubbed... URL: From catamorphism at gmail.com Mon Apr 9 14:35:26 2012 From: catamorphism at gmail.com (Tim Chevalier) Date: Mon, 9 Apr 2012 14:35:26 -0700 Subject: [rust-dev] How to build multiple .rs source files? In-Reply-To: References: Message-ID: On Mon, Apr 9, 2012 at 2:26 PM, Amitava Shee wrote: > Is there a starter project or a Makefile? > > When I try to compile a source file without linking, I get the following > error > > amitava:learn amitava$ rustc -g -c shape.rs > shape.rs:1:0: 1:0 error: main function not found > shape.rs:1 class shape { > ?????????? ^ > error: aborting due to previous errors > > How do I compile several source files to obj files and then link them > together into an executable? > Hi, Amitava -- A single source file has to contain a function named "main" in order to be compiled to an executable, much as in C/C++. Rust's way of organizing multiple source files is called a "crate". The tutorial explains how they work: http://doc.rust-lang.org/doc/tutorial.html#modules-and-crates Feel free to ask either here or on the Rust IRC channel (see https://github.com/mozilla/rust/wiki/Note-development-policy for details) if you have more questions. Cheers, Tim -- Tim Chevalier * http://catamorphism.org/ * Often in error, never in doubt "Debate is useless when one participant denies the full dignity of the other." -- Eric Berndt From grahame at angrygoats.net Mon Apr 9 23:23:29 2012 From: grahame at angrygoats.net (Grahame Bowland) Date: Tue, 10 Apr 2012 14:23:29 +0800 Subject: [rust-dev] How to build multiple .rs source files? In-Reply-To: References: Message-ID: Hi Amitava I've attached a simple template from a project I'm working on. I've got a dependency "rust-csv" which I'm not building with cargo, but from a git submodule. I use stamp files for the libraries, as the output files are versioned and thus the filename produced changes. Hopefully this is of use. Grahame On 10 April 2012 05:26, Amitava Shee wrote: > Is there a starter project or a Makefile? > > When I try to compile a source file without linking, I get the following > error > > amitava:learn amitava$ rustc -g -c shape.rs > shape.rs:1:0: 1:0 error: main function not found > shape.rs:1 class shape { > ^ > error: aborting due to previous errors > > How do I compile several source files to obj files and then link them > together into an executable? > > Thanks & Regards, > Amitava Shee > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Makefile Type: application/octet-stream Size: 510 bytes Desc: not available URL: From stefan.plantikow at googlemail.com Tue Apr 10 02:42:27 2012 From: stefan.plantikow at googlemail.com (Stefan Plantikow) Date: Tue, 10 Apr 2012 11:42:27 +0200 Subject: [rust-dev] Shapiro: BitC isn't going to work In-Reply-To: <4F6E3F17.9000502@alum.mit.edu> References: <4F6E3F17.9000502@alum.mit.edu> Message-ID: <55E5C4AB1F3B422D9B4DB11E576ACB87@googlemail.com> Hi, > 3. Type class instance coherence (what I was calling the Hashtable > Problem)---we've still got some work to do here. missed that mail, can you please explain what you refer to by this? Thanks, Stefan From a.stavonin at gmail.com Tue Apr 10 04:07:41 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Tue, 10 Apr 2012 20:07:41 +0900 Subject: [rust-dev] sizeof type Message-ID: <86A40F4B-689E-4813-9E21-E8966D7A7981@gmail.com> Hi, I have a type which will be bind to appropriate C structure: type test_type = { val1 : i16; val2 : i32; } How could I get size of the test_type? How could I provide an information about alignment of the test_type? Regards, Alexander. From a.stavonin at gmail.com Tue Apr 10 05:16:02 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Tue, 10 Apr 2012 21:16:02 +0900 Subject: [rust-dev] **libc::c_char to ??? Message-ID: Hi all, it's again me. I have a C function returns array of null terminated strings. And I need to convert it to an Rust string type. C function declaration: const char** func(); Rust code: native mod c { fn func() -> **libc::c_char; } #[test] fn test_func() { let results = c::func(); // how to print all string in results??? } I've tried next idea without success: let v: [str] = methods; // mismatched types: expected `[str]` but found `**core::libc::types::os::arch::c95::c_char` (vector vs *-ptr) What is the best way to do it? Regards, Alexander. From a.stavonin at gmail.com Tue Apr 10 05:25:56 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Tue, 10 Apr 2012 21:25:56 +0900 Subject: [rust-dev] How to build multiple .rs source files? In-Reply-To: References: Message-ID: Where can I read something regarding class keyword? Or this is just misspell? > shape.rs:1 class shape { Alexander From amitava.shee at gmail.com Tue Apr 10 05:42:56 2012 From: amitava.shee at gmail.com (Amitava Shee) Date: Tue, 10 Apr 2012 08:42:56 -0400 Subject: [rust-dev] How to build multiple .rs source files? In-Reply-To: References: Message-ID: I found this file illustrative of class construct src/test/run-pass/classes.rs Regards, Amitava Shee On Tue, Apr 10, 2012 at 8:25 AM, Alexander Stavonin wrote: > Where can I read something regarding class keyword? Or this is just > misspell? > > > shape.rs:1 class shape { > > Alexander > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > -- Amitava Shee Software Architect There are two ways of constructing a software design. One is to make it so simple that there are obviously no deficiencies; the other is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult. -- C. A. R. Hoare The Emperor's Old Clothes, CACM February 1981 -------------- next part -------------- An HTML attachment was scrubbed... URL: From hsivonen at iki.fi Tue Apr 10 05:53:42 2012 From: hsivonen at iki.fi (Henri Sivonen) Date: Tue, 10 Apr 2012 15:53:42 +0300 Subject: [rust-dev] Fall-through in alt, break&continue by label Message-ID: It appears that Rust does not to have labeled loops with break and continue by label the way Java has. Also, it appears that alt does not have fall-through the way switch in C has. Are break and continue by label and/or fall-through in alt supported in some non-obvious and unadvertised way? If not, are there plans to add to these features? (While I understand that fall-through in switch is largely considered a misfeature, break and continue by label seem less controversial.) If there are no plans to add these features, what are the recommended ways to emulate these features in a way that compiles to efficient machine code? The use case I have is targeting Rust with the translator that currently targets C++ and generates the HTML parser in Gecko. (It uses goto hidden behind macros to emulate break and continue by label in C++.) -- Henri Sivonen hsivonen at iki.fi http://hsivonen.iki.fi/ From amitava.shee at gmail.com Tue Apr 10 05:56:51 2012 From: amitava.shee at gmail.com (Amitava Shee) Date: Tue, 10 Apr 2012 08:56:51 -0400 Subject: [rust-dev] How to build multiple .rs source files? In-Reply-To: References: Message-ID: Thanks. Just to confirm my understanding - rustc deviates from the gcc way of generating object files and linking them together without first packaging them into libraries. Is there a way to compile and link several .rs files into a single executable without an intermediate library? -Amitava On Tue, Apr 10, 2012 at 2:23 AM, Grahame Bowland wrote: > Hi Amitava > > I've attached a simple template from a project I'm working on. I've got a > dependency "rust-csv" which I'm not building with cargo, but from a git > submodule. > > I use stamp files for the libraries, as the output files are versioned and > thus the filename produced changes. > > Hopefully this is of use. > > Grahame > > On 10 April 2012 05:26, Amitava Shee wrote: > >> Is there a starter project or a Makefile? >> >> When I try to compile a source file without linking, I get the following >> error >> >> amitava:learn amitava$ rustc -g -c shape.rs >> shape.rs:1:0: 1:0 error: main function not found >> shape.rs:1 class shape { >> ^ >> error: aborting due to previous errors >> >> How do I compile several source files to obj files and then link them >> together into an executable? >> >> Thanks & Regards, >> Amitava Shee >> >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev >> >> > -- Amitava Shee Software Architect There are two ways of constructing a software design. One is to make it so simple that there are obviously no deficiencies; the other is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult. -- C. A. R. Hoare The Emperor's Old Clothes, CACM February 1981 -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Tue Apr 10 06:18:43 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 10 Apr 2012 06:18:43 -0700 Subject: [rust-dev] Shapiro: BitC isn't going to work In-Reply-To: <55E5C4AB1F3B422D9B4DB11E576ACB87@googlemail.com> References: <4F6E3F17.9000502@alum.mit.edu> <55E5C4AB1F3B422D9B4DB11E576ACB87@googlemail.com> Message-ID: <4F843333.7040601@alum.mit.edu> On 4/10/12 2:42 AM, Stefan Plantikow wrote: > Hi, > >> 3. Type class instance coherence (what I was calling the Hashtable >> Problem)---we've still got some work to do here. > > missed that mail, can you please explain what you refer to by this? This mail describes what I still believe to be the best solution: https://mail.mozilla.org/pipermail/rust-dev/2011-December/001036.html Niko From niko at alum.mit.edu Tue Apr 10 06:27:20 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 10 Apr 2012 06:27:20 -0700 Subject: [rust-dev] sizeof type In-Reply-To: <86A40F4B-689E-4813-9E21-E8966D7A7981@gmail.com> References: <86A40F4B-689E-4813-9E21-E8966D7A7981@gmail.com> Message-ID: <4F843538.2070702@alum.mit.edu> Hello, The `sys` module in core has `size_of()` and `align_of()` functions. So you could write `sys::size_of::()`, for example. Niko On 4/10/12 4:07 AM, Alexander Stavonin wrote: > Hi, > > I have a type which will be bind to appropriate C structure: > > type test_type = { > val1 : i16; > val2 : i32; > } > > How could I get size of the test_type? How could I provide an information about alignment of the test_type? > > Regards, > Alexander. > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From niko at alum.mit.edu Tue Apr 10 06:28:57 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 10 Apr 2012 06:28:57 -0700 Subject: [rust-dev] How to build multiple .rs source files? In-Reply-To: References: Message-ID: <4F843599.3070105@alum.mit.edu> On 4/10/12 5:56 AM, Amitava Shee wrote: > Just to confirm my understanding - rustc deviates from the gcc way of > generating object files and linking them together without first > packaging them into libraries. We prefer to say "improves upon", but yes. =) > Is there a way to compile and link several .rs files into a single > executable without an intermediate library? There is no need for an intermediate library. A .rc file can directly produce an application. There is, however, no way to link together multiple .rs files without an .rc file. Niko From niko at alum.mit.edu Tue Apr 10 06:42:07 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 10 Apr 2012 06:42:07 -0700 Subject: [rust-dev] Fall-through in alt, break&continue by label In-Reply-To: References: Message-ID: <4F8438AF.9090003@alum.mit.edu> On 4/10/12 5:53 AM, Henri Sivonen wrote: > The use case I have is targeting Rust with the translator that > currently targets C++ and generates the HTML parser in Gecko. (It uses > goto hidden behind macros to emulate break and continue by label in > C++.) There is currently no way to do that kind of control flow beyond using flags with `if` checks or restructuring the code in some other way (tail calls, if they worked, seem like they would be useful). I believe our `break` can only target loops in any case. How hard would it be do you think to prototype a version that avoids these control-flow features? Also, how important is fall-through for alt vs break to labeled blocks? I think adding labeled blocks/loops and the ability to break/continue with a label is plausible, but a fair bit of work. Fall-through in alt seems less likely. Certainly it would be good to do some experiments and measurements, either of your parser or of micro-benchmarks. Niko From amitava.shee at gmail.com Tue Apr 10 07:56:28 2012 From: amitava.shee at gmail.com (Amitava Shee) Date: Tue, 10 Apr 2012 10:56:28 -0400 Subject: [rust-dev] How to build multiple .rs source files? In-Reply-To: <4F843599.3070105@alum.mit.edu> References: <4F843599.3070105@alum.mit.edu> Message-ID: Thanks. For those still in baby step phase, I have the most rudimentary project in github - https://github.com/ashee/rust-babysteps Please note that .rc implicitly references .rs that contains the main() entry point. -Amitava On Tue, Apr 10, 2012 at 9:28 AM, Niko Matsakis wrote: > On 4/10/12 5:56 AM, Amitava Shee wrote: > >> Just to confirm my understanding - rustc deviates from the gcc way of >> generating object files and linking them together without first packaging >> them into libraries. >> > > We prefer to say "improves upon", but yes. =) > > > Is there a way to compile and link several .rs files into a single >> executable without an intermediate library? >> > > There is no need for an intermediate library. A .rc file can directly > produce an application. There is, however, no way to link together > multiple .rs files without an .rc file. > > Niko > -- Amitava Shee Software Architect There are two ways of constructing a software design. One is to make it so simple that there are obviously no deficiencies; the other is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult. -- C. A. R. Hoare The Emperor's Old Clothes, CACM February 1981 -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Tue Apr 10 09:56:20 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 10 Apr 2012 09:56:20 -0700 Subject: [rust-dev] How to build multiple .rs source files? In-Reply-To: References: <4F843599.3070105@alum.mit.edu> Message-ID: <4F846634.5090101@alum.mit.edu> On 4/10/12 7:56 AM, Amitava Shee wrote: > For those still in baby step phase, I have the most rudimentary > project in github - > https://github.com/ashee/rust-babysteps This look about right, except of course that the dependencies in the Makefile are incomplete You probably want "myapp: myapp.rc $(wildcard *.rs)" or something like that. Otherwise changing, for example, amod.rs will not trigger a recompilation. Or you could just declare "myapp" as `.PHONY` (my personal preference, since I am lazy and I'd rather recompile than miss a dependency). Niko From catamorphism at gmail.com Tue Apr 10 11:30:46 2012 From: catamorphism at gmail.com (Tim Chevalier) Date: Tue, 10 Apr 2012 11:30:46 -0700 Subject: [rust-dev] How to build multiple .rs source files? In-Reply-To: References: Message-ID: On Tue, Apr 10, 2012 at 5:25 AM, Alexander Stavonin wrote: > Where can I read something regarding class keyword? Or this is just misspell? > >> shape.rs:1 class shape { Classes are still experimental and the documentation hasn't been updated yet. In the meantime, Amitava is correct -- you can grep for "class" under rustc/src/test/run-pass. I realize this isn't ideal, it's just that the docs are lagging behind the code in this case, which is entirely my fault. (Part of the reason is that the syntax for classes is still in flux.) You can also read https://github.com/mozilla/rust/issues/1726 and https://mail.mozilla.org/pipermail/rust-dev/2011-November/000929.html -- but that's a bit risky since parts of the latter were superseded by ifaces/impls, which were implemented later. At least that should give you more examples. Cheers, Tim -- Tim Chevalier * http://catamorphism.org/ * Often in error, never in doubt "Debate is useless when one participant denies the full dignity of the other." -- Eric Berndt From banderson at mozilla.com Tue Apr 10 14:22:46 2012 From: banderson at mozilla.com (Brian Anderson) Date: Tue, 10 Apr 2012 14:22:46 -0700 Subject: [rust-dev] **libc::c_char to ??? In-Reply-To: References: Message-ID: <4F84A4A6.7080701@mozilla.com> On 04/10/2012 05:16 AM, Alexander Stavonin wrote: > Hi all, it's again me. > > I have a C function returns array of null terminated strings. And I need to convert it to an Rust string type. > > C function declaration: > > const char** func(); > > Rust code: > > native mod c { > fn func() -> **libc::c_char; > } > > #[test] > fn test_func() { > let results = c::func(); > // how to print all string in results??? > } > > I've tried next idea without success: > > let v: [str] = methods; // mismatched types: expected `[str]` but found `**core::libc::types::os::arch::c95::c_char` (vector vs *-ptr) > > > What is the best way to do it? Something like: let buf = func(); let buflen = buf_len(buf); let strs = unsafe { let cstrs: [*c_char] = vec::unsafe::from_buf(buf, buflen); vec::map(cstrs) {|cstr| str::unsafe::from_c_str(cstr) } }; The problem is that `buf_len` doesn't exist. We could probably use some iterators over unsafe pointers in core. Assuming that your array of string pointers is null terminated, buf_len might look like: unsafe fn buf_len(buf: **c_char) -> uint { position(buf) {|i| i == ptr::null() } } // This should probably be in core::ptr unsafe fn position(buf: *T, f: (T) -> bool) -> uint { let mut offset = 0u; loop { if f(*ptr::offset(ptr::addr_of(buf), offset) { ret offset; } else { offset += 1u; } } } From a.stavonin at gmail.com Tue Apr 10 20:07:03 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Wed, 11 Apr 2012 12:07:03 +0900 Subject: [rust-dev] crate_type = "lib" and testing Message-ID: <9E6334D8-6811-4239-A29F-096ABB646CEE@gmail.com> Hi all, I've faced with strange behavior of #[test] command. Could someone explain me is it bug or feature. --------- test.rc ---------- #[link (name="test", vers = "0.1", uuid = "B019C86D-C7ED-4263-810E-B12A33E6954C")]; #[crate_type = "lib"]; use std; mod test; ------ END ------- --------- test.rs ---------- fn foo() -> bool { ret true } #[test] fn foo_test() { assert foo() == true; } ------ END ------- Compiling and run it: astavonin:/Users/../RustTests: rustc --test test.rc warning: no debug symbols in executable (-arch x86_64) astavonin:/Users/../RustTests: ./test running 2 tests test foo_test ... ok test test::foo_test ... ok result: ok. 2 passed; 0 failed; 0 ignored Question: Why 2 tests was created from just 1 foo_test() test function? Also, it's looks like both tests are running parallel and sometimes it provoke segmentation faults in case of more complex tests. Regards, Alexander. -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Tue Apr 10 20:28:36 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 10 Apr 2012 20:28:36 -0700 Subject: [rust-dev] crate_type = "lib" and testing In-Reply-To: <9E6334D8-6811-4239-A29F-096ABB646CEE@gmail.com> References: <9E6334D8-6811-4239-A29F-096ABB646CEE@gmail.com> Message-ID: <4F84FA64.1070609@alum.mit.edu> the reason is that there is no need for the `mod test` directive in test.rc. This effectively creates loads the `test.rs` file twice, once in the root namespace and once as the module `test`. Niko On 4/10/12 8:07 PM, Alexander Stavonin wrote: > Hi all, > > I've faced with strange behavior of #[test] command. Could someone > explain me is it bug or feature. > > --------- test.rc ---------- > #[link (name="test", > vers = "0.1", > uuid = "B019C86D-C7ED-4263-810E-B12A33E6954C")]; > #[crate_type = "lib"]; > > use std; > mod test; > ------ END ------- > > > --------- test.rs ---------- > fn foo() -> bool { ret true } > > #[test] > fn foo_test() { > assert foo() == true; > } > ------ END ------- > > Compiling and run it: > > astavonin:/Users/../RustTests: rustc --test test.rc > warning: no debug symbols in executable (-arch x86_64) > astavonin:/Users/../RustTests: ./test > > running *_2_* tests > test foo_test ... ok > test test::foo_test ... ok > > result: ok. 2 passed; 0 failed; 0 ignored > > Question: Why 2 tests was created from just 1 foo_test() test > function? Also, it's looks like both tests are running parallel and > sometimes it provoke segmentation faults in case of more complex tests. > > Regards, > Alexander. > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From mictadlo at gmail.com Wed Apr 11 02:54:47 2012 From: mictadlo at gmail.com (Mic) Date: Wed, 11 Apr 2012 19:54:47 +1000 Subject: [rust-dev] class.new Message-ID: Hello, How about to create an instance with class.new like in Ruby, because in the class is a new method? class cat { priv { let mutable x : int; fn meow() { log_err "Meow"; } } let y : int; new(in_x : int, in_y : int) { x = in_x; self.y = in_y; } fn speak() { meow(); } fn eat() { ... } } let c : cat = cat(1, 2).*new*; c.speak(); -------------- next part -------------- An HTML attachment was scrubbed... URL: From amitava.shee at gmail.com Wed Apr 11 05:11:43 2012 From: amitava.shee at gmail.com (Amitava Shee) Date: Wed, 11 Apr 2012 08:11:43 -0400 Subject: [rust-dev] class.new In-Reply-To: References: Message-ID: You don't need to invoke new - let c : cat = cat(1, 2); will do. -Amitava On Wed, Apr 11, 2012 at 5:54 AM, Mic wrote: > Hello, > How about to create an instance with class.new like in Ruby, because in > the class is a new method? > > class cat { > priv { > let mutable x : int; > fn meow() { log_err "Meow"; } > } > > let y : int; > > new(in_x : int, in_y : int) { x = in_x; self.y = in_y; } > > fn speak() { meow(); } > > fn eat() { ... } > } > > > let c : cat = cat(1, 2).*new*; > c.speak(); > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -- Amitava Shee Software Architect There are two ways of constructing a software design. One is to make it so simple that there are obviously no deficiencies; the other is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult. -- C. A. R. Hoare The Emperor's Old Clothes, CACM February 1981 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mictadlo at gmail.com Wed Apr 11 06:13:34 2012 From: mictadlo at gmail.com (Mic) Date: Wed, 11 Apr 2012 23:13:34 +1000 Subject: [rust-dev] class.new In-Reply-To: References: Message-ID: Thank you. I did a mistake I meant cat.*new*(1, 2). It would be easier to distinguish between a function cat(1,2) and a class cat(1,2). On Wed, Apr 11, 2012 at 10:11 PM, Amitava Shee wrote: > You don't need to invoke new - > > let c : cat = cat(1, 2); > > will do. > > -Amitava > > On Wed, Apr 11, 2012 at 5:54 AM, Mic wrote: > >> Hello, >> How about to create an instance with class.new like in Ruby, because in >> the class is a new method? >> >> class cat { >> priv { >> let mutable x : int; >> fn meow() { log_err "Meow"; } >> } >> >> let y : int; >> >> new(in_x : int, in_y : int) { x = in_x; self.y = in_y; } >> >> fn speak() { meow(); } >> >> fn eat() { ... } >> } >> >> >> let c : cat = cat(1, 2).*new*; >> c.speak(); >> >> >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev >> >> > > > -- > Amitava Shee > Software Architect > > There are two ways of constructing a software design. One is to make it so > simple that there are obviously no deficiencies; the other is to make it so > complicated that there are no obvious deficiencies. The first method is far > more difficult. > -- C. A. R. Hoare The Emperor's Old Clothes, CACM February 1981 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.ronnquist at gmail.com Wed Apr 11 04:26:13 2012 From: peter.ronnquist at gmail.com (Peter Ronnquist) Date: Wed, 11 Apr 2012 13:26:13 +0200 Subject: [rust-dev] '-' as prefix to a function argument? Message-ID: Hi, I have a question regarding '-' as a prefix to a function argument as used in the task-perf-word-count.rs test file: ----------------------------------------------- rust-0.2\src\test\bench\task-perf-word-count.rs: fn map_reduce(-inputs: [str]) { .... fn main(argv: [str]) { let inputs = if vec::len(argv) < 2u { [input1(), input2(), input3()] } else { vec::map(vec::slice(argv, 1u, vec::len(argv)), {|f| result::get(io::read_whole_file_str(f)) }) }; let start = time::precise_time_ns(); map_reduce::map_reduce(inputs); .... ----------------------------------------------- What does the minus sign mean when used as a prefix for the argument "input" in the function map_reduce()? I looked for this in the tutorial and the reference manual but I could only find reference to '+' in the 7.4 "Argument passing styles" : "Then there is the by-copy style, written +." Thanks Peter Ronnquist From catamorphism at gmail.com Wed Apr 11 10:19:17 2012 From: catamorphism at gmail.com (Tim Chevalier) Date: Wed, 11 Apr 2012 10:19:17 -0700 Subject: [rust-dev] '-' as prefix to a function argument? In-Reply-To: References: Message-ID: On Wed, Apr 11, 2012 at 4:26 AM, Peter Ronnquist wrote: > Hi, > > I have a question regarding '-' as a prefix to a function argument as > used in the task-perf-word-count.rs test file: Hi, Peter -- The '-' prefix means that an argument is passed by move, meaning that when control passes to the callee, it becomes deinitialized at the call site. (Obviously, this means the caller has to pass an l-value.) This does not seem to be documented, and should be. Cheers, Tim -- Tim Chevalier * http://catamorphism.org/ * Often in error, never in doubt "Debate is useless when one participant denies the full dignity of the other." -- Eric Berndt From graydon at mozilla.com Wed Apr 11 10:21:27 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Wed, 11 Apr 2012 10:21:27 -0700 Subject: [rust-dev] '-' as prefix to a function argument? In-Reply-To: References: Message-ID: <4F85BD97.9090600@mozilla.com> On 12-04-11 04:26 AM, Peter Ronnquist wrote: > What does the minus sign mean when used as a prefix for the argument > "input" in the function map_reduce()? It's "move-in" mode. It works like '+' mode except that the caller is oblige to relinquish its ownership; it's an error for the caller to carry on using the argument after passing it, rather than an implicit copy. Modes are likely to change substantially as we finish up the region-pointer system, hopefully during this development/release cycle. There may not be any concept of distinct argument-modes when we're done, we'll have to see. For the time being I've added a brief description of '-' to the tutorial. -Graydon From niko at alum.mit.edu Wed Apr 11 10:24:15 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 11 Apr 2012 10:24:15 -0700 Subject: [rust-dev] '-' as prefix to a function argument? In-Reply-To: References: Message-ID: <4F85BE3F.5080001@alum.mit.edu> On 4/11/12 10:19 AM, Tim Chevalier wrote: > The '-' prefix means that an argument is passed by move, meaning that > when control passes to the callee, it becomes deinitialized at the > call site. (Obviously, this means the caller has to pass an l-value.) > This does not seem to be documented, and should be. This is correct, except that it does not require that the caller use an lvalue. The value is indeed moved from the caller to the callee---and thus the caller must use a value that they own. If this value is an lvalue, then the lvalue must be of the sort that it can be deinitialized. Examples might help: fn take(-x: int) { ... } fn give(&x: int) { take(x); // ERROR: x is not owned by give(). let y = 3; take(y); // OK let z = y; // ERROR --- y was given away, can't use it anymore take(x+2); // OK --- rvalue } Moving vs copying is most important for unique values, as a move does not require that the unique value be cloned. Niko From pwalton at mozilla.com Wed Apr 11 13:28:05 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Wed, 11 Apr 2012 13:28:05 -0700 Subject: [rust-dev] Brace-free if and alt Message-ID: <4F85E955.3040605@mozilla.com> Here's a total bikeshed. Apologies in advance: There's been some criticism of Rust's syntax for being too brace-heavy. I've been thinking this for a while. Here's a minimal delta on the current syntax to address this: Examples: // before: if foo() == "bar" { 10 } else { 20 } // after: if foo() == "bar" then 10 else 20 // or: if foo() == "bar" { 10 } else { 20 } // before: alt foo() { "bar" { 10 } "baz" { 20 } "boo" { 30 } } // after: alt foo() { "bar" => 10, "baz" => 20, "boo" => 30 } // or: alt foo() { "bar" { 10 } "baz" { 20 } "boo" { 30 } } BNF: if ::== "if" expr ("then" expr | block) ("else" expr)? alt ::== "alt" expr "{" (arm* last-arm) "}" arm ::== block-arm | pat "=>" expr "," last-arm ::== block-arm | pat "=>" expr ","? block-arm ::== pat block You can think of it this way: We insert a "then" before the then-expression of each if; however, you can omit it if you use a block. We also insert a "=>" before each expression in an alt arm and a "," to separate expressions from subsequent patterns; however, both can be omitted if the arm expression is a block. This does, unfortunately, create the dangling else ambiguity. I'm not sure this is much of a problem in practice, but it might be an issue. The pretty printer would always omit the "then" and the "=>"/"," when the alt arm is a block. That way, we aren't introducing multiple preferred syntactic forms of the same Rust code (which I agree is generally undesirable); the blessed style is to never over-annotate when a "then" body or an alt expression is a block. Here's an example piece of code (Jonanin's emulator) written before-and-after: Before: https://github.com/Jonanin/rust-dcpu16/blob/master/asm.rs After: https://gist.github.com/2360838 Thoughts? Patrick From banderson at mozilla.com Wed Apr 11 13:47:41 2012 From: banderson at mozilla.com (Brian Anderson) Date: Wed, 11 Apr 2012 13:47:41 -0700 Subject: [rust-dev] crate_type = "lib" and testing In-Reply-To: <4F84FA64.1070609@alum.mit.edu> References: <9E6334D8-6811-4239-A29F-096ABB646CEE@gmail.com> <4F84FA64.1070609@alum.mit.edu> Message-ID: <4F85EDED.8080801@mozilla.com> On 04/10/2012 08:28 PM, Niko Matsakis wrote: > the reason is that there is no need for the `mod test` directive in > test.rc. This effectively creates loads the `test.rs` file twice, once > in the root namespace and once as the module `test`. > There's a bug open[1] to change this behavior since it's a frequent source of confusion. [1] https://github.com/mozilla/rust/issues/1277 From banderson at mozilla.com Wed Apr 11 13:50:59 2012 From: banderson at mozilla.com (Brian Anderson) Date: Wed, 11 Apr 2012 13:50:59 -0700 Subject: [rust-dev] crate_type = "lib" and testing In-Reply-To: <9E6334D8-6811-4239-A29F-096ABB646CEE@gmail.com> References: <9E6334D8-6811-4239-A29F-096ABB646CEE@gmail.com> Message-ID: <4F85EEB3.8050408@mozilla.com> On 04/10/2012 08:07 PM, Alexander Stavonin wrote: > Hi all, > > Question: Why 2 tests was created from just 1 foo_test() test function? > Also, it's looks like both tests are running parallel and sometimes it > provoke segmentation faults in case of more complex tests. > Yes, tests are run in parallel. I would be interested to see what tests this causes issues for. From personal experience I've encountered difficulties when tests modify global state, like getenv/setenv. I also ran into a major problem with bindings to a library that wasn't threadsafe. As a hack, if you run with RUST_THREADS=1, the tests will not run in parallel. From mictadlo at gmail.com Wed Apr 11 03:02:41 2012 From: mictadlo at gmail.com (Mic) Date: Wed, 11 Apr 2012 20:02:41 +1000 Subject: [rust-dev] c++ interface Message-ID: Hello, any plans to support C++ like in D http://dlang.org/cpp_interface.html ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Tue Apr 10 12:26:41 2012 From: masklinn at masklinn.net (Masklinn) Date: Tue, 10 Apr 2012 21:26:41 +0200 Subject: [rust-dev] Class UI Message-ID: I was reading http://smallcultfollowing.com/babysteps/blog/2012/04/09/rusts-object-system/ today, and saw the description of the classes definition. Next to it, Nicholas notes: > I am not fond of the definition of constructors, in particular I can only agree, for a simple reason: the example is that of an initializer, but uses naming generally used for actual constructors. Let's back up to what I mean: the role of a constructor is to construct, so it would take nothing, allocate a chunk of memory and put object content in it, then return the type-tagged chunk of memory (or a reference to it with runtime/GC cooperation). An initializer is the part the constructor delegates to for the "put object content in it", all arguments to the constructor are forwarded to the initializer but the initializer gets a "ready to use" object and makes it "readier" by initializing fields if needs be. I realize some languages (mostly in the Java line) call the initializer "constructor" and give no access to the constructor itself, that's fine. There's also C++ which gives access to both as respectively the constructor and *the new operator*. The latter part hints at what bothers me with the current syntax/naming: languages giving access to actual constructors (the "class method" which handles allocation and fully creates an instances from scratch) very often use the `new` naming: C++ uses `operator new` (called through the likewise named operator), Python uses `__new__` (called through the call operator `()`), Ruby uses `new` (called directly on the class), and I think that's how Perl5 seems to work (the instance is created within a proc called `new` ? from scratch ? then blessed and returned). The only significant differer I've found seems to be F#. As a result, really don't think Rust should call its initializer `new`. Although I *do* think it would be neat if Rust provided an actual constructor which would have to return some sort of instance build from scratch using core memory allocation thingies. But whether it provides a constructor or not, it should not name its initializer `new`. `init` is sometimes used, at least in javascript "class" libraries as well as in Objective-C[0] (and in Python, with dunders since it's a magic method) (Ruby uses `initialize`) [0] Yeah I know it's a bit weirder in Obj-C as it can return the instance as well, but the memory for the instance is setup in a preceding `alloc` call so it still works. From ben.striegel at gmail.com Wed Apr 11 13:52:39 2012 From: ben.striegel at gmail.com (Benjamin Striegel) Date: Wed, 11 Apr 2012 16:52:39 -0400 Subject: [rust-dev] Brace-free if and alt In-Reply-To: <4F85E955.3040605@mozilla.com> References: <4F85E955.3040605@mozilla.com> Message-ID: Really not a fan of the alternative `if` syntax, and I think that the problem of "too many braces" applies primarily to `alt` expressions anyway. Another reason to be wary of your `if` proposal is that it makes Rust's already-a-little-scary semicolon rules a bit harder to express ("you don't need a semicolon after a closing brace in a control flow expression statement that is not a component of a larger statement" is already about at the threshold of credulity without adding "...or immediately after an `if` statement making use of the alternative syntax"). An alternative `alt` syntax could be cool. One thing that I like about the current syntax with the required braces is that it flows nicely from the understanding "this is the end of a block, therefore this is where the implicit `ret` exists", which is handy enough that it helps get over the initial anxiety of semicolon significance. What if your alternative syntax used a colon rather than a fat arrow, to mirror record literal syntax? On Wed, Apr 11, 2012 at 4:28 PM, Patrick Walton wrote: > Here's a total bikeshed. Apologies in advance: > > There's been some criticism of Rust's syntax for being too brace-heavy. > I've been thinking this for a while. Here's a minimal delta on the current > syntax to address this: > > Examples: > > // before: > if foo() == "bar" { 10 } else { 20 } > > // after: > if foo() == "bar" then 10 else 20 > // or: > if foo() == "bar" { 10 } else { 20 } > > // before: > alt foo() { > "bar" { 10 } > "baz" { 20 } > "boo" { 30 } > } > > // after: > alt foo() { > "bar" => 10, > "baz" => 20, > "boo" => 30 > } > // or: > alt foo() { > "bar" { 10 } > "baz" { 20 } > "boo" { 30 } > } > > BNF: > > if ::== "if" expr ("then" expr | block) ("else" expr)? > alt ::== "alt" expr "{" (arm* last-arm) "}" > arm ::== block-arm | pat "=>" expr "," > last-arm ::== block-arm | pat "=>" expr ","? > block-arm ::== pat block > > You can think of it this way: We insert a "then" before the > then-expression of each if; however, you can omit it if you use a block. We > also insert a "=>" before each expression in an alt arm and a "," to > separate expressions from subsequent patterns; however, both can be omitted > if the arm expression is a block. > > This does, unfortunately, create the dangling else ambiguity. I'm not sure > this is much of a problem in practice, but it might be an issue. > > The pretty printer would always omit the "then" and the "=>"/"," when the > alt arm is a block. That way, we aren't introducing multiple preferred > syntactic forms of the same Rust code (which I agree is generally > undesirable); the blessed style is to never over-annotate when a "then" > body or an alt expression is a block. > > Here's an example piece of code (Jonanin's emulator) written > before-and-after: > > Before: https://github.com/Jonanin/**rust-dcpu16/blob/master/asm.rs > After: https://gist.github.com/**2360838 > > Thoughts? > > Patrick > ______________________________**_________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/**listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pwalton at mozilla.com Wed Apr 11 13:54:37 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Wed, 11 Apr 2012 13:54:37 -0700 Subject: [rust-dev] Brace-free if and alt In-Reply-To: References: <4F85E955.3040605@mozilla.com> Message-ID: <4F85EF8D.5060003@mozilla.com> On 4/11/12 1:52 PM, Benjamin Striegel wrote: > Really not a fan of the alternative `if` syntax, and I think that the > problem of "too many braces" applies primarily to `alt` expressions > anyway. Another reason to be wary of your `if` proposal is that it makes > Rust's already-a-little-scary semicolon rules a bit harder to express > ("you don't need a semicolon after a closing brace in a control flow > expression statement that is not a component of a larger statement" is > already about at the threshold of credulity without adding "...or > immediately after an `if` statement making use of the alternative syntax"). > > An alternative `alt` syntax could be cool. One thing that I like about > the current syntax with the required braces is that it flows nicely from > the understanding "this is the end of a block, therefore this is where > the implicit `ret` exists", which is handy enough that it helps get over > the initial anxiety of semicolon significance. What if your alternative > syntax used a colon rather than a fat arrow, to mirror record literal > syntax? Requires unbounded lookahead due to type annotations in patterns also using ":". Patrick From graydon at mozilla.com Wed Apr 11 14:15:15 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Wed, 11 Apr 2012 14:15:15 -0700 Subject: [rust-dev] Brace-free if and alt In-Reply-To: <4F85E955.3040605@mozilla.com> References: <4F85E955.3040605@mozilla.com> Message-ID: <4F85F463.8060908@mozilla.com> On 12-04-11 01:28 PM, Patrick Walton wrote: > Before: https://github.com/Jonanin/rust-dcpu16/blob/master/asm.rs > After: https://gist.github.com/2360838 > > Thoughts? Bikeshed indeed! Aesthetically, I prefer the braces. Block structure's more visible. I also use C-M-space to move code around all the time, so practically speaking, editor-obvious block boundaries are my preference. Take that for no more than it's worth though. Straw-vote preference. I've written a lot of ML that looks like your proposal too, and my fingers didn't break. -Graydon From banderson at mozilla.com Wed Apr 11 14:30:58 2012 From: banderson at mozilla.com (Brian Anderson) Date: Wed, 11 Apr 2012 14:30:58 -0700 Subject: [rust-dev] c++ interface In-Reply-To: References: Message-ID: <4F85F812.3080806@mozilla.com> On 04/11/2012 03:02 AM, Mic wrote: > Hello, > any plans to support C++ like in D http://dlang.org/cpp_interface.html ? Nothing firm but there is a bug open[1]. I imagine it will happen in small steps, as needed. [1] https://github.com/mozilla/rust/issues/37 From a.stavonin at gmail.com Wed Apr 11 16:10:44 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Thu, 12 Apr 2012 08:10:44 +0900 Subject: [rust-dev] crate_type = "lib" and testing In-Reply-To: <4F85EEB3.8050408@mozilla.com> References: <9E6334D8-6811-4239-A29F-096ABB646CEE@gmail.com> <4F85EEB3.8050408@mozilla.com> Message-ID: Hi, Brian I can't reproduce error with segmentation faults any more, and I suppose that were my mistake because of bad understanding of Rust. The issue were in the re_get_supported_methods (line 40, vec::map(cstrs) {|cstr| str::unsafe::from_c_str(cstr) }) function in revent.rs. https://github.com/astavonin/revent Regards, Alexander. Yes, tests are run in parallel. I would be interested to see what tests > this causes issues for. From personal experience I've encountered > difficulties when tests modify global state, like getenv/setenv. I also ran > into a major problem with bindings to a library that wasn't threadsafe. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steven099 at gmail.com Wed Apr 11 21:56:31 2012 From: steven099 at gmail.com (Steven Blenkinsop) Date: Thu, 12 Apr 2012 00:56:31 -0400 Subject: [rust-dev] Class UI In-Reply-To: References: Message-ID: I don't know about your use of the term "constructor", but it is true that "new" is often associated with the allocation side of things instead of the initialization side of things in languages that make a distinction (many don't). Go faces the opposite problem since people think of "new" as syntax for calling a constructor which does initialization, and its "new" does allocation. But maybe that just helps prove your point, which is that people have strong preconceptions about what "new" means, so it might be worthwhile for Rust to pick a different word, all else being equal. I can't see it creating huge problems, since the worst case scenario is that people will continuously be griping about how "new" is the wrong word, but if you can minimize the number of perennial meaningless complaints, you have more time to address real problems. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fw at deneb.enyo.de Wed Apr 11 22:46:03 2012 From: fw at deneb.enyo.de (Florian Weimer) Date: Thu, 12 Apr 2012 07:46:03 +0200 Subject: [rust-dev] Brace-free if and alt In-Reply-To: <4F85E955.3040605@mozilla.com> (Patrick Walton's message of "Wed, 11 Apr 2012 13:28:05 -0700") References: <4F85E955.3040605@mozilla.com> Message-ID: <87bomxh59g.fsf@mid.deneb.enyo.de> * Patrick Walton: > // after: > if foo() == "bar" then 10 else 20 This paves the way to: if foo() == "bar" then f(); g(); I expect that if braces are optional, some users will want an option to make them mandatory again. From kobi2187 at gmail.com Thu Apr 12 00:25:33 2012 From: kobi2187 at gmail.com (Kobi Lurie) Date: Thu, 12 Apr 2012 10:25:33 +0300 Subject: [rust-dev] Class UI In-Reply-To: References: Message-ID: <4F86836D.90301@gmail.com> what about a private new() in the class, that returns the allocated instance, and a public 'make' fn, with how many overloads you want, to serve as a ctor. rustc can require that atleast one 'make' fn will exist for a class, and that it returns the same object that came from new(). (the user cannot write a fn named 'make', if it doesn't return the class type, and the actual object from new(). -- if it's possible to verify such a thing statically) bye, kobi On 4/12/2012 7:56 AM, Steven Blenkinsop wrote: > I don't know about your use of the term "constructor", but it is true > that "new" is often associated with the allocation side of things > instead of the initialization side of things in languages that make a > distinction (many don't). Go faces the opposite problem since people > think of "new" as syntax for calling a constructor which does > initialization, and its "new" does allocation. But maybe that just helps > prove your point, which is that people have strong preconceptions about > what "new" means, so it might be worthwhile for Rust to pick a different > word, all else being equal. I can't see it creating huge problems, since > the worst case scenario is that people will continuously be griping > about how "new" is the wrong word, but if you can minimize the number of > perennial meaningless complaints, you have more time to address real > problems. > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From a.stavonin at gmail.com Thu Apr 12 01:08:21 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Thu, 12 Apr 2012 17:08:21 +0900 Subject: [rust-dev] Functions overloading Message-ID: <6CFF2562-BFD7-4A47-B8B1-072A638E3B93@gmail.com> What's about function overloading? Something like this: fn foo(val: int) { io::println("int"); } fn foo(val: str) { io::println("str"); } fn main() { foo(1); foo("test"); } But :( main.rs:3:0: 5:1 error: duplicate definition of foo main.rs:3 fn foo(val: int) { main.rs:4 io::println("int"); main.rs:5 } As I understood, similar code can be implemented by using ifase and impl, but it's very cumbersome way. From dteller at mozilla.com Thu Apr 12 03:01:34 2012 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Thu, 12 Apr 2012 12:01:34 +0200 Subject: [rust-dev] Functions overloading In-Reply-To: <6CFF2562-BFD7-4A47-B8B1-072A638E3B93@gmail.com> References: <6CFF2562-BFD7-4A47-B8B1-072A638E3B93@gmail.com> Message-ID: <4F86A7FE.10107@mozilla.com> On Thu Apr 12 10:08:21 2012, Alexander Stavonin wrote: > What's about function overloading? Something like this: > > fn foo(val: int) { > io::println("int"); > } > > fn foo(val: str) { > io::println("str"); > } > > fn main() { > foo(1); > foo("test"); > } > > But :( > > main.rs:3:0: 5:1 error: duplicate definition of foo > main.rs:3 fn foo(val: int) { > main.rs:4 io::println("int"); > main.rs:5 } > > As I understood, similar code can be implemented by using ifase and impl, but it's very cumbersome way. My personal experience is that you want function overloading to be very explicit. Having too much overloading makes it much easier to misread code. So, I personally would not advocate C++-style function overloading. -- David Rajchenbach-Teller, PhD Performance Team, Mozilla -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 487 bytes Desc: OpenPGP digital signature URL: From a.stavonin at gmail.com Thu Apr 12 06:15:42 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Thu, 12 Apr 2012 22:15:42 +0900 Subject: [rust-dev] Alignment and tuples Message-ID: <727D95A2-A965-4F37-A662-206986949E69@gmail.com> I'm confused, how it is possible? type tuple_type1 = (u64, u32, u16); type tuple_type2 = (u8, u32, u16); fn main() { io::println(#fmt("size of tuple_type1 = %u, size of tuple_type2 = %u", sys::size_of::(), sys::size_of::())); io::println(#fmt("align of tuple_type1 = %u, align of tuple_type2 = %u", sys::align_of::(), sys::align_of::())); } Result: size of tuple_type1 = 16, size of typle_type2 = 12 align of tuple_type1 = 8, align of tuple_type2 = 8 I expected same size for tuple_type1 and tuple_type2 in case of alignment 8. Or tuple_type2 have alignment 4, but align_of returns invalid result? From niko at alum.mit.edu Thu Apr 12 06:47:12 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 12 Apr 2012 06:47:12 -0700 Subject: [rust-dev] Functions overloading In-Reply-To: <4F86A7FE.10107@mozilla.com> References: <6CFF2562-BFD7-4A47-B8B1-072A638E3B93@gmail.com> <4F86A7FE.10107@mozilla.com> Message-ID: <4F86DCE0.6040609@alum.mit.edu> Not only that, but it adds an extra layer of complexity for type inferencing, which quite frankly is already complex enough. That said, you can do a limited form of overloading using impls: impl methods for int { fn foo() { ... } } impl methods for uint { fn foo() { ... } } Niko On 4/12/12 3:01 AM, David Rajchenbach-Teller wrote: > On Thu Apr 12 10:08:21 2012, Alexander Stavonin wrote: >> What's about function overloading? Something like this: >> >> fn foo(val: int) { >> io::println("int"); >> } >> >> fn foo(val: str) { >> io::println("str"); >> } >> >> fn main() { >> foo(1); >> foo("test"); >> } >> >> But :( >> >> main.rs:3:0: 5:1 error: duplicate definition of foo >> main.rs:3 fn foo(val: int) { >> main.rs:4 io::println("int"); >> main.rs:5 } >> >> As I understood, similar code can be implemented by using ifase and impl, but it's very cumbersome way. > My personal experience is that you want function overloading to be very > explicit. Having too much overloading makes it much easier to misread > code. > > So, I personally would not advocate C++-style function overloading. > > -- > David Rajchenbach-Teller, PhD > Performance Team, Mozilla > > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From niko at alum.mit.edu Thu Apr 12 06:49:23 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 12 Apr 2012 06:49:23 -0700 Subject: [rust-dev] Alignment and tuples In-Reply-To: <727D95A2-A965-4F37-A662-206986949E69@gmail.com> References: <727D95A2-A965-4F37-A662-206986949E69@gmail.com> Message-ID: <4F86DD63.4050202@alum.mit.edu> On 4/12/12 6:15 AM, Alexander Stavonin wrote: > type tuple_type1 = (u64, u32, u16); > type tuple_type2 = (u8, u32, u16); > > ... > > size of tuple_type1 = 16, size of typle_type2 = 12 > align of tuple_type1 = 8, align of tuple_type2 = 8 > > I expected same size for tuple_type1 and tuple_type2 in case of alignment 8. Or tuple_type2 have alignment 4, but align_of returns invalid result? The alignment refers to the alignment of the overall structure. The internal components do not all share the same alignment. To be honest, though, I am not sure why the alignment of tuple_type2 is listed as `8`, that seems like a bug. I would expect 4 and a size of 12. Niko From niko at alum.mit.edu Thu Apr 12 06:58:00 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 12 Apr 2012 06:58:00 -0700 Subject: [rust-dev] Class UI In-Reply-To: <4F86836D.90301@gmail.com> References: <4F86836D.90301@gmail.com> Message-ID: <4F86DF68.8020602@alum.mit.edu> In fact, Rust separates initialization from allocation. The OP was correct that new might better be called an initializer, but I think that ship has sailed---constructor is the generally accepted term for the function which initiailizes the instance, for better or worse. Still, for a class C, it is possible to write C(...), @C(...), ~C(...) and so forth, each of which will allocate memory from a different location (stack, task heap, exchange heap). With regions, it will also be possible to use user-specified memory pools. The current syntax is `new(pool) C`, harkening back to overloaded-new in C++, but I am not sure if we'll stay with it. What I mainly dislike about the current constructor system is having to repeat everything all the time. Sometimes this is ok but usually a constructor just wants to initialize all (or most) of the fields from the values given in the parameters. One possibility that patrick and I had talked about is that if there is no constructor defined, one can construct the class using a literal syntax like `C { f1: ..., f2: ... }`. But I am concerned that this is kind of discontinuous with the constructor---implementing a constructor would then require rewriting every allocation site. So perhaps we should just say there is a default constructor of the form: new(f1: T1, ..., fN: T2) { self.f1 = f1; ... self.fN = fN; } for each field `f1`, ..., `fN` defined in the class (and in the same order as they are defined). Niko On 4/12/12 12:25 AM, Kobi Lurie wrote: > what about a private new() in the class, that returns the allocated > instance, and a public 'make' fn, with how many overloads you want, to > serve as a ctor. rustc can require that atleast one 'make' fn will > exist for a class, and that it returns the same object that came from > new(). > (the user cannot write a fn named 'make', if it doesn't return the > class type, and the actual object from new(). -- if it's possible to > verify such a thing statically) > > bye, kobi > > > > On 4/12/2012 7:56 AM, Steven Blenkinsop wrote: >> I don't know about your use of the term "constructor", but it is true >> that "new" is often associated with the allocation side of things >> instead of the initialization side of things in languages that make a >> distinction (many don't). Go faces the opposite problem since people >> think of "new" as syntax for calling a constructor which does >> initialization, and its "new" does allocation. But maybe that just helps >> prove your point, which is that people have strong preconceptions about >> what "new" means, so it might be worthwhile for Rust to pick a different >> word, all else being equal. I can't see it creating huge problems, since >> the worst case scenario is that people will continuously be griping >> about how "new" is the wrong word, but if you can minimize the number of >> perennial meaningless complaints, you have more time to address real >> problems. >> >> >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From jonathan.bragg at alum.rpi.edu Thu Apr 12 07:22:30 2012 From: jonathan.bragg at alum.rpi.edu (Nate Bragg) Date: Thu, 12 Apr 2012 10:22:30 -0400 Subject: [rust-dev] Class UI In-Reply-To: References: <4F86836D.90301@gmail.com> <4F86DF68.8020602@alum.mit.edu> Message-ID: On Thu, Apr 12, 2012 at 9:58 AM, Niko Matsakis wrote: > What I mainly dislike about the current constructor system is having to > repeat everything all the time. Sometimes this is ok but usually a > constructor just wants to initialize all (or most) of the fields from the > values given in the parameters. One possibility that patrick and I had > talked about is that if there is no constructor defined, one can construct > the class using a literal syntax like `C { f1: ..., f2: ... }`. But I am > concerned that this is kind of discontinuous with the > constructor---implementing a constructor would then require rewriting every > allocation site. > > So perhaps we should just say there is a default constructor of the form: > > new(f1: T1, ..., fN: T2) { > self.f1 = f1; > ... > self.fN = fN; > } > > for each field `f1`, ..., `fN` defined in the class (and in the same order > as they are defined). > Anecdotally, from the perspective of this prospective user, default constructors like that give me heartburn (for the first example that pops into my head, say two fields of the same type change in position inside the class - now you have to fix all of your default constructors). If there is going to be an implicit constructor, I would prefer the literal style you listed. It hearkens back to c99-style field-name designated initializers, and is very explicit. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.striegel at gmail.com Thu Apr 12 07:45:55 2012 From: ben.striegel at gmail.com (Benjamin Striegel) Date: Thu, 12 Apr 2012 10:45:55 -0400 Subject: [rust-dev] Class UI In-Reply-To: References: <4F86836D.90301@gmail.com> <4F86DF68.8020602@alum.mit.edu> Message-ID: Optional named parameters in function invocations could be a third option, although of course that introduces a third inflexibility to consider. :P So, which happens most often? 1) You create a class without any constructor besides property assignment, and then later introduce a more substantial constructor. 2) You alter the order of fields/memory layout of a class. 3) You change the names of a class' fields. On Thu, Apr 12, 2012 at 10:22 AM, Nate Bragg wrote: > On Thu, Apr 12, 2012 at 9:58 AM, Niko Matsakis wrote: > >> What I mainly dislike about the current constructor system is having to >> repeat everything all the time. Sometimes this is ok but usually a >> constructor just wants to initialize all (or most) of the fields from the >> values given in the parameters. One possibility that patrick and I had >> talked about is that if there is no constructor defined, one can construct >> the class using a literal syntax like `C { f1: ..., f2: ... }`. But I am >> concerned that this is kind of discontinuous with the >> constructor---implementing a constructor would then require rewriting every >> allocation site. >> >> So perhaps we should just say there is a default constructor of the form: >> >> new(f1: T1, ..., fN: T2) { >> self.f1 = f1; >> ... >> self.fN = fN; >> } >> >> for each field `f1`, ..., `fN` defined in the class (and in the same >> order as they are defined). >> > > Anecdotally, from the perspective of this prospective user, default > constructors like that give me heartburn (for the first example that pops > into my head, say two fields of the same type change in position inside the > class - now you have to fix all of your default constructors). If there is > going to be an implicit constructor, I would prefer the literal style you > listed. It hearkens back to c99-style field-name designated initializers, > and is very explicit. > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Thu Apr 12 10:02:17 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 12 Apr 2012 10:02:17 -0700 Subject: [rust-dev] paring back the self type to be, well, a type Message-ID: <4F870A99.8030205@alum.mit.edu> I've been reading more closely into the impl/iface code and I think I've found another problem. The current `self` type is actually more than a type---rather it's a kind of type constructor. I think we should change it to be just a type, because we do not have other type constructors in our system and in general things do not hold together. Let me explain briefly what I mean. Currently, the type `self` refers to "the type implemented the current iface". However, the self type is parameterized just as the containing iface. So, technically, you could have an iface like: iface foo { fn bar(x: self); } Now this would correspond to an impl like: impl of foo for [T] { fn bar(x: [S]) { ... } } Here, the type `self` was transformed to `[S]`. This is a fairly complex transformation, actually, as must "reverse-link" the parameter T from its use in `[T]` to the appearance in `foo` and so forth. In fact, this transformation is not especially well-defined: there may be multiple types or no types which are equally valid. To see why there could be multiple types, consider something like `impl of foo for either`. What type does `self` have? It would be valid for any `either`. To see why there might be no types, consider something like `impl of foo for uint`. What type does `self` have? Or, similarly, consider: fn some_func>(t: T) { t.bar(...); } What is the type of the parameter expected by `bar`? In other languages, there is support for parameterizing a function or type by a "type constructor" (which is essentially what `self` is). I don't think we need those capabilities, though of course we could add them later. I would rather see the `self` type be like any other type parameter: non-parameterized. In that case, the iface shown above would be invalid. This does naturally mean that our type system is less expressive. However, its behavior is well-defined in all cases, which seems like a nice property to have. Niko From niko at alum.mit.edu Thu Apr 12 10:33:02 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 12 Apr 2012 10:33:02 -0700 Subject: [rust-dev] Brace-free if and alt In-Reply-To: <4F85E955.3040605@mozilla.com> References: <4F85E955.3040605@mozilla.com> Message-ID: <4F8711CE.8070001@alum.mit.edu> I am not a big fan of the `if` syntax. Or at least I don't mind our current one and it is nicely unambiguous. However, I really like "alt with arrow" syntax. I find the current one quite unreadable, particularly for long blocks. The "=>" arrow (or `->`, I am somewhat indifferent) helps to set apart the conditions and the blocks. Here is a random example to illustrate what I mean: Before: > alt elsopt { > some(els) { > let if_t = fcx.next_ty_var(); > let thn_bot = check_block(fcx, thn); > let thn_t = fcx.node_ty(thn.node.id); > demand::simple(fcx, thn.span, if_t, thn_t); > let els_bot = check_expr_with(fcx, els, if_t); > (if_t, thn_bot & els_bot) > } > none { > check_block_no_value(fcx, thn); > (ty::mk_nil(fcx.ccx.tcx), false) > } > }; After: > alt elsopt { > some(els) => { > let if_t = fcx.next_ty_var(); > let thn_bot = check_block(fcx, thn); > let thn_t = fcx.node_ty(thn.node.id); > demand::simple(fcx, thn.span, if_t, thn_t); > let els_bot = check_expr_with(fcx, els, if_t); > (if_t, thn_bot & els_bot) > } > none => { > check_block_no_value(fcx, thn); > (ty::mk_nil(fcx.ccx.tcx), false) > } > }; I personally find the second example quite a bit easier to read. In the first, my eyes get lost and I have trouble distinguishing the patterns from the code. It is also much nicer for small alts, for example: > let pass1 = alt ty::get(self.self_ty).struct { > ty::ty_param(n, did) { > self.method_from_param(n, did) > } > ty::ty_iface(did, tps) { > self.method_from_iface(did, tps) > } > ty::ty_class(did, tps) { > self.method_from_class(did, tps) > } > _ { > none > } > }; becomes: > let pass1 = alt ty::get(self.self_ty).struct { > ty::ty_param(n, did) => self.method_from_param(n, did) > ty::ty_iface(did, tps) => self.method_from_iface(did, tps) > ty::ty_class(did, tps) => self.method_from_class(did, tps) > _ => none > }; So I propose we make the syntax for an alt arm be: alt := `alt` expr { arm* } arm := pattern => expr Niko On 4/11/12 1:28 PM, Patrick Walton wrote: > Here's a total bikeshed. Apologies in advance: > > There's been some criticism of Rust's syntax for being too > brace-heavy. I've been thinking this for a while. Here's a minimal > delta on the current syntax to address this: > > Examples: > > // before: > if foo() == "bar" { 10 } else { 20 } > > // after: > if foo() == "bar" then 10 else 20 > // or: > if foo() == "bar" { 10 } else { 20 } > > // before: > alt foo() { > "bar" { 10 } > "baz" { 20 } > "boo" { 30 } > } > > // after: > alt foo() { > "bar" => 10, > "baz" => 20, > "boo" => 30 > } > // or: > alt foo() { > "bar" { 10 } > "baz" { 20 } > "boo" { 30 } > } > > BNF: > > if ::== "if" expr ("then" expr | block) ("else" expr)? > alt ::== "alt" expr "{" (arm* last-arm) "}" > arm ::== block-arm | pat "=>" expr "," > last-arm ::== block-arm | pat "=>" expr ","? > block-arm ::== pat block > > You can think of it this way: We insert a "then" before the > then-expression of each if; however, you can omit it if you use a > block. We also insert a "=>" before each expression in an alt arm and > a "," to separate expressions from subsequent patterns; however, both > can be omitted if the arm expression is a block. > > This does, unfortunately, create the dangling else ambiguity. I'm not > sure this is much of a problem in practice, but it might be an issue. > > The pretty printer would always omit the "then" and the "=>"/"," when > the alt arm is a block. That way, we aren't introducing multiple > preferred syntactic forms of the same Rust code (which I agree is > generally undesirable); the blessed style is to never over-annotate > when a "then" body or an alt expression is a block. > > Here's an example piece of code (Jonanin's emulator) written > before-and-after: > > Before: https://github.com/Jonanin/rust-dcpu16/blob/master/asm.rs > After: https://gist.github.com/2360838 > > Thoughts? > > Patrick > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From niko at alum.mit.edu Thu Apr 12 11:54:50 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 12 Apr 2012 11:54:50 -0700 Subject: [rust-dev] Class UI In-Reply-To: References: <4F86836D.90301@gmail.com> <4F86DF68.8020602@alum.mit.edu> Message-ID: <4F8724FA.1080900@alum.mit.edu> On 4/12/12 7:45 AM, Benjamin Striegel wrote: > Optional named parameters in function invocations could be a third > option, although of course that introduces a third inflexibility to > consider. :P > > So, which happens most often? I suppose it's not only a question of what happens most often, but what is most surprising. That is, if I change the order of fields, perhaps I am very surprised that code breaks, whereas if I change the names, less so? > 1) You create a class without any constructor besides property > assignment, and then later introduce a more substantial constructor. Personally speaking, the majority of my classes tend to have no "real" constructor happens a lot to me. I am not sure how often they migrate to a more substantial constructor. Probably infrequently. > 2) You alter the order of fields/memory layout of a class. > 3) You change the names of a class' fields. Hard to say. However, I will add one other wrinkle: if I change the names of the fields, I get errors that are very clear and easy to correct. If I change the order of the fields, and the fields have the same type, I get no errors at all?until runtime, that is! This is bad. (Incidentally, it's why I far prefer Smalltalk-style "match based on a label" style to the far more prevalent "parameter-list" style. But that ship, too, has sailed.) So, therefore, I propose something like this. Class instances are *always* created by writing "C { f1: v1, ..., fN: vN }". A class can, however, declare "priv new" or something like that to indicate that the constructor is private to the class. We can then permit static functions (which are probably useful anyhow) that can serve as blessed constructors: class C { static fn create(...) -> C { ret C { f1: v1, ..., fN: vN }; } } You would then create an instance of C using `C::create(...)`. The only thing I am not sure about is how one declares the construction to be private? Perhaps it's enough to say "if there are private fields, you must use a static function within the class to construct an instance". In any case we probably want such a rule (at least pcwalton feels strongly that it should exist). Niko From arcata at gmail.com Thu Apr 12 12:03:29 2012 From: arcata at gmail.com (Joe Groff) Date: Thu, 12 Apr 2012 12:03:29 -0700 Subject: [rust-dev] Class UI In-Reply-To: <4F8724FA.1080900@alum.mit.edu> References: <4F86836D.90301@gmail.com> <4F86DF68.8020602@alum.mit.edu> <4F8724FA.1080900@alum.mit.edu> Message-ID: On Thu, Apr 12, 2012 at 11:54 AM, Niko Matsakis wrote: > You would then create an instance of C using `C::create(...)`. The only > thing I am not sure about is how one declares the construction to be > private? Have you guys read Bob Nystrom's notes about designing a safe constructor interface? His design is by his own words unusual but could be a good fit for Rust. http://journal.stuffwithstuff.com/2010/12/14/the-trouble-with-constructors/ -Joe From nmatsakis at mozilla.com Thu Apr 12 09:18:01 2012 From: nmatsakis at mozilla.com (Niko Matsakis) Date: Thu, 12 Apr 2012 09:18:01 -0700 Subject: [rust-dev] paring back the self type to be, well, just a type Message-ID: <4F870039.5030401@mozilla.com> I've been reading more closely into the impl/iface code and I think I've found another problem. The current `self` type is actually more than a type---rather it's a kind of type constructor. I think we should change it to be just a type, because we do not have other type constructors in our system and in general things do not hold together. Let me explain briefly what I mean. Currently, the type `self` refers to "the type implemented the current iface". However, the self type is parameterized just as the containing iface. So, technically, you could have an iface like: iface foo { fn bar(x: self); } Now this would correspond to an impl like: impl of foo for [T] { fn bar(x: [S]) { ... } } Here, the type `self` was transformed to `[S]`. This is a fairly complex transformation, actually, as must "reverse-link" the parameter T from its use in `[T]` to the appearance in `foo` and so forth. In fact, this transformation is not especially well-defined: there may be multiple types or no types which are equally valid. To see why there could be multiple types, consider something like `impl of foo for either`. What type does `self` have? It would be valid for any `either`. To see why there might be no types, consider something like `impl of foo for uint`. What type does `self` have? Or, similarly, consider: fn some_func>(t: T) { t.bar(...); } What is the type of the parameter expected by `bar`? In other languages, there is support for parameterizing a function or type by a "type constructor" (which is essentially what `self` is). I don't think we need those capabilities, though of course we could add them later. I would rather see the `self` type be like any other type parameter: non-parameterized. In that case, the iface shown above would be invalid. This does naturally mean that our type system is less expressive. However, its behavior is well-defined in all cases, which seems like a nice property to have. Niko From banderson at mozilla.com Thu Apr 12 12:14:04 2012 From: banderson at mozilla.com (Brian Anderson) Date: Thu, 12 Apr 2012 12:14:04 -0700 Subject: [rust-dev] Brace-free if and alt In-Reply-To: <4F8711CE.8070001@alum.mit.edu> References: <4F85E955.3040605@mozilla.com> <4F8711CE.8070001@alum.mit.edu> Message-ID: <4F87297C.3000808@mozilla.com> On 04/12/2012 10:33 AM, Niko Matsakis wrote: > I am not a big fan of the `if` syntax. Or at least I don't mind our > current one and it is nicely unambiguous. However, I really like "alt > with arrow" syntax. I find the current one quite unreadable, > particularly for long blocks. The "=>" arrow (or `->`, I am somewhat > indifferent) helps to set apart the conditions and the blocks. I share the same opinion on both subjects. From niko at alum.mit.edu Thu Apr 12 12:19:21 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 12 Apr 2012 12:19:21 -0700 Subject: [rust-dev] Class UI In-Reply-To: References: <4F86836D.90301@gmail.com> <4F86DF68.8020602@alum.mit.edu> <4F8724FA.1080900@alum.mit.edu> Message-ID: <4F872AB9.5060701@alum.mit.edu> Based on a quick read, this sounds almost identical to what I proposed, except that `construct()` would be the "literal" syntax. Looks like a neat blog, thanks for the tip. Niko On 4/12/12 12:03 PM, Joe Groff wrote: > On Thu, Apr 12, 2012 at 11:54 AM, Niko Matsakis wrote: >> You would then create an instance of C using `C::create(...)`. The only >> thing I am not sure about is how one declares the construction to be >> private? > Have you guys read Bob Nystrom's notes about designing a safe > constructor interface? His design is by his own words unusual but > could be a good fit for Rust. > > http://journal.stuffwithstuff.com/2010/12/14/the-trouble-with-constructors/ > > -Joe From niko at alum.mit.edu Thu Apr 12 12:20:26 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 12 Apr 2012 12:20:26 -0700 Subject: [rust-dev] paring back the self type to be, well, just a type In-Reply-To: <4F870039.5030401@mozilla.com> References: <4F870039.5030401@mozilla.com> Message-ID: <4F872AFA.9030302@alum.mit.edu> Sorry for the double post. Niko On 4/12/12 9:18 AM, Niko Matsakis wrote: > I've been reading more closely into the impl/iface code and I think > I've found another problem. The current `self` type is actually more > than a type---rather it's a kind of type constructor. I think we > should change it to be just a type, because we do not have other type > constructors in our system and in general things do not hold > together. Let me explain briefly what I mean. > > Currently, the type `self` refers to "the type implemented the current > iface". However, the self type is parameterized just as the > containing iface. So, technically, you could have an iface like: > > iface foo { > fn bar(x: self); > } > > Now this would correspond to an impl like: > > impl of foo for [T] { > fn bar(x: [S]) { ... } > } > > Here, the type `self` was transformed to `[S]`. This is a fairly > complex transformation, actually, as must "reverse-link" the parameter > T from its use in `[T]` to the appearance in `foo` and so forth. > In fact, this transformation is not especially well-defined: there may > be multiple types or no types which are equally valid. > > To see why there could be multiple types, consider something like > `impl of foo for either`. What type does `self` have? > It would be valid for any `either`. > > To see why there might be no types, consider something like `impl of > foo for uint`. What type does `self` have? Or, similarly, > consider: > > fn some_func>(t: T) { > t.bar(...); > } > > What is the type of the parameter expected by `bar`? > > In other languages, there is support for parameterizing a function or > type by a "type constructor" (which is essentially what `self` is). I > don't think we need those capabilities, though of course we could add > them later. > > I would rather see the `self` type be like any other type parameter: > non-parameterized. In that case, the iface shown above would be > invalid. This does naturally mean that our type system is less > expressive. However, its behavior is well-defined in all cases, which > seems like a nice property to have. > > > Niko > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From halperin.dr at gmail.com Thu Apr 12 14:08:11 2012 From: halperin.dr at gmail.com (Dave Halperin) Date: Thu, 12 Apr 2012 17:08:11 -0400 Subject: [rust-dev] paring back the self type to be, well, just a type In-Reply-To: <4F870039.5030401@mozilla.com> References: <4F870039.5030401@mozilla.com> Message-ID: Hi, I've been a lurker on this list for a little while now. Just thought I'd point out that the solution to this in Haskell is the 'kind' system, see http://www.haskell.org/onlinereport/decls.html#sect4.1.1. The equivalent of the line 'impl of foo for uint' would cause an error that foo requires a type with kind '* -> *'. Higher kinded types have turned out to be useful in Haskell. If you were to try to implement monads in Rust you'd want to something along the following lines: iface monad { fn bind(fn (A -> self)) -> self; } I'm not advocating either way on this in terms of the complexity tradeoff for adding a kind system, just pointing out that it's what you'd need to make the current system work and it's not completely crazy to want that flexibility out of the type system. Dave On Thu, Apr 12, 2012 at 12:18 PM, Niko Matsakis wrote: > I've been reading more closely into the impl/iface code and I think I've > found another problem. The current `self` type is actually more than a > type---rather it's a kind of type constructor. I think we should change it > to be just a type, because we do not have other type constructors in our > system and in general things do not hold together. Let me explain briefly > what I mean. > > Currently, the type `self` refers to "the type implemented the current > iface". However, the self type is parameterized just as the containing > iface. So, technically, you could have an iface like: > > iface foo { > fn bar(x: self); > } > > Now this would correspond to an impl like: > > impl of foo for [T] { > fn bar(x: [S]) { ... } > } > > Here, the type `self` was transformed to `[S]`. This is a fairly > complex transformation, actually, as must "reverse-link" the parameter T > from its use in `[T]` to the appearance in `foo` and so forth. In fact, > this transformation is not especially well-defined: there may be multiple > types or no types which are equally valid. > > To see why there could be multiple types, consider something like `impl > of foo for either`. What type does `self` have? It would be > valid for any `either`. > > To see why there might be no types, consider something like `impl of > foo for uint`. What type does `self` have? Or, similarly, > consider: > > fn some_func>(t: T) { > t.bar(...); > } > > What is the type of the parameter expected by `bar`? > > In other languages, there is support for parameterizing a function or type > by a "type constructor" (which is essentially what `self` is). I don't > think we need those capabilities, though of course we could add them later. > > I would rather see the `self` type be like any other type parameter: > non-parameterized. In that case, the iface shown above would be invalid. > This does naturally mean that our type system is less expressive. However, > its behavior is well-defined in all cases, which seems like a nice property > to have. > > > Niko > ______________________________**_________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/**listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.ronnquist at gmail.com Thu Apr 12 13:52:15 2012 From: peter.ronnquist at gmail.com (Peter Ronnquist) Date: Thu, 12 Apr 2012 22:52:15 +0200 Subject: [rust-dev] '-' as prefix to a function argument? Message-ID: Thank you all for your quick replies, the example was enlightening. Peter From niko at alum.mit.edu Thu Apr 12 15:52:44 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 12 Apr 2012 15:52:44 -0700 Subject: [rust-dev] paring back the self type to be, well, just a type In-Reply-To: References: <4F870039.5030401@mozilla.com> Message-ID: <4F875CBC.2030903@alum.mit.edu> On 4/12/12 2:08 PM, Dave Halperin wrote: > I'm not advocating either way on this in terms of the complexity > tradeoff for adding a kind system, just pointing out that it's what > you'd need to make the current system work and it's not completely > crazy to want that flexibility out of the type system. Yes, this was one use case that the self type was intended to model (though not the one that I think will come up most often). I don't think this is crazy by any means, but right now our type system has no notion of (Haskell-style) kinds and it'd be a big change to add them. It's possible that the self type should be yanked altogether, but it's come in handy for me many times, but always in its simpler incarnation of "the type of the receiver" rather than "the type function defined by the current iface". In any case, I spent some time trying to adapt the iface system---in any form!---to writing generic monadic code and I decided it's just not a very good fit. There are two major hurdles. First, we have no good way to define a "return" function (though perhaps we could add static or class methods to ifaces). Second, and this is more major, we define monadic implementations for some concrete type and we should be doing it for all types. In principle, something like impl of monad for option { ... } seems like it does what we want, but it also permits things like: impl of monad for option { ... } for which there is no clear mapping from which to derive the self type. I think to do this correctly, we'd rather write something like: impl of monad for option { ... } which would also fit with the "kind" notion of Haskell: here the type being implemented for has kind * => * and so does monad. Of course, we would also need to be able to write things like: fn mapM( a: [A], b: fn(A) -> M) -> M<[B]> { ... } Here the parameter M is bound to some (* => *) type constructor for which monad is defined. I am not opposed to adding such capabilities at some point. But they don't feel like burning issues to me right *now*. Niko From a.stavonin at gmail.com Thu Apr 12 21:30:46 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Fri, 13 Apr 2012 13:30:46 +0900 Subject: [rust-dev] Functions overloading Message-ID: The closest variant can be implemented by using enums, but I can't say that resulted code is nice.: enum foo_param { i(int), s(str) } fn foo(param: foo_param) { alt param { i(int_val) { io::println(#fmt("foo was called with int == %d", int_val)); } s(str_val) { io::println(#fmt("foo was called with str == %s", str_val)); } } } fn main() { foo(i(10)); foo(s("test")); } Not only that, but it adds an extra layer of complexity for type > inferencing, which quite frankly is already complex enough. > > That said, you can do a limited form of overloading using impls: > > impl methods for int { > fn foo() { ... } > } > > impl methods for uint { > fn foo() { ... } > } > > > > > Niko -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Thu Apr 12 22:00:55 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 12 Apr 2012 22:00:55 -0700 Subject: [rust-dev] Functions overloading In-Reply-To: References: Message-ID: <4F87B307.6070302@alum.mit.edu> On 4/12/12 9:30 PM, Alexander Stavonin wrote: > The closest variant can be implemented by using enums, but I can't say > that resulted code is nice.: What is wrong with the impl-based technique I suggested? Niko From a.stavonin at gmail.com Thu Apr 12 22:19:41 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Fri, 13 Apr 2012 14:19:41 +0900 Subject: [rust-dev] Functions overloading In-Reply-To: <4F87B307.6070302@alum.mit.edu> References: <4F87B307.6070302@alum.mit.edu> Message-ID: I suppose this style is less confused. foo(i(10)); foo(s("test")); than this one, at least for passing parameters to a function. 10.foo(); "test".foo(); 13 ?????? 2012 ?. 14:00 ???????????? Niko Matsakis ???????: > On 4/12/12 9:30 PM, Alexander Stavonin wrote: > >> The closest variant can be implemented by using enums, but I can't say >> that resulted code is nice.: >> > > What is wrong with the impl-based technique I suggested? > > > Niko > -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.stavonin at gmail.com Fri Apr 13 05:39:10 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Fri, 13 Apr 2012 21:39:10 +0900 Subject: [rust-dev] Work with records in C style Message-ID: <757C582F-FE63-4898-9619-DF641C51B46D@gmail.com> When we want to create a new record we have to enumerate all fields during creation: let base = {x: 1, y: 2, z: 3}; This idea is very good, until we are working with small records, without internal subrecords or with low count of field. What is the best way for big records initializing? For example: type sockaddr4_in = {sin_family: i16, sin_port: u16, sin_addr: in4_addr, sin_zero: (x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x)}; How to initialize it quick and easy? Idea with using "with" keyword is not too god, as in that case I have to enumerate all fields at list once. How about C/C++ style, something like that? let addr = sockaddr4_in::any_keyword_as_you_like(); or let addr = sockaddr4_in(); From sebastian.sylvan at gmail.com Fri Apr 13 08:30:39 2012 From: sebastian.sylvan at gmail.com (Sebastian Sylvan) Date: Fri, 13 Apr 2012 08:30:39 -0700 Subject: [rust-dev] Work with records in C style In-Reply-To: <757C582F-FE63-4898-9619-DF641C51B46D@gmail.com> References: <757C582F-FE63-4898-9619-DF641C51B46D@gmail.com> Message-ID: On Fri, Apr 13, 2012 at 5:39 AM, Alexander Stavonin wrote: > When we want to create a new record we have to enumerate all fields during creation: > > let base = {x: 1, y: 2, z: 3}; > > This idea is very good, until we are working with small records, without internal subrecords or with low count of field. What is the best way for big records initializing? For example: > > type sockaddr4_in = {sin_family: i16, sin_port: u16, sin_addr: in4_addr, > ? ? ? ? ? ? ? ? ? ? sin_zero: (x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x)}; > > How to initialize it quick and easy? Idea with using "with" keyword is not too god, as in that case I have to enumerate all fields at list once. > How about C/C++ style, something like that? > > let addr = sockaddr4_in::any_keyword_as_you_like(); > or > let addr = sockaddr4_in(); What about when the initial values need to be, say, 1 rather than 0, or "Unknown Name" instead of "", or NaN instead of 0.0f, etc.? Granted the latter may be slightly more common, but it seems like a hack to treat it specially just the same, in order to save a tiny bit of typing to create a "default" record that people can copy from. -- Sebastian Sylvan From kobi2187 at gmail.com Fri Apr 13 23:07:08 2012 From: kobi2187 at gmail.com (Kobi Lurie) Date: Sat, 14 Apr 2012 09:07:08 +0300 Subject: [rust-dev] 2 possible simplifications: reverse application, records as arguments Message-ID: <4F89140C.70408@gmail.com> such a flurry of activity here :-D I saw a few things I liked, in the felix language (and some that are above my head for now.) Do you think they fit rust well or not? one is reverse application. it's actually logical and might simplify things. (similar to extension methods in c#) there are no classes, but a syntax like: obj.method(b,c) still exists. from what I could tell, the function is really method(obj,b,c), and the previous syntax gets translated to it. which is just nicer for the programmers, minimizing parentheses. another thing is that instead of passing arguments, you pass just one (anonymous) record. the record is the arguments. which means, that argument names become mandatory, and the order wouldn't matter. or alternatively, a tuple, and then names don't matter. (personally I prefer the record with explicit names) from the calling side it looks the same. this is also very logical, as it's the same concept, but more refined. the fields to use don't come after the fact, in a way you also design the arguments (as the record). I'm not quite sure I make sense here, because it looks like you have the same benefits, but I think there's something to inquire here. actually there is a third one, virtual sequences, as I saw in the factor programming language (which is a really cool language btw, very nicely done) http://docs.factorcode.org/content/article-virtual-sequences.html ok, enough talk in the air, let's actually put it to usage. bye, kobi From dteller at mozilla.com Sat Apr 14 03:10:21 2012 From: dteller at mozilla.com (David Rajchenbach-Teller) Date: Sat, 14 Apr 2012 12:10:21 +0200 Subject: [rust-dev] 2 possible simplifications: reverse application, records as arguments In-Reply-To: <4F89140C.70408@gmail.com> References: <4F89140C.70408@gmail.com> Message-ID: <4F894D0D.4000905@mozilla.com> On 4/14/12 8:07 AM, Kobi Lurie wrote: > such a flurry of activity here :-D > > I saw a few things I liked, in the felix language (and some that are > above my head for now.) > > Do you think they fit rust well or not? > > one is reverse application. > it's actually logical and might simplify things. > (similar to extension methods in c#) > there are no classes, but a syntax like: obj.method(b,c) still exists. > from what I could tell, the function is really method(obj,b,c), and the > previous syntax gets translated to it. which is just nicer for the > programmers, minimizing parentheses. Looks like a nasty case of overloading of |method|, though :/ > another thing is that instead of passing arguments, you pass just one > (anonymous) record. the record is the arguments. > which means, that argument names become mandatory, and the order > wouldn't matter. > or alternatively, a tuple, and then names don't matter. (personally I > prefer the record with explicit names) > from the calling side it looks the same. I personally like a lot the idea of being able to label arguments. > actually there is a third one, virtual sequences, as I saw in the factor > programming language (which is a really cool language btw, very nicely > done) > http://docs.factorcode.org/content/article-virtual-sequences.html Not sure I understand. Is that a variant on enumerations or streams? Cheers, David -- David Rajchenbach-Teller, PhD Performance Team, Mozilla -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 487 bytes Desc: OpenPGP digital signature URL: From ben.striegel at gmail.com Sat Apr 14 09:00:39 2012 From: ben.striegel at gmail.com (Benjamin Striegel) Date: Sat, 14 Apr 2012 12:00:39 -0400 Subject: [rust-dev] 2 possible simplifications: reverse application, records as arguments In-Reply-To: <4F894D0D.4000905@mozilla.com> References: <4F89140C.70408@gmail.com> <4F894D0D.4000905@mozilla.com> Message-ID: So this would basically mean that a function like: fn wtever(foo: int, bar: str) { ... } could be called as either: wtever(1, "hello"); // "tuple" syntax wtever{foo: 1, bar: "hello"}; // "record" syntax Not sure how I feel about invoking a function using a record literal, its' a little bit elegant but it also feels clunky to have different enclosing glyphs depending on if you want to pass by name or by position. What I'd *really* love is just for Rust to have optional named parameters in function invocations, like so many other languages have. Then you could do: wtever(1, "hello"); wtever(foo: 1, bar: "hello"); wtever(1, bar: "hello"); // not possible using just tuple and record literals! AFAIK named parameters aren't supported in C++, so this could be an area where Rust really improves upon it. On Sat, Apr 14, 2012 at 6:10 AM, David Rajchenbach-Teller < dteller at mozilla.com> wrote: > On 4/14/12 8:07 AM, Kobi Lurie wrote: > > such a flurry of activity here :-D > > > > I saw a few things I liked, in the felix language (and some that are > > above my head for now.) > > > > Do you think they fit rust well or not? > > > > one is reverse application. > > it's actually logical and might simplify things. > > (similar to extension methods in c#) > > there are no classes, but a syntax like: obj.method(b,c) still exists. > > from what I could tell, the function is really method(obj,b,c), and the > > previous syntax gets translated to it. which is just nicer for the > > programmers, minimizing parentheses. > > Looks like a nasty case of overloading of |method|, though :/ > > > another thing is that instead of passing arguments, you pass just one > > (anonymous) record. the record is the arguments. > > which means, that argument names become mandatory, and the order > > wouldn't matter. > > or alternatively, a tuple, and then names don't matter. (personally I > > prefer the record with explicit names) > > from the calling side it looks the same. > > I personally like a lot the idea of being able to label arguments. > > > actually there is a third one, virtual sequences, as I saw in the factor > > programming language (which is a really cool language btw, very nicely > > done) > > http://docs.factorcode.org/content/article-virtual-sequences.html > > Not sure I understand. Is that a variant on enumerations or streams? > > Cheers, > David > > > -- > David Rajchenbach-Teller, PhD > Performance Team, Mozilla > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arcata at gmail.com Sat Apr 14 10:06:16 2012 From: arcata at gmail.com (Joe Groff) Date: Sat, 14 Apr 2012 10:06:16 -0700 Subject: [rust-dev] Brace-free if and alt In-Reply-To: <4F8711CE.8070001@alum.mit.edu> References: <4F85E955.3040605@mozilla.com> <4F8711CE.8070001@alum.mit.edu> Message-ID: On Thu, Apr 12, 2012 at 10:33 AM, Niko Matsakis wrote: > I am not a big fan of the `if` syntax. ?Or at least I don't mind our current > one and it is nicely unambiguous. ?However, I really like "alt with arrow" > syntax. ?I find the current one quite unreadable, particularly for long > blocks. ?The "=>" arrow (or `->`, I am somewhat indifferent) helps to set > apart the conditions and the blocks. Total bikeshed comment: ASCII arrows are a massive pain to type. Is there any possibility of using a single-character symbol there? -Joe From me at kevincantu.org Sat Apr 14 14:02:08 2012 From: me at kevincantu.org (Kevin Cantu) Date: Sat, 14 Apr 2012 14:02:08 -0700 Subject: [rust-dev] Brace-free if and alt In-Reply-To: References: <4F85E955.3040605@mozilla.com> <4F8711CE.8070001@alum.mit.edu> Message-ID: That reminds me, I keep meaning to make APL-style keyboard mappings... Also: I'm another one who likes this for `alt` syntax. -- Kevin Cantu On Sat, Apr 14, 2012 at 10:06 AM, Joe Groff wrote: > On Thu, Apr 12, 2012 at 10:33 AM, Niko Matsakis wrote: >> I am not a big fan of the `if` syntax. ?Or at least I don't mind our current >> one and it is nicely unambiguous. ?However, I really like "alt with arrow" >> syntax. ?I find the current one quite unreadable, particularly for long >> blocks. ?The "=>" arrow (or `->`, I am somewhat indifferent) helps to set >> apart the conditions and the blocks. > > Total bikeshed comment: ASCII arrows are a massive pain to type. Is > there any possibility of using a single-character symbol there? > > -Joe > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From banderson at mozilla.com Sat Apr 14 16:11:28 2012 From: banderson at mozilla.com (Brian Anderson) Date: Sat, 14 Apr 2012 16:11:28 -0700 Subject: [rust-dev] Some documentation about rustdoc Message-ID: <4F8A0420.6030806@mozilla.com> I wrote up an introduction to rustdoc. http://brson.github.com/rust/2012/04/14/how-to-rustdoc/ From theoneandonlykosta at googlemail.com Thu Apr 12 00:38:41 2012 From: theoneandonlykosta at googlemail.com (Kosta Welke) Date: Thu, 12 Apr 2012 09:38:41 +0200 Subject: [rust-dev] Class UI In-Reply-To: References: Message-ID: On Apr 10, 2012, at 9:26 PM, Masklinn wrote: > I was reading > http://smallcultfollowing.com/babysteps/blog/2012/04/09/rusts-object-system/ > today, and saw the description of the classes definition. > > Next to it, Nicholas notes: >> I am not fond of the definition of constructors, in particular > > I can only agree, for a simple reason: the example is that of an > initializer, but uses naming generally used for actual constructors. Apart from the naming, I always found constructors as used in C, Java, Python, etc. to be quite a bit verbose and tedious. I think Dylan's got this about right: You'd just note whether the value should be initialized with a default value or must appear in the constructor. Consider: class colored_point { let x: int = 0, y: int = 0; let color: color init-required; fn abs() -> int { sqrt(x*x+y*y) } } And using it like this: let p1 = point(color: red); //0,0 let p2 = point(color: blue, y: 2); //0,2 let p3 = point(color: black, x: 3, y: 5) //3,5 let p4 = point(x: 5) //compile time error: "color" is missing as constructor argument Of course, you can always supply your own constructor for more complicated stuff. But 95% of the time, this is very concise and helpful. On the other hand, I'm not sure whether Rust has or wants to have keyword arguments for functions... Cheers, Kosta From banderson at mozilla.com Sun Apr 15 17:38:33 2012 From: banderson at mozilla.com (Brian Anderson) Date: Sun, 15 Apr 2012 17:38:33 -0700 Subject: [rust-dev] Fall-through in alt, break&continue by label In-Reply-To: References: Message-ID: <4F8B6A09.3010205@mozilla.com> On 04/10/2012 05:53 AM, Henri Sivonen wrote: > It appears that Rust does not to have labeled loops with break and > continue by label the way Java has. Also, it appears that alt does > not have fall-through the way switch in C has. > > Are break and continue by label and/or fall-through in alt supported > in some non-obvious and unadvertised way? If not, are there plans to > add to these features? (While I understand that fall-through in > switch is largely considered a misfeature, break and continue by label > seem less controversial.) > > If there are no plans to add these features, what are the recommended > ways to emulate these features in a way that compiles to efficient > machine code? > > The use case I have is targeting Rust with the translator that > currently targets C++ and generates the HTML parser in Gecko. (It uses > goto hidden behind macros to emulate break and continue by label in > C++.) > I spent some time piddling with Rust macros but came up empty handed. So I don't have a solution, but eliminating obstacles to generating an HTML parser strikes me as high priority. I opened an issue for it. https://github.com/mozilla/rust/issues/2216 From sebastian.sylvan at gmail.com Sun Apr 15 18:17:34 2012 From: sebastian.sylvan at gmail.com (Sebastian Sylvan) Date: Sun, 15 Apr 2012 18:17:34 -0700 Subject: [rust-dev] Fall-through in alt, break&continue by label In-Reply-To: References: Message-ID: On Tue, Apr 10, 2012 at 5:53 AM, Henri Sivonen wrote: > It appears that Rust does not to have labeled loops with break and > continue by label the way Java has. ?Also, it appears that alt does > not have fall-through the way switch in C has. > > Are break and continue by label and/or fall-through in alt supported > in some non-obvious and unadvertised way? ?If not, are there plans to > add to these features? ?(While I understand that fall-through in > switch is largely considered a misfeature, break and continue by label > seem less controversial.) > > If there are no plans to add these features, what are the recommended > ways to emulate these features in a way that compiles to efficient > machine code? Could tail calls work? I.e. each "label" would equal a separate function (any state would have to be passed through), and then you'd just keep tail-calling from state to state. Without really knowing exactly what kind of code you're trying to generate, this seems like it might be workable. Seb -- Sebastian Sylvan From pwalton at mozilla.com Sun Apr 15 18:19:12 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Sun, 15 Apr 2012 18:19:12 -0700 Subject: [rust-dev] Fall-through in alt, break&continue by label In-Reply-To: References: Message-ID: <4F8B7390.1080300@mozilla.com> On 04/15/2012 06:17 PM, Sebastian Sylvan wrote: > Could tail calls work? I.e. each "label" would equal a separate > function (any state would have to be passed through), and then you'd > just keep tail-calling from state to state. Without really knowing > exactly what kind of code you're trying to generate, this seems like > it might be workable. Only if LLVM's optimizer is smart enough to turn that code into a goto-based state machine. I'm not sure if it is. (Of course, if it's not, that's possibly fixable...) Patrick From arcata at gmail.com Sun Apr 15 18:23:23 2012 From: arcata at gmail.com (Joe Groff) Date: Sun, 15 Apr 2012 18:23:23 -0700 Subject: [rust-dev] Fall-through in alt, break&continue by label In-Reply-To: <4F8B7390.1080300@mozilla.com> References: <4F8B7390.1080300@mozilla.com> Message-ID: On Sun, Apr 15, 2012 at 6:19 PM, Patrick Walton wrote: > Only if LLVM's optimizer is smart enough to turn that code into a goto-based > state machine. I'm not sure if it is. (Of course, if it's not, that's > possibly fixable...) IIRC there was talk of adding explicit tail calls to the language a while back. Did that get shot down? -Joe From a.stavonin at gmail.com Mon Apr 16 06:01:35 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Mon, 16 Apr 2012 22:01:35 +0900 Subject: [rust-dev] How to join together C and Rust code? Message-ID: Is it possible to include into the one crate as Rust as C code? From a.stavonin at gmail.com Sun Apr 15 17:51:27 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Mon, 16 Apr 2012 09:51:27 +0900 Subject: [rust-dev] How to allocate record on memory? Message-ID: I have a C function with void* argument wich will be passed to callback and I want to pass Rust record as the argument. The question is, how to allocate record on memory but not on the stack? I've tried a lot of different ways, but with same result: in callback function record data looked as already destroyed. example: ... type data_rec = { on_connect_cb: fn@(listner: *evconnlistener, sock: c_int) -> bool, data: int, data1: int }; fn re_listener_new_bind(ev_base: *event_base, flags: [listner_flags], addr: sockaddr, on_connect: fn@(listner: *evconnlistener, sock: c_int) -> bool) -> *evconnlistener unsafe { ... // initialising callback: let callback = unsafe{ {on_connect_cb: on_connect, data: 10, data1: -10} }; // will the record be allocated on the stack? ret ev::evconnlistener_new_bind(ev_base, connect_callback, data, res_flags, -1 as c_int, ptr::addr_of(a), l as c_int); } crust fn connect_callback(listner: *evconnlistener, sock: c_int, sockaddr: *c_void, len: c_int, ptr: *c_void) unsafe { let data = ptr as *data_rec; io::println(#fmt("%u, data: %d, data1 %d", ptr as uint, (*data).data, (*data).data1)); } ... console output: 1088426520, data: 1088426784, data1 146814400 expected output: 1088426520, data: 10, data1 -10 -------------- next part -------------- An HTML attachment was scrubbed... URL: From graydon at mozilla.com Mon Apr 16 11:39:23 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 16 Apr 2012 11:39:23 -0700 Subject: [rust-dev] How to join together C and Rust code? In-Reply-To: References: Message-ID: <4F8C675B.1030303@mozilla.com> On 12-04-16 06:01 AM, Alexander Stavonin wrote: > Is it possible to include into the one crate as Rust as C code? Rust can call C by declaring the C functions within a suitably named rust native module. C can call rust only by rust passing C a pointer to a 'crust' function. Currently the rust compiler cannot compile C code directly, it must be compiled and linked separately. We might be changing this in the near future, by incorporating clang into our LLVM support library, but we have not decided to do that yet, nor started work on it. So that feature, if it happens, is still a ways off. -Graydon From banderson at mozilla.com Mon Apr 16 11:41:17 2012 From: banderson at mozilla.com (Brian Anderson) Date: Mon, 16 Apr 2012 11:41:17 -0700 Subject: [rust-dev] How to join together C and Rust code? In-Reply-To: References: Message-ID: <4F8C67CD.4030802@mozilla.com> On 04/16/2012 06:01 AM, Alexander Stavonin wrote: > Is it possible to include into the one crate as Rust as C code? Yes it is possible. There's no explicit support for it but native modules may refer to static libraries and it's relatively easy to link arbitrary static libraries into a crate. rust-azure[1] does this so it doesn't have to worry about keeping track of a native dynamic library. It's not quite ergonomic right now but rust-azure does it by including the following hack (in test.rs of all places): #[link_args = "-L. -lcairo -lazure"] #[nolink] native mod m { } The native mod is just a place to tack on the extra link args, `-L` adds to the library search path, `-lcairo` is a system-installed dependency of azure, and `-lazure` is the static library that is linked into the crate. There are a few[2] upcoming changes[3] that will make this eaisier in the future. -Brian [1]:https://github.com/brson/rust-azure [2]:https://github.com/mozilla/rust/issues/2217 [3]:https://github.com/mozilla/rust/issues/2218 From graydon at mozilla.com Mon Apr 16 11:46:18 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 16 Apr 2012 11:46:18 -0700 Subject: [rust-dev] Fall-through in alt, break&continue by label In-Reply-To: References: <4F8B7390.1080300@mozilla.com> Message-ID: <4F8C68FA.3090004@mozilla.com> On 12-04-15 06:23 PM, Joe Groff wrote: > On Sun, Apr 15, 2012 at 6:19 PM, Patrick Walton wrote: >> Only if LLVM's optimizer is smart enough to turn that code into a goto-based >> state machine. I'm not sure if it is. (Of course, if it's not, that's >> possibly fixable...) > > IIRC there was talk of adding explicit tail calls to the language a > while back. Did that get shot down? They're already "present" (were from the beginning) but they broke when we shifted from rustboot (hand-rolled code generator) to rustc (LLVM). It turns out that you have to adopt a somewhat pessimistic ABI in all cases if your functions are to be tail-callable. There's a bug open on this[1] that discusses in some more detail, but I think the feature is drifting towards a decision to remove the feature altogether. -Graydon [1] https://github.com/mozilla/rust/issues/217 From pwalton at mozilla.com Mon Apr 16 11:49:47 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Mon, 16 Apr 2012 11:49:47 -0700 Subject: [rust-dev] Fall-through in alt, break&continue by label In-Reply-To: <4F8C68FA.3090004@mozilla.com> References: <4F8B7390.1080300@mozilla.com> <4F8C68FA.3090004@mozilla.com> Message-ID: <4F8C69CB.8070704@mozilla.com> On 4/16/12 11:46 AM, Graydon Hoare wrote: > They're already "present" (were from the beginning) but they broke when > we shifted from rustboot (hand-rolled code generator) to rustc (LLVM). > It turns out that you have to adopt a somewhat pessimistic ABI in all > cases if your functions are to be tail-callable. There's a bug open on > this[1] that discusses in some more detail, but I think the feature is > drifting towards a decision to remove the feature altogether. I actually disagree with this; I think that we should measure. I'm not sure that the Pascal calling convention is worse than the C calling convention in practice. In any case, I believe we're doing sibling call optimization already. Patrick From graydon at mozilla.com Mon Apr 16 11:54:43 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 16 Apr 2012 11:54:43 -0700 Subject: [rust-dev] Fall-through in alt, break&continue by label In-Reply-To: <4F8C69CB.8070704@mozilla.com> References: <4F8B7390.1080300@mozilla.com> <4F8C68FA.3090004@mozilla.com> <4F8C69CB.8070704@mozilla.com> Message-ID: <4F8C6AF3.5010704@mozilla.com> On 12-04-16 11:49 AM, Patrick Walton wrote: > On 4/16/12 11:46 AM, Graydon Hoare wrote: >> They're already "present" (were from the beginning) but they broke when >> we shifted from rustboot (hand-rolled code generator) to rustc (LLVM). >> It turns out that you have to adopt a somewhat pessimistic ABI in all >> cases if your functions are to be tail-callable. There's a bug open on >> this[1] that discusses in some more detail, but I think the feature is >> drifting towards a decision to remove the feature altogether. > > I actually disagree with this; I think that we should measure. I'm not > sure that the Pascal calling convention is worse than the C calling > convention in practice. > > In any case, I believe we're doing sibling call optimization already. Ok! I certainly don't _dislike_ tail calls (put them in intentionally), I just thought the mood had mostly soured on them due to the assumed perf overhead. I'm happy to dig further and try to get this working again. (They also interact with borrowing rules and alias analysis -- the caller frame may not exist -- but it's probably tractable to express that in terms of region constraints a tail-caller must make sure to satisfy) -Graydon From graydon at mozilla.com Mon Apr 16 12:22:47 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 16 Apr 2012 12:22:47 -0700 Subject: [rust-dev] 2 possible simplifications: reverse application, records as arguments In-Reply-To: <4F89140C.70408@gmail.com> References: <4F89140C.70408@gmail.com> Message-ID: <4F8C7187.2000107@mozilla.com> On 12-04-13 11:07 PM, Kobi Lurie wrote: > one is reverse application. > it's actually logical and might simplify things. > (similar to extension methods in c#) > there are no classes, but a syntax like: obj.method(b,c) still exists. > from what I could tell, the function is really method(obj,b,c), and the > previous syntax gets translated to it. which is just nicer for the > programmers, minimizing parentheses. This is what our typeclasses (impls and ifaces) currently do. > another thing is that instead of passing arguments, you pass just one > (anonymous) record. the record is the arguments. We actually had quite an argument with one of the Felix authors about this. This was not, back then, a terribly realistic option during that conversation (argument modes were still the primary way we were doing safe references, which are not first class types). But it's conceivably something we could look into if we get the argument-passing logic down to "always by-value and use region pointers for safe references" (which is where we're going). There remain some hitches: - Our syntax isn't quite compatible with the idea; records have to be brace-parenthesized and tuples given round parentheses. They'd need reform, and the syntax is already pretty crowded. - We'd have to decide the rules surrounding keyword mismatches, partial provision of keywords, argument permutation, and function types. - Our records are order-sensitive, to be C-structure compatible. Keyword arguments are usually argued-for (as you are doing here) as a way to make function arguments order-insensitive. We'd need to decide whether we wanted order to matter or not. - Argument-passing tends to work best when you can pass stuff in registers, not require the arguments to be spilled to memory and addressable as a contiguous slab. So we'd want to be careful not to require the "arguments structure" to be _actually_ addressable as a structure at any point. Rather, calling f(x) would be some kind of semantic sugar for `f(x.a, x.b, x.c)`, making separate copies for the sake of passing. So .. I can see a possibility here, but it'd be a complicated set of issues to work through. Would need some serious design work. I've never been intrinsically opposed to it, just felt that we were constrained by other choices in the language. At the time, argument modes were completely prohibitive; now it might be possible, but is still not entirely straightforward. -Graydon From banderson at mozilla.com Mon Apr 16 12:40:19 2012 From: banderson at mozilla.com (Brian Anderson) Date: Mon, 16 Apr 2012 12:40:19 -0700 Subject: [rust-dev] -L flag now also used for native libraries Message-ID: <4F8C75A3.8070605@mozilla.com> Hey. Using bindings has been complicated by our lack of configuration for locating native libraries. Until now we've had to hack in `link_args` attributes in order to specify where native libraries live. Today I checked in a change that makes the `-L` flag also apply to native libraries, so when you want to run bindgen with your build of libclang located at /home/brian/dev/clang-for-bindgen/Release+Asserts/lib you can simply write `rustc bindgen.rc -L /home/brian/dev/clang-for-bindgen-/Release+Asserts/lib`. Yay. -Brian From jonathan.bragg at alum.rpi.edu Mon Apr 16 14:09:04 2012 From: jonathan.bragg at alum.rpi.edu (Nate Bragg) Date: Mon, 16 Apr 2012 17:09:04 -0400 Subject: [rust-dev] 2 possible simplifications: reverse application, records as arguments In-Reply-To: <4F8C7187.2000107@mozilla.com> References: <4F89140C.70408@gmail.com> <4F8C7187.2000107@mozilla.com> Message-ID: On Mon, Apr 16, 2012 at 3:22 PM, Graydon Hoare wrote: > - Our records are order-sensitive, to be C-structure compatible. > Keyword arguments are usually argued-for (as you are doing here) as > a way to make function arguments order-insensitive. We'd need to > decide whether we wanted order to matter or not. > Could the compiler just reorder them behind the scenes? Then records could remain order-sensitive internally, and order mattering would be a non-issue. As a tangent, I have found the best use of keyword args to be made by Python, where they enable arbitrary parameters to have defaults (as opposed to c++ style, where default arguments can only be given from the final parameter backwards). -------------- next part -------------- An HTML attachment was scrubbed... URL: From marijnh at gmail.com Mon Apr 16 15:26:03 2012 From: marijnh at gmail.com (Marijn Haverbeke) Date: Tue, 17 Apr 2012 00:26:03 +0200 Subject: [rust-dev] paring back the self type to be, well, just a type In-Reply-To: <4F875CBC.2030903@alum.mit.edu> References: <4F870039.5030401@mozilla.com> <4F875CBC.2030903@alum.mit.edu> Message-ID: It's not just monads that require parameterized self types. It comes up even in something like a generic collection type, if you want to have a map operator. I agree that the problems you raised are real, but I think giving up on the self type this easily would be a shame. It seems that it wouldn't be hard to take the low-tech approach of simply spitting out a well-defined error when one of the situations you describe occurs. Self types are used only as a kind of templates that are filled in when an impl implements an iface. They don't leak into the type system in general, as far as I can see (granted, I can't see all that far, for I am not a type theoretician). Best, Marijn On Fri, Apr 13, 2012 at 12:52 AM, Niko Matsakis wrote: > On 4/12/12 2:08 PM, Dave Halperin wrote: >> >> I'm not advocating either way on this in terms of the complexity tradeoff >> for adding a kind system, just pointing out that it's what you'd need to >> make the current system work and it's not completely crazy to want that >> flexibility out of the type system. > > > Yes, this was one use case that the self type was intended to model (though > not the one that I think will come up most often). ?I don't think this is > crazy by any means, but right now our type system has no notion of > (Haskell-style) kinds and it'd be a big change to add them. > > It's possible that the self type should be yanked altogether, but it's come > in handy for me many times, but always in its simpler incarnation of "the > type of the receiver" rather than "the type function defined by the current > iface". > > In any case, I spent some time trying to adapt the iface system---in any > form!---to writing generic monadic code and I decided it's just not a very > good fit. ?There are two major hurdles. > > First, we have no good way to define a "return" function (though perhaps we > could add static or class methods to ifaces). > > Second, and this is more major, we define monadic implementations for some > concrete type and we should be doing it for all types. ?In principle, > something like > ? ?impl of monad for option { ... } > seems like it does what we want, but it also permits things like: > ? ?impl of monad for option { ... } > for which there is no clear mapping from which to derive the self type. ?I > think to do this correctly, we'd rather write something like: > ? ?impl of monad for option { ... } > which would also fit with the "kind" notion of Haskell: here the type being > implemented for has kind * => * and so does monad. > > Of course, we would also need to be able to write things like: > ? ?fn mapM( > ? ? ? ?a: [A], > ? ? ? ?b: fn(A) -> M) -> M<[B]> { ... } > Here the parameter M is bound to some (* => *) type constructor for which > monad is defined. > > I am not opposed to adding such capabilities at some point. ?But they don't > feel like burning issues to me right *now*. > > > > Niko > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From stefan.plantikow at googlemail.com Mon Apr 16 16:07:43 2012 From: stefan.plantikow at googlemail.com (Stefan Plantikow) Date: Tue, 17 Apr 2012 01:07:43 +0200 Subject: [rust-dev] Fall-through in alt, break&continue by label In-Reply-To: <4F8C6AF3.5010704@mozilla.com> References: <4F8B7390.1080300@mozilla.com> <4F8C68FA.3090004@mozilla.com> <4F8C69CB.8070704@mozilla.com> <4F8C6AF3.5010704@mozilla.com> Message-ID: <6EA8A23646264C49A25513CF08EDC734@googlemail.com> Hi, Am Montag, 16. April 2012 um 20:54 schrieb Graydon Hoare: > On 12-04-16 11:49 AM, Patrick Walton wrote: > > On 4/16/12 11:46 AM, Graydon Hoare wrote: > > > They're already "present" (were from the beginning) but they broke when > > > we shifted from rustboot (hand-rolled code generator) to rustc (LLVM). > > > It turns out that you have to adopt a somewhat pessimistic ABI in all > > > cases if your functions are to be tail-callable. There's a bug open on > > > this[1] that discusses in some more detail, but I think the feature is > > > drifting towards a decision to remove the feature altogether. > > > > > > > > I actually disagree with this; I think that we should measure. I'm not > > sure that the Pascal calling convention is worse than the C calling > > convention in practice. > > > > In any case, I believe we're doing sibling call optimization already. > > Ok! I certainly don't _dislike_ tail calls (put them in intentionally), > I just thought the mood had mostly soured on them due to the assumed > perf overhead. I'm happy to dig further and try to get this working again. > +1 Some algorithms just yearn for being written in a recursive style. I imagine a new user who first discovers rust type classes will be quite surprised to find out that there are no tail calls, it goes quite against the idea of being able to program in a functional style and recursion seems to be quite a perfect match for a language that already has pattern matching over immutable variables. In erlang there is an interesting use of tail calls related to atomic code migration: Actors spin on a tail-callable function. This recursive call serves as the handover point during code migration, i.e. the loop state is passed on but the code changes. To achieve something similar with an actor that loops over requests via while seems more involved to me. I'd rather see tail calllibility remain as a feature that has to be requested explicitly by the user than to nick it completely. At one point I wrote a wiki page that discusses the idea of making "tail callability" a part of the function type. This would allow to switch to the proper calling convention on the call site. While that is agreeably awkward and limiting it should still be enough for many useful scenarios (module-local recursion, and recursively written main loops). Cheers, Stefan. From graydon at mozilla.com Mon Apr 16 16:50:09 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 16 Apr 2012 16:50:09 -0700 Subject: [rust-dev] Fall-through in alt, break&continue by label In-Reply-To: <6EA8A23646264C49A25513CF08EDC734@googlemail.com> References: <4F8B7390.1080300@mozilla.com> <4F8C68FA.3090004@mozilla.com> <4F8C69CB.8070704@mozilla.com> <4F8C6AF3.5010704@mozilla.com> <6EA8A23646264C49A25513CF08EDC734@googlemail.com> Message-ID: <4F8CB031.1000901@mozilla.com> On 12-04-16 04:07 PM, Stefan Plantikow wrote: > Some algorithms just yearn for being written in a recursive style. Or state-machine style, yeah. IMO the use case for tail calls is more state machines than recursion. But both can be rewritten without. I think it's just a style issue; I don't actually know of any cases where a rewrite from tailcall style to non costs more than a constant factor of contortion. But I agree it's a nice style to support if it can be kept cheap. > In erlang there is an interesting use of tail calls related to atomic code migration Yeah, but they are running in a late-bound VM with a fixed repertoire of latent types. Everything about that environment is different. In rust, if you change anything about your datatypes a hot code reload couldn't work (it'll fail to find the same symbols). And even if you keep the types exactly the same, your lazily-bound PLTs will break. And even if they don't, any pointers into the static data regions of your DSOs will break. And so on. I went to some unusual efforts to try to support hot code upgrading at first in rust (as much as "manually linking DSOs") but have long since given up on it. It doesn't match this space. Use subprocesses. > At one point I wrote a wiki page that discusses the idea of making "tail callability" a part of the function type. It's possible. But I (personally) feel this particular strategy is more cost than it's worth for the style benefit. Would be better to push on the costs to see if we can get them to "tolerable", if we want to keep the feature (or limit to "just module-local and static" or something). -Graydon From nmatsakis at mozilla.com Mon Apr 16 15:50:26 2012 From: nmatsakis at mozilla.com (Nicholas Matsakis) Date: Mon, 16 Apr 2012 15:50:26 -0700 (PDT) Subject: [rust-dev] paring back the self type to be, well, just a type In-Reply-To: Message-ID: <1001865906.415729.1334616626083.JavaMail.root@zmmbox4.mail.corp.phx1.mozilla.com> I think that if we want to support something like the `self` type, we need to go all out and truly support higher-kinded types. The current system is too weak to really express anything useful. I mean, you can write useful ifaces, but you can't work with them in a generic fashion. For example: iface coll { fn map(f: fn(A) -> B) -> self; } fn to_int>(c: C) -> ... { c.map { |i| i as int } } This cannot be compiled. What return type should it have? (Note that if you don't care about working with the collections generically, then you don't need to include `map` in the iface anyhow, so the self type is rather irrelevant) Also, a word of caution: it is not at all clear that `self` is the correct result type for a map operation on a collection. For many collections, it is, but not all. Basically it prevents collections which are specialized to types of their contents (e.g., bitsets) from implementing the collection interface. The Scala guys took this line of reasoning to the hilt, I doubt we want to go that far (though the resulting system is very powerful). A simpler solution is just to have a basic collection type that does not include operations to create new collections (this could be implemented by bitset), and then an extended collection type that includes mapping operations. The latter would only be used by collections (like vector) that are fully generic with respect to the kinds of types they can store. My current (not terribly satisfying) plan was to have `map()` return `[T]`. Obviously not great. Naturally, you'd end up with customized methods to different collections that returned results in a more specialized type. It would then be impossible to work with these customized methods in a generic way which is too bad. Niko ----- Original Message ----- From: "Marijn Haverbeke" To: "Niko Matsakis" Cc: "Dave Halperin" , rust-dev at mozilla.org, "Niko Matsakis" Sent: Monday, April 16, 2012 3:26:03 PM Subject: Re: [rust-dev] paring back the self type to be, well, just a type It's not just monads that require parameterized self types. It comes up even in something like a generic collection type, if you want to have a map operator. I agree that the problems you raised are real, but I think giving up on the self type this easily would be a shame. It seems that it wouldn't be hard to take the low-tech approach of simply spitting out a well-defined error when one of the situations you describe occurs. Self types are used only as a kind of templates that are filled in when an impl implements an iface. They don't leak into the type system in general, as far as I can see (granted, I can't see all that far, for I am not a type theoretician). Best, Marijn On Fri, Apr 13, 2012 at 12:52 AM, Niko Matsakis wrote: > On 4/12/12 2:08 PM, Dave Halperin wrote: >> >> I'm not advocating either way on this in terms of the complexity tradeoff >> for adding a kind system, just pointing out that it's what you'd need to >> make the current system work and it's not completely crazy to want that >> flexibility out of the type system. > > > Yes, this was one use case that the self type was intended to model (though > not the one that I think will come up most often). ?I don't think this is > crazy by any means, but right now our type system has no notion of > (Haskell-style) kinds and it'd be a big change to add them. > > It's possible that the self type should be yanked altogether, but it's come > in handy for me many times, but always in its simpler incarnation of "the > type of the receiver" rather than "the type function defined by the current > iface". > > In any case, I spent some time trying to adapt the iface system---in any > form!---to writing generic monadic code and I decided it's just not a very > good fit. ?There are two major hurdles. > > First, we have no good way to define a "return" function (though perhaps we > could add static or class methods to ifaces). > > Second, and this is more major, we define monadic implementations for some > concrete type and we should be doing it for all types. ?In principle, > something like > ? ?impl of monad for option { ... } > seems like it does what we want, but it also permits things like: > ? ?impl of monad for option { ... } > for which there is no clear mapping from which to derive the self type. ?I > think to do this correctly, we'd rather write something like: > ? ?impl of monad for option { ... } > which would also fit with the "kind" notion of Haskell: here the type being > implemented for has kind * => * and so does monad. > > Of course, we would also need to be able to write things like: > ? ?fn mapM( > ? ? ? ?a: [A], > ? ? ? ?b: fn(A) -> M) -> M<[B]> { ... } > Here the parameter M is bound to some (* => *) type constructor for which > monad is defined. > > I am not opposed to adding such capabilities at some point. ?But they don't > feel like burning issues to me right *now*. > > > > Niko > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From stefan.plantikow at googlemail.com Mon Apr 16 17:58:22 2012 From: stefan.plantikow at googlemail.com (Stefan Plantikow) Date: Tue, 17 Apr 2012 02:58:22 +0200 Subject: [rust-dev] Functions overloading In-Reply-To: <4F86DCE0.6040609@alum.mit.edu> References: <6CFF2562-BFD7-4A47-B8B1-072A638E3B93@gmail.com> <4F86A7FE.10107@mozilla.com> <4F86DCE0.6040609@alum.mit.edu> Message-ID: <79F159C0DE47447FB4DEB6DAD5B89D16@googlemail.com> Hi, Am Donnerstag, 12. April 2012 um 15:47 schrieb Niko Matsakis: > Not only that, but it adds an extra layer of complexity for type > inferencing, which quite frankly is already complex enough. > > That said, you can do a limited form of overloading using impls: > > impl methods for int { > fn foo() { ... } > } > > impl methods for uint { > fn foo() { ... } > } > > I was reading this again and thought that as a middle ground between type classes and enums, one could use type classes that wrap supported types into enum variants for easily switching over them. enum input { val(t), seq([t]) } iface to_input { fn to_input() -> input } impl of to_input for int { fn to_input() -> input { ret val(self); } } impl of to_input for [int] { fn to_input() -> input { ret seq(self); } } fn to_input(i: to_input) { alt i.to_input() { val(v) { ... } seq(v) { ... } } } This way the user does not have to know about the internal enums and the implementor can simply switch (especially useful with multiple arguments). I think it would be nice too have some syntax support for building ifaces like that, perhaps via macros. -- Stefan From niko at alum.mit.edu Mon Apr 16 18:08:33 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Mon, 16 Apr 2012 18:08:33 -0700 Subject: [rust-dev] Functions overloading In-Reply-To: <79F159C0DE47447FB4DEB6DAD5B89D16@googlemail.com> References: <6CFF2562-BFD7-4A47-B8B1-072A638E3B93@gmail.com> <4F86A7FE.10107@mozilla.com> <4F86DCE0.6040609@alum.mit.edu> <79F159C0DE47447FB4DEB6DAD5B89D16@googlemail.com> Message-ID: <4F8CC291.5080604@alum.mit.edu> On 4/16/12 5:58 PM, Stefan Plantikow wrote: > I think it would be nice too have some syntax support for building ifaces like that, perhaps via macros. I can see this approach being useful in cases of multiple inputs. In any case, it reminds me of this bug: https://github.com/mozilla/rust/issues/1838 Niko From steven099 at gmail.com Mon Apr 16 18:23:13 2012 From: steven099 at gmail.com (Steven Blenkinsop) Date: Mon, 16 Apr 2012 21:23:13 -0400 Subject: [rust-dev] iface type parameter restrictions Message-ID: Motivated by the thread "paring back the self type to be, well, just a type" I decided to see how far I could get toward implementing monads given the self type [constructor] as-is. I ran into several things that got in the way, so I'm wondering how many of these are intentional. Here are some reduced test cases and the compiler errors they gave. The last one is obviously a compiler bug. ************* iface I1>> {} iface I2>> {} fn main() {} testcase1.rs:1:12: 1:21 error: illegal recursive type. insert a enum in the cycle, if this is desired) testcase1.rs:1 iface I1>> {} ^~~~~~~~~ ************** The enum suggestion here isn't really useful. There's no inherent reason that this can't work afaics, other than the difficulty you'd have instantiating it. ************* iface I1 {} iface I2 {} fn f, U: I2>() {} fn main() {} testcase5.rs:4:11: 4:12 error: unresolved typename: U testcase5.rs:4 fn f, U: I2>() {} ^ ************** Okay, I get it, type parameters must be introduced before you can use them in your constraints. It would be nice to be able to instantiate those recursive interfaces, though (if they worked). ************* iface I1 { fn F>>(); } iface I2>> {} fn main() {} testcase2.rs:2:9: 2:18 error: unbound path I2> testcase2.rs:2 fn F>>(); ^~~~~~~~~ ************** See, here I was hoping to get around the restriction on recursive interfaces by leaving the type parameterization until method invocation, but clearly it doesn't work, though I can't entirely tell from the error message why. ************* iface I1 { fn F>>(); } iface I2>> {} impl of I1 for [T] { fn F>() {} } fn main() {} testcase3.rs:7:13: 7:14 error: unresolved typename: T testcase3.rs:7 fn F>() {} ^ error: aborting due to previous errors ************* This is even weirder. It won't let me use impl type parameters in method type parameter constraints. This is masking the previous error. ************* iface I1 { fn f(u: U) -> self; } impl of I1 for [T] { fn f(u: U) -> [U] { [u] } } fn func>(i: I) { i.f(5); } fn main() {} error: internal compiler error unexpected failure note: The compiler hit an unexpected failure path. This is a bug. Try running with RUST_LOG=rustc=0,::rt::backtrace to get further details and report the results to github.com/mozilla/rust/issues ************** Yeah, this is a bug. I'm not sure whether the code itself is right, though. ** -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Mon Apr 16 20:24:19 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Mon, 16 Apr 2012 20:24:19 -0700 Subject: [rust-dev] iface type parameter restrictions In-Reply-To: References: Message-ID: <4F8CE263.8090804@alum.mit.edu> On 4/16/12 6:23 PM, Steven Blenkinsop wrote: > Motivated by the thread "paring back the self type to be, well, just a > type" > I > decided to see how far I could get toward implementing monads given > the self type [constructor] as-is. I ran into several things that got > in the way, so I'm wondering how many of these are intentional. Here > are some reduced test cases and the compiler errors they gave. The > last one is obviously a compiler bug. I haven't read through these all in detail, but these kinds of tests are always helpful. I was planning to go through soon and make up some more twisted torture tests of recursive iface bounds and so forth, so now you gave me a headstart. :) The first message (about enums) reflects a somewhat outdated rule dating from the time in which enums were the only nominal type in our type system (meaning a type whose equality is based on its name, not its definition), and hence all cycles had to go through an enum. Interface types are also nominal (and class types too) so we ought to loosen that rule. Niko From ben.striegel at gmail.com Tue Apr 17 08:53:29 2012 From: ben.striegel at gmail.com (Benjamin Striegel) Date: Tue, 17 Apr 2012 11:53:29 -0400 Subject: [rust-dev] Keyword cleanup Message-ID: There was an exchange on IRC last week about removing some keywords. It seemed a bit too broad and open-ended to open an issue for it, but I didn't want it to get lost entirely. pcwalton:we could get rid of "resource" and "be", maybe "while" and "log", although "trait" will add another pcwalton:maybe unifying import and use like we talked about graydon:'as' can go if we pick up go's expr.(type) syntax pcwalton:yes pcwalton:I like that syntax graydon:is 'tag' still listed? looks like in my docs. it should be 'enum' now pcwalton:"note" might be better done as a special kind of class pcwalton:just RAII-based pcwalton:notes sitting on your stack print themselves out during unwinding, but hide otherwise graydon:maybe. that's how it was done in monotone (where I copied the feature from) graydon:'block' is a dead keyword, you can remove that graydon:as is 'obj' graydon:'syntax' is likely to prove redundant, or might if we shift around the order of evaluating attributes and/or make "activating a compiler plugin" something you can do via attributes. graydon:pcwalton: 'in' is also dead, I think. 'with' could probably be turned into something symbolic w/o much work. -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.stavonin at gmail.com Mon Apr 16 16:21:36 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Tue, 17 Apr 2012 08:21:36 +0900 Subject: [rust-dev] How to allocate record on memory? Message-ID: It this the only way to create record in memory? May be someone has better ideas. fn mk_mem_obj() -> *T { libc::malloc(sys::size_of::()) as *T } unsafe fn mk_mem_copy_of_obj(src: T) -> *T { let size = sys::size_of::(); let dst = libc::malloc(size); libc::memcpy(dst, ptr::addr_of(src) as *libc::c_void, size); ret dst as *T; } -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Tue Apr 17 14:13:54 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 17 Apr 2012 14:13:54 -0700 Subject: [rust-dev] How to allocate record on memory? In-Reply-To: References: Message-ID: <4F8DDD12.2010407@alum.mit.edu> I don't *really* understand what you are trying to do, but I think you have two choices: (1) A call to libc::malloc, like you showed in your later mail. (2) Allocate the type as a @T and then use ptr::addr_of(*x) to get an unsafe ptr from that. Then you are responsible for keeping a live reference to the @T so that we don't collect it. Niko On 4/15/12 5:51 PM, Alexander Stavonin wrote: > I have a C function with void* argument wich will be passed to > callback and I want to pass Rust record as the argument. The question > is, how to allocate record on memory but not on the stack? I've tried > a lot of different ways, but with same result: in callback function > record data looked as already destroyed. > > example: > > ... > type data_rec = { > on_connect_cb: fn@(listner: *evconnlistener, sock: c_int) -> bool, > data: int, > data1: int > }; > > fn re_listener_new_bind(ev_base: *event_base, flags: [listner_flags], > addr: sockaddr, > on_connect: fn@(listner: *evconnlistener, sock: c_int) -> > bool) > -> *evconnlistener unsafe { > ... > // initialising callback: > > let callback = unsafe{ {on_connect_cb: on_connect, data: 10, > data1: -10} }; // will the record be allocated on the stack? > ret ev::evconnlistener_new_bind(ev_base, connect_callback, data, > res_flags, -1 as c_int, > ptr::addr_of(a), l as c_int); > } > > crust fn connect_callback(listner: *evconnlistener, sock: c_int, > sockaddr: *c_void, len: c_int, ptr: *c_void) unsafe { > > let data = ptr as *data_rec; > io::println(#fmt("%u, data: %d, data1 %d", ptr as uint, > (*data).data, (*data).data1)); > } > ... > > console output: > > 1088426520, data: 1088426784, data1 146814400 > > expected output: > 1088426520, data: 10, data1 -10 > > > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From banderson at mozilla.com Tue Apr 17 14:41:34 2012 From: banderson at mozilla.com (Brian Anderson) Date: Tue, 17 Apr 2012 14:41:34 -0700 Subject: [rust-dev] Keyword cleanup In-Reply-To: References: Message-ID: <4F8DE38E.90009@mozilla.com> On 04/17/2012 08:53 AM, Benjamin Striegel wrote: > There was an exchange on IRC last week about removing some keywords. It > seemed a bit too broad and open-ended to open an issue for it, but I > didn't want it to get lost entirely. > > pcwalton:we could get rid of "resource" and "be", maybe "while" and > "log", although "trait" will add another > pcwalton:maybe unifying import and use like we talked about > graydon:'as' can go if we pick up go's expr.(type) syntax > pcwalton:yes > pcwalton:I like that syntax > graydon:is 'tag' still listed? looks like in my docs. it should be > 'enum' now > pcwalton:"note" might be better done as a special kind of class > pcwalton:just RAII-based > pcwalton:notes sitting on your stack print themselves out during > unwinding, but hide otherwise > graydon:maybe. that's how it was done in monotone (where I copied the > feature from) > graydon:'block' is a dead keyword, you can remove that > graydon:as is 'obj' > graydon:'syntax' is likely to prove redundant, or might if we shift > around the order of evaluating attributes and/or make "activating a > compiler plugin" something you can do via attributes. > graydon:pcwalton: 'in' is also dead, I think. 'with' could probably be > turned into something symbolic w/o much work. > Based on this conversation pcwalton, niko and I filed a bunch of new papercut issues: * Remove `be` https://github.com/mozilla/rust/issues/2227 * Implement `assert` in the library https://github.com/mozilla/rust/issues/2228 * Rename `cont` to `next` https://github.com/mozilla/rust/issues/2229 * Remove `do` loops https://github.com/mozilla/rust/issues/2230 * Use go's casting syntax https://github.com/mozilla/rust/issues/2231 * Implement `fail` in the library https://github.com/mozilla/rust/issues/2232 * Remove `while` https://github.com/mozilla/rust/issues/2233 And here are some old ones that cover the same territory * Implement `log` in the library https://github.com/mozilla/rust/issues/554 * Implement `note` https://github.com/mozilla/rust/issues/415 -Brian From niko at alum.mit.edu Tue Apr 17 16:01:35 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 17 Apr 2012 16:01:35 -0700 Subject: [rust-dev] bikeshed on closure syntax Message-ID: <4F8DF64F.4060000@alum.mit.edu> Apologies in advance. An outbreak of Syntax Fever has struck the Mountain View offices today. One of the results was a total bikeshed proposal for a streamlined closure syntax. The idea is like this: 1. Closure expressions can be written `x -> expr` or `(x, y) -> expr`. This requires arbitrary lookahead to disambiguate from tuples. But it's relatively simple to write the code for it: you basically scan forward to find the matching parenthesis and then check the next token to see whether it is "->". 2. You may omit the last argument of a function that expects a closure using a syntax like the following: vec.iter: x { ... } The general structure is ": args blk". In BNF-like form, this looks like: Arguments := ID | "(" Argument0, ... ArgumentN ")" Argument := IrrefutablePattern Field := Expr "." ID Call := Expr "(" ... ")" TailCall := (Field | Call) ":" Arguments "{" ... "}" ClosureExpr := Arguments "->" Expr Here are some examples: spawn: { ... } for vec.each: x { ... } let xs = xs.filter(x -> x.isEven()); let ys = xs.map(x -> x * 2).filter(x -> x.isEven()); let ys = xs.map: x { ... some_thing_complex(x) }; In all cases we would infer the @ vs ~ vs & as we do today. I am not sure whether (x, y) -> x.isEven(y) is lightweight enough to replace _.isEven(_) in my heart, but it certainly looks better than {|x, y| x.isEven(y)}. Niko From graydon at mozilla.com Tue Apr 17 16:57:44 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 17 Apr 2012 16:57:44 -0700 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: <4F8DF64F.4060000@alum.mit.edu> References: <4F8DF64F.4060000@alum.mit.edu> Message-ID: <4F8E0378.1030503@mozilla.com> On 12-04-17 04:01 PM, Niko Matsakis wrote: > Apologies in advance. Ha! Apologies in return, then, as at least the first bit I'm kinda sour on. The rest I'm just confused over. > This requires arbitrary lookahead to disambiguate from tuples. This bit in particular. Really really don't want to cross the bridge to arbitrary lookahead in the grammar. > 2. You may omit the last argument of a function that expects a closure > using a syntax like the following: > > vec.iter: x { ... } Ok. I think I can see where you're going with this -- now that I'm reading it as a _replacement_ for the existing block syntax -- but looking it over I think I don't get it and/or prefer what we've got: - It doesn't seem any shorter: foo {|x| ... } and foo: x { ... } are equally long. - Earlier today (in meeting) we discussed adopting the pattern => expr form in alts. If you're going to have any kind of "pat arrow expr" form for lambdas, I think it should be the same arrow as used in alt's "pat arrow expr". - I can imagine a "\ pat => expr" or "fn: pat => expr" form, but in all honesty I find the "{|pat| expr}" form easier to read. Because of the braces. For two reasons: 1. the binder scope is structurally visible, beginning and ending. 2. they transition smoothly to multiline blocks when the expression inevitably grows more complex or line-wraps due to indentation. > Here are some examples: I'm writing "how it's written now" examples beneath, strictly for aesthetic sense. I find the current form has grown on me and I quite like it now. But it's also notable to me that the size doesn't change: > spawn: { ... } surely this has to be "spawn(): { ... }" > for vec.each: x { ... } for vec.each {|x| ... } > let xs = xs.filter(x -> x.isEven()); let xs = xs.filter {|x| x.isEven()}; > let ys = xs.map(x -> x * 2).filter(x -> x.isEven()); let ys = xs.map {|x| x * 2}.filter {|x| x.isEven()}; I guess I'm having a hard time seeing the motive. Is it a preference for parens over braces? We could probably support (|pat| expr) as a lambda just as well as {|pat| expr} -- parser can see the transition point to pattern grammar -- though it loses the "transitions to multi-line easily" aspect I mention in point #2 above.. -Graydon From stefan.plantikow at googlemail.com Tue Apr 17 17:21:03 2012 From: stefan.plantikow at googlemail.com (Stefan Plantikow) Date: Wed, 18 Apr 2012 02:21:03 +0200 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: <4F8DF64F.4060000@alum.mit.edu> References: <4F8DF64F.4060000@alum.mit.edu> Message-ID: <7A9E107006B34C3694B2B8AFFF829A10@googlemail.com> Hi, > > 2. You may omit the last argument of a function that expects a closure > using a syntax like the following: Independent from the lambda syntax (I like the ruby/smalltalkish block syntax that is in place now), is there a reason why this is limited to closures? It may be visually appealing to be able to pass other expressions as the last argument in a similar fashion. x.append() [1,2,3,4] print(f) "foo" -- Stefan. From pwalton at mozilla.com Tue Apr 17 17:29:56 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Tue, 17 Apr 2012 17:29:56 -0700 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: <4F8E0378.1030503@mozilla.com> References: <4F8DF64F.4060000@alum.mit.edu> <4F8E0378.1030503@mozilla.com> Message-ID: <4F8E0B04.2030905@mozilla.com> On 4/17/12 4:57 PM, Graydon Hoare wrote: > On 12-04-17 04:01 PM, Niko Matsakis wrote: >> Apologies in advance. > > Ha! Apologies in return, then, as at least the first bit I'm kinda sour > on. The rest I'm just confused over. > >> This requires arbitrary lookahead to disambiguate from tuples. > > This bit in particular. Really really don't want to cross the bridge to > arbitrary lookahead in the grammar. Yes, I don't like this either. Note that these two are not problematic: "-> foo" (no lookahead) or "x -> x + 1" (one token lookahead). Only multi-arity functions require arbitrary lookahead: "(x, y) -> x + y". This could be fixed with some sort of sigil: "\(x, y) -> x + y" or bars: "|x, y| -> x + y". I kind of like the latter, although I'm told that JavaScript developers balked when TC39 proposed it. >> 2. You may omit the last argument of a function that expects a closure >> using a syntax like the following: >> >> vec.iter: x { ... } This is ambiguous. Is "{ spawn: { hello() } }" a call to a function "spawn" with a block argument or a record literal with "spawn" as the key? This can be repaired by requiring the parentheses: for [ 1, 2, 3 ].each(): x { ... } spawn(): { ... } Or it can be repaired by requiring some sort of prefix for record literals. > I guess I'm having a hard time seeing the motive. Is it a preference for > parens over braces? We could probably support (|pat| expr) as a lambda > just as well as {|pat| expr} -- parser can see the transition point to > pattern grammar -- though it loses the "transitions to multi-line > easily" aspect I mention in point #2 above.. To me (just IMHO) there are three main issues with the current syntax: (1) In curly-brace-structured languages, "{" generally ends the line. (2) "{||" reads like line noise. (3) Programmers don't seem to like Ruby/Smalltalk-like bars much in languages that aren't in the syntactic tradition of Smalltalk. I didn't realize this when I proposed it... that said, I do like the way "|x, y| x + y" looks. Anyway, just a little more shameless bikeshedding... Patrick From niko at alum.mit.edu Tue Apr 17 18:46:21 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 17 Apr 2012 18:46:21 -0700 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: <7A9E107006B34C3694B2B8AFFF829A10@googlemail.com> References: <4F8DF64F.4060000@alum.mit.edu> <7A9E107006B34C3694B2B8AFFF829A10@googlemail.com> Message-ID: <4F8E1CED.1040505@alum.mit.edu> We considered this. It seemed to invite abuse---or at least provide for a lot of choice. Niko On 4/17/12 5:21 PM, Stefan Plantikow wrote: > > Hi, >> 2. You may omit the last argument of a function that expects a closure >> using a syntax like the following: > > Independent from the lambda syntax (I like the ruby/smalltalkish block syntax that is in place now), is there a reason why this is limited to closures? > It may be visually appealing to be able to pass other expressions as the last argument in a similar fashion. > > x.append() [1,2,3,4] > print(f) "foo" > > > > -- Stefan. > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From niko at alum.mit.edu Tue Apr 17 19:05:47 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 17 Apr 2012 19:05:47 -0700 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: <4F8E0378.1030503@mozilla.com> References: <4F8DF64F.4060000@alum.mit.edu> <4F8E0378.1030503@mozilla.com> Message-ID: <4F8E217B.6070805@alum.mit.edu> I think the motivation was was just to see if we could find something that seemed "prettier". Obviously a subjective thing. Your objections seem quite reasonable, to be honest. Personally speaking I find the current form decent, but not an optimum. I find it quite heavyweight for small expressions, but to be honest I am not sure that the proposal below is a big improvement in that regard. I'm still mulling over my old underscore-as-syntactic-sugar-for-closures, trying to think if there is nice way to bring that in. For everyone's productivity, we should probably pick an arbitrary Rust release whereupon pointless bikesheds will no longer be entertained. =) Niko On 4/17/12 4:57 PM, Graydon Hoare wrote: > On 12-04-17 04:01 PM, Niko Matsakis wrote: >> Apologies in advance. > Ha! Apologies in return, then, as at least the first bit I'm kinda sour > on. The rest I'm just confused over. > >> This requires arbitrary lookahead to disambiguate from tuples. > This bit in particular. Really really don't want to cross the bridge to > arbitrary lookahead in the grammar. > >> 2. You may omit the last argument of a function that expects a closure >> using a syntax like the following: >> >> vec.iter: x { ... } > Ok. I think I can see where you're going with this -- now that I'm > reading it as a _replacement_ for the existing block syntax -- but > looking it over I think I don't get it and/or prefer what we've got: > > - It doesn't seem any shorter: > > foo {|x| ... } and > foo: x { ... } > > are equally long. > > - Earlier today (in meeting) we discussed adopting the pattern => > expr form in alts. If you're going to have any kind of "pat > arrow expr" form for lambdas, I think it should be the same > arrow as used in alt's "pat arrow expr". > > - I can imagine a "\ pat => expr" or "fn: pat => expr" form, but > in all honesty I find the "{|pat| expr}" form easier to read. > Because of the braces. For two reasons: > > 1. the binder scope is structurally visible, beginning and ending. > > 2. they transition smoothly to multiline blocks when the expression > inevitably grows more complex or line-wraps due to indentation. > >> Here are some examples: > I'm writing "how it's written now" examples beneath, strictly for > aesthetic sense. I find the current form has grown on me and I quite > like it now. But it's also notable to me that the size doesn't change: > >> spawn: { ... } > surely this has to be "spawn(): { ... }" > >> for vec.each: x { ... } > for vec.each {|x| ... } > >> let xs = xs.filter(x -> x.isEven()); > let xs = xs.filter {|x| x.isEven()}; > >> let ys = xs.map(x -> x * 2).filter(x -> x.isEven()); > let ys = xs.map {|x| x * 2}.filter {|x| x.isEven()}; > > I guess I'm having a hard time seeing the motive. Is it a preference for > parens over braces? We could probably support (|pat| expr) as a lambda > just as well as {|pat| expr} -- parser can see the transition point to > pattern grammar -- though it loses the "transitions to multi-line > easily" aspect I mention in point #2 above.. > > -Graydon > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From a.stavonin at gmail.com Tue Apr 17 19:24:18 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Wed, 18 Apr 2012 11:24:18 +0900 Subject: [rust-dev] How to allocate record on memory? Message-ID: I need unmanaged, C-compatibe structure on heap which will not be autodeleted in any cases. I don't *really* understand what you are trying to do, but I think you > have two choices: > > (1) A call to libc::malloc, like you showed in your later mail. > > (2) Allocate the type as a @T and then use ptr::addr_of(*x) to get an > unsafe ptr from that. Then you are responsible for keeping a live > reference to the @T so that we don't collect it. > > > Niko -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Tue Apr 17 19:42:29 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 17 Apr 2012 19:42:29 -0700 Subject: [rust-dev] How to allocate record on memory? In-Reply-To: References: Message-ID: <4F8E2A15.6080207@alum.mit.edu> On 4/17/12 7:24 PM, Alexander Stavonin wrote: > I need unmanaged, C-compatibe structure on heap which will not be > autodeleted in any cases. Then calling malloc seems like what you want. Niko From pwalton at mozilla.com Tue Apr 17 19:46:39 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Tue, 17 Apr 2012 19:46:39 -0700 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: <4F8E0B04.2030905@mozilla.com> References: <4F8DF64F.4060000@alum.mit.edu> <4F8E0378.1030503@mozilla.com> <4F8E0B04.2030905@mozilla.com> Message-ID: <4F8E2B0F.6060106@mozilla.com> Here's a little before-and-after with some of the syntax and semantic changes discussed (snippet from Sebastian Sylvan's raytracer [1] and modified slightly): --- Before --- #[inline(always)] fn get_rand_env() -> rand_env { let rng = rand::rng(); let disk_samples = vec::from_fn(513u) { |_x| // compute random position on light disk let r_sqrt = f32::sqrt(rng.next_float() as f32); let theta = rng.next_float() as f32 * 2f32 * f32::consts::pi; (r_sqrt * theta.cos(), r_sqrt * theta.sin()) } let mut hemicos_samples = []; for uint::range(0u, NUM_GI_SAMPLES_SQRT) { |x| for uint::range(0u, NUM_GI_SAMPLES_SQRT) { |y| let (u, v) = ((x as f32 + rng.next_float() as f32) / NUM_GI_SAMPLES_SQRT as f32, (y as f32 + rng.next_float() as f32) / NUM_GI_SAMPLES_SQRT as f32); hemicos_samples.push(cosine_hemisphere_sample(u, v)); } } { rng: rng, floats: vec::from_fn(513u, { |_x| rng.next_float() as f32 }), disk_samples: disk_samples, hemicos_samples: hemicos_samples } } --- After --- #[inline(always)] fn get_rand_env() -> rand_env { let rng = rand::rng(); let disk_samples = vec::from_fn(513): x { // compute random position on light disk let r_sqrt = rng.next_float().(f32).sqrt(); let theta = rng.next_float().(f32) * 2.0 * f32::consts::pi; (r_sqrt * theta.cos(), r_sqrt * theta.sin()); } let mut hemicos_samples = []/~; for uint::range(0, NUM_GI_SAMPLES_SQRT): x { for uint::range(0, NUM_GI_SAMPLES_SQRT): y { let (u, v) = ((x.(f32) + rng.next_float().(f32)) / NUM_GI_SAMPLES_SQRT.(f32), (y.(f32) + rng.next_float().(f32)) / NUM_GI_SAMPLES_SQRT.(f32)); hemicos_samples.push(cosine_hemisphere_sample(u, v)); } } { rng: rng, floats: vec::from_fn(513, _ -> rng.next_float().(f32)), disk_samples: disk_samples, hemicos_samples: hemicos_samples }; } --- Patrick [1]: https://github.com/brson/rustray/blob/master/raytracer.rs From qwertie256 at gmail.com Tue Apr 17 22:03:20 2012 From: qwertie256 at gmail.com (David Piepgrass) Date: Tue, 17 Apr 2012 23:03:20 -0600 Subject: [rust-dev] bikeshed on closure syntax Message-ID: > > > This requires arbitrary lookahead to disambiguate from tuples. > > This bit in particular. Really really don't want to cross the bridge to > arbitrary lookahead in the grammar. > Pardon me, but I'm not convinced that there is a problem in lambdas like (x, y) -> (x + y). By analogy, you can realize that ((x * y) + z, q) is a tuple instead of a simple parenthesized expression when you reach the comma -- you don't need to look ahead for a comma in advance. So why not treat (x, y) as a tuple until you reach the "->" and then reinterpret the contents at that point? This works as long as the syntax of a lambda argument list is a subset of the tuple syntax, anyway. If that's not the case, parsing gets messier, though I'm sure arbitrary lookahead is not be the only possible implementation. -- - David http://loyc-etc.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From pwalton at mozilla.com Tue Apr 17 22:06:46 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Tue, 17 Apr 2012 22:06:46 -0700 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: References: Message-ID: <4F8E4BE6.2000306@mozilla.com> On 04/17/2012 10:03 PM, David Piepgrass wrote: > > This requires arbitrary lookahead to disambiguate from tuples. > > This bit in particular. Really really don't want to cross the bridge to > arbitrary lookahead in the grammar. > > > Pardon me, but I'm not convinced that there is a problem in lambdas > like (x, y) -> (x + y). By analogy, you can realize that ((x * y) + z, > q) is a tuple instead of a simple parenthesized expression when you > reach the comma -- you don't need to look ahead for a comma in advance. > So why not treat (x, y) as a tuple until you reach the "->" and then > reinterpret the contents at that point? This works as long as the syntax > of a lambda argument list is a subset of the tuple syntax, anyway. If > that's not the case, parsing gets messier, though I'm sure arbitrary > lookahead is not be the only possible implementation. Patterns are not a subset of the expression grammar. For example, ":" has meaning in a pattern (type test), but not in an expression. Patrick From grahame at angrygoats.net Tue Apr 17 22:35:36 2012 From: grahame at angrygoats.net (Grahame Bowland) Date: Wed, 18 Apr 2012 13:35:36 +0800 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: <4F8E2B0F.6060106@mozilla.com> References: <4F8DF64F.4060000@alum.mit.edu> <4F8E0378.1030503@mozilla.com> <4F8E0B04.2030905@mozilla.com> <4F8E2B0F.6060106@mozilla.com> Message-ID: Hi Patrick Thanks for the visual example. I prefer the old { |x| } style. It might look slightly ugly but glancing upwards from a piece of code I find it much easier to spot that a given pair of braces are a block closure. I guess || sticks out like a pair of thumbs, which is useful. Cheers Grahame On 18 April 2012 10:46, Patrick Walton wrote: > Here's a little before-and-after with some of the syntax and semantic > changes discussed (snippet from Sebastian Sylvan's raytracer [1] and > modified slightly): > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.striegel at gmail.com Wed Apr 18 06:01:54 2012 From: ben.striegel at gmail.com (Benjamin Striegel) Date: Wed, 18 Apr 2012 09:01:54 -0400 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: References: <4F8DF64F.4060000@alum.mit.edu> <4F8E0378.1030503@mozilla.com> <4F8E0B04.2030905@mozilla.com> <4F8E2B0F.6060106@mozilla.com> Message-ID: Count me alongside the fans of the current style. The explicit braces are nice (if Rust is going to use braces, it shouldn't be ashamed about it!), and the bars make it easy to visually identify closures. Plus, the notation is instantly familiar to anyone who's seen Ruby. On Wed, Apr 18, 2012 at 1:35 AM, Grahame Bowland wrote: > Hi Patrick > > Thanks for the visual example. I prefer the old { |x| } style. It might > look slightly ugly but glancing upwards from a piece of code I find it much > easier to spot that a given pair of braces are a block closure. I guess || > sticks out like a pair of thumbs, which is useful. > > Cheers > Grahame > > > On 18 April 2012 10:46, Patrick Walton wrote: > >> Here's a little before-and-after with some of the syntax and semantic >> changes discussed (snippet from Sebastian Sylvan's raytracer [1] and >> modified slightly): >> > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Wed Apr 18 06:46:01 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 18 Apr 2012 06:46:01 -0700 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: References: <4F8DF64F.4060000@alum.mit.edu> <4F8E0378.1030503@mozilla.com> <4F8E0B04.2030905@mozilla.com> <4F8E2B0F.6060106@mozilla.com> Message-ID: <4F8EC599.80400@alum.mit.edu> On 4/17/12 10:35 PM, Grahame Bowland wrote: > Hi Patrick > > Thanks for the visual example. I prefer the old { |x| } style. It > might look slightly ugly but glancing upwards from a piece of code I > find it much easier to spot that a given pair of braces are a block > closure. I guess || sticks out like a pair of thumbs, which is useful. I have to admit, even I prefer the "|x|" to ": x {" when I see it laid out like that. They are much easier to spot. Niko From niko at alum.mit.edu Wed Apr 18 06:48:01 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 18 Apr 2012 06:48:01 -0700 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: References: Message-ID: <4F8EC611.1090100@alum.mit.edu> On 4/17/12 10:03 PM, David Piepgrass wrote: > > > This requires arbitrary lookahead to disambiguate from tuples. > > This bit in particular. Really really don't want to cross the > bridge to > arbitrary lookahead in the grammar. > > > Pardon me, but I'm not convinced that there is a problem in lambdas > like (x, y) -> (x + y). By analogy, you can realize that ((x * y) + z, > q) is a tuple instead of a simple parenthesized expression when you > reach the comma -- you don't need to look ahead for a comma in > advance. So why not treat (x, y) as a tuple until you reach the "->" > and then reinterpret the contents at that point? This works as long as > the syntax of a lambda argument list is a subset of the tuple syntax, > anyway. If that's not the case, parsing gets messier, though I'm sure > arbitrary lookahead is not be the only possible implementation. You are correct that, in theory, we could parse, but we wouldn't be able to build AST nodes until we know definitively whether it's a lambda or not. Niko From pwalton at mozilla.com Wed Apr 18 08:14:01 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Wed, 18 Apr 2012 08:14:01 -0700 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: <4F8EC599.80400@alum.mit.edu> References: <4F8DF64F.4060000@alum.mit.edu> <4F8E0378.1030503@mozilla.com> <4F8E0B04.2030905@mozilla.com> <4F8E2B0F.6060106@mozilla.com> <4F8EC599.80400@alum.mit.edu> Message-ID: <4F8EDA39.1010803@mozilla.com> On 04/18/2012 06:46 AM, Niko Matsakis wrote: > On 4/17/12 10:35 PM, Grahame Bowland wrote: >> Hi Patrick >> >> Thanks for the visual example. I prefer the old { |x| } style. It >> might look slightly ugly but glancing upwards from a piece of code I >> find it much easier to spot that a given pair of braces are a block >> closure. I guess || sticks out like a pair of thumbs, which is useful. > > I have to admit, even I prefer the "|x|" to ": x {" when I see it laid > out like that. They are much easier to spot. Yeah, I have to agree. If we want a minimal delta on the current syntax that addresses issues (1) and (2) (and I'm not insisting that we address them, mind you), how about this? Change the current function call syntax to: Call ::== Primary '(' Args ')' (':' BlockLambda (',' BlockLambda)*)? And change BlockLambda to: BlockLambda ::== '->' Expr | '|' InferredArg (',' InferredArg)* '|' Expr It establishes a fairly simple set of rules: 1. Bars indicate a closure. 2. A thin arrow indicates a zero-argument thunk. 3. In any function call, trailing closure arguments can be pulled out and placed after a colon. If this is done, the semicolon statement separator after the call can be omitted. So we would have: #[inline(always)] fn get_rand_env() -> rand_env { let rng = rand::rng(); let disk_samples = vec::from_fn(513): |x| { // compute random position on light disk let r_sqrt = rng.next_float().(f32).sqrt(); let theta = rng.next_float().(f32) * 2.0 * f32::consts::pi; (r_sqrt * theta.cos(), r_sqrt * theta.sin()); } let mut hemicos_samples = []/~; for uint::range(0, NUM_GI_SAMPLES_SQRT): |x| { for uint::range(0, NUM_GI_SAMPLES_SQRT): |y| { let (u, v) = ((x.(f32) + rng.next_float().(f32)) / NUM_GI_SAMPLES_SQRT.(f32), (y.(f32) + rng.next_float().(f32)) / NUM_GI_SAMPLES_SQRT.(f32)); hemicos_samples.push(cosine_hemisphere_sample(u, v)); } } { rng: rng, floats: vec::from_fn(513, |_| rng.next_float().(f32)), disk_samples: disk_samples, hemicos_samples: hemicos_samples }; } Patrick From banderson at mozilla.com Wed Apr 18 11:31:20 2012 From: banderson at mozilla.com (Brian Anderson) Date: Wed, 18 Apr 2012 11:31:20 -0700 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: <4F8EC599.80400@alum.mit.edu> References: <4F8DF64F.4060000@alum.mit.edu> <4F8E0378.1030503@mozilla.com> <4F8E0B04.2030905@mozilla.com> <4F8E2B0F.6060106@mozilla.com> <4F8EC599.80400@alum.mit.edu> Message-ID: <4F8F0878.2090803@mozilla.com> On 04/18/2012 06:46 AM, Niko Matsakis wrote: > On 4/17/12 10:35 PM, Grahame Bowland wrote: >> Hi Patrick >> >> Thanks for the visual example. I prefer the old { |x| } style. It >> might look slightly ugly but glancing upwards from a piece of code I >> find it much easier to spot that a given pair of braces are a block >> closure. I guess || sticks out like a pair of thumbs, which is useful. > > I have to admit, even I prefer the "|x|" to ": x {" when I see it laid > out like that. They are much easier to spot. > Agreed. Would still be nice to have a lightweight version for one-liners. From graydon at mozilla.com Wed Apr 18 11:54:25 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Wed, 18 Apr 2012 11:54:25 -0700 Subject: [rust-dev] strings, slices and nulls Message-ID: <4F8F0DE1.3030607@mozilla.com> Hi, Our current strings always have a trailing null. This fact is pretty much solely for interop with C. It's convenient that when you grab a pointer to the buffer storing the string, you get something null-terminated that you can pass to C. We accomplish this by setting the fill field of a string to 1 longer than the number of bytes we're given, and writing a null to the str[fill] index. There are a couple places this is visible in the language, and a couple new places where it'll surface with fixed-size and slice strings: existing: - the index operator [] lets you index past the str::len length. That is, x[str::len(x)] == 0 as uint, even though the same thing fails as a bounds-overrun on a vec. - If you _cast_ to a vec (rather than asking for a copy) you get a vec that includes a trailing 0 byte. new: - "hello"/5 is a "fixed size" string of str::len 5, but it costs 6 bytes of storage. And the type, say a str/5 in a structure, will eat 6 bytes of contiguous storage. - "hello"/& makes a slice, a (*u8,uint) pair. The uint field is length, but it is also 6u, not 5u. That is, a slice always points one byte beyond the "part that the user wants". This is to support the idea of taking a slice from the middle of a string and passing it to C: the library function that produces a *c_char will have to look at the len'th byte, check for null, and make a temporary copy if the slice doesn't "end in null" already. (You only notice this if you manually unpack a slice to tuple form and inspect it. asking for str::slice_len(s) will return 5 as with any other "non-raw-pointers" view of a string) Here are some possible paths forward: 1. Keep everything as-described here. It's perfect! 2. Fix [] on strings to fault on s[str::len(s)], like vec. 3. Remove the null termination stuff altogether. Make all strings (fixed-size, slice, unique, shared) work exactly like vecs in terms of length, and _always_ make temporary copies that we manually null terminate before passing to C. My current thinking is #2 here. Fix the indexing operator to relate to observable "length" the same way vec does, but otherwise try to "preserve the illusion" that most strings can pass through to C "for cheap", without making a copy. Only slices-to-the-middle-of-strings need copies. Which should not be most slices. Other opinions? -Graydon From banderson at mozilla.com Wed Apr 18 12:17:53 2012 From: banderson at mozilla.com (Brian Anderson) Date: Wed, 18 Apr 2012 12:17:53 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: <4F8F0DE1.3030607@mozilla.com> References: <4F8F0DE1.3030607@mozilla.com> Message-ID: <4F8F1361.7070002@mozilla.com> On 04/18/2012 11:54 AM, Graydon Hoare wrote: > Hi, > > Our current strings always have a trailing null. This fact is pretty > much solely for interop with C. It's convenient that when you grab a > pointer to the buffer storing the string, you get something > null-terminated that you can pass to C. > > We accomplish this by setting the fill field of a string to 1 longer > than the number of bytes we're given, and writing a null to the > str[fill] index. > > There are a couple places this is visible in the language, and a couple > new places where it'll surface with fixed-size and slice strings: > > existing: > > - the index operator [] lets you index past the str::len > length. That is, x[str::len(x)] == 0 as uint, even though > the same thing fails as a bounds-overrun on a vec. This sounds like a bug. I've never encountered this before. > > - If you _cast_ to a vec (rather than asking for a copy) you > get a vec that includes a trailing 0 byte. This we rely on extensively, but hopefully most places that makes use of this fact do it via str::as_bytes. > > new: > > - "hello"/5 is a "fixed size" string of str::len 5, but it costs > 6 bytes of storage. And the type, say a str/5 in a structure, > will eat 6 bytes of contiguous storage. > > - "hello"/& makes a slice, a (*u8,uint) pair. The uint field is > length, but it is also 6u, not 5u. That is, a slice always points > one byte beyond the "part that the user wants". This is to > support the idea of taking a slice from the middle of a string and > passing it to C: the library function that produces a *c_char will > have to look at the len'th byte, check for null, and make a > temporary copy if the slice doesn't "end in null" already. > > (You only notice this if you manually unpack a slice to tuple form > and inspect it. asking for str::slice_len(s) will return 5 as with > any other "non-raw-pointers" view of a string) > > Here are some possible paths forward: > > 1. Keep everything as-described here. It's perfect! > > 2. Fix [] on strings to fault on s[str::len(s)], like vec. Yes please. > > 3. Remove the null termination stuff altogether. Make all strings > (fixed-size, slice, unique, shared) work exactly like vecs in terms > of length, and _always_ make temporary copies that we manually null > terminate before passing to C. > > My current thinking is #2 here. Fix the indexing operator to relate to > observable "length" the same way vec does, but otherwise try to > "preserve the illusion" that most strings can pass through to C "for > cheap", without making a copy. Only slices-to-the-middle-of-strings need > copies. Which should not be most slices. I agree with this. If we change str::as_bytes to copy as needed then most code should not be affected. The documentation for as_bytes/buf/c_str should reflect this though because it's sneaky. I would kind of like for Rust strings not to expose the fact that they are null-terminated (or just not be null terminated) but it seems unavoidable. From niko at alum.mit.edu Wed Apr 18 12:43:40 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 18 Apr 2012 12:43:40 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: <4F8F1361.7070002@mozilla.com> References: <4F8F0DE1.3030607@mozilla.com> <4F8F1361.7070002@mozilla.com> Message-ID: <4F8F196C.4090109@alum.mit.edu> On 4/18/12 12:17 PM, Brian Anderson wrote: > I would kind of like for Rust strings not to expose the fact that they > are null-terminated (or just not be null terminated) but it seems > unavoidable. +1 I can see the practicality of null-terminating, but I would really like to preserve the flexibility to reverse that decision in the future. Forcing people to go through str::as_bytes() to observe the null terminating seems like a win?except that I'd probably call it as_cstr() or something like that, to emphasize it's null-terminated-ness. Niko From niko at alum.mit.edu Wed Apr 18 13:09:33 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 18 Apr 2012 13:09:33 -0700 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: <4F8F0878.2090803@mozilla.com> References: <4F8DF64F.4060000@alum.mit.edu> <4F8E0378.1030503@mozilla.com> <4F8E0B04.2030905@mozilla.com> <4F8E2B0F.6060106@mozilla.com> <4F8EC599.80400@alum.mit.edu> <4F8F0878.2090803@mozilla.com> Message-ID: <4F8F1F7D.2020504@alum.mit.edu> On 4/18/12 11:31 AM, Brian Anderson wrote: > Agreed. Would still be nice to have a lightweight version for one-liners. I still favor something based on underscores as sugar for a closure. I ought to dust off my old proposal---I think now that I don't want to use it as the basis for the iter library, it might be simplified and perhaps made more appealing. Niko From graydon at mozilla.com Wed Apr 18 13:37:51 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Wed, 18 Apr 2012 13:37:51 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: <4F8F196C.4090109@alum.mit.edu> References: <4F8F0DE1.3030607@mozilla.com> <4F8F1361.7070002@mozilla.com> <4F8F196C.4090109@alum.mit.edu> Message-ID: <4F8F261F.90107@mozilla.com> On 12-04-18 12:43 PM, Niko Matsakis wrote: > I can see the practicality of null-terminating, but I would really like > to preserve the flexibility to reverse that decision in the future. > Forcing people to go through str::as_bytes() to observe the null > terminating seems like a win?except that I'd probably call it as_cstr() > or something like that, to emphasize it's null-terminated-ness. Right, but .. you can still observe it by allocating a str/5 in a record and noticing that it takes 6 bytes (i.e. with sizeof). Tolerable? -Graydon From banderson at mozilla.com Wed Apr 18 13:51:10 2012 From: banderson at mozilla.com (Brian Anderson) Date: Wed, 18 Apr 2012 13:51:10 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: <4F8F196C.4090109@alum.mit.edu> References: <4F8F0DE1.3030607@mozilla.com> <4F8F1361.7070002@mozilla.com> <4F8F196C.4090109@alum.mit.edu> Message-ID: <4F8F293E.1050601@mozilla.com> On 04/18/2012 12:43 PM, Niko Matsakis wrote: > On 4/18/12 12:17 PM, Brian Anderson wrote: >> I would kind of like for Rust strings not to expose the fact that they >> are null-terminated (or just not be null terminated) but it seems >> unavoidable. > > +1 > > I can see the practicality of null-terminating, but I would really like > to preserve the flexibility to reverse that decision in the future. > Forcing people to go through str::as_bytes() to observe the null > terminating seems like a win?except that I'd probably call it as_cstr() > or something like that, to emphasize it's null-terminated-ness. > > I was considering that the behavior of as_bytes is a little surprising because it exposes the null-terminator. We also already have as_c_str, which is built off of as_bytes. (Also we have as_buf). Ideally only as_c_str exposes the null-terminater, but if that's the case then there's no reason for as_bytes or as_buf and you should just use str::bytes to get a fresh, non-null-terminated copy. They are all just doing an unsafe cast so exposing the implementation is unavoidable. From banderson at mozilla.com Wed Apr 18 13:54:58 2012 From: banderson at mozilla.com (Brian Anderson) Date: Wed, 18 Apr 2012 13:54:58 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: <4F8F293E.1050601@mozilla.com> References: <4F8F0DE1.3030607@mozilla.com> <4F8F1361.7070002@mozilla.com> <4F8F196C.4090109@alum.mit.edu> <4F8F293E.1050601@mozilla.com> Message-ID: <4F8F2A22.10605@mozilla.com> On 04/18/2012 01:51 PM, Brian Anderson wrote: > On 04/18/2012 12:43 PM, Niko Matsakis wrote: >> On 4/18/12 12:17 PM, Brian Anderson wrote: >>> I would kind of like for Rust strings not to expose the fact that they >>> are null-terminated (or just not be null terminated) but it seems >>> unavoidable. >> >> +1 >> >> I can see the practicality of null-terminating, but I would really like >> to preserve the flexibility to reverse that decision in the future. >> Forcing people to go through str::as_bytes() to observe the null >> terminating seems like a win?except that I'd probably call it as_cstr() >> or something like that, to emphasize it's null-terminated-ness. >> >> > > I was considering that the behavior of as_bytes is a little surprising > because it exposes the null-terminator. We also already have as_c_str, > which is built off of as_bytes. (Also we have as_buf). Ideally only > as_c_str exposes the null-terminater, but if that's the case then > there's no reason for as_bytes or as_buf and you should just use > str::bytes to get a fresh, non-null-terminated copy. They are all just > doing an unsafe cast so exposing the implementation is unavoidable. As I think about this further, with slices we probably don't need as_buf or as_bytes since I think we will be able to create a string slice from a vec slice and vice versa. From banderson at mozilla.com Wed Apr 18 13:58:56 2012 From: banderson at mozilla.com (Brian Anderson) Date: Wed, 18 Apr 2012 13:58:56 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: <4F8F261F.90107@mozilla.com> References: <4F8F0DE1.3030607@mozilla.com> <4F8F1361.7070002@mozilla.com> <4F8F196C.4090109@alum.mit.edu> <4F8F261F.90107@mozilla.com> Message-ID: <4F8F2B10.5070304@mozilla.com> On 04/18/2012 01:37 PM, Graydon Hoare wrote: > On 12-04-18 12:43 PM, Niko Matsakis wrote: > >> I can see the practicality of null-terminating, but I would really like >> to preserve the flexibility to reverse that decision in the future. >> Forcing people to go through str::as_bytes() to observe the null >> terminating seems like a win?except that I'd probably call it as_cstr() >> or something like that, to emphasize it's null-terminated-ness. > > Right, but .. you can still observe it by allocating a str/5 in a record > and noticing that it takes 6 bytes (i.e. with sizeof). Tolerable? I think this is acceptable, Who's to say what that other byte is really doing there? From graydon at mozilla.com Wed Apr 18 13:59:28 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Wed, 18 Apr 2012 13:59:28 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: <4F8F2A22.10605@mozilla.com> References: <4F8F0DE1.3030607@mozilla.com> <4F8F1361.7070002@mozilla.com> <4F8F196C.4090109@alum.mit.edu> <4F8F293E.1050601@mozilla.com> <4F8F2A22.10605@mozilla.com> Message-ID: <4F8F2B30.8010200@mozilla.com> On 12-04-18 01:54 PM, Brian Anderson wrote: > As I think about this further, with slices we probably don't need as_buf > or as_bytes since I think we will be able to create a string slice from > a vec slice and vice versa. Not if string slices are, as I say, one byte "longer" than the area of interest. You can get a vec slice from a str slice but going the other way won't work (you have no way to know if the vec is even addressable one byte further than the slice). -Graydon From graydon at mozilla.com Wed Apr 18 14:04:36 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Wed, 18 Apr 2012 14:04:36 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: <4F8F2B10.5070304@mozilla.com> References: <4F8F0DE1.3030607@mozilla.com> <4F8F1361.7070002@mozilla.com> <4F8F196C.4090109@alum.mit.edu> <4F8F261F.90107@mozilla.com> <4F8F2B10.5070304@mozilla.com> Message-ID: <4F8F2C64.8030108@mozilla.com> On 12-04-18 01:58 PM, Brian Anderson wrote: > I think this is acceptable, Who's to say what that other byte is really > doing there? Heh. Rust strings: now with a "mystery byte"! (Not as interesting as the mystery bytes in ocaml strings, they have something pretty clever going on in them but it's specific to their storage manager: http://caml.inria.fr/pub/ml-archives/caml-list/2002/08/e109df224ff0150b302033e2002dbf87.en.html ) From a.stavonin at gmail.com Wed Apr 18 16:26:20 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Thu, 19 Apr 2012 08:26:20 +0900 Subject: [rust-dev] Functions overloading Message-ID: Stefan, I understood you idea but I have problem with compilation. fn_overloading.rs:31:34: 31:80 error: method `to_input` has an incompatible type: type parameter vs int fn_overloading.rs:31 impl of to_input for int { fn to_input() -> input { ret val(self); } } ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Also, am I right that in this case we could not make overloading for different types, for example for str and int? Alexander. I was reading this again and thought that as a middle ground between type > classes and enums, one could use type classes that wrap supported types > into enum variants > for easily switching over them. > > enum input { val(t), seq([t]) } > > iface to_input { fn to_input() -> input } > impl of to_input for int { fn to_input() -> input { ret > val(self); } } > impl of to_input for [int] { fn to_input() -> input { ret > seq(self); } } > > fn to_input(i: to_input) { > alt i.to_input() { > val(v) { ... } > seq(v) { ... } > } > } > > This way the user does not have to know about the internal enums and the > implementor can simply switch (especially useful with multiple arguments). > > I think it would be nice too have some syntax support for building ifaces > like that, perhaps via macros. > > -- Stefan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From erick.tryzelaar at gmail.com Wed Apr 18 17:07:12 2012 From: erick.tryzelaar at gmail.com (Erick Tryzelaar) Date: Wed, 18 Apr 2012 17:07:12 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: <4F8F2A22.10605@mozilla.com> References: <4F8F0DE1.3030607@mozilla.com> <4F8F1361.7070002@mozilla.com> <4F8F196C.4090109@alum.mit.edu> <4F8F293E.1050601@mozilla.com> <4F8F2A22.10605@mozilla.com> Message-ID: On Wed, Apr 18, 2012 at 1:54 PM, Brian Anderson wrote: > > As I think about this further, with slices we probably don't need as_buf or > as_bytes since I think we will be able to create a string slice from a vec > slice and vice versa. That would be great. I always forget that str::as_bytes includes a null, and it inevitably causes some obscure bug that takes me some time to track down. From jws at csse.unimelb.edu.au Wed Apr 18 17:07:36 2012 From: jws at csse.unimelb.edu.au (Jeff Schultz) Date: Thu, 19 Apr 2012 10:07:36 +1000 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: <4F8EDA39.1010803@mozilla.com> References: <4F8DF64F.4060000@alum.mit.edu> <4F8E0378.1030503@mozilla.com> <4F8E0B04.2030905@mozilla.com> <4F8E2B0F.6060106@mozilla.com> <4F8EC599.80400@alum.mit.edu> <4F8EDA39.1010803@mozilla.com> Message-ID: <20120419000736.GA3261@mulga.csse.unimelb.edu.au> On Wed, Apr 18, 2012 at 08:14:01AM -0700, Patrick Walton wrote: > And change BlockLambda to: > BlockLambda ::== '->' Expr > | '|' InferredArg (',' InferredArg)* '|' Expr > 1. Bars indicate a closure. > 2. A thin arrow indicates a zero-argument thunk. Any reason we can't just have an empty '||' instead of the '->'? It's easier to type and makes it easier to find all closures. > 3. In any function call, trailing closure arguments can be pulled out and > placed after a colon. If this is done, the semicolon statement separator > after the call can be omitted. Why is the ':' needed? Jeff Schultz From pwalton at mozilla.com Wed Apr 18 17:09:24 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Wed, 18 Apr 2012 17:09:24 -0700 Subject: [rust-dev] bikeshed on closure syntax In-Reply-To: <20120419000736.GA3261@mulga.csse.unimelb.edu.au> References: <4F8DF64F.4060000@alum.mit.edu> <4F8E0378.1030503@mozilla.com> <4F8E0B04.2030905@mozilla.com> <4F8E2B0F.6060106@mozilla.com> <4F8EC599.80400@alum.mit.edu> <4F8EDA39.1010803@mozilla.com> <20120419000736.GA3261@mulga.csse.unimelb.edu.au> Message-ID: <4F8F57B4.2080705@mozilla.com> On 4/18/12 5:07 PM, Jeff Schultz wrote: > Any reason we can't just have an empty '||' instead of the '->'? > > It's easier to type and makes it easier to find all closures. || looks a little like line noise to me, although I'm not wedded to the thin arrow. spawn(): -> { log("Hi!"); } vs. spawn(): || { log("Hi!"); } > > >> 3. In any function call, trailing closure arguments can be pulled out and >> placed after a colon. If this is done, the semicolon statement separator >> after the call can be omitted. > > Why is the ':' needed? To disambiguate a block from bitwise or. Patrick From niko at alum.mit.edu Wed Apr 18 17:10:43 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Wed, 18 Apr 2012 17:10:43 -0700 Subject: [rust-dev] Functions overloading In-Reply-To: References: Message-ID: <4F8F5803.5060705@alum.mit.edu> On 4/18/12 4:26 PM, Alexander Stavonin wrote: > Stefan, I understood you idea but I have problem with compilation. > > fn_overloading.rs:31:34: 31:80 error: method `to_input` has an > incompatible type: type parameter vs int > fn_overloading.rs:31 impl of > to_input for int { fn to_input() -> input { ret val(self); } } > > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ I think you wanted: impl of to_input for int { fn to_input() -> input { ret val(self); } } Actually, we should probably not allow free variables like that which are unbound in the iface type. I am not sure what problem it can cause but I have distant memory of unsoundness that results from such things when combined with existential types... have to go refresh my memory. > Also, am I right that in this case we could not make overloading for > different types, for example for str and int? You could add to the enum and thus support overloading for as many types as you like. Niko From steven099 at gmail.com Wed Apr 18 17:44:50 2012 From: steven099 at gmail.com (Steven Blenkinsop) Date: Wed, 18 Apr 2012 20:44:50 -0400 Subject: [rust-dev] Functions overloading In-Reply-To: References: Message-ID: On Wednesday, April 18, 2012, Alexander Stavonin wrote: > Stefan, I understood you idea but I have problem with compilation. > > fn_overloading.rs:31:34: 31:80 error: method `to_input` has an > incompatible type: type parameter vs int > fn_overloading.rs:31 impl of to_input for int { fn to_input() -> > input { ret val(self); } } > > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Also, am I right that in this case we could not make overloading for > different types, for example for str and int? > > Alexander. > > I think you're looking for: enum input { int(int), str(str) } iface to_input { fn to_input() -> input; } impl of to_input for int { fn to_input() -> input { ret int(self); } } impl of to_input for str { fn to_input() -> input { ret str(self); } } fn to_input(t: T) { alt t.to_input() { int(v) { io::println("int"); } str(v) { io::println("str"); } } } fn main() { to_input(5); to_input("hello") } -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Thu Apr 19 06:25:39 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 19 Apr 2012 06:25:39 -0700 Subject: [rust-dev] proposal for auto-unboxing and impls Message-ID: <4F901253.6020402@alum.mit.edu> As some of you have noticed, if you have an expression a.b, we currently do the following: 1. Find the type of `a`, let's call it T_a 2. Auto-deref T_a to D_a 3. Search for a field b in the type D_a 4. Assuming none is found, search for an impl defining a method b on the type T_a As you can see, we search for fields using the deref'd type D_a but search for *methods* using the original type, T_a. This is plainly inconsistent. It also leads to annoying things like `(*a).foo()`. However, the reason we do that is also sensible: we want to be able to define an impl on a type `@T`, for example. I was thinking, though, that we could simply change the step where we scan for matching impls to take both the original and dereferenced type as input. For each impl, we can then examine the type the impl is defined for and check either T_a or D_a as appropriate. To decide which T_a or D_a to use, we examine the type T_f that the impl is `for`. If `T_f` is dereferencable, we use T_a, otherwise we use D_a. Examples: class C { ... } type R = { ... } enum kilometer = uint; impl of X for @uint { } // use T_a impl of X for option { } // use D_a impl of X for uint { } // use D_a impl of X for C { } // use D_a impl of X for kilometer { } // use T_a impl of X for R { } // use D_a How's this sound? Niko From marijnh at gmail.com Thu Apr 19 06:32:12 2012 From: marijnh at gmail.com (Marijn Haverbeke) Date: Thu, 19 Apr 2012 15:32:12 +0200 Subject: [rust-dev] proposal for auto-unboxing and impls In-Reply-To: <4F901253.6020402@alum.mit.edu> References: <4F901253.6020402@alum.mit.edu> Message-ID: How about intermediate half-unboxed types? (If there's an impl for @X, can you directly call its methods on @@X?) I was going to implement this, but it somehow slipped through the cracks. From niko at alum.mit.edu Thu Apr 19 06:56:35 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 19 Apr 2012 06:56:35 -0700 Subject: [rust-dev] proposal for auto-unboxing and impls In-Reply-To: References: <4F901253.6020402@alum.mit.edu> Message-ID: <4F901993.9090904@alum.mit.edu> On 4/19/12 6:32 AM, Marijn Haverbeke wrote: > How about intermediate half-unboxed types? (If there's an impl for @X, > can you directly call its methods on @@X?) No. Under this proposal, you would have to do (*x).foo() in that case. I was trying to avoid looping through types, unboxing a step at a time and scanning for impls. That seemed to me to be very close to selecting the most specific type, something we have hitherto avoided. But I guess the loop doesn't have to stop. You could scan all the way through and at the end there should still be only one match. I'm not sure which I prefer. I guess I'd basically be happy either way. Niko From niko at alum.mit.edu Thu Apr 19 07:22:47 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Thu, 19 Apr 2012 07:22:47 -0700 Subject: [rust-dev] Syntax of vectors, slices, etc Message-ID: <4F901FB7.70805@alum.mit.edu> In general I love Graydon's proposal for strings and arrays, but I am not crazy about the notation. In particular I think []/@ and []/~ is not a good syntax for shared/unique vectors. It's not the slash, it's that I find it inconsistent. Generally speaking, a @ or ~ after the main type is a bound, and before it indicates the kind of the pointer. But here, it indicates the kind of pointer. And []/3 is not a pointer at all. In Graydon's proposal, there are three kinds of vector-like things: - Fixed-length arrays ([T]/3, T[3] at runtime) - Vectors ([T]/@, [T]/~, boxed>* or rust_vec* at runtime) - Slices ([T] or [T]/&, pair of T* and length) Of these, the notation for slices seems exactly right: it is short and the "/" suffix indicates a bound. In fact, I think maybe we should change fn@() to fn/@() and so forth, and just have "/" be a trailing bound indicator. That leaves fixed-length arrays and vectors to represent somehow. And let's not forget strings, which just complicate everything. So here is my overall proposal (best viewed in fixed width). The comparison is between my proposal, Graydon's proposal, and an English-language description. In some cases (such as ifaces), I have also integrated work on the type system I would like to do in the future. New type Old type Descr. -------- -------- ------ fn(S) -> T fn(S) fn/@(S) -> T fn@(S) -> T fn/~(S) -> T fn~(S) -> T :N [T] [T]/N fixed-length array [N]T [T]/N fixed-length array :[T] N/A (see below) @:[T] [T]/@ boxed vec ~:[T] [T]/~ unique vec [T] [T] slice [T]/&r [T]/&r slice with expl. region Id Id enum/class/resource/iface Id/&r Id&r ...with expl. region bound Id/@ Id@ iface with @ bound Id/~ Id~ iface with ~ bound str str slice str/&r str/&r slice with expl. region :N str str/N fixed-length str :str N/A (see below) @:str str/@ boxed str ~:str str/~ unique str Explanation and rationale: - A trailing slash always indicates a bound, meaning that it limits the types contained "within" the affected type. Normally, the bound is a region. In the case of opaque types (like fn and ifaces), this bound can also be @ or ~. - The type `:N [T]` and `:N str`, corresponds to `T[N]` or `u8[N+1]` respectively. That is, it is a "by-value" array. If we want to allow N to be an arbitrary (const) expression, we may need to write `:(expr) [T]`, since `str` is no longer a keyword. - Now everything which is in fact a pointer into the task/exchange heaps is prefixed with a @/~. - The pseudo-type `:[T]` is supposed to look like "an array with an unspecified length". It refers to a rust_vec (by-value). I say that it is a pseudo-type because you cannot write `:[T]` on its own. In fact, it is not even a type. You can only write `@:[T]` or `~:[T]`---we just use a bit of look-ahead. The reason to keep `:[T]` from being a type is that it has unknown size. To support this safely with generic types, we'd need to add kinds. I would like to do this eventually so that we can declare records with an inline vector at the end, but it's not necessary now. I am not at all crazy about `:` prefix, I just couldn't come up with a better character. I wanted `#` for number, but (a) it's in use by macros and (b) it's kind of heavy. `*` (think: repeat) is used for unsafe ptrs. `^` is random. `+` (again, repeat) looks like an infix operator, not a prefix operator. Rejected ideas: My original plan was "N:[T]" which I think looks way better than ":N [T]", but I scrapped it because `N` might eventually be a const expression and we need some clue that it's coming in the parser. Another plan which I liked a lot was to have []T be slice, [N]T be constant length array, and [:]T or [.]T be unknown length array. I think this looks *great*, but there are two problems: First, I don't know how it extends to `str`. Second, the region bound, if any, is ambiguous, so you'd need parentheses to clear it up: []T/&r could be [](T/&r) or ([]T)/&r. But maybe that's ok as I don't expect explicit region bounds to appear very often at all. Thoughts? Niko From jruderman at gmail.com Thu Apr 19 07:25:24 2012 From: jruderman at gmail.com (Jesse Ruderman) Date: Thu, 19 Apr 2012 07:25:24 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: <4F8F0DE1.3030607@mozilla.com> References: <4F8F0DE1.3030607@mozilla.com> Message-ID: My preference is to remove null termination: * I'm guessing most strings aren't passed to C. (What are the most common C string calls in rustc?) * C functions that scan for null are inefficient, so they're even more likely to be replaced with Rust equivalents than other C functions. * Null termination is not sufficient for interop with C. You also have to ensure the strings don't contain null characters. (This is a common source of bugs in Firefox, since JavaScript strings and strings from the network can contain null characters.) And if null characters are present, what do you do? * Each C function has its own expectations about character encoding and allowed characters, so calls to C involve extra state-tracking or checks anyway. From graydon at mozilla.com Thu Apr 19 12:59:07 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Thu, 19 Apr 2012 12:59:07 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: References: <4F8F0DE1.3030607@mozilla.com> Message-ID: <4F906E8B.603@mozilla.com> On 12-04-19 07:25 AM, Jesse Ruderman wrote: > My preference is to remove null termination: > > * I'm guessing most strings aren't passed to C. (What are the most > common C string calls in rustc?) All the filesystem access stuff, at this point. In the future it's harder to say. > * C functions that scan for null are inefficient, so they're even more > likely to be replaced with Rust equivalents than other C functions. Hm, I think this is not a reasonable stance: $ find /usr/include/ -name \*.h \ | xargs cat \ | grep -c 'char\( *const\)\? *\*' 10488 There are a lot of C APIs that take strings. "Rewrite the world in rust" is going to take a long time. > * Null termination is not sufficient for interop with C. You also have > to ensure the strings don't contain null characters. (This is a common > source of bugs in Firefox, since JavaScript strings and strings from > the network can contain null characters.) And if null characters are > present, what do you do? I can see some cases where that might be a bug, but in general I think an embedded null just ... makes a string shorter, from C's perspective. It's the same as passing a short string. Of course if the C code requires some other kind of well-formedness condition in the prefix, you'd need to enforce that, but that condition presumably holds over shorter and longer strings alike. Most C APIs aren't written to take strings of a fixed size. > * Each C function has its own expectations about character encoding > and allowed characters, so calls to C involve extra state-tracking or > checks anyway. For APIs that take UTF-16, such as the win32 APIs, we already do the conversion before calling, yes. But for APIs that take "char *" they tend to be set up so they can accept UTF-8 input: they're either agnostic to the differences between ASCII and UTF-8 (as UTF-8 was designed to exploit) or else they can operate in UTF-8 mode via LC_CTYPE or such. Sure you need to either enforce that and/or re-encode when it's not true, but again, this is about opportunistic recoding-avoidance by careful choice of defaults, rather than a guarantee that we never need to recode. Sometimes users want an array of UCS4 as well, but it's not our default string representation. -Graydon From banderson at mozilla.com Thu Apr 19 13:33:32 2012 From: banderson at mozilla.com (Brian Anderson) Date: Thu, 19 Apr 2012 13:33:32 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: References: <4F8F0DE1.3030607@mozilla.com> Message-ID: <4F90769C.4000409@mozilla.com> On 04/19/2012 07:25 AM, Jesse Ruderman wrote: > My preference is to remove null termination: > > * I'm guessing most strings aren't passed to C. (What are the most > common C string calls in rustc?) In rustc it's all the calls to LLVM that require a things to be named. From brendan at mozilla.org Thu Apr 19 13:42:31 2012 From: brendan at mozilla.org (Brendan Eich) Date: Thu, 19 Apr 2012 13:42:31 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: <4F906E8B.603@mozilla.com> References: <4F8F0DE1.3030607@mozilla.com> <4F906E8B.603@mozilla.com> Message-ID: <4F9078B7.6060000@mozilla.org> Graydon Hoare wrote: >> * C functions that scan for null are inefficient, so they're even more >> > likely to be replaced with Rust equivalents than other C functions. > > Hm, I think this is not a reasonable stance: > > $ find/usr/include/ -name \*.h \ > | xargs cat \ > | grep -c 'char\( *const\)\? *\*' > 10488 > > There are a lot of C APIs that take strings. "Rewrite the world in rust" > is going to take a long time. Also guessing that C code is slow because of NUL-termination searches needs evidence. Over my ~30 years of C/Unix I've heard this but I've never seen such evidence. Maybe I missed it! /be From qwertie256 at gmail.com Fri Apr 20 18:20:59 2012 From: qwertie256 at gmail.com (David Piepgrass) Date: Fri, 20 Apr 2012 19:20:59 -0600 Subject: [rust-dev] strings, slices and nulls Message-ID: > > My preference is to remove null termination: > > * I'm guessing most strings aren't passed to C. (What are the most > common C string calls in rustc?) > It isn't just C, I'm afraid. It's pretty much every other language and OS, too, because every language and OS is designed for C as the lowest-common-denominator. * Each C function has its own expectations about character encoding and allowed characters, so calls to C involve extra state-tracking or > checks anyway. > I can't really agree. Some C functions have 'expectations about character encoding and allowed characters' but many don't, and those that do don't necessarily require 'extra state-tracking or checks' at run-time. It would be best if rust could hide the 'implementation detail' of null-termination, but for the foreseeable future, the potential interop performance advantages probably outweigh a byte of wasted storage here and there (after all, if one byte were important, we wouldn't want to use a whole 4 bytes to hold the string length.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From me at kevincantu.org Fri Apr 20 20:33:05 2012 From: me at kevincantu.org (Kevin Cantu) Date: Fri, 20 Apr 2012 20:33:05 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: References: Message-ID: The cost isn't an extra byte of storage. The cost is that every slice will have to be a copy+append. Both D and Go have special slice types to store pointers and offsets to the underlying arrays/strings, so slice operations can avoid that hit, IIRC. -- Kevin Cantu On Fri, Apr 20, 2012 at 6:20 PM, David Piepgrass wrote: >> My preference is to remove null termination: >> >> * I'm guessing most strings aren't passed to C. (What are the most >> common C string calls in rustc?) > > > It isn't just C, I'm afraid. It's pretty much every other language and OS, > too, because every language and OS is designed for C as the > lowest-common-denominator. > >> * Each C function has its own expectations about character encoding >> >> and allowed characters, so calls to C involve extra state-tracking or >> checks anyway. > > > I can't really agree. Some C functions have 'expectations about character > encoding and allowed characters' but many don't, and those that do don't > necessarily require 'extra state-tracking or checks' at run-time. > > It would be best if rust could hide the 'implementation detail' of > null-termination, but for the foreseeable future, the potential interop > performance advantages probably outweigh a byte of wasted storage here and > there (after all, if one byte were important, we wouldn't want to use a > whole 4 bytes to hold the string length.) > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > From a.stavonin at gmail.com Sun Apr 22 17:19:04 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Mon, 23 Apr 2012 09:19:04 +0900 Subject: [rust-dev] String constans Message-ID: Do you have a plan to adding string constans? Or are you proposing some other way for creating string constants? string_const.rs:3:18: 3:24 error: string constants are not supported string_const.rs:3 const NAME: str = "sting"; ^~~~~~ error: aborting due to previous errors -------------- next part -------------- An HTML attachment was scrubbed... URL: From arcata at gmail.com Sun Apr 22 18:02:53 2012 From: arcata at gmail.com (Joe Groff) Date: Sun, 22 Apr 2012 18:02:53 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: References: Message-ID: <3513163521000981743@unknownmsgid> On Apr 20, 2012, at 8:33 PM, Kevin Cantu wrote: > The cost isn't an extra byte of storage. The cost is that every slice > will have to be a copy+append. > > Both D and Go have special slice types to store pointers and offsets > to the underlying arrays/strings, so slice operations can avoid that > hit, IIRC. It would definitely be more forward-thinking to use slices as the primary native mechanism for passing around strings and memory-contiguous sequences in general. Regions should help make data-sharing slices safe to use. Since C calls already require that arguments be copied to the FFI context, does making null-terminated copies of strings add that much additional overhead? Copying is pretty cheap on modern platforms. -Joe From gasche.dylc at gmail.com Sat Apr 21 10:28:52 2012 From: gasche.dylc at gmail.com (gasche) Date: Sat, 21 Apr 2012 19:28:52 +0200 Subject: [rust-dev] 2 possible simplifications: reverse application, records as arguments Message-ID: I've been wondering about a problem tightly related to named parameters: named enum constructor arguments. SML has had the ability to define algebraic datatypes as sum of (named) records, and I think that is something that is missing in current rust. The code antipattern that cries for it is pattern matching for a enumeration name with a tedious number of "_" following it: rust/src % grep "_, _" -R . | wc -l 547 See also the following code example: alt it.node { ast::item_impl(tps, _, _, _) { if ns == ns_type { ret lookup_in_ty_params(e, name, tps); } } ast::item_enum(_, tps, _) | ast::item_ty(_, tps, _) { if ns == ns_type { ret lookup_in_ty_params(e, name, tps); } } ... } The position of the 'tps' variable among a variable number of ignored parameters is fragile, increase maintainance costs (if you add a parameter to some constructor in an enumeration, a lot of code changes are required just to say that you ignore it most of the time), and creates redundancy. This could be solved by having enum constructors with named arguments, and conversely a syntax for enum patterns matching named argument. Being able to write something like: alt it.node { | ast::item_impl(tps:tps) | ast::item_enum(tps:tps) | ast::item_ty(tps:tps) { if ns == ns_type { ret lookup_in_ty_params(e, name, tps); } } would be a win (then you can say that "tps:" is a shorthand for "tps:tps" or what not). Note: OCaml has made the choice that (K _) is a pattern that matches the constructor K no matter what its arity is (including none or several parameters). While that doesn't help with the 'tps' example above, it would still simplify a lot of places and does not require named constructor arguments. But I think that's an inferior solution. > - Our records are order-sensitive, to be C-structure compatible. > Keyword arguments are usually argued-for (as you are doing here) as > a way to make function arguments order-insensitive. We'd need to > decide whether we wanted order to matter or not. > > - Argument-passing tends to work best when you can pass stuff in > registers, not require the arguments to be spilled to memory and > addressable as a contiguous slab. So we'd want to be careful not to > require the "arguments structure" to be _actually_ addressable as a > structure at any point. Rather, calling f(x) would be some kind of > semantic sugar for `f(x.a, x.b, x.c)`, making separate copies for > the sake of passing. The clean solution is to make a distinction between the data structure you call "records" here and the structure that is denoted by the parameter-building syntax. Haskell has a semantic distinction between "boxed tuples" (the usual thing) and "unboxed tuples" (primitive, less exposed, can't be used to instantiate polymorphic types). Similarly you would have "data records" (contiguous, C-compatible, etc.) and, say, "native records", that would be a different type with less flexibility for the user and more flexibility for the implementer: not adressable, possibly non-contiguous memory layout, a field order decided by the compiler, etc. You could then explain {x:1, y:2} as syntaxic sugard for, say, record(x:1, y:2), where `record` is a polytypic primitive that builds a "data record" from some "native record". (You could do the same for tuples and handle mixed named/unnamed parameters by adopting the convention existing in some languages, for example Oz, that a tuple (x,y,z) is just a record of numeric fields (0:x, 1:y, 2:z)) > - We'd have to decide the rules surrounding keyword mismatches, partial > provision of keywords, argument permutation, and function types. Re. function types: if you consider those parameter-passing structures as "first class" (which does necessarily mean that they are convenient to use, for example if they're not adressable they will be less flexible), the natural choice is to have a family of types for them. Those types could come with restrictions and an unspoken kinding discipline, so that for example they cannot be used to instantiate type variables, maybe cannot be nested, etc. That's the main reason why I think one should think of such structures as real structures rather than syntactic sugar; it forces you to have a proper design for types and other aspects. > > another thing is that instead of passing arguments, you pass just one > > (anonymous) record. the record is the arguments. > > We actually had quite an argument with one of the Felix authors about > this. This was not, back then, a terribly realistic option during that > conversation (argument modes were still the primary way we were doing > safe references, which are not first class types). But it's conceivably > something we could look into if we get the argument-passing logic down > to "always by-value and use region pointers for safe references" (which > is where we're going). There remain some hitches: > > - Our syntax isn't quite compatible with the idea; records have to be > brace-parenthesized and tuples given round parentheses. They'd need > reform, and the syntax is already pretty crowded. > > - We'd have to decide the rules surrounding keyword mismatches, partial > provision of keywords, argument permutation, and function types. > > - Our records are order-sensitive, to be C-structure compatible. > Keyword arguments are usually argued-for (as you are doing here) as > a way to make function arguments order-insensitive. We'd need to > decide whether we wanted order to matter or not. > > - Argument-passing tends to work best when you can pass stuff in > registers, not require the arguments to be spilled to memory and > addressable as a contiguous slab. So we'd want to be careful not to > require the "arguments structure" to be _actually_ addressable as a > structure at any point. Rather, calling f(x) would be some kind of > semantic sugar for `f(x.a, x.b, x.c)`, making separate copies for > the sake of passing. > > So .. I can see a possibility here, but it'd be a complicated set of > issues to work through. Would need some serious design work. I've never > been intrinsically opposed to it, just felt that we were constrained by > other choices in the language. At the time, argument modes were > completely prohibitive; now it might be possible, but is still not > entirely straightforward. > > -Graydon From ben.striegel at gmail.com Mon Apr 23 08:15:26 2012 From: ben.striegel at gmail.com (Benjamin Striegel) Date: Mon, 23 Apr 2012 11:15:26 -0400 Subject: [rust-dev] 2 possible simplifications: reverse application, records as arguments In-Reply-To: References: Message-ID: > Note: OCaml has made the choice that (K _) is a pattern that matches > the constructor K no matter what its arity is (including none or > several parameters). While that doesn't help with the 'tps' example > above, it would still simplify a lot of places and does not require > named constructor arguments. But I think that's an inferior solution. I'm not familiar with OCaml, but I think that something along these lines was added just last week: https://github.com/mozilla/rust/commit/37b054973083ed4201a2ba73be6bdd39daf13cf6 An example: enum pattern { tabby, tortoiseshell, calico } enum breed { beagle, rottweiler, pug } type name = str; enum ear_kind { lop, upright } enum animal { cat(pattern), dog(breed), rabbit(name, ear_kind), tiger } fn noise(a: animal) -> option { alt a { cat(*) { some("meow") } dog(*) { some("woof") } rabbit(*) { none } tiger(*) { some("roar") } } } fn main() { assert noise(cat(tabby)) == some("meow"); assert noise(dog(pug)) == some("woof"); assert noise(rabbit("Hilbert", upright)) == none; assert noise(tiger) == some("roar"); } On Sat, Apr 21, 2012 at 1:28 PM, gasche wrote: > I've been wondering about a problem tightly related to named > parameters: named enum constructor arguments. SML has had the ability > to define algebraic datatypes as sum of (named) records, and I think > that is something that is missing in current rust. The code > antipattern that cries for it is pattern matching for a enumeration > name with a tedious number of "_" following it: > > rust/src % grep "_, _" -R . | wc -l > 547 > > See also the following code example: > > alt it.node { > ast::item_impl(tps, _, _, _) { > if ns == ns_type { ret lookup_in_ty_params(e, name, tps); } > } > ast::item_enum(_, tps, _) | ast::item_ty(_, tps, _) { > if ns == ns_type { ret lookup_in_ty_params(e, name, tps); } > } > ... } > > The position of the 'tps' variable among a variable number of ignored > parameters is fragile, increase maintainance costs (if you add > a parameter to some constructor in an enumeration, a lot of code > changes are required just to say that you ignore it most of the time), > and creates redundancy. > > This could be solved by having enum constructors with named arguments, > and conversely a syntax for enum patterns matching named > argument. Being able to write something like: > > alt it.node { > | ast::item_impl(tps:tps) > | ast::item_enum(tps:tps) > | ast::item_ty(tps:tps) { > if ns == ns_type { ret lookup_in_ty_params(e, name, tps); } > } > > would be a win (then you can say that "tps:" is a shorthand for > "tps:tps" or what not). > > Note: OCaml has made the choice that (K _) is a pattern that matches > the constructor K no matter what its arity is (including none or > several parameters). While that doesn't help with the 'tps' example > above, it would still simplify a lot of places and does not require > named constructor arguments. But I think that's an inferior solution. > > > - Our records are order-sensitive, to be C-structure compatible. > > Keyword arguments are usually argued-for (as you are doing here) as > > a way to make function arguments order-insensitive. We'd need to > > decide whether we wanted order to matter or not. > > > > - Argument-passing tends to work best when you can pass stuff in > > registers, not require the arguments to be spilled to memory and > > addressable as a contiguous slab. So we'd want to be careful not to > > require the "arguments structure" to be _actually_ addressable as a > > structure at any point. Rather, calling f(x) would be some kind of > > semantic sugar for `f(x.a, x.b, x.c)`, making separate copies for > > the sake of passing. > > The clean solution is to make a distinction between the data structure > you call "records" here and the structure that is denoted by the > parameter-building syntax. Haskell has a semantic distinction between > "boxed tuples" (the usual thing) and "unboxed tuples" (primitive, less > exposed, can't be used to instantiate polymorphic types). Similarly > you would have "data records" (contiguous, C-compatible, etc.) and, > say, "native records", that would be a different type with less > flexibility for the user and more flexibility for the implementer: not > adressable, possibly non-contiguous memory layout, a field order > decided by the compiler, etc. > > You could then explain {x:1, y:2} as syntaxic sugard for, say, > record(x:1, y:2), where `record` is a polytypic primitive that builds > a "data record" from some "native record". > > (You could do the same for tuples and handle mixed named/unnamed > parameters by adopting the convention existing in some languages, for > example Oz, that a tuple (x,y,z) is just a record of numeric fields > (0:x, 1:y, 2:z)) > > > - We'd have to decide the rules surrounding keyword mismatches, partial > > provision of keywords, argument permutation, and function types. > > Re. function types: if you consider those parameter-passing structures > as "first class" (which does necessarily mean that they are convenient > to use, for example if they're not adressable they will be > less flexible), the natural choice is to have a family of types for > them. Those types could come with restrictions and an unspoken kinding > discipline, so that for example they cannot be used to instantiate > type variables, maybe cannot be nested, etc. > > That's the main reason why I think one should think of such structures > as real structures rather than syntactic sugar; it forces you to have > a proper design for types and other aspects. > > > > > another thing is that instead of passing arguments, you pass just one > > > (anonymous) record. the record is the arguments. > > > > We actually had quite an argument with one of the Felix authors about > > this. This was not, back then, a terribly realistic option during that > > conversation (argument modes were still the primary way we were doing > > safe references, which are not first class types). But it's conceivably > > something we could look into if we get the argument-passing logic down > > to "always by-value and use region pointers for safe references" (which > > is where we're going). There remain some hitches: > > > > - Our syntax isn't quite compatible with the idea; records have to be > > brace-parenthesized and tuples given round parentheses. They'd need > > reform, and the syntax is already pretty crowded. > > > > - We'd have to decide the rules surrounding keyword mismatches, partial > > provision of keywords, argument permutation, and function types. > > > > - Our records are order-sensitive, to be C-structure compatible. > > Keyword arguments are usually argued-for (as you are doing here) as > > a way to make function arguments order-insensitive. We'd need to > > decide whether we wanted order to matter or not. > > > > - Argument-passing tends to work best when you can pass stuff in > > registers, not require the arguments to be spilled to memory and > > addressable as a contiguous slab. So we'd want to be careful not to > > require the "arguments structure" to be _actually_ addressable as a > > structure at any point. Rather, calling f(x) would be some kind of > > semantic sugar for `f(x.a, x.b, x.c)`, making separate copies for > > the sake of passing. > > > > So .. I can see a possibility here, but it'd be a complicated set of > > issues to work through. Would need some serious design work. I've never > > been intrinsically opposed to it, just felt that we were constrained by > > other choices in the language. At the time, argument modes were > > completely prohibitive; now it might be possible, but is still not > > entirely straightforward. > > > > -Graydon > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pwalton at mozilla.com Mon Apr 23 08:34:48 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Mon, 23 Apr 2012 08:34:48 -0700 Subject: [rust-dev] 2 possible simplifications: reverse application, records as arguments In-Reply-To: References: Message-ID: <4F957698.1050706@mozilla.com> On 04/21/2012 10:28 AM, gasche wrote: > I've been wondering about a problem tightly related to named > Re. function types: if you consider those parameter-passing structures > as "first class" (which does necessarily mean that they are convenient > to use, for example if they're not adressable they will be > less flexible), the natural choice is to have a family of types for > them. Those types could come with restrictions and an unspoken kinding > discipline, so that for example they cannot be used to instantiate > type variables, maybe cannot be nested, etc. > > That's the main reason why I think one should think of such structures > as real structures rather than syntactic sugar; it forces you to have > a proper design for types and other aspects. There are several issues with going to tupled arguments: * We'd still need formal parameters for C interoperability. At the ABI level, a single-argument function applied to a 3-ary tuple is very different from a function with 3 arguments. * It prohibits us from having optional parameters in the future (at least, not without some very hairy typechecking). * I don't know how to make the block loop syntax work. Patrick From arcata at gmail.com Mon Apr 23 10:45:37 2012 From: arcata at gmail.com (Joe Groff) Date: Mon, 23 Apr 2012 10:45:37 -0700 Subject: [rust-dev] 2 possible simplifications: reverse application, records as arguments In-Reply-To: <4F957698.1050706@mozilla.com> References: <4F957698.1050706@mozilla.com> Message-ID: <-1039930478732374265@unknownmsgid> On Apr 23, 2012, at 8:34 AM, Patrick Walton wrote: > > * We'd still need formal parameters for C interoperability. At the ABI level, a single-argument function applied to a 3-ary tuple is very different from a function with 3 arguments. Could the rust calling convention behave similarly to the x86-64 convention, where small composite types are destructured when passed by value? > * It prohibits us from having optional parameters in the future (at least, not without some very hairy type checking. Why would optional parameters need to be any more complicated than either C++-style default values or option-typed slots? -Joe From graydon at mozilla.com Mon Apr 23 12:04:12 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 23 Apr 2012 12:04:12 -0700 Subject: [rust-dev] strings, slices and nulls In-Reply-To: <3513163521000981743@unknownmsgid> References: <3513163521000981743@unknownmsgid> Message-ID: <4F95A7AC.6030204@mozilla.com> On 12-04-22 06:02 PM, Joe Groff wrote: > It would definitely be more forward-thinking to use slices as the > primary native mechanism for passing around strings and > memory-contiguous sequences in general. That is the current plan. I'm in the process of implementing it. https://github.com/mozilla/rust/issues/2112 > Regions should help make > data-sharing slices safe to use. Since C calls already require that > arguments be copied to the FFI context, does making null-terminated > copies of strings add that much additional overhead? Copying is pretty > cheap on modern platforms. Copying a pointer (as is required by the FFI) and copying a string are different operations. The latter is what we're trying to avoid, and it's costly, yes. Particularly when in inner loops, as C calls often are. -Graydon From graydon at mozilla.com Mon Apr 23 12:10:24 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 23 Apr 2012 12:10:24 -0700 Subject: [rust-dev] String constans In-Reply-To: References: Message-ID: <4F95A920.2070100@mozilla.com> On 12-04-22 05:19 PM, Alexander Stavonin wrote: > Do you have a plan to adding string constans? Or are you proposing some > other way for creating string constants? > > string_const.rs:3:18: 3:24 error: string constants are not supported > string_const.rs:3 const NAME: str = "sting"; > ^~~~~~ This derives from a basic inability to express long-lived pointers to constants in the current system of memory ownership; we lost that ability with the last vector-system rewrite and haven't yet regained it. It's one of the (many) reasons we're making region pointers first class and introducing slice types (that use region pointers). The new work on vectors will permit constant strings, slices, and general vectors (and other things built out of them). -Graydon From rick.richardson at gmail.com Mon Apr 23 12:58:33 2012 From: rick.richardson at gmail.com (Rick Richardson) Date: Mon, 23 Apr 2012 15:58:33 -0400 Subject: [rust-dev] Syntax of vectors, slices, etc In-Reply-To: <4F901FB7.70805@alum.mit.edu> References: <4F901FB7.70805@alum.mit.edu> Message-ID: I like your suggestion about having / always specify bound. However, syntactically decoupling the size of the type from the type seems odd. Also, I'm not sure how one would express multidimensional arrays. My line of thinking is this: If you have an array of four Ints, the type of the array is 4 Ints. It's incompatible with an array that is 5 ints without an adaptor. Since [] indicates a vector, and a vector in Rust has both a length and a type, Why not make the size of the vector the first of two parameters in the []? e.g. [N,T] or, for unspecified length: [_,T] (possibly sugared to [T]) For multidimensional arrays: let matrix : [4, [4, Int]] Then, not that this is a big deal, but no backtracking required, and it's a bit more in line with existing paradigms. P.S. If I had to order my preferences, I would still prefer the :N [T] over [T]/N because I think [T]/N could be rather misleading to the uninitiated. Although I think I would end up expressing it as : let foo : N[T] But I also express my ptrs in C as foo* varname instead of foo *varname, because I think the latter is nonsensical. On Thu, Apr 19, 2012 at 10:22 AM, Niko Matsakis wrote: > In general I love Graydon's proposal for strings and arrays, but I am not > crazy about the notation. ?In particular I think []/@ and []/~ is not a good > syntax for shared/unique vectors. ?It's not the slash, it's that I find it > inconsistent. ?Generally speaking, a @ or ~ after the main type is a bound, > and before it indicates the kind of the pointer. ?But here, it indicates the > kind of pointer. ?And []/3 is not a pointer at all. > > In Graydon's proposal, there are three kinds of vector-like things: > > - Fixed-length arrays ([T]/3, T[3] at runtime) > - Vectors ([T]/@, [T]/~, boxed>* or rust_vec* at runtime) > - Slices ([T] or [T]/&, pair of T* and length) > > Of these, the notation for slices seems exactly right: it is short and the > "/" suffix indicates a bound. ?In fact, I think maybe we should change fn@() > to fn/@() and so forth, and just have "/" be a trailing bound indicator. > ?That leaves fixed-length arrays and vectors to represent somehow. ?And > let's not forget strings, which just complicate everything. > > So here is my overall proposal (best viewed in fixed width). ?The comparison > is between my proposal, Graydon's proposal, and an English-language > description. ?In some cases (such as ifaces), I have also integrated work on > the type system I would like to do in the future. > > ? ?New type ? ? ?Old type ? ? Descr. > ? ?-------- ? ? ?-------- ? ? ------ > ? ?fn(S) -> T ? ?fn(S) > ? ?fn/@(S) -> T ?fn@(S) -> T > ? ?fn/~(S) -> T ?fn~(S) -> T > > ? ?:N [T] ? ? ? ?[T]/N ? ? ? ?fixed-length array > ? ?[N]T ? ? ? ? ?[T]/N ? ? ? ?fixed-length array > > ? ?:[T] ? ? ? ? ?N/A ? ? ? ? ?(see below) > ? ?@:[T] ? ? ? ? [T]/@ ? ? ? ?boxed vec > ? ?~:[T] ? ? ? ? [T]/~ ? ? ? ?unique vec > > ? ?[T] ? ? ? ? ? [T] ? ? ? ? ?slice > ? ?[T]/&r ? ? ? ?[T]/&r ? ? ? slice with expl. region > > ? ?Id ? ? ? ? ? ?Id ? ? ? ? ? enum/class/resource/iface > ? ?Id/&r ? ? ? ? Id&r ? ? ? ? ...with expl. region bound > > ? ?Id/@ ? ? ? ? ?Id@ ? ? ? ? ?iface with @ bound > ? ?Id/~ ? ? ? ? ?Id~ ? ? ? ? ?iface with ~ bound > > ? ?str ? ? ? ? ? str ? ? ? ? ?slice > ? ?str/&r ? ? ? ?str/&r ? ? ? slice with expl. region > > ? ?:N str ? ? ? ?str/N ? ? ? ?fixed-length str > > ? ?:str ? ? ? ? ?N/A ? ? ? ? ?(see below) > ? ?@:str ? ? ? ? str/@ ? ? ? ?boxed str > ? ?~:str ? ? ? ? str/~ ? ? ? ?unique str > > Explanation and rationale: > > - A trailing slash always indicates a bound, meaning that it limits the > types contained "within" the affected type. ?Normally, the bound is a > region. ?In the case of opaque types (like fn and ifaces), this bound can > also be @ or ~. > > - The type `:N [T]` and `:N str`, corresponds to `T[N]` or `u8[N+1]` > respectively. ?That is, it is a "by-value" array. ?If we want to allow N to > be an arbitrary (const) expression, we may need to write `:(expr) [T]`, > since `str` is no longer a keyword. > > - Now everything which is in fact a pointer into the task/exchange heaps is > prefixed with a @/~. > > - The pseudo-type `:[T]` is supposed to look like "an array with an > unspecified length". ?It refers to a rust_vec (by-value). ? I say that it > is a pseudo-type because you cannot write `:[T]` on its own. ?In fact, it is > not even a type. ?You can only write `@:[T]` or `~:[T]`---we just use a bit > of look-ahead. > > The reason to keep `:[T]` from being a type is that it has unknown size. ?To > support this safely with generic types, we'd need to add kinds. I would like > to do this eventually so that we can declare records with an inline vector > at the end, but it's not necessary now. > > I am not at all crazy about `:` prefix, I just couldn't come up with a > better character. ?I wanted `#` for number, but (a) it's in use by macros > and (b) it's kind of heavy. ?`*` (think: repeat) is used for unsafe ptrs. > ?`^` is random. `+` (again, repeat) looks like an infix operator, not a > prefix operator. > > Rejected ideas: > > My original plan was "N:[T]" which I think looks way better than ":N [T]", > but I scrapped it because `N` might eventually be a const expression and we > need some clue that it's coming in the parser. > > Another plan which I liked a lot was to have []T be slice, [N]T be constant > length array, and [:]T or [.]T be ?unknown length array. ?I think this looks > *great*, but there are two problems: First, I don't know how it extends to > `str`. ?Second, the region bound, if any, is ambiguous, so you'd need > parentheses to clear it up: []T/&r could be [](T/&r) or ([]T)/&r. ?But maybe > that's ok as I don't expect explicit region bounds to appear very often at > all. > > Thoughts? > > > Niko > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From graydon at mozilla.com Mon Apr 23 15:10:34 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 23 Apr 2012 15:10:34 -0700 Subject: [rust-dev] Syntax of vectors, slices, etc In-Reply-To: References: <4F901FB7.70805@alum.mit.edu> Message-ID: <4F95D35A.1070302@mozilla.com> On 12-04-23 12:58 PM, Rick Richardson wrote: > My line of thinking is this: > If you have an array of four Ints, the type of the array is 4 Ints. > It's incompatible with an array that is 5 ints without an adaptor. > Since [] indicates a vector, and a vector in Rust has both a length > and a type, Why not make the size of the vector the first of two > parameters in the []? > > e.g. [N,T] > > or, for unspecified length: [_,T] (possibly sugared to [T]) All versions of this syntax that work with the bound-inside-the-brackets need to have something to say for the 'str' type, which has no brackets. That's the problem. -Graydon From rick.richardson at gmail.com Mon Apr 23 15:21:24 2012 From: rick.richardson at gmail.com (Rick Richardson) Date: Mon, 23 Apr 2012 18:21:24 -0400 Subject: [rust-dev] Syntax of vectors, slices, etc In-Reply-To: <4F95D35A.1070302@mozilla.com> References: <4F901FB7.70805@alum.mit.edu> <4F95D35A.1070302@mozilla.com> Message-ID: Should a str be subject to the same syntax? Because it will have different semantics. A UTF-8 string has differently sized characters, so you can't treat it as a vector, there are obvious and currently discussed interoperability issues regarding the null terminator. It should definitely get a slice syntax, since that will likely be the most common operation on a string. I would also like to support a notion of static sizing, but with UTF-8 even that's not always possible. I reckon a string should be an object, and potentially be convertible to/from a vector. But trying to treat it like a vector will just lead to surprising semantics for some. But that's just my opinion. On Mon, Apr 23, 2012 at 6:10 PM, Graydon Hoare wrote: > On 12-04-23 12:58 PM, Rick Richardson wrote: > >> My line of thinking is this: >> If you have an array of four Ints, the type of the array is 4 Ints. >> It's incompatible with an array that is 5 ints without an adaptor. >> Since [] indicates a vector, and a vector in Rust has both a length >> and a type, Why not make the size of the vector the first of two >> parameters in the []? >> >> e.g. ? ? [N,T] >> >> or, for unspecified length: ?[_,T] ?(possibly sugared to [T]) > > All versions of this syntax that work with the bound-inside-the-brackets > need to have something to say for the 'str' type, which has no brackets. > That's the problem. > > -Graydon > > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From graydon at mozilla.com Mon Apr 23 16:12:06 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 23 Apr 2012 16:12:06 -0700 Subject: [rust-dev] Syntax of vectors, slices, etc In-Reply-To: References: <4F901FB7.70805@alum.mit.edu> <4F95D35A.1070302@mozilla.com> Message-ID: <4F95E1C6.60907@mozilla.com> On 12-04-23 03:21 PM, Rick Richardson wrote: > Should a str be subject to the same syntax? Because it will have > different semantics. I think the semantics are almost identical to vectors. Save the null issue. > A UTF-8 string has differently sized characters, so you can't treat > it as a vector, there are obvious and currently discussed > interoperability issues regarding the null terminator. You certainly can treat it as a (constrained) vector. It's just a byte vector, not a character vector. A character vector is [char]. Indexing into a str gives you a byte. You can iterate through it in terms of bytes or characters (or words, lines, paragraphs, etc.) or convert to characters or utf-16 code units or any other encoding of unicode. > It should definitely get a slice syntax, since that will likely be the > most common operation on a string. > I would also like to support a notion of static sizing, but with UTF-8 > even that's not always possible. Yes it is. The static size is a byte count. The compiler knows that size statically and can complain if you get it wrong (or fill it in if you leave it as a wildcard, as I expect most will do.) > I reckon a string should be an object, and potentially be convertible > to/from a vector. But trying to treat it like a vector will just lead > to surprising semantics for some. But that's just my opinion. The set of use-cases to address simultaneously is large and covers much of the same ground as vectors: - Sometimes people want to be able to send strings between tasks. - Sometimes people want a shared, refcounted string. - Sometimes people want strings of arbitrary length. - Sometimes people want an interior string that's part of another structure (with necessarily-fixed size), copied by value. - String literals exist and ought to turn into something useful, something in static memory when possible, dynamic otherwise. - Passing strings and substrings should be cheap, cheaper than refcount-adjustment even (when possible). As far as I know, our class system can't really satisfy these requirements. This is why they're a built-in type (just like vectors). To make the class system strong enough to do all those things would be much more work, and would be approaching more like the C++0x model, which I believe to be over-engineered in pursuit of the "make libraries able to do anything a built in type can do" goal. But reasonable people disagree on this. -Graydon From niko at alum.mit.edu Mon Apr 23 16:40:37 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Mon, 23 Apr 2012 16:40:37 -0700 Subject: [rust-dev] Syntax of vectors, slices, etc In-Reply-To: <4F95E1C6.60907@mozilla.com> References: <4F901FB7.70805@alum.mit.edu> <4F95D35A.1070302@mozilla.com> <4F95E1C6.60907@mozilla.com> Message-ID: <4F95E875.5040208@alum.mit.edu> One thing that is unclear to me is the utility of the str/N type. I can't think of a case where a *user* might want this type---it seems to me to represent a string of exactly N bytes (not a buffer of at most N bytes). Graydon, did you have use cases in mind? Niko On 4/23/12 4:12 PM, Graydon Hoare wrote: > On 12-04-23 03:21 PM, Rick Richardson wrote: >> Should a str be subject to the same syntax? Because it will have >> different semantics. > I think the semantics are almost identical to vectors. Save the null issue. > >> A UTF-8 string has differently sized characters, so you can't treat >> it as a vector, there are obvious and currently discussed >> interoperability issues regarding the null terminator. > You certainly can treat it as a (constrained) vector. It's just a byte > vector, not a character vector. A character vector is [char]. Indexing > into a str gives you a byte. You can iterate through it in terms of > bytes or characters (or words, lines, paragraphs, etc.) or convert to > characters or utf-16 code units or any other encoding of unicode. > >> It should definitely get a slice syntax, since that will likely be the >> most common operation on a string. >> I would also like to support a notion of static sizing, but with UTF-8 >> even that's not always possible. > Yes it is. The static size is a byte count. The compiler knows that size > statically and can complain if you get it wrong (or fill it in if you > leave it as a wildcard, as I expect most will do.) > >> I reckon a string should be an object, and potentially be convertible >> to/from a vector. But trying to treat it like a vector will just lead >> to surprising semantics for some. But that's just my opinion. > The set of use-cases to address simultaneously is large and covers much > of the same ground as vectors: > > - Sometimes people want to be able to send strings between tasks. > - Sometimes people want a shared, refcounted string. > - Sometimes people want strings of arbitrary length. > - Sometimes people want an interior string that's part of another > structure (with necessarily-fixed size), copied by value. > - String literals exist and ought to turn into something useful, > something in static memory when possible, dynamic otherwise. > - Passing strings and substrings should be cheap, cheaper than > refcount-adjustment even (when possible). > > As far as I know, our class system can't really satisfy these > requirements. This is why they're a built-in type (just like vectors). > To make the class system strong enough to do all those things would be > much more work, and would be approaching more like the C++0x model, > which I believe to be over-engineered in pursuit of the "make libraries > able to do anything a built in type can do" goal. > > But reasonable people disagree on this. > > -Graydon > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From niko at alum.mit.edu Mon Apr 23 17:06:32 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Mon, 23 Apr 2012 17:06:32 -0700 Subject: [rust-dev] Syntax of vectors, slices, etc In-Reply-To: <4F95E875.5040208@alum.mit.edu> References: <4F901FB7.70805@alum.mit.edu> <4F95D35A.1070302@mozilla.com> <4F95E1C6.60907@mozilla.com> <4F95E875.5040208@alum.mit.edu> Message-ID: <4F95EE88.7070504@alum.mit.edu> Some more thoughts on the matter: http://smallcultfollowing.com/babysteps/blog/2012/04/23/vectors-strings-and-slices/ Niko On 4/23/12 4:40 PM, Niko Matsakis wrote: > One thing that is unclear to me is the utility of the str/N type. I > can't think of a case where a *user* might want this type---it seems > to me to represent a string of exactly N bytes (not a buffer of at > most N bytes). Graydon, did you have use cases in mind? > > > Niko > > On 4/23/12 4:12 PM, Graydon Hoare wrote: >> On 12-04-23 03:21 PM, Rick Richardson wrote: >>> Should a str be subject to the same syntax? Because it will have >>> different semantics. >> I think the semantics are almost identical to vectors. Save the null >> issue. >> >>> A UTF-8 string has differently sized characters, so you can't treat >>> it as a vector, there are obvious and currently discussed >>> interoperability issues regarding the null terminator. >> You certainly can treat it as a (constrained) vector. It's just a byte >> vector, not a character vector. A character vector is [char]. Indexing >> into a str gives you a byte. You can iterate through it in terms of >> bytes or characters (or words, lines, paragraphs, etc.) or convert to >> characters or utf-16 code units or any other encoding of unicode. >> >>> It should definitely get a slice syntax, since that will likely be the >>> most common operation on a string. >>> I would also like to support a notion of static sizing, but with UTF-8 >>> even that's not always possible. >> Yes it is. The static size is a byte count. The compiler knows that size >> statically and can complain if you get it wrong (or fill it in if you >> leave it as a wildcard, as I expect most will do.) >> >>> I reckon a string should be an object, and potentially be convertible >>> to/from a vector. But trying to treat it like a vector will just lead >>> to surprising semantics for some. But that's just my opinion. >> The set of use-cases to address simultaneously is large and covers much >> of the same ground as vectors: >> >> - Sometimes people want to be able to send strings between tasks. >> - Sometimes people want a shared, refcounted string. >> - Sometimes people want strings of arbitrary length. >> - Sometimes people want an interior string that's part of another >> structure (with necessarily-fixed size), copied by value. >> - String literals exist and ought to turn into something useful, >> something in static memory when possible, dynamic otherwise. >> - Passing strings and substrings should be cheap, cheaper than >> refcount-adjustment even (when possible). >> >> As far as I know, our class system can't really satisfy these >> requirements. This is why they're a built-in type (just like vectors). >> To make the class system strong enough to do all those things would be >> much more work, and would be approaching more like the C++0x model, >> which I believe to be over-engineered in pursuit of the "make libraries >> able to do anything a built in type can do" goal. >> >> But reasonable people disagree on this. >> >> -Graydon >> _______________________________________________ >> Rust-dev mailing list >> Rust-dev at mozilla.org >> https://mail.mozilla.org/listinfo/rust-dev > From niko at alum.mit.edu Tue Apr 24 10:15:33 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Tue, 24 Apr 2012 10:15:33 -0700 Subject: [rust-dev] 2 possible simplifications: reverse application, records as arguments In-Reply-To: References: Message-ID: <4F96DFB5.1070706@alum.mit.edu> Record patterns are fairly close to what you want. We don't use them consistently, but if we did, they would allow us to write things like: alt it.node { ast::item_impl({tps, _}) { ... } ast::item_enum({tps, _}) { ... } ... } It has an extra level of braces, though. Also, I don't know why we require a`_` to omit parameters and so forth. Niko On 4/21/12 10:28 AM, gasche wrote: > I've been wondering about a problem tightly related to named > parameters: named enum constructor arguments. SML has had the ability > to define algebraic datatypes as sum of (named) records, and I think > that is something that is missing in current rust. The code > antipattern that cries for it is pattern matching for a enumeration > name with a tedious number of "_" following it: > > rust/src % grep "_, _" -R . | wc -l > 547 > > See also the following code example: > > alt it.node { > ast::item_impl(tps, _, _, _) { > if ns == ns_type { ret lookup_in_ty_params(e, name, tps); } > } > ast::item_enum(_, tps, _) | ast::item_ty(_, tps, _) { > if ns == ns_type { ret lookup_in_ty_params(e, name, tps); } > } > ... } > > The position of the 'tps' variable among a variable number of ignored > parameters is fragile, increase maintainance costs (if you add > a parameter to some constructor in an enumeration, a lot of code > changes are required just to say that you ignore it most of the time), > and creates redundancy. > > This could be solved by having enum constructors with named arguments, > and conversely a syntax for enum patterns matching named > argument. Being able to write something like: > > alt it.node { > | ast::item_impl(tps:tps) > | ast::item_enum(tps:tps) > | ast::item_ty(tps:tps) { > if ns == ns_type { ret lookup_in_ty_params(e, name, tps); } > } > > would be a win (then you can say that "tps:" is a shorthand for > "tps:tps" or what not). > > Note: OCaml has made the choice that (K _) is a pattern that matches > the constructor K no matter what its arity is (including none or > several parameters). While that doesn't help with the 'tps' example > above, it would still simplify a lot of places and does not require > named constructor arguments. But I think that's an inferior solution. > >> - Our records are order-sensitive, to be C-structure compatible. >> Keyword arguments are usually argued-for (as you are doing here) as >> a way to make function arguments order-insensitive. We'd need to >> decide whether we wanted order to matter or not. >> >> - Argument-passing tends to work best when you can pass stuff in >> registers, not require the arguments to be spilled to memory and >> addressable as a contiguous slab. So we'd want to be careful not to >> require the "arguments structure" to be _actually_ addressable as a >> structure at any point. Rather, calling f(x) would be some kind of >> semantic sugar for `f(x.a, x.b, x.c)`, making separate copies for >> the sake of passing. > The clean solution is to make a distinction between the data structure > you call "records" here and the structure that is denoted by the > parameter-building syntax. Haskell has a semantic distinction between > "boxed tuples" (the usual thing) and "unboxed tuples" (primitive, less > exposed, can't be used to instantiate polymorphic types). Similarly > you would have "data records" (contiguous, C-compatible, etc.) and, > say, "native records", that would be a different type with less > flexibility for the user and more flexibility for the implementer: not > adressable, possibly non-contiguous memory layout, a field order > decided by the compiler, etc. > > You could then explain {x:1, y:2} as syntaxic sugard for, say, > record(x:1, y:2), where `record` is a polytypic primitive that builds > a "data record" from some "native record". > > (You could do the same for tuples and handle mixed named/unnamed > parameters by adopting the convention existing in some languages, for > example Oz, that a tuple (x,y,z) is just a record of numeric fields > (0:x, 1:y, 2:z)) > >> - We'd have to decide the rules surrounding keyword mismatches, partial >> provision of keywords, argument permutation, and function types. > Re. function types: if you consider those parameter-passing structures > as "first class" (which does necessarily mean that they are convenient > to use, for example if they're not adressable they will be > less flexible), the natural choice is to have a family of types for > them. Those types could come with restrictions and an unspoken kinding > discipline, so that for example they cannot be used to instantiate > type variables, maybe cannot be nested, etc. > > That's the main reason why I think one should think of such structures > as real structures rather than syntactic sugar; it forces you to have > a proper design for types and other aspects. > > >>> another thing is that instead of passing arguments, you pass just one >>> (anonymous) record. the record is the arguments. >> We actually had quite an argument with one of the Felix authors about >> this. This was not, back then, a terribly realistic option during that >> conversation (argument modes were still the primary way we were doing >> safe references, which are not first class types). But it's conceivably >> something we could look into if we get the argument-passing logic down >> to "always by-value and use region pointers for safe references" (which >> is where we're going). There remain some hitches: >> >> - Our syntax isn't quite compatible with the idea; records have to be >> brace-parenthesized and tuples given round parentheses. They'd need >> reform, and the syntax is already pretty crowded. >> >> - We'd have to decide the rules surrounding keyword mismatches, partial >> provision of keywords, argument permutation, and function types. >> >> - Our records are order-sensitive, to be C-structure compatible. >> Keyword arguments are usually argued-for (as you are doing here) as >> a way to make function arguments order-insensitive. We'd need to >> decide whether we wanted order to matter or not. >> >> - Argument-passing tends to work best when you can pass stuff in >> registers, not require the arguments to be spilled to memory and >> addressable as a contiguous slab. So we'd want to be careful not to >> require the "arguments structure" to be _actually_ addressable as a >> structure at any point. Rather, calling f(x) would be some kind of >> semantic sugar for `f(x.a, x.b, x.c)`, making separate copies for >> the sake of passing. >> >> So .. I can see a possibility here, but it'd be a complicated set of >> issues to work through. Would need some serious design work. I've never >> been intrinsically opposed to it, just felt that we were constrained by >> other choices in the language. At the time, argument modes were >> completely prohibitive; now it might be possible, but is still not >> entirely straightforward. >> >> -Graydon > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From graydon at mozilla.com Tue Apr 24 10:40:45 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 24 Apr 2012 10:40:45 -0700 Subject: [rust-dev] tool interfaces Message-ID: <4F96E59D.3060407@mozilla.com> Hi, There's been some casual conversation on IRC and around mozilla lately about the longer-term evolution of tool interfaces (command-line and crate/library interfaces) for rust. I thought I'd poll the mailing list a bit and see if anyone has strong opinions. Here is what's been discussed: 1. Creating an 'outermost' command-line tool called simply "rust", through which any remotely rust-related sub-tool can be discovered (and invoked as "rust download and compile it as a dependency). Generating docs as part of a compilation pass. 4. Moving more of the compiler to separate crates with their own library interfaces, LLVM-like, to make it easy to make tools with different command-line interfaces, but shared code paths. It should not escape notice that these topics are somewhat contradictory or at least pulling the problem in multiple directions at once. That's fine, it just points to the existence of a problem-space we need to adopt strategy around. Currently we don't have much of a _strategy_. As in, not many really clear organizing principles for where to draw lines between crates or tools. I'm polling the list here mostly to request advice on such principles. What are some ways you'd divide responsibility between command line tools? How many should we aim for? One tool per general role of developer? One tool per intended man page? One tool per different default interpretation of an unadorned command-line argument? One tool per step in a build process? One tool ("rust") with all subcommands as merely library calls? Some other principle? Thoughts, opinions? -Graydon From ben.striegel at gmail.com Tue Apr 24 10:59:46 2012 From: ben.striegel at gmail.com (Benjamin Striegel) Date: Tue, 24 Apr 2012 13:59:46 -0400 Subject: [rust-dev] tool interfaces In-Reply-To: <4F96E59D.3060407@mozilla.com> References: <4F96E59D.3060407@mozilla.com> Message-ID: Here's a summary of Go's subcommands, which could be useful for a starting point: build compile packages and dependencies clean remove object files doc run godoc on package sources env print Go environment information fix run go tool fix on packages fmt run gofmt on package sources get download and install packages and dependencies install compile and install packages and dependencies list list packages run compile and run Go program test test packages tool run specified go tool version print Go version vet run go tool vet on packages Not intending to throw more fuel on the Rust vs. Go fire, but they do have a lot of good ideas! :) On Tue, Apr 24, 2012 at 1:40 PM, Graydon Hoare wrote: > Hi, > > There's been some casual conversation on IRC and around mozilla lately > about the longer-term evolution of tool interfaces (command-line and > crate/library interfaces) for rust. I thought I'd poll the mailing list a > bit and see if anyone has strong opinions. Here is what's been discussed: > > 1. Creating an 'outermost' command-line tool called simply "rust", through > which any remotely rust-related sub-tool can be discovered (and invoked as > "rust broad tool-sets such as git (and more recently go). > > 2. Renaming sub-tools rustc, rustdoc, fuzzer and cargo to a uniform naming > scheme (either, say, renaming the latter two to "rustfuzz" and "rustpkg", > or perhaps renaming them all to hyphen-names like "rust-compile", > "rust-doc", "rust-fuzzer", "rust-package") > > 3. Merging tools or splitting them. Moving cargo functionality into rustc, > for example (name a link requirement in a crate => download and compile it > as a dependency). Generating docs as part of a compilation pass. > > 4. Moving more of the compiler to separate crates with their own library > interfaces, LLVM-like, to make it easy to make tools with different > command-line interfaces, but shared code paths. > > It should not escape notice that these topics are somewhat contradictory > or at least pulling the problem in multiple directions at once. That's > fine, it just points to the existence of a problem-space we need to adopt > strategy around. Currently we don't have much of a _strategy_. As in, not > many really clear organizing principles for where to draw lines between > crates or tools. > > I'm polling the list here mostly to request advice on such principles. > What are some ways you'd divide responsibility between command line tools? > How many should we aim for? One tool per general role of developer? One > tool per intended man page? One tool per different default interpretation > of an unadorned command-line argument? One tool per step in a build > process? One tool ("rust") with all subcommands as merely library calls? > Some other principle? > > Thoughts, opinions? > > -Graydon > ______________________________**_________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/**listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From graydon at mozilla.com Tue Apr 24 11:06:34 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 24 Apr 2012 11:06:34 -0700 Subject: [rust-dev] tool interfaces In-Reply-To: References: <4F96E59D.3060407@mozilla.com> Message-ID: <4F96EBAA.7020104@mozilla.com> On 4/24/2012 10:59 AM, Benjamin Striegel wrote: > Here's a summary of Go's subcommands, which could be useful for a > starting point: > > ... > > Not intending to throw more fuel on the Rust vs. Go fire, but they do > have a lot of good ideas! :) Thanks! Handy to see but I'm not sure which principles fall out of that list aside from "all the tools that got written previously, now under one tool interface!" No need to consider go's competitive existence a fire to put fuel on; they're sufficiently different projects that they can peacefully coexist and just copy good ideas. We copy ideas from lots of languages. Everyone does. -Graydon From banderson at mozilla.com Tue Apr 24 11:26:19 2012 From: banderson at mozilla.com (Brian Anderson) Date: Tue, 24 Apr 2012 11:26:19 -0700 Subject: [rust-dev] tool interfaces In-Reply-To: <4F96E59D.3060407@mozilla.com> References: <4F96E59D.3060407@mozilla.com> Message-ID: <4F96F04B.10205@mozilla.com> On 04/24/2012 10:40 AM, Graydon Hoare wrote: > Hi, > > There's been some casual conversation on IRC and around mozilla lately > about the longer-term evolution of tool interfaces (command-line and > crate/library interfaces) for rust. I thought I'd poll the mailing list > a bit and see if anyone has strong opinions. Here is what's been discussed: There's an issue open here: https://github.com/mozilla/rust/issues/2238 > 1. Creating an 'outermost' command-line tool called simply "rust", > through which any remotely rust-related sub-tool can be discovered (and > invoked as "rust footsteps of other broad tool-sets such as git (and more recently go). > > 2. Renaming sub-tools rustc, rustdoc, fuzzer and cargo to a uniform > naming scheme (either, say, renaming the latter two to "rustfuzz" and > "rustpkg", or perhaps renaming them all to hyphen-names like > "rust-compile", "rust-doc", "rust-fuzzer", "rust-package") > > 3. Merging tools or splitting them. Moving cargo functionality into > rustc, for example (name a link requirement in a crate => download and > compile it as a dependency). Generating docs as part of a compilation pass. > > 4. Moving more of the compiler to separate crates with their own library > interfaces, LLVM-like, to make it easy to make tools with different > command-line interfaces, but shared code paths. This is my main interest. Whatever the resulting UI looks like I want everything to be factored into libraries and not all dumped into rustc. > It should not escape notice that these topics are somewhat contradictory > or at least pulling the problem in multiple directions at once. That's > fine, it just points to the existence of a problem-space we need to > adopt strategy around. Currently we don't have much of a _strategy_. As > in, not many really clear organizing principles for where to draw lines > between crates or tools. > > I'm polling the list here mostly to request advice on such principles. > What are some ways you'd divide responsibility between command line > tools? How many should we aim for? One tool per general role of > developer? One tool per intended man page? One tool per different > default interpretation of an unadorned command-line argument? One tool > per step in a build process? One tool ("rust") with all subcommands as > merely library calls? Some other principle? My preference is that every tool is a library crate that exports a known interface. There is one driver that knows how to discover and load libraries that implement this interface. We have many configurations of the driver that restrict it to just the rustdoc tool, etc. and one 'master' configuration that can drive all available tools. Users can create libraries that implement this interface and publish them via cargo to add features to the Rust toolchain. > > Thoughts, opinions? > > -Graydon > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From matthieu.monrocq at gmail.com Tue Apr 24 11:30:55 2012 From: matthieu.monrocq at gmail.com (Matthieu Monrocq) Date: Tue, 24 Apr 2012 20:30:55 +0200 Subject: [rust-dev] Syntax of vectors, slices, etc In-Reply-To: <4F95EE88.7070504@alum.mit.edu> References: <4F901FB7.70805@alum.mit.edu> <4F95D35A.1070302@mozilla.com> <4F95E1C6.60907@mozilla.com> <4F95E875.5040208@alum.mit.edu> <4F95EE88.7070504@alum.mit.edu> Message-ID: Hello, As this is going to be my first e-mail on this list, please do not hesitate to correct me if I speak out of turn. Also do note that I am not a native English speaker, I still promise to do my best and I will gladly welcome any correction. ---- First, I agree that operations on vectors and strings are mostly similar. However this is at the condition of considering strings as list of codepoints, and not list of bytes. List of bytes are useful in encoding and decoding operations, but to manipulate Arabic or Korean, they fall short: having users manipulate the strings byte-wise instead of codepoint-wise is a recipe to disaster outside of English and Latin-1 representable languages. I understand that this may seem contradictory to Rust's original direction of utf-8 encoded strings, but having worked with utf-8 strings using C++ `std::string` I can assure you that apart from blindly passing them around, one cannot do much. All modifiying operations require the use of Unicode aware libraries... even `substr`. Second, I do not think that statically known sizes are so important in the type system. I am a huge fan, and abuser, of the C++ template system, but I will be the first to admit it is really complex and generally poorly understood even among usually savvy C++ users. As I understand, fixed-length vectors were imagined for C-compatibility. Statically allocated buffers have lifetime that exceed that of all other objects in the system, therefore they can perfectly be accessed through slices. Other uses implying C-compatibility should be based on dynamically allocated memory, and the size will be unknown at compilation. In the blog article linked, an issue regarding the variable-size of `rust_vec` is made because it plays havoc with stack-allocation. However, is real stack-allocation necessary here ? It seems to me that was is desirable is the semantic aspect of a scope-bound variable. Whether the actual representation is instantiated on the stack or on the task heap is an implementation detail, and the compiler could perfectly well be enhanced such that all variably-sized types are actually instantiated on the heap, but automatically collected at the end of the function scope. A "parallel" stack dedicated to such allocations could even be used, as the allocation/deallocation pattern is stack-like. I hope my suggestions are reasonable. Do feel free to ignore them if they are not! -- Matthieu On Tue, Apr 24, 2012 at 2:06 AM, Niko Matsakis wrote: > Some more thoughts on the matter: > > http://smallcultfollowing.com/**babysteps/blog/2012/04/23/** > vectors-strings-and-slices/ > > Niko > > > On 4/23/12 4:40 PM, Niko Matsakis wrote: > >> One thing that is unclear to me is the utility of the str/N type. I >> can't think of a case where a *user* might want this type---it seems to me >> to represent a string of exactly N bytes (not a buffer of at most N bytes). >> Graydon, did you have use cases in mind? >> >> >> Niko >> >> On 4/23/12 4:12 PM, Graydon Hoare wrote: >> >>> On 12-04-23 03:21 PM, Rick Richardson wrote: >>> >>>> Should a str be subject to the same syntax? Because it will have >>>> different semantics. >>>> >>> I think the semantics are almost identical to vectors. Save the null >>> issue. >>> >>> A UTF-8 string has differently sized characters, so you can't treat >>>> it as a vector, there are obvious and currently discussed >>>> interoperability issues regarding the null terminator. >>>> >>> You certainly can treat it as a (constrained) vector. It's just a byte >>> vector, not a character vector. A character vector is [char]. Indexing >>> into a str gives you a byte. You can iterate through it in terms of >>> bytes or characters (or words, lines, paragraphs, etc.) or convert to >>> characters or utf-16 code units or any other encoding of unicode. >>> >>> It should definitely get a slice syntax, since that will likely be the >>>> most common operation on a string. >>>> I would also like to support a notion of static sizing, but with UTF-8 >>>> even that's not always possible. >>>> >>> Yes it is. The static size is a byte count. The compiler knows that size >>> statically and can complain if you get it wrong (or fill it in if you >>> leave it as a wildcard, as I expect most will do.) >>> >>> I reckon a string should be an object, and potentially be convertible >>>> to/from a vector. But trying to treat it like a vector will just lead >>>> to surprising semantics for some. But that's just my opinion. >>>> >>> The set of use-cases to address simultaneously is large and covers much >>> of the same ground as vectors: >>> >>> - Sometimes people want to be able to send strings between tasks. >>> - Sometimes people want a shared, refcounted string. >>> - Sometimes people want strings of arbitrary length. >>> - Sometimes people want an interior string that's part of another >>> structure (with necessarily-fixed size), copied by value. >>> - String literals exist and ought to turn into something useful, >>> something in static memory when possible, dynamic otherwise. >>> - Passing strings and substrings should be cheap, cheaper than >>> refcount-adjustment even (when possible). >>> >>> As far as I know, our class system can't really satisfy these >>> requirements. This is why they're a built-in type (just like vectors). >>> To make the class system strong enough to do all those things would be >>> much more work, and would be approaching more like the C++0x model, >>> which I believe to be over-engineered in pursuit of the "make libraries >>> able to do anything a built in type can do" goal. >>> >>> But reasonable people disagree on this. >>> >>> -Graydon >>> ______________________________**_________________ >>> Rust-dev mailing list >>> Rust-dev at mozilla.org >>> https://mail.mozilla.org/**listinfo/rust-dev >>> >> >> > ______________________________**_________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/**listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arcata at gmail.com Tue Apr 24 11:49:56 2012 From: arcata at gmail.com (Joe Groff) Date: Tue, 24 Apr 2012 11:49:56 -0700 Subject: [rust-dev] Syntax of vectors, slices, etc In-Reply-To: References: <4F901FB7.70805@alum.mit.edu> <4F95D35A.1070302@mozilla.com> <4F95E1C6.60907@mozilla.com> <4F95E875.5040208@alum.mit.edu> <4F95EE88.7070504@alum.mit.edu> Message-ID: On Tue, Apr 24, 2012 at 11:30 AM, Matthieu Monrocq wrote: > However this is at the condition of considering strings as list of > codepoints, and not list of bytes. List of bytes are useful in encoding and > decoding operations, but to manipulate Arabic or Korean, they fall short: > having users manipulate the strings byte-wise instead of codepoint-wise is a > recipe to disaster outside of English and Latin-1 representable languages. > > I understand that this may seem contradictory to Rust's original direction > of utf-8 encoded strings, but having worked with utf-8 strings using C++ > `std::string` I can assure you that apart from blindly passing them around, > one cannot do much. All modifiying operations require the use of Unicode > aware libraries... even `substr`. Well, that's why you should use ICU instead of builtin language facilities for Unicode-aware processing. But there's a lot of code that really does just need to blindly pass around pre-composed strings, and an ICU or equivalent dependency (and in many cases even UTF encoding/decoding) would be overkill for those applications. In previous discussions about text processing on the list, IIRC it's been decided that the builtin string facilities should remain low-level, and bindings to ICU used for real text processing. -Joe From graydon at mozilla.com Tue Apr 24 14:24:15 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Tue, 24 Apr 2012 14:24:15 -0700 Subject: [rust-dev] Syntax of vectors, slices, etc In-Reply-To: References: <4F901FB7.70805@alum.mit.edu> <4F95D35A.1070302@mozilla.com> <4F95E1C6.60907@mozilla.com> <4F95E875.5040208@alum.mit.edu> <4F95EE88.7070504@alum.mit.edu> Message-ID: <4F9719FF.1000907@mozilla.com> On 12-04-24 11:30 AM, Matthieu Monrocq wrote: > However this is at the condition of considering strings as list of > codepoints, and not list of bytes. List of bytes are useful in encoding > and decoding operations, but to manipulate Arabic or Korean, they fall > short: having users manipulate the strings byte-wise instead of > codepoint-wise is a recipe to disaster outside of English and Latin-1 > representable languages. Could you elaborate on this a little bit? I'm curious to hear impressions -- even if vague or hard to specify -- about the experience of working with known-language, non-Latin-1 text. I'm an English-speaker and much technical material is English-derived, so usually when I'm working with text-processing code, it falls into one of two categories: - ASCII-subset by construction (eg. structured-language keywords) - Totally unknown language semantics, has to work with everything, can't assume I know anything about the language (eg. "human input") I am emphatically not saying these are the _only_ two possible environments, just the two that I have experience in. So in my experience byte-operations in ASCII range works for the former and using a proper language-and-locale-aware unicode library like ICU works for the latter. That's where my "usability biases" emerge in the design of str. In particular I want to know if you would feel that there are common operations you expect to be able to do codepoint-at-a-time on the datatype "str", that you would not be comfortable doing on the datatype "[char]", if you converted str to [char] as a one-time pass in advance of performing the operation. That's what I assume people will do if they need random (rather than sequential) codepoint access. Sequential access we already have iterators for. But I understand this might not be right; it's a design space with a lot of tensions. There are as many different string representations in the world as there are opinionated programmers :) > I understand that this may seem contradictory to Rust's original > direction of utf-8 encoded strings, but having worked with utf-8 strings > using C++ `std::string` I can assure you that apart from blindly passing > them around, one cannot do much. All modifiying operations require the > use of Unicode aware libraries... even `substr`. Naturally so. We're intending to ship a relatively full binding to libicu for just this reason. Unicode Text Is Hard To Do By Hand. (Though, hmm, substr is actually fine on UTF-8, no? You just have to land on character boundaries. Which are easy to find; O(1) from any given start point -- at most 5 bytes away -- and the guaranteed output of any other algorithm that iterates over character boundaries...) > Second, I do not think that statically known sizes are so important in > the type system. I am a huge fan, and abuser, of the C++ template > system, but I will be the first to admit it is really complex and > generally poorly understood even among usually savvy C++ users. > > As I understand, fixed-length vectors were imagined for C-compatibility. > Statically allocated buffers have lifetime that exceed that of all other > objects in the system, therefore they can perfectly be accessed through > slices. Other uses implying C-compatibility should be based on > dynamically allocated memory, and the size will be unknown at compilation. They're useful for a lot of reasons. You can alloca them, which is good for small buffers. And a decent number of heap structures also have need of small fixed-fanout arrays, caches, lookup tables and the like. But beyond that they simply _occur_ in the C type system. With annoying frequency! We've designed (and intend to maintain) a degree of compatibility with C in our structured types: a rust record and a C struct containing the same elements ought to be memory-compatible. When a C struct has an array in the middle of it, we need to be able to represent that somehow. There are a nontrivial number of C structures that have that property (or, say, a fixed-sized reserved region). We currently address this by having users generate a sequence of fields like: pad1: u8; pad2: u8; pad3: u8; pad4: u8; etc. etc. Not so fun. > In the blog article linked, an issue regarding the variable-size of > `rust_vec` is made because it plays havoc with stack-allocation. > However, is real stack-allocation necessary here ? It's not necessary, but if it's not done on the stack, it's done on a parallel-to-the-stack LIFO structure (a.k.a. "dynastack"), which we used to have, but have removed since we managed to move everything to the stack. If we had to re-acquire the dynastack for this purpose, it would not be the end of the world, but we'd like to avoid it. It's one more moving part. > I hope my suggestions are reasonable. Do feel free to ignore them if > they are not! Quite reasonable. I hope I've provided useful answers to some. -Graydon From matthieu.monrocq at gmail.com Wed Apr 25 12:14:52 2012 From: matthieu.monrocq at gmail.com (Matthieu Monrocq) Date: Wed, 25 Apr 2012 21:14:52 +0200 Subject: [rust-dev] Syntax of vectors, slices, etc In-Reply-To: <4F9719FF.1000907@mozilla.com> References: <4F901FB7.70805@alum.mit.edu> <4F95D35A.1070302@mozilla.com> <4F95E1C6.60907@mozilla.com> <4F95E875.5040208@alum.mit.edu> <4F95EE88.7070504@alum.mit.edu> <4F9719FF.1000907@mozilla.com> Message-ID: On Tue, Apr 24, 2012 at 11:24 PM, Graydon Hoare wrote: > On 12-04-24 11:30 AM, Matthieu Monrocq wrote: > > > However this is at the condition of considering strings as list of > > codepoints, and not list of bytes. List of bytes are useful in encoding > > and decoding operations, but to manipulate Arabic or Korean, they fall > > short: having users manipulate the strings byte-wise instead of > > codepoint-wise is a recipe to disaster outside of English and Latin-1 > > representable languages. > > Could you elaborate on this a little bit? I'm curious to hear > impressions -- even if vague or hard to specify -- about the experience > of working with known-language, non-Latin-1 text. I'm an English-speaker > and much technical material is English-derived, so usually when I'm > working with text-processing code, it falls into one of two categories: > > - ASCII-subset by construction (eg. structured-language keywords) > > - Totally unknown language semantics, has to work with everything, > can't assume I know anything about the language (eg. "human input") > > I am emphatically not saying these are the _only_ two possible > environments, just the two that I have experience in. So in my > experience byte-operations in ASCII range works for the former and using > a proper language-and-locale-aware unicode library like ICU works for > the latter. That's where my "usability biases" emerge in the design of str. > > In particular I want to know if you would feel that there are common > operations you expect to be able to do codepoint-at-a-time on the > datatype "str", that you would not be comfortable doing on the datatype > "[char]", if you converted str to [char] as a one-time pass in advance > of performing the operation. That's what I assume people will do if they > need random (rather than sequential) codepoint access. Sequential access > we already have iterators for. > > But I understand this might not be right; it's a design space with a lot > of tensions. There are as many different string representations in the > world as there are opinionated programmers :) > > I understand that this may seem contradictory to Rust's original > > direction of utf-8 encoded strings, but having worked with utf-8 strings > > using C++ `std::string` I can assure you that apart from blindly passing > > them around, one cannot do much. All modifiying operations require the > > use of Unicode aware libraries... even `substr`. > > Naturally so. We're intending to ship a relatively full binding to > libicu for just this reason. Unicode Text Is Hard To Do By Hand. > > (Though, hmm, substr is actually fine on UTF-8, no? You just have to > land on character boundaries. Which are easy to find; O(1) from any > given start point -- at most 5 bytes away -- and the guaranteed output > of any other algorithm that iterates over character boundaries...) > Thanks for this answer: I had not considered the ability to do a str -> [char] -> str with actual Unicode work on the [char] type. I also did not know about the intent of integrating a subset of libicu. Indeed with a full library handling [char] correctly, and two simple facilities to convert back and fro, then it would be trivial for the user to use real Unicode operations (to_lower / to_upper / capitalize are not fun :x) without too much hassle. Regarding the use cases I have encountered, they were in a general public web app: - wrap-around at a specified length (in number of graphemes, which in the appropriate canonical form was the number of codepoints in all the languages we cared for) - truncation at a specified length (also in number of graphemes) - sorting lists (the first time we presented a list of countries in Greek, it was nigh unusable...) Pretty basic operations, we used ICU for sorting (collation) and conversion to 32bits unicode codepoint value for length operations. It was all the more funny with Arabic, of course, because of the control characters for the direction of display which do not have a graphical representation, but since we counted by hand, we just ignored them. > > Second, I do not think that statically known sizes are so important in > > the type system. I am a huge fan, and abuser, of the C++ template > > system, but I will be the first to admit it is really complex and > > generally poorly understood even among usually savvy C++ users. > > > > As I understand, fixed-length vectors were imagined for C-compatibility. > > Statically allocated buffers have lifetime that exceed that of all other > > objects in the system, therefore they can perfectly be accessed through > > slices. Other uses implying C-compatibility should be based on > > dynamically allocated memory, and the size will be unknown at > compilation. > > They're useful for a lot of reasons. You can alloca them, which is good > for small buffers. And a decent number of heap structures also have need > of small fixed-fanout arrays, caches, lookup tables and the like. > > But beyond that they simply _occur_ in the C type system. With annoying > frequency! We've designed (and intend to maintain) a degree of > compatibility with C in our structured types: a rust record and a C > struct containing the same elements ought to be memory-compatible. When > a C struct has an array in the middle of it, we need to be able to > represent that somehow. There are a nontrivial number of C structures > that have that property (or, say, a fixed-sized reserved region). We > currently address this by having users generate a sequence of fields like: > > pad1: u8; pad2: u8; pad3: u8; pad4: u8; > > etc. etc. Not so fun. > Ah indeed, to emulate C's layout they seem quite necessary. > > > In the blog article linked, an issue regarding the variable-size of > > `rust_vec` is made because it plays havoc with stack-allocation. > > However, is real stack-allocation necessary here ? > > It's not necessary, but if it's not done on the stack, it's done on a > parallel-to-the-stack LIFO structure (a.k.a. "dynastack"), which we used > to have, but have removed since we managed to move everything to the > stack. If we had to re-acquire the dynastack for this purpose, it would > not be the end of the world, but we'd like to avoid it. It's one more > moving part. > > Yes, that is the issue. The more parts there are in a task and the heavier they get. > I hope my suggestions are reasonable. Do feel free to ignore them if > > they are not! > > Quite reasonable. I hope I've provided useful answers to some. > > -Graydon > Thank you very much for the detailed answer. Having just discovered Rust a few weeks ago I am afraid that I lack a lot of background on those questions. I definitely hope to get up to speed as the design of Rust (if not the syntax ;) ) is extremely interesting: typestate, built-in log/note, task & fail interactions, region pointers => that's a lot of goodness for someone coming from a C++ background! -- Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Fri Apr 27 15:15:32 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Fri, 27 Apr 2012 15:15:32 -0700 Subject: [rust-dev] In favor of types of unknown size Message-ID: <4F9B1A84.7040806@alum.mit.edu> Hi, This is a post I recently put on my blog. I thought I'd post it to the mailing list as it pertains to our recent discussion on the syntax of vectors, slices, and so forth. I feel like I'm harping on this issue so I think it's the last thing I will write about it for a while. =) Niko ------------ http://smallcultfollowing.com/babysteps/blog/2012/04/27/in-favor-of-types-of-unknown-size/ ## Summary First, The Grand ASCII Art Table, summarizing everything (sad fact: `M-x picture-mode` is way more convenient than making an HTML table). Blank spaces indicate things that are inexpressible in one proposal or the other (for better or worse). ``` +---------------------++---------------------+ | This proposal: || Original proposal: | |--------+------------||-------+-------------| | Type | Literal || Type | Literal | |--------+------------||-------+-------------| | [:]T | || [T] | [1, 2, 3] | | []T | [1, 2, 3] || | | | &[]T | &[1, 2, 3] || | | | @[]T | @[1, 2, 3] || [T]/@ | [1, 2, 3]/@ | | ~[]T | ~[1, 2, 3] || [T]/~ | [1, 2, 3]/~ | | [3]T | [|1, 2, 3] || [T]/3 | [1, 2, 3]/_ | | | || | | | substr | || str | "abc" | | str | "abc" || | | | &str | &"abc" || | | | @str | @"abc" || str/@ | "abc"/@ | | ~str | ~"abc" || str/~ | "abc"/~ | | | || str/3 | "abc"/_ | +---------------------++---------------------+ ``` The types `[]T` and `str` would represent vectors and strings, respectively. These types have the C representation `rust_vec` and `rust_vec`. They are of *dynamic size*, meaning that their size depends on their length. The literal form for vectors and strings are `[a, b, c]` and `"foo"`, just as normal. The types `[:]T` and `substr` represent slices of vectors and strings. Their representation is the pair of a pointer and a length. They are each associated with a [lifetime][ref] that specifies how long the slice is valid, and thus can be more fully notated as `[:]/&r T` and `substr/&r`, but users will not have to write this very often, if ever. Vectors, strings, and fixed-length vectors are implicitly coercable to slices just as today. Furthermore, one can explicitly take a slice using a Python like slice notation: `v[3:-5]` or `v[:]` to take a slice of the entire vector. It is also allowed to take a slice of a slice. This is where the `:` in the slice type comes from: it's supposed to echo this syntactic form. [ref]: /blog/2012/04/25/references Fixed-length vectors are written `[N]T`. They are represented just like a C vector `T[N]`. The literal form is `[| v1, ..., vN]`. The leading `|` serves to distinguish a fixed-length vector. It is random but whatever, this is a specialized use case for C compatibility. The length of the literal form is always derived from the number of items. I opted not to include a way to represent fixed-length strings for the [same reasons I previously stated][bg]. ## Advantages The big advantage is that everything is written the way that seems to me to be most natural. For example, a vector on the stack is `&[1, 2, 3]`. A task-local vector is written: `@[1, 2, 3]`. unique vector is written `~[1, 2, 3]`. Same with strings. I also like the indication of where memory is allocated is orthogonal to what is stored in the memory. The type and unary operators `&`, `@` and `~` tell you where the memory is allocated, and the types which follow tell you what you will find at that memory. If we have types like `[1, 2, 3]/@`, they combine where the memory is allocated with what you will find there (to be clear, that is by design, so as to avoid the disadvantages in the next section). There is no need for a literal form for slices. If you create a vector and then use it where a slice is expected, the type will be coercable, so no error will result. ## Disadvantages The primary disadvantage is that the types `[]T` and `str` are of dynamic length. This implies a kind distinction that does not exist today. I'd be inlined to just make a rule that types of dynamic length cannot be used as the types of local variables, fields, vector contents, nor the values of generic type parameters (and maybe a few other places). Later we could add an explicit kind if that seems necessary. It basically means you would get an error message like "the type `[T]` has unknown size cannot be used as the type of a local variable, use a pointer like `@[T]` or `&[T]`". Having types of unknown size are a complication, to be sure, but I feel it is a lesser complication than having special types, expression forms, and rules for vectors and strings. Furthermore, this same case (types of unknown size) has come up from time to time when thinking about other possible future designs, so I am not sure that it can be avoided. A second, more subtle point is that slices are no longer the shortest type in terms of how they are written, although they are probably the most common thing you will want to use. I am not too worried about this either: `[:]T` is still fairly short and we will use it ubiquitously. One thing I don't like is that I find `[:]` somewhat hard to type. Maybe that will get easier, or maybe something else (e.g, `[.]` and a slice notation of `v[1..3]`) would be better. ## Other kinds of variably sized types...? Records of dynamic size are common in C, and we may ultimately have to be able to model that (though we could admittedly use the C trick, where it pretends all types have fixed size when in fact the memory allocated may be greater, combined with unsafe pointers). Still, there is a legitimate use case for allocating a variably-sized vector interior to a record even in Rust code, and we could support that (it's the same trick that we in fact use to implement vectors themselves---if it's important enough for us, maybe it's important enough for our users). Another example would be base types. We may sometime want to allow records or classes that can be extended with subtypes. In that case, we could say that the base types have variable size, since the number of fields they possess are unknown---this would mean that you only refer to them by pointer, preventing the common C++ problems of [slicing][slice] and unsafe array arithmetic. I'm not sure where else this comes up. Perhaps that's it. [slice]: http://stackoverflow.com/questions/274626/what-is-the-slicing-problem-in-c From niko at alum.mit.edu Fri Apr 27 19:05:51 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Fri, 27 Apr 2012 19:05:51 -0700 Subject: [rust-dev] iter library Message-ID: <4F9B507F.30206@alum.mit.edu> I just pushed a rather different version of the iter library to master. The existing approach wasn't really working out. The new one is simpler, and I think it will work out quite well, especially as we start making better use of slices. It is also plays nicely with the `for` loop syntax. Lemme know what you think. To implement an iterable type, you have to implement this interface: iface base_iter { // invoke f() on each item, stopping if it returns false fn each(f: fn(A) -> bool); // returns the number of items in your collection, if you know it, // or none otherwise. fn size_hint() -> option; } There are then a variety of various helper methods that you get "for free" (to_vec(), foldl(), contains(), map_to_vec() and so forth). Unfortunately, since we don't have traits, I have to resort to a clever trick that Brian came up with to share these methods. I have defined two instances so far: vec and option. If you want to see how to implement such a type, check out iter-trait.rs and iter-trait/vec.rs. The magic also relies on various #[path] attributes in core.rc. It's deep voodoo, let me tell you. ## Future plans Later, I would like to add str, map, and a variety of other types. For types (like str and map) where there are multiple possibilities for how to iterate, my plan is to use wrapper types like so: enum keys> = &M; This wrapper would allow you to iterate over the keys in a map. You would use it something like: for keys(&map).each { |k| ... } This also allows things like keys(&map).map_to_vec { |k| ... } Another example where I think such wrapper types would be helpful is for iterating over two slices in parallel. I planned to have an enum like: enum zip { zip([A]/&, [B]/&) } which would implement the iterable interface for the type (&A, &B). And so forth. I also wanted to replace (or supplement) the `eachi()` method with something like enum enumerate> = IA; where `enumerate` implements the iface for the type `(uint, IA)`. This would then allow you to write: for enumerate(keys(&map)).each { |(i, k)| ... } (This requires support for irrefutable patterns in argument types, but we should add those) ## Possible far future plans If we added support for higher-kinded types, we could support a `map()` method in the iteration trait. But until then we have `map_to_vec()`, which always results in a type of `[A]`. Maybe we should find a shorter name, to just do `iterable.to_vec().map()`. I dunno. Niko From arcata at gmail.com Fri Apr 27 20:02:28 2012 From: arcata at gmail.com (Joe Groff) Date: Fri, 27 Apr 2012 20:02:28 -0700 Subject: [rust-dev] iter library In-Reply-To: <4F9B507F.30206@alum.mit.edu> References: <4F9B507F.30206@alum.mit.edu> Message-ID: <4457010042099824986@unknownmsgid> On Apr 27, 2012, at 7:05 PM, Niko Matsakis wrote: > > Later, I would like to add str, map, and a variety of other types. For types (like str and map) where there are multiple possibilities for how to iterate, my plan is to use wrapper types like so: > > enum keys> = &M; > > This wrapper would allow you to iterate over the keys in a map. You would use it something like: > > for keys(&map).each { |k| ... } > > This also allows things like > > keys(&map).map_to_vec { |k| ... } Is the wrapper type necessary? I thought named implementations were intended to allow multiple implementations without wrapping. > Another example where I think such wrapper types would be helpful is for iterating over two slices in parallel. I planned to have an enum like: > > enum zip { > zip([A]/&, [B]/&) > } > > which would implement the iterable interface for the type (&A, &B). And so forth. Would the each() method of an iterator be resumable if you called it again after stopping a previous iteration? If so, you could implement a generic zip over any two iterators. > ## Possible far future plans > > If we added support for higher-kinded types, we could support a `map()` method in the iteration trait. But until then we have `map_to_vec()`, which always results in a type of `[A]`. Maybe we should find a shorter name, to just do `iterable.to_vec().map()`. I dunno. Could you implement a map O, iterable> : iterable adapter without higher kinds? -Joe From niko at alum.mit.edu Fri Apr 27 21:35:38 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Fri, 27 Apr 2012 21:35:38 -0700 Subject: [rust-dev] iter library In-Reply-To: <4457010042099824986@unknownmsgid> References: <4F9B507F.30206@alum.mit.edu> <4457010042099824986@unknownmsgid> Message-ID: <4F9B739A.8030201@alum.mit.edu> On 4/27/12 8:02 PM, Joe Groff wrote: > Is the wrapper type necessary? I thought named implementations were > intended to allow multiple implementations without wrapping. Sorry if I didn't make my purpose clear. The wrapper type is to distinguish the different modes of iteration. For example, with maps, do you want to iterate over the keys in the map, the values, or the (key->value) pairs? Right now, we have separate methods for those things (iter_keys, iter_values, etc), but that doesn't allow you to make use of all the other associated iteration methods (foldl, map, etc). You would need "foldl_keys", "foldl_values", and so forth. A similar case occurs with strings (iterate by byte, by unicode character, by word, by line, etc). > Would the each() method of an iterator be resumable if you called it > again after stopping a previous iteration? If so, you could implement > a generic zip over any two iterators. No, this is not currently possible. We don't support a cursor-based API. This decision was made before I came around but I think it's a good one. Cursor-based APIs are kind of a poor fit for the Rust memory model, I think, which is very stack-oriented. Also, function-based APIs are easier to compile efficiently and give you side benefits like making it easy to determine when iteration has started and ended. Still, a cursor-like iface or perhaps an iface for O(1) indexable types might be useful additions in the future. They would enable more generic combinators. > Could you implement a map O, iterable> : iterable > adapter without higher kinds? So, what is specifically not possible is use ifaces to write a generic function that works on, say, any mappable collection. Of course we can still define map() methods for any type and you can write methods that operate generically using closures. It's not clear to me that the current limits will be a problem in practice. Niko From marijnh at gmail.com Fri Apr 27 23:03:40 2012 From: marijnh at gmail.com (Marijn Haverbeke) Date: Sat, 28 Apr 2012 08:03:40 +0200 Subject: [rust-dev] iter library In-Reply-To: <4F9B739A.8030201@alum.mit.edu> References: <4F9B507F.30206@alum.mit.edu> <4457010042099824986@unknownmsgid> <4F9B739A.8030201@alum.mit.edu> Message-ID: > Sorry if I didn't make my purpose clear. ?The wrapper type is to distinguish > the different modes of iteration. What Joe meant is that you could simply write multiple impls on the same type with different names for the various modes of iteration. impl of iter for maptype { ... } impl iter_keys of iter for maptype { ... } impl iter_vals of iter for maptype { ... } You could then do 'import map::iter_keys;` at the top of a block to force the key-iterating impl to take precedence there. I think the ergonomics of this kind of trick didn't work out as well as hoped, though. You'd get multiple applicable impl errors when importing `map::*`, and seeing which impl is currently closest in a scope is somewhat indirect and confusing. Best, Marijn From marijnh at gmail.com Fri Apr 27 23:12:44 2012 From: marijnh at gmail.com (Marijn Haverbeke) Date: Sat, 28 Apr 2012 08:12:44 +0200 Subject: [rust-dev] In favor of types of unknown size In-Reply-To: <4F9B1A84.7040806@alum.mit.edu> References: <4F9B1A84.7040806@alum.mit.edu> Message-ID: I must say I prefer Graydon's syntax. `[]T` sets off all kinds of alarms in my head. I have no strong opinion on dynamically-sized types. Not having them is definitely a win in terms of compiler complexity, but yes, some of the things that they make possible are nice to have. From matthieu.monrocq at gmail.com Sat Apr 28 03:17:44 2012 From: matthieu.monrocq at gmail.com (Matthieu Monrocq) Date: Sat, 28 Apr 2012 12:17:44 +0200 Subject: [rust-dev] In favor of types of unknown size In-Reply-To: References: <4F9B1A84.7040806@alum.mit.edu> Message-ID: On Sat, Apr 28, 2012 at 8:12 AM, Marijn Haverbeke wrote: > I must say I prefer Graydon's syntax. `[]T` sets off all kinds of > alarms in my head. > > I have no strong opinion on dynamically-sized types. Not having them > is definitely a win in terms of compiler complexity, but yes, some of > the things that they make possible are nice to have. > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev > Hello Niko, First I really appreciate you thinking hard about it and if you don't want to bother the list I would certainly not mind talking it out with you in private; I feel it's very important for these things to be thought through extensively and I really like that decisions in Rust are always considered carefully and objectively. That being said, I have two remarks: I would like to ask a question on the vectors syntax: why the focus on [] ? I understand it in the literal form, however a string type is denoted as `str` so why not denote a vector of Ts as `vec` ? Yes, it's slightly more verbose, but this is how all the other generic types will be expressed anyway. Similarly, since a substring is expressed as `substr`, one could simply express a slice as `slice` or `svec` or even `array_ref`. I don't think being overly clever with the syntax type will really help the users. Imagine grepping for all uses of the slice type in a crate ? It's so much simpler with an alphabetic name. (Also, `[:]/&r T` feel *really* weird, look at the mess C is with its pointer to function syntax that let's you specify the name in the *middle* of the type...) As for types of unknown sizes, I would like to point out that prevent users from having plain `str` attributes in their records is kinda weird. The pointer syntax is not only more verbose, it also means that suddenly getting a local *copy* of the string gets more difficult. Sure it's equivalent (semantically) to a unique pointer `~str`, but it does not make copying easier, while it's one of the primary operations in impure languages (because the original may easily get modified at a later point in time). I think that `rust_vec` having an unknown size rather than being (in effect) a pointer to a heap allocated structure is nice from an implementation point of view, but it should not get in the way of using it. I would therefore venture that either it has an unknown size and the compiler just extend this unknown size property to all types so they can have `vec` and `str` attributes naturally, or it's better for it *not* to have an unknown size. I would also like to point out that if it's an implementation detail, the actual representation might vary from known size to unknown size without impact for the user, so starting without for the moment because it's easier and refining it later is an option. Another option is to have a fixed size with an alternative representation using something similar to SSO (Short String Optimization); that is small vectors/strings allocate their storage in place while larger ones push their storage to the heap to avoid trashing the stack. Hope this does not look harsh, I sometimes have difficulties expressing my opinions without being seen as patronizing: I can assure you I probably know less than you do :) -- Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From pwalton at mozilla.com Sat Apr 28 03:45:20 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Sat, 28 Apr 2012 03:45:20 -0700 Subject: [rust-dev] In favor of types of unknown size In-Reply-To: References: <4F9B1A84.7040806@alum.mit.edu> Message-ID: <4F9BCA40.7020202@mozilla.com> On 04/28/2012 03:17 AM, Matthieu Monrocq wrote: > I would also like to point out that if it's an implementation detail, > the actual representation might vary from known size to unknown size > without impact for the user, so starting without for the moment because > it's easier and refining it later is an option. Another option is to > have a fixed size with an alternative representation using something > similar to SSO (Short String Optimization); that is small > vectors/strings allocate their storage in place while larger ones push > their storage to the heap to avoid trashing the stack. We tried this once. It was a disaster in terms of code size; you really don't want all strings and vectors doing this. Patrick From pwalton at mozilla.com Sat Apr 28 03:53:01 2012 From: pwalton at mozilla.com (Patrick Walton) Date: Sat, 28 Apr 2012 03:53:01 -0700 Subject: [rust-dev] In favor of types of unknown size In-Reply-To: <4F9B1A84.7040806@alum.mit.edu> References: <4F9B1A84.7040806@alum.mit.edu> Message-ID: <4F9BCC0D.2040706@mozilla.com> On 04/27/2012 03:15 PM, Niko Matsakis wrote: > Hi, > > This is a post I recently put on my blog. I thought I'd post it to the > mailing list as it pertains to our recent discussion on the syntax of > vectors, slices, and so forth. I feel like I'm harping on this issue so > I think it's the last thing I will write about it for a while. =) I like the idea of eliminating [T]/@ in favor of @[T]; it simplifies the user-facing syntax and semantics a lot. On the other hand, I agree with Marijn that []T doesn't look as nice as [T] (although Go is popularizing the former). [:]T is also strange-looking. There's also the issue that users might use &[]T, which is almost never useful, instead of the more-useful [:]T. I'd honestly be ok with going back to vec or vec for the vector type and using [T] for the slice, to discourage this hazard. Patrick From arcata at gmail.com Sat Apr 28 08:02:15 2012 From: arcata at gmail.com (Joe Groff) Date: Sat, 28 Apr 2012 08:02:15 -0700 Subject: [rust-dev] iter library In-Reply-To: References: <4F9B507F.30206@alum.mit.edu> <4457010042099824986@unknownmsgid> <4F9B739A.8030201@alum.mit.edu> Message-ID: <-7463212431943616118@unknownmsgid> On Apr 27, 2012, at 11:03 PM, Marijn Haverbeke wrote: > What Joe meant is that you could simply write multiple impls on the > same type with different names for the various modes of iteration. > > impl of iter for maptype { ... } > impl iter_keys of iter for maptype { ... } > impl iter_vals of iter for maptype { ... } > > You could then do 'import map::iter_keys;` at the top of a block to > force the key-iterating impl to take precedence there. > > I think the ergonomics of this kind of trick didn't work out as well > as hoped, though. You'd get multiple applicable impl errors when > importing `map::*`, and seeing which impl is currently closest in a > scope is somewhat indirect and confusing. That is indeed what I was going for, but I suggested it thinking it was possible to instantiate a named impl explicitly when multiple impls are in scope. If that's not the case then wrapper types make sense, but this looks like a prime use case of named impls to me. Are there problems with allowing impls to be used as a constructor or cast, to allow for example `iter_keys(map).each` or `map.(iter_keys).each` to just work given an impl iter_keys? -Joe From ben.striegel at gmail.com Sat Apr 28 11:03:00 2012 From: ben.striegel at gmail.com (Benjamin Striegel) Date: Sat, 28 Apr 2012 14:03:00 -0400 Subject: [rust-dev] In favor of types of unknown size In-Reply-To: <4F9BCC0D.2040706@mozilla.com> References: <4F9B1A84.7040806@alum.mit.edu> <4F9BCC0D.2040706@mozilla.com> Message-ID: > I'd honestly be ok with going back to vec or vec for the vector type and using [T] for the slice, to discourage this hazard. I think this could be a win for clarity. There are enough potential use cases here that overloading [] doesn't seem to give all that much benefit. On Sat, Apr 28, 2012 at 6:53 AM, Patrick Walton wrote: > On 04/27/2012 03:15 PM, Niko Matsakis wrote: > >> Hi, >> >> This is a post I recently put on my blog. I thought I'd post it to the >> mailing list as it pertains to our recent discussion on the syntax of >> vectors, slices, and so forth. I feel like I'm harping on this issue so >> I think it's the last thing I will write about it for a while. =) >> > > I like the idea of eliminating [T]/@ in favor of @[T]; it simplifies the > user-facing syntax and semantics a lot. > > On the other hand, I agree with Marijn that []T doesn't look as nice as > [T] (although Go is popularizing the former). [:]T is also strange-looking. > There's also the issue that users might use &[]T, which is almost never > useful, instead of the more-useful [:]T. I'd honestly be ok with going back > to vec or vec for the vector type and using [T] for the slice, to > discourage this hazard. > > Patrick > > ______________________________**_________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/**listinfo/rust-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Sat Apr 28 14:44:17 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Sat, 28 Apr 2012 14:44:17 -0700 Subject: [rust-dev] In favor of types of unknown size In-Reply-To: References: <4F9B1A84.7040806@alum.mit.edu> Message-ID: <4F9C64B1.7070601@alum.mit.edu> The main thing I was trying to argue for is not a specific syntax but rather the idea that a leading @, ~, or & sigil indicates the kind of pointer, and what comes after indicates the data that is being pointed at. The orthogonality appeals to me; it seems to make the language "fit together" more neatly. As far as pure visual aesthetics, I think what I prefer most is `[T]` for slices and `vec` for vectors. I proposed this previously but amended it because we would need to support types like `vec`, which are different from ordinary type parameters that do not permit a `mut` qualifier. Since type names are no longer keywords, this is somewhat awkward to do, though of course we could manage it (either by making `vec` a keyword or by allow `` as an alternate type parameter syntax that can only be used with vectors). Niko On 4/27/12 11:12 PM, Marijn Haverbeke wrote: > I must say I prefer Graydon's syntax. `[]T` sets off all kinds of > alarms in my head. > > I have no strong opinion on dynamically-sized types. Not having them > is definitely a win in terms of compiler complexity, but yes, some of > the things that they make possible are nice to have. From sebastian.sylvan at gmail.com Sat Apr 28 15:21:05 2012 From: sebastian.sylvan at gmail.com (Sebastian Sylvan) Date: Sat, 28 Apr 2012 15:21:05 -0700 Subject: [rust-dev] In favor of types of unknown size In-Reply-To: <4F9B1A84.7040806@alum.mit.edu> References: <4F9B1A84.7040806@alum.mit.edu> Message-ID: On Fri, Apr 27, 2012 at 3:15 PM, Niko Matsakis wrote: > The types `[]T` and `str` would represent vectors and strings, > respectively. ?These types have the C representation `rust_vec` and > `rust_vec`. ?They are of *dynamic size*, meaning that their size > depends on their length. ?The literal form for vectors and strings are > `[a, b, c]` and `"foo"`, just as normal. Back when I was entertaining the idea of writing my own rust-like language (before I was aware of rust's existence), I had the idea that all records/objects cold have dynamic size if any of their members had dynamic size (and the root cause of dynamic size would be fixed-size arrays - fixed at the time of construction, not a static constant size). This is only slightly related, but it's too close that I can't resist presenting gist of it (it's not completely worked out), in case anyone else wants to figure it out and see if it makes sense :-) Basically the idea spawned from the attempt of trying to avoid pointers as much as possible. Keep things packed, with "chunky" objects, reduce the complexity for GC/RC, reduce memory fragmentation, etc.. Aside from actual honest-to-goodness graphs (which fairly rare, and most are small, and unavoidable anyway). The conjecture is that the main source of pointers are arrays. Okay, so basically the idea is that arrays are length-prefixed blocks of elements. They're statically sized (can't be expanded), but you can pass in a dynamic, non-constant value when you construct them. Unlike C/C++ though these arrays can still live *inside* an object. There's some fiddlyness here.. e.g.. do you put all arrays (except ones which true const sizes?) at the end of the object so other members have a constant offset? If you have more than a small number of arrays in an object it probably makes to have a few pointers indicating the start of each instead of having to add up the sizes of preceeding arrays each time an access is made to one of the "later" arrays. So, during Construction of an object, you'd have to proceed in two phases. First is the constructor logic where you compute values, and the second is the allocation of the object and moving the values into it. You need to hold off on allocation because you don't know the size of any member objects until you've constructed them. Moving an array is now expensive, since it requires a copy, not just a pointer move. So ideally the compiler would try to move the allocation to happen as early as possible so most of the values can be written directly to its final location instead of having to be constructed on the stack (or heap) and then moved. There are of course cases where this couldn't be done. E.g. if the size of an array X, depends on some computation done on array Y in the same object - you have to create Y on the stack, or heap, to run the computation before you can know the total size of the object, and only then can you allocate the final object and copy the arrays into it. I'm not 100% sold on the idea, since it does make things a bit more complex, but it is pretty appealing to me that you can allocate dynamic-but-fixed sized arrays on the stack, inside other objects etc.. For a language that emphasizes immutable data structures I'd imagine the opportunity to use these fixed arrays "in-place" would be extremely frequent. Seb -- Sebastian Sylvan From matthieu.monrocq at gmail.com Sun Apr 29 04:53:45 2012 From: matthieu.monrocq at gmail.com (Matthieu Monrocq) Date: Sun, 29 Apr 2012 13:53:45 +0200 Subject: [rust-dev] In favor of types of unknown size In-Reply-To: References: <4F9B1A84.7040806@alum.mit.edu> Message-ID: Hi Sebastian, I have a few comments. On Sun, Apr 29, 2012 at 12:21 AM, Sebastian Sylvan < sebastian.sylvan at gmail.com> wrote: > On Fri, Apr 27, 2012 at 3:15 PM, Niko Matsakis wrote: > > The types `[]T` and `str` would represent vectors and strings, > > respectively. These types have the C representation `rust_vec` and > > `rust_vec`. They are of *dynamic size*, meaning that their size > > depends on their length. The literal form for vectors and strings are > > `[a, b, c]` and `"foo"`, just as normal. > > Back when I was entertaining the idea of writing my own rust-like > language (before I was aware of rust's existence), I had the idea that > all records/objects cold have dynamic size if any of their members had > dynamic size (and the root cause of dynamic size would be fixed-size > arrays - fixed at the time of construction, not a static constant > size). > > This is only slightly related, but it's too close that I can't resist > presenting gist of it (it's not completely worked out), in case anyone > else wants to figure it out and see if it makes sense :-) > > Basically the idea spawned from the attempt of trying to avoid > pointers as much as possible. Keep things packed, with "chunky" > objects, reduce the complexity for GC/RC, reduce memory fragmentation, > etc.. Aside from actual honest-to-goodness graphs (which fairly rare, > and most are small, and unavoidable anyway). The conjecture is that > the main source of pointers are arrays. > > Okay, so basically the idea is that arrays are length-prefixed blocks > of elements. They're statically sized (can't be expanded), but you can > pass in a dynamic, non-constant value when you construct them. Unlike > C/C++ though these arrays can still live *inside* an object. There's > some fiddlyness here.. e.g.. do you put all arrays (except ones which > true const sizes?) at the end of the object so other members have a > constant offset? If you have more than a small number of arrays in an > object it probably makes to have a few pointers indicating the start > of each instead of having to add up the sizes of preceeding arrays > each time an access is made to one of the "later" arrays. > > Small reactions on "pointers": I think it's a good idea to pack the variable length structures at the end of the current object. However I would use cumulative offsets rather than pointers, because of size (on 64-bits architecture, which are becoming the de-facto standard for PCs and servers). The idea would be in C-style: struct Object { int scalar1; int scalar2; unsigned __offset0; unsigned __offset1; unsigned __offset2; SomeObject __obj0; Table __obj1[X]; }; Where __offset0 indicates the offset from the start of Object to the start of __obj0, __offset1 the offset from the start of Object to the start of __obj1 and __offset2 the offset from the start of Object to the start of __obj2. This means you have direct access to any attribute with a simple addition to pointer, and you can know the size with a simple substraction (the size of __obj0 is __offset1 - __offset2). > So, during Construction of an object, you'd have to proceed in two > phases. First is the constructor logic where you compute values, and > the second is the allocation of the object and moving the values into > it. You need to hold off on allocation because you don't know the size > of any member objects until you've constructed them. Moving an array > is now expensive, since it requires a copy, not just a pointer move. > So ideally the compiler would try to move the allocation to happen as > early as possible so most of the values can be written directly to its > final location instead of having to be constructed on the stack (or > heap) and then moved. There are of course cases where this couldn't be > done. E.g. if the size of an array X, depends on some computation done > on array Y in the same object - you have to create Y on the stack, or > heap, to run the computation before you can know the total size of the > object, and only then can you allocate the final object and copy the > arrays into it. > > Yes, this is getting quite difficult at this stage. It's good once the size is settled but the construction can be expensive. > I'm not 100% sold on the idea, since it does make things a bit more > complex, but it is pretty appealing to me that you can allocate > dynamic-but-fixed sized arrays on the stack, inside other objects > etc.. For a language that emphasizes immutable data structures I'd > imagine the opportunity to use these fixed arrays "in-place" would be > extremely frequent. > Seb > > -- > Sebastian Sylvan > There is a subtle issue that I had not remarked earlier. This mechanism works great for fixed-size arrays, but is not amenable to extensible arrays: vectors and strings *grow*. So it would work if the field/attribute is runtime-fixed-size, either because the type imposes it or because it's declared immutable, however it will not work in the general case. This is important because it means that in general, we need the vector or string to be allocated on the heap because we want it growable. Having realized that, I wonder if it's worth considering types of unknown size to start with. The idea of a pointer to a `vec` that is of runtime-fixed-size is pleasant enough, but it means that the vector itself is not modified, instead a new vector is built and the pointer is reseated. This in turns means that I need to pass types such as `&@vec` to my functions... Frankly, this is not nice to the user, and since we are talking about very common types here, I believe it would be worth sugar coating a bit, syntax-wise. Having specific types for those runtime-fixed-length structures (a fixed_array ?) could be worth it, but I strongly believe that `vec` and `str` should be manipulable as-is rather than always prefixed by `~` or `@` to be useful, even more because most of the routine would then have to be duplicated to handle both types of ownership: *shiver*. -- Matthieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From garethdanielsmith at gmail.com Sun Apr 29 07:48:58 2012 From: garethdanielsmith at gmail.com (Gareth Smith) Date: Sun, 29 Apr 2012 15:48:58 +0100 Subject: [rust-dev] Bikeshed impl method extraction Message-ID: <4F9D54DA.6020704@gmail.com> Hi, I have written up some thoughts about a enabling a less repetitious API for constructing hashmaps (amongst other possibilities), here: https://github.com/mozilla/rust/wiki/Bikeshed-impl-method-extraction Does this make any sense? Gareth From steven099 at gmail.com Sun Apr 29 23:08:27 2012 From: steven099 at gmail.com (Steven Blenkinsop) Date: Mon, 30 Apr 2012 02:08:27 -0400 Subject: [rust-dev] Bikeshed impl method extraction In-Reply-To: <4F9D54DA.6020704@gmail.com> References: <4F9D54DA.6020704@gmail.com> Message-ID: On Sunday, April 29, 2012, Gareth Smith wrote: > > Hi, > > I have written up some thoughts about a enabling a less repetitious API > for constructing hashmaps (amongst other possibilities), here: > https://github.com/mozilla/**rust/wiki/Bikeshed-impl-**method-extraction > > Does this make any sense? Couldn't you just do something like: fn hashmap () -> std::map::hashmap { ret std::map::hashmap({|k| k.hash()}, {|k1,k2| k1.equals(k2)}); } ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko at alum.mit.edu Mon Apr 30 08:33:31 2012 From: niko at alum.mit.edu (Niko Matsakis) Date: Mon, 30 Apr 2012 08:33:31 -0700 Subject: [rust-dev] Bikeshed impl method extraction In-Reply-To: <4F9D54DA.6020704@gmail.com> References: <4F9D54DA.6020704@gmail.com> Message-ID: <4F9EB0CB.60001@alum.mit.edu> It seems to me that classes will solve this problem neatly. Niko On 4/29/12 7:48 AM, Gareth Smith wrote: > Hi, > > I have written up some thoughts about a enabling a less repetitious > API for constructing hashmaps (amongst other possibilities), here: > https://github.com/mozilla/rust/wiki/Bikeshed-impl-method-extraction > > Does this make any sense? > > Gareth > _______________________________________________ > Rust-dev mailing list > Rust-dev at mozilla.org > https://mail.mozilla.org/listinfo/rust-dev From graydon at mozilla.com Mon Apr 30 10:59:07 2012 From: graydon at mozilla.com (Graydon Hoare) Date: Mon, 30 Apr 2012 10:59:07 -0700 Subject: [rust-dev] Bikeshed impl method extraction In-Reply-To: <4F9EB0CB.60001@alum.mit.edu> References: <4F9D54DA.6020704@gmail.com> <4F9EB0CB.60001@alum.mit.edu> Message-ID: <4F9ED2EB.3010509@mozilla.com> On 12-04-30 08:33 AM, Niko Matsakis wrote: > It seems to me that classes will solve this problem neatly. Agreed. IIRC it was actually one of the motivating examples! ("the hashtable problem", cue ominous music) -Graydon From garethdanielsmith at gmail.com Mon Apr 30 11:35:31 2012 From: garethdanielsmith at gmail.com (Gareth Smith) Date: Mon, 30 Apr 2012 19:35:31 +0100 Subject: [rust-dev] Bikeshed impl method extraction In-Reply-To: References: <4F9D54DA.6020704@gmail.com> Message-ID: <4F9EDB73.5070707@gmail.com> On 30/04/12 07:08, Steven Blenkinsop wrote: > On Sunday, April 29, 2012, Gareth Smith wrote: > > Hi, > > I have written up some thoughts about a enabling a less > repetitious API for constructing hashmaps (amongst other > possibilities), here: > https://github.com/mozilla/rust/wiki/Bikeshed-impl-method-extraction > > Does this make any sense? > > > Couldn't you just do something like: > > fn hashmap () -> std::map::hashmap { > ret std::map::hashmap({|k| k.hash()}, {|k1,k2| k1.equals(k2)}); > } > > ? Ah, I had not considered that - and it looks to me like it should work - but it doesn't: hello.rs:26:31: 26:37 error: the type of this value must be known in this context hello.rs:26 ret std::map::hashmap({|k| k.hash()}, {|k1, k2| k1.equals(k2)}); but this might be a type inference bug, because if it is rewritten with type annotations then it works: let hashfn:fn@(K)->uint = {|k| k.hash()}; let eqfn:fn@(K, K)->bool = {|a, b| a.equals(b)}; ret std::map::hashmap(hashfn, eqfn); Considering your answer, and the other answers, I have abandoned this bikeshed. Gareth -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.sylvan at gmail.com Mon Apr 30 23:23:42 2012 From: sebastian.sylvan at gmail.com (Sebastian Sylvan) Date: Mon, 30 Apr 2012 23:23:42 -0700 Subject: [rust-dev] Interesting paper on RC vs GC Message-ID: I found this quite interesting. The upshot is that they measure perf. of state-of-the-art GC vs RC and finds the latter to be about 30% slower. However, they also figure out where it's slower, and apply some simple optimizations bringing it to roughly on-par with GC, perf. wise. I'm thinking that Rust's language-based techniques for reducing reference increments (higly stack-based allocations, etc.), combined with the fact that reference counts are cheaper (since they're task local), could mean these techniques in the setting of Rust would make it beat GC. And if not, at least the performance measurements are informative. R. Shahriyar, S. M. Blackburn, and D. Frampton, "Down for the Count? Getting Reference Counting Back in the Ring," in Proceedings of the Eleventh ACM SIGPLAN International Symposium on Memory Management, ISMM ?12, Beijing, China, June 15-16, 2012. http://users.cecs.anu.edu.au/~steveb/downloads/pdf/rc-ismm-2012.pdf -- Sebastian Sylvan From a.stavonin at gmail.com Mon Apr 30 23:44:19 2012 From: a.stavonin at gmail.com (Alexander Stavonin) Date: Tue, 1 May 2012 15:44:19 +0900 Subject: [rust-dev] Strange info inside output Message-ID: <0A6DA044-E8D8-48A6-B4D4-47B7680D483A@gmail.com> Am I right that this information is type of memory leak report? If it is true, is it possible to find time of memory allocation? Information example: ---------> <--------- (28 bytes from 0x7fbb9b415320) +0 +4 +8 +c 0 4 8 c +0000 1c 1e 1f 99 00 00 00 00 00 00 00 00 00 00 00 00 ................ +0010 00 00 00 00 00 00 00 01 00 00 00 00 ............ ---------> <--------- (16 bytes from 0x7fbb9b4148e0) +0 +4 +8 +c 0 4 8 c +0000 10 02 1f 99 7f 00 00 01 00 00 00 00 00 00 00 00 ................ ---------> <--------- (28 bytes from 0x7fbb9b4152d0) +0 +4 +8 +c 0 4 8 c +0000 1c 1e 1f 99 00 00 00 00 fe 80 00 00 00 00 00 00 ................ +0010 00 00 00 00 00 00 00 01 01 00 00 00 ............