How to refer the struct itself in Julia - struct

I have this code:
struct MyStruct
text::String
function MyStruct(_text::String)
text = _text
# do other things
end
end
When I wrote this, I realized that Julia is not recognizing text as MyStruct's field. How can I do something like this in Julia like in Python?
self.text = _text

Don't try to imitate Python. Julia is not object oriented.
You could emulate a Python-style constructor by
mutable struct MyStruct
text::String
function MyStruct(text::String)
self = new()
self.text = some_stuff(text)
return self
end
end
but for this to work the struct needs to be mutable. Then you can set up an uninitialize instance with new() and overwrite the fields.
Note that this is more equivalent to a combination of both __init__ and __new__. In Python, the new part is (99% of the time) already done for you, and you just mutate the already created empty object in __init__. In Julia, you have to do both on your own; this especially also requires returning the new value at the end of the constructor!
Having said all this, it's not often useful to write that way. More idiomatic would be just
struct MyStruct
text::String
MyStruct(text::String) = new(some_stuff(text))
end
unless you absolutely need the struct to be mutable (which has consequences with respect to memory layout and possible optimizations).
And also read up on the difference between inner and outer constructors. If you want the above to be the only valid way to construct MyStruct, this is fine. If you want "convenience constructors", e.g. with default arguments or conversions from other types, prefer outer constructors (where you don't have new but recursively call constructors until an inner constructor is reached).

Taking a quick glance at the constructors documentation and trying out the playground, I was able to come up with this:
struct MyStruct
text::String
function MyStruct(_text::String)
s = new(_text * "def")
# do other things
s
end
end
s = MyStruct("abc")
println(s.text) # abcdef

Related

Ergonomically passing a slice of trait objects

I am converting a variety of types to String when they are passed to a function. I'm not concerned about performance as much as ergonomics, so I want the conversion to be implicit. The original, less generic implementation of the function simply used &[impl Into<String>], but I think that it should be possible to pass a variety of types at once without manually converting each to a string.
The key is that ideally, all of the following cases should be valid calls to my function:
// String literals
perform_tasks(&["Hello", "world"]);
// Owned strings
perform_tasks(&[String::from("foo"), String::from("bar")]);
// Non-string types
perform_tasks(&[1,2,3]);
// A mix of any of them
perform_tasks(&["All", 3, String::from("types!")]);
Some various signatures I've attempted to use:
fn perform_tasks(items: &[impl Into<String>])
The original version fails twice; it can't handle numeric types without manual conversion, and it requires all of the arguments to be the same type.
fn perform_tasks(items: &[impl ToString])
This is slightly closer, but it still requires all of the arguments to be of one type.
fn perform_tasks(items: &[&dyn ToString])
Doing it this way is almost enough, but it won't compile unless I manually add a borrow on each argument.
And that's where we are. I suspect that either Borrow or AsRef will be involved in a solution, but I haven't found a way to get them to handle this situation. For convenience, here is a playground link to the final signature in use (without the needed references for it to compile), alongside the various tests.
The following way works for the first three cases if I understand your intention correctly.
pub fn perform_tasks<I, A>(values: I) -> Vec<String>
where
A: ToString,
I: IntoIterator<Item = A>,
{
values.into_iter().map(|s| s.to_string()).collect()
}
As the other comments pointed out, Rust does not support an array of mixed types. However, you can do one extra step to convert them into a &[&dyn fmt::Display] and then call the same function perform_tasks to get their strings.
let slice: &[&dyn std::fmt::Display] = &[&"All", &3, &String::from("types!")];
perform_tasks(slice);
Here is the playground.
If I understand your intention right, what you want is like this
fn main() {
let a = 1;
myfn(a);
}
fn myfn(i: &dyn SomeTrait) {
//do something
}
So it's like implicitly borrow an object as function argument. However, Rust won't let you to implicitly borrow some objects since borrowing is quite an important safety measure in rust and & can help other programmers quickly identified which is a reference and which is not. Thus Rust is designed to enforce the & to avoid confusion.

Can I use an iterator in global state in Rust?

I want to use an iterator as global state in Rust. Simplified example:
static nums = (0..).filter(|&n|n%2==0);
Is this possible?
You can do it, but you'll have to fight the language along the way.
First, true Rust statics created with the static declaration need to be compile-time constants. So something like static FOO: usize = 10 will compile, but static BAR: String = "foo".to_string() won't, because BAR requires a run-time allocation. While your iterator doesn't require a run-time allocation (though using it will make your life simpler, as you'll see later), its type is complex enough that it doesn't support compile-time initialization.
Second, Rust statics require specifying the full type up-front. This is a problem for arbitrary iterators, which one would like to create by combining iterator adapters and closures. While in this particular case, as mcarton points out, one could specify the type as Filter<RangeFrom<i32>, fn(&i32) -> bool>, it'd be closely tied to the current implementation. You'd have to change the type as soon as you switch to a different combinator. To avoid the hassle it's better to hide the iterator behind a dyn Iterator reference, i.e. type-erase it by putting it in a Box. Erasing the type involves dynamic dispatch, but so would specifying the filter function through a function pointer.
Third, Rust statics are read-only, and Iterator::next() takes &mut self, as it updates the state of the iteration. Statics must be read-only because Rust is multi-threaded, and writing to a static without proof that there are no readers or other writers would allow a data race in safe code. So to advance your global iterator, you must wrap it in a Mutex, which provides both thread safety and interior mutability.
After the long introduction, let's take a look at the fairly short implementation:
use lazy_static::lazy_static;
use std::sync::Mutex;
lazy_static! {
static ref NUMS: Mutex<Box<dyn Iterator<Item = u32> + Send + Sync>> =
Mutex::new(Box::new((0..).filter(|&n| n % 2 == 0)));
}
lazy_static is used to implement the create-on-first-use idiom to work around the non-const initial value. The first time NUMS is accessed, it will create the iterator.
As explained above, the iterator itself is boxed and wrapped in a Mutex. Since global variables are assumed to be accessed from multiple threads, our boxed iterator implements Send and Sync in addition to Iterator.
The result is used as follows:
fn main() {
assert_eq!(NUMS.lock().unwrap().next(), Some(0)); // take single value
assert_eq!(
// take multiple values
Vec::from_iter(NUMS.lock().unwrap().by_ref().take(5)),
vec![2, 4, 6, 8, 10]
);
}
Playground
No. For multiple reasons:
Iterators types tend to be complicated. This is usually not a problem because iterator types must rarely be named, but statics must be explicitly typed. In this case the type is still relatively simple: core::iter::Filter<core::ops::RangeFrom<i32>, fn(&i32) -> bool>.
Iterator's main method, next, needs a &mut self parameter. statics can't be mutable by default, as this would not be safe.
Iterators can only be iterated once. Therefore it makes little sense to have a global iterator in the first place.
The value to initialize a static must be a constant expression. Your initializer is not a constant expression.

What are the differences between the multiple ways to create zero-sized structs?

I found four different ways to create a struct with no data:
struct A{} // empty struct / empty braced struct
struct B(); // empty tuple struct
struct C(()); // unit-valued tuple struct
struct D; // unit struct
(I'm leaving arbitrarily nested tuples that contain only ()s and single-variant enum declarations out of the question, as I understand why those shouldn't be used).
What are the differences between these four declarations? Would I use them for specific purposes, or are they interchangeable?
The book and the reference were surprisingly unhelpful. I did find this accepted RFC (clarified_adt_kinds) which goes into the differences a bit, namely that the unit struct also declares a constant value D and that the tuple structs also declare constructors B() and C(_: ()). However it doesn't offer a design guideline on why to use which.
My guess would be that when I export them with pub, there are differences in which kinds can actually be constructed outside of my module, but I found no conclusive documentation about that.
There are only two functional differences between these four definitions (and a fifth possibility I'll mention in a minute):
Syntax (the most obvious). mcarton's answer goes into more detail.
When the struct is marked pub, whether its constructor (also called struct literal syntax) is usable outside the module it's defined in.
The only one of your examples that is not directly constructible from outside the current module is C. If you try to do this, you will get an error:
mod stuff {
pub struct C(());
}
let _c = stuff::C(()); // error[E0603]: tuple struct `C` is private
This happens because the field is not marked pub; if you declare C as pub struct C(pub ()), the error goes away.
There's another possibility you didn't mention that gives a marginally more descriptive error message: a normal struct, with a zero-sized non-pub member.
mod stuff {
pub struct E {
_dummy: (),
}
}
let _e = stuff::E { _dummy: () }; // error[E0451]: field `_dummy` of struct `main::stuff::E` is private
(Again, you can make the _dummy field available outside of the module by declaring it with pub.)
Since E's constructor is only usable inside the stuff module, stuff has exclusive control over when and how values of E are created. Many structs in the standard library take advantage of this, like Box (to take an obvious example). Zero-sized types work in exactly the same way; in fact, from outside the module it's defined in, the only way you would know that an opaque type is zero-sized is by calling mem::size_of.
See also
What is an idiomatic way to create a zero-sized struct that can't be instantiated outside its crate?
Why define a struct with single private field of unit type?
struct D; // unit struct
This is the usual way for people to write a zero-sized struct.
struct A{} // empty struct / empty braced struct
struct B(); // empty tuple struct
These are just special cases of basic struct and tuple struct which happen to have no parameters. RFC 1506 explains the rational to allow those (they didn't used to):
Permit tuple structs and tuple variants with 0 fields. This restriction is artificial and can be lifted trivially. Macro writers dealing with tuple structs/variants will be happy to get rid of this one special case.
As such, they could easily be generated by macros, but people will rarely write those on their own.
struct C(()); // unit-valued tuple struct
This is another special case of tuple struct. In Rust, () is a type just like any other type, so struct C(()); isn't much different from struct E(u32);. While the type itself isn't very useful, forbidding it would make yet another special case that would need to be handled in macros or generics (struct F<T>(T) can of course be instantiated as F<()>).
Note that there are many other ways to have empty types in Rust. Eg. it is possible to have a function return Result<(), !> to indicate that it doesn't produce a value, and cannot fail. While you might think that returning () in that case would be better, you might have to do that if you implement a trait that dictates you to return Result<T, E> but lets you choose T = () and E = !.

How to constrain the element type of an iterator?

I’m converting some older Rust code to work on 1.0.0. I need to convert a function that takes an iterator over characters, which used to be written like this:
fn f<I: Iterator<char>>(char_iter: I)
Now that Iterator doesn’t take a parameter, the constraint on I can only be I: Iterator. The element type is then I::Item. Is there a way to express the constraint that I::Item = char? (Or should I be doing this another way entirely?)
fn f<I: Iterator<Item = char>>(char_iter: I)
Associated types were recently added to the language, and many library types were updated to take advantage of them. For example, Iterator defines one associated type, named Item. You can add a constraint on the associated type by writing the name of the associated type, an equals sign, and the type you need.
Okay, I was able to figure this out from reading some RFC discussions, and the answer is that you can instantiate associated types in the trait (like signature fibration in ML):
fn f<I: Iterator<Item = char>>(char_iter: I)
Soon it should be possible to use equality constraints in where clauses, but this doesn’t work in 1.0.0-alpha:
fn f<I: Iterator>(char_iter: I) where I::Item == char
You can write I: Iterator<Item = char>. At some point in the future, a where clause like where I::Item == char may work too, but not now.

How to write a fn that processes input and returns an iterator instead of the full result?

Forgive me if this is a dumb question, but I'm new to Rust, and having a hard time writing this toy program to test my understanding.
I want a function that given a string, returns the first word in each line, as an iterator (because the input could be huge, I don't want to buffer the result as an array). Here's the program I wrote which collects the result as an array first:
fn get_first_words(input: ~str) -> ~[&str] {
return input.lines_any().filter_map(|x| x.split_str(" ").nth(0)).collect();
}
fn main() {
let s = ~"Hello World\nFoo Bar";
let words = get_words(s);
for word in words.iter() {
println!("{}", word);
}
}
Result (as expected):
Hello
Foo
How do I modify this to return an Iterator instead? I'm apparently not allowed to make Iterator<&str> the return type. If I try #Iterator<&str>, rustc says
error: The managed box syntax is being replaced by the `std::gc::Gc` and `std::rc::Rc` types. Equivalent functionality to managed trait objects will be implemented but is currently missing.
I can't figure out for the life of me how to make that work.
Similarly, trying to return ~Iterator<&str> makes rustc complain that the actual type is std::iter::FilterMap<....blah...>.
In C# this is really easy, as you simply return the result of the equivalent map call as an IEnumerable<string>. Then the callee doesn't have to know what the actual type is that's returned, it only uses methods available in the IEnumerable interface.
Is there nothing like returning an interface in Rust??
(I'm using Rust 0.10)
I believe that the equivalent of the C# example would be returning ~Iterator<&str>. This can be done, but must be written explicitly: rather than returning x, return ~x as ~Iterator<&'a str>. (By the way, your function is going to have to take &'a str rather than ~str—if you don’t know why, ask and I’ll explain.)
This is not, however, idiomatic Rust because it is needlessly inefficient. The idiomatic Rust is to list the return type explicitly. You can specify it in one place like this if you like:
use std::iter::{FilterMap, Map};
use std::str::CharSplits;
type Foo = FilterMap<'a, &'a str, &'a str,
Map<'a, &'a str, &'a str,
CharSplits<'a, char>>>
And then list Foo as the return type.
Yes, this is cumbersome. At present, there is no such thing as inferring a return type in any way. This has, however, been discussed and I believe it likely that it will come eventually in some syntax similar to fn foo<'a>(&'a str) -> Iterator<&'a str>. For now, though, there is no fancy sugar.

Resources