I'm trying to instrument a function using tokio tracing, get its arguments and then walk to the parent span and get the values for a handful of fields. I was able to successfully get the argument by creating a new Layer as follows:
impl<S> Layer<S> for MyTestLayer
where
S: tracing::Subscriber,
S: for<'lookup> tracing_subscriber::registry::LookupSpan<'lookup>,
{
fn on_new_span(&self, attrs: &Attributes<'_>, id: &Id, ctx: Context<'_, S>) {
let span = ctx.span(id).unwrap();
let mut visitor = MyVisitor{ ... };
attrs.record(&mut visitor);
...
}
This seems to work and lets me get the attributes for the expected span. However, visiting each field/value with a visitor feels like I'm doing something wrong. Is this the best way to get the attribute values?
From here, I'd like to get two fields from the parent span. It seems like I should be able to do something like:
let parent = span.parent().unwrap();
let parent_fields = p.fields();
// ... Read the data from the fields?
However, this doesn't seem to work as fields() provides a FieldSet and no apparent access to the values or ValueSet. What's the proper way of getting a value from the parent field? Is there an easier/better way to do this?
Attributes, and therefore the values contained, are only available when a new span is created. The tracing framework does not keep them around for inspection which keeps the overhead low. It is the Subscriber that is responsible for persisting any values that it needs when processing spans.
Fortunately there's some tools to help. You are already taking advantage of the Layer ecosystem from tracing-subscriber and are even relying on its Registry and the LookupSpan trait as well. The registry provides a general purpose mechanism for storing data with spans, done via Extensions.
What this means is that your MyTestLayer should store any data that will be needed later when processing the parent span, which can then be fetched from the registry when processing the child.
Here's an example that fetches and persists a "test" field value that can be looked-up later:
use std::fmt::Debug;
use tracing::field::{Field, Visit};
use tracing::span::{Attributes, Id};
use tracing_subscriber::layer::{Context, Layer};
struct TestVisitor(Option<String>);
impl Visit for TestVisitor {
fn record_debug(&mut self, _field: &Field, _value: &dyn Debug) {
// do nothing
}
fn record_str(&mut self, field: &Field, value: &str) {
if field.name() == "test" {
self.0 = Some(value.to_owned());
}
}
}
struct Test(String);
pub struct MyTestLayer;
impl<S> Layer<S> for MyTestLayer
where
S: tracing::Subscriber,
S: for<'lookup> tracing_subscriber::registry::LookupSpan<'lookup>,
{
fn on_new_span(&self, attrs: &Attributes<'_>, id: &Id, ctx: Context<'_, S>) {
let span = ctx.span(id).unwrap();
let mut visitor = TestVisitor(None);
attrs.record(&mut visitor);
if let Some(test) = visitor.0 {
span.extensions_mut().insert(Test(test));
} else {
// span has no test value
}
let parent = span.parent().unwrap();
if let Some(Test(test)) = parent.extensions().get::<Test>() {
// parent test value is `test`
} else {
// parent has no test value
};
}
}
However, visiting each field/value with a visitor feels like I'm doing something wrong. Is this the best way to get the attribute values?
Using .record() is the only way to access the span's values as far as I'm aware.
Related
Here is a Rust function.
impl Room {
pub fn getPageContent(&mut self, bookNumber: u32, pageNumber: u16, alp: &Alphabet) {
let page = &mut self.books[(bookNumber-1) as usize].pages[(pageNumber-1) as usize];
if page.text.is_empty() {
page.generateContent(alp, bookNumber, &self);
}
}
}
Each page needs information from to room in order to generate its content.
However, doing that as show in the function of course leads to
cannot borrow `self` as immutable because it is also borrowed as mutable´
My question is : what is the intended design pattern to be used in such cases ? Especially at scale, as this is only an example.
Giving the page each information it needs from the Room would be a solution if all those informations implement Clone, but doesn't seem advisable (what if ´generateContent´ needs 10 different informations ? Such a pattern would quickly multiply to work to be done along with potential mistakes and overall complexity).
Another solution is to generate the content directly in the ´getPageContent´, but this pattern would quickly lead to having everything be done inside Room, which does not seem to be a good idea either when applied at scale either.
You can pull the mutation out of generateContent then change generateContent to take a non-mut reference.
pub fn get_page_content(&mut self, book_number: u32, page_number: u16, alp: &Alphabet) {
let page = &self.books[(book_number - 1) as usize].pages[(page_number - 1) as usize];
if page.text.is_empty() {
let text = page.generate_content(alp, book_number, self);
// lookup the page again but mut this time
let page =
&mut self.books[(book_number - 1) as usize].pages[(page_number - 1) as usize];
page.text = text;
}
}
I have a custom struct like the following:
struct MyStruct {
first_field: i32,
second_field: String,
third_field: u16,
}
Is it possible to get the number of struct fields programmatically (like, for example, via a method call field_count()):
let my_struct = MyStruct::new(10, "second_field", 4);
let field_count = my_struct.field_count(); // Expecting to get 3
For this struct:
struct MyStruct2 {
first_field: i32,
}
... the following call should return 1:
let my_struct_2 = MyStruct2::new(7);
let field_count = my_struct2.field_count(); // Expecting to get count 1
Is there any API like field_count() or is it only possible to get that via macros?
If this is achievable with macros, how should it be implemented?
Are there any possible API like field_count() or is it only possible to get that via macros?
There is no such built-in API that would allow you to get this information at runtime. Rust does not have runtime reflection (see this question for more information). But it is indeed possible via proc-macros!
Note: proc-macros are different from "macro by example" (which is declared via macro_rules!). The latter is not as powerful as proc-macros.
If this is achievable with macros, how should it be implemented?
(This is not an introduction into proc-macros; if the topic is completely new to you, first read an introduction elsewhere.)
In the proc-macro (for example a custom derive), you would somehow need to get the struct definition as TokenStream. The de-facto solution to use a TokenStream with Rust syntax is to parse it via syn:
#[proc_macro_derive(FieldCount)]
pub fn derive_field_count(input: TokenStream) -> TokenStream {
let input = parse_macro_input!(input as ItemStruct);
// ...
}
The type of input is ItemStruct. As you can see, it has the field fields of the type Fields. On that field you can call iter() to get an iterator over all fields of the struct, on which in turn you could call count():
let field_count = input.fields.iter().count();
Now you have what you want.
Maybe you want to add this field_count() method to your type. You can do that via the custom derive (by using the quote crate here):
let name = &input.ident;
let output = quote! {
impl #name {
pub fn field_count() -> usize {
#field_count
}
}
};
// Return output tokenstream
TokenStream::from(output)
Then, in your application, you can write:
#[derive(FieldCount)]
struct MyStruct {
first_field: i32,
second_field: String,
third_field: u16,
}
MyStruct::field_count(); // returns 3
It's possible when the struct itself is generated by the macros - in this case you can just count tokens passed into macros, as shown here. That's what I've come up with:
macro_rules! gen {
($name:ident {$($field:ident : $t:ty),+}) => {
struct $name { $($field: $t),+ }
impl $name {
fn field_count(&self) -> usize {
gen!(#count $($field),+)
}
}
};
(#count $t1:tt, $($t:tt),+) => { 1 + gen!(#count $($t),+) };
(#count $t:tt) => { 1 };
}
Playground (with some test cases)
The downside for this approach (one - there could be more) is that it's not trivial to add an attribute to this function - for example, to #[derive(...)] something on it. Another approach would be to write the custom derive macros, but this is something that I can't speak about for now.
I've been playing with Rust for the past few days,
and I'm still struggling with the memory management (figures).
I wrote a simple project that creates a hierarchy of structs ("keywords") from lexing/parsing a text file.
A keyword is defined like this:
pub struct Keyword<'a> {
name: String,
code: u32,
parent: Option<&'a Keyword<'a>>,
}
which is my equivalent for this C struct:
typedef struct Keyword Keyword;
struct Keyword {
char* name;
unsigned int code;
Keyword* parent;
};
A hierarchy is just a container for keywords, defined like this:
pub struct KeywordHierarchy<'a> {
keywords: Vec<Box<Keyword<'a>>>,
}
impl<'a> KeywordHierarchy<'a> {
fn add_keyword(&mut self, keyword: Box<Keyword<'a>>) {
self.keywords.push(keyword);
}
}
In the parse function (which is a stub of the complete parser), I recreated the condition that spawns the lifetime error in my code.
fn parse<'a>() -> Result<KeywordHierarchy<'a>, String> {
let mut hierarchy = KeywordHierarchy { keywords: Vec::new() };
let mut first_if = true;
let mut second_if = false;
while true {
if first_if {
// All good here.
let root_keyword = Keyword {
name: String::from("ROOT"),
code: 0,
parent: None,
};
hierarchy.add_keyword(Box::new(root_keyword));
first_if = false;
second_if = true;
}
if second_if {
// Hierarchy might have expired here?
// Find parent
match hierarchy.keywords.iter().find(|p| p.code == 0) {
None => return Err(String::from("No such parent")),
Some(parent) => {
// If parent is found, create a child
let child = Keyword {
name: String::from("CHILD"),
code: 1,
parent: Some(&parent),
};
hierarchy.add_keyword(Box::new(child));
}
}
second_if = false;
}
if !first_if && !second_if {
break;
}
}
Ok(hierarchy)
}
There's a while loop that goes through the lexer tokens.
In the first if, I add the ROOT keyword to the hierarchy, which is the only one that doesn't have a parent, and everything goes smoothly as expected.
In the second if, I parse the children keywords and I get a lifetime error when invoking KeywordHierarchy.add_keyword().
hierarchy.keywords` does not live long enough
Could you guys recommend an idiomatic way to fix this?
Cheers.
P.S. Click here for the playground
The immediate problem I can see with your design is that your loop is going to modify the hierarchy.keywords vector (in the first_if block), but it also borrows elements from the hierarchy.keywords vector (in the second_if block).
This is problematic, because modifying a vector may cause its backing buffer to be reallocated, which, if it were allowed, would invalidate all existing borrows to the vector. (And thus it is not allowed.)
Have you considered using an arena to hold your keywords instead of a Vec? Arenas are designed so that you can allocate new things within them while still having outstanding borrows to elements previously allocated within the arena.
Update: Here is a revised version of your code that illustrates using an arena (in this case a rustc_arena::TypedArena, but that's just so I can get this running on the play.rust-lang.org service) alongside a Vec<&Keyword> to handle the lookups.
https://play.rust-lang.org/?gist=fc6d81cb7efa7e4f32c481ab6482e587&version=nightly&backtrace=0
The crucial bits of code are this:
First: the KeywordHierarchy now holds a arena alongside a vec:
pub struct KeywordHierarchy<'a> {
keywords: Vec<&'a Keyword<'a>>,
kw_arena: &'a TypedArena<Keyword<'a>>,
}
Second: Adding a keyword now allocates the spot in the arena, and stashes a reference to that spot in the vec:
fn add_keyword(&mut self, keyword: Keyword<'a>) {
let kw = self.kw_arena.alloc(keyword);
self.keywords.push(kw);
}
Third: the fn parse function now takes an arena (reference) as input, because we need the arena to outlive the stack frame of fn parse:
fn parse<'a>(arena: &'a TypedArena<Keyword<'a>>) -> Result<KeywordHierarchy<'a>, String> {
...
Fourth: To avoid borrow-checker issues with borrowing hierarchy as mutable while also iterating over it, I moved the hierarchy modification outside of your Find parent match:
// Find parent
let parent = match hierarchy.keywords.iter().find(|p| p.code == 0) {
None => return Err(String::from("No such parent")),
Some(parent) => *parent, // (this deref is important)
};
// If parent is found, create a child
let child = Keyword {
name: String::from("CHILD"),
code: 1,
parent: Some(parent),
};
hierarchy.add_keyword(child);
second_if = false;
I am trying to share a reference in two collections: a map and a list. I want to push to both of the collections and remove from the back of the list and remove from the map too. This code is just a sample to present what I want to do, it doesn't even compile!
What is the right paradigm to implement this? What kind of guarantees are the best practice?
use std::collections::{HashMap, LinkedList};
struct Data {
url: String,
access: u64,
}
struct Holder {
list: LinkedList<Box<Data>>,
map: HashMap<String, Box<Data>>,
}
impl Holder {
fn push(&mut self, d: Data) {
let boxed = Box::new(d);
self.list.push_back(boxed);
self.map.insert(d.url.to_owned(), boxed);
}
fn remove_last(&mut self) {
if let Some(v) = self.list.back() {
self.map.remove(&v.url);
}
self.list.pop_back();
}
}
The idea of storing a single item in multiple collections simultaneously is not new... and is not simple.
The common idea to sharing elements is either doing it unsafely (knowing how many collections there are) or doing it with a shared pointer.
There is a some inefficiency in just blindly using regular collections with a shared pointer:
Node based collections mean you have a double indirection
Switching from one view to the other require finding the element all over again
In Boost.MultiIndex (C++), the paradigm used is to create an intrusive node to wrap the value, and then link this node in the various views. The trick (and difficulty) is to craft the intrusive node in a way that allows getting to the surrounding elements to be able to "unlink" it in O(1) or O(log N).
It would be quite unsafe to do so, and I don't recommend attempting it unless you are ready to spend significant time on it.
Thus, for a quick solution:
type Node = Rc<RefCell<Data>>;
struct Holder {
list: LinkedList<Node>,
map: HashMap<String, Node>,
}
As long as you do not need to remove by url, this is efficient enough.
Complete example:
use std::collections::{HashMap, LinkedList};
use std::cell::RefCell;
use std::rc::Rc;
struct Data {
url: String,
access: u64,
}
type Node = Rc<RefCell<Data>>;
struct Holder {
list: LinkedList<Node>,
map: HashMap<String, Node>,
}
impl Holder {
fn push(&mut self, d: Data) {
let url = d.url.to_owned();
let node = Rc::new(RefCell::new(d));
self.list.push_back(Rc::clone(&node));
self.map.insert(url, node);
}
fn remove_last(&mut self) {
if let Some(node) = self.list.back() {
self.map.remove(&node.borrow().url);
}
self.list.pop_back();
}
}
fn main() {}
Personally, I would use Rc instead of Box.
Alternatively, you could store indexes into your list as the value type of your hash map (i.e. use HashMap<String, usize> instead of HashMap<String, Box<Data>>.
Depending on what you are trying to do, it might be a better idea to only have either a list or a map, and not both.
Most of this is boilerplate, provided as a compilable example. Scroll down.
use std::rc::{Rc, Weak};
use std::cell::RefCell;
use std::any::{Any, AnyRefExt};
struct Shared {
example: int,
}
struct Widget {
parent: Option<Weak<Box<Widget>>>,
specific: RefCell<Box<Any>>,
shared: RefCell<Shared>,
}
impl Widget {
fn new(specific: Box<Any>,
parent: Option<Rc<Box<Widget>>>) -> Rc<Box<Widget>> {
let parent_option = match parent {
Some(parent) => Some(parent.downgrade()),
None => None,
};
let shared = Shared{pos: 10};
Rc::new(box Widget{
parent: parent_option,
specific: RefCell::new(specific),
shared: RefCell::new(shared)})
}
}
struct Woo {
foo: int,
}
impl Woo {
fn new() -> Box<Any> {
box Woo{foo: 10} as Box<Any>
}
}
fn main() {
let widget = Widget::new(Woo::new(), None);
{
// This is a lot of work...
let cell_borrow = widget.specific.borrow();
let woo = cell_borrow.downcast_ref::<Woo>().unwrap();
println!("{}", woo.foo);
}
// Can the above be made into a function?
// let woo = widget.get_specific::<Woo>();
}
I'm learning Rust and trying to figure out some workable way of doing a widget hierarchy. The above basically does what I need, but it is a bit cumbersome. Especially vexing is the fact that I have to use two statements to convert the inner widget (specific member of Widget). I tried several ways of writing a function that does it all, but the amount of reference and lifetime wizardry is just beyond me.
Can it be done? Can the commented out method at the bottom of my example code be made into reality?
Comments regarding better ways of doing this entire thing are appreciated, but put it in the comments section (or create a new question and link it)
I'll just present a working simplified and more idiomatic version of your code and then explain all the changed I made there:
use std::rc::{Rc, Weak};
use std::any::{Any, AnyRefExt};
struct Shared {
example: int,
}
struct Widget {
parent: Option<Weak<Widget>>,
specific: Box<Any>,
shared: Shared,
}
impl Widget {
fn new(specific: Box<Any>, parent: Option<Rc<Widget>>) -> Widget {
let parent_option = match parent {
Some(parent) => Some(parent.downgrade()),
None => None,
};
let shared = Shared { example: 10 };
Widget{
parent: parent_option,
specific: specific,
shared: shared
}
}
fn get_specific<T: 'static>(&self) -> Option<&T> {
self.specific.downcast_ref::<T>()
}
}
struct Woo {
foo: int,
}
impl Woo {
fn new() -> Woo {
Woo { foo: 10 }
}
}
fn main() {
let widget = Widget::new(box Woo::new() as Box<Any>, None);
let specific = widget.get_specific::<Woo>().unwrap();
println!("{}", specific.foo);
}
First of all, there are needless RefCells inside your structure. RefCells are needed very rarely - only when you need to mutate internal state of an object using only & reference (instead of &mut). This is useful tool for implementing abstractions, but it is almost never needed in application code. Because it is not clear from your code that you really need it, I assume that it was used mistakingly and can be removed.
Next, Rc<Box<Something>> when Something is a struct (like in your case where Something = Widget) is redundant. Putting an owned box into a reference-counted box is just an unnecessary double indirection and allocation. Plain Rc<Widget> is the correct way to express this thing. When dynamically sized types land, it will be also true for trait objects.
Finally, you should try to always return unboxed values. Returning Rc<Box<Widget>> (or even Rc<Widget>) is unnecessary limiting for the callers of your code. You can go from Widget to Rc<Widget> easily, but not the other way around. Rust optimizes by-value returns automatically; if your callers need Rc<Widget> they can box the return value themselves:
let rc_w = box(RC) Widget::new(...);
Same thing is also true for Box<Any> returned by Woo::new().
You can see that in the absence of RefCells your get_specific() method can be implemented very easily. However, you really can't do it with RefCell because RefCell uses RAII for dynamic borrow checks, so you can't return a reference to its internals from a method. You'll have to return core::cell::Refs, and your callers will need to downcast_ref() themselves. This is just another reason to use RefCells sparingly.