Build variable length arguments array for #call - metaprogramming

I've recently started learning Zig.
As a little project I wanted to implement a small QuickCheck [1] style helper library for writing randomized tests.
However, I can't figure out how to write a generic way to call a function with an arbitrary number of arguments.
Here's a simplified version that can test functions with two arguments:
const std = #import("std");
const Prng = std.rand.DefaultPrng;
const Random = std.rand.Random;
const expect = std.testing.expect;
// the thing we want to test
fn some_property(a: u64, b: u64) !void {
var tmp: u64 = undefined;
var c1 = #addWithOverflow(u64, a, b, &tmp);
var c2 = #addWithOverflow(u64, a, b, &tmp);
expect(c1 == c2);
}
// helper for generating random arguments for the function under test
fn gen(comptime T: ?type, rnd: Random) (T orelse undefined) {
switch (T orelse undefined) {
u64 => return rnd.int(u64),
f64 => return rnd.float(f64),
else => #compileError("unsupported type"),
}
}
/// tests if 'property' holds.
fn for_all(property: anytype) !void {
var rnd = Prng.init(0);
const arg_types = #typeInfo(#TypeOf(property)).Fn.args;
var i: usize = 0;
while (i < 100) {
var a = gen(arg_types[0].arg_type, rnd.random());
var b = gen(arg_types[1].arg_type, rnd.random());
var args = .{a, b}; // <-- how do I build args for functions with any number of arguments?
try #call(.{}, property, args);
i += 1;
}
}
test "test" {
try for_all(some_property);
}
I've tried a few different things, but I can't figure out how to get the above code to work for functions with any number of arguments.
Things I've tried:
Make args an array and fill it with an inline for loop. Doesn't work since []anytype is not a valid type.
Use a bit of comptime magic to build a struct type whose fields hold the arguments for #call. This hits a TODO in the compiler: error: TODO: struct args.
Write generic functions that return an appropriate argument tuple call. I don't really like this one, since you need one function for every arity you want to support. But it doesn't seem to work anyway since antype is not a valid return type.
I'm on Zig 0.9.1.
Any insight would be appreciated.
[1] https://hackage.haskell.org/package/QuickCheck

This can be done with std.meta.ArgsTuple (defined in this file of the zig standard library)
const Args = std.meta.ArgsTuple(#TypeOf(property));
var i: usize = 0;
while (i < 1000) : (i += 1) {
var args: Args = undefined;
inline for (std.meta.fields(Args)) |field, index| {
args[index] = gen(field.field_type, rnd.random());
}
try #call(.{}, property, args);
}
The way this works internally is it constructs a tuple type with #Type(). We can then fill it with values and use it to call the function.

Related

Struct initialization and method declaration in Go

I am new of Go, and pretty curious of structs. Let's define a struct T
type T struct {
size int
}
I have seen different types of struct initialization. What are the differences?
new(T) // 1
T{size:1} // 2
&T{size:1} // 3
And the two types of method declarations:
func (r *T) area() int // 1
func (r T) area() int // 2
What should be the right way?
allocation
new and &T{size:1} returns *T
T{size:1} return T
The built-in function new takes a type T, allocates storage for a variable of that type at run time, and returns a value of type *T pointing to it. The variable is initialized as described in the section on initial values.
2.
The method set of any other named type T consists of all methods with receiver type T. The method set of the corresponding pointer type *T is the set of all methods with receiver *T or T (that is, it also contains the method set of T).
var pt *T
var t T
func (r *T) area() int
you can use pt.area() or t.area()
func (r T) area() int
you can use t.area(), can't use pt.area()
usually we use func(r *T) area() int
Here are different examples:
type Animal struct {
Legs int
Kingdom string
Carnivore bool
}
Initialization by reference
Return the pointer to the struct
var tiger = &Animal{4, "mammalia", true}
fmt.Println(tiger.Kingdom) // print "mammalia"
func changeKingdom(a *Animal) {
a.Kingdom = "alien" // modify original struct
}
changeKingdom(tiger)
fmt.Println(tiger.Kingdom) // print "alien"
Constructor New Initializaiton
Return a pointer with zero-ed values
var xAnimal = New(Animal)
fmt.Println(xAnimal.Kingdom) // print ""
fmt.Println(xAnimal.Legs) // print 0
fmt.Println(xAnimal.carnivore) // print false
changeKingdom(xAnimal)
fmt.Println(xAnimal.Kingdom) // print "alien"
Initialization by value (copy)
Return a separate copy of the original struct
var giraffe = Animal{4, "mammalia", false}
fmt.Println(giraffe.Kingdom) // print "mammalia"
func changeKingdom(a Animal) {
a.Kingdom = "extraterrestrial"
}
changeKingdom(giraffe)
fmt.Println(giraffe) // print "mammalia"
More often you'll deal with pointers when using structs than the copies.

how many vectors can be added in DataFrame::create( vec1, vec2 ... )?

I am creating a DataFrame to hold a parsed haproxy http log files which has quite a few fields (25+).
If I add more than 20 vectors (one for each field), I get the compilation error:
no matching function call to 'create'
The create method:
return DataFrame::create(
_["clientIp"] = clientIp,
_["clientPort"] = clientPort,
_["acceptDate"] = acceptDate,
_["frontendName"] = frontendName,
_["backendName"] = backendName,
_["serverName"] = serverName,
_["tq"] = tq,
_["tw"] = tw,
_["tc"] = tc,
_["tr"] = tr,
_["tt"] = tt,
_["status_code"] = statusCode,
_["bytes_read"] = bytesRead,
#if CAPTURED_REQUEST_COOKIE_FIELD == 1
_["capturedRequestCookie"] = capturedRequestCookie,
#endif
#if CAPTURED_REQUEST_COOKIE_FIELD == 1
_["capturedResponseCookie"] = capturedResponseCookie,
#endif
_["terminationState"] = terminationState,
_["actconn"] = actconn,
_["feconn"] = feconn,
_["beconn"] = beconn,
_["srv_conn"] = srvConn,
_["retries"] = retries,
_["serverQueue"] = serverQueue,
_["backendQueue"] = backendQueue
);
Questions:
Have I hit a hard limit?
Is there a workaround to allow me to add more than 20 vectors to a data frame?
Yes, you have hit a hard limit -- Rcpp is limited by the C++98 standard, which requires explicit code bloat to support 'variadic' arguments. Essentially, a new overload must be generated for each create function used, and to avoid choking the compiler Rcpp just provides up to 20.
A workaround would be to use a 'builder' class, where you successively add elements, and then convert to DataFrame at the end. A simple example of such a class -- we create a ListBuilder object, for which we successively add new columns. Try running Rcpp::sourceCpp() with this file to see the output.
#include <Rcpp.h>
using namespace Rcpp;
class ListBuilder {
public:
ListBuilder() {};
~ListBuilder() {};
inline ListBuilder& add(std::string const& name, SEXP x) {
names.push_back(name);
// NOTE: we need to protect the SEXPs we pass in; there is
// probably a nicer way to handle this but ...
elements.push_back(PROTECT(x));
return *this;
}
inline operator List() const {
List result(elements.size());
for (size_t i = 0; i < elements.size(); ++i) {
result[i] = elements[i];
}
result.attr("names") = wrap(names);
UNPROTECT(elements.size());
return result;
}
inline operator DataFrame() const {
List result = static_cast<List>(*this);
result.attr("class") = "data.frame";
result.attr("row.names") = IntegerVector::create(NA_INTEGER, XLENGTH(elements[0]));
return result;
}
private:
std::vector<std::string> names;
std::vector<SEXP> elements;
ListBuilder(ListBuilder const&) {}; // not safe to copy
};
// [[Rcpp::export]]
DataFrame test_builder(SEXP x, SEXP y, SEXP z) {
return ListBuilder()
.add("foo", x)
.add("bar", y)
.add("baz", z);
}
/*** R
test_builder(1:5, letters[1:5], rnorm(5))
*/
PS: With Rcpp11, we have variadic functions and hence the limitations are removed.
The other common approach with Rcpp is to just use an outer list containing as many DataFrame objects (with each limited by the number of elements provided via the old-school macro expansion / repetition) in the corresponding header) as you need.
In (untested) code:
Rcpp::DataFrame a = Rcpp::DateFrame::create(/* ... */);
Rcpp::DataFrame b = Rcpp::DateFrame::create(/* ... */);
Rcpp::DataFrame c = Rcpp::DateFrame::create(/* ... */);
return Rcpp::List::create(Rcpp::Named("a") = a,
Rcpp::Named("b") = b,
Rcpp::Named("c") = c);

Comparing String.Index values

Is it possible to compare two String.Index values in Swift? I'm trying to process a string character by character, and several times I need to check if I am at the end of the string. I've tried just doing
while (currentIndex < string.endIndex) {
//do things...
currentIndex = currentIndex.successor()
}
Which complained about type conversions. Then, I tried defining and overload for < as such:
#infix func <(lhs: String.Index, rhs: String.Index) -> Bool {
var ret = true //what goes here?
return ret
}
Which gets rid of compilation errors, but I have no clue what to do in order to compare lhs and rhs properly. Is this the way I should go about using String.Index, or is there a better way to compare them?
The simplest option is the distance() function:
var string = "Hello World"
var currentIndex = string.startIndex
while (distance(currentIndex, string.endIndex) >= 0) {
println("currentIndex: \(currentIndex)")
currentIndex = currentIndex.successor()
}
Beware distance() has O(N) performance, so avoid it for large strings. However, the entire String class doesn't currently handle large strings anyway — you should probably switch to CFString if performance is critical.
Using an operator overload is a bad idea, but just as a learning exercise this is how you'd do it:
var string = "Hello World"
var currentIndex = string.startIndex
#infix func <(lhs: String.Index, rhs: String.Index) -> Bool {
return distance(lhs, rhs) > 0
}
while (currentIndex < string.endIndex) {
currentIndex = currentIndex.successor()
}
String indexes support = and !=. String indexes are an opaque type, not integers and can not be compared like integers.
Use: if (currentIndex != string.endIndex)
var currentIndex = string.startIndex
while (currentIndex != string.endIndex) {
println("currentIndex: \(currentIndex)")
currentIndex = currentIndex.successor()
}
I believe this REPL/Playground example should illuminate what you (and others) need to know about working with the String.Index concept.
// This will be our working example
let exampleString = "this is a string"
// And here we'll call successor a few times to get an index partway through the example
var someIndexInTheMiddle = exampleString.startIndex
for _ in 1...5 {
someIndexInTheMiddle = someIndexInTheMiddle.successor()
}
// And here we will iterate that string and detect when our current index is relative in one of three different possible ways to the character selected previously
println("\n\nsomeIndexInTheMiddle = \(exampleString[someIndexInTheMiddle])")
for var index: String.Index = exampleString.startIndex; index != exampleString.endIndex; index = index.successor() {
println(" - \(exampleString[index])")
if index != exampleString.startIndex && index.predecessor() == someIndexInTheMiddle {
println("current character comes after someIndexInTheMiddle")
} else if index == someIndexInTheMiddle {
println("current character is the one indicated by someIndexInTheMiddle")
} else if index != exampleString.endIndex && index.successor() == someIndexInTheMiddle {
println("Current character comes before someIndexinTheMiddle")
}
}
Hopefully that provides the necessary information.
Whatever way you decide to iterator over a String, you will immediately want to capture the iteration in a function that can be repeatedly invoked while using a closure applied to each string character. As in:
extension String {
func each (f: (Character) -> Void) {
for var index = self.startIndex;
index < self.endIndex;
index = index.successor() {
f (string[index])
}
}
}
Apple already provides these for C-Strings and will for general strings as soon as they get character access solidified.

C++ lambdas for std::sort and std::lower_bound/equal_range on a struct element in a sorted vector of structs

I have a std::vector of this struct:
struct MS
{
double aT;
double bT;
double cT;
};
which I want to use std::sort on as well as std::lower_bound/equal_range etc...
I need to be able to sort it and look it up on either of the first two elements of the struct. So at the moment I have this:
class MSaTLess
{
public:
bool operator() (const MS &lhs, const MS &rhs) const
{
return TLess(lhs.aT, rhs.aT);
}
bool operator() (const MS &lhs, const double d) const
{
return TLess(lhs.aT, d);
}
bool operator() (const double d, const MS &rhs) const
{
return TLess(d, rhs.aT);
}
private:
bool TLess(const double& d1, const double& d2) const
{
return d1 < d2;
}
};
class MSbTLess
{
public:
bool operator() (const MS &lhs, const MS &rhs) const
{
return TLess(lhs.bT, rhs.bT);
}
bool operator() (const MS &lhs, const double d) const
{
return TLess(lhs.bT, d);
}
bool operator() (const double d, const MS &rhs) const
{
return TLess(d, rhs.bT);
}
private:
bool TLess(const double& d1, const double& d2) const
{
return d1 < d2;
}
};
This allows me to call both std::sort and std::lower_bound with MSaTLess() to sort/lookup based on the aT element and with MSbTLess() to sort/lookup based on the bT element.
I'd like to get away from the functors and use C++0x lambdas instead. For sort that is relatively straightforward as the lambda will take two objects of type MS as arguments.
What about for the lower_bound and other binary search lookup algorithms though? They need to be able to call a comparator with (MS, double) arguments and also the reverse, (double, MS), right? How can I best provide these with a lambda in a call to lower_bound? I know I could create an MS dummy object with the required key value being searched for and then use the same lambda as with std::sort but is there a way to do it without using dummy objects?
It's a little awkward, but if you check the definitions of lower_bound and upper_bound from the standard, you'll see that the definition of lower_bound puts the dereferenced iterator as the first parameter of the comparison (and the value second), whereas upper_bound puts the dereferenced iterator second (and the value first).
So, I haven't tested this but I think you'd want:
std::lower_bound(vec.begin(), vec.end(), 3.142, [](const MS &lhs, double rhs) {
return lhs.aT < rhs;
});
and
std::upper_bound(vec.begin(), vec.end(), 3.142, [](double lhs, const MS &rhs) {
return lhs < rhs.aT;
});
This is pretty nasty, and without looking up a few more things I'm not sure you're actually entitled to assume that the implementation uses the comparator only in the way it's described in the text - that's a definition of the result, not the means to get there. It also doesn't help with binary_search or equal_range.
It's not explicitly stated in 25.3.3.1 that the iterator's value type must be convertible to T, but it's sort of implied by the fact that the requirement for the algorithm is that T (in this case, double) must be LessThanComparable, not that T must be comparable to the value type of the iterator in any particular order.
So I think it's better just to always use a lambda (or functor) that compares two MS structs, and instead of passing a double as a value, pass a dummy MS with the correct field set to the value you're looking for:
std::upper_bound(vec.begin(), vec.end(), MS(3.142,0,0), [](const MS &lhs, const MS &rhs) {
return lhs.aT < rhs.aT;
});
If you don't want to give MS a constructor (because you want it to be POD), then you can write a function to create your MS object:
MS findA(double d) {
MS result = {d, 0, 0};
return result;
}
MS findB(double d) {
MS result = {0, d, 0};
return result;
}
Really, now that there are lambdas, for this job we want a version of binary search that takes a unary "comparator":
double d = something();
unary_upper_bound(vec.begin(), vec.end(), [d](const MS &rhs) {
return d < rhs.aT;
});
C++0x doesn't provide it, though.
The algorithms std::sort, std::lower_bound, and std::binary_search take a predicate that compares two elements of the container. Any lambda that compares two MS objects and returns true when they are in order should work for all three algorithms.
Not directly relevant to what you're saying about lambdas, but this might be an idea for using the binary search functions:
#include <iostream>
#include <algorithm>
#include <vector>
struct MS
{
double aT;
double bT;
double cT;
MS(double a, double b, double c) : aT(a), bT(b), cT(c) {}
};
// template parameter is a data member of MS, of type double
template <double MS::*F>
struct Find {
double d;
Find(double d) : d(d) {}
};
template <double MS::*F>
bool operator<(const Find<F> &lhs, const Find<F> &rhs) {
return lhs.d < rhs.d;
}
template <double MS::*F>
bool operator<(const Find<F> &lhs, const MS &rhs) {
return lhs.d < rhs.*F;
}
template <double MS::*F>
bool operator<(const MS &lhs, const Find<F> &rhs) {
return lhs.*F < rhs.d;
}
int main() {
std::cout << (Find<&MS::bT>(1) < Find<&MS::bT>(2)) << "\n";
std::cout << (Find<&MS::bT>(1) < MS(1,0,0)) << "\n";
std::cout << (MS(1,0,0) < Find<&MS::bT>(1)) << "\n";
std::vector<MS> vec;
vec.push_back(MS(1,0,0));
vec.push_back(MS(0,1,0));
std::lower_bound(vec.begin(), vec.end(), Find<&MS::bT>(0.5));
std::upper_bound(vec.begin(), vec.end(), Find<&MS::bT>(0.5));
}
Basically, by using Find as the value, we don't have to supply a comparator, because Find compares to MS using the field that we specify. This is the same kind of thing as the answer you saw over here: how to sort STL vector, but using the value rather than the comparator as in that case. Not sure if it'd be all that great to use, but it might be, since it specifies the value to search for and the field to search in a single short expression.
I had the same problem for std::equal_range and came up with an alternative solution.
I have a collection of pointers to objects sorted on a type field. I need to find the find the range of objects for a given type.
const auto range = std::equal_range (next, blocks.end(), nullptr,
[type] (Object* o1, Object* o2)
{
return (o1 ? o1->Type() : type) < (o2 ? o2->Type() : type);
});
Although it is less efficient than a dedicated predicate as it introduces an unnecessary nullptr test for each object in my collection, it does provide an interesting alternative.
As an aside, when I do use a class as in your example, I tend to do the following. As well as being shorter, this allows me to add additional types with only 1 function per type rather then 4 operators per type.
class MSbTLess
{
private:
static inline const double& value (const MS& val)
{
return val.bT;
}
static inline const double& value (const double& val)
{
return val;
}
public:
template <typename T1, typename T2>
bool operator() (const T1& lhs, const T2& rhs) const
{
return value (t1) < value (t2);
}
};
In the definition of lower_bound and other STL Algorithms the Compare function is such that the first type must match that of the Forward Iterator and the second type must match that of T (i.e., of the value).
template< class ForwardIt, class T, class Compare >
ForwardIt lower_bound( ForwardIt first, ForwardIt last, const T& value, Compare comp );
So one indeed can compare things from different objects (doing what the other response called an Unary Comparator). In C++11 :
vector<MS> v = SomeSortedVectorofMSByFieldaT();
double a_key;
auto it = std::lower_bound(v.begin(),
v.end(),
a_key,
[]{const MS& m, const double& a) {
m.aT < a;
});
And this can be used with other STL algorithm functions as well.

how can I create a reference to a variable in specman?

I have the following code in specman:
var x := some.very.long.path.to.a.variable.in.another.struct;
while (x == some_value) {
//do something that uses x;
//wait for something
//get a new value for x
x = some.very.long.path.to.a.variable.in.another.struct;
};
Now, it seems wasteful to write the assignment to x twice; once during initialization and once during the loop.
What I really want to use is a reference to the long variable name, so that I could do:
var x := reference to some.very.long.path.to.a.variable.in.another.struct;
while (x == some_value) {
//do something that uses x;
//wait for something
//no need to update x now since it's a reference
};
Can this be done in specman?
specman/e generally uses references for structs and lists, so if your variable type is either of it your second example should work. For integer or boolean I don't know a way to use a reference for a variable. Anyway, two ideas which might help you:
Add a pointer to the other struct and bind it in a config file:
struct a { other_variable : uint; };
struct b {
other_struct : a;
some_func() is {
var x : uint = other_struct.other_variable;
while (x == some_value) {
x = other_struct.other_variable;
};
};
};
extend cfg {
struct_a : a;
struct_b : b;
keep struct_b.other_struct == struct_a;
};
UPDATE: You can find some more information on this technique in this Team Specman Post.
Wrap your while loop in a function, there you can pass parameters by reference (see help pass reference):
some_func(x : *uint) is {
while (x == some_value) {
// stuff ...
};
};
Hope this helps!
Daniel

Resources