Rcpp check if list has an element - rcpp

My program deals with clustering. Apart from dataset user has to specify some details concerning clusters. There are two ways to approach that: you specify number of clusters or prepare list of clusters' descriptions.
args <- list(dataset=points, K=5)
args <- list(dataset=points, clusters=list(
list(type="spherical",radius=4),
list(type="covariance",covMat=matrix)
)
next you call proper function (my program) in R with args as the argument.
classification <- CEC(args)
I would like to prepare CEC like below
SEXP CEC(SEXP args) {
Rcpp::List list(args);
arma::mat dataset = Rcpp::as<arma::mat>(list["dataset"]);
if(list.contains("K")) {
//something
} else if(list.contains("clusters")) {
//something
}
}
I cannot find any API for List or example how to do that. Moreover, I study headers of Rcpp but definition of List which is typedef Vector<VECSXP> List ; is hardly helpful.
Is there anything I can use instead of list.contains() ?

You are probably looking for the containsElementNamed method:
Rcpp::List list(args);
if( list.containsElementNamed("K") ){
// something
} else {
// something else
}
https://github.com/RcppCore/Rcpp/blob/master/inst/include/Rcpp/vector/Vector.h#L584

Related

Register array with a field in only one of the registers in DML 1.4

I want to make a register array, where one of the registers should include a field in bit 0 with a value of 1.
I have tried using a conditional without any success.
register feature_bits[i < N_FEATURE_SELECT_REGS] size 4 is unmapped {
#if (i == 1) {
field virtio_f_version_1 # [0] "VIRTIO_F_VERSION_1" {
param init_val = 1;
}
}
}
I have also tried indexing the register element and set the field accordingly
register feature_bits[i < N_FEATURE_SELECT_REGS] size 4 is unmapped;
register feature_bits[1] {
field VIRTIO_F_VERSION_1 # [0] {
param init_val = 1;
}
}
None of these approaches worked.
If we take a step back and look at what you're trying to accomplish here, it's possible that a third option can be usable. If it's only the init-value of some fields that differs, you can create a template type and iterate over it like this:
template feature_bit {
param lsb : uint64;
}
bank regs is (init, hard_reset) {
register features[i < 4] size 4 # unmapped;
method init() {
foreach f in (each feature_bit in (this)) {
features[f.lsb / 32].val[f.lsb % 32] = 1;
}
}
method hard_reset() {
init();
}
group featA is feature_bit { param lsb = 42; }
group featB is feature_bit { param lsb = 3; }
}
The init-template is depth-first recursive, so the feature registers will be initialized first, and then the init() method of the bank will run and set the value of the feature-registers by iterating over all instances of the feature_bit template. We also call init from hard_reset, which is also depth-first recursive, otherwise the register will be 0 after reset.
Arrays in DML are homogeneous; every subobject must exist for all indices. This is because when you write a method inside an array, each array index translates to an implicit method argument, so in your case, if a method in your register calls this.virtio_f_version.read(), this translates to something like regs__feature_bits__virtio_f_version_1__read(_dev, _i). This function exists for all values of i, therefore the field must exist for all values of i.
There are two approaches to solve your problem. The first is to let the field exist in all indices, and add code to make it pretend to not exist in indices other than 1:
register feature_bits[...]; {
field VIRTIO_F_VERSION_1 # [0] {
param init_val = i == 1 ? 1 : 0;
method write(uint64 value) {
if (i == 1) {
return set_f_version_1(value);
} else if (value != this.val) {
log spec_viol: "write outside fields";
}
}
}
}
The second approach is to accept that your set of registers is heterogeneous, and use a template to share code between them instead:
template feature_bit {
param i;
#if (i == 1) {
field virtio_f_version ... { ... }
}
... lots of common fields here ...
}
register feature_bits_0 is feature_bit { param index = 0; }
register feature_bits_1 is feature_bit { param index = 1; }
register feature_bits_2 is feature_bit { param index = 2; }
...
Which approach to choose is a trade-off; the first solution is more economic in terms of memory use and compile speed (because DML doesn't need to create almost-identical copies of methods across register indices), whereas the second solution gives a model that more accurately reflects the specification upon inspection (because the model will declare to the simulator that virtio_f_version_1 is a field only for one of the registers).
If you read the spec and you get the feeling that this field is a singular exception to an otherwise homogeneous array, then the first approach is probably better, but if registers vary wildly across indices then the second approach probably makes more sense.

Why would a function called from another not show in the profile output of a node app?

I have a NodeJS program. This is a trimmed down version of a much larger program with a lot of complexity that's irrelevant to this question removed. It goes through lists looking for matching objects:
/**
* Checks that the given attributes are defined in the given dictionaries
* and have the same value
*/
function areDefinedAndEqual(a, b, keys) {
return (
a &&
b &&
keys
.map(function (k) {
return a[k] && b[k] && a[k] === b[k]
})
.reduce(function (a, b) {
return a && b
}, true)
)
}
function calculateOrder() {
const matchingRules = [
{
desc: 'stuff, more stuff and different stuff',
find: (po, dp) => areDefinedAndEqual(po, dp, ['stuff', 'more_stuff', 'different_stuff'])
},
{
desc: 'stuff and different stuff',
find: (po, dp) => areDefinedAndEqual(po, dp, ['stuff', 'different_stuff'])
},
{
desc: 'just stuff',
find: (po, dp) => areDefinedAndEqual(po, dp, ['stuff'])
}
]
let listOfStuff = []
listOfStuff[999] = { stuff: 'Hello' }
listOfStuff[9999] = { stuff: 'World' }
listOfStuff[99999] = { stuff: 'Hello World' }
// Run through lots of objects running through different rules to
// find things that look similar to what we're searching for
for (let i = 0; i < 100000000; i++) {
for (let j = 0; j < matchingRules.length; j++) {
if (matchingRules[j].find({ stuff: 'Hello World' }, listOfStuff[i])) {
console.log(`Found match at position ${i} on ${matchingRules[j].desc}`)
}
}
}
}
calculateOrder()
Now all calculateOrder does is repeatedly call functions listed under matchingRules which in turn call areDefinedAndEqual which does some actual checking.
Now if I run this as follows:
richard#sophia:~/cc/sheetbuilder (main) $ node --prof fred.js
Found match at position 99999 on just stuff
richard#sophia:~/cc/sheetbuilder (main) $
I get just what I'd expect. So far so good.
I can then run the profile output through prof-process to get something more readable.
node --prof-process isolate-0x57087f0-56563-v8.log
However if I look at the output, I see this:
[JavaScript]:
ticks total nonlib name
4197 46.0% 89.0% LazyCompile: *calculateOrder /home/richard/cogcred/eng-data_pipeline_misc/sheetbuilder/fred.js:19:24
All the time is being spent in calculateOrder. I'd expect to see a large %age of the time spent in the various "find" functions and in areDefinedAndEqual but I don't. There's no mention of any of them at all. Why? Are they potentially being optimized out / inlined in some way? If so, how do I begin to debug that? Or is there some restrictions on certain functions not showing in the output? In which case, where are those retrictions defined? Any pointers would be much appreciated.
I'm running Node v16.5.0
Functions show up in the profile when tick samples have been collected for them. Since sample-based profiling is a statistical affair, it could happen that a very short-running function just wasn't sampled.
In the case at hand, inlining is the more likely answer. Running node with --trace-turbo-inlining spits out a bunch of information about inlining decisions.
If I run the example you posted, I see areDefinedEqual getting inlined into find, and accordingly find (and calculateOrder) are showing up high on the profile. Looking closely, in the particular run I profiled, areDefinedEqual was caught by a single profiler tick -- before it got inlined.

If statements not working with JSON array

I have a JSON file of 2 discord client IDs `{
{
"premium": [
"a random string of numbers that is a client id",
"a random string of numbers that is a client id"
]
}
I have tried to access these client IDs to do things in the program using a for loop + if statement:
for(i in premium.premium){
if(premium.premium[i] === msg.author.id){
//do some stuff
}else{
//do some stuff
When the program is ran, it runs the for loop and goes to the else first and runs the code in there (not supposed to happen), then runs the code in the if twice. But there are only 2 client IDs and the for loop has ran 3 times, and the first time it runs it goes instantly to the else even though the person who sent the message has their client ID in the JSON file.
How can I fix this? Any help is greatly appreciated.
You may want to add a return statement within your for loop. Otherwise, the loop will continue running until a condition has been met, or it has nothing else to loop over. See the documentation on for loops here.
For example, here it is without return statements:
const json = {
"premium": [
"aaa-1",
"bbb-1"
]
}
for (i in json.premium) {
if (json.premium[i] === "aaa-1") {
console.log("this is aaa-1!!!!")
} else {
console.log("this is not what you're looking for-1...")
}
}
And here it is with return statements:
const json = {
"premium": [
"aaa-2",
"bbb-2"
]
}
function loopOverJson() {
for (i in json.premium) {
if (json.premium[i] === "aaa-2") {
console.log("this is aaa-2!!!!")
return
} else {
console.log("this is not what you're looking for-2...")
return
}
}
}
loopOverJson()
Note: without wrapping the above in a function, the console will show: "Syntax Error: Illegal return statement."
for(i in premium.premium){
if(premium.premium[i] === msg.author.id){
//do some stuff
} else{
//do some stuff
}
}
1) It will loop through all your premium.premium entries. If there are 3 entries it will execute three times. You could use a break statement if you want to exit the loop once a match is found.
2) You should check the type of your msg.author.id. Since you are using the strict comparison operator === it will evaluate to false if your msg.author.id is an integer since you are comparing to a string (based on your provided json).
Use implicit casting: if (premium.premium[i] == msg.author.id)
Use explicit casting: if (premium.premium[i] === String(msg.author.id))
The really fun and easy way to solve problems like this is to use the built-in Array methods like map, reduce or filter. Then you don't have to worry about your iterator values.
eg.
const doSomethingAuthorRelated = (el) => console.log(el, 'whoohoo!');
const authors = premiums
.filter((el) => el === msg.author.id)
.map(doSomethingAuthorRelated);
As John Lonowski points out in the comment link, using for ... in for JavaScript arrays is not reliable, because its designed to iterate over Object properties, so you can't be really sure what its iterating on, unless you've clearly defined the data and are working in an environment where you know no other library has mucked with the Array object.

Elegant way to check if multiple strings are empty

How can I check if multiple strings are empty in an elegant way? This is how I currently do it:
//if one required field is empty, close the connection
if (registerRequest.Email == "") ||
(registerRequest.PhoneNumber == "")||
(registerRequest.NachName =="") ||
(registerRequest.VorName =="") ||
(registerRequest.Password =="") ||
(registerRequest.VerificationId ==""){
//Could not proceed
w.WriteHeader(UNABLE_TO_PROCEED)
w.Write([]byte("Unable to register account."))
return
}
Note: You may use the solution below if you keep the "is-valid" condition in your handler, and also if you separate your condition into another function or method.
You can create a simple helper function, which has a variadic parameter, and you can call it with any number of string values:
func containsEmpty(ss ...string) bool {
for _, s := range ss {
if s == "" {
return true
}
}
return false
}
Example using it:
if containsEmpty("one", "two", "") {
fmt.Println("One is empty!")
} else {
fmt.Println("All is non-empty.")
}
if containsEmpty("one", "two", "three") {
fmt.Println("One is empty!")
} else {
fmt.Println("All is non-empty.")
}
Output of the above (try it on the Go Playground):
One is empty!
All is non-empty.
Your example would look like this:
if containsEmpty(registerRequest.Email,
registerRequest.PhoneNumber,
registerRequest.NachName,
registerRequest.VorName,
registerRequest.Password,
registerRequest.VerificationId) {
// One of the listed strings is empty
}
Also registerRequest is a kinda long name, it could be shortened to like r. If you can't or don't want to rename it in the surrounding code and if you want to shorten the condition, you could also do something like this:
If registerRequest is a pointer (or interface), you could also write:
if r := registerRequest; containsEmpty(r.Email,
r.PhoneNumber,
r.NachName,
r.VorName,
r.Password,
r.VerificationId) {
// One of the listed strings is empty
}
Actually you can do this even if registerRequest is not a pointer, but then the struct will be copied. If registerRequest is a struct, then you can take its address to avoid having to copy it like this:
if r := &registerRequest; containsEmpty(r.Email,
r.PhoneNumber,
r.NachName,
r.VorName,
r.Password,
r.VerificationId) {
// One of the listed strings is empty
}
As Mario Santini mentioned in comment, a way to increase testability, encapsulate this logic, and decouple it from your handler method (which judging by the number of fields looks like it is at risk of changing at a different rate than your handler) could be to put this logic in a function:
func validRequest(registerRequest ?) bool {
return registerRequest.Email == "" ||
registerRequest.PhoneNumber == "" ||
registerRequest.NachName == "" ||
registerRequest.VorName == "" ||
registerRequest.Password == "" ||
registerRequest.VerificationId == ""
}
This now supports very focused, table driven tests, that can exercise what it means to be a valid request independent of any method involving writing headers.
It allows you to verify the valid/invalid path of your enclosing function, but to have very focused tests here. It also allows you to change what it means to be a valid request and verify it independent of your enclosing function.
You can use a switch:
switch "" {
case registerRequest.Email,
registerRequest.NachName,
registerRequest.Password,
registerRequest.PhoneNumber,
registerRequest.VerificationId,
registerRequest.VorName:
w.WriteHeader(UNABLE_TO_PROCEED)
w.Write([]byte("Unable to register account."))
return
}
https://golang.org/ref/spec#Switch_statements

Map/Reduce differences between Couchbase & CloudAnt

I've been playing around with Couchbase Server and now just tried replicating my local db to Cloudant, but am getting conflicting results for my map/reduce function pair to build a set of unique tags with their associated projects...
// map.js
function(doc) {
if (doc.tags) {
for(var t in doc.tags) {
emit(doc.tags[t], doc._id);
}
}
}
// reduce.js
function(key,values,rereduce) {
if (!rereduce) {
var res=[];
for(var v in values) {
res.push(values[v]);
}
return res;
} else {
return values.length;
}
}
In Cloudbase server this returns JSON like:
{"rows":[
{"key":"3d","value":["project1","project3","project8","project10"]},
{"key":"agents","value":["project2"]},
{"key":"fabrication","value":["project3","project5"]}
]}
That's exactly what I wanted & expected. However, the same query on the Cloudant replica, returns this:
{"rows":[
{"key":"3d","value":4},
{"key":"agents","value":1},
{"key":"fabrication","value":2}
]}
So it somehow only returns the length of the value array... Highly confusing & am grateful for any insights by some M&R ninjas... ;)
It looks like this is exactly the behavior you would expect given your reduce function. The key part is this:
else {
return values.length;
}
In Cloudant, rereduce is always called (since the reduce needs to span over multiple shards.) In this case, rereduce calls values.length, which will only return the length of the array.
I prefer to reduce/re-reduce implicitly rather than depending on the rereduce parameter.
function(doc) { // map
if (doc.tags) {
for(var t in doc.tags) {
emit(doc.tags[t], {id:doc._id, tag:doc.tags[t]});
}
}
}
Then reduce checks whether it is accumulating document ids from the identical tag, or whether it is just counting different tags.
function(keys, vals, rereduce) {
var initial_tag = vals[0].tag;
return vals.reduce(function(state, val) {
if(initial_tag && val.tag === initial_tag) {
// Accumulate ids which produced this tag.
var ids = state.ids;
if(!ids)
ids = [ state.id ]; // Build initial list from the state's id.
return { tag: val.tag,
, ids: ids.concat([val.id])
};
} else {
var state_count = state.ids ? state.ids.length : state;
var val_count = val.ids ? val.ids.length : val;
return state_count + val_count;
}
})
}
(I didn't test this code, but you get the idea. As long as the tag value is the same, it doesn't matter whether it's a reduce or rereduce. Once different tags start reducing together, it detects that because the tag value will change. So at that point just start accumulating.
I have used this trick before, although IMO it's rarely worth it.
Also in your specific case, this is a dangerous reduce function. You are building a wide list to see all the docs that have a tag. CouchDB likes tall lists, not fat lists. If you want to see all the docs that have a tag, you could map them.
for(var a = 0; a < doc.tags.length; a++) {
emit(doc.tags[a], doc._id);
}
Now you can query /db/_design/app/_view/docs_by_tag?key="3d" and you should get
{"total_rows":287,"offset":30,"rows":[
{"id":"project1","key":"3d","value":"project1"}
{"id":"project3","key":"3d","value":"project3"}
{"id":"project8","key":"3d","value":"project8"}
{"id":"project10","key":"3d","value":"project10"}
]}

Resources