Nested namespaces in Rcpp - rcpp

I have a working R package including the necessary R/Cpp working. My question here is more of a 'good practices' when using nested namespace.
Currently, my package has; 1) cost functions, 2) optimization functions, 3) parallel workers that call the optimization functions. At present the workers are defined in the .cpp files that contain the exported (to R) cpp function. The optimization functions have one namespace and header file and the cost functions have they're own header/namespace file. All the optimization functions call a cost function so they include the 'cost function' header and use the 'cost function' namespace.
As the .cpp files that run in parallel and call the workers are on the order of ~700 lines, I was thinking to move the workers to a separate namespace and header file. These include the 'optimization' header and use the 'optimization' namespace.
That got me thinking, do I really need 3 nested namespaces? It would be nice not to have 3 namespaces with 2-4 functions in each of them. Additionally, the functions are/would always be called in order parallel_worker/optimize_function/cost_function, each living in a different namespace/header.

Related

Can Terraform mention multiple module instances in one resource?

In Terraform is there some way to refer to the set of instances of a module?
I have several workloads that each need very similar (but still separate) infrastructure, and I also want to configure another item of infrastructure to be used in common between them.
For example, say each needs several pieces of infrastructure (AWS S3 bucket, SQS queue, and IAM role..) but with mostly equivalent attributes. I want to achieve this without code duplication (e.g., by writing a Terraform module to be reused for each instance, with input variables for name prefixes and specific IAM policies).
Is there a Terraform syntax for then making a reference to all of those instances in a single resource, and with minimal boilerplate? (Maybe something analogous to a classmethod, to define a common resource to only be created once, even if multiple instances of that module get created?) For example, to make a shared kubernetes config-map that contains an index of the generated addresses (bucket names or SQS URLs), as a mechanism for passing those addresses to the containerised workloads that will use them? Another example might be setting up a single load balancer or DNS server with rules referring individually to every service from this group.
Or does the problem have to be approached in the other direction, by starting with a list of parameter sets, and somehow looping over that list to create the infrastructure? (Requiring every instance of this kind to be specified together in the same file?)
The Terraform terminology for modularity is a child module that is called from multiple places in the configuration. The call uses the module block (where parameter values are passed). Since modules are directories, the child module will be defined somewhere outside of the configuration root module directory tree. The calling module can access output values exported from the child module.
You can use a single module call to create multiple instances, by putting a for_each argument in the module block, and passing a map (or set) through this meta-argument. The other argument expressions in the block can use the each object to refer to the particular for_each element corresponding to the current instance, as Terraform iterates through them. From outside of the module block (elsewhere in the calling module), the output values can themselves be accessed like a map.
You can use [for .. in .. : .. if ..] expressions to filter and transform maps. (Some transformations can also be performed more concisely by splat expressions.) Some resources (such as kubernetes_config_map) have arguments that take maps directly. (Also, some data or resources can accept a sequence of nested blocks, which can be generated from a map by using dynamic block syntax.)
Note, do not use the older feature count as an alternative to for_each. Otherwise, there is a documented tendency to produce unintended destruction and recreation of other infrastructure if one element of the list has to be decommissioned. Similarly, passing a list-derived set, instead of a map, to for_each can make indexing more cumbersome.
Thus for the OP's example, the approach would be to first create a parameterised child module defining the nearly-duplicated parts of the infrastructure (e.g., bucket, queue, role, etc). Make sure the child module has either inputs or outputs for any aspect that needs customisation (for example, output a handle for the created bucket resource, or at least for its auto-generated globally-unique name). Then have a module in your configuration that creates your whole collection of instances (by a single module block that sources the child module and using for_each). The customisation of individual instances (e.g., some having additional policies or other specific infrastructure) can be achieved by a combination of the parameter sets initially passed to the call, and supplementary resource blocks that each refer to outputs from an individual instance. Furthermore, the outputs can also be referred to collectively, for example parsing the module call outputs to form a list of bucket names (or addresses) and then passing this list to another resource (i.e., to a k8s config map). Again this must be done from the calling module; the child module does not have access to a list of instances of itself.

Identify heavily called function

I search for a performance problem with procmon. This tool lists a lot of operation sequences like
CreateFile
QueryInformationVolume
QueryAllInformationFile
CloseFile
All operations are performed on the same file somewhere in the ProgramData tree. The QueryAllInformationFile fails with BUFFER OVERFLOW, the others succeed.
My first thought was that it could be related to a call of the API function GetVolumeInformation. But this API function rejects any call with a RootPathName that is not a drive name but a file name. Therefore it can't be used to call QueryInformationVolume for the file.
I have a huge amount of source code and want to identify the reason for this repeated sequence. Involved packages are e.g. the MXE cross compiler suite, some g-libraries like glibmm, glibio and others. The actual problem occurs when a program "PulseView" is running, that has been compiled with MXE.
How can I identify the API function that is responsible for the operations?

Azure durable entity or static variables?

Question: Is it thread-safe to use static variables (as a shared storage between orchestrations) or better to save/retrieve data to durable-entity?
There are couple of azure functions in the same namespace: hub-trigger, durable-entity, 2 orchestrations (main process and the one that monitors the whole process) and activity.
They all need some shared variables. In my case I need to know the number of main orchestration instances (start new or hold on). It's done in another orchestration (monitor)
I've tried both options and ask because I see different results.
Static variables: in my case there is a generic List, where SomeMyType holds the Id of the task, state, number of attempts, records it processed and other info.
When I need to start new orchestration and List.Add(), when I need to retrieve and modify it I use simple List.First(id_of_the_task). First() - I know for sure needed task is there.
With static variables I sometimes see that tasks become duplicated for some reason - I retrieve the task with List.First(id_of_the_task) - change something on result variable and that is it. Not a lot of code.
Durable-entity: the major difference is that I add List on a durable entity and each time I need to retrieve it I call for .CallEntityAsync("getTask") and .CallEntityAsync("saveTask") that might slow done the app.
With this approach more code and calls is required however it looks more stable, I don't see any duplicates.
Please, advice
Can't answer why you would see duplicates with the static variables approach without the code, may be because list is not thread safe and it may need ConcurrentBag but not sure. One issue with static variable is if the function app is not always on or if it can have multiple instances. Because when function unloads (or crashes) the state would be lost. Static variables are not shared across instances either so during high loads it wont work (if there can be many instances).
Durable entities seem better here. Yes they can be shared across many concurrent function instances and each entity can only execute one operation at a time so they are for sure a better option. The performance cost is a bit higher but they should not be slower than orchestrators since they perform a lot of common operations, writing to Table Storage, checking for events etc.
Can't say if its right for you but instead of List.First(id_of_the_task) you should just be able to access the orchestrators properties through the client which can hold custom data. Another idea depending on the usage is that you may be able to query the Table Storages directly with CloudTable class for the information about the running orchestrators.
Although not entirely related you can look at some settings for parallelism for durable functions Azure (Durable) Functions - Managing parallelism
Please ask any questions if I should clarify anything or if I misunderstood your question.

Is there a way to declare a class inside a try and catch block in jscript?

I want to use System.Threading methods provided by the .NET. for creating 2 test conditions that must happen simultaneously.
Using this solution to put the function in a class I was able to create 2 threads and run them parallely in jsc compiler.
But the environment I am writing my test scripts in, places my entire code in a try block and then compiles, which throws the
error:JS1109: Class definition not allowed in this context.
Is there anyway to make multithreading possible without the use of classes in JScript?

File read access from threads

I have a static class that contains a number of functions that read values from configuration files. The configuration files are provided with the software and the software itself NEVER writes to them.
I have a number of threads that are running in my application and I need to call a function in the static class. The function will then go to one of the configuration files, look up a value (depending on the parameter that I pass when I call the function) and then return a result.
I need the threads to be able to read the file all at the same time (or rather, without synchronising to the main thread). The threads will NEVER write to the configuration files.
My question is simply, therefore, will there be any issues in allowing multiple threads to call the same static functions to read values from the same file at the same time? I can appreciate that there would be serialization issues if some threads were writing to the file while others were reading, but this will never happen.
Basically:
1. Are there any issues allowing multiple threads to read from the same file at the same time?
2. Are there any issues allowing multiple threads to call the same static functions (in the same static class) at the same time?
Yes, this CAN be an issue, depending on how the class is actually locating and reading from the files, and more so if the class is also caching the values so it does not need to read from the files every time. Without seeing your class's actual code, there is no way to tell you whether your code is thread-safe or not.

Resources