I'm new to OpenMDAO and started off with the newest version (version 2.3.1 at the time of this post).
I'm working on the setup to a fairly complicated aero-structural optimization using several external codes, specifically NASTRAN and several executables (compiled C++) that post process NASTRAN results.
Ideally I would like to break these down into multiple components to generate my model, run NASTRAN, post process the results, and then extract my objective and constraints from text files. All of my existing interfaces are through text file inputs and outputs. According to the GitHub page, the file variable feature that existed in an old version (v1.7.4) has not yet been implemented in version 2.
https://github.com/OpenMDAO/OpenMDAO
Is there a good workaround for this until the feature is added?
So far the best solution I've come up with is to group everything into one large component that maps input variables to final output by running everything instead of multiple smaller components that break up the process.
Thanks!
File variables themselves are no longer implemented in OpenMDAO. They caused a lot of headaches and didn't fundamentally offer useful functionality because they requires serializing the whole file into memory and passing it around as string buffers. The whole process was just duplicative and inefficient, since the files were ultimately getting written and read from disk far more times than were necessary.
In your case since you're setting up an aerostructural problem, you really wouldn't want to use them anyway. You will want to have access to either analytic or at least semi-analytic total derivatives for efficient execution. So what that means is that the boundary of each component must composed of only floating point variables or arrays of floating point variables.
What you want to do is wrap your analysis tools using ExternalCodeImplicitComp, which tells openmdao that the underlying analysis is actually implicit. Then, even if you use finite-differences to compute the partial derivatives you only need to FD across the residual evaluation. For NASTRAN, this might be a bit tricky to set up, since I don't know if it directly exposes the residual evaluation, but if you can get to the stiffness matrix then you should be able to compute it. You'll be rewarded for your efforts with a greatly improved efficiency and accuracy.
Inside each wrapper, you can use the built in file wrapping tools to read through the files that were written and pull out the numerical values, which you then push into the outputs vector. For NASTRAN you might consider using pyNASTRAN, instead of the file wrapping tools, to save yourself some work.
If you can't expose the residual evaluation, then you can use ExternalCodeComp instead and treat the analysis as if it was explicit. This will make your FD more costly and less accurate, but for linear analyses you should be ok (still not ideal, but better than nothing).
The key idea here is that you're not asking OpenMDAO to pass around file objects. You are wrapping each component with only numerical data at its boundaries. This has the advantage of allowing OpenMDAO's automatic derivatives features to work (even if you use FD to compute the partial derivatives). It also has a secondary advantage that if you (hopefully) graduate to in-memory wrappers for your codes then you won't have to update your models. Only the component's internal code will change.
Related
I am using python's multiprocessing module. I have a networkx graph which I wish to share between many sub processes. These subprocesses do not modify the graph in any way, and only read its attributes (nodes, edges, etc). Right now every subprocess has its own copy of the graph, but I am looking for a way to share the graph between all of them, which will result in the memory footprint of the entire program being reduced. Since the computations are very CPU-intensive, I would want this to be done in a way that would not cause big performance issues (avoiding locks if possible, etc).
Note: I want this to work on various operating systems, including Windows, which means COW does not help (if I understand this correctly, it probably wouldn't have helped regardless, due to reference counting)
I found https://docs.python.org/3/library/multiprocessing.html#proxy-objects and
https://docs.python.org/3/library/multiprocessing.shared_memory.html, but I'm not sure which (or if either) is suitable. What is the right way to go about this? I'm using python 3.8, but can use later versions if helpful.
There are a few options for sharing data in python during multiprocessing but you may not be able to do exactly what you want to.
In C++ you could use simple shared memory for ints, floats, structs, etc.. Python's shared memory manager does allow for this type of sharing for simple objects but it doesn't work for classes or anything more complex than a list of base types. For shared complex python objects, you really only have a few choices...
Create a copy of the object in your forked process (which it sounds like you don't want to do).
Put the object in a centralized process (ie.. python's Manager / proxy objects) and interact with it via pipes and pickled data.
Convert your networkX graph to a list of simple ints and put it in shared memory.
What works for you is going to depend on some specifics. Option #2 has a bit of overhead because every time you need to access the object, data has to be pickled and piped to the centralized process and the result pickled/piped for return. This works well if you only need a small portion of the centralized data at a time and your processing steps are relatively long (compared to the pickle/pipe time).
Option #3 could be a lot of work. You would fundamentally be changing the data format from networkX objects to a list of ints so it's going to change the way you do processing a lot.
A while back I put together PythonDataServe which allows you to server your data to multiple processes from another process. It's a very similar solution to #2 above. This type of approach works if you only need a small portion of the data at a time but it you need it all, it's much easier to just create a local copy.
I am working on an obfuscated binary as a part of a crackme challenge. It has got a sequence of push, pop and nop instructions (which repeats for thousands of times). Functionally, these chunks do not have any effect on the program. But, they make generation of CFGs and the process of reversing, very hard.
There are solutions on how to change the instructions to nop so that I can remove them. But in my case, I would like to completely strip off those instructions, so that I can get a better view of the CFG. If instructions are stripped off, I understand that the memory offsets must be modified too. As far as I could see, there were no tools available to achieve this directly.
I am using IDA Pro evaluation version. I am open to solutions using other reverse engineering frameworks too. It is preferable, if it is scriptable.
I went through a similar question but, the proposed solution is not applicable in my case.
I would like to completely strip off those instructions ... I understand that the memory offsets must be modified too ...
In general, this is practically impossible:
If the binary exports any dynamic symbols, you would have to update the .dynsym (these are probably the offsets you are thinking of).
You would have to find every statically-assigned function pointer, and update it with the new address, but there is no effective way to find such pointers.
Computed GOTOs and switch statements create function pointer tables even when none are present in the program source.
As Peter Cordes pointed out, it's possible to write programs that use delta between two assembly labels, and use such deltas (small immediate values directly encoded into instructions) to control program flow.
It's possible that your target program is free from all of the above complications, but spending much effort on a technique that only works for that one program seems wasteful.
I need to load a mid-sized XML file into memory, make many random access modifications to the file (perhaps hundreds of thousands), then write the result out to STDIO. Most of these modifications will be node insertion/deletions, as well as character insertion/deletions within the text nodes. These XML files will be small enough to fit into memory, but large enough that I won't want to keep multiple copies around.
I am trying to settle on the architecture/libraries and am looking for suggestions.
Here is what I have come up with so far-
I am looking for the ideal XML library for this, and so far, I haven't found anything that seems to fit the bill. The libraries generally store nodes in Haskell lists, and text in Haskell Data.Text objects. This only allows linear Node and Text inserts, and I believe that the Text inserts will have to do full rewrite on every insert/delete.
I think storing both nodes and text in sequences seems to be the way to go.... It supports log(N) inserts and deletes, and only needs to rewrite a small fraction of the tree on each alteration. None of the XML libs are based on this though, so I will have to either write my own lib, or just use one of the other libs to parse then convert it to my own form (given how easy it is to parse XML, I would almost just as soon do the former, rather than have a shadow parse of everything).
I had briefly considered the possibility that this might be a rare case where Haskell might not be the best tool.... But then I realized that mutability doesn't offer much of an advantage here, because my modifications aren't char replacements, but rather add/deletes. If I wrote this in C, I would still need to store the strings/nodes in some sort of tree structure to avoid large byte moves for each insert/delete. (Actually, Haskell probably has some of the best tools to deal with this, but I would be open to suggestions of a better choice of language for this task if you feel there is one).
To summarize-
Is Haskell the right choice for this?
Does any Haskell lib support fast node/text insert/deletes (log(N))?
Is sequence the best data structure to store a list of items (in my case, Nodes and Chars) for fast insert and deletes?
I will answer my own question-
I chose to wrap an Text.XML tree with a custom object that stores nodes and text in Data.Sequence objects. Because haskell is lazy, I believe it only temporarily holds the Text.XML data in memory, node by node as the data streams in, then it is garbage collected before I actually start any real work modifying the Sequence trees.
(It would be nice if someone here could verify that this is how Haskell would work internally, but I've implemented things, and the performance seems to be reasonable, not great- about 30k insert/deletes per second, but this should do).
Today I read that there is a software called WinCalibra (scroll a bit down) which can take a text file with properties as input.
This program can then optimize the input properties based on the output values of your algorithm. See this paper or the user documentation for more information (see link above; sadly doc is a zipped exe).
Do you know other software which can do the same which runs under Linux? (preferable Open Source)
EDIT: Since I need this for a java application: should I invest my research in java libraries like gaul or watchmaker? The problem is that I don't want to roll out my own solution nor I have time to do so. Do you have pointers to an out-of-the-box applications like Calibra? (internet searches weren't successfull; I only found libraries)
I decided to give away the bounty (otherwise no one would have a benefit) although I didn't found a satisfactory solution :-( (out-of-the-box application)
Some kind of (Metropolis algorithm-like) probability selected random walk is a possibility in this instance. Perhaps with simulated annealing to improve the final selection. Though the timing parameters you've supplied are not optimal for getting a really great result this way.
It works like this:
You start at some point. Use your existing data to pick one that look promising (like the highest value you've got). Set o to the output value at this point.
You propose a randomly selected step in the input space, assign the output value there to n.
Accept the step (that is update the working position) if 1) n>o or 2) the new value is lower, but a random number on [0,1) is less than f(n/o) for some monotonically increasing f() with range and domain on [0,1).
Repeat steps 2 and 3 as long as you can afford, collecting statistics at each step.
Finally compute the result. In your case an average of all points is probably sufficient.
Important frill: This approach has trouble if the space has many local maxima with deep dips between them unless the step size is big enough to get past the dips; but big steps makes the whole thing slow to converge. To fix this you do two things:
Do simulated annealing (start with a large step size and gradually reduce it, thus allowing the walker to move between local maxima early on, but trapping it in one region later to accumulate precision results.
Use several (many if you can afford it) independent walkers so that they can get trapped in different local maxima. The more you use, and the bigger the difference in output values, the more likely you are to get the best maxima.
This is not necessary if you know that you only have one, big, broad, nicely behaved local extreme.
Finally, the selection of f(). You can just use f(x) = x, but you'll get optimal convergence if you use f(x) = exp(-(1/x)).
Again, you don't have enough time for a great many steps (though if you have multiple computers, you can run separate instances to get the multiple walkers effect, which will help), so you might be better off with some kind of deterministic approach. But that is not a subject I know enough about to offer any advice.
There are a lot of genetic algorithm based software that can do exactly that. Wrote a PHD about it a decade or two ago.
A google for Genetic Algorithms Linux shows a load of starting points.
Intrigued by the question, I did a bit of poking around, trying to get a better understanding of the nature of CALIBRA, its standing in academic circles and the existence of similar software of projects, in the Open Source and Linux world.
Please be kind (and, please, edit directly, or suggest editing) for the likely instances where my assertions are incomplete, inexact and even flat-out incorrect. While working in related fields, I'm by no mean an Operational Research (OR) authority!
[Algorithm] Parameter tuning problem is a relatively well defined problem, typically framed as one of a solution search problem whereby, the combination of all possible parameter values constitute a solution space and the parameter tuning logic's aim is to "navigate" [portions of] this space in search of an optimal (or locally optimal) set of parameters.
The optimality of a given solution is measured in various ways and such metrics help direct the search. In the case of the Parameter Tuning problem, the validity of a given solution is measured, directly or through a function, from the output of the algorithm [i.e. the algorithm being tuned not the algorithm of the tuning logic!].
Framed as a search problem, the discipline of Algorithm Parameter Tuning doesn't differ significantly from other other Solution Search problems where the solution space is defined by something else than the parameters to a given algorithm. But because it works on algorithms which are in themselves solutions of sorts, this discipline is sometimes referred as Metaheuristics or Metasearch. (A metaheuristics approach can be applied to various algorihms)
Certainly there are many specific features of the parameter tuning problem as compared to the other optimization applications but with regard to the solution searching per-se, the approaches and problems are generally the same.
Indeed, while well defined, the search problem is generally still broadly unsolved, and is the object of active research in very many different directions, for many different domains. Various approaches offer mixed success depending on the specific conditions and requirements of the domain, and this vibrant and diverse mix of academic research and practical applications is a common trait to Metaheuristics and to Optimization at large.
So... back to CALIBRA...
From its own authors' admission, Calibra has several limitations
Limit of 5 parameters, maximum
Requirement of a range of values for [some of ?] the parameters
Works better when the parameters are relatively independent (but... wait, when that is the case, isn't the whole search problem much easier ;-) )
CALIBRA is based on a combination of approaches, which are repeated in a sequence. A mix of guided search and local optimization.
The paper where CALIBRA was presented is dated 2006. Since then, there's been relatively few references to this paper and to CALIBRA at large. Its two authors have since published several other papers in various disciplines related to Operational Research (OR).
This may be indicative that CALIBRA hasn't been perceived as a breakthrough.
State of the art in that area ("parameter tuning", "algorithm configuration") is the SPOT package in R. You can connect external fitness functions using a language of your choice. It is really powerful.
I am working on adapters for e.g. C++ and Java that simplify the experimental setup, which requires some getting used to in SPOT. The project goes under name InPUT, and a first version of the tuning part will be up soon.
So I'm currently working on a new programming language. Inspired by ideas from concurrent programming and Haskell, one of the primary goals of the language is management of side effects. More or less, each module will be required to specify which side effects it allows. So, if I were making a game, the graphics module would have no ability to do IO. The input module would have no ability to draw to the screen. The AI module would be required to be totally pure. Scripts and plugins for the game would have access to a very restricted subset of IO for reading configuration files. Et cetera.
However, what constitutes a side effect isn't clear cut. I'm looking for any thoughts or suggestions on the subject that I might want to consider in my language. Here are my current thoughts.
Some side effects are blatant. Whether its printing to the user's console or launching your missiles, anything action that reads or write to a user-owned file or interacts with external hardware is a side effect.
Others are more subtle and these are the ones I'm really interested in. These would be things like getting a random number, getting the system time, sleeping a thread, implementing software transactional memory, or even something very fundamental such as allocating memory.
Unlike other languages built to control side effects (looking at you Haskell), I want to design my language to be pragmatic and practical. The restrictions on side effects should serve two purposes:
To aid in the separations of concerns. (No one module can do everything).
To sandbox each module in the application. (Any module could be used as a plugin)
With that in mind, how should I handle "pseudo"-side effects, like random numbers and sleeping, as I mention above? What else might I have missed? In what ways might I manage memory usage and time as resources?
The problem of how to describe and control effects is currently occupying some of the best scientific minds in programming languages, including people like Greg Morrisett of Harvard University. To my knowledge, the most ambitious pioneering work in this area was done by David Gifford and Pierre Jouvelot in the FX programming language started in 1987. The language definition is online, but you may get more insight into the ideas by reading their 1991 POPL paper.
This is a really interesting question, and it represents one of the stages I've gone through and, frankly, moved beyond.
I remember seminars in which Carl Hewitt, in talking about his Actors formalism, discussed this. He defined it in terms of a method giving a response that was solely a function of its arguments, or that could give different answers at different times.
I say I moved beyond this because it makes the language itself (or the computational model) the main subject, as opposed to the problem(s) it is supposed to solve. It is based on the idea that the language should have a formal underlying model so that its properties are easy to verify. That is fine, but still remains a distant goal, because there is still no language (to my knowledge) in which the correctness of something as simple as bubble sort is easy to prove, let alone more complex systems.
The above is a fine goal, but the direction I went was to look at information systems in terms of information theory. Specifically, assuming a system starts with a corpus of requirements (on paper or in somebody's head), those requirements can be transmitted to a program-writing machine (whether automatic or human) to generate source code for a working implementation. THEN, as changes occur to the requirements, the changes are processed through as delta changes to the implementation source code.
Then the question is: What properties of the source code (and the language it is encoded in) facilitate this process? Clearly it depends on the type of problem being solved, what kinds of information go in and out (and when), how long the information has to be retained, and what kind of processing needs to be done on it. From this one can determine the formal level of the language needed for that problem.
I realized the process of cranking through delta changes of requirements to source code is made easier as the format of the code comes more to resemble the requirements, and there is a nice quantitative way to measure this resemblence, not in terms of superficial resemblence, but in terms of editing actions. The well-known technology that best expresses this is domain specific languages (DSL). So I came to realize that what I look for most in a general-purpose language is the ability to create special-purpose languages.
Depending on the application, such special-purpose languages may or may not need specific formal features like functional notation, side-effect control, paralellism, etc. In fact, there are many ways to make a special-purpose language, from parsing, interpreting, compiling, down to just macros in an existing language, down to simply defining classes, variables, and methods in an existing language. As soon as you declare a variable or subroutine you're created new vocabulary and thus, a new language in which to solve your problem. In fact, in this broad sense, I don't think you can solve any programming problem without being, at some level, a language designer.
So best of luck, and I hope it opens up new vistas for you.
A side effect is having any effect on anything in the world other than returning a value, i.e. mutating something that could be visible in some way outside the function.
A pure function neither depends on or affects any mutable state outside the scope of that invocation of the function, which means that the function's output depends only on constants and its inputs. This implies that if you call a function twice with the same arguments, you are guaranteed to get the same result both times, regardless of how the function is written.
If you have a function that modifies a variable that it has been passed, that modification is a side effect because it's visible output from the function other than the return value. A void function that is not a no-op must have side effects, because it has no other way of affecting the world.
The function could have a private variable only visible to that function that it reads and modifies, and calling it would still have the side effect of changing the way the function behaves in the future. Being pure means having exactly one channel for output of any kind: the return value.
It is possible to generate random numbers purely, but you have to pass around the random seed manually. Most random functions keep a private seed value that is updated each time its called so that you get a different random each time. Here's a Haskell snippet using System.Random:
randomColor :: StdGen -> (Color, Int, StdGen)
randomColor gen1 = (color, intensity, gen2)
where (color, gen2) = random gen1
(intensity, gen3) = randomR (1, 100) gen2
The random functions each return the randomized value and a new generator with a new seed (based on the previous one). To get a new value each time, the chain of new generators (gen1,gen2,gen3) have to be passed along. Implicit generators just use an internal variable to store the gen1.. values in the background.
Doing this manually is a pain, and in Haskell you can use a state monad to make it a lot easier. You'll want to implement something less pure or use a facility like monads, arrows or uniqueness values to abstract it away.
Getting the system time is impure because the time could be different each time you ask.
Sleeping is fuzzier because sleep doesn't affect the result of the function, and you could always delay execution with a busy loop, and that wouldn't affect purity. The thing is that sleeping is done for the sake of something else, which IS a side effect.
Memory allocation in pure languages has to happen implicitly, because explicitly allocating and freeing memory are side effects if you can do any kind of pointer comparisons. Otherwise, creating two new objects with the same parameters would still produce different values because they would have different identities (e.g. not be equal by Java's == operator).
I know I've rambled on a bit, but hopefully that explains what side effects are.
Give a serious look to Clojure, and their use of software transactional memory, agents, and atoms to keep side effects under control.