Programmatically detect Terraform input and/or output variables - terraform

I would like to automate some Terraform documentation and CI/CD checks related to input variables. Is there any way to do one or more of the following:
detect what input variables a specific module will take
detect what output variables a specific module can generate
detect the data type and description fields of the above (when applicable)
If not possible, I guess I will have to resort to regex parsing of all files in a module folder - but this seems like brute force, and far from ideal.
Any ideas?

I have had a good bit of success with the terraform-docs open-source tool. You essentially point it to your module and it generates fairly standard looking docs in the format you provide.
This tool can also output JSON if you'd like a raw tree of data to process yourself.
If you're looking for something a little more "low level" you could also look into the module that powers terraform-docs: terraform-config-inspect.

Related

Create a C and C++ preprocessor using ANTLR

I want to create a tool that can analyze C and C++ code and detect unwanted behaviors, based on a config file. I thought about using ANTLR for this task, as I already created a simple compiler with it from scratch a few years ago (variables, condition, loops, and functions).
I grabbed C.g4 and CPP14.g4 from ANTLR grammars repository. However, I came to notice that they don't support the pre-processing parsing, as that's a different step in the compilation.
I tried to find a grammar that does the pre-processing part (updated to ANTLR4) with no luck. Moreover, I also understood that if I'll go with two-steps parsing I won't be able to retain the original locations of each character, as I'd already modified the input stream.
I wonder if there's a good ANTLR grammar or program (preferably Python, but can deal with other languages as well) that can help me to pre-process the C code. I also thought about using gcc -E, but then I won't be able to inspect the macro definitions (for example, I want to warn if a user used a #pragma GCC (some students at my university, for which I write this program to, used this to bypass some of the course coding style restrictions). Moreover, gcc -E will include library header contents, which I don't want to process.
My question is, therefore, if you can recommend me a grammar/program that I can use to pre-process C and C++ code. Alternatively, if you can guide me on how to create a grammar myself that'd be perfect. I was able to write the basic #define, #pragma etc. processings, but I'm unable to deal with conditions and with macro functions, as I'm unsure how to deal with them.
Thanks in advance!
This question is almost off-topic as it asks for an external resource. However, it also bears a part that deserves some attention.
The term "preprocessor" already indicates what the handling of macros etc. is about. The parser never sees the disabled parts of the input, which also means it can be anything, which might not be part of the actual language to parse. Hence a good approach for parsing C-like languages is to send the input through a preprocessor (which can be a specialized input stream) to strip out all preprocessing constructs, to resolve macros and remove disabled text. The parse position is not a problem, because you can push the current token position before you open a new input stream and restore that when you are done with it. Store reported errors together with your input stream stack. This way you keep the correct token positions. I have used exactly this approach in my Windows resource file parser.

OpenMDAO version 2.x File Variable Workaround

I'm new to OpenMDAO and started off with the newest version (version 2.3.1 at the time of this post).
I'm working on the setup to a fairly complicated aero-structural optimization using several external codes, specifically NASTRAN and several executables (compiled C++) that post process NASTRAN results.
Ideally I would like to break these down into multiple components to generate my model, run NASTRAN, post process the results, and then extract my objective and constraints from text files. All of my existing interfaces are through text file inputs and outputs. According to the GitHub page, the file variable feature that existed in an old version (v1.7.4) has not yet been implemented in version 2.
https://github.com/OpenMDAO/OpenMDAO
Is there a good workaround for this until the feature is added?
So far the best solution I've come up with is to group everything into one large component that maps input variables to final output by running everything instead of multiple smaller components that break up the process.
Thanks!
File variables themselves are no longer implemented in OpenMDAO. They caused a lot of headaches and didn't fundamentally offer useful functionality because they requires serializing the whole file into memory and passing it around as string buffers. The whole process was just duplicative and inefficient, since the files were ultimately getting written and read from disk far more times than were necessary.
In your case since you're setting up an aerostructural problem, you really wouldn't want to use them anyway. You will want to have access to either analytic or at least semi-analytic total derivatives for efficient execution. So what that means is that the boundary of each component must composed of only floating point variables or arrays of floating point variables.
What you want to do is wrap your analysis tools using ExternalCodeImplicitComp, which tells openmdao that the underlying analysis is actually implicit. Then, even if you use finite-differences to compute the partial derivatives you only need to FD across the residual evaluation. For NASTRAN, this might be a bit tricky to set up, since I don't know if it directly exposes the residual evaluation, but if you can get to the stiffness matrix then you should be able to compute it. You'll be rewarded for your efforts with a greatly improved efficiency and accuracy.
Inside each wrapper, you can use the built in file wrapping tools to read through the files that were written and pull out the numerical values, which you then push into the outputs vector. For NASTRAN you might consider using pyNASTRAN, instead of the file wrapping tools, to save yourself some work.
If you can't expose the residual evaluation, then you can use ExternalCodeComp instead and treat the analysis as if it was explicit. This will make your FD more costly and less accurate, but for linear analyses you should be ok (still not ideal, but better than nothing).
The key idea here is that you're not asking OpenMDAO to pass around file objects. You are wrapping each component with only numerical data at its boundaries. This has the advantage of allowing OpenMDAO's automatic derivatives features to work (even if you use FD to compute the partial derivatives). It also has a secondary advantage that if you (hopefully) graduate to in-memory wrappers for your codes then you won't have to update your models. Only the component's internal code will change.

Canonical method of modifying (add/delete code) a binary executable programmatically?

I've always had a great interest in computer security and after reading:
Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection
-I'd like to implement some of these algorithms, which assume that you are able to modify an executable at the binary level.
Using a HEX editor, I have accomplished to insert a simple checksum algorithm for tamper-proofing protection of a code region. However, this technique is not feasible in practice, so I'm looking for ways of automating this.
Are there any well-known frameworks or techniques that takes an executable as input and allow the programmer to work with the code in a programatically way? If not, what are my options?
By programatically, I mean, parse it into, e.g., a tree-like structure that can be written back out as a new (modified) executable.
Thank you for your time and interest.

Preprocess only local #includes into single file?

I understand VC++ will let you emit C++ source files which are the result of preprocessor operations e.g. macros are expanded and includes "copy-pasted in line".
Is it possible to restrict this simply to embed included files, which are files in my own project rather than standard libraries?
From the outside there's no way you can tell from which syntax form (<> or "") the content is being preprocessed. Unless a kind of API was exposed by the preprocessor, which is not the case here.
A not so elegant (and not strictly correct) solution I could propose would be to index a preprocessed version of all Standard headers (there are not that many) and after preprocessing the source of interest you could run a string matching script to detect the known files and remove the corresponding content from the final output.
Notice this is subjected to flaws because the #include system is purely textual and influenced by whatever macros are (un)defined at the time of inclusion and order matters. But depending on the complexity of the code you're working on this might give reasonable results.
By the way, may I ask what is the ultimate goal of your task?
Edit: Or actually... Maybe it's possible that you filter the sources before-hand to remove the undesired #includes and then submit it to preprocessing?

Custom log processing/parsing

I have such log format:
[26830431.7966868][4][0.013590574264526367][30398][api][1374829886.320353][init]
GET /foo
{"controller"=>"foo", "action"=>"index"}
[26830431.7966868][666][2.1876697540283203][30398][api][1374829888.4944339][request_end]
200 OK
The entry is constracted using such pattern:
[request_id][user_id][time_from_request_started][process_id][app][timestamp][tagline]
payload
Durring request I have many point where I log something - app basically has complex behaviour. This helps me debug a lot the user behaviour.
The way I would like to parse it is that I would like to make have directory structure like this:
req_id
|
|----[time_from_request_started][process_id][timestamp][tagline]
|
etc
Basically each directory will have name based on req_id, with files wchich names are rest of tagline. These files will include payload.
And also I will have other directory, with users ids, which will contain symlinks to request done by this user.
First question: Is this structure correct? In my opinion it will make easy fast log access. The reason I want to use directories and files is that I like unix approach, and try it (feel by myself its drawbacks and advantages)
Second question: I will have no problem to use ruby for creating this. But I would like to learn some new tool, which is better suited for this. I am thinking about using just unix tools (pipe, awk etc) to achieve this, or write parser in golang which I am learning right now (even have time to implement simple map reduce). What tool is best suited for this?
I would not store logs in a directory to see how the users behave.
Depending on what behaviour you want to keep track of you could use different tools. One of these could be mixpanel or keen.io.
Instead of logging what the user did in a log file you would sent an event to either of those (they are pretty similar, pick the one you think has better docs / lib), then you would graph those events to better understand the behaviour of your users. I've done this a lot recently, to display data in a nice way I've used rickshaw.
The key point why I'm suggesting this is that if you go the file route you will still have to find a way to understand your data, something that graphs will help you a lot at. Also, visualization is something keen.io does by default, you may still want to do your graphs but it's a good start.
Hope this helped.
Is this structure correct?
Only you can know that, it depends directly on how the data needs be accessed and used.
What tool is best suited for this?
You could probably use UNIX tools to achieve this but it may as well be a good exercise to practice your Go skills by writing this. It would also be more extensible.

Resources