Skylark struct with methods - struct

Here's something I tried to do, and it works well. I find it useful, but it feels like a hack, so I'm afraid it will break in the future.
I'm working on converting a large project to Bazel, and we have lots of local wrappers, something like:
my_cc_library(name='a', srcs=['lib.c'])
my_cc_binary(name='b', deps=['a'], srcs=['main.c'])
This requires lots of load commands, which are annoying and a repeating source of errors.
If it was normal Python, I'd simply import a whole module. But Skylark requires loading functions one by one.
I can remove the need for individual loads, using a struct.
In my.bzl:
def _my_cc_library(...): ...
def _my_cc_binary(...): ...
my = struct(cc_library=_my_cc_library, cc_binary=_my_cc_binary)
In various BUILD files:
load('//my.bzl', 'my')
my.cc_library(name='a', srcs=['lib.c'])
my.cc_binary(name='b', deps=['a'], srcs=['main.c'])
As I wrote above, it all works well. I can use if for wrappers to native rules, and for various other functions.
But an I abusing the language? Is it prone to break in the future?
Thanks.

This pattern is used in other places (e.g. https://github.com/bazelbuild/bazel-skylib/blob/master/lib/collections.bzl), it's safe to use it.
Not all tools support it well, though. For example, you won't be able to update your BUILD files with Buildozer - although it's something that can be fixed.
This requires lots of load commands, which are annoying and a repeating source of errors.
I agree it's annoying. In the future, we should have better tooling for updating load lines (to automatically add/remove them).
Rules in BUILD files look like my_cc_library(...) due to historical reasons. For a long time, load didn't exist and all rules were hard-coded in Bazel. Maybe we should encourage the my.cc_library(...) syntax and make it easier to use.

Related

Securely running user's code

I am looking to create an AI environment where users can submit their own code for the AI and let them compete. The language could be anything, but something easy to learn like JavaScript or Python is preferred.
Basically I see three options with a couple of variants:
Make my own language, e.g. a JavaScript clone with only very basic features like variables, loops, conditionals, arrays, etc. This is a lot of work if I want to properly implement common language features.
1.1 Take an existing language and strip it to its core. Just remove lots of features from, say, Python until there is nothing left but the above (variables, conditionals, etc.). Still a lot of work, especially if I want to keep up to date with upstream (though I just could also just ignore upstream).
Use a language's built-in features to lock it down. I know from PHP that you can disable functions and searching around, similar solutions seem to exist for Python (with lots and lots of caveats). For this I'd need to have a good understanding of all the language's features and not miss anything.
2.1. Make a preprocessor that rejects code with dangerous stuff (preferably whitelist based). Similar to option 1, except that I only have to implement the parser and not implement all features: the preprocessor has to understand the language so that you can have variables named "eval" but not call the function named "eval". Still a lot of work, but more manageable than option 1.
2.2. Run the code in a very locked-down environment. Chroot, no unnecessary permissions... perhaps in a virtual machine or container. Something in that sense. I'd have to research how to achieve this and how to make it give me the results in a secure way, but that seems doable.
Manually read through all code. Doable on a small scale or with moderators, though still tedious and error-prone (I might miss stuff like if (user.id = 0)).
The way I imagine 2.2 to work is like this: run both AIs in a virtual machine (or something) and constrain it to communicate with the host machine only (no other Internet or LAN access). Both AIs run in a separate machine and communicate with each other (well, with the playing field, and thereby they see each other's positions) through an API running on the host.
Option 2.2 seems the most doable, but also relatively hacky... I let someone's code loose in a virtualized or locked down environment, hoping that that'll keep them in while giving them free game to DoS or break out of the environment. Then again, most other options are not much better.
TL;DR: in essence my question is: how do I let people give me 'logic' for an AI (which I think is most easily done using code) and then run that without compromising the functionality of the system? There must be at least 2 AIs working on the same playing field.
This is really just a plugin system, so researching how others implement plugins is a good starting point. In particular, I'd look at web browsers like Chrome and Safari and their plugin systems.
A common theme in modern plugins systems is process isolation. Ideally you should run the plugin in its own process space in a sandbox. In OS X look at XPC, which is designed explicitly for this problem. On Linux (or more portably), I would probably look at NaCl (Native Client). The JVM is also designed to provide sandboxing, and offers a rich selection of languages. (That said, I don't personally consider the JVM a very strong sandbox. It's had a history of security problems.)
In general, my preference on these kinds of projects is a language-agnostic API. I most often use REST APIs (or "REST-like"). This allows the plugin to be highly restricted, while not restricting the language choice. I like simple HTTP for communications whenever possible because it has rich support in numerous languages, so it puts little restriction on the plugin. In fact, given your description, you wouldn't even have to run the plugin on your hardware (and certainly not on the main server). Making the plugins remote clients removes many potential concerns.
But ultimately, I think something like your "2.2" is the right direction.

How can I find unused symbols using cscope?

I am doing a little code refactoring. Usually, after I rewrite/reimplement everything, I do a clean-up first. This means, I remove unused libraries, unused functions etc.
My question is: How can I find functions that are not used anywhere? by this I mean there is only a definition and declaration and nothing more?
Even if it is possible, I don't think cscope is the best tool in your situation.
You'll probably have more success by using a static code analyzer tool, that will also find other problems like non initialized or unused variables, dead code, etc.
I don't know which language(s) you are working with, but generally there are multiple open source solutions available.

unifdef for Makefiles?

Is there an equivalent to the unifdef program that works on Makefiles? I'd like to automate removal of certain features from Makefiles as well as source files. I would think that the two would go hand in hand, but so far I can't seem to find anything.
The unifdef program already works on makefiles (or any other text file), with the caveat that since the #ifdef#endif statements will be treated as comments by make, the "raw" file will go through all of your statemens. I use unifdef on makefiles mainly for the following purpose: In my code I have blocks that are only used in development and that for reason of being sloppy or for being possibly dangerous I don't want to have in the published code. By using unifdef on both sourcecode and makefiles, I can get a publishable codebase from my development code, without having to sync by development code with a public branch or similar. I can also use unifdef to get code, including buildscripts and makefiles, targeted at different platforms or audiences using this method, though since the #ifdef statements are parsed as comments, using e.g. #else can be a bit tricky, and too advanced applications of this technique will probably become unmaintainable fairly quickly.

#Grape in scripts with multiple files

I'd like to use #Grape in my groovy program but my program consists of several files. The examples on the Groovy Grape page all seem to assume that your script will consist of one file. How can I do this? Should I just add it to one of the files and expect that the imports will work from the others? If so, then is it common to place all the #Grape calls in one file with no other code? Do I need to add the Grape call to all files that will import the package? Do I need to download the JAR and create a Gradle file, which I was getting away without at this point?
the grape engine and the #grab annotation were created as part of core groovy with single file scripts in mind, to allow a chunk of text to easily become a fully functional program.
for larger applications, gradle is an awesome build tool with lots of useful features.
but yes, you can manage all the application dependencies just with grape.
whether you annotate every file or a single one does not matter, just make sure the #grab annotated file is read before you try to use the external class.
annotating the main class is probably better as you will easily lose track of library versions if you have the annotations scattered.
and yes, you should consider gradle for any application with more than a dozen files or anything you might want to reuse elsewhere as a library.
In my opinion, it depends how your program is to be run...
If your program is to be run as a collection of standalone scripts, then I'd probably stick the #Grab required for each script at the top of each of them.
If your program is more of a standard style program with a single point of entry, then I'd go for using a build tool like Gradle (as you say), as you get a lot of easy wins by using it.
Firstly, it makes it easy to define your dependencies (and build a single large jar containing all of them)
Secondly, Gradle makes it really easy to start writing tests, include code coverage plugins, or useful tools like codenarc to suggest possible fixes or improvements to your code. These all become invaluable not only for improving your code (or knowing your code works), but also when refactoring your code, you know you've not broken anything that used to work.

Are there any context-sensitive code search tools?

I have been getting very frustrated recently in dealing with a massive bulk of legacy code which I am trying to get familiar with.
Say I try to search for a particular function call, I get loads of results that turn out to be completely irrelevant; some of them are easy to spot, eg a comment saying
// Fixed functionality in foo() so don't need to handle this here any more
But others are much harder to spot manually, because they turn out to be calls from other functions in modules that are only compiled in certain cases, or are part of a much larger block of code that is #if 0'd out in its entirety.
What I'd like would be a search tool that would allow me to search for a term and give me the choice to include or exclude commented out or #if 0'd out code. Then the search results would be displayed alongside a list of #defines that are required in order for that snippet of code to be relevant.
I'm working in C / C++, but other than the specific comment syntax I guess the techniques should be more generally applicable.
Does such a tool exist?
Not entirely what you're after, but I find this quite handy.
GrepWin - A free visual "grep" tool for searching files.
I find it quite helpful because:
Its a separate app (doesn't lock up my editor)
Handles Regular expressions
Its fast
Can specify what folder to search, and what filetypes (handles regex's here too)
Can limit by file size
Can include subdirs (or exclude by regex)
etc.
Almost any decent source browser will let you go to where a function is defined, and/or list all the calls of that function and take you directly to a call site. This will normally be based on a fairly complete parse of the source code so it will ignore comments, code that's excluded by the preprocessor, and so on (in fact, in at least one case, the parser used by the source browser is almost certainly better than the one used in the compiler itself).

Resources