capturing pipe exit status in R

capturing pipe exit status in R - linux

I am using R's pipe() function to capture output from a shell command, but I would also like to get the exit code from the command.
I know that I could use system2 here, but I need the advantage of a pipe i.e. the ability to process output in a streaming fashion.
I am considering writing my own library to wrap the popen() and pclose() C functions to take advantage of the fact that pclose() returns the exit status, but maybe this can be avoided.
Any suggestions? Thanks!
Note
There are certainly ways to do this with temporary files, named pipes, etc but I would ideally like to avoid these workarounds. I am willing to compile a shared library with an R->C function in it (and I'm even willing to copy-paste part of the R source code), but I'm not willing to rebuild R.
Update
I started reading through the R source code and found the unchecked pclose call:
in src/main/connections.c:
static void pipe_close(Rconnection con)
{
pclose(((Rfileconn)(con->private))->fp);
con->isopen = FALSE;
}
I tried going forward with the approach of implementing an R_pclose C function that duplicates the R code for close() but saves this return value. I unfortunately ran into this static variable in src/main/connections.c
static Rconnection Connections[NCONNECTIONS];
Since I'd have to run objcopy --globalize-symbol=Connections libR.so /path/to/my/libR.so anyway to access the variable, it looks like my best solution is to rebuild R with my own patch to capture the pclose return value.

Ugly hack: you can wrap your command call into a small shell script which writes the exit code of its child to some temporary file. So when the stream has ended, you could wait until that file has non-zero size, then read the status from there. I hope someone comes up with a better solution, but at least this is a kind of solution.

Related

NodeJS: how disable auto output after every line in console?

As u can see on the image, after every line, an automatic output appears. I want do disable this. I do not intend to use it as a workaround editor, the problem is that on some functions this output is more than some screens big and is hard to look at the expected result.

The default CLI, IIRC, uses the standard Node REPL, but does not provide a trivial way to customize it.
You could start your own REPL and provide options as described in the REPL docs, specifically focusing on:
eval
ignoreUndefined
The easiest solution is to append something to whatever is returning the long value, e.g., if there's a function called foo that returns an object that spans pages, like:
> foo()
// countless lines of output
Tack on something to the command, like:
> foo(); null
null

Trick open() into reading from a non-file?

I realize this is a pretty weird thing to want to do - it's mainly just to simplify my unittests.
I have a class whose __init__ takes a filename as an argument, which it open()s and reads a bunch of data from. It'd be really awesome if I could somehow "trick" that open() into reading from a string object instead without actually writing a temporary file to disk.
Is this possible, and if so, how would I go about it?

You can monkey-patch the module that contains the class before running the test, and restore it after the test:
def my_fake_open(fake_filename):
return object_with_read_and_close_that_will_feed_correct_data()
def test_something(self):
module_containing_test.open = my_fake_open
...run test...
del module_containing_test.open
Check out the mock library if you don't want to write your own mock objects.

Reading a bit too fast, I thought the answer would be in io.stringIO which allows you to create a file-like object with the content of a string.
But what you want, is an object that, passed to the standard open function would actually yield a file-like object from the contents of the your string.
The thing is that open takes a string (or a file descriptor) but anything else will pause a problem.
So I don't believe this way is practical.
But actually, it's not difficult to create a file using tempfile:
with tempfile.NamedTemporaryFile() as tmp_file:
tmp_file.write(your_string)
yourmodule.YourClass(tmp_file.name)
(If you're on Windows you might want to play with delete=False and close before sending the name for it to be opened.)
Another approach might be to just change the API: if all the init does with the name is to open a file, why not directly pass a file-like object.

Executing functions stored in a string

Lets say that there is a function in my Delphi app:
MsgBox
and there is a string which has MsgBox in it.
I know what most of you are going to say is that its possible, but I think it is possible because I opened the compiled exe(compiled using delphi XE2) using a Resource Editor, and that resource editor was built for Delphi. In that, I could see most of the code I wrote, as I wrote it. So since the variables names, function names etc aren't changed during compile, there should a way to execute the functions from a string, but how? Any help will be appreciated.
EDIT:
What I want to do is to create a simple interpreter/scripting engine. And this is how its supposed to work:
There are two files, scr.txt and arg.txt
scr.txt contains:
msg_show
0
arg.txt contains:
"Message"
And now let me explain what that 0 is:
First, scr.txt's first line is function name
second line tells that at which line its arguments are in the arg.txt, i.e 0 tells that "Message" is the argument for msg_show.
I hope my question is now clear.

I want to make a simple scripting engine.
In order to execute arbitrary code stored as text, you need a compiler or an interpreter. Either you need to write one yourself, or embed one that already exists. Realistically, the latter option is your best option. There are a number available but in my view it's hard to look past dwscript.

I think I've already solved my problem! The answer is in this question's first answer.
EDIT:
But with that, as for a workaround of the problem mentioned in first comment, I have a very easy solution.
You don't need to pass all the arguments/parameters to it. Just take my example:
You have two files, as mentioned in the question. Now you need to execute the files. It is as simple as that:
read the first line of scr.txt
check if it's a function. If not, skip the line
If yes, read the next line which tells the index where it's arguments are in arg.txt
pass on the index(an integer) to the "Call" function.
Now to the function which has to be executed, it should know how many arguments it needs. i.e 2
Lets say that the function is "Sum(a,b : integer)".It needs 2 arguments
Now let the function read the two arguments from arg.txt.
And its done!
I hope it will help you all.
And I can get some rep :)

Get the end address of Linux kernel function on run-time

I am trying to get the boundary for a kernel function (system calls for example). Now, if I understand correctly, I can get the start address of the interested function by reading /proc/kallsyms or System.map but I dont know how to get the end address of this function.
As you may know, /proc/kallsyms allow us to view the symbol table for Linux kernel so we can see the start address of all exported symbols. Can we use the start address of the next function to calculate the end address of the previous function? If we cannot do like this, could you suggest me another ways?

Generally, executables store only the start address of a function, as it is all that is required to call the function. You will have to infer the end address, rather than simply looking it up.
You could try to find the start address of the subsequent function, but that wouldn't always work either. Imagine the following:
void func_a() {
// do something
}
static void helper_function() {
// do something else
}
void func_b() {
// ...
helper_function();
// ...
}
You could get the address of func_a and func_b, but helper_function would not show up, because nothing needs to link to it. If you tried to use func_b as the end of func_a (assuming that the order in the compiled code in equivalent to the order in the source code, which is not guaranteed), you would end up accidentally including code that you didn't need to include - and might not find code that you need to find when inlining other functions into func_b.
So, how do we find this information? Well, if you think about it - the information does exist - all of the paths within func_a will eventually terminate (in a loop, return statement, tail call, etc), probably before helper_function begins.
You would need to parse out the code of func_a and build up a map of all of the possible code paths within it. Of course, you would need to do this anyway to inline other functions into it - so it shouldn't be too much harder to simply not care about the end address of the function.
One final note: in this example, you would have trouble finding helper_function in order to know to inline it, because the symbol wouldn't show up in kallsyms. The solution here is that you can track the call instructions in individual functions to determine what hidden functions exist that you didn't know about otherwise.
TL;DR: You can only find the end address by parsing the compiled code. You have to parse this anyway, so just do it once.

How to make this Groovy string search code more efficient?

I'm using the following groovy code to search a file for a string, an account number. The file I'm reading is about 30MB and contains 80,000-120,000 lines. Is there a more efficient way to find a record in a file that contains the given AcctNum? I'm a novice, so I don't know which area to investigate, the toList() or the for-loop. Thanks!
AcctNum = 1234567890
if (testfile.exists())
{
lines = testfile.readLines()
words = lines.toList()
for (word in words)
{
if (word.contains(AcctNum)) { done = true; match = 'YES' ; break }
chunks += 1
if (done) { break }
}
}

Sad to say, I don't even have Groovy installed on my current laptop - but I wouldn't expect you to have to call toList() at all. I'd also hope you could express the condition in a closure, but I'll have to refer to Groovy in Action to check...
Having said that, do you really need it split into lines? Could you just read the whole thing using getText() and then just use a single call to contains()?
EDIT: Okay, if you need to find the actual line containing the record, you do need to call readLines() but I don't think you need to call toList() afterwards. You should be able to just use:
for (line in lines)
{
if (line.contains(AcctNum))
{
// Grab the results you need here
break;
}
}

When you say efficient you usually have to decide which direction you mean: whether it should run quickly, or use as few resources (memory, ...) as possible. Often both lie on opposite sites and you have to pick a trade-off.
If you want to search memory-friendly I'd suggest reading the file line-by-line instead of reading it at once which I suspect it does (I would be wrong there, but in other languages something like readLines reads the whole file into an array of strings).
If you want it to run quickly I'd suggest, as already mentioned, reading in the whole file at once and looking for the given pattern. Instead of just checking with contains you could use indexOf to get the position and then read the record as needed from that position.

I should have explained it better, if I find a record with the AcctNum, I extract out other information on the record...so I thought I needed to split the file into multiple lines.

if you control the format of the file you are reading, the solution is to add in an index.
In fact, this is how databases are able to locate records so quickly.
But for 30MB of data, i think a modern computer with a decent harddrive should do the trick, instead of over complicating the program.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

capturing pipe exit status in R - linux

Related

NodeJS: how disable auto output after every line in console?

Trick open() into reading from a non-file?

Executing functions stored in a string

Get the end address of Linux kernel function on run-time

How to make this Groovy string search code more efficient?

Categories

Resources