running linux commands from R - linux

I have a bunch of random files and I am going to run LINUX-file command on each file. Linux screen will be as follows
m7% file date-file.csv
date-file.csv: ASCII text, with CRLF line terminators
m7% file image-file.JPG
image-file.JPG: JPEG image data, EXIF standard
Only when Linux says that the file is a text file, I want to run a R script that goes through that file and finds all column names. In above screen, I want to run R script only on the first file. How could I achieve this conditional processing?
Is there any way I can run Linux commands from R? If i can do that then I can analyze the output given by Linux command to see if it contains text and then I can execute R script if required.
I am having difficulty achieving this and any help is appreciated

Try system()
08/27 7:08 [nodakai#kaidev01] ~/R$ R -q
> system("ls /")
bin boot dev etc home initrd.img initrd.img.old lib lib32 lib64 libGL.so libnss3.so lost+found media mnt opt proc root run sbin selinux sftp srv sys tmp usr var vmlinuz vmlinuz.old
> x = system("ls", TRUE)
> x
[1] "a.csv" "abi.pdf"
[3] "b.csv" "image.jpeg"
[5] "x86_64-pc-linux-gnu-library"
> for (f in system("ls", TRUE)) { if (length(grep("ASCII", system(paste("file", f), TRUE)))) { print(f) } }
[1] "a.csv"
[1] "b.csv"
http://stat.ethz.ch/R-manual/R-devel/library/base/html/system.html
Globbing
If you're sure all files ending ".csv" are really plain text CSV files, just use Sys.glob()
> list.files()
[1] "a.csv" "abi.pdf"
[3] "b.csv" "image.jpeg"
[5] "x86_64-pc-linux-gnu-library"
> Sys.glob("*.csv")
[1] "a.csv" "b.csv"
http://stat.ethz.ch/R-manual/R-devel/library/base/html/Sys.glob.html

Related

Integrating Crashpad with Qt on Linux

I am trying to integrate Crashpad into Qt application on Linux. I am using Bugsplat database for testing and I followed this tutorial and managed to build this "dummy" application, which should serve as an example of using Qt with Crashpad.
I have made minor adjustments of files to fix build for my Linux platform, primarily making change of version easier and fixed creating directory & crashpad files next to application binaries.
All of the changes are listed below as a diff file:
diff --git a/Crashpad/Tools/Linux/symbols.sh b/Crashpad/Tools/Linux/symbols.sh
index 095f295..b065438 100644
--- a/Crashpad/Tools/Linux/symbols.sh
+++ b/Crashpad/Tools/Linux/symbols.sh
## -3,6 +3,6 ## symupload="${1}/Crashpad/Tools/Linux/symupload"
app="${2}/${4}.debug"
sym="${4}.sym"
url="https://${3}.bugsplat.com/post/bp/symbol/breakpadsymbols.php?appName=${4}&appVer=${5}"
-
+echo ${url}
eval "${dump_syms} ${app} > ${sym}"
eval $"${symupload} \"${sym}\" \"${url}\""
diff --git a/main.cpp b/main.cpp
index db97dd4..b721dc5 100644
--- a/main.cpp
+++ b/main.cpp
## -26,7 +26,7 ## int main(int argc, char *argv[])
{
QString dbName = "Fred";
QString appName = "myQtCrasher";
- QString appVersion = "1.0";
+ QString appVersion = QString::number(MAJOR_VERSION) + "." + QString::number(MINOR_VERSION);
initializeCrashpad(dbName, appName, appVersion);
diff --git a/myQtCrasher.pro b/myQtCrasher.pro
index 3005e41..3bf7a3e 100644
--- a/myQtCrasher.pro
+++ b/myQtCrasher.pro
## -15,6 +15,12 ## DEFINES += QT_DEPRECATED_WARNINGS
# You can also select to disable deprecated APIs only up to a certain version of Qt.
#DEFINES += QT_DISABLE_DEPRECATED_BEFORE=0x060000 # disables all the APIs deprecated before Qt 6.0.0
+MAJOR_VERSION = 4
+MINOR_VERSION = 9
+
+DEFINES += MAJOR_VERSION=$$MAJOR_VERSION
+DEFINES += MINOR_VERSION=$$MINOR_VERSION
+
SOURCES += \
main.cpp \
mainwindow.cpp \
## -94,7 +100,8 ## linux {
LIBS += -L$$PWD/Crashpad/Libraries/Linux/ -lbase
# Copy crashpad_handler to build directory and run dump_syms and symupload
- QMAKE_POST_LINK += "cp $$PWD/Crashpad/Bin/Linux/crashpad_handler $$OUT_PWD/crashpad"
- QMAKE_POST_LINK += "&& bash $$PWD/Crashpad/Tools/Linux/symbols.sh $$PWD $$OUT_PWD fred myQtCrasher 1.0 > $$PWD/Crashpad/Tools/Linux/symbols.out 2>&1"
- QMAKE_POST_LINK += "&& cp $$PWD/Crashpad/attachment.txt $$OUT_PWD/attachment.txt"
+ QMAKE_POST_LINK += "mkdir $$OUT_PWD/crashpad"
+ QMAKE_POST_LINK += "&& cp $$PWD/Crashpad/Bin/Linux/crashpad_handler $$OUT_PWD/crashpad"
+ QMAKE_POST_LINK += "&& bash $$PWD/Crashpad/Tools/Linux/symbols.sh $$PWD $$OUT_PWD fred myQtCrasher $$MAJOR_VERSION"."$$MINOR_VERSION > $$PWD/Crashpad/Tools/Linux/symbols.out 2>&1"
+# QMAKE_POST_LINK += "&& cp $$PWD/Crashpad/attachment.txt $$OUT_PWD/attachment.txt" #if any attachment is needed
}
Build generates both myQtCrasher.debug, and externaly generated myQtCrasher.sym symbols file.
Using their dummy database (the creditals are fred#bugsplat.com and Flintstone as a password), I have managed to report crash, but I for some reason, the bug do not contain uploaded symbols. I have tried to manualy upload them using dump_syms and then symupload applications by sending request onto https://fred.bugsplat.com/post/bp/symbol/breakpadsymbols.php?appName=myQtCrasher&appVer=4.9, but without success.
The symupload application output is
Failed to open curl lib from binary, use libcurl.so instead
Successfully sent the symbol file.
How can I properly upload *.sym and view stack trace on crash?
Thanks for your help!
We were able to get the symbols to resolve for this crash report. Right after the symupload warning Failed to open curl lib from binary, use libcurl.so instead it says successfully sent the symbol file. I confirmed the symbol file was uploaded correctly.
I found 2 issues with the symbol file. When minidump_stackwalk was looking for the corresponding symbols it was looking for:
/myQtCrasher-4.9/myQtCrasher/C03D64A46AB29A093459A592482836E50/myQtCrasher.sym
The file that was uploaded to BugSplat was myQtCrasher.debug.sym and the module on the first line of the sym file was myQtCrasher.debug. I changed the file name to myQtCrasher.sym and the module name to myQtCrasher and the symbols for the myQtCrasher stack frames displayed function names and line numbers.
I'm not sure if these issues with mismatched symbols were due to your script changes but it seems like our script attempts to set the following variables:
app="${2}/${4}.debug"
sym="${4}.sym"
Therefore the script expects the user to generate sym files from the .debug file, but name them based on the corresponding executable.

Bash file descriptors vs Linux file descriptors

I'm just trying to reconcile these two seemingly similar concepts.
In Bash, one is allowed to make arbitrary redirections, and importantly, using one's chosen file descriptor number. However in Linux, the value returned by an open call (AFAIK) cannot be chosen by the calling process.
Thus, are Bash fd numbers the same as the fd numbers returned by system calls? If not, what's the difference?
Here's a little experiment that might shed some light on what's going on when you open a file descriptor in bash with a number of your choosing:
> cat test.txt
foobar!
> cat test.sh
#!/bin/bash
exec 17<test.txt
read -u 17 line
echo "$line"
exec 17>&-
> strace ./test.sh
//// A bunch of stuff omitted so we can skip to the interesting part...
open("test.txt", O_RDONLY) = 3
fcntl(17, F_GETFD) = -1 EBADF (Bad file descriptor)
dup2(3, 17) = 17
close(3) = 0
fcntl(17, F_GETFD) = 0
ioctl(17, TCGETS, 0x7ffc56f093f0) = -1 ENOTTY (Inappropriate ioctl for device)
lseek(17, 0, SEEK_CUR) = 0
read(17, "foobar!\n", 128) = 8
write(1, "foobar!\n", 8foobar!) = 8
fcntl(17, F_GETFD) = 0
fcntl(17, F_DUPFD, 10) = 10
fcntl(17, F_GETFD) = 0
fcntl(10, F_SETFD, FD_CLOEXEC) = 0
close(17) = 0
The part that answers your question is where it calls open() on test.txt, which returns a value of 3. This is what you would most likely get in a C program if you did the same, because file descriptors 0, 1, and 2 (i.e., stdin, stdout, and stderr) are all you have open initially. The number 3 is just the next available file descriptor.
And we see that in the strace output of the bash script as well. What bash does differently is that it then calls fcntl(17, F_GETFD) to check if file descriptor 17 is already open (because it wants to use that fd for test.txt). Then, when fcntl returns EBADF indicating no such fd is open, bash knows it is free to use it. So then it calls dup2(3, 17) to make fd 17 a copy of fd 3. Finally, it calls close() on fd 3 to free it up again, leaving fd 17 (and only fd 17) as an open file descriptor for test.txt.
So the answer to your question is that bash file descriptors are not special creatures set apart from the "normal" file descriptors that everyone else uses. They are in fact just the same thing. You could easily use the same trick in your C program to open files with file descriptor numbers of your choosing.
Also, it's worth pointing out that bash doesn't really get to choose its own file descriptor when it calls open(). It has to make do with whatever open() returns, like everyone else. All that's really going on in your bash script is some smoke and mirrors (via dup2()) to make it seem as if you get to choose your own file descriptor.

How to add new file descriptors under /dev/fd?

I am using a remote linux server via ssh, so I don't have the super user authority. However, the mounted file descriptors in /dev/fd is not enough:
user >ls /dev/fd/
0 1 2
or:
user >ls /proc/self/fd
0 1 2
And what I want to is add new file descriptors, so that I can redirect the output stream in this way:
user >./main.exe 1>1.txt 2>2.txt 3>3.txt ...
Since the file descriptor is not enough, I can't create a file descriptor such as /dev/fd/3, an error triggered:
IOError: [Errno 2] No such file or directory: '/dev/fd/3'
/dev/fd is not a real directory. You don't add files to it, it just shows which fds the process (ls in your case) has open.
To open new FDs from the shell, you can just run
./yourprogram 3>myfile
If the program writes to FD 3, the output will end up in myfile.
Here's an example:
$ cat foo.c
#include <unistd.h>
void main() {
write(3, "hello world\n", 12);
}
$ gcc foo.c -o foo
$ ./foo 3> myfile
$ cat myfile
hello world

node.js filesystem mangles Cygwin drive name

I recently installed Cygwin64 after using Cygwin32 for quite some time and I'm now having problems with one of our production scripts. Node.js's 'readFileSync' seems to be prepending the windows drive letter to the path and then failing to resolve - e.g. /cygdrive/c/foo becomes c:/cygdrive/c/foo.
I've found various mentions of similar issues online but so far I've been unable to resolve this problem. My co-worker has a seemingly identical setup and does not experience the problem.
Here it is in a nutshell -
$ cat filetest.js
var fs = require('fs');
function main(argv) {
console.log("fileName => ", argv[2]);
var data = fs.readFileSync(argv[2], 'utf8');
console.log("success");
}
main(process.argv);
$ s/node filetest.js filetest.js
fileName => filetest.js
success
$ ls -l /cygdrive/c/temp/Test.txt
----rwx---+ 1 bdodd Domain Users 14 Jun 3 14:12 /cygdrive/c/temp/Test.txt
$ s/node filetest.js /cygdrive/c/temp/Test.txt
fileName => /cygdrive/c/temp/Test.txt
fs.js:338
return binding.open(pathModule._makeLong(path), stringToFlags(flags), mode);
^
Error: ENOENT, no such file or directory 'C:\cygdrive\c\temp\Test.txt'
and just for completeness...
$ cat /etc/fstab
# /etc/fstab
#
# This file is read once by the first process in a Cygwin process tree.
# To pick up changes, restart all Cygwin processes. For a description
# see https://cygwin.com/cygwin-ug-net/using.html#mount-table
# This is default anyway:
none /cygdrive cygdrive binary,posix=0,user 0 0
$ cd /cygdrive/
$ ls -l
total 40
d---rwx---+ 1 NT SERVICE+TrustedInstaller NT SERVICE+TrustedInstaller 0 Jun 3 14:09 c
dr-xrwxr-x 1 Unknown+User Unix_Group+33 0 Apr 29 2014 u
Insight would be greatly appreciated. Thanks!

What is the command in Linux related to structure size

Hello sometime back I came across a command in Linux which prints in a file with the same name as that of the sourcecode filename but different extension,the detailed usage of sizes of the structures defined in the source code ...please let me know about any such commands
Thanks
My best guess is you are talking about nm which lists symbols from object files. A quick example:
file test.c
int int_array[10];
double double_array[10];
int main()
{
int_array[0] = 0;
double_array[0] = 0;
return 0;
}
Build an object file :
$ gcc -c test.c
Now list symbols with size information:
$ nm -S test.o
This prints something like this on my macbook:
0000000000000040 n EH_frame0
0000000000000050 C _double_array
0000000000000028 C _int_array
0000000000000000 T _main
0000000000000058 N _main.eh
Check the nm manpage for further information (http://linux.about.com/library/cmd/blcmdl1_nm.htm)

Resources