Get directory entries for "slow protocols" asynchronously - linux

I want a function for getting directory entries on Linux. I use ioutil.ReadDir and usually it is fast.
But if I want to read some mounted virtual file system on /run/user/1000/gvfs/, this function becomes slow. If the directory has many file entries I need to wait a long time.
I can use the ls command in a terminal and result will be the same.
When I tried ls -U -a -p -1 I got line by line output immediately.
I tried running this in Go with exec.Command, but it didn't work asynchronously. Go is waiting for full program output. What did I do wrong?
m.cmd = exec.Command("ls", "-U", "-a", "-p", "-1")
// for example some "slow" directory:
m.cmd.Dir = "/run/user/1000/gvfs/dav:host=webdav.yandex.ru,ssl=true,user=...../"
reader, _ := m.cmd.StdoutPipe()
bufReader := bufio.NewReader(reader)
go func() {
m.cmd.Start()
for {
line, _, err := bufReader.ReadLine()
if err != nil {
break
}
linestr := string(line)
if linestr != "./" && linestr != "../" {
fmt.Println(linestr)
}
}
}()
I need line by line printing immediately in Go.

Try ls -U -a -p 1 | cat to see if you get line-by-line output.
Go doesn't control ls; ls does line-by-line writing if ls chooses to do so, and ls chooses not to do that when its output is a pipe. You could allocate a pty pair and use that, but that's the wrong way to do this.
ioutil.ReadDir first reads the entire directory (by calling Readdir(-1)), then sorts the file names. If you use os.Open to open the directory, then call the Readdir or Readdirnames function with a small (but not negative) number, you should get something more to your liking.

Related

Scons PreAction Command is printed but apparently not executed

I'm building a large project with SCONS, for reasons out of this topic (large story) I need to pass the object files options in the final linkage command inside a file.
Eg:
gcc -o program.elf #objects_file.txt -T linker_file.ld
This command works since I've tested it manually. But now I need to run it embedded in the Project build files. My first approach/idea has been to collect all the options into a file in the following way:
dbg_exe = own_env.Program('../' + target_path, components)
own_env.AddPreAction(dbg_exe, 'echo \'$SOURCES\' > objects_file.txt')
note: the $sources contains all the object files I need.
As I expected the command seems to be executed , I see the command printed in the terminal but for some reason it has not been executed since I don't find the objects_file.txt anywhere.
It's curious that if I copy & paste the printed lines in the same terminal the command execution is successful so I suppose the syntax constructed is correct.
I tried also a shorter test code:
own_env.AddPreAction(dbg_exe, 'ls -l > salida_ls.txt')
... and another surprise , this time I get syntax error in the console:
scons: done reading SConscript files.
scons: Building targets ...
ls -l > salida_ls.txt
ls: cannot access '>': No such file or directory
ls: cannot access 'salida_ls.txt': No such file or directory
a simple 'ls -l' works fine.
Any idea why this kind of bash commands don't work as expected? Is the > redirection symbol affecting the SCONS?
Some maybe useful information:
OS Windows10
Terminal mingw32
SCons v2.3.1
After searching I've found out that this is something related with the redefinition of the SPAWN construction variable:
def w32api_spawn(sh, escape, cmd, args, e_env):
print "CMD value"
print sh
print escape
print cmd
print args
print e_env
print " ********************************** "
if cmd == "SHELL":
return SCons.Platform.win32.spawn(sh,escape,args[1], args[1:],e_env)
cmdline = cmd + ' ' + string.join(args[1:], ' ')
startupinfo = subprocess.STARTUPINFO()
startupinfo.dwFlags |= _subprocess.STARTF_USESHOWWINDOW
proc = subprocess.Popen(
cmdline,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
startupinfo=startupinfo,
shell = False,
env = None
)
data, err = proc.communicate()
print data
rv = proc.wait()
if rv:
print "====="
print err
print "====="
return rv
Looks like you'll need to swap back to the default SPAWN for that Program().
Add this to the top of that SConscript
from SCons.Platform.win32 import spawn
Then replace the logic you pasted above with:
dbg_exe = own_env.Program('../' + target_path, components, SPAWN=spawn)
own_env.AddPreAction(dbg_exe, 'echo \'$SOURCES\' > objects_file.txt')
This assumes that you're only building on win32. If that's not true you'll need to conditionally add the SPAWN to your Program() above only when you're on win32.
Finally I found a workaround running a python native function to build th efile I needed. Unfortunately I cannot afford more time with this issue, I didn't find the real reason and solution but it is clear is not something related with the normal SCONS performing but with the trick performed in the SPAWN.
scons_common.GenerateObjectsFile('../' + objects_file, components)

How can I dynamically pass arguments to a node script using unix commands?

# index.js
console.log(process.argv) // expect this to print [.., .., '1']
# terminal
$ echo 1 | node index.js // just prints [.., ..]
What's the trick? How do I dynamically pass arguments to a node script from the command line via unix commands like echo, ls, ps aux, and so on?
Note: I see that I can read the output of unix commands from within my script using stdin, but what I'd like is to truly pass arguments to the script from the command line.
$ echo 1 | node index.js
In this command echo prints 1 to the standard output which is redirected (via pipe) to the standard input of the node command that accepts index.js argument. If you want to read the string printed by echo, read the standard input, e.g.:
var text = '';
process.stdin.setEncoding('utf8');
process.stdin.on('readable', function () {
var chunk = process.stdin.read();
if (chunk !== null) {
text += chunk;
}
});
process.stdin.on('end', function () {
console.log(text);
});
How do I dynamically pass arguments to a node script from the command line via unix commands like echo, ls, ps aux, and so on.?
With a pipe you can only redirect the bulk output from a command. You may use command substitution to pass the outputs of multiple commands as strings, e.g.:
node index.js --arg1="$(ls -ld /tmp)" --arg2="$(stat -c%a /tmp)"
Assign the output of the commands to shell variables in order to make your script more readable:
arg1="$(ls -ld /tmp)"
node index.js --arg1="$arg1"
A friend of mine showed me this:
$ node index.js `echo 1 2 3 4`
Actually does exactly what I want. This would result in:
// index.js
process.argv // [.., .., '1', '2', '3', '4']
The difference between this and #RuslanOsmanov answer is that the above will pass in the output as all the arguments to the node process, whereas:
$ node --arg1=`echo 1` --arg2=`echo 2`
Requires an individual command for each individual argument.
It would not work as expected with ls if your filenames contain spaces, as space characters are treated as argument delimiters.
See What does the backtick - ` - do in a command line invocation specifically with regards to Git commands? for more about this use of back ticks.
You’ve been using the process.stdout writable stream implicitly every time you’ve called console.log(). Internally, the console.log() function calls process.stdout.write() after formatting the input arguments. But the console functions are more for debugging and inspecting objects. When you need to write structured data to stdout, you can call process.stdout.write() directly.
Say your program connects to an HTTP URL and writes the response to stdout. Stream#pipe() works well in this context, as shown here:
var http = require('http');
var url = require('url');
var target = url.parse(process.argv[2]);
var req = http.get(target, function (res) {
res.pipe(process.stdout);
});
Before you can read from stdin, you must call process.stdin.resume() to indicate that your script is interested in data from stdin. After that, stdin acts like any other readable stream, emitting data events as data is received from the output of another process, or as the user enters keystrokes into the terminal window.
The following listing shows a command-line program that prompts the user for their age before deciding whether to continue executing.
To improve #Joseph's answer, you can use both command in between back ticks() and $(enter code here)` :
$ node --arg1=`echo 1` --arg2=$(echo 2)

How does grep know it is writing to the input file?

If I try to redirect the output of grep to the same file that it is reading from, like so:
$ grep stuff file.txt > file.txt
I get the error message grep: input file 'file.txt' is also the output. How does grep determine this?
According to the GNU grep source code, the grep check the i-nodes of the input and the output:
if (!out_quiet && list_files == 0 && 1 < max_count
&& S_ISREG (out_stat.st_mode) && out_stat.st_ino
&& SAME_INODE (st, out_stat)) /* <------------------ */
{
if (! suppress_errors)
error (0, 0, _("input file %s is also the output"), quote (filename));
errseen = 1;
goto closeout;
}
The out_stat is filled by calling fstat against STDOUT_FILENO.
if (fstat (STDOUT_FILENO, &tmp_stat) == 0 && S_ISREG (tmp_stat.st_mode))
out_stat = tmp_stat;
Looking at the source code - you can see that it checks for this case (the file is already open for reading by grep) and reports it, see the SAME_INODE check below:
/* If there is a regular file on stdout and the current file refers
to the same i-node, we have to report the problem and skip it.
Otherwise when matching lines from some other input reach the
disk before we open this file, we can end up reading and matching
those lines and appending them to the file from which we're reading.
Then we'd have what appears to be an infinite loop that'd terminate
only upon filling the output file system or reaching a quota.
However, there is no risk of an infinite loop if grep is generating
no output, i.e., with --silent, --quiet, -q.
Similarly, with any of these:
--max-count=N (-m) (for N >= 2)
--files-with-matches (-l)
--files-without-match (-L)
there is no risk of trouble.
For --max-count=1, grep stops after printing the first match,
so there is no risk of malfunction. But even --max-count=2, with
input==output, while there is no risk of infloop, there is a race
condition that could result in "alternate" output. */
if (!out_quiet && list_files == 0 && 1 < max_count
&& S_ISREG (out_stat.st_mode) && out_stat.st_ino
&& SAME_INODE (st, out_stat))
{
if (! suppress_errors)
error (0, 0, _("input file %s is also the output"), quote (filename));
errseen = true;
goto closeout;
}
Here is how to write back to some file:
grep stuff file.txt > tmp && mv tmp file.txt
try pipline with cat or tac:
cat file | grep 'searchpattern' > newfile
it's best practice and short for realization

Running sh/bash/python scripts with arguments using Go

I've been stuck on this one a few days, I'm trying to run a bash script which runs off of the first argument (maybe I should give up all hope, haha)
Syntax for running the script can be assumed to be:
sudo bash script argument or since it has og+x it can be ran as just sudo script argument
In go I'm running it using the following:
package main
import (
"os"
"os/exec"
"fmt"
)
func main() {
c := exec.Command("/bin/bash", "script " + argument)
if err := c.Run(); err != nil {
fmt.Println("Error: ", err)
}
os.Exit(0)
}
I have had absolutely no luck, I've tried loads of other variations as well for this...
exec.Command("/bin/sh", "-c", "sudo script", argument)
exec.Command("/bin/sh", "-c", "sudo script " + argument) (my first try)
exec.Command("/bin/bash", "-c", "sudo script" + argument)
exec.Command("/bin/bash", "sudo script", argument)
exec.Command("/bin/bash sudo script" + argument)
Most of these I am met with '/bin/bash sudo ect' no such file or directory, or Error: exit status 1 I have even gone as far as to write a Python wrapper looking for an argument and executing the bash script with subprocess. To rule out the path to the script not being defined I have tried all of the above with a direct route to the script rather than script name.
For the sake of my remaining hair, what am I doing wrong here? How can I better diagnose this problem so that I can get more information rather than exit status 1?
You don't need to call bash/sh at all, simply pass each argument alone, also to get the error you have to capture the command's stderr, here's a working example:
func main() {
c := exec.Command("sudo", "ls", "/tmp")
stderr := &bytes.Buffer{}
stdout := &bytes.Buffer{}
c.Stderr = stderr
c.Stdout = stdout
if err := c.Run(); err != nil {
fmt.Println("Error: ", err, "|", stderr.String())
} else {
fmt.Println(stdout.String())
}
os.Exit(0)
}

Detect if pid is zombie on Linux

We can detect if some is a zombie process via shell command line
ps ef -o pid,stat | grep <pid> | grep Z
To get that info in our C/C++ programs we use popen(), but we would like to avoid using popen(). Is there a way to get the same result without spawning additional processes?
We are using Linux 2.6.32-279.5.2.el6.x86_64.
You need to use the proc(5) filesystem. Access to files inside it (e.g. /proc/1234/stat ...) is really fast (it does not involve any physical I/O).
You probably want the third field from /proc/1234/stat (which is readable by everyone, but you should read it sequentially, since it is unseekable.). If that field is Z then process of pid 1234 is zombie.
No need to fork a process (e.g. withpopen or system), in C you might code
pid_t somepid;
// put the process pid you are interested in into somepid
bool iszombie = false;
// open the /proc/*/stat file
char pbuf[32];
snprintf(pbuf, sizeof(pbuf), "/proc/%d/stat", (int) somepid);
FILE* fpstat = fopen(pbuf, "r");
if (!fpstat) { perror(pbuf); exit(EXIT_FAILURE); };
{
int rpid =0; char rcmd[32]; char rstatc = 0;
fscanf(fpstat, "%d %30s %c", &rpid, rcmd, &rstatc);
iszombie = rstatc == 'Z';
}
fclose(fpstat);
Consider also procps and libproc so see this answer.
(You could also read the second line of /proc/1234/status but this is probably harder to parse in C or C++ code)
BTW, I find that the stat file in /proc/ has a weird format: if your executable happens to contain both spaces and parenthesis in its name (which is disgusting, but permitted) parsing the /proc/*/stat file becomes tricky.

Resources