Broken symlinks and a mysterious (deleted) - linux

I've been doing stuff with the proc filesystem on linux, and I've come across some behavior I'd like to have clarified.
Each process in /proc has a symlink to it's executable file, /proc/{pid}/exe. If a process continues to run after it's executable has been deleted, reading this symlink will return the path to the executable, with (deleted) appended to the end.
Running this command you may even see a few on your system:
grep '(deleted)' <(for dir in $(ls /proc | grep -E '^[0-9]+'); do echo "$dir $(readlink /proc/$dir/exe)"; done)
I tried recreating this behavior with some simple bash commands:
>>> echo "temporary file" >> tmpfile.test
>>> ln -s tmpfile.test tmpfile.link
>>> rm tmpfile.test
>>> readlink tmpfile.link
tmpfile.test
There is no (deleted) appended to the name! Trying a cat tmpfile.link confirms that the link is broken (cat: tmpfile.link: No such file or directory).
However, the other day this same test did result in a (deleted) being appended to the output of readlink. What gives?
Here is what I would like to know:
Is there a sequences of events that guarantees (deleted) will be
appended to the name?
Why does /proc/{pid}/exe show (deleted) for removed executables?
How can I get the name of an executable through /proc/{pid}/exe
without any appended (deleted) and guarantee that the original
executable wasn't just named some_executable (deleted)?

It is not readlink, but Linux changes the symlink to point to <filename> (deleted), i.e., (deleted) gets appended to the target of the link.

FWIW, the special <filename> (deleted) behavior is implemented in the Linux kernel function d_path() here: https://elixir.bootlin.com/linux/v4.1.13/source/fs/dcache.c#L3080.
The source comments (in the snippet below) suggest the special behavior only applies to names (paths) generated on-the-fly for some 'synthetic filesystems' (e.g. procfs) and 'psuedo inodes'.
/**
* d_path - return the path of a dentry
* #path: path to report
* #buf: buffer to return value in
* #buflen: buffer length
*
* Convert a dentry into an ASCII path name. If the entry has been deleted
* the string " (deleted)" is appended. Note that this is ambiguous.
*
* Returns a pointer into the buffer or an error code if the path was
* too long. Note: Callers should use the returned pointer, not the passed
* in buffer, to use the name! The implementation often starts at an offset
* into the buffer, and may leave 0 bytes at the start.
*
* "buflen" should be positive.
*/
char *d_path(const struct path *path, char *buf, int buflen)
{
char *res = buf + buflen;
struct path root;
int error;
/*
* We have various synthetic filesystems that never get mounted. On
* these filesystems dentries are never used for lookup purposes, and
* thus don't need to be hashed. They also don't need a name until a
* user wants to identify the object in /proc/pid/fd/. The little hack
* below allows us to generate a name for these objects on demand:
*
* Some pseudo inodes are mountable. When they are mounted
* path->dentry == path->mnt->mnt_root. In that case don't call d_dname
* and instead have d_path return the mounted path.
*/
if (path->dentry->d_op && path->dentry->d_op->d_dname &&
(!IS_ROOT(path->dentry) || path->dentry != path->mnt->mnt_root))
return path->dentry->d_op->d_dname(path->dentry, buf, buflen);
rcu_read_lock();
get_fs_root_rcu(current->fs, &root);
error = path_with_deleted(path, &root, &res, &buflen);
rcu_read_unlock();
if (error < 0)
res = ERR_PTR(error);
return res;
}

Related

Get directory entries for "slow protocols" asynchronously

I want a function for getting directory entries on Linux. I use ioutil.ReadDir and usually it is fast.
But if I want to read some mounted virtual file system on /run/user/1000/gvfs/, this function becomes slow. If the directory has many file entries I need to wait a long time.
I can use the ls command in a terminal and result will be the same.
When I tried ls -U -a -p -1 I got line by line output immediately.
I tried running this in Go with exec.Command, but it didn't work asynchronously. Go is waiting for full program output. What did I do wrong?
m.cmd = exec.Command("ls", "-U", "-a", "-p", "-1")
// for example some "slow" directory:
m.cmd.Dir = "/run/user/1000/gvfs/dav:host=webdav.yandex.ru,ssl=true,user=...../"
reader, _ := m.cmd.StdoutPipe()
bufReader := bufio.NewReader(reader)
go func() {
m.cmd.Start()
for {
line, _, err := bufReader.ReadLine()
if err != nil {
break
}
linestr := string(line)
if linestr != "./" && linestr != "../" {
fmt.Println(linestr)
}
}
}()
I need line by line printing immediately in Go.
Try ls -U -a -p 1 | cat to see if you get line-by-line output.
Go doesn't control ls; ls does line-by-line writing if ls chooses to do so, and ls chooses not to do that when its output is a pipe. You could allocate a pty pair and use that, but that's the wrong way to do this.
ioutil.ReadDir first reads the entire directory (by calling Readdir(-1)), then sorts the file names. If you use os.Open to open the directory, then call the Readdir or Readdirnames function with a small (but not negative) number, you should get something more to your liking.

Perl script cron/environment issue

The following Perl script generates an .xls file from a text file. It runs great in our linux test environment, but generates an empty spreadsheet (.xls) in our production environment when run via cron (cron works in test, as well.) Nothing jumps out at our sys admins in terms of system level settings that might account for this behavior. Towards the bottom of the script in the import_data subroutine, the correct number of lines is reported, but nothing is written to the spreadsheet and no errors are returned at either the script or system level. I ran it through the perl debugger but my skills fell short of being able to interactively watch it populate the file. The cron entry looks like this:
cd <script directory>; cvs2xls input.txt output.xls 2>&1
Any debugging tips would be appreciated, as well as potential system settings that I can forward on to our sysadmins.
#!/usr/bin/perl
use strict;
use warnings;
use lib '/apps/tu01688/perl5/lib/perl5';
use Spreadsheet::WriteExcel;
use Text::CSV::Simple;
BEGIN {
unshift #INC, "/apps/tu01688/jobs/mayo-expert";
};
my $infile = shift;
usage() unless defined $infile && -f $infile;
my $parser = Text::CSV::Simple->new;
my #data = $parser->read_file($infile);
my $headers = shift #data;
my $outfile = shift || $infile . ".xls";
my $subject = shift || 'worksheet';
sub usage {
print "csv2xls infile [outfile] [subject]\n";
exit;
}
my $workbook = Spreadsheet::WriteExcel->new($outfile);
my $bold = $workbook->add_format();
$bold->set_bold(1);
import_data($workbook, $subject, $headers, \#data);
# Add a worksheet
sub import_data {
my $workbook = shift;
my $base_name = shift;
my $colums = shift;
my $data = shift;
my $limit = shift || 50_000;
my $start_row = shift || 1;
my $worksheet = $workbook->add_worksheet($base_name);
$worksheet->add_write_handler(qr[\w], \&store_string_widths);
#$worksheet->add_write_handler(qr[\w]| \&store_string_widths);
my $w = 1;
$worksheet->write('A' . $start_row, $colums, ,$bold);
my $i = $start_row;
my $qty = 0;
for my $row (#$data) {
$qty++;
if ($i > $limit) {
$i = $start_row;
$w++;
$worksheet = $workbook->add_worksheet("$base_name - $w");
$worksheet->write('A1', $colums,$bold);
}
$worksheet->write($i++, 0, $row);
}
autofit_columns($worksheet);
warn "Converted $qty rows.";
return $worksheet;
}
###############################################################################
###############################################################################
#
# Functions used for Autofit.
#
###############################################################################
#
# Adjust the column widths to fit the longest string in the column.
#
sub autofit_columns {
my $worksheet = shift;
my $col = 0;
for my $width (#{$worksheet->{__col_widths}}) {
$worksheet->set_column($col, $col, $width) if $width;
$col++;
}
}
###############################################################################
#
# The following function is a callback that was added via add_write_handler()
# above. It modifies the write() function so that it stores the maximum
# unwrapped width of a string in a column.
#
sub store_string_widths {
my $worksheet = shift;
my $col = $_[1];
my $token = $_[2];
# Ignore some tokens that we aren't interested in.
return if not defined $token; # Ignore undefs.
return if $token eq ''; # Ignore blank cells.
return if ref $token eq 'ARRAY'; # Ignore array refs.
return if $token =~ /^=/; # Ignore formula
# Ignore numbers
#return if $token =~ /^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/;
# Ignore various internal and external hyperlinks. In a real scenario
# you may wish to track the length of the optional strings used with
# urls.
return if $token =~ m{^[fh]tt?ps?://};
return if $token =~ m{^mailto:};
return if $token =~ m{^(?:in|ex)ternal:};
# We store the string width as data in the Worksheet object. We use
# a double underscore key name to avoid conflicts with future names.
#
my $old_width = $worksheet->{__col_widths}->[$col];
my $string_width = string_width($token);
if (not defined $old_width or $string_width > $old_width) {
# You may wish to set a minimum column width as follows.
#return undef if $string_width < 10;
$worksheet->{__col_widths}->[$col] = $string_width;
}
# Return control to write();
return undef;
}
###############################################################################
#
# Very simple conversion between string length and string width for Arial 10.
# See below for a more sophisticated method.
#
sub string_width {
return length $_[0];
}
Uhmm.. don't put chained commands in cron, use an external script instead. Anyway: some suggestions that may help you:
Debugging cron commands
Check the mail! By default cron will mail any output from the command to the user it is running the command as. If there is no output there will be no mail. If you want cron to send mail to a different account then you can set the MAILTO environment variable in the crontab file e.g.
MAILTO=user#somehost.tld
1 2 * * * /path/to/your/command
Capture the output yourself
1 2 * * * /path/to/your/command &>/tmp/mycommand.log
which captures stdout and stderr to /tmp/mycommand.log
Look at the logs; cron logs its actions via syslog, which (depending on your setup) often go to /var/log/cron or /var/log/syslog.
If required you can filter the cron statements with e.g.
grep CRON /var/log/syslog
Now that we've gone over the basics of cron, where the files are and how to use them let's look at some common problems.
Check that cron is running
If cron isn't running then your commands won't be scheduled ...
ps -ef | grep cron | grep -v grep
should get you something like
root 1224 1 0 Nov16 ? 00:00:03 cron
or
root 2018 1 0 Nov14 ? 00:00:06 crond
If not restart it
/sbin/service cron start
or
/sbin/service crond start
There may be other methods; use what your distro provides.
cron runs your command in a restricted environment.
What environment variables are available is likely to be very limited. Typically, you'll only get a few variables defined, such as $LOGNAME, $HOME, and $PATH.
Of particular note is the PATH is restricted to /bin:/usr/bin. The vast majority of "my cron script doesn't work" problems are caused by this restrictive path. If your command is in a different location you can solve this in a couple of ways:
Provide the full path to your command.
1 2 * * * /path/to/your/command
Provide a suitable PATH in the crontab file
PATH=/usr:/usr/bin:/path/to/something/else
1 2 * * * command
If your command requires other environment variables you can define them in the crontab file too.
cron runs your command with cwd == $HOME
Regardless of where the program you execute resides on the filesystem, the current working directory of the program when cron runs it will be the user's home directory. If you access files in your program, you'll need to take this into account if you use relative paths, or (preferably) just use fully-qualified paths everywhere, and save everyone a whole lot of confusion.
The last command in my crontab doesn't run
Cron generally requires that commands are terminated with a new line. Edit your crontab; go to the end of the line which contains the last command and insert a new line (press enter).
Check the crontab format
You can't use a user crontab formatted crontab for /etc/crontab or the fragments in /etc/cron.d and vice versa. A user formatted crontab does not include a username in the 6th position of a row, while a system formatted crontab includes the username and runs the command as that user.
I put a file in /etc/cron.{hourly,daily,weekly,monthly} and it doesn't run
Check that the filename doesn't have an extension see run-parts
Ensure the file has execute permissions.
Tell the system what to use when executing your script (eg. put #!/bin/sh at top)
Cron date related bugs
If your date is recently changed by a user or system update, timezone or other, then crontab will start behaving erratically and exhibit bizarre bugs, sometimes working, sometimes not. This is crontab's attempt to try to "do what you want" when the time changes out from underneath it. The "minute" field will become ineffective after the hour is changed. In this scenario, only asterisks would be accepted. Restart cron and try it again without connecting to the internet (so the date doesn't have a chance to reset to one of the time servers).
Percent signs, again
To emphasise the advice about percent signs, here's an example of what cron does with them:
# cron entry
* * * * * cat >$HOME/cron.out%foo%bar%baz
will create the ~/cron.out file containing the 3 lines
foo
bar
baz
This is particularly intrusive when using the date command. Be sure to escape the percent signs
* * * * * /path/to/command --day "$(date "+\%Y\%m\%d")"
Thanks so much for the extensive feedback, everyone. I'm certainly taking a lot more away from this than I put into it. In any event, I ran across the answer. In my perl5 lib folder I found that somehow the IO and OLE libraries were missing on production. Copying those over from development resulted in everything working fine. The fact that I was unable to determine/capture this through conventional debugging efforts as opposed to merely comparing directory listings out of exasperation speaks to how much more I have to learn along these lines. But I'm confident that the great feedback I received will go a long ways towards getting me there. Thanks again, everyone.

how to make a folder named as sth that we read before in a text file by linux terminal?

I have a text file that contains a folder name.I want to read that text file's context via Ubuntu terminal and make a folder with that name(which is written in the txt file).I don't know what to do,I have made a file named "in.txt" that its context is "name" and i have tried the c code below:
#include <stdio.h>
int main(){
FILE *fopen (const char in.txt, const char r+);
}
What should i write in the terminal?
Let the text file in.txt be:
foldername1
foldername2
Then you can create a script file script.sh:
#!/usr/bin/env bash
# $1 will be replaced with the first argument passed to the script:
mkdir -p $(cat "$1")
Run the script in the shell as: ./script.sh in.txt. The folder names specified in in.txt should now be available. This assumes one file name per line in in.txt without space.
Lets say you have text file called in.txt which contains
Rambo
Johnny
Jacky
Hulk
Next you need to read word by word from file, for that you can use fscanf(). Once word is read, you need to create a folder with that word name. To create a folder command is mkdir. Next you need to execute mkdir from file for that you use either system() or execlp().
Here is the simple C program
int main(){
FILE *fp = fopen ("in.txt", "r");
if(fp == NULL) {
/* write some message that file is not present */
return 0;
}
char buf[100];
while(fscanf(fp,"%s",buf) > 0) {
/* buf contains each word of file */
/* now create that folder use execlp */
if(fork() ==0 )
execlp("/bin/mkdir","mkdir",buf,NULL); /* this will create folder with name available in file */
else
;
}
fclose(fp);
return 0;
}
Note that mkdir fails if directory already exists.
Compile like gcc -Wall test.c and execute it like ./a.out it will create the folders.
EDIT : Same If you want to using open() system call, there you won't any system calls which reads word by words like so you need to string manipulation.
int main(int argc, char *argv[]) {
int inputfd = open("in.txt",O_RDONLY);
perror("open");
if (inputfd == -1) {
exit(EXIT_FAILURE);
}
/* first find the size of file */
int pos = lseek(inputfd,0,2);
printf("pos = %d \n",pos);
/* againg make inputfd to point to beginning */
lseek(inputfd,0,0);
/*allocate memory equal to size of file, not random 1024 bytes */
char *buf = malloc(pos);
read(inputfd,buf,pos);/* read will read whole file data at a time */
/* you need to find the words from the buf, bcz buf contain whole data not one word */
char cmd[50];/* buffer to store folder name */
for(int row = 0,index = 0; buf[row]; row++) {
if(buf[row]!=' ' && buf[row]!='\n') {
cmd[index] = buf[row];
index++;
continue;
}
else {
cmd[index] = '\0';
index = 0;/* for next word, it should start from cmd[0] */
if(fork() == 0 )
execlp("/bin/mkdir","mkdir",cmd,NULL);
else ;
}
}
close(inputfd);
return 0;
}
Another option if you want to perform other commands besides mkdir
while read -r line; do mkdir -p $line && chown www-data:www-data $line; done < file.txt
You can use for loop from Linux terminal like:
for line in $(cat in.txt); do mkdir $line; done
That should work fine to create multiple files.

Detect if pid is zombie on Linux

We can detect if some is a zombie process via shell command line
ps ef -o pid,stat | grep <pid> | grep Z
To get that info in our C/C++ programs we use popen(), but we would like to avoid using popen(). Is there a way to get the same result without spawning additional processes?
We are using Linux 2.6.32-279.5.2.el6.x86_64.
You need to use the proc(5) filesystem. Access to files inside it (e.g. /proc/1234/stat ...) is really fast (it does not involve any physical I/O).
You probably want the third field from /proc/1234/stat (which is readable by everyone, but you should read it sequentially, since it is unseekable.). If that field is Z then process of pid 1234 is zombie.
No need to fork a process (e.g. withpopen or system), in C you might code
pid_t somepid;
// put the process pid you are interested in into somepid
bool iszombie = false;
// open the /proc/*/stat file
char pbuf[32];
snprintf(pbuf, sizeof(pbuf), "/proc/%d/stat", (int) somepid);
FILE* fpstat = fopen(pbuf, "r");
if (!fpstat) { perror(pbuf); exit(EXIT_FAILURE); };
{
int rpid =0; char rcmd[32]; char rstatc = 0;
fscanf(fpstat, "%d %30s %c", &rpid, rcmd, &rstatc);
iszombie = rstatc == 'Z';
}
fclose(fpstat);
Consider also procps and libproc so see this answer.
(You could also read the second line of /proc/1234/status but this is probably harder to parse in C or C++ code)
BTW, I find that the stat file in /proc/ has a weird format: if your executable happens to contain both spaces and parenthesis in its name (which is disgusting, but permitted) parsing the /proc/*/stat file becomes tricky.

Is there a command to write random garbage bytes into a file?

I am now doing some tests of my application again corrupted files. But I found it is hard to find test files.
So I'm wondering whether there are some existing tools, which can write random/garbage bytes into a file of some format.
Basically, I need this tool to:
It writes random garbage bytes into the file.
It does not need to know the format of the file, just writing random bytes are OK for me.
It is best to write at random positions of the target file.
Batch processing is also a bonus.
Thanks.
The /dev/urandom pseudo-device, along with dd, can do this for you:
dd if=/dev/urandom of=newfile bs=1M count=10
This will create a file newfile of size 10M.
The /dev/random device will often block if there is not sufficient randomness built up, urandom will not block. If you're using the randomness for crypto-grade stuff, you can steer clear of urandom. For anything else, it should be sufficient and most likely faster.
If you want to corrupt just bits of your file (not the whole file), you can simply use the C-style random functions. Just use rnd() to figure out an offset and length n, then use it n times to grab random bytes to overwrite your file with.
The following Perl script shows how this can be done (without having to worry about compiling C code):
use strict;
use warnings;
sub corrupt ($$$$) {
# Get parameters, names should be self-explanatory.
my $filespec = shift;
my $mincount = shift;
my $maxcount = shift;
my $charset = shift;
# Work out position and size of corruption.
my #fstat = stat ($filespec);
my $size = $fstat[7];
my $count = $mincount + int (rand ($maxcount + 1 - $mincount));
my $pos = 0;
if ($count >= $size) {
$count = $size;
} else {
$pos = int (rand ($size - $count));
}
# Output for debugging purposes.
my $last = $pos + $count - 1;
print "'$filespec', $size bytes, corrupting $pos through $last\n";
# Open file, seek to position, corrupt and close.
open (my $fh, "+<$filespec") || die "Can't open $filespec: $!";
seek ($fh, $pos, 0);
while ($count-- > 0) {
my $newval = substr ($charset, int (rand (length ($charset) + 1)), 1);
print $fh $newval;
}
close ($fh);
}
# Test harness.
system ("echo =========="); #DEBUG
system ("cp base-testfile testfile"); #DEBUG
system ("cat testfile"); #DEBUG
system ("echo =========="); #DEBUG
corrupt ("testfile", 8, 16, "ABCDEFGHIJKLMNOPQRSTUVWXYZ ");
system ("echo =========="); #DEBUG
system ("cat testfile"); #DEBUG
system ("echo =========="); #DEBUG
It consists of the corrupt function that you call with a file name, minimum and maximum corruption size and a character set to draw the corruption from. The bit at the bottom is just unit testing code. Below is some sample output where you can see that a section of the file has been corrupted:
==========
this is a file with nothing in it except for lowercase
letters (and spaces and punctuation and newlines).
that will make it easy to detect corruptions from the
test program since the character range there is from
uppercase a through z.
i have to make it big enough so that the random stuff
will work nicely, which is why i am waffling on a bit.
==========
'testfile', 344 bytes, corrupting 122 through 135
==========
this is a file with nothing in it except for lowercase
letters (and spaces and punctuation and newlines).
that will make iFHCGZF VJ GZDYct corruptions from the
test program since the character range there is from
uppercase a through z.
i have to make it big enough so that the random stuff
will work nicely, which is why i am waffling on a bit.
==========
It's tested at a basic level but you may find there are edge error cases which need to be taken care of. Do with it what you will.
Just for completeness, here's another way to do it:
shred -s 10 - > my-file
Writes 10 random bytes to stdout and redirects it to a file. shred is usually used for destroying (safely overwriting) data, but it can be used to create new random files too.
So if you have already have a file that you want to fill with random data, use this:
shred my-existing-file
You could read from /dev/random:
# generate a 50MB file named `random.stuff` filled with random stuff ...
dd if=/dev/random of=random.stuff bs=1000000 count=50
You can specify the size also in a human readable way:
# generate just 2MB ...
dd if=/dev/random of=random.stuff bs=1M count=2
You can also use cat and head. Both are usually installed.
# write 1024 random bytes to my-file-to-override
cat /dev/urandom | head -c 1024 > my-file-to-override

Resources