Copy and move's command effect on inode - linux

I interpret inode as a pointer to the actual place where the file is stored.
But I have problem understanding:
If I use cp file1 file2 in a place where file2 already exists, the inode doesn't change. And If there is originally a hard-link to file2, they now both point to the new file just copied here.
The only reason I can think of is that Linux interprets this as modifying
the file instead of deleting and creating a new file. I don't understand why it's designed this way?
But when I use mv file1 file2, the inode changes to the inode of file1.

You are correct in stating that cp will modify the file instead of deleting and recreating.
Here is a view of the underlying system calls as seen by strace (part of the output of strace cp file1 file2):
open("file2", O_WRONLY|O_TRUNC) = 4
stat("file2", {st_mode=S_IFREG|0664, st_size=6, ...}) = 0
stat("file1", {st_mode=S_IFREG|0664, st_size=3, ...}) = 0
stat("file2", {st_mode=S_IFREG|0664, st_size=6, ...}) = 0
open("file1", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0664, st_size=3, ...}) = 0
open("file2", O_WRONLY|O_TRUNC) = 4
fstat(4, {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
read(3, "hi\n", 65536) = 3
write(4, "hi\n", 3) = 3
read(3, "", 65536) = 0
close(4) = 0
close(3) = 0
As you can see, it detects that file2 is present (stat returns 0), but then opens it for writing (O_WRONLY|O_TRUNC) without first doing an unlink.
See for example POSIX.1-2017, which specifies that the destination file shall only be unlink-ed where it could not be opened for writing and -f is used:
A file descriptor for dest_file shall be obtained by performing
actions equivalent to the open() function defined in the System
Interfaces volume of POSIX.1-2017 called using dest_file as the path
argument, and the bitwise-inclusive OR of O_WRONLY and O_TRUNC as the
oflag argument.
If the attempt to obtain a file descriptor fails and the -f option is
in effect, cp shall attempt to remove the file by performing actions
equivalent to the unlink() function defined in the System Interfaces
volume of POSIX.1-2017 called using dest_file as the path argument. If
this attempt succeeds, cp shall continue with step 3b.
This implies that if the destination file exists, the copy will succeed (without resorting to -f behaviour) if the cp process has write permission on it (not necessarily run as the user that owns the file), even if it does not have write permission on the containing directory. By contrast, unlinking and recreating would require write permission on the directory. I would speculate that this is behind the reason why the standard is as it is.
The --remove-destination option on GNU cp will make it do instead what you thought ought to be the default.
Here is the relevant part of the output of strace cp --remove-destination file1 file2. Note the unlink this time.
stat("file2", {st_mode=S_IFREG|0664, st_size=6, ...}) = 0
stat("file1", {st_mode=S_IFREG|0664, st_size=3, ...}) = 0
lstat("file2", {st_mode=S_IFREG|0664, st_size=6, ...}) = 0
unlink("file2") = 0
open("file1", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0664, st_size=3, ...}) = 0
open("file2", O_WRONLY|O_CREAT|O_EXCL, 0664) = 4
fstat(4, {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
read(3, "hi\n", 65536) = 3
write(4, "hi\n", 3) = 3
read(3, "", 65536) = 0
close(4) = 0
close(3) = 0
When you use mv and the source and destination paths are on the same file filesystem, it will do an rename, and this will have the effect of unlinking any existing file at the target path. Here is the relevant part of the output of strace mv file1 file2.
access("file2", W_OK) = 0
rename("file1", "file2") = 0
In either case where an destination path is unlinked (whether explicitly by unlink() as called from cp --remove-destination, or as part of the effect of rename() as called from mv), the link count of the inode to which it was pointing will be decremented, but it will remain on the filesystem if either the link count is still >0 or if any processes have open filehandles on it. Any other (hard) links to this inode (i.e. other directory entries for it) will remain.
Investigating using ls -i
ls -i will show the inode numbers (as the first column when combined with -l), which helps demonstrate what is happening.
Example with default cp action
$ rm file1 file2 file3
$ echo hi > file1
$ echo world > file2
$ ln file2 file3
$ ls -li file*
49 -rw-rw-r-- 1 myuser mygroup 3 Jun 13 10:43 file1
50 -rw-rw-r-- 2 myuser mygroup 6 Jun 13 10:43 file2
50 -rw-rw-r-- 2 myuser mygroup 6 Jun 13 10:43 file3
$ cp file1 file2
$ ls -li file*
49 -rw-rw-r-- 1 myuser mygroup 3 Jun 13 10:43 file1
50 -rw-rw-r-- 2 myuser mygroup 3 Jun 13 10:43 file2 <=== exsting inode
50 -rw-rw-r-- 2 myuser mygroup 3 Jun 13 10:43 file3 <=== exsting inode
(Note existing inode 50 now has size 3).
Example with --remove-destination
$ rm file1 file2 file3
$ echo hi > file1
$ echo world > file2
$ ln file2 file3
$ ls -li file*
49 -rw-rw-r-- 1 myuser mygroup 3 Jun 13 10:46 file1
50 -rw-rw-r-- 2 myuser mygroup 6 Jun 13 10:46 file2
50 -rw-rw-r-- 2 myuser mygroup 6 Jun 13 10:46 file3
$ cp --remove-destination file1 file2
$ ls -li file*
49 -rw-rw-r-- 1 myuser mygroup 3 Jun 13 10:46 file1
55 -rw-rw-r-- 1 myuser mygroup 3 Jun 13 10:47 file2 <=== new inode
50 -rw-rw-r-- 1 myuser mygroup 6 Jun 13 10:46 file3 <=== existing inode
(Note new inode 55 has size 3. Unmodified inode 50 still has size 6.)
Example with mv
$ rm file1 file2 file3
$ echo hi > file1
$ echo world > file2
$ ln file2 file3
$ ls -li file*
49 -rw-rw-r-- 1 myuser mygroup 3 Jun 13 11:05 file1
50 -rw-rw-r-- 2 myuser mygroup 6 Jun 13 11:05 file2
50 -rw-rw-r-- 2 myuser mygroup 6 Jun 13 11:05 file3
$ mv file1 file2
$ ls -li file*
49 -rw-rw-r-- 1 myuser mygroup 3 Jun 13 11:05 file2 <== existing inode
50 -rw-rw-r-- 1 myuser mygroup 6 Jun 13 11:05 file3 <== existing inode

#alaniwi's answer covers what is is happening, but there's an implicit why here as well.
The reason cp works the way it does is to provide a way of replacing a file with mulitple names, having all of those names refer to the new file. When the destination of cp is a file that already exists, possibly with multiple names via hard or soft links, cp will make all of those names refer to the new file. There will be no 'orphan' references to the old file left over.
Given this command, it is pretty easy to get the 'just change the file for one name' behavior -- unlink the file first. Given just that as a primitive it would be very hard to implement the 'change all references to point to the new contents' behavior.
Of course, doing rm+cp has some race condition issues (it is two commands), which is why the install command got added in BSD unix -- it basically just does rm + cp, along with some checks to make it atomic in the rare case two people try to install to the same path simultaneously, as well as the more serious problems of someone reading from the file you're trying to install to (a problem with plain cp). Then the GNU version added options to backup the old version and various other useful bookkeeping.

An inode is a collection of metadata for a file, i.e. information about a file, in a Unix/ Unix-like filesystem. It includes permission data, last access/ modify time, file size, etc.
Notably, a file's name/ path is not part of the inode. A filename is just a human-readable identifier for an inode. A file can have one or more names, the number of which is represented in the inode by its number of "links" (hard links). The number associated with the inode, the inode number, which I believe you're interpreting as its physical location on disk, is rather simply a unique identifier for the inode. An inode does contain the location of the file on disk, but that is not the inode number.
So knowing this, the difference you're seeing is in how cp and mv function. When you cp a file you're creating a new inode with a new name and copying the contents of the old file to a new location on disk. When you mv a file all you're doing is changing one of its names. If the new name is already the name of another file, the name is disassociated with the old file (and the old file's link count is reduced by 1) and associated with the new file.
You can read more about inodes here.

Related

What happens internally when `ls *.c` is executed?

I've got very interested in Linux internals recently, and currently trying to understand how things work.
I knew that when I type ls
opendir() - function is called;
readdir() - function called for each directory entry in directory data store;
stat() - function can be called to get additional information on files, if required.
Please correct me if I'm missing something or if it's wrong.
The part which is mystery to me is filename expansion(globbing).
I've compared the output of strace ls
open(".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=270336, ...}) = 0
getdents(3, /* 14 entries */, 32768) = 440
getdents(3, /* 0 entries */, 32768) = 0
close(3) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
write(1, "2q.c ds.c fglob fnoglob\n", 272q.c ds.c fglob fnoglob
and strace ls *.c,
stat("2q.c", {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
lstat("2q.c", {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
stat("ds.c", {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
lstat("ds.c", {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
write(1, "2q.c ds.c\n", 112q.c ds.c
) = 11
and from my limited knowledge can tell that in the first case, it indeed behaves as I expected open, stat followed by getdents.
But the later one, with shell globbing isn't clear to me, because it there's already a list of files, which match the pattern. Where does this list came from?
Thanks!
Shell globbing patterns on the command line are expanded by the shell before the utility is invoked.
You can see this by enabling tracing in the shell with set -x:
$ set -x
$ ls -l f*
+ ls -l file1 file2 file3
-rw-r--r-- 1 kk wheel 0 May 11 16:49 file1
-rw-r--r-- 1 kk wheel 0 May 11 16:49 file2
-rw-r--r-- 1 kk wheel 0 May 11 16:49 file3
As you can see, the shell tells you what command it invokes (at the + prompt), and at that point it has already expanded the pattern on the command line.
The ls command does not do filename globbing. In fact, if you single quote the globbing pattern to protect it from the shell, ls is bound to be confused:
$ ls -l 'f*'
+ ls -l f*
ls: f*: No such file or directory
(unless there's actually something in the current directory called f* of course).

How to change umask so all files start with different modal bits than new directories

Specifiacally, I need to give files rw----r--
and dirs rwx--xr-x
Use umask 062.
This works because umask only unsets bits, and files aren't normally created with executable bits set in the first place:
$ umask 062
$ touch myfile; mkdir mydir
$ ls -ld myfile mydir
drwx--xr-x 1 user user 0 Dec 5 15:21 mydir
-rw----r-- 1 user user 0 Dec 5 15:21 myfile

chown does not set SGID

I am trying to create a a directory with permissions 02770, so that the resultant permissions would be drwxrws---
When I run the below commands I get the expected behavior
rsam.svtest2.serendipity> (/home/svtest2)
$ mkdir abc
rsam.svtest2.serendipity> (/home/svtest2)
$ ls -lrt
drwxrwxr-x 2 svtest2 users 6 Apr 18 10:57 abc
rsam.svtest2.serendipity> (/home/svtest2)
$ chmod 02770 abc
rsam.svtest2.serendipity> (/home/svtest2)
$ ls -lrt
drwxrws--- 2 svtest2 users 6 Apr 18 10:57 abc
UPDATE#1
Following from above, after running mkdir and chmod on a directory, when I run chown the SGID bit gets cleared off.
rsam.svtest2.serendipity> (/home/svtest2)
$ chown svtest2:users abc
rsam.svtest2.serendipity> (/home/svtest2)
$ ls -lrt
drwxrwx--- 2 svtest2 users 6 Apr 18 10:57 abc
From the chown documentation,
Only a privileged process (Linux: one with the CAP_CHOWN capability)
may change the owner of a file. The owner of a file may change the
group of the file to any group of which that owner is a member. A
privileged process (Linux: with CAP_CHOWN) may change the group
arbitrarily.
The problem is that my user svtest does not have CAP_CHOWN capability.
Now the question boils down to - How to I get the user to have CAP_CHOWN capability?
It looks like there is some instruction here - SO - setting CAP_CHOWN
but I am yet to try it out.
However, when I run below C++ code(part of tuxedo server)
// Check if the directory exists and if not creates the directory
// with the given permissions.
struct stat st;
int lreturn_code = stat(l_string, &st);
if (lreturn_code != 0 &&
(mkdir(l_string, lpermission) != 0 ||
chmod(l_string, lpermission) != 0)) {
....
....
}
....
....
// Convert group name to group id into lgroup
if (chown(l_string, -1, lgroup) != 0) {
// System error.
}
The directory is created as below:
$ ls -l|grep DirLevel1
drwxrwx--- 2 svtest2 users 6 Apr 18 11:14 DirLevel1
Notice that the SGUID bit is not set as against when the commands were run directly as mentioned above.
Excerpt from strace for the operation:
5864 stat("/home/svtest2/data/server/log/DirLevel1/", 0x7ffd235f29f0) = -1 ENOENT (No such file or directory)
5864 mkdir("/home/svtest2/data/server/log/DirLevel1/", 02770) = 0
5864 chmod("/home/svtest2/data/server/log/DirLevel1/", 02770) = 0
5864 socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 15
5864 connect(15, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
5864 close(15) = 0
5864 socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 15
5864 connect(15, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
5864 close(15) = 0
5864 open("/etc/group", O_RDONLY|O_CLOEXEC) = 15
5864 fstat(15, {st_mode=S_IFREG|0644, st_size=652, ...}) = 0
5864 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f414d7c4000
5864 read(15, "root:x:0:\nbin:x:1:\ndaemon:x:2:\ns"..., 4096) = 652
5864 close(15) = 0
5864 munmap(0x7f414d7c4000, 4096) = 0
5864 chown("/home/svtest2/data/server/log/DirLevel1/", 4294967295, 100) = 0
5864 write(7, "\0\0\2~\6\0\0\0\0\0\21i\216\376\377\377\377\377\377\377\377\1\0\0\0\0\0\0\0\1\0\0"..., 638) = 638
5864 read(7, "\0\0\0\300\6\0\0\0\0\0\10\0\0\0\0\250\0\0\0\0\0\0\0\0\0(\0\0\0\0\0\0"..., 8208) = 192
5864 write(7, "\0\0\1}\6\0\0\0\0\0\3h\221\1\0\0\0\0\0\0\0\376\377\377\377\377\377\377\377\250\0\0"..., 381) = 381
5864 read(7, "\0\0\0\26\6\0\0\0\0\0\10\4\0\0\0\t\1\0\0\0\215\f", 8208) = 22
5864 msgsnd(43679799, {805306373, "y\0\0\0007\200\232\2\0\0\0\0\f\2\0\0\0\0\0\200\0\0\0\0\0\0\0\0\0\0\0\0"...}, 516, IPC_NOWAIT) = 0
5864 msgrcv(43614264,
From http://man.sourcentral.org/RHEL7/2+chown,
When the owner or group of an executable file are changed by an
unprivileged user the S_ISUID and S_ISGID mode bits are cleared. POSIX
does not specify whether this also should happen when root does the
chown(); the Linux behavior depends on the kernel version. In case of
a non-group-executable file (i.e., one for which the S_IXGRP bit is
not set) the S_ISGID bit indicates mandatory locking, and is not
cleared by a chown().
The above highlights a possible scenario, but I'm not sure how that is applicable to my case because it is not executable file but a directory.
Since *nix system consider a file to be an executable by seeing the 'x' permission bit, I believe a searchable directory might be considered as an executable, too.

Bash script not producing desired result

I am running a cron-ed bash script to extract cache hits and bytes served per IP address. The script (ProxyUsage.bash) has two parts:
(uniqueIP.awk) find unique IPs and create a bash script do add up the hits and bytes
run the hits and bytes per IP
ProxyUsage.bash
#!/usr/bin/env bash
sudo gawk -f /home/maxg/scripts/uniqueIP.awk /var/log/squid3/access.log.1 > /home/maxg/scripts/pxyUsage.bash
source /home/maxg/scripts/pxyUsage.bash
uniqueIP.awk
{
arrIPs[$3]++;
}
END {
for (n in arrIPs) {
m++; # count arrIPs elements
#print "Array elements: " m;
arrAddr[i++] = n; # fill arrAddr with IPs
#print i " " n;
}
asort(arrAddr); # sort the array values
for (i = 1; i <= m; i++) { # write one command line per IP address
#printf("#!/usr/bin/env bash\n");
printf("sudo gawk -f /home/maxg/scripts/proxyUsage.awk -v v_Var=%s /var/log/squid3/access.log.1 >> /home/maxg/scripts/pxyUsage.txt\n", arrAddr[i])
}
}
pxyUsage.bash
sudo gawk -f /home/maxg/scripts/proxyUsage.awk -v v_Var=192.168.1.13 /var/log/squid3/access.log.1 >> /home/maxg/scripts/pxyUsage.txt
sudo gawk -f /home/maxg/scripts/proxyUsage.awk -v v_Var=192.168.1.14 /var/log/squid3/access.log.1 >> /home/maxg/scripts/pxyUsage.txt
sudo gawk -f /home/maxg/scripts/proxyUsage.awk -v v_Var=192.168.1.22 /var/log/squid3/access.log.1 >> /home/maxg/scripts/pxyUsage.txt
TheProxyUsage.bash script runs as scheduled and creates the pxyUsage.bash script.
However the pxyUsage.text file is not amended with the latest values when the script runs.
So far I run pxyUsage.bash every day myself, as I cannot figure out, why the result is not written to file.
Both bash scripts are set to execute. Actually the file permissions are below:
-rwxr-xr-x 1 maxg maxg 169 Mar 14 08:40 ProxySummary.bash
-rw-r--r-- 1 maxg maxg 910 Mar 15 17:15 proxyUsage.awk
-rwxrwxrwx 1 maxg maxg 399 Mar 17 06:10 pxyUsage.bash
-rw-rw-rw- 1 maxg maxg 2922 Mar 17 07:32 pxyUsage.txt
-rw-r--r-- 1 maxg maxg 781 Mar 16 07:35 uniqueIP.awk
Any hints appreciated. Thanks.
The sudo(8) command requires a pseudo-tty and you do not have one allocated under cron(8); you do have one allocated when logged in the usual way.
Instead of mucking about with sudo(8), just run the script as the correct user.
If you cannot do that, then in the root crontab, do something like this:
su - username /path/to/mycommand arg1 arg2...
This will work because root can use su(1) without neding a password.

Linux proc/pid/fd for stdout is 11?

Executing a script with stdout redirected to a file. So /proc/$$/fd/1 should point to that file (since stdout fileno is 1). However, actual fd of the file is 11. Please, explain, why.
Here is session:
$ cat hello.sh
#!/bin/sh -e
ls -l /proc/$$/fd >&2
$ ./hello.sh > /tmp/1
total 0
lrwx------ 1 nga users 64 May 28 22:05 0 -> /dev/pts/0
lrwx------ 1 nga users 64 May 28 22:05 1 -> /dev/pts/0
lr-x------ 1 nga users 64 May 28 22:05 10 -> /home/me/hello.sh
l-wx------ 1 nga users 64 May 28 22:05 11 -> /tmp/1
lrwx------ 1 nga users 64 May 28 22:05 2 -> /dev/pts/0
I have a suspicion, but this is highly dependent on how your shell behaves. The file descriptors you see are:
0: standard input
1: standard output
2: standard error
10: the running script
11: a backup copy of the script's normal standard out
Descriptors 10 and 11 are close on exec, so won't be present in the ls process. 0-2 are, however, prepared for ls before forking. I see this behaviour in dash (Debian Almquist shell), but not in bash (Bourne again shell). Bash instead does the file descriptor manipulations after forking, and incidentally uses 255 rather than 10 for the script. Doing the change after forking means it won't have to restore the descriptors in the parent, so it doesn't have the spare copy to dup2 from.
The output of strace can be helpful here.
The relevant section is
fcntl64(1, F_DUPFD, 10) = 11
close(1) = 0
fcntl64(11, F_SETFD, FD_CLOEXEC) = 0
dup2(2, 1) = 1
stat64("/home/random/bin/ls", 0xbf94d5e0) = -1 ENOENT (No such file or
+++++++>directory)
stat64("/usr/local/bin/ls", 0xbf94d5e0) = -1 ENOENT (No such file or directory)
stat64("/usr/bin/ls", 0xbf94d5e0) = -1 ENOENT (No such file or directory)
stat64("/bin/ls", {st_mode=S_IFREG|0755, st_size=96400, ...}) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
+++++++>child_tidptr=0xb75a8938) = 22748
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 22748
--- SIGCHLD (Child exited) # 0 (0) ---
dup2(11, 1) = 1
So, the shell moves the existing stdout to an available file descriptor above 10 (namely, 11), then moves the existing stderr onto its own stdout (due to the >&2 redirect), then restores 11 to its own stdout when the ls command is finished.

Resources