What is there behind a symbolic link? - linux

How are symbolic links managed internally by UNIX/Linux systems. It is known that a symbolic link may exist even without an actual target file (Dangling link). So what is that which represents a symbolic link internally.
In Windows, the answer is a reparse point.
Questions:
Is the answer an inode in UNIX/Linux?
If yes, then will the inode number be same for target and links?
If yes, can the link inode can have permissions different from that of target's inode (if one exists)?

It is not about UNIX/Linux but about filesystem implementation - but yes, Unix/Linux uses inodes at kernel level and filesystem implementations have inodes (at least virtual ones).
In the general, symbolic links are simply files (btw, directories are also files), that have:
the flag file-type in the "inode" that tells to the system this file is a "symbolic link"
file-content: path to the target - in other words: a symbolic link is simply a file which contains a filename with a flag in the inode.
Virtual filesystems can have symbolic links too, so, check FUSE or some other filesystem implementation sources. (ext2/ext3/ufs..etc)
So,
Is the answer an inode in UNIX/Linux?
depends on filesystem implementation, but yes, generally the inode contains a "file-type" (and owners, access rights, timestamps, size, pointers to data blocks). There are filesystems that don't have inodes (in a physical implementation) but have only "virtual inodes" for maintaining compatibility with the kernel.
If yes, then will the inode number be same for target and links?
No. Usually, the symlink is a file with its own inode, (with file-type, own data blocks, etc.)
If yes, can the link inode can have permissions different from that of target's
inode(if one exists)?
This is about how symlink files are handled. Usually, the kernel doesn't allow changes to the symlink permissions - and symlinks always have default permissions. You could write your own filesystem that would allow different permissions for symlinks, but you would get into trouble because common programs like chmod don't change permissions on symlinks themselves, so making such a filesystem would be pointless anyway)
To understand the difference between hard links and symlinks, you should understand directories first.
Directories are files (with differentiated by a flag in the inode) that tell the kernel, "handle this file as a map of file-name to inode_number". Hard-links are simply file names that map to the same inode. So if the directory-file contains:
file_a: 1000
file_b: 1001
file_c: 1000
the above means, in this directory, are 3 files:
file_a described by inode 1000
file_b described by inode 1001 and
file_c again described by inode 1000 (so it is a hard link with file_a, not hardlink to file_a - because it is impossible to tell which filename came first - they are identical).
This is the main difference to symlinks, where the inode of file_b (inode 1001) could have content "file_a" and a flag meaning "this is a symlink". In this case, file_b would be a symlink pointing to file_a.

You can also easily explore this on your own:
$ touch a
$ ln -s a b
$ ln a c
$ ls -li
total 0
95905 -rw-r--r-- 1 regnarg regnarg 0 Jun 19 19:01 a
96990 lrwxrwxrwx 1 regnarg regnarg 1 Jun 19 19:01 b -> a
95905 -rw-r--r-- 2 regnarg regnarg 0 Jun 19 19:01 c
The -i option to ls shows inode numbers in the first column. You can see that the symlink has a different inode number while the hardlink has the same. You can also use the stat(1) command:
$ stat a
File: 'a'
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: 28h/40d Inode: 95905 Links: 2
[...]
$ stat b
File: 'b' -> 'a'
Size: 1 Blocks: 0 IO Block: 4096 symbolic link
Device: 28h/40d Inode: 96990 Links: 1
[...]
If you want to do this programmatically, you can use the lstat(2) system call to find information about the symlink itself (its inode number etc.), while stat(2) shows information about the target of the symlink, if it exists. Example in Python:
>>> import os
>>> os.stat("b").st_ino
95905
>>> os.lstat("b").st_ino
96990

Related

What kind of file is the smp_affinity-file?

The IRQ affinity can be set by writing a bit mask to /proc/irq/<irqid>/smp_affinity.
I guess there is a kernel module behind smp_affinity, however, ls tells me it is a normal file:
# ls
-rw-r--r-- 1 root root 0 Feb 9 16:06 smp_affinity
So I wonder, what kind of file /proc/irq/<irqid>/smp_affinity is?
Read about procfs - https://man7.org/linux/man-pages/man5/procfs.5.html https://en.wikipedia.org/wiki/Procfs etc.
smp_affinity is a file inside /proc filesystem. File operation on that file are handled specially by the kernel. Writing or reading - instead of storing or retrieving the data using some non-volatile medium - the kernel executes special function with special semantics instead.
The file would be created somewhere in kernel/irq/proc.c.

Buildroot doesn’t run as root and doesn’t want to run as root

I have 2 questions:
I am not sure to undrestand(from the directories description in Buildroot manual):
target/ which contains almost the complete root filesystem for the target:everything needed is present except the device files in /dev/ (Buildroot doesn’t run as root and doesn’t want to run as root)
Why buildroot need to be root to create the /dev
what i know is that buildroot uses target to generate images/rootfs.tar; is it a simple compression with taror ...? could you please help me find the make target that generate images/rootfs.tar?
In case of using NFS why can't we use directly the targetfolder as rootfs what makes "untaring" images/rootfs.tar different than target
Ref: http://free-electrons.com/~thomas/buildroot/manual/html/ch03.html
I am not sure to undrestand(from the directories description in Buildroot manual):
Buildroot, a tool for generating a kernel and root filesystem, is executed on your host system as a normal user without need of superuser privileges.
Why buildroot need to be root to create the /dev
Buildroot does not use superuser privileges.
what i know is that buildroot uses target to generate images/rootfs.tar; is it a simple compression with taror ...?
The .tar is an ordinary archive without compression.
You can configure/specify compression (and/or select filesystem images) using the make menuconfig procedure.
could you please help me find the make target that generate images/rootfs.tar?
You do not specify this in the make shell command.
You can configure/specify tar and/or cpio archives with optional compression (and/or select filesystem images) using the make menuconfig procedure.
In case of using NFS why can't we use directly the target folder as rootfs
Because it is not suitable as a roofs.
File owners & groups are incorrect (this could be irrelevant for NFS usage).
File permissions may not be correct (e.g. setuid for the busybox binary).
The /dev directory does not have the minimal device nodes that the target kernel requires.
Instead of the required minimal device nodes (e.g. console), the target directory has ordinary files in dev:
buildroot-2015.05/output/target$ ls -l dev
total 4
-rw--w--w- 1 me swdev 0 Sep 15 16:34 console
lrwxrwxrwx 1 me swdev 10 Aug 14 2015 log -> ../tmp/log
drwxrwxr-x 2 me swdev 4096 May 31 2015 pts
$
The target kernel cannot use these files when it expects device nodes. Instead of I/O performed through device nodes, ordinary file transfers will be attempted with these files.
The actual dev directory should be:
crw--w--w- 1 root root 5, 1 Sep 15 16:34 console
lrwxrwxrwx 1 root root 10 Aug 14 2015 log -> ../tmp/log
drwxr-xr-x 2 root root 4096 May 31 2015 pts
what makes "untaring" images/rootfs.tar different than target
Buildroot can cleverly create entries for the device nodes and assign the proper owner and group to each filename as it creates the archive (or filesystem image).
This is simply generating binary data in the appropriate format that is inserted with actual archive entries (or to the fs image) written to a file.
Only when it is un-archived (or the filesystem image is mounted) that the "data" is properly interpreted as for device nodes.

Unable to copy the shared library's soft links with their original size in linux

I have created a shared object of 1.2 M and created 4 soft links for that SO.
Size of all the links is 20B and the size of the main so is 1.2M
20 May 23 10:56 libAbc.so -> libAbc.so.2.0.11.0
20 May 23 10:56 libAbc.so.1 -> libAbc.so.2.0.11.0
20 May 23 10:56 libAbc.so.1.0 -> libAbc.so.2.0.11.0
1.2M May 23 10:56 libAbc.so.2.0.11.0
While i tried to copy all these files to another folder using cp , the size of the links is equal to the main file.
1.2M May 24 08:07 libABC.so
1.2M May 24 08:07 libABC.so.1
1.2M May 24 08:07 libABC.so.1.0
1.2M May 24 08:07 libABC.so.2.0.11.0
I have also used
cp -s libAgent.so* src/
which is also failing with an error
cp: `src/libAgent.so': can make relative symbolic links only in current directory
Why is it failing to copy the softlinks with their original size(20B)
Copying a soft link usually copies the file the soft link refers to and does not replicate the soft link.
If you wish to use cp to copy soft links as new soft links, specify the -a option to cp.
cp -a libAbc.so* /path/to/another/folder
The info page for cp says:
`-a'
`--archive'
Preserve as much as possible of the structure and attributes of the
original files in the copy (but do not attempt to preserve internal
directory structure; i.e., `ls -U' may list the entries in a copied
directory in a different order). Equivalent to `-dpR'.
The key is to select the "archive" option of the command used to copy the link. rsync is an effective alternative as Maxim Egorushkin points out, and rsync has the capability of copying extended attributes and ACLs if the command includes the appropriate command-line arguments.
rsync -aAX libAbc.so* /path/to/another/folder
The rsync man page says:
-a, --archive archive mode; equals -rlptgoD (no -H,-A,-X)
--no-OPTION turn off an implied OPTION (e.g. --no-D)
-A, --acls preserve ACLs (implies -p)
-X, --xattrs preserve extended attributes
In the OP's use case, the -A and -X are not needed.
A caveat is that if the soft link is a relative link, copying it to a new location may make the link inoperative because it does not point to an absolute path.
For example:
$ ls -al /usr/lib/rpmpopt
lrwxrwxrwx 1 root root 19 2012-05-02 12:40 /usr/lib/rpmpopt -> rpm/rpmpopt-4.4.2.3
cp -a /usr/lib/rpmpopt ${HOME}/tmp and rsync -av /usr/lib/rpmpopt ${HOME}/tmp on my machine both create a broken link rpmpopt -> rpm/rpmpopt-4.4.2.3 because my local copy has no rpm sub-folder. One needs to consider this fact when deciding what to copy.
This link is broken because ${HOME}/tmp/rpm is not present:
$ ls -nl ${HOME}/tmp/rpmpopt
lrwxrwxrwx 1 505 700 19 2016-05-24 10:04 /home/kbulgrien/tmp/rpmpopt -> rpm/rpmpopt-4.4.2.3
rsync is less error-prone then cp:
rsync -av libAgent.so* src/

How can I tar a file that is being used by another process?

I'm archiving a directory. This directory has a file that is being written by another process. When I tar this using Linux tar/Perl Tar module, in the archive the entry for the file is there but the contents are null.
Before tarring the files are...
-rw-r--r-- 1 irraju dba 28 Feb 18 02:22 a
-rw-r--r-- 1 irraju dba 25 Feb 18 02:23 b
-rw-r--r-- 1 irraju dba 29 Feb 18 03:38 c
After untarring
-rw-r--r-- irraju/dba 28 2009-02-18 02:22:58 a
-rw-r--r-- irraju/dba 25 2009-02-18 02:23:17 b
-rw-r--r-- irraju/dba 0 2009-02-18 03:33:12 c
How can I fix this problem? I want the file to be in the archive with the contents it has at the instant it is archived. This file can be a log file and assume that we can not close the file handle before tarring.
As you tagged the question with "Linux" there's a chance you're using an LVM partition.
If indeed you're running on an LVM partition, you can use the LVM snapshot feature.
Here's a link to the relevant LVM documentation on how to perform the operation.
Here's a part of the LVM snapshot intro:
A wonderful facility provided by LVM is 'snapshots'. This allows the administrator to create a new block device which presents an exact copy of a logical volume, frozen at some point in time. Typically this would be used when some batch processing, a backup for instance, needs to be performed on the logical volume, but you don't want to halt a live system that is changing the data. When the snapshot device has been finished with the system administrator can just remove the device. This facility does require that the snapshot be made at a time when the data on the logical volume is in a consistent state - the VFS-lock patch for LVM1 makes sure that some filesystems do this automatically when a snapshot is created, and many of the filesystems in the 2.6 kernel do this automatically when a snapshot is created without patching.
Try copying the files first...
cp a a.tmp
cp b b.tmp
cp c c.tmp
...then tarball everything together...
tar *.tmp abc.tar
...and clean up:
rm *.tmp
If that doesn't work then the process holding the file handle doesn't want to share read access...
You may find that this depends on the filesystem used and the application that is accessing the file. The closest to a generic solution is to use a filesystem that supports snapshots and create a snapshot before running tar.
Your second output is made after your first, that can't be right. I'm guessing that tar is right here: when it was doing its job, the file was empty. You may be dealing with a race condition here.
As others have said, it depends on the file system & OS being used. sync first (or whatever the equivalent is on your file system), copy the files to a temp directory and then tar them up. If the file system won't allow you to copy an opened file, then you're SOL; Perl can't get around file system limitations.

Detecting a chroot jail from within

How can one detect being in a chroot jail without root privileges? Assume a standard BSD or Linux system. The best I came up with was to look at the inode value for "/" and to consider whether it is reasonably low, but I would like a more accurate method for detection.
[edit 20080916 142430 EST] Simply looking around the filesystem isn't sufficient, as it's not difficult to duplicate things like /boot and /dev to fool the jailed user.
[edit 20080916 142950 EST] For Linux systems, checking for unexpected values within /proc is reasonable, but what about systems that don't support /proc in the first place?
The inode for / will always be 2 if it's the root directory of an ext2/ext3/ext4 filesystem, but you may be chrooted inside a complete filesystem. If it's just chroot (and not some other virtualization), you could run mount and compare the mounted filesystems against what you see. Verify that every mount point has inode 2.
On Linux with root permissions, test if the root directory of the init process is your root directory. Although /proc/1/root is always a symbolic link to /, following it leads to the “master” root directory (assuming the init process is not chrooted, but that's hardly ever done). If /proc isn't mounted, you can bet you're in a chroot.
[ "$(stat -c %d:%i /)" != "$(stat -c %d:%i /proc/1/root/.)" ]
# With ash/bash/ksh/zsh
! [ -x /proc/1/root/. ] || [ /proc/1/root/. -ef / ]
This is more precise than looking at /proc/1/exe because that could be different outside a chroot if init has been upgraded since the last boot or if the chroot is on the main root filesystem and init is hard linked in it.
If you do not have root permissions, you can look at /proc/1/mountinfo and /proc/$$/mountinfo (briefly documented in filesystems/proc.txt in the Linux kernel documentation). This file is world-readable and contains a lot of information about each mount point in the process's view of the filesystem. The paths in that file are restricted by the chroot affecting the reader process, if any. If the process reading /proc/1/mountinfo is chrooted into a filesystem that's different from the global root (assuming pid 1's root is the global root), then no entry for / appears in /proc/1/mountinfo. If the process reading /proc/1/mountinfo is chrooted to a directory on the global root filesystem, then an entry for / appears in /proc/1/mountinfo, but with a different mount id. Incidentally, the root field ($4) indicates where the chroot is in its master filesystem. Again, this is specific to Linux.
[ "$(awk '$5=="/" {print $1}' </proc/1/mountinfo)" != "$(awk '$5=="/" {print $1}' </proc/$$/mountinfo)" ]
If you are not in a chroot and the root filesystem is ext2/ext3/ext4, the inode for / will always be 2. You may check that using
stat -c %i /
or
ls -id /
Interresting, but let's try to find path of chroot directory. Ask to stat on which device / is located:
stat -c %04D /
First byte is major of device and lest byte is minor. For example, 0802, means major 8, minor 1. If you check in /dev, you will see this device is /dev/sda2. If you are root you can directly create correspondong device in your chroot:
mknode /tmp/root_dev b 8 1
Now, let's find inode associated to our chroot. debugfs allows list contents of files using inode numbers. For exemple, ls -id / returned 923960:
sudo debugfs /tmp/root_dev -R 'ls <923960>'
923960 (12) . 915821 (32) .. 5636100 (12) var
5636319 (12) lib 5636322 (12) usr 5636345 (12) tmp
5636346 (12) sys 5636347 (12) sbin 5636348 (12) run
5636349 (12) root 5636350 (12) proc 5636351 (12) mnt
5636352 (12) home 5636353 (12) dev 5636354 (12) boot
5636355 (12) bin 5636356 (12) etc 5638152 (16) selinux
5769366 (12) srv 5769367 (12) opt 5769375 (3832) media
Interesting information is inode of .. entry: 915821. I can ask its content:
sudo debugfs /tmp/root_dev -R 'ls <915821>'
915821 (12) . 2 (12) .. 923960 (20) debian-jail
923961 (4052) other-jail
Directory called debian-jail has inode 923960. So last component of my chroot dir is debian-jail. Let's see parent directory (inode 2) now:
sudo debugfs /tmp/root_dev -R 'ls <2>'
2 (12) . 2 (12) .. 11 (20) lost+found 1046529 (12) home
130817 (12) etc 784897 (16) media 3603 (20) initrd.img
261633 (12) var 654081 (12) usr 392449 (12) sys 392450 (12) lib
784898 (12) root 915715 (12) sbin 1046530 (12) tmp
1046531 (12) bin 784899 (12) dev 392451 (12) mnt
915716 (12) run 12 (12) proc 1046532 (12) boot 13 (16) lib64
784945 (12) srv 915821 (12) opt 3604 (3796) vmlinuz
Directory called opt has inode 915821 and inode 2 is root of filesystem. So my chroot directory is /opt/debian-jail. Sure, /dev/sda1 may be mounted on another filesystem. You need to check that (use lsof or directly picking information /proc).
Preventing stuff like that is the whole point. If it's your code that's supposed to run in the chroot, have it set a flag on startup. If you're hacking, hack: check for several common things in known locations, count the files in /etc, something in /dev.
On BSD systems (check with uname -a), proc should always be present. Check if the dev/inode pair of /proc/1/exe (use stat on that path, it won't follow the symlink by text but by the underlying hook) matches /sbin/init.
Checking the root for inode #2 is also a good one.
On most other systems, a root user can find out much faster by attempting the fchdir root-breaking trick. If it goes anywhere you are in a chroot jail.
I guess it depends why you might be in a chroot, and whether any effort has gone into disguising it.
I'd check /proc, these files are automatically generated system information files. The kernel will populate these in the root filesystem, but it's possible that they don't exist in the chroot filesystem.
If the root filesystem's /proc has been bound to /proc in the chroot, then it is likely that there are some discrepancies between that information and the chroot environment. Check /proc/mounts for example.
Similrarly, check /sys.
I wanted the same information for a jail running on FreeBSD (as Ansible doesn't seem to detect this scenario).
On the FreeNAS distribution of FreeBSD 11, /proc is not mounted on the host, but it is within the jail. Whether this is also true on regular FreeBSD I don't know for sure, but procfs: Gone But Not Forgotten seems to suggest it is. Either way, you probably wouldn't want to try mounting it just to detect jail status and therefore I'm not certain it can be used as a reliable predictor of being within a jail.
I also ruled out using stat on / as certainly on FreeNAS all jails are given their own file system (i.e. a ZFS dataset) and therefore the / node on the host and in the jail both have inode 4. I expect this is common on FreeBSD 11 in general.
So the approach I settled on was using procstat on pid 0.
[root#host ~]# procstat 0
PID PPID PGID SID TSID THR LOGIN WCHAN EMUL COMM
0 0 0 0 0 1234 - swapin - kernel
[root#host ~]# echo $?
0
[root#host ~]# jexec guest tcsh
root#guest:/ # procstat 0
procstat: sysctl(kern.proc): No such process
procstat: procstat_getprocs()
root#guest:/ # echo $?
1
I am making an assumption here that pid 0 will always be the kernel on the host, and there won't be a pid 0 inside the jail.
If you entered the chroot with schroot, then you can check the value of $debian_chroot.

Resources