Get the path of link that it points to?

Get the path of link that it points to? - linux

Is it possible to get the abolute path of the link that it is pointing to?
Is there any simple system command?
I need for all of the following OS
HP-UX 11i, 1123u, 1123i
AIX 5.2 and 5.3
Suse Linux 10
Solaris 10

You didn't specify a language, so I assume you want a command that can be run in whatever shell you are using. The ls command has the -l (that is an ell) option which prints out a lot of information about the file. The last bit of information is the full path, so you should be able to say
ls -l file | awk '{print $NF}'
on any SUS2 compliant machine (which should be all of the commercial UNIXes). This will have a problem if the file or the any of the directories leading up to the file have spaces though.

If you are looking for a system call, you want readlink(2). This is standardized, and so should be available on all POSIX compliant systems.
Here's an example of its usage, taken from the link given earlier:
#include <unistd.h>
char buf[1024];
ssizet_t len;
if ((len = readlink("/modules/pass1", buf, sizeof(buf)-1)) != -1)
buf[len] = '\0';
If you're looking for a command line utility, it doesn't look like there is one standardized, but GNU (Linux) and BSD both have readlink(1).

Related

How to execute svn command along with grep on windows?

Trying to execute svn command on windows machine and capture the output for the same.
Code:
import subprocess
cmd = "svn log -l1 https://repo/path/trunk | grep ^r | awk '{print \$3}'"
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)
'grep' is not recognized as an internal or external command,
operable program or batch file.
I do understand that 'grep' is not windows utility.
Getting error as "grep' is not recognized as an internal or external command,
operable program or batch file."
Is it only limited to execute on Linux?
Can we execute the same on Windows?
Is my code right?

For windows your command will look something like the following
svn log -l1 https://repo/path/trunk | find "string_to_find"
You need to use the find utility in windows to get the same effect as grep.
svn --version | find "ra"
* ra_svn : Module for accessing a repository using the svn network protocol.
* ra_local : Module for accessing a repository on local disk.
* ra_serf : Module for accessing a repository via WebDAV protocol using serf.

Use svn log --search FOO instead of grep-ing the command's output.

grep and awk are certainly available for Windows as well, but there is really no need to install them -- the code is easy to replace with native Python.
import subprocess
p = subprocess.run(["svn", "log", "-l1", "https://repo/path/trunk"],
capture_output=True, text=True)
for line in p.stdout.splitlines():
# grep ^r
if line.startswith('r'):
# awk '{ print $3 }'
print(line.split()[2])
Because we don't need a pipeline, and just run a single static command, we can avoid shell=True.
Because we don't want to do the necessary plumbing (which you forgot anyway) for Popen(), we prefer subprocess.run(). With capture_output=True we conveniently get its output in the resulting object's stdout atrribute; because we expect text output, we pass text=True (in older Python versions you might need to switch to the old, slightly misleading synonym universal_newlines=True).
I guess the intent is to search for the committer in each revision's output, but this will incorrectly grab the third token on any line which starts with an r (so if you have a commit message like "refactored to use Python native code" the code will extract use from that). A better approach altogether is to request machine-readable output from svn and parse that (but it's unfortunately rather clunky XML, so there's another not entirely trivial rabbithole for you). Perhaps as middle ground implement a more specific pattern for finding those lines -- maybe look for a specific number of fields, and static strings where you know where to expect them.
if line.startswith('r'):
fields = line.split()
if len(fields) == 13 and fields[1] == '|' and fields[3] == '|':
print(fields[2])
You could also craft a regular expression to look for a date stamp in the third |-separated field, and the number of changed lines in the fourth.
For the record, a complete commit message from Subversion looks like
------------------------------------------------------------------------
r16110 | tripleee | 2020-10-09 10:41:13 +0300 (Fri, 09 Oct 2020) | 4 lines
refactored to use native Python instead of grep + awk
(which is a useless use of grep anyway; see http://www.iki.fi/era/unix/award.html#grep)

on-the-fly output redirection, seeing the file redirection output while the program is still running

If I use a command like this one:
./program >> a.txt &
, and the program is a long running one then I can only see the output once the program ended. That means I have no way of knowing if the computation is going well until it actually stops computing. I want to be able to read the redirected output on file while the program is running.
This is similar to opening a file, appending to it, then closing it back after every writing. If the file is only closed at the end of the program then no data can be read on it until the program ends. The only redirection I know is similar to closing the file at the end of the program.
You can test it with this little python script. The language doesn't matter. Any program that writes to standard output has the same problem.
l = range(0,100000)
for i in l:
if i%1000==0:
print i
for j in l:
s = i + j
One can run this with:
./python program.py >> a.txt &
Then cat a.txt .. you will only get results once the script is done computing.

From the stdout manual page:
The stream stderr is unbuffered.
The stream stdout is line-buffered
when it points to a terminal.
Partial lines will not appear until
fflush(3) or exit(3) is called, or
a new‐line is printed.
Bottom line: Unless the output is a terminal, your program will have its standard output in fully buffered mode by default. This essentially means that it will output data in large-ish blocks, rather than line-by-line, let alone character-by-character.
Ways to work around this:
Fix your program: If you need real-time output, you need to fix your program. In C you can use fflush(stdout) after each output statement, or setvbuf() to change the buffering mode of the standard output. For Python there is sys.stdout.flush() of even some of the suggestions here.
Use a utility that can record from a PTY, rather than outright stdout redirections. GNU Screen can do this for you:
screen -d -m -L python test.py
would be a start. This will log the output of your program to a file called screenlog.0 (or similar) in your current directory with a default delay of 10 seconds, and you can use screen to connect to the session where your command is running to provide input or terminate it. The delay and the name of the logfile can be changed in a configuration file or manually once you connect to the background session.
EDIT:
On most Linux system there is a third workaround: You can use the LD_PRELOAD variable and a preloaded library to override select functions of the C library and use them to set the stdout buffering mode when those functions are called by your program. This method may work, but it has a number of disadvantages:
It won't work at all on static executables
It's fragile and rather ugly.
It won't work at all with SUID executables - the dynamic loader will refuse to read the LD_PRELOAD variable when loading such executables for security reasons.
It's fragile and rather ugly.
It requires that you find and override a library function that is called by your program after it initially sets the stdout buffering mode and preferably before any output. getenv() is a good choice for many programs, but not all. You may have to override common I/O functions such as printf() or fwrite() - if push comes to shove you may just have to override all functions that control the buffering mode and introduce a special condition for stdout.
It's fragile and rather ugly.
It's hard to ensure that there are no unwelcome side-effects. To do this right you'd have to ensure that only stdout is affected and that your overrides will not crash the rest of the program if e.g. stdout is closed.
Did I mention that it's fragile and rather ugly?
That said, the process it relatively simple. You put in a C file, e.g. linebufferedstdout.c the replacement functions:
#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>
char *getenv(const char *s) {
static char *(*getenv_real)(const char *s) = NULL;
if (getenv_real == NULL) {
getenv_real = dlsym(RTLD_NEXT, "getenv");
setlinebuf(stdout);
}
return getenv_real(s);
}
Then you compile that file as a shared object:
gcc -O2 -o linebufferedstdout.so -fpic -shared linebufferedstdout.c -ldl -lc
Then you set the LD_PRELOAD variable to load it along with your program:
$ LD_PRELOAD=./linebufferedstdout.so python test.py | tee -a test.out
0
1000
2000
3000
4000
If you are lucky, your problem will be solved with no unfortunate side-effects.
You can set the LD_PRELOAD library in the shell, if necessary, or even specify that library system-wide (definitely NOT recommended) in /etc/ld.so.preload.

If you're trying to modify the behavior of an existing program try stdbuf (part of coreutils starting with version 7.5 apparently).
This buffers stdout up to a line:
stdbuf -oL command > output
This disables stdout buffering altogether:
stdbuf -o0 command > output

Have you considered piping to tee?
./program | tee a.txt
However, even tee won't work if "program" doesn't write anything to stdout until it is done. So, the effectiveness depends a lot on how your program behaves.

If the program writes to a file, you can read it while it is being written using tail -f a.txt.

Your problem is that most programs check to see if the output is a terminal or not. If the output is a terminal then output is buffered one line at a time (so each line is output as it is generated) but if the output is not a terminal then the output is buffered in larger chunks (4096 bytes at a time is typical) This behaviour is normal behaviour in the C library (when using printf for example) and also in the C++ library (when using cout for example), so any program written in C or C++ will do this.
Most other scripting languages (like perl, python, etc.) are written in C or C++ and so they have exactly the same buffering behaviour.
The answer above (using LD_PRELOAD) can be made to work on perl or python scripts, since the interpreters are themselves written in C.

The unbuffer command from the expect package does exactly what you are looking for.
$ sudo apt-get install expect
$ unbuffer python program.py | cat -
<watch output immediately show up here>

Compressing the core files during core generation

Is there way to compress the core files during core dump generation?
If the storage space is limited in the system, is there a way of conserving it in case of need for core dump generation with immediate compression?
Ideally the method would work on older versions of linux such as 2.6.x.

The Linux kernel /proc/sys/kernel/core_pattern file will do what you want: http://www.mjmwired.net/kernel/Documentation/sysctl/kernel.txt#191
Set the filename to something like |/bin/gzip -1 > /var/crash/core-%t-%p-%u.gz and your core files should be saved compressed for you.

For an embedded Linux systems, following script change perfectly works to generate compressed core files in 2 steps
step 1: create a script
touch /bin/gen_compress_core.sh
chmod +x /bin/gen_compress_core.sh
cat > /bin/gen_compress_core.sh #!/bin/sh exec /bin/gzip -f - >"/var/core/core-$1.$2.gz"
ctrl +d
step 2: update the core pattern file
cat > /proc/sys/kernel/core_pattern |/bin/gen_compress_core.sh %e %p ctrl+d

As suggested by other answer, the Linux kernel /proc/sys/kernel/core_pattern file is good place to start: http://www.mjmwired.net/kernel/Documentation/sysctl/kernel.txt#141
As documentation says you can specify the special character "|" which will tell kernel to output the file to script. As suggested you could use |/bin/gzip -1 > /var/crash/core-%t-%p-%u.gz as name, however it doesn't seem to work for me. I expect that the reason is that on my system kernel doesn't treat the > character as a output, rather it probably passes it as a parameter to gzip.
In order to avoid this problem, like other suggested you can create your file in some location I am using /home//crash/core.sh, create it using the following command, replacing with your user. Alternatively you can also obviously change the entire path.
echo -e '#!/bin/bash\nexec /bin/gzip -f - >"/home/<username>/crashes/core-$1-$2-$3-$4-$5.gz"' > ~/crashes/core.sh
Now this script will take 5 input parameters and concatenate them and add to core-path. The full paths must be specified in the ~/crashes/core.sh. Also the location of this script can be specified. Now lets tell kernel to use tour executable with parameters when generating file:
sudo sysctl -w kernel.core_pattern="|/home/<username>/crashes/core.sh %e %p %h %t"
Again should be replaced (or entire path to match location and name of core.sh script). Next step is to crash some program, lets create example crashing cpp file:
int main (){
int * a = nullptr;
int b = *a;
}
After compiling and running there are 2 options, either we will see:
Segmentation fault (core dumped)
Or
Segmentation fault
In case we see the latter, there are few possible reasons.
ulimit is not set, ulimit -c should specify what is limit for cores
apport or your distro core dump collector is not running, this should be investigated further
there is an error in script we wrote, I suggest than checking some basic dump path to check if the other things aren't reason the below should create /tmp/core.dump:
sudo sysctl -w kernel.core_pattern="/tmp/core.dump"
I know there is already an answer for this question however it wasn't obvious for me why it isn't working "out of the box" so I wanted to summarize my findings, hope it helps someone.

How do I increase the /proc/pid/cmdline 4096 byte limit?

For my Java apps with very long classpaths, I cannot see the main class specified near the end of the arg list when using ps. I think this stems from my Ubuntu system's size limit on /proc/pid/cmdline. How can I increase this limit?

For looking at Java processes jps is very useful.
This will give you the main class and jvm args:
jps -vl | grep <pid>

You can't change this dynamically, the limit is hard-coded in the kernel to PAGE_SIZE in fs/proc/base.c:
274 int res = 0;
275 unsigned int len;
276 struct mm_struct *mm = get_task_mm(task);
277 if (!mm)
278 goto out;
279 if (!mm->arg_end)
280 goto out_mm; /* Shh! No looking before we're done */
281
282 len = mm->arg_end - mm->arg_start;
283
284 if (len > PAGE_SIZE)
285 len = PAGE_SIZE;
286
287 res = access_process_vm(task, mm->arg_start, buffer, len, 0);

I temporarily get around the 4096 character command line argument limitation of ps (or rather /proc/PID/cmdline) is by using a small script to replace the java command.
During development, I always use an unpacked JDK version from SUN and never use the installed JRE or JDK of the OS no matter if Linux or Windows (eg. download the bin versus the rpm.bin).
I do not recommend changing the script for your default Java installation (e.g. because it might break updates or get overwritten or create problems or ...)
So assuming the java command is in /x/jdks/jdk1.6.0_16_x32/bin/java
first move the actual binary away:
mv /x/jdks/jdk1.6.0_16_x32/bin/java /x/jdks/jdk1.6.0_16_x32/bin/java.orig
then create a script /x/jdks/jdk1.6.0_16_x32/bin/java like e.g.:
#!/bin/bash
echo "$#" > /tmp/java.$$.cmdline
/x/jdks/jdk1.6.0_16_x32/bin/java.orig $#
and then make the script runnable
chmod a+x /x/jdks/jdk1.6.0_16_x32/bin/java
in case of copy and pasting the above, you should make sure that there are not extra spaces in /x/jdks/jdk1.6.0_16_x32/bin/java and #!/bin/bash is the first line
The complete command line ends up in e.g. /tmp/java.26835.cmdline where 26835 is the PID of the shell script.
I think there is also some shell limit on the number of command line arguments, cannot remember but it was possibly 64K characters.
you can change the script to remove the command line text from /tmp/java.PROCESS_ID.cmdline
at the end
After I got the commandline, I always move the script to something like "java.script" and copy (cp -a) the actual binary java.orig back to java. I only use the script when I hit the 4K limit.
There might be problems with escaped characters and maybe even spaces in paths or such, but it works fine for me.

You can use jconsole to get access to the original command line without all the length limits.

It is possible to use newer linux distributions, where this limit was removed, for example RHEL 6.8 or later
"The /proc/pid/cmdline file length limit for the ps command was previously hard-coded in the kernel to 4096 characters. This update makes sure the length of /proc/pid/cmdline is unlimited, which is especially useful for listing processes with long command line arguments. (BZ#1100069)"
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.8_Release_Notes/new_features_kernel.html

For Java based programs where you are just interested in inspecting the command line args your main class got, you can run:
jps -m

I'm pretty sure that if you're actually seeing the arguments truncated in /proc/$pid/cmdline then you're actually exceeding the maximum argument length supported by the OS. As far as I can tell, in Linux, the size is limited to the memory page size. See "ps ww" length restriction for reference.
The only way to get around that would be to recompile the kernel. If you're interested in going that far to resolve this then you may find this post useful: "Argument list too long": Beyond Arguments and Limitations
Additional reference:
ARG_MAX, maximum length of arguments for a new process

Perhaps the 'w' parameter to ps is what you want. Add two 'w' for greater output. It tells ps to ignore the line width of the terminal.

Why is ARG_MAX not defined via limits.h?

On Fedora Core 7, I'm writing some code that relies on ARG_MAX. However, even if I #include <limits.h>, the constant is still not defined. My investigations show that it's present in <sys/linux/limits.h>, but this is supposed to be portable across Win32/Mac/Linux, so directly including it isn't an option. What's going on here?

The reason it's not in limits.h is that it's not a quantity giving the limits of the value range of an integral type based on bit width on the current architecture. That's the role assigned to limits.h by the ISO standard.
The value in which you're interested is not hardware-bound in practice and can vary from platform to platform and perhaps system build to system build.
The correct thing to do is to call sysconf and ask it for "ARG_MAX" or "_POSIX_ARG_MAX". I think that's the POSIX-compliant solution anyway.
Acc. to my documentation, you include one or both of unistd.h or limits.h based on what values you're requesting.
One other point: many implementations of the exec family of functions return E2BIG or a similar value if you try to call them with an oversized environment. This is one of the defined conditions under which exec can actually return.

For the edification of future people like myself who find themselves here after a web search for "arg_max posix", here is a demonstration of the POSIXly-correct method for discerning ARG_MAX on your system that Thomas Kammeyer refers to in his answer:
cc -x c <(echo '
#include <unistd.h>
#include <stdio.h>
int main() { printf("%li\n", sysconf(_SC_ARG_MAX)); }
')
This uses the process substitution feature of Bash; put the same lines in a file and run cc thefile.c if you are using some other shell.
Here's the output for macOS 10.14:
$ ./a.out
262144
Here's the output for a RHEL 7.x system configured for use in an HPC environment:
$ ./a.out
4611686018427387903
$ ./a.out | numfmt --to=iec-i # 'numfmt' from GNU coreutils
4.0Ei
For contrast, here is the method prescribed by https://porkmail.org/era/unix/arg-max.html, which uses the C preprocessor:
cpp <<HERE | tail -1
#include <limits.h>
ARG_MAX
HERE
This does not work on Linux for reasons still not entirely clear to me—I am not a systems programmer and not conversant in the POSIX or ISO specs—but probably explained above.

ARG_MAX is defined in /usr/include/linux/limits.h. My linux kernel version is 3.2.0-38.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string