How do I increase the /proc/pid/cmdline 4096 byte limit? - linux

For my Java apps with very long classpaths, I cannot see the main class specified near the end of the arg list when using ps. I think this stems from my Ubuntu system's size limit on /proc/pid/cmdline. How can I increase this limit?

For looking at Java processes jps is very useful.
This will give you the main class and jvm args:
jps -vl | grep <pid>

You can't change this dynamically, the limit is hard-coded in the kernel to PAGE_SIZE in fs/proc/base.c:
274 int res = 0;
275 unsigned int len;
276 struct mm_struct *mm = get_task_mm(task);
277 if (!mm)
278 goto out;
279 if (!mm->arg_end)
280 goto out_mm; /* Shh! No looking before we're done */
281
282 len = mm->arg_end - mm->arg_start;
283
284 if (len > PAGE_SIZE)
285 len = PAGE_SIZE;
286
287 res = access_process_vm(task, mm->arg_start, buffer, len, 0);

I temporarily get around the 4096 character command line argument limitation of ps (or rather /proc/PID/cmdline) is by using a small script to replace the java command.
During development, I always use an unpacked JDK version from SUN and never use the installed JRE or JDK of the OS no matter if Linux or Windows (eg. download the bin versus the rpm.bin).
I do not recommend changing the script for your default Java installation (e.g. because it might break updates or get overwritten or create problems or ...)
So assuming the java command is in /x/jdks/jdk1.6.0_16_x32/bin/java
first move the actual binary away:
mv /x/jdks/jdk1.6.0_16_x32/bin/java /x/jdks/jdk1.6.0_16_x32/bin/java.orig
then create a script /x/jdks/jdk1.6.0_16_x32/bin/java like e.g.:
#!/bin/bash
echo "$#" > /tmp/java.$$.cmdline
/x/jdks/jdk1.6.0_16_x32/bin/java.orig $#
and then make the script runnable
chmod a+x /x/jdks/jdk1.6.0_16_x32/bin/java
in case of copy and pasting the above, you should make sure that there are not extra spaces in /x/jdks/jdk1.6.0_16_x32/bin/java and #!/bin/bash is the first line
The complete command line ends up in e.g. /tmp/java.26835.cmdline where 26835 is the PID of the shell script.
I think there is also some shell limit on the number of command line arguments, cannot remember but it was possibly 64K characters.
you can change the script to remove the command line text from /tmp/java.PROCESS_ID.cmdline
at the end
After I got the commandline, I always move the script to something like "java.script" and copy (cp -a) the actual binary java.orig back to java. I only use the script when I hit the 4K limit.
There might be problems with escaped characters and maybe even spaces in paths or such, but it works fine for me.

You can use jconsole to get access to the original command line without all the length limits.

It is possible to use newer linux distributions, where this limit was removed, for example RHEL 6.8 or later
"The /proc/pid/cmdline file length limit for the ps command was previously hard-coded in the kernel to 4096 characters. This update makes sure the length of /proc/pid/cmdline is unlimited, which is especially useful for listing processes with long command line arguments. (BZ#1100069)"
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.8_Release_Notes/new_features_kernel.html

For Java based programs where you are just interested in inspecting the command line args your main class got, you can run:
jps -m

I'm pretty sure that if you're actually seeing the arguments truncated in /proc/$pid/cmdline then you're actually exceeding the maximum argument length supported by the OS. As far as I can tell, in Linux, the size is limited to the memory page size. See "ps ww" length restriction for reference.
The only way to get around that would be to recompile the kernel. If you're interested in going that far to resolve this then you may find this post useful: "Argument list too long": Beyond Arguments and Limitations
Additional reference:
ARG_MAX, maximum length of arguments for a new process

Perhaps the 'w' parameter to ps is what you want. Add two 'w' for greater output. It tells ps to ignore the line width of the terminal.

Related

Use more than one core in bash

I have a linux tool that (greatly simplifying) cuts me the sequences specified in illumnaSeq file. I have 32 files to grind. One file is processed in about 5 hours. I have a server on the centos, it has 128 cores.
I've found a few solutions, but each one works in a way that only uses one core. The last one seems to fire 32 nohups, but it'll still pressurize the whole thing with one core.
My question is, does anyone have any idea how to use the server's potential? Because basically every file can be processed independently, there are no relations between them.
This is the current version of the script and I don't know why it only uses one core. I wrote it with the help of advice here on stack and found on the Internet:
#!/bin/bash
FILES=/home/daw/raw/*
count=0
for f in $FILES
to
base=${f##*/}
echo "process $f file..."
nohup /home/daw/scythe/scythe -a /home/daw/scythe/illumina_adapters.fa -o "OUT$base" $f &
(( count ++ ))
if (( count = 31 )); then
wait
count=0
fi
done
I'm explaining: FILES is a list of files from the raw folder.
The "core" line to execute nohup: the first path is the path to the tool, -a path is the path to the file with paternas to cut, out saves the same file name as the processed + OUT at the beginning. The last parameter is the input file to be processed.
Here readme tools:
https://github.com/vsbuffalo/scythe
Does anybody know how you can handle it?
P.S. I also tried move nohup before count, but it's still use one core. I have no limitation on server.
IMHO, the most likely solution is GNU Parallel, so you can run up to say, 64 jobs in parallel something like this:
parallel -j 64 /home/daw/scythe/scythe -a /home/daw/scythe/illumina_adapters.fa -o OUT{.} {} ::: /home/daw/raw/*
This has the benefit that jobs are not batched, it keeps 64 running at all times, starting a new one as each job finishes, which is better than waiting potentially 4.9 hours for all 32 of your jobs to finish before starting the last one which takes a further 5 hours after that. Note that I arbitrarily chose 64 jobs here, if you don't specify otherwise, GNU Parallel will run 1 job per CPU core you have.
Useful additional parameters are:
parallel --bar ... gives a progress bar
parallel --dry-run ... does a dry run so you can see what it would do without actually doing anything
If you have multiple servers available, you can add them in a list and GNU Parallel will distribute the jobs amongst them too:
parallel -S server1,server2,server3 ...

Why there are written randomly null characters in some of my output files?

I have some scripts in my RedHat server which contains Microfocus COBOL programs which generates a huge file of aprox 3GB in a sort of time of 3 hours on average. The programs write their output files directly in the directory /my_test/files/.
The problem is that sometimes (randomly) some files generated contains null character sections in the middle of the file. And when I check them up, if I reexecute the script again (with the same input parameters), the output file is perfectly generated (it doesn't contain any nullchars). I've checked it a lot of times and I'm pretty sure is not the fault of the COBOL programs (they use quite simple operations). The space in use of that folder is 40%.
Some programs updates the database, and if they finish with return code 0, then the changes are commited, and I don't have any backup, so this is the point of what I'm doing.
This is an example of a file declaration of one of the problematic COBOL programs:
FILE-CONTROL.
SELECT MYFILE
ASSIGN TO MYFILE
ORGANIZATION IS SEQUENTIAL
ACCESS MODE IS SEQUENTIAL
FILE STATUS IS FILE-STATUS.
DATA DIVISION.
FILE SECTION.
FD MYFILE
LABEL RECORD STANDARD
RECORDING MODE F.
01 REG-OUTPUT PIC X(400).
I've also checked for the nulls in the COBOL programs before the NULL files, but unfortunately there are no nulls spotted.
Then I thought about creating a crontab which executes the following script each 5 seconds:
if [[ -f /tmp/sorry_im_working ]]; then
exit
fi
trap 'rm -rf /tmp/sorry_im_working' EXIT
touch /tmp/sorry_im_working
lsof | awk 'BEGIN{
sfiles="";
} {
if($1=="PROGRAM" && $9~/my_test\/files/){
sfiles=sfiles" "$9
}
}END{
comm="find "sfiles" -newermt \x27-2 seconds\x27 -exec env LC_ALL=C bash -c \x27grep -Pq \x22\x5Cx00{200}\x22 <(tail -c 1000 {}) && echo {}\x27 \x5C\x3B";
while(comm | getline sout){
print sout;
};
close(comm);
}' >> /home/ouhma/nullfiles.txt
Therefore, I would like to ask you the following questions:
Any idea of what's going on here?
Do you have any other way to trigger the lastest modified files?
What other information of interest could I add to my log?
If you construct a file d with only \x00 :
hexdump -C d
00000000 5c 78 30 30 0a |\x00.|
00000005
and you :
grep -Faq '\x00' d;echo $?
0
But they're no null caracter inside d.
Maybe, is better to use grep -Paq '\x00'
Depending on the configuration and record structure that is used for the file MF will pad different characters with hex null.
Please copy the 'ASSIGN' clause and the 'FD' clause of the COBOL program.
BTW: if your COBOL programs run three ours to do some calculations and write three GB of data back you should investigate the storage and / or get a COBOL programmer to check the programs, sounds much to slow.
I suspect you are have non-printable characters in your file, the null inserts can be controlled, take a look # INSERTNULL file configuration.

How do I implement "file -s <file>" on Linux in pure Go?

Intent:
Does Go have the functionality (package or otherwise) to perform a special file stat on Linux akin to the command file -s <path>
Example:
[root#localhost ~]# file /proc/uptime
/proc/uptime: empty
[root#localhost ~]# file -s /proc/uptime
/proc/uptime: ASCII text
Use Case:
I have a fileglob of files in /proc/* that I need to very quickly detect if they are truly empty instead of appearing to be empty.
Using The os Package:
Code:
result,_ := os.Stat("/proc/uptime")
fmt.Println("Name:",result.Name()," Size:",result.Size()," Mode:",int(result.Mode()))
fmt.Printf("%q",result)
Result:
Name: uptime Size: 0 Mode: 292
&{"uptime" '\x00' 'Ĥ' {%!q(int64=63606896088) %!q(int32=413685520) %!q(*time.Location=&{ [] [] 0 0 <nil>})} {'\x03' %!q(uint64=4026532071) '\x01' '脤' '\x00' '\x00' '\x00' '\x00' '\x00' 'Ѐ' '\x00' {%!q(int64=1471299288) %!q(int64=413685520)} {%!q(int64=1471299288) %!q(int64=413685520)} {%!q(int64=1471299288) %!q(int64=413685520)} ['\x00' '\x00' '\x00']}}
Obvious Workaround:
There is the obvious workaround of the following. But it's a little over the top to need to call in a bash shell in order to get file stats.
output,_ := exec.Command("bash","-c","file -s","/proc/uptime").Output()
//parse output etc...
EDIT/MY PRACTICAL USE CASE:
Quickly determining which files are zero size without needing to read each one of them first.
file -s /cgroup/memory/lsf/<cluster>/*/tasks | <clean up commands> | uniq -c
6 /cgroup/memory/lsf/<cluster>/<jobid>/tasks: ASCII text
805 /cgroup/memory/lsf/<cluster>/<jobid>/tasks: empty
So in this case, I know that only those 6 jobs are running and the rest (805) have terminated. Reading the file works like this:
# cat /cgroup/memory/lsf/<cluster>/<jobid>/tasks
#
or
# cat /cgroup/memory/lsf/<cluster>/<jobid>/tasks
12352
53455
...
I'm afraid you might be confusing matters here: file is special in precisely a way it "knows" a set of heuristics to carry out its tasks.
To my knowledge, Go does not have anything like this in its standard library, and I've not came across a 3rd-party package implementing a file-like functionality (though I invite you to search by relevant keywords on http://godoc.org)
On the other hand, Go provides full access to the syscall interface of the underlying OS so when it comes to querying the OS in a way file does it, there's nothing you could not do in plain Go.
So I suggest you to just fetch the source code of file, learn what it does in its mode turned on by the "-s" command-line option and implement that in your Go code.
We'll try to have you with specific problems doing that — should you have any.
Update
Looks like I've managed to grasp the OP is struggling with: a simple check:
$ stat -c %s /proc/$$/status && wc -c < $_
0
849
That is, the stat call on a file under /proc shows it has no contents but actually reading from that file returns that contents.
OK, so the solution is simple: instead of doing a call to os.Stat() while traversing the subtree of the filesystem one should instead merely attempt to read a single byte from the file, like in:
var buf [1]byte
f, err := os.Open(fname)
if err != nil {
// do something, or maybe ignore.
// A not existing file is OK to ignore
// (the POSIX error code will be ENOENT)
// because after the `path/filepath.Walk()` fetched an entry for
// this file from its directory, the file might well have gone.
}
_, err = f.Read(buf[:])
if err != nil {
if err == io.EOF {
// OK, we failed to read 1 byte, so the file is empty.
}
// Otherwise, deal with the error
}
f.Close()
You might try to be more clever and first obtain the stat information
(using a call to os.Stat()) to see if the file is a regular file—to
not attempt reading from sockets etc.
I have a fileglob of files in /proc/* that I need to very quickly
detect if they are truly empty instead of appearing to be empty.
They are truly empty in some sense (eg. they occupy no space on file system). If you want to check whether any data can be read from them, try reading from them - that's what file -s does:
-s, --special-files
Normally, file only attempts to read and
determine the type of argument files which stat(2) reports are
ordinary files. This prevents problems, because reading special files
may have peculiar consequences. Specifying the -s option causes file
to also read argument files which are block or character special
files. This is useful for determining the filesystem types of the
data in raw disk partitions, which are block special files. This
option also causes file to disregard the file size as reported by
stat(2) since on some systems it reports a zero size for raw disk
partitions.

How long is the limitation of the length of the file `/proc/{processId}/cmdline`?

I have a java process and its classpath includes many jars, so the startup command is very long.
Assume that the process id is 110101, when I check the command by command cat /proc/110101/cmdline, I find the command is incomplete, it only contains about 4000 characters.
Is there any other method to show the original startup command?
man proc
/proc/[pid]/cmdline
This holds the complete command line for the process, unless
the process is a zombie. In the latter case, there is nothing
in this file: that is, a read on this file will return 0
characters. The command-line arguments appear in this file as
a set of strings separated by null bytes ('\0'), with a
further null byte after the last string.
It may not give you all args by cat the file.

Compressing the core files during core generation

Is there way to compress the core files during core dump generation?
If the storage space is limited in the system, is there a way of conserving it in case of need for core dump generation with immediate compression?
Ideally the method would work on older versions of linux such as 2.6.x.
The Linux kernel /proc/sys/kernel/core_pattern file will do what you want: http://www.mjmwired.net/kernel/Documentation/sysctl/kernel.txt#191
Set the filename to something like |/bin/gzip -1 > /var/crash/core-%t-%p-%u.gz and your core files should be saved compressed for you.
For an embedded Linux systems, following script change perfectly works to generate compressed core files in 2 steps
step 1: create a script
touch /bin/gen_compress_core.sh
chmod +x /bin/gen_compress_core.sh
cat > /bin/gen_compress_core.sh #!/bin/sh exec /bin/gzip -f - >"/var/core/core-$1.$2.gz"
ctrl +d
step 2: update the core pattern file
cat > /proc/sys/kernel/core_pattern |/bin/gen_compress_core.sh %e %p ctrl+d
As suggested by other answer, the Linux kernel /proc/sys/kernel/core_pattern file is good place to start: http://www.mjmwired.net/kernel/Documentation/sysctl/kernel.txt#141
As documentation says you can specify the special character "|" which will tell kernel to output the file to script. As suggested you could use |/bin/gzip -1 > /var/crash/core-%t-%p-%u.gz as name, however it doesn't seem to work for me. I expect that the reason is that on my system kernel doesn't treat the > character as a output, rather it probably passes it as a parameter to gzip.
In order to avoid this problem, like other suggested you can create your file in some location I am using /home//crash/core.sh, create it using the following command, replacing with your user. Alternatively you can also obviously change the entire path.
echo -e '#!/bin/bash\nexec /bin/gzip -f - >"/home/<username>/crashes/core-$1-$2-$3-$4-$5.gz"' > ~/crashes/core.sh
Now this script will take 5 input parameters and concatenate them and add to core-path. The full paths must be specified in the ~/crashes/core.sh. Also the location of this script can be specified. Now lets tell kernel to use tour executable with parameters when generating file:
sudo sysctl -w kernel.core_pattern="|/home/<username>/crashes/core.sh %e %p %h %t"
Again should be replaced (or entire path to match location and name of core.sh script). Next step is to crash some program, lets create example crashing cpp file:
int main (){
int * a = nullptr;
int b = *a;
}
After compiling and running there are 2 options, either we will see:
Segmentation fault (core dumped)
Or
Segmentation fault
In case we see the latter, there are few possible reasons.
ulimit is not set, ulimit -c should specify what is limit for cores
apport or your distro core dump collector is not running, this should be investigated further
there is an error in script we wrote, I suggest than checking some basic dump path to check if the other things aren't reason the below should create /tmp/core.dump:
sudo sysctl -w kernel.core_pattern="/tmp/core.dump"
I know there is already an answer for this question however it wasn't obvious for me why it isn't working "out of the box" so I wanted to summarize my findings, hope it helps someone.

Resources