Multiple nested echo statements piped to command kernel limitation - linux

I've got a straight forward bash script generated with fwbuilder that nests several echo statements and pipes them through to iptables-restore.
We compile this way instead of just having multiple "iptables -A xxx" lines since it compiles and deploys much quicker and it also doesn't drop existing connections.
The problem is we seem to have hit the limit of allowed multiple redirects (~23'850 lines don't work, ~23'600 lines do).
Run it on kernel 2.6.18 (CentOS 5.x) and it breaks, run it on 2.6.32 (6.x) and it works like a charm.
Script essentially looks like this, comes out as just one long line piped to the command:
(echo "1"; echo "2"; echo "3"; ... ; echo "25000") | /do/anything
So I guess the question is, is there an easy way to increase this limit without recompiling the kernel? I'd imagine it's some sort of stdin character limitation of piping. Or do I have to do an OS upgrade?
edit: Oh and would also like to add that when running on the older kernel, no errors are shown, but a segfault shows in dmesg.

The reason that you're not observing the problem on 2.6.32 and observing it on 2.6.18 is that starting with kernel 2.6.23 the ARG_MAX limitation has been removed. This is the commit for the change.
In order to find some ways to circumvent the limit, see ARG_MAX.

Can you use a here-doc instead?
cat <<EOF | /do/anything
1
2
3
...
25000
EOF

Related

Bash: Unexpected parallel behavior when reading arguments from file using xargs

Previous
This is a follow-up to this question.
Specs
My system is a dedicated server running Ubuntu Desktop, Release 12.04 (precise) 64-bit, 3.14.32-xxxx-std-ipv6-64. Neither release or kernel can be upgraded, but I can install any package.
Problem
The problem discribed in the question above seems to be solved, however this doesn't work for me. I've installed the latest lftp and parallel packages and they seem to work fine for themselves.
Running lftp works fine.
Running ./job.sh ftp.microsoft.com works fine, but I needed to chmod -x the script
Running sed 's/|.*$//' end_unique.txt | xargs parallel -j20 ./job.sh ::: does not work and produces bash errors in the form of /bin/bash: <server>: command not found.
To simplify things, I cleaned the input file end_unique.txt, now it has the following format for each line:
<server>
Each line ends in a CRLF, because it is imported from a windows server.
Edit 1:
This is the job.sh script:
#/bin/sh
server="$1"
lftp -e "find .; exit" "$server" >"$server-files.txt"
Edit 2:
I took the file and ran it against fromdos. Now it should be standard unix format, one server per line. Keep in mind that the server in the file can vary in format:
ftp.server.com
www.server.com
server.com
123.456.789.190
etc. All of those servers are ftp servers, accessible by ftp://<serverfromfile>/.
With :::, parallel expects the list of arguments it needs to complete the commands it's going to run to appear on the command line, as in
parallel -j20 ./job.sh ::: server1 server2 server3
Without ::: it reads the arguments from stdin, which serves us better in this case. You can simply say
parallel -j20 ./job.sh < end_unique.txt
Addendum: Things that can go wrong
Make certain two things:
That you are using GNU parallel and not another version (such as the one from moreutils), because only (as far as I'm aware) the GNU version supports reading an argument list from stdin, and
That GNU parallel is not configured to disable the GNU extensions. It turned out, after a lengthy discussion in the comments, that they are disabled by default on Ubuntu 12.04, so it is not inconceivable that this sort of thing might be found elsewhere (particularly downstream from Ubuntu). Such a configuration can hide in
The environment variable $PARALLEL,
/etc/parallel/config, or
~/.parallel/config
If the GNU version of parallel is not available to you, and if your argument list is not too long for the shell and none of the arguments in it contain whitespaces, the same thing with the moreutils parallel is
parallel -j20 job.sh -- $(cat end_unique.txt)
This did not work for OP because the file contained more servers than the shell was willing to put into a command line, but it might work for others with similar problems.

Use of tee command promptly even for one command

I am new to using tee command.
I am trying to run one of my program which takes long time to finish but it prints out information as it progresses. I am using 'tee' to save the output to a file as well as to see the output in the shell (bash).
But the problem is tee doesn't forward the output to shell until the end of my command.
Is there any way to do that ?
I am using Debian and bash.
This actually depends on the amount of output and the implementation of whatever command you are running. No program is obliged to print stuff straight to stdout or stderr and flush it all the time. So even though most C runtime implementation flush after a certain amount of data was written using one of the runtime routines, such as printf, this may not be true depending on the implementation.
It tee doesn't output it right away, it is likely only receiving the input at the very end of the run of your command. It might be helpful to mention which exact command it is.
The problem you are experienced is most probably related to buffering.
You may have a look at stdbuf command, which does the following:
stdbuf - Run COMMAND, with modified buffering operations for its standard streams.
If you were to post your usage I could give a better answer, but as it is
(for i in `seq 10`; do echo $i; sleep 1s; done) | tee ./tmp
Is proper usage of the tee command and seems to work. Replace the part before the pipe with your command and you should be good to go.

Read stdout from a process (linux embedded)

Before flagging the question as duplicate, please read my various issues I encountered.
A bit of background: we are developing a C++ application running on embedded ARM sbc using a lite variant of debian linux. The application start at boot launched by the boot script and print various information to stdout. What we would like is the ability to connect using SSH/Telnet and read the application output, without having to kill the process and restart it for the current bash session. I want to create a simple .sh script for non-tech-savvy people to use.
The first solution for the similar question posted here is to use gdb. First it's not user-friendly (need to write multiple commands manually) and I wonder why but it don't seems to output anything into the file.
The second solution strace -ewrite -p PID works perfectly, that's what I want. Problem is, there's a lot more information than just the stdout, and it's badly formatted.
I managed to get an "acceptable" result with strace -e write=1 -s 1024 -p 20049 2>&1 | grep "write(1," but it still have the superfluous write(1, "...", 19) = 19 text. Up to this point it's simply a bit of string formatting, and I've found on multiple other pages this line saying it achieve good formatting : strace -ff -e write=1,2 -s 1024 -p PID 2>&1 | grep "^ |" | cut -c11-60 | sed -e 's/ //g' | xxd -r -p
There are some things I find strange in this command (why -ff?, why grep "^ |"?, why use xxd there?) and it just don't output anything when I try it.
Unfortunately, we do use a old buggy version of busybox (1.7.1) that have some problem with multiple pipes. That bug gives me bad results. For example, if I only do grep it works, and if I only do cut it also works, but let's say "grep "write(1," | cut -c11-60" returns nothing.
I know the real solution would simply be to update busybox and use these multiple pipes to format the string, but we can't update it since the os distribution is already installed on thousands of boards shipped to our clients worldwide..
Anyone have a miraculous solution? Thanks
Screen can be connected to an existing process using reptyr (http://blog.nelhage.com/2011/01/reptyr-attach-a-running-process-to-a-new-terminal/), or you can use neercs (http://caca.zoy.org/wiki/neercs) which I haven't used but apparently is like screen but supports attaching to an existing process all by itself.

How do I set a ulimit from inside a Perl script that applies to its children?

I have a Perl script that does various installation steps to set up a development box for our company. It runs various shell scripts, some of which crash due to lower than required ulimits (specifically, stack size -s in my case).
Therefore, I'd like to set a ulimit that would apply to all scripts (children) started from within my main Perl one, but I am not sure how to achieve that - any attempts at calling ulimit from within the script only set it on that specific child shell, which immediately exits.
I am aware that I can call ulimit before I run the Perl script or use /etc/security/limits.conf but I don't want the user to know any of this - they should only know how to run the script, which should take care of all of that for them.
I can also run ulimit every time I run a command, like this ulimit -s BLA; ./cmd but I don't want to duplicate this every time and I feel like there's a better, cleaner solution out there.
Another crazy "workaround" is to make a wrapper script called BLA.sh which would set ulimit and call BLA.pl, but again, it's a hack in my mind and now I'd have 2 scripts (I could even make BLA.pl call itself with "ulimit -s BLA; ./BLA.pl --foo" and act differently based on whether it sees --foo or not but that's even hackier than before).
Finally, apparently I could install BSD::Resource but I'd like to avoid using external dependencies.
So what is THE way to set the ulimit from within a Perl script and make it apply to all children?
Thank you.
You've already answered your question: use BSD::Resource.
There isn't anything in the Perl core that interfaces with setrlimit. If you can't (or won't) use the standard method, then you have to use a hack. Any of the methods you've already described would work. (Note that you could create a subroutine to prepend ulimit -s BLA; to every command, and then use that sub instead of system.)
Here's an example of how to set the cpu limit without using BSD::Resource (but assuming the perl system headers are there). To adapt to other resources, make the obvious changes.
require 'syscall.ph';
require 'sys/resource.ph';
# set the soft cpu limit to 1 (second), and the hard limit to 10.
$rstruct = pack "L!L!",1,10; # L! means native long unsigned int.
syscall(&SYS_setrlimit,&RLIMIT_CPU,$rstruct);
This assumes knowledge that rlim_t is in fact unsigned long; I don't know if there's a way to extract this info from the Perl headers.
You can always wrap your perl in a little shell script:
#!/bin/sh -- # --*-Perl-*--
ulimit -n 2048
exec /usr/bin/perl -x -S $0 ${1+"$#"}
#!/usr/bin/perl
#line 6
use strict;
# etc, etc....
It's ugly, and obviously, script start up time will be slightly longer.
I ended up prepending ulimit -s BLA to the commands that needed it. I specifically didn't want to go with BSD::Resource because it's not a default Perl package and was missing on about half of the existing dev machines. No user interaction was a specific requirement.

REDUX: How to overcome an incompatibility between the ksh on Linux vs. that installed on AIX/Solaris/HPUX?

I have uncovered another problem in the effort that we are making to port several hundreds of ksh scripts from AIX, Solaris and HPUX to Linux. See here for the previous problem.
This code:
#!/bin/ksh
if [ -a k* ]; then
echo "Oh yeah!"
else
echo "No way!"
fi
exit 0
(when run in a directory with several files whose name starts with k) produces "Oh yeah!" when called with the AT&T ksh variants (ksh88 and ksh93). On the other hand it produces and error message followed by "No way!" on the other ksh variants (pdksh, MKS ksh and bash).
Again, my question are:
Is there an environment variable that will cause pdksh to behave like ksh93? Failing that:
Is there an option on pdksh to get the required behavior?
I wouldn't use pdksh on Linux anymore.
Since AT&T ksh has become OpenSource there are packages available from the various Linux distributions. E.g. RedHat Enterprise Linux and CentOS include ksh93 as the "ksh" RPM package.
pdksh is still mentioned in many installation requirement documentations from software vendors. We replaced pdksh on all our Linux systems with ksh93 with no problems so far.
Well after one year there seems to be no solution to my problem.
I am adding this answer to say that I will have to live with it......
in Bash the test -a operation is for a single file.
I'm guessing that in Ksh88 the test -a operation is for a single file, but doesn't complain because the other test words are an unspecified condition to the -a.
you want something like
for K in /etc/rc2.d/K* ; do test -a $K && echo heck-yea ; done
I can say that ksh93 works just like bash in this regard.
Regrettably I think the code was written poorly, my opinion, and likely a bad opinion since the root cause of the problem is the ksh88 built-in test allowing for sloppy code.
You do realize that [ is an alias (often a link, symbolic or hard) for /usr/bin/test, right? So perhaps the actual problem is different versions of /usr/bin/test ?
OTOH, ksh overrides it with a builtin. Maybe there's a way to get it to not do that? or maybe you can explicitly alias [ to /usr/bin/test, if /usr/bin/test on all platforms is compatible?

Resources