Dropping privileges from perl script? - linux

I have a perl script running as root, and from within it I want to execute a system command bar as a lesser priveleged user foo. So I have my system call wrapped as follows:
sub dosys
{
system(#_) == 0
or die "system #_ failed: $?";
}
And so I want to say:
as user foo dosys("bar")
Is there a mechanism within perl or the underlying bash shell that I can use to do this? (I would prefer one that didn't require installing an additional cpan library if possible)

The POSIX module is a Perl core module, and it includes the functions:
setuid()
setgid()
and related get*id() functions, though the values are also available through special variables:
$) and $( (effective and real GID)
$< and $> (effective and real UID)
You can also try setting those directly (per $EGID and $UID).

system('su www-data -c whoami')
> www-data

You have to change groups first, remember to quash supplementary groups, and then change user. You'll want to do this in a separate process, so that the [UG]ID changing doesn't affect privs on your root process.
sub su_system {
my $acct = shift;
my $gid = getgrnam $acct; # XXX error checking!
my $uid = getpwnam $acct;
if (fork) { # XXX error checking!
wait;
return $? >> 8;
}
# -- child
$( = $) = "$gid $gid"; # No supp. groups; see perlvar $)
$< = $> = $uid;
exec #_; # XXX not as safe as exec {prog} #argv
# oh, and what if $acct had [ug]id zero? darn
}
Proceed with caution.

Related

Changing global var inside function doesnt mutate global variable [duplicate]

I'm working with this:
GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)
I have a script like below:
#!/bin/bash
e=2
function test1() {
e=4
echo "hello"
}
test1
echo "$e"
Which returns:
hello
4
But if I assign the result of the function to a variable, the global variable e is not modified:
#!/bin/bash
e=2
function test1() {
e=4
echo "hello"
}
ret=$(test1)
echo "$ret"
echo "$e"
Returns:
hello
2
I've heard of the use of eval in this case, so I did this in test1:
eval 'e=4'
But the same result.
Could you explain me why it is not modified? How could I save the echo of the test1 function in ret and modify the global variable too?
When you use a command substitution (i.e., the $(...) construct), you are creating a subshell. Subshells inherit variables from their parent shells, but this only works one way: A subshell cannot modify the environment of its parent shell.
Your variableĀ e is set within a subshell, but not the parent shell. There are two ways to pass values from a subshell to its parent. First, you can output something to stdout, then capture it with a command substitution:
myfunc() {
echo "Hello"
}
var="$(myfunc)"
echo "$var"
The above outputs:
Hello
For a numerical value in the range of 0 through 255, you can use return to pass the number as the exit status:
mysecondfunc() {
echo "Hello"
return 4
}
var="$(mysecondfunc)"
num_var=$?
echo "$var - num is $num_var"
This outputs:
Hello - num is 4
This needs bash 4.1 if you use {fd} or local -n.
The rest should work in bash 3.x I hope. I am not completely sure due to printf %q - this might be a bash 4 feature.
Summary
Your example can be modified as follows to archive the desired effect:
# Add following 4 lines:
_passback() { while [ 1 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; return $1; }
passback() { _passback "$#" "$?"; }
_capture() { { out="$("${#:2}" 3<&-; "$2_" >&3)"; ret=$?; printf "%q=%q;" "$1" "$out"; } 3>&1; echo "(exit $ret)"; }
capture() { eval "$(_capture "$#")"; }
e=2
# Add following line, called "Annotation"
function test1_() { passback e; }
function test1() {
e=4
echo "hello"
}
# Change following line to:
capture ret test1
echo "$ret"
echo "$e"
prints as desired:
hello
4
Note that this solution:
Works for e=1000, too.
Preserves $? if you need $?
The only bad sideffects are:
It needs a modern bash.
It forks quite more often.
It needs the annotation (named after your function, with an added _)
It sacrifices file descriptor 3.
You can change it to another FD if you need that.
In _capture just replace all occurances of 3 with another (higher) number.
The following (which is quite long, sorry for that) hopefully explains, how to adpot this recipe to other scripts, too.
The problem
d() { let x++; date +%Y%m%d-%H%M%S; }
x=0
d1=$(d)
d2=$(d)
d3=$(d)
d4=$(d)
echo $x $d1 $d2 $d3 $d4
outputs
0 20171129-123521 20171129-123521 20171129-123521 20171129-123521
while the wanted output is
4 20171129-123521 20171129-123521 20171129-123521 20171129-123521
The cause of the problem
Shell variables (or generally speaking, the environment) is passed from parental processes to child processes, but not vice versa.
If you do output capturing, this usually is run in a subshell, so passing back variables is difficult.
Some even tell you, that it is impossible to fix. This is wrong, but it is a long known difficult to solve problem.
There are several ways on how to solve it best, this depends on your needs.
Here is a step by step guide on how to do it.
Passing back variables into the parental shell
There is a way to pass back variables to a parental shell. However this is a dangerous path, because this uses eval. If done improperly, you risk many evil things. But if done properly, this is perfectly safe, provided that there is no bug in bash.
_passback() { while [ 0 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; }
d() { let x++; d=$(date +%Y%m%d-%H%M%S); _passback x d; }
x=0
eval `d`
d1=$d
eval `d`
d2=$d
eval `d`
d3=$d
eval `d`
d4=$d
echo $x $d1 $d2 $d3 $d4
prints
4 20171129-124945 20171129-124945 20171129-124945 20171129-124945
Note that this works for dangerous things, too:
danger() { danger="$*"; passback danger; }
eval `danger '; /bin/echo *'`
echo "$danger"
prints
; /bin/echo *
This is due to printf '%q', which quotes everything such, that you can re-use it in a shell context safely.
But this is a pain in the a..
This does not only look ugly, it also is much to type, so it is error prone. Just one single mistake and you are doomed, right?
Well, we are at shell level, so you can improve it. Just think about an interface you want to see, and then you can implement it.
Augment, how the shell processes things
Let's go a step back and think about some API which allows us to easily express, what we want to do.
Well, what do we want do do with the d() function?
We want to capture the output into a variable.
OK, then let's implement an API for exactly this:
# This needs a modern bash 4.3 (see "help declare" if "-n" is present,
# we get rid of it below anyway).
: capture VARIABLE command args..
capture()
{
local -n output="$1"
shift
output="$("$#")"
}
Now, instead of writing
d1=$(d)
we can write
capture d1 d
Well, this looks like we haven't changed much, as, again, the variables are not passed back from d into the parent shell, and we need to type a bit more.
However now we can throw the full power of the shell at it, as it is nicely wrapped in a function.
Think about an easy to reuse interface
A second thing is, that we want to be DRY (Don't Repeat Yourself).
So we definitively do not want to type something like
x=0
capture1 x d1 d
capture1 x d2 d
capture1 x d3 d
capture1 x d4 d
echo $x $d1 $d2 $d3 $d4
The x here is not only redundant, it's error prone to always repeate in the correct context. What if you use it 1000 times in a script and then add a variable? You definitively do not want to alter all the 1000 locations where a call to d is involved.
So leave the x away, so we can write:
_passback() { while [ 0 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; }
d() { let x++; output=$(date +%Y%m%d-%H%M%S); _passback output x; }
xcapture() { local -n output="$1"; eval "$("${#:2}")"; }
x=0
xcapture d1 d
xcapture d2 d
xcapture d3 d
xcapture d4 d
echo $x $d1 $d2 $d3 $d4
outputs
4 20171129-132414 20171129-132414 20171129-132414 20171129-132414
This already looks very good. (But there still is the local -n which does not work in oder common bash 3.x)
Avoid changing d()
The last solution has some big flaws:
d() needs to be altered
It needs to use some internal details of xcapture to pass the output.
Note that this shadows (burns) one variable named output,
so we can never pass this one back.
It needs to cooperate with _passback
Can we get rid of this, too?
Of course, we can! We are in a shell, so there is everything we need to get this done.
If you look a bit closer to the call to eval you can see, that we have 100% control at this location. "Inside" the eval we are in a subshell,
so we can do everything we want without fear of doing something bad to the parental shell.
Yeah, nice, so let's add another wrapper, now directly inside the eval:
_passback() { while [ 0 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; }
# !DO NOT USE!
_xcapture() { "${#:2}" > >(printf "%q=%q;" "$1" "$(cat)"); _passback x; } # !DO NOT USE!
# !DO NOT USE!
xcapture() { eval "$(_xcapture "$#")"; }
d() { let x++; date +%Y%m%d-%H%M%S; }
x=0
xcapture d1 d
xcapture d2 d
xcapture d3 d
xcapture d4 d
echo $x $d1 $d2 $d3 $d4
prints
4 20171129-132414 20171129-132414 20171129-132414 20171129-132414
However, this, again, has some major drawback:
The !DO NOT USE! markers are there,
because there is a very bad race condition in this,
which you cannot see easily:
The >(printf ..) is a background job. So it might still
execute while the _passback x is running.
You can see this yourself if you add a sleep 1; before printf or _passback.
_xcapture a d; echo then outputs x or a first, respectively.
The _passback x should not be part of _xcapture,
because this makes it difficult to reuse that recipe.
Also we have some unneded fork here (the $(cat)),
but as this solution is !DO NOT USE! I took the shortest route.
However, this shows, that we can do it, without modification to d() (and without local -n)!
Please note that we not neccessarily need _xcapture at all,
as we could have written everyting right in the eval.
However doing this usually isn't very readable.
And if you come back to your script in a few years,
you probably want to be able to read it again without much trouble.
Fix the race
Now let's fix the race condition.
The trick could be to wait until printf has closed it's STDOUT, and then output x.
There are many ways to archive this:
You cannot use shell pipes, because pipes run in different processes.
One can use temporary files,
or something like a lock file or a fifo. This allows to wait for the lock or fifo,
or different channels, to output the information, and then assemble the output in some correct sequence.
Following the last path could look like (note that it does the printf last because this works better here):
_passback() { while [ 0 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; }
_xcapture() { { printf "%q=%q;" "$1" "$("${#:2}" 3<&-; _passback x >&3)"; } 3>&1; }
xcapture() { eval "$(_xcapture "$#")"; }
d() { let x++; date +%Y%m%d-%H%M%S; }
x=0
xcapture d1 d
xcapture d2 d
xcapture d3 d
xcapture d4 d
echo $x $d1 $d2 $d3 $d4
outputs
4 20171129-144845 20171129-144845 20171129-144845 20171129-144845
Why is this correct?
_passback x directly talks to STDOUT.
However, as STDOUT needs to be captured in the inner command,
we first "save" it into FD3 (you can use others, of course) with '3>&1'
and then reuse it with >&3.
The $("${#:2}" 3<&-; _passback x >&3) finishes after the _passback,
when the subshell closes STDOUT.
So the printf cannot happen before the _passback,
regardless how long _passback takes.
Note that the printf command is not executed before the complete
commandline is assembled, so we cannot see artefacts from printf,
independently how printf is implemented.
Hence first _passback executes, then the printf.
This resolves the race, sacrificing one fixed file descriptor 3.
You can, of course, choose another file descriptor in the case,
that FD3 is not free in your shellscript.
Please also note the 3<&- which protects FD3 to be passed to the function.
Make it more generic
_capture contains parts, which belong to d(), which is bad,
from a reusability perspective. How to solve this?
Well, do it the desparate way by introducing one more thing,
an additional function, which must return the right things,
which is named after the original function with _ attached.
This function is called after the real function, and can augment things.
This way, this can be read as some annotation, so it is very readable:
_passback() { while [ 0 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; }
_capture() { { printf "%q=%q;" "$1" "$("${#:2}" 3<&-; "$2_" >&3)"; } 3>&1; }
capture() { eval "$(_capture "$#")"; }
d_() { _passback x; }
d() { let x++; date +%Y%m%d-%H%M%S; }
x=0
capture d1 d
capture d2 d
capture d3 d
capture d4 d
echo $x $d1 $d2 $d3 $d4
still prints
4 20171129-151954 20171129-151954 20171129-151954 20171129-151954
Allow access to the return-code
There is only on bit missing:
v=$(fn) sets $? to what fn returned. So you probably want this, too.
It needs some bigger tweaking, though:
# This is all the interface you need.
# Remember, that this burns FD=3!
_passback() { while [ 1 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; return $1; }
passback() { _passback "$#" "$?"; }
_capture() { { out="$("${#:2}" 3<&-; "$2_" >&3)"; ret=$?; printf "%q=%q;" "$1" "$out"; } 3>&1; echo "(exit $ret)"; }
capture() { eval "$(_capture "$#")"; }
# Here is your function, annotated with which sideffects it has.
fails_() { passback x y; }
fails() { x=$1; y=69; echo FAIL; return 23; }
# And now the code which uses it all
x=0
y=0
capture wtf fails 42
echo $? $x $y $wtf
prints
23 42 69 FAIL
There is still a lot room for improvement
_passback() can be elmininated with passback() { set -- "$#" "$?"; while [ 1 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; return $1; }
_capture() can be eliminated with capture() { eval "$({ out="$("${#:2}" 3<&-; "$2_" >&3)"; ret=$?; printf "%q=%q;" "$1" "$out"; } 3>&1; echo "(exit $ret)")"; }
The solution pollutes a file descriptor (here 3) by using it internally.
You need to keep that in mind if you happen to pass FDs.
Note thatbash 4.1 and above has {fd} to use some unused FD.
(Perhaps I will add a solution here when I come around.)
Note that this is why I use to put it in separate functions like _capture, because stuffing this all into one line is possible, but makes it increasingly harder to read and understand
Perhaps you want to capture STDERR of the called function, too.
Or you want to even pass in and out more than one filedescriptor
from and to variables.
I have no solution yet, however here is a way to catch more than one FD, so we can probably pass back the variables this way, too.
Also do not forget:
This must call a shell function, not an external command.
There is no easy way to pass environment variables out of external commands.
(With LD_PRELOAD= it should be possible, though!)
But this then is something completely different.
Last words
This is not the only possible solution. It is one example to a solution.
As always you have many ways to express things in the shell.
So feel free to improve and find something better.
The solution presented here is quite far from being perfect:
It was nearly not tested at all, so please forgive typos.
There is a lot of room for improvement, see above.
It uses many features from modern bash, so probably is hard to port to other shells.
And there might be some quirks I haven't thought about.
However I think it is quite easy to use:
Add just 4 lines of "library".
Add just 1 line of "annotation" for your shell function.
Sacrifices just one file descriptor temporarily.
And each step should be easy to understand even years later.
Maybe you can use a file, write to file inside function, read from file after it. I have changed e to an array. In this example blanks are used as separator when reading back the array.
#!/bin/bash
declare -a e
e[0]="first"
e[1]="secondddd"
function test1 () {
e[2]="third"
e[1]="second"
echo "${e[#]}" > /tmp/tempout
echo hi
}
ret=$(test1)
echo "$ret"
read -r -a e < /tmp/tempout
echo "${e[#]}"
echo "${e[0]}"
echo "${e[1]}"
echo "${e[2]}"
Output:
hi
first second third
first
second
third
What you are doing, you are executing test1
$(test1)
in a sub-shell( child shell ) and Child shells cannot modify anything in parent.
You can find it in bash manual
Please Check: Things results in a subshell here
I had a similar problem when I wanted to remove temporary files I had created automatically. The solution I came up with was not to use command substitution, but rather to pass the name of the variable, that should take the final result, into the function. E.g.
#!/usr/bin/env bash
# array that keeps track of tmp-files
remove_later=()
# function that manages tmp-files
new_tmp_file() {
file=$(mktemp)
remove_later+=( "$file" )
# assign value (safe form of `eval "$1=$file"`)
printf -v "$1" -- "$file"
}
# function to remove all tmp-files
remove_tmp_files() { rm -- "${remove_later[#]}"; }
# define trap to remove all tmp-files upon EXIT
trap remove_tmp_files EXIT
# generate tmp-files
new_tmp_file tmpfile1
new_tmp_file tmpfile2
So, adapting this to the OP, it would be:
#!/usr/bin/env bash
e=2
function test1() {
e=4
printf -v "$1" -- "hello"
}
test1 ret
echo "$ret"
echo "$e"
Works and has no restrictions on the "return value".
Assuming that local -n is available, the following script lets the function test1 modify a global variable:
#!/bin/bash
e=2
function test1() {
local -n var=$1
var=4
echo "hello"
}
test1 e
echo "$e"
Which gives the following output:
hello
4
I'm not sure if this works on your terminal, but I found out that if you don't provide any outputs whatsoever it gets naturally treated as a void function, and can make global variable changes.
Here's the code I used:
let ran1=$(( (1<<63)-1)/3 ))
let ran2=$(( (1<<63)-1)/5 ))
let c=0
function randomize {
c=$(( ran1+ran2 ))
ran2=$ran1
ran1=$c
c=$(( c > 0 ))
}
It's a simple randomizer for games that effectively modifies the needed variables.
It's because command substitution is performed in a subshell, so while the subshell inherits the variables, changes to them are lost when the subshell ends.
Reference:
Command substitution, commands grouped with parentheses, and asynchronous commands are invoked in a subshell environment that is a duplicate of the shell environment
A solution to this problem, without having to introduce complex functions and heavily modify the original one, is to store the value in a temporary file and read / write it when needed.
This approach helped me greatly when I had to mock a bash function called multiple times in a bats test case.
For example, you could have:
# Usage read_value path_to_tmp_file
function read_value {
cat "${1}"
}
# Usage: set_value path_to_tmp_file the_value
function set_value {
echo "${2}" > "${1}"
}
#----
# Original code:
function test1() {
e=4
set_value "${tmp_file}" "${e}"
echo "hello"
}
# Create the temp file
# Note that tmp_file is available in test1 as well
tmp_file=$(mktemp)
# Your logic
e=2
# Store the value
set_value "${tmp_file}" "${e}"
# Run test1
test1
# Read the value modified by test1
e=$(read_value "${tmp_file}")
echo "$e"
The drawback is that you might need multiple temp files for different variables. And also you might need to issue a sync command to persist the contents on the disk between one write and read operations.
You can always use an alias:
alias next='printf "blah_%02d" $count;count=$((count+1))'

finding a file in directory using perl script

I'm trying to develop a perl script that looks through all of the user's directories for a particular file name without the user having to specify the entire pathname to the file.
For example, let's say the file of interest was data.list. It's located in /home/path/directory/project/userabc/data.list. At the command line, normally the user would have to specify the pathname to the file like in order to access it, like so:
cd /home/path/directory/project/userabc/data.list
Instead, I want the user just to have to enter script.pl ABC in the command line, then the Perl script will automatically run and retrieve the information in the data.list. which in my case, is count the number of lines and upload it using curl. the rest is done, just the part where it can automatically locate the file
Even though very feasible in Perl, this looks more appropriate in Bash:
#!/bin/bash
filename=$(find ~ -name "$1" )
wc -l "$filename"
curl .......
The main issue would of course be if you have multiple files data1, say for example /home/user/dir1/data1 and /home/user/dir2/data1. You will need a way to handle that. And how you handle it would depend on your specific situation.
In Perl that would be much more complicated:
#! /usr/bin/perl -w
eval 'exec /usr/bin/perl -S $0 ${1+"$#"}'
if 0; #$running_under_some_shell
use strict;
# Import the module File::Find, which will do all the real work
use File::Find ();
# Set the variable $File::Find::dont_use_nlink if you're using AFS,
# since AFS cheats.
# for the convenience of &wanted calls, including -eval statements:
# Here, we "import" specific variables from the File::Find module
# The purpose is to be able to just type '$name' instead of the
# complete '$File::Find::name'.
use vars qw/*name *dir *prune/;
*name = *File::Find::name;
*dir = *File::Find::dir;
*prune = *File::Find::prune;
# We declare the sub here; the content of the sub will be created later.
sub wanted;
# This is a simple way to get the first argument. There is no
# checking on validity.
our $filename=$ARGV[0];
# Traverse desired filesystem. /home is the top-directory where we
# start our seach. The sub wanted will be executed for every file
# we find
File::Find::find({wanted => \&wanted}, '/home');
exit;
sub wanted {
# Check if the file is our desired filename
if ( /^$filename\z/) {
# Open the file, read it and count its lines
my $lines=0;
open(my $F,'<',$name) or die "Cannot open $name";
while (<$F>){ $lines++; }
print("$name: $lines\n");
# Your curl command here
}
}
You will need to look at the argument-parsing, for which I simply used $ARGV[0] and I do dont know what your curl looks like.
A more simple (though not recommended) way would be to abuse Perl as a sort of shell:
#!/usr/bin/perl
#
my $fn=`find /home -name '$ARGV[0]'`;
chomp $fn;
my $wc=`wc -l '$fn'`;
print "$wc\n";
system ("your curl command");
Following code snippet demonstrates one of many ways to achieve desired result.
The code takes one parameter, a word to look for in all subdirectories inside file(s) data.list. And prints out a list of found files in a terminal.
The code utilizes subroutine lookup($dir,$filename,$search) which calls itself recursively once it come across a subdirectory.
The search starts from current working directory (in question was not specified a directory as start point).
use strict;
use warnings;
use feature 'say';
my $search = shift || die "Specify what look for";
my $fname = 'data.list';
my $found = lookup('.',$fname,$search);
if( #$found ) {
say for #$found;
} else {
say 'Not found';
}
exit 0;
sub lookup {
my $dir = shift;
my $fname = shift;
my $search = shift;
my $files;
my #items = glob("$dir/*");
for my $item (#items) {
if( -f $item && $item =~ /\b$fname\b/ ) {
my $found;
open my $fh, '<', $item or die $!;
while( my $line = <$fh> ) {
$found = 1 if $line =~ /\b$search\b/;
if( $found ) {
push #{$files}, $item;
last;
}
}
close $fh;
}
if( -d $item ) {
my $ret = lookup($item,$fname,$search);
push #{$files}, $_ for #$ret;
}
}
return $files;
}
Run as script.pl search_word
Output sample
./capacitor/data.list
./examples/data.list
./examples/test/data.list
Reference:
glob,
Perl file test operators

Perl script cron/environment issue

The following Perl script generates an .xls file from a text file. It runs great in our linux test environment, but generates an empty spreadsheet (.xls) in our production environment when run via cron (cron works in test, as well.) Nothing jumps out at our sys admins in terms of system level settings that might account for this behavior. Towards the bottom of the script in the import_data subroutine, the correct number of lines is reported, but nothing is written to the spreadsheet and no errors are returned at either the script or system level. I ran it through the perl debugger but my skills fell short of being able to interactively watch it populate the file. The cron entry looks like this:
cd <script directory>; cvs2xls input.txt output.xls 2>&1
Any debugging tips would be appreciated, as well as potential system settings that I can forward on to our sysadmins.
#!/usr/bin/perl
use strict;
use warnings;
use lib '/apps/tu01688/perl5/lib/perl5';
use Spreadsheet::WriteExcel;
use Text::CSV::Simple;
BEGIN {
unshift #INC, "/apps/tu01688/jobs/mayo-expert";
};
my $infile = shift;
usage() unless defined $infile && -f $infile;
my $parser = Text::CSV::Simple->new;
my #data = $parser->read_file($infile);
my $headers = shift #data;
my $outfile = shift || $infile . ".xls";
my $subject = shift || 'worksheet';
sub usage {
print "csv2xls infile [outfile] [subject]\n";
exit;
}
my $workbook = Spreadsheet::WriteExcel->new($outfile);
my $bold = $workbook->add_format();
$bold->set_bold(1);
import_data($workbook, $subject, $headers, \#data);
# Add a worksheet
sub import_data {
my $workbook = shift;
my $base_name = shift;
my $colums = shift;
my $data = shift;
my $limit = shift || 50_000;
my $start_row = shift || 1;
my $worksheet = $workbook->add_worksheet($base_name);
$worksheet->add_write_handler(qr[\w], \&store_string_widths);
#$worksheet->add_write_handler(qr[\w]| \&store_string_widths);
my $w = 1;
$worksheet->write('A' . $start_row, $colums, ,$bold);
my $i = $start_row;
my $qty = 0;
for my $row (#$data) {
$qty++;
if ($i > $limit) {
$i = $start_row;
$w++;
$worksheet = $workbook->add_worksheet("$base_name - $w");
$worksheet->write('A1', $colums,$bold);
}
$worksheet->write($i++, 0, $row);
}
autofit_columns($worksheet);
warn "Converted $qty rows.";
return $worksheet;
}
###############################################################################
###############################################################################
#
# Functions used for Autofit.
#
###############################################################################
#
# Adjust the column widths to fit the longest string in the column.
#
sub autofit_columns {
my $worksheet = shift;
my $col = 0;
for my $width (#{$worksheet->{__col_widths}}) {
$worksheet->set_column($col, $col, $width) if $width;
$col++;
}
}
###############################################################################
#
# The following function is a callback that was added via add_write_handler()
# above. It modifies the write() function so that it stores the maximum
# unwrapped width of a string in a column.
#
sub store_string_widths {
my $worksheet = shift;
my $col = $_[1];
my $token = $_[2];
# Ignore some tokens that we aren't interested in.
return if not defined $token; # Ignore undefs.
return if $token eq ''; # Ignore blank cells.
return if ref $token eq 'ARRAY'; # Ignore array refs.
return if $token =~ /^=/; # Ignore formula
# Ignore numbers
#return if $token =~ /^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/;
# Ignore various internal and external hyperlinks. In a real scenario
# you may wish to track the length of the optional strings used with
# urls.
return if $token =~ m{^[fh]tt?ps?://};
return if $token =~ m{^mailto:};
return if $token =~ m{^(?:in|ex)ternal:};
# We store the string width as data in the Worksheet object. We use
# a double underscore key name to avoid conflicts with future names.
#
my $old_width = $worksheet->{__col_widths}->[$col];
my $string_width = string_width($token);
if (not defined $old_width or $string_width > $old_width) {
# You may wish to set a minimum column width as follows.
#return undef if $string_width < 10;
$worksheet->{__col_widths}->[$col] = $string_width;
}
# Return control to write();
return undef;
}
###############################################################################
#
# Very simple conversion between string length and string width for Arial 10.
# See below for a more sophisticated method.
#
sub string_width {
return length $_[0];
}
Uhmm.. don't put chained commands in cron, use an external script instead. Anyway: some suggestions that may help you:
Debugging cron commands
Check the mail! By default cron will mail any output from the command to the user it is running the command as. If there is no output there will be no mail. If you want cron to send mail to a different account then you can set the MAILTO environment variable in the crontab file e.g.
MAILTO=user#somehost.tld
1 2 * * * /path/to/your/command
Capture the output yourself
1 2 * * * /path/to/your/command &>/tmp/mycommand.log
which captures stdout and stderr to /tmp/mycommand.log
Look at the logs; cron logs its actions via syslog, which (depending on your setup) often go to /var/log/cron or /var/log/syslog.
If required you can filter the cron statements with e.g.
grep CRON /var/log/syslog
Now that we've gone over the basics of cron, where the files are and how to use them let's look at some common problems.
Check that cron is running
If cron isn't running then your commands won't be scheduled ...
ps -ef | grep cron | grep -v grep
should get you something like
root 1224 1 0 Nov16 ? 00:00:03 cron
or
root 2018 1 0 Nov14 ? 00:00:06 crond
If not restart it
/sbin/service cron start
or
/sbin/service crond start
There may be other methods; use what your distro provides.
cron runs your command in a restricted environment.
What environment variables are available is likely to be very limited. Typically, you'll only get a few variables defined, such as $LOGNAME, $HOME, and $PATH.
Of particular note is the PATH is restricted to /bin:/usr/bin. The vast majority of "my cron script doesn't work" problems are caused by this restrictive path. If your command is in a different location you can solve this in a couple of ways:
Provide the full path to your command.
1 2 * * * /path/to/your/command
Provide a suitable PATH in the crontab file
PATH=/usr:/usr/bin:/path/to/something/else
1 2 * * * command
If your command requires other environment variables you can define them in the crontab file too.
cron runs your command with cwd == $HOME
Regardless of where the program you execute resides on the filesystem, the current working directory of the program when cron runs it will be the user's home directory. If you access files in your program, you'll need to take this into account if you use relative paths, or (preferably) just use fully-qualified paths everywhere, and save everyone a whole lot of confusion.
The last command in my crontab doesn't run
Cron generally requires that commands are terminated with a new line. Edit your crontab; go to the end of the line which contains the last command and insert a new line (press enter).
Check the crontab format
You can't use a user crontab formatted crontab for /etc/crontab or the fragments in /etc/cron.d and vice versa. A user formatted crontab does not include a username in the 6th position of a row, while a system formatted crontab includes the username and runs the command as that user.
I put a file in /etc/cron.{hourly,daily,weekly,monthly} and it doesn't run
Check that the filename doesn't have an extension see run-parts
Ensure the file has execute permissions.
Tell the system what to use when executing your script (eg. put #!/bin/sh at top)
Cron date related bugs
If your date is recently changed by a user or system update, timezone or other, then crontab will start behaving erratically and exhibit bizarre bugs, sometimes working, sometimes not. This is crontab's attempt to try to "do what you want" when the time changes out from underneath it. The "minute" field will become ineffective after the hour is changed. In this scenario, only asterisks would be accepted. Restart cron and try it again without connecting to the internet (so the date doesn't have a chance to reset to one of the time servers).
Percent signs, again
To emphasise the advice about percent signs, here's an example of what cron does with them:
# cron entry
* * * * * cat >$HOME/cron.out%foo%bar%baz
will create the ~/cron.out file containing the 3 lines
foo
bar
baz
This is particularly intrusive when using the date command. Be sure to escape the percent signs
* * * * * /path/to/command --day "$(date "+\%Y\%m\%d")"
Thanks so much for the extensive feedback, everyone. I'm certainly taking a lot more away from this than I put into it. In any event, I ran across the answer. In my perl5 lib folder I found that somehow the IO and OLE libraries were missing on production. Copying those over from development resulted in everything working fine. The fact that I was unable to determine/capture this through conventional debugging efforts as opposed to merely comparing directory listings out of exasperation speaks to how much more I have to learn along these lines. But I'm confident that the great feedback I received will go a long ways towards getting me there. Thanks again, everyone.

Perl run the same script for different directories at the same time

I have a directory that contains other directories (the number of directories is arbitrary), like this:
Main_directory_samples/
subdirectory_sample_1/
subdirectory_sample_2/
subdirectory_sample_3/
subdirectory_sample_4/
I have a script that receives as input one directory each time and it takes 1h to run (for each directory). To run the script I have the following code:
opendir DIR, $maindirectory or die "Can't open directory!!";
while(my $dir = readdir DIR){
if($dir ne '.' && $dir ne '..'){
system("/bin/bash", "my_script.sh", $maindirectory.'/'.$dir);
}
}
closedir DIR;
However, I want to run the script for different directories at the same time. For instance, the subdirectory_sample_1/ and subdirectory_sample_2/ would run in the same thread; subdirectory_sample_3/ and subdirectory_sample_4/ in another. But I just can't find a way to do this.
As you're just starting external processes and waiting for them, a non-threading option:
use strict;
use warnings;
use Path::Tiny;
use IO::Async::Loop;
use Future::Utils 'fmap_concat';
my $loop = IO::Async::Loop->new;
my $maindirectory = '/foo/bar';
my #subdirs = grep { -d } path($maindirectory)->children; # excludes . and ..
# runs this code to maintain up to 'concurrent' pending futures at once
my $main_future = fmap_concat {
my $dir = shift;
my $future = $loop->new_future;
my $process = $loop->open_process(
command => ['/bin/bash', 'my_script.sh', $dir],
on_finish => sub { $future->done(#_) },
on_exception => sub { $future->fail(#_) },
);
return $future;
} foreach => \#subdirs, concurrent => 2;
# run event loop until all futures are done or one fails, throw exception on failure
my #exit_codes = $main_future->get;
See the docs for IO::Async::Loop and Future::Utils.
One way is to fork and in each child process a group of directories.
A basic example
use warnings;
use strict;
use feature 'say';
use List::MoreUtils qw(natatime);
use POSIX qw(:sys_wait_h); # for WNOHANG
use Time::HiRes qw(sleep); # for fractional seconds
my #all_dirs = qw(d1 d2 d3 d4);
my $path = 'maindir';
my #procs;
# Get iterator over groups (of 2)
my $it = natatime 2, #all_dirs;
while (my #dirs = $it->()) {
my $pid = fork // do { #/
warn "Can't fork for #dirs: $!";
next;
};
if ($pid == 0) {
foreach my $dir (#dirs) {
my #cmd = ('/bin/bash/', 'my_script.sh', "$path/$dir");
say "in $$, \#cmd: (#cmd)";
# system(#cmd) == 0 or do { inspect $? }
};
exit;
};
push #procs, $pid;
}
# Poll with non-blocking wait for processes (reap them)
my $gone;
while (($gone = waitpid -1, WNOHANG) > -1) {
my $status = $?;
say "Process $gone exited with $status" if $gone > 0;
sleep 0.1;
}
See system and/or exec for details, in particular on error checking, as well as $? variable. It can be unpacked to retrieve more details about the error; or, at least print a warning and skip to the next item (which happens above anyway).
The code above prints out the command and pid's with their exit status, but replace #cmd with a test command of no consequence and un-comment the system line to try this out.
Watch for how many jobs there are. A basic rule of thumb is to not have more than 2 per core at which point the performance starts suffering, but this depends on many details. Experiment to find the sweet spot for your case. I like to have a job per core and then at least one core free. In order to throttle this see modules linked at the end.
To break all jobs (directories) into groups I used natatime from List::MoreUtils (n-at-a-time). If there are more specific criteria about how to group directories adjust that.
See Forks::Super and Parallel::ForkManager for higher-level ways to work with forked processes.

prstat in Ubuntu or Centos

As the Java Performance said:
Solaris prstat has additional capabilities
such as reporting both user and kernel or system CPU utilization along with other
microstate information using the prstat -m and -L options. The -m option prints
microstate information, and -L prints statistics on per lightweight process.
There is any tool available like prstat in Centos or Ubuntu ?
I believe the Linux commands you are looking for are top and pstree .
Here is ptree for Linux,
#!/bin/sh
# Solaris style ptree
[ -x /usr/bin/ptree ] && exec /usr/bin/ptree "$#"
# Print process tree
# $1 = PID : extract tree for this process
# $1 = user : filter for this (existing) user
# $1 = user $2 = PID : do both
PATH=/bin:/usr/bin:/usr/sbin:/sbin
export PATH
psopt="-e"
case $1 in
[a-z]*) psopt="-u $1";shift;;
esac
[ -z "$1" ] &&
exec ps $psopt -Ho pid=,args=
#some effort to add less to the ps list
tmp=/tmp/ptree.$$
trap 'rm $tmp' 0 HUP INT TERM
ps $psopt -Ho pid=,args= >$tmp
<$tmp awk '
{ ci=index(substr($0,7),$2); o[ci]=$0 }
ci>s[a] { s[++a]=ci }
$1==pid {
for(i=1;i<=a;i++) {
si=s[i]; if(si<=ci) print o[si]
}
walkdown=ci
next
}
ci<walkdown { exit }
walkdown!=0 { print }
' pid="$1"
There is no prstat "equivalent" tool in Linux. You can use a combination of top and ps (or /proc/$pid/ resources) to get some useful result; maybe writing a shell script (using grep, sed and awk) which collects results from above commands and files.
Just for reference I found this link about top command and kernel, user and idle CPU utilization intresting
http://blog.scoutapp.com/articles/2015/02/24/understanding-linuxs-cpu-stats
Hope this helps.

Resources