Bash script not producing desired result - linux

I am running a cron-ed bash script to extract cache hits and bytes served per IP address. The script (ProxyUsage.bash) has two parts:
(uniqueIP.awk) find unique IPs and create a bash script do add up the hits and bytes
run the hits and bytes per IP
ProxyUsage.bash
#!/usr/bin/env bash
sudo gawk -f /home/maxg/scripts/uniqueIP.awk /var/log/squid3/access.log.1 > /home/maxg/scripts/pxyUsage.bash
source /home/maxg/scripts/pxyUsage.bash
uniqueIP.awk
{
arrIPs[$3]++;
}
END {
for (n in arrIPs) {
m++; # count arrIPs elements
#print "Array elements: " m;
arrAddr[i++] = n; # fill arrAddr with IPs
#print i " " n;
}
asort(arrAddr); # sort the array values
for (i = 1; i <= m; i++) { # write one command line per IP address
#printf("#!/usr/bin/env bash\n");
printf("sudo gawk -f /home/maxg/scripts/proxyUsage.awk -v v_Var=%s /var/log/squid3/access.log.1 >> /home/maxg/scripts/pxyUsage.txt\n", arrAddr[i])
}
}
pxyUsage.bash
sudo gawk -f /home/maxg/scripts/proxyUsage.awk -v v_Var=192.168.1.13 /var/log/squid3/access.log.1 >> /home/maxg/scripts/pxyUsage.txt
sudo gawk -f /home/maxg/scripts/proxyUsage.awk -v v_Var=192.168.1.14 /var/log/squid3/access.log.1 >> /home/maxg/scripts/pxyUsage.txt
sudo gawk -f /home/maxg/scripts/proxyUsage.awk -v v_Var=192.168.1.22 /var/log/squid3/access.log.1 >> /home/maxg/scripts/pxyUsage.txt
TheProxyUsage.bash script runs as scheduled and creates the pxyUsage.bash script.
However the pxyUsage.text file is not amended with the latest values when the script runs.
So far I run pxyUsage.bash every day myself, as I cannot figure out, why the result is not written to file.
Both bash scripts are set to execute. Actually the file permissions are below:
-rwxr-xr-x 1 maxg maxg 169 Mar 14 08:40 ProxySummary.bash
-rw-r--r-- 1 maxg maxg 910 Mar 15 17:15 proxyUsage.awk
-rwxrwxrwx 1 maxg maxg 399 Mar 17 06:10 pxyUsage.bash
-rw-rw-rw- 1 maxg maxg 2922 Mar 17 07:32 pxyUsage.txt
-rw-r--r-- 1 maxg maxg 781 Mar 16 07:35 uniqueIP.awk
Any hints appreciated. Thanks.

The sudo(8) command requires a pseudo-tty and you do not have one allocated under cron(8); you do have one allocated when logged in the usual way.
Instead of mucking about with sudo(8), just run the script as the correct user.
If you cannot do that, then in the root crontab, do something like this:
su - username /path/to/mycommand arg1 arg2...
This will work because root can use su(1) without neding a password.

Related

How to monitor newly created file in a directory with bash?

I have a log directory that consists of bunch of log files, one log file is created once an system event has happened. I want to write an oneline bash script that always monitors the file list and display the content of the newly created file on the terminal. Here is what it looks like:
Currently, all I have is to display the content of the whole directory:
for f in *; do cat $f; done
It lacks the monitoring feature that I wanted. One limitation of my system is that I do not have watch command. I also don't have any package manager to install fancy tools. Raw BSD is all I have. I do have tail, I was thinking of something like tail -F $(ls) but this tails each file instead of the file list.
In summary, I want to modify my script such that I can monitor the content of all newly created files.
First approach - use a hidden file in you dir (in my example it has a name .watch). Then you one-liner might look like:
for f in $(find . -type f -newer .watch); do cat $f; done; touch .watch
Second approach - use inotify-tools: https://unix.stackexchange.com/questions/273556/when-a-particular-file-arrives-then-execute-a-procedure-using-shell-script/273563#273563
You can cram it into a one-liner if you want, but I'd recommend just running the script in the background:
#!/bin/bash
[ ! -d "$1" ] && {
printf "error: argument is not a valid directory to monitory.\n"
exit 1
}
while :; fname="$1/$(inotifywait -q -e modify -e create --format '%f' "$1")"; do
cat "$fname"
done
Which will watch the directory given as the first argument, and cat any new or changed file in that directory. Example:
$ bash watchdir.sh my_logdir &
Which will then cat new or changed files in my_logdir.
Using inotifywait in monitor mode
First this little demo:
Open one terminal and run this:
ext=(php css other)
while :;do
subname=''
((RANDOM%10))||printf -v subname -- "-%04x" $RANDOM
date >/tmp/test$subname.${ext[RANDOM%3]}
sleep 1
done
This will create randomly files named /tmp/test.php, /tmp/test.css and /tmp/test.other, but randomly (approx 1 time / 10), the name will be /tmp/test-XXXX.[css|php|other] where XXXX is an hexadecimal random number.
Open another terminal and run this:
waitPaths=(/{home,tmp})
while read file ;do
if [ "$file" ] &&
( [ -z "${file##*.php}" ] || [ -z "${file##*.css}" ] ) ;then
(($(stat -c %Y-%X $file)))||echo -n new
echo file: $file, content:
cat $file
fi
done < <(
inotifywait -qme close_write --format %w%f ${waitPaths[*]}
)
This may produce something like:
file: /tmp/test.css, content:
Tue Apr 26 18:53:19 CEST 2016
file: /tmp/test.php, content:
Tue Apr 26 18:53:21 CEST 2016
file: /tmp/test.php, content:
Tue Apr 26 18:53:23 CEST 2016
file: /tmp/test.css, content:
Tue Apr 26 18:53:25 CEST 2016
file: /tmp/test.php, content:
Tue Apr 26 18:53:27 CEST 2016
newfile: /tmp/test-420b.php, content:
Tue Apr 26 18:53:28 CEST 2016
file: /tmp/test.php, content:
Tue Apr 26 18:53:29 CEST 2016
file: /tmp/test.php, content:
Tue Apr 26 18:53:30 CEST 2016
file: /tmp/test.php, content:
Tue Apr 26 18:53:31 CEST 2016
Some explanation:
waitPaths=(/{home,tmp}) could be written waitPaths=(/home /tmp) or for only one directory: waitPaths=/var/log
if condition search for filenames matching *.php or *.css
(($(stat -c %Y-%X $file)))||echo -n new will compare creation and modification time.
inotifywait
-q to stay quiet (don't print more then required)
-m for monitor mode: Command don't termine, but print each matching event.
-e close_write react only to specified kind of event.
-f %w%f Output format: path/file
Another way:
There is a more sophisticated sample:
Listenning for two kind of events (CLOSE_WRITE | CREATE)
Using a list of new files flags for knowing which files are new when CLOSE_WRITE event occur.
In second console, hit Ctrl+C, or in new terminal, tris this:
waitPaths=(/{home,tmp})
declare -A newFiles
while read path event file; do
if [ "$file" ] && ( [ -z "${file##*.php}" ] || [ -z "${file##*.css}" ] ); then
if [ "$event" ] && [ -z "${event//*CREATE*}" ]; then
newFiles[$file]=1
else
if [ "${newFiles[$file]}" ]; then
unset newFiles[$file]
echo NewFile: $file, content:
sed 's/^/>+ /' $file
else
echo file: $file, content:
sed 's/^/> /' $path/$file
fi
fi
fi
done < <(inotifywait -qme close_write -e create ${waitPaths[*]})
May produce something like:
file: test.css, content:
> Tue Apr 26 22:16:02 CEST 2016
file: test.php, content:
> Tue Apr 26 22:16:03 CEST 2016
NewFile: test-349b.css, content:
>+ Tue Apr 26 22:16:05 CEST 2016
file: test.css, content:
> Tue Apr 26 22:16:08 CEST 2016
file: test.css, content:
> Tue Apr 26 22:16:10 CEST 2016
file: test.css, content:
> Tue Apr 26 22:16:13 CEST 2016
Watching for new files AND new lines in old files, using bash
There is another solution by using some bashisms like associative arrays:
Sample:
wpath=/var/log
while : ;do
while read -a crtfile ;do
if [ "${crtfile:0:1}" = "-" ] &&
[ "${crtfile[8]##*.}" != "gz" ] &&
[ "${files[${crtfile[8]}]:-0}" -lt ${crtfile[4]} ] ;then
printf "\e[47m## %-14s :- %(%a %d %b %y %T)T ##\e[0m\n" ${crtfile[8]} -1
tail -c +$[1+${files[${crtfile[8]}]:-0}] $wpath/${crtfile[8]}
files[${crtfile[8]}]=${crtfile[4]}
fi
done < <( /bin/ls -l $wpath )
sleep 1
done
This will dump each files (with filename not ending by .gz) in /var/log, and watch for modification or new files, then dump new lines.
Demo:
In a first terminal console, hit:
ext=(php css other)
( while :; do
subname=''
((RANDOM%10)) || printf -v subname -- "-%04x" $RANDOM
name=test$subname.${ext[RANDOM%3]}
printf "%-16s" $name
{
date +"%a %d %b %y %T" | tee /dev/fd/5
fortune /usr/share/games/fortunes/bofh-excuses
} >> /tmp/$name
sleep 1
done ) 5>&1
You need to have fortune installed with BOFH excuses librarie.
If you really not have fortune, you could use this instead:
LANG=C ext=(php css other)
( while :; do
subname=''
((RANDOM%10)) || printf -v subname -- "-%04x" $RANDOM
name=test$subname.${ext[RANDOM%3]}
printf "%-16s" $name
{
date +"%a %d %b %y %T" | tee /dev/fd/5
for ((1; RANDOM%5; 1))
do
printf -v str %$[RANDOM&12]s
str=${str// /blah, }
echo ${str%, }.
done
} >> /tmp/$name
sleep 1
done ) 5>&1
This may output something like:
test.css Thu 28 Apr 16 12:00:02
test.php Thu 28 Apr 16 12:00:03
test.other Thu 28 Apr 16 12:00:04
test.css Thu 28 Apr 16 12:00:05
test.css Thu 28 Apr 16 12:00:06
test.other Thu 28 Apr 16 12:00:07
test.php Thu 28 Apr 16 12:00:08
test.css Thu 28 Apr 16 12:00:09
test.other Thu 28 Apr 16 12:00:10
test.other Thu 28 Apr 16 12:00:11
test.php Thu 28 Apr 16 12:00:12
test.other Thu 28 Apr 16 12:00:13
In a second terminal console, hit:
declare -A files
wpath=/tmp
while :; do
while read -a crtfile; do
if [ "${crtfile:0:1}" = "-" ] && [ "${crtfile[8]:0:4}" = "test" ] &&
( [ "${crtfile[8]##*.}" = "css" ] || [ "${crtfile[8]##*.}" = "php" ] ) &&
[ "${files[${crtfile[8]}]:-0}" -lt ${crtfile[4]} ]; then
printf "\e[47m## %-14s :- %(%a %d %b %y %T)T ##\e[0m\n" ${crtfile[8]} -1
tail -c +$[1+${files[${crtfile[8]}]:-0}] $wpath/${crtfile[8]}
files[${crtfile[8]}]=${crtfile[4]}
fi
done < <(/bin/ls -l $wpath)
sleep 1
done
This will each seconds
for all entries in watched directory
search for files (first caracter is -),
search for filenames begining by test,
search for filenames ending by css or php,
compare already printed sizes with new file size,
if new size greater,
print out new bytes by using tail -c and
store new already printed size
sleep 1 seconds
this may output something like:
## test.css :- Thu 28 Apr 16 12:00:09 ##
Thu 28 Apr 16 12:00:02
BOFH excuse #216:
What office are you in? Oh, that one. Did you know that your building was built over the universities first nuclear research site? And wow, aren't you the lucky one, your office is right over where the core is buried!
Thu 28 Apr 16 12:00:05
BOFH excuse #145:
Flat tire on station wagon with tapes. ("Never underestimate the bandwidth of a station wagon full of tapes hurling down the highway" Andrew S. Tannenbaum)
Thu 28 Apr 16 12:00:06
BOFH excuse #301:
appears to be a Slow/Narrow SCSI-0 Interface problem
## test.php :- Thu 28 Apr 16 12:00:09 ##
Thu 28 Apr 16 12:00:03
BOFH excuse #36:
dynamic software linking table corrupted
Thu 28 Apr 16 12:00:08
BOFH excuse #367:
Webmasters kidnapped by evil cult.
## test.css :- Thu 28 Apr 16 12:00:10 ##
Thu 28 Apr 16 12:00:09
BOFH excuse #25:
Decreasing electron flux
## test.php :- Thu 28 Apr 16 12:00:13 ##
Thu 28 Apr 16 12:00:12
BOFH excuse #3:
electromagnetic radiation from satellite debris
Nota: If some file are modified more than one time between two checks, all modification will be printed on next check.
Although not really nice, the following gives (and repeats) the last 50 lines of the newest file in the current directory:
while true; do tail -n 50 $(ls -Art | tail -n 1); sleep 5; done
You can refresh every minute using a cronjob:
$crontabe -e
* * * * * /home/script.sh
if you need to refresh in less than a minute you can use the command "sleep" inside your script.

Why linux split program have weird behavior with large files >20GB?

I'm doing the next statement on my ubuntu:
split --number=l/5 /pathToSource.csv /pathToOutputDirectory
If i do a "ls"
myUser#serverNAme:/pathToOutputDirectory> ls -la
total 21467452
drwxr-xr-x 2 myUser group 4096 Jun 23 08:51 .
drwxrwxrwx 4 myUser group 4096 Jun 23 08:44 ..
-rw-r--r-- 1 myUser group 10353843231 Jun 23 08:48 aa
-rw-r--r-- 1 myUser group 0 Jun 23 08:48 ab
-rw-r--r-- 1 myUser group 11376663825 Jun 23 08:51 ac
-rw-r--r-- 1 myUser group 0 Jun 23 08:51 ad
-rw-r--r-- 1 myUser group 252141913 Jun 23 08:51 ae
If i do a "du" over ab and ad files.
$du -h ab ad
0 ab
0 ad
As you can see, split divided the file in a non-homogeneous form.
Anyone know what's going on?
Some unprintable character can hang the split?
Thank you.
Best Regards!
Francisco.
While this is unusual data with an average line length of 114137, I'm not sure that fully describes the issue. Hmm you've 21982648969 of data => each bucket that split is trying to fill is 4396529793. That's larger than 2^32. I wonder do we have a 32 bit overflow. Are you on a 32 bit or 64 bit platform? Looking at the code I don't see an overflow issue TBH. Note you could anonymize and compress the data providing the following file for download somewhere:
tr -c '\n' . < /pathToSource.csv | xz > /pathToSource.csv.xz
It's also worth specifying the version since implementation changed a bit between v8.8 and v8.13
A workarround in groovy:
class Sanitizer {
public static void main(String[] args) {
def textOnly = new File('/path/NoDanger.txt')
def data = new File('/path/danger.txt')
String line = null
data.withReader { reader ->
while ( ( line = reader.readLine() ) != null ){
/*char[] stringToCharArray = line.toCharArray();
for(int i = 0; i < 5; i++ ){
char a = stringToCharArray[i]
int b = Character.getNumericValue(a);
println Integer.toHexString(b)
if (!(b =~ /\w/)) {
println "inside"
} else println "outside"
}*/
String newString = line.replaceAll("[^\\p{Print}]", "");
textOnly << newString+"\n"
}
} //reader
}
}

See stdin/stdout/stderr of a running process - Linux kernel

Is there a way to redirect/see the stdin/stdout/stderr of a given running process(By PID) in a simple way ?
I tried the following (Assume that 'pid' contains a running user process):
int foo(const void* data, struct file* file, unsigned fd)
{
printf("Fd = %x\n", fd);
return 0;
}
struct task_struct* task = pid_task(find_vpid(pid), PIDTYPE_PID);
struct files_struct* fs = task->files;
iterate_fd(fs, 0, foo, NULL);
I get 3 calls to foo (This process probably has 3 opened files, makes sense) but I can't really read from them (from the file pointers).
It prints:
0
1
2
Is it possible to achieve what I asked for in a fairly simple way ?
thanks
First, if you can change your architecure, you run it under something like screen, tmux, nohup, or dtach which will make your life easier.
But if you have a running program, you can use strace to monitor it's kernel calls, including all reads/writes. You will need to limit what it sees (try -e), and maybe filter the output for just the first 3 FDs. Also add -s because the default is to limit the size of data recorded. Something like: strace -p <PID> -e read,write -s 1000000
You can achieve it via gdb
Check the file handles process() has open :
$ ls -l /proc/6760/fd
total 3
lrwx—— 1 rjc rjc 64 Feb 27 15:32 0 -> /dev/pts/5
l-wx—— 1 rjc rjc 64 Feb 27 15:32 1 -> /tmp/foo1
lrwx—— 1 rjc rjc 64 Feb 27 15:32 2 -> /dev/pts/5
Now run GDB:
$ gdb -p 6760 /bin/cat
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
[lots more license stuff snipped]
Attaching to program: /bin/cat, process 6760
[snip other stuff that’s not interesting now]
(gdb) p close(1)
$1 = 0
Provide a new file name to get output - process_log
(gdb) p creat(“/tmp/process_log″, 0600)
$2 = 1
(gdb) q
The program is running. Quit anyway (and detach it)? (y or n) y
Detaching from program: /bin/cat, process 6760
After that verify the result as:
ls -l /proc/6760/fd/
total 3
lrwx—— 1 rjc rjc 64 2008-02-27 15:32 0 -> /dev/pts/5
l-wx—— 1 rjc rjc 64 2008-02-27 15:32 1 -> /tmp/process_log <====
lrwx—— 1 rjc rjc 64 2008-02-27 15:32 2 -> /dev/pts/5
In the similar way, you can redirect stdin, stderr too.

How to find user memory usage in linux

How i can see memory usage by user in linux centos 6
For example:
USER USAGE
root 40370
admin 247372
user2 30570
user3 967373
This one-liner worked for me on at least four different Linux systems with different distros and versions. It also worked on FreeBSD 10.
ps hax -o rss,user | awk '{a[$2]+=$1;}END{for(i in a)print i" "int(a[i]/1024+0.5);}' | sort -rnk2
About the implementation, there are no shell loop constructs here; this uses an associative array in awk to do the grouping & summation.
Here's sample output from one of my servers that is running a decent sized MySQL, Tomcat, and Apache. Figures are in MB.
mysql 1566
joshua 1186
tomcat 353
root 28
wwwrun 12
vbox 1
messagebus 1
avahi 1
statd 0
nagios 0
Caveat: like most similar solutions, this is only considering the resident set (RSS), so it doesn't count any shared memory segments.
EDIT: A more human-readable version.
echo "USER RSS PROCS" ; echo "-------------------- -------- -----" ; ps hax -o rss,user | awk '{rss[$2]+=$1;procs[$2]+=1;}END{for(user in rss) printf "%-20s %8.0f %5.0f\n", user, rss[user]/1024, procs[user];}' | sort -rnk2
And the output:
USER RSS PROCS
-------------------- -------- -----
mysql 1521 1
joshua 1120 28
tomcat 379 1
root 19 107
wwwrun 10 10
vbox 1 3
statd 1 1
nagios 1 1
messagebus 1 1
avahi 1 1
Per-user memory usage in percent using standard tools:
for _user in $(ps haux | awk '{print $1}' | sort -u)
do
ps haux | awk -v user=${_user} '$1 ~ user { sum += $4} END { print user, sum; }'
done
or for more precision:
TOTAL=$(free | awk '/Mem:/ { print $2 }')
for _user in $(ps haux | awk '{print $1}' | sort -u)
do
ps hux -U ${_user} | awk -v user=${_user} -v total=$TOTAL '{ sum += $6 } END { printf "%s %.2f\n", user, sum / total * 100; }'
done
The first version just sums up the memory percentage for each process as reported by ps. The second version sums up the memory in bytes instead and calculates the total percentage afterwards, thus leading to a higher precision.
If your system supports, try to install and use smem:
smem -u
User Count Swap USS PSS RSS
gdm 1 0 308 323 820
nobody 1 0 912 932 2240
root 76 0 969016 1010829 1347768
or
smem -u -t -k
User Count Swap USS PSS RSS
gdm 1 0 308.0K 323.0K 820.0K
nobody 1 0 892.0K 912.0K 2.2M
root 76 0 937.6M 978.5M 1.3G
ameskaas 46 0 1.2G 1.2G 1.5G
124 0 2.1G 2.2G 2.8G
In Ubuntu, smem can be installed by typing
sudo apt install smem
This will return the total ram usage by users in GBs, reverse sorted
sudo ps --no-headers -eo user,rss | awk '{arr[$1]+=$2}; END {for (i in arr) {print i,arr[i]/1024/1024}}' | sort -nk2 -r
You can use the following Python script to find per-user memory usage using only sys and os module.
import sys
import os
# Get list of all users present in the system
allUsers = os.popen('cut -d: -f1 /etc/passwd').read().split('\n')[:-1]
for users in allUsers:
# Check if the home directory exists for the user
if os.path.exists('/home/' + str(users)):
# Print the current usage of the user
print(os.system('du -sh /home/' + str(users)))

procmail disregards /etc/group?

sample procmailrc:
SHELL=/bin/bash
LOGFILE=$HOME/procmail.log
VERBOSE=yes
:0
* ^Subject: envdump please$
{
LOG="`id`"
:0
/dev/null
}
/etc/group file contains (note the other usernames are vain attempts to make this work):
someuser:x:504:
s3:x:505:someuser,someotheruser,postfix,postdrop,mail,root
If I run as "someuser" the command id:
[someuser#lixyz-pqr ~]$ id
uid=504(someuser) gid=504(someuser) groups=504(someuser),505(s3)
However when I run procmail by sending an email with the subject "envdump please", the 505/s3 group disappears (this is in procmail.log):
procmail: [17618] Mon Dec 19 17:39:50 2011
procmail: Match on "^Subject: envdump please$"
procmail: Executing "id"
procmail: Assigning "LOG=uid=504(someuser) gid=504(someuser) groups=504(someuser)"
uid=504(someuser) gid=504(someuser) groups=504(someuser)procmail: Assigning "LASTFOLDER=/dev/null"
this server is running Fedora 14 with Postfix 2.7.5
Procmail wasn't installed setuid.
for background, it should look like:
[root#li321-238 postfix]# ls -l /usr/bin/procmail
-rwsr-sr-x. 1 root mail 92816 Jul 28 2009 /usr/bin/procmail
which you can set up via:
chmod ug+s /usr/bin/procmail

Resources