How to delete identical lines in my output files?

How to delete identical lines in my output files? - linux

I have this script :
#!/bin/bash
ps -eo lstart,pid,cmd --sort lstart | while read line 2> /dev/null
do
if [ "$(date -d "${line::24}" "+%Y%m%d%H%M%S")" -gt "$(date -d "Thu Apr 7 00:55:38" "+%Y%m%d%H%M%S")" ] 2> /dev/null
then echo "Date : $(date -d "${line::24}" "+%d/%m/%Y %H:%M:%S") | PID & CMD : ${line:25:29}" >> process.log 2> /dev/null
fi 2> /dev/null
done
sort process.log | uniq > process.log
#sort process.log | uniq -u | tee process.log
My script runs automatically every 10 seconds, so I would like the identical lines to be deleted. As you can see, I tried with uniq but it doesn't work. I would like all lines in my file to be deleted if they are identical.
As I did, the second time the script is executed, there is nothing in the output file and I don't understand why.
I would also like nothing to be displayed in my terminal when the script runs. I used tee but when executing the uniq command, it returns an output in my terminal... How to remove it?
I thank you in advance for your help and wish you a good day
Thanks a lot

You should not parse ps output ever, especially not of lstart. Also, you are running date in the loop all the time, again and again.
I think something along this would be better to do:
some_date=$(date -d "Thu Apr 7 00:55:38" +%s)
now=$(date +%s)
how_much_time_ago=$(( now - some_date ))
ps -eo etimes,pid,cmd --sort etimes |
awk -v v="$how_much_time_ago" '$1 > v' |
while IFS=' ' read line etimes pid cmd; do
printf "Date : %s | PID & CMD : %s %s\n" \
"$(date -d "$((now - etimes))" "+%d/%m/%Y %H:%M:%S")" \
"$pid" "$cmd"
done |
sort |
uniq > process.log
Note that you can pipe the output of a while .... done | stuff loop to another thing normally. Instead of sprinkling 2>/dev/null everywhere, try to actually solve the issue, not hide the error.

Related

Bash script to check is a directory is modifed

This has probably been created before and better than mine. I have a directory where files are created for a few milliseconds before being removed. I researched and couldn't find what I was looking for so I made something to do it and added a few more features.
How it works:
you run the script, input the directory, and input the time you want it to run from 10 seconds to 6000 seconds (1 hour). It validates what you enter to make sure the directory is real and you don't exceed or go below that time. using sdiff -s it will compare the state of the directory when the script began to a new version of it ever 0.001 seconds. If there are changes it will tell you.
I wanted to share it since other may find it useful, and more importantly ask if you guys had improvements. I have been doing a lot of self-taught (mostly using stack exchange) bash scripting for almost a year and I really love it. I am always looking to improve me code. I am new to interactive scripts so if you guys have recommendations for input validation I'd love to hear it. I couldn't figure out how to get the "if" statements for time in seconds combined to check for anything less than 10 and greater than 6,000 despite trying a lot of things so I just made them separate. The "sed" portions are kind of wonky here and I didn't do a great job optimizing. I just worked on them until the output was what I wanted.
EDIT: I don't have inotify and I don't think I could get it on this locked down system.
#!/bin/bash
# Directory Check Script
# Created 13 Aug 2022
CLISESSID=$$
export CLISESSID
### DEFINE A LOCATION WHERE FILES CAN BE TEMPORARILY MADE ###
tmp=/tmp
temp1=$tmp/temp1.txt
temp2=$tmp/temp2.txt
echo "This script will check a directory to see if any files were added for the length of time you specify"
read -ep 'What is the full directory you would like to verify? ' dir
if [ ! -d "$dir" ] ; then
echo "Directory does not exist. Exiting."
exit
fi
read -ep '(This must be between 10-6000. i.e 5 minutes = 300, 10 minutes = 600, 1 hour = 6000)
How many seconds would you like to check for? ' seconds
if [[ "$seconds" -lt 10 ]] ; then
echo "Seconds must be between 10 and 6000"
exit
fi
if [[ "$seconds" -gt 6000 ]] ; then
echo "Seconds must be between 10 and 6000"
exit
fi
echo "checking $dir for $seconds seconds."
ls --full-time $dir | tail -n +2 > $temp1
SECONDS=0
echo "Checking for changes to $dir every 0.001 seconds for $seconds seconds."
until [[ $(ls --full-time $dir | tail -n +2) != $(cat "$temp1") ]] > /dev/null 2>&1
do
if (( SECONDS > $seconds ))
then
echo "Exceded defined time of $seconds seconds. Exiting."
exit 1
fi
sleep 0.001
done
ls --full-time $dir | tail -n +2 > $temp2
if [[ $(sdiff -w 400 -s $temp1 $temp2 | grep " |" | wc -l) -gt 0 ]] ; then
echo "
File has been modified in $dir:"
sdiff -w 400 -s $temp1 $temp2 | sed 's/|/\n/' | sed 's/^ *//g' | sed '1~ i Before:' | sed '3~ i After:' | sed 's/^ *//g' | sed -e 's/^[ \t]*//'
fi
if [[ $(sdiff -w 400 -s $temp1 $temp2 | grep " >" | wc -l) -gt 0 ]] ; then
echo "
File has been added to $dir:"
sdiff -w 400 -s /tmp/temp1.txt /tmp/temp2.txt | sed 's/>/\n/' | grep -v " |" | sed 's/^ *//g' | sed '1~ i Added file:' | sed 's/^ *//g' | sed -e 's/^[ \t]*//' | sed '/./!d'
fi
if [[ $(sdiff -w 400 -s $temp1 $temp2 | grep " <" | wc -l) -gt 0 ]] ; then
echo "
File has removed modified in $dir:"
sdiff -w 400 -s $temp1 /$temp2 | sed 's/</\n/' | grep -v " |" | sed 's/^ *//g' | sed '1~ i Removed file:' | sed 's/^ *//g' | sed -e 's/^[ \t]*//' | sed '/./!d' | sed 's/ *$//'
fi
rm -f $temp1 $temp2

Identify processes running more than 3 hrs in linux

I want to find out processes running more than 3 hrs, I have written a command for this but it's not returning expected output
ps -u <user> -o pid,stime,pcpu,pmem,etime,cmd --sort=start_time | \
grep <searchString> | grep -v grep| awk '{print $5}' | \
sed 's/:|-/ /g;'| awk '{print $4" "$3" "$2" "$1"}' | \
awk '$1+$2*60+$3*3600+$4*86400 > 10800'
but it's printing the values of etime in output. But expected output is, command should print the values of "pid,stime,pcpu,pmem,etime,cmd"
I am not able to find exact issue with this.

You are executing "awk '{print $5}'" which is taking in the input and printing out only column 5 which in your case is "etime" , everything from this point on is lost.
If your system supports etimes (notice the s on the end), you can easily do this with
ps -eo pid,etimes,etime,comm,user,tty | awk '{if ( $2>10800) print $0}'
on a system not supporting etimes which has a standard output of etime which hh:mm:ss or just mm:ss if no hours have passed
ps -eo pid,etime,comm,user,tty | awk '{seconds_old=10800 ; split($2,a,":",sep) ; if(length(a) < 3) b = (a[1] *60) + (a[2]) ; else b=((a[1]*3600) + (a[2] *60) + (a[3])) ; if(b > seconds_old ) print $0}'
Adjust "seconds_old" to change the age you want to test for:
There are various other methods of doing this using Find for example:
explained here:
https://serverfault.com/questions/181477/how-do-i-kill-processes-older-than-t
However, the solution should match your expected output

Try this:
ps -u <user> -o pid,stime,pcpu,pmem,etime=,cmd --sort=start_time|grep <searchString>|while read z;do tago=$(echo $z|awk '{print $5}'|sed -E 's/(:|-)/ /g'| awk '{print $4+$3*60+$2*3600+$1*86400}');if [ $tago -ge 10800 ];then echo $z;fi;done
It prints only processes >= 10800 secs old.
You can readjust the output further to fit your needs.

Able to find running process for more than 3 hrs with below command.
ps -u <user> -o pid,stime,pcpu,pmem,etime,cmd --sort=start_time |grep -v grep|awk 'substr($0,23,2) > 3'

Problem with putting value in array in bash

I would like to make array which put users in a time using for loop. For example:
y[1]="user1"
y[2]="user2"
...
y[n]="usern"
I tried to do it like this
#!/bin/bash
x=$(who | cut -d " " -f1 | sort | uniq | wc -l)
for (( i=1; i<=$x; i++ )); do
y[$i]=$(who | cut -d " " -f1 | sort | uniq | sed -n '$ip')
p[$i]=$(lsof -u ${y[$i]} | wc -l)
echo "Users:"
echo ${y[$i]}
echo -e "Number of launched files:\n" ${p[$i]}
done
Most likely I'm using command "sed" wrong.
Can you help me?

Indeed your sed command seems to be a bit off. I can't really guess what you're trying to do there. Besides that, I'm wondering why you're executing who twice. You can make use of the data first obtained in the following manner.
#!/bin/bash
# define two arrays
y=()
p=()
#x=0
while read -r username; do
y+=("$username")
p+=($(lsof -u $(id -u "$username") | wc -l))
echo -e "User:\n${y[-1]}"
echo -e "Open files:\n${p[-1]}"
# The -1 index is the last index in the array, but you
# could uncomment the x=0 variable and the line below:
#((x++))
done <<< $(who | cut -d " " -f1 | sort | uniq)
echo "Amount of users: $x"
exit 0

Command substitution as a variable in one-liner

I get the following error:
> echo "${$(qstat -a | grep kig):0:7}"
-bash: ${$(qstat -a | grep kig):0:7}: bad substitution
I'm trying to take the number before. of
> qstat -a | grep kig
1192530.perceus- kigumen lr_regul pbs.sh 27198 2 16 -- 24:00:00 R 00:32:23
and use it as an argument to qdel in openPBS so that I can delete all process that I started with my login kigumen
so ideally, this should work:
qdel ${$(qstat -a | grep kig):0:7}
so far, only this works:
str=$(qstat -a | grep kig); qdel "${str:0:7}"
but I want a clean one-liner without a temporary variable.

The shell substring construct you're using (:0:7) only works on variables, not command substitution. If you want to do this in a single operation, you'll need to trim the string as part of the pipeline, something like one of these:
echo "$(qstat -a | grep kig | sed 's/[.].*//')"
echo "$(qstat -a | awk -F. '/kig/ {print $1}')"
echo "$(qstat -a | awk '/kig/ {print substr($0, 1, 7)}')"
(Note that the first two print everything before the first ".", while the last prints the first 7 characters.) I don't know that any of them are particularly cleaner, but they do it without a temp variable...

qstat -u palle | cut -f 1 -d "." | xargs qdel
Kills all my jobs... normally I grep out the jobname(s) before cut'ing...
So I use a small script "idlist":
qstat -u palle | grep -E "*.in" | grep -E "$1" | cut -f 1 -d "." | xargs
To see all my "map_..." jobs:
idlist "map_*"
For killing all my "map_...." jobs:
idlist "map_*" | xargs qdel

yet another ways :
foreach m1 in $(qstat -a );do
if [[ $m1 =~ kig ]];then
m2=${m1%.kig}
echo "kig found $m2 "
break
fi
done

Idle time of a process in Linux

I need to calculate CPU usage (user mode, system mode, idle time) of a process in Linux.
I am able to calculate usage in user and system mode using utime and stime values from /proc/PID/stat, but I found nothing which is related to idle time.
I know I can get idle time from /proc/stat but this value is related to machine, not for particular process.
Is it possible to calculate idle time of a process knowing its PID (reading data from /proc directory)?

I don't know much about it but maybe the following works:
1) Get the process start up time. Im sure thats possible
2) Generate time difference (dTime = CurrentTime - TimeProcessStarted)
3) Substract the time the process is running ( dTime - (usageSystemMode + usageUserMode))
Hope this helps! :D

it's too late but, I guessed this command useful:
IFS=$'\n';for i in `ps -eo uname:20,pid,cmd | grep -v "USER\|grep\|root"`; \
do if [ $(id -g `echo $i | cut -d" " -f1`) -gt 1000 ] && \
[ $(echo $((($(date +%s) - $(date -d "$(ll -u \
--time-style=+"%y-%m-%d %H:%M:%S" /proc/$(echo $i | \
awk '{print $2}')/cwd | awk '{print $6" "$7}')" +%s))/3600))) >=1 ]; \
then echo $i; fi; done
to use it in bash file:
#!/bin/bash
IFS=$'\n'
for i in `ps -eo uname:20,pid,cmd | grep -v "USER\|grep\|root"`
do
Name="`echo $i | cut -d' ' -f1`"
Id="$(id -g $Name)"
Pid="`echo $i | awk '{print $2}'`"
Time1=$(date +%s)
Time2=$(date -d "$(/usr/bin/ls -lu --time-style=+"%y-%m-%d %H:%M:%S" \
/proc/$Pid/cwd | awk '{print $6" "$7}')" +%s)/3600
Time=$Time1-$Time2
if [ $Id -gt 1000 ] && [ $Time >=1 ]
then
echo $i
fi
done
you could change grep -v "grep\|root" as you wish.
this one line command list all processes which not root owner or system users.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to delete identical lines in my output files? - linux

Related

Bash script to check is a directory is modifed

Identify processes running more than 3 hrs in linux

Problem with putting value in array in bash

Command substitution as a variable in one-liner

Idle time of a process in Linux

Categories

Resources