I'm trying to write a script that builds a list of nodes then ssh into the first node of that list
and runs a checknodes.sh script which it's self is just a for i loop that calls checknode.sh
The first 2 lines seems to work ok, the list builds successfully, but then I get either get just the echo line of checknodes.sh to print out or an error saying cat: gpcnodes.txt: No such file or directory
MYSCRIPT.sh:
#gets the master node for the job
MASTERNODE=`qstat -t -u \* | grep $1 | awk '{print$8}' | cut -d'#' -f 2 | cut -d'.' -f 1 | sed -e 's/$/.com/' | head -n 1`
#builds list of nodes in job
ssh -qt $MASTERNODE "qstat -t -u \* | grep $1 | awk '{print$8}' | cut -d'#' -f 2 | cut -d'.' -f 1 | sed -e 's/$/.com/' > /users/issues/slow_job_starts/gpcnodes.txt"
ssh -qt $MASTERNODE cd /users/issues/slow_job_starts/
ssh -qt $MASTERNODE /users/issues/slow_job_starts/checknodes.sh
checknodes.sh
for i in `cat gpcnodes.txt `
do
echo "### $i ###"
ssh -qt $i /users/issues/slow_job_starts/checknode.sh
done
checknode.sh
str=`hostname`
cd /tmp
time perf record qhost >/dev/null 2>&1 | sed -e 's/^/${str}/'
perf report --pretty=raw | grep % | head -20 | grep -c kernel.kallsyms | sed -e "s/^/`hostname`:/"
When ssh -qt $MASTERNODE cd /users/issues/slow_job_starts/ is finished, the changed directory is lost.
With the backquotes replaced by $(..) (not an error here, but get used to it), the script would be something like
for i in $(cat /users/issues/slow_job_starts/gpcnodes.txt)
do
echo "### $i ###"
ssh -nqt $i /users/issues/slow_job_starts/checknode.sh
done
or better
while read -r i; do
echo "### $i ###"
ssh -nqt $i /users/issues/slow_job_starts/checknode.sh
done < /users/issues/slow_job_starts/gpcnodes.txt
Perhaps you would also like to change your last script (start with cd /users/issues/slow_job_starts)
You will find more problems, like sed -e 's/^/${str}/' (the ${str} inside single quotes won't be replaced by a host), but this should get you started.
EDIT:
I added option -n to the ssh call.
Redirects stdin from /dev/null (actually, prevents reading from stdin).
Without this option only one node is checked.
i am stuck with my piece of code any help is appreciated. This is the piece of code i am executing from jenkins.
#!/bin/bash
aws ec2 describe-instances --query 'Reservations[*].Instances[*].[Tags[?Key==`Name`].Value|[0],PrivateIpAddress]' --output text | column | grep devtools > devtools
ip=`awk '{print $2}' devtool`
echo $ip
ssh ubuntu#$ip -n "aws s3 cp s3://bucket/$userlistlocation . --region eu-central-1"
cd builds/${BUILD_NUMBER}/
scp * ubuntu#$ip:/home/ubuntu
if [ $port_type == "normal" ]; then
if [ $duplicate_value == "no" ]; then
if [ $userlist == "uuid" ]; then
ssh ubuntu#$ip -n "export thread_size='"'$thread_size'"'; filename=$(echo $userlistlocation | sed -E 's/.*\/(.*)$/\1/') ; echo $filename ; echo filename='"'$filename'"'; chmod +x uuidwithduplicate.sh; ./uuidwithduplicate.sh"
fi
fi
fi
fi
userlistlocation --> is an user input it can be in any format /rahul/december/file.csv or simply it can be file.csv.
Through sed command i am able to get the output and stored in "filename" variable.
But when i try to echo $filename it's printing as echo $filename it should print as file.csv
this file.csv will be the source file for one more script to run i.e for uuidwithduplicate.sh
both userlistlocation and thread_size are specified through Jenkins job parameters.
I am not facing issues while exporting thread_size, only issue is with filename.
It's just printing echo $filename --> it should print file.csv
Breaking down the ssh command:
ssh ubuntu#$ip -n "export thread_size='"'$thread_size'"'; filename=$(echo $userlistlocation | sed -E 's/.*\/(.*)$/\1/') ; echo $filename ; echo filename='"'$filename'"'; chmod +x uuidwithduplicate.sh; ./uuidwithduplicate.sh"
Into segments of single/double quoted items
"export thread_size='"
'$thread_size'
"#'; filename=$(echo $userlistlocation | sed -E 's/./(.)$/\1/') ; echo $filename ; echo filename='#"
'$filename'
"'; chmod +x uuidwithduplicate.sh; ./uuidwithduplicate.sh"
Note: On the 3rd token, an '#' was added between double quotes and single quote to make it more readable. Not part of the command.
On surface few issues:
The '$thread_size' should be "$thread_size" to enable expansion
The 'echo $filename' is in double quote, resulting in expansion on the local host, where as setting filename=$(echo ...) is executed on the remote host.
There are two echo for filename, not sure why
Proposed solution is to move the setting of filename to the local host (simplify command), and move the thread_size into double quotes. It is possible to put complete command into single double-quoted item:
filename=$(echo $userlistlocation | sed -E 's/.*\/(.*)$/\1/')
ssh localhost -n "export thread_size='$thread_size'; echo '$filename' ; echo filename='$filename'; chmod +x uuidwithduplicate.sh; ./uuidwithduplicate.sh"
I am trying currently to achieve a bash script that will validate if SSH keys on a server are still linked to known hosts that are active on the local area network. You can find below the beginning of my bash script to achieve this:
#!/bin/bash
# LAN SSH KEYS DISCOVERY SCRIPT
# TRYING TO FIND THOSE SSH KEYS NOW
cat /etc/passwd | grep /bin/bash > bash_users
cat bash_users | cut -d ":" -f 6 > cutted.bash_users_home_dir
for bash_users in $(cat cutted.bash_users_home_dir)
do
ls -al $bash_users/.ssh/*id_* >> ssh-keys.txt
done
# DISCOVERING THE KNOWN_HOSTS NOW
for known_hosts in $(cat cutted.bash_users_home_dir)
do
cat $bash_users/.ssh/known_hosts | awk '{print $1}' | sort -u >>
hosts_known.txt
sleep 2
done
hosts_known=$(wc -l hosts_known.txt)
echo "We have $hosts_known known hosts that could be still active via SSH
keys"
# TIME TO TEST WHICH SSH servers are still active with the SSH keys
# AND THIS IS WHERE I AM FROZEN...
# Would love to have bash script that could
# ssh -l $users_that_have_/bin/bash -i $ssh_keys $ssh_servers
# Would also be very nice if it could save active
# SSH servers with the valid keys in output.txt in the format
# username:local-IP:/path/to/SSH_key
Please feel very comfortable to edit/modify the bash script above if it can serve better the goals described.
Any help would be very appreciated,
Thanks
The following works cool:
</etc/passwd \
grep /bin/bash |
cut -d: -f6 |
sudo xargs -i -- sh -c '
[ -e "$1" ] && cat "$1"
' -- {}/.ssh/known_hosts |
cut -d' ' -f1 |
tr ',' '\n' |
sed '
/^\[/{
s/\[\(.*\)\]:\(.*\)/\1 \2/;
t;
};
s/$/ 22/;
' |
sort -u |
xargs -l1 -- sh -c '
if echo "~" | nc -q1 -w3 "$1" "$2" | grep -q "^SSH"; then
echo "#### SUCCESS $1 $2";
else
echo "#### ERROR $1 $2";
fi
' --
So:
Start with /etc/passwd
Filter all "bash_users" as you call them
Filter user home directories only cut -d: -f6
For each user home directory sudo xargs -i -- run
Check if the file .ssh/known_hosts inside the user home directory exists
If it does, print it
Filter only hosts names
Multiple hosts signatures may share same key and are separated by a comma. Replace comma for newline
Now a sed script:
If a line starts with a [ that means it has a format of [host]:port and I want to replace it with host port
If the line does not start with a [ I add 22 to the end of the line so it's host 22
Then I sort -u
Now for each line:
I get the ssh version from ssh echo "~" | nc hostname port returns smth like "SSH-2.0-OpenSSH_6.0" + newline + "Protocol mismatch".
So if the line returned by nc hostname port starts with SSH that means there is ssh running on the other side
I added timeout for unresponsive hosts, but I think nc -w timeout option may also be used. Probably also nc -q 1 should be specified.
Now the real fun is, when you add the max-procs option to the last xargs line, you can check all hosts simultaneously. On my host I have 47 unique addresses and xargs -P30 checks them ALL in like 2 seconds.
But really there are some problems. The script needs root to read from all users known_hosts. But worse, the known_hosts may be hashed. It would be better to firstly know the list of hosts on your network, and then generate known_hosts from it. It would look like ssh-keyscan -f list_of_hosts > ~/.ssh/known_hosts or similar. Generaly ssh-keygen -F hostname should be used if a host exists in known_hosts, sadly there is no listing command. known_hosts file format may be found in ssh documentation.
EDIT: Working script below
I have used this site MANY times to get answers, but I am a little stumped with this.
I am tasked with writing a script, in bash, to log into roughly 2000 Unix servers (Solaris, AIX, Linux) and check the size of OS filesystems, most notable /var /usr /opt.
I have set some variables, which may be where I am going wrong right off the bat.
1.) First I am connecting to another server that has a list of all hosts in the infrastructure. Then I parse this data with some sed commands to get a list I can use properly
1.) Then I do a ping test, to see if the server is alive. If the server is decom. The idea behind this, is if the server is not pingable, I don't want it being reported on, or any attempt to be made to connect to it, as it is just wasting time. I feel I am doing this wrong, but don't know how to do it corectly (a re-occurring theme you will here in this post lol)
If any FS is over 80% mark, then it should output to a text file with the servername, filesystem, size on one line <== very important for me
If the FS is under 80% full, then I don't want it in my output, it can me omitted completely.
I have created something that I will post below, and am hoping to get some help in figuring out where I am going wrong. I am very new to bash scripting, but have experience as a Unix admin (i have never been good at scripting).
Can anyone provide some direction and teach me where I am going wrong?
I will upload my script that i can confirm is working hopefully tomorrow. thanks everyone for your input in this!
Here is my "disk usage" linux script, i hope that help you.
#!/bin/sh
df -H | awk '{ print $5 " " $6 }' | while read output;
do
echo $output
usep=$(echo $output | awk '{ print $1}' | cut -d'%' -f1 )
partition=$(echo $output | awk '{ print $2 }' )
if [ $usep -ge 90 ]; then
echo "Running out of space \"$partition ($usep%)\" on $(hostname) as on $(date)" |
mail -s "Warning! There is no space on the disk: $usep%" root#domain.com
fi
done
Some trouble is here:
ping -c 1 -W 3 $i > /dev/null 2>&1
if [ $? -ne 0 ]; then
echo "$i is offline" >> $LOG
fi
You need a continue statement inside that if. Your program isn't really treating non-pingable hosts differently, just logging they're not pingable.
Okay, now I'm looking a little deeper, and there's more naive stuff in here. These shouldn't work:
SOLVARFS=$(df -h /var |cut -f5 |grep -v capacity |awk '{print $5}')
SOLUSRFS=$(df -h /usr |cut -f5 |grep -v capacity |awk '{print $5}')
SOLOPTFS=$(df -h /opt |cut -f5 |grep -v capacity |awk '{print $5}')
etc...
The problem with these lines is, the command substitution gets assigned to the variables before the ssh session happens. So the content of each variable is the command's result on your local system, not the command itself. Since you're doing command substitution around your ssh calls, it might well work just to rewrite these lines as (note the backslash escapes on $5):
SOLVARFS="df -h /var |cut -f5 |grep -v capacity |awk '{print \$5}'"
SOLUSRFS="df -h /usr |cut -f5 |grep -v capacity |awk '{print \$5}'"
SOLOPTFS="df -h /opt |cut -f5 |grep -v capacity |awk '{print \$5}'"
etc...
The part where you're contacting another server has some more stuff to correct. You don't need three if statements per server, and there's no reason to echo anything to /dev/null. Here's a rewrite for the SunOS section. For each directory you're checking, it outputs the host name, the command name (so you can see which dir was being checked), and the result:
if [[ $UNAME = "SunOS" ]]; then
for SSH_COMMAND in SOLVARFS SOLUSRFS SOLOPTFS ; do
RESULT=`ssh -o PasswordAuthentication=no -o BatchMode=yes -o StrictHostKeyChecking=no -o ConnectTimeout=2 GSSAPIAuthentication=no -q $i ${!SSH_COMMAND}`
if ["$RESULT" -gt 80] ; do
echo "$i, $SSH_COMMAND, $RESULT" >> $LOG
fi
done
fi
Note that the ${!BLAH} construction is variable indirection. "Give me the contents of the variable named by BLAH".
Your original script does a bunch of things less-than-optimally. Rather than running an almost-identical block of code for each filesystem and each operating system, the thing to do would be to record the differences in a way that a SINGLE piece of code can iterate over all your objects, adapting as required.
Here's my take on this. Commands should appear ONCE, but
they get run multiple times by loops, and
they get run multiple ways using arrays.
The following script passes lint checks, but obviously this is untested, as I don't have your environment to test in.
You might still want to think about how your logging and notifications work.
#!/bin/bash
# Assign temp file, remove it automatically upon successful exit.
tmpfile=$(mktemp /tmp/${0##*/}.XXXX)
trap "rm '$tmpfile'" 0
#NOW=$(date +"%Y-%m-%d-%T")
NOW=$(date +"%F")
LOG=/usr/scripts/disk_usage/Unix_df_issues-$NOW.txt
printf '' > "$LOG"
# Use variables to refer to commonly accessed files. If you change a name, just do it once.
rawhostlist=all_vms.txt
host_os=${rawhostlist}_OS
# Commonly-used options need only be declared once. Use an array for easier management.
declare -a ssh_opts=()
ssh_opts+=(-o PasswordAuthentication=no)
ssh_opts+=(-o BatchMode=yes)
ssh_opts+=(-o StrictHostKeyChecking=no) # Eliminate prompts on new hosts
ssh_opts+=(-o ConnectTimeout=2) # This should make your `ping` unnecessary.
ssh_opts+=(-o GSSAPIAuthentication=no) # This is default. Do we really need it?
# Note: Associative arrays require Bash 4.x.
declare -A df_opts=(
[SunOS]="-h"
[Linux]="-hP"
[AIX]=""
)
declare -A df_column=(
[SunOS]=5
[Linux]=5
[AIX]=4
)
# Fetch host list from configserver, stripping /^adm/ on the remote end.
ssh "${ssh_opts[#]}" -q configserver "sed 's/^adm//' /reports/*/HOSTNAME" > "$rawhostlist"
# Confirm that our host_os cache is up to date and process any missing hosts.
awk '
NR==FNR { h[$1]; next } # Add everything in rawhostlist to an array...
{ delete h[$1] } # Then remove any entries that exist in host_os.
END {
for (i in h) print i # And print whatever remains.
}' "$rawhostlist" "$host_os" |
while read h; do
printf '%s\t%s\n' "$h" $(ssh "$h" "${ssh_opts[#]}" -q uname -s)
done >> "$host_os"
# Next, step through the host list and collect data.
while read host os; do
ssh "${ssh_opts[#]}" "$host" df "${df_opts[$os]}" /var /usr /opt |
awk -v column="${df_column[$os]}" -v host="$host" 'NR>1 { print host,$1,$column }'
)
done < "$host_os" > "$tmpfile"
# Now that we have all our data, check for warning/critical levels.
while read host filesystem usage; do
if [ "$usage" -gt 80 ]; then
status="CRITICAL"
elif [ "$usage" -gt 70 ]; then
status="WARNING"
else
continue
fi
# Log our results to our log file, AND send them to stderr.
printf "[%s] %s: %s:%s at %d%%\n" "$(date +"%F %T")" "$status" "$host" "$filesystem" "$usage" | tee -a "$LOG" >&2
done < "$tmpfile"
# Email and record our results.
if [ -s "$LOG" ]; then
mail -s "Daily Unix /var Report - $NOW" unixsystems#examplle.com < "$LOG"
mv "$LOG" /var/log/vm_reports/
fi
Consider this example code. If you like the way it looks, your next task is to debug it, or open new questions for parts that you're having trouble debugging. :-)
hi i am new in bash scripting.
This is my script in this i use while loop this is working till giving input to ping the ips in serverfile but further i want to use those ips to make files of each ip as below i am doing but it has some issue i think there must be more while loops in it . but its not working it takes only one ip as input and make the only one file and further adding in the required file its not working on whole input lets say there are 5 ips in the file it only make the first ip file.
#!/bin/bash
l2=$(tail -1 /root/serverfile | grep hadoop | tr ' ' '\n' | grep hadoop)
awk '{print $1}' < serverFile.txt | while read ip; do
if ping -c1 $ip >/dev/null 2>&1; then
cd /usr/local/nagios/etc/objects/Hadoop
cp Hadoop-node.cfg $l2.cfg
sed -i 's/192.168.0.1/'$ip'/' $l2.cfg
sed -i 's/Hadoop-node/'$l2'/' $l2.cfg
echo "cfg_file=/usr/local/nagios/etc/objects/Hadoop/$l2.cfg" >> /usr/local/nagios/etc/nagios.cfg
service nagios restart
echo " Node is added successfull"
echo $ip IS UP
else
echo $ip IS DOWN NOT PINGING
fi
done