How to log output of multiple shell scripts - linux

I am not familiar with this platform, so if this is in the frozen section my apologies :P
I am working on upgrading a raspberry pi script project. It decodes NOAA APT Satellite images and it runs from AT Scheduler (I think) and scripts. Scripts are used to start recordings and do auto processing.
I have been having some problems and am trying to get a log of what is processed through the scripts, their are 3. I have tired adding something like ...) >> log.txt to the files but they are always empty.
I cant call them as sh -x script.sh >>log.txt because they are scheduled to trigger at different times and it would be a pain to replace all the calls.
Idealy i would like something i could add at the end of each script to log all the things they process and stick them in their own log file (script1.log, script2.log, script3.log)
Thanks!!
Jake
edit: I was advised to post the scripts. These are not "mine" i got them off of an instructable and made some changes to fit my needs. And i would rather not screw them up more than i have. ideally i would like something i could put after the #!/bin/bash line where it would log all of the commands processed by the script.
Thanks!
Script 1, the main scheduling script. some of them have been comented out because i dont use NOAA 15 or Meteor M2.
#!/bin/bash
# Update Satellite Information
wget -qr https://www.celestrak.com/NORAD/elements/weather.txt -O /home/pi/weather/predict/weather.txt
#grep "NOAA 15" /home/pi/weather/predict/weather.txt -A 2 > /home/pi/weather/predict/weather.tle
grep "NOAA 18" /home/pi/weather/predict/weather.txt -A 2 >> /home/pi/weather/predict/weather.tle
grep "NOAA 19" /home/pi/weather/predict/weather.txt -A 2 >> /home/pi/weather/predict/weather.tle
#grep "METEOR-M 2" /home/pi/weather/predict/weather.txt -A 2 >> /home/pi/weather/predict/weather.tle
#Remove all AT jobs
for i in `atq | awk '{print $1}'`;do atrm $i;done
#Schedule Satellite Passes:
/home/pi/weather/predict/schedule_satellite.sh "NOAA 19" 137.1000
/home/pi/weather/predict/schedule_satellite.sh "NOAA 18" 137.9125
#/home/pi/weather/predict/schedule_satellite.sh "NOAA 15" 137.6200
script 2, the individual satellite scheduler. It uses information from the first script to find times the satellite is passing overhead.
#!/bin/bash
PREDICTION_START=`/usr/bin/predict -t /home/pi/weather/predict/weather.tle -p "${1}" | head -1`
PREDICTION_END=`/usr/bin/predict -t /home/pi/weather/predict/weather.tle -p "${1}" | tail -1`
var2=`echo $PREDICTION_END | cut -d " " -f 1`
MAXELEV=`/usr/bin/predict -t /home/pi/weather/predict/weather.tle -p "${1}" | awk -v max=0 '{if($5>max){max=$5}}END{print max}'`
while [ `date --date="TZ=\"UTC\" #${var2}" +%D` == `date +%D` ]; do
START_TIME=`echo $PREDICTION_START | cut -d " " -f 3-4`
var1=`echo $PREDICTION_START | cut -d " " -f 1`
var3=`echo $START_TIME | cut -d " " -f 2 | cut -d ":" -f 3`
TIMER=`expr $var2 - $var1 + $var3`
OUTDATE=`date --date="TZ=\"UTC\" $START_TIME" +%Y%m%d-%H%M%S`
if [ $MAXELEV -gt 28 ]
then
echo ${1//" "}${OUTDATE} $MAXELEV
echo "/home/pi/weather/predict/receive_and_process_satellite.sh \"${1}\" $2 /home/pi/weather/${1//" "}${OUTDATE} /home/pi/weather/predict/weather.tle $var1 $TIMER" | at `date --date="TZ=\"UTC\" $START_TIME" +"%H:%M %D"`
fi
nextpredict=`expr $var2 + 60`
PREDICTION_START=`/usr/bin/predict -t /home/pi/weather/predict/weather.tle -p "${1}" $nextpredict | head -1`
PREDICTION_END=`/usr/bin/predict -t /home/pi/weather/predict/weather.tle -p "${1}" $nextpredict | tail -1`
MAXELEV=`/usr/bin/predict -t /home/pi/weather/predict/weather.tle -p "${1}" $nextpredict | awk -v max=0 '{if($5>max){max=$5}}END{print max}'`
var2=`echo $PREDICTION_END | cut -d " " -f 1`
done
the final script takes care of recording the audio from the satellite at the specified frequency, calculated for dopler shift, auto decodes/processes it, and posts it to my archive and webserver.
#!/bin/bash
# $1 = Satellite Name
# $2 = Frequency
# $3 = FileName base
# $4 = TLE File
# $5 = EPOC start time
# $6 = Time to capture
sudo timeout $6 rtl_fm -f ${2}M -s 60k -g 45 -p 55 -E wav -E deemp -F 9 - | sox -t wav - $3.wav rate 11025
#pass start 150 was 90
PassStart=`expr $5 + 150`
if [ -e $3.wav ]
then
/usr/local/bin/wxmap -T "${1}" -H $4 -p 0 -l 0 -o $PassStart ${3}-map.png
/usr/local/bin/wxtoimg -m ${3}-map.png -e ZA $3.wav ${3}.png
/usr/local/bin/wxtoimg -m ${3}-map.png -e NO $3.wav ${3}.NO.png
/usr/local/bin/wxtoimg -m ${3}-map.png -e MSA $3.wav ${3}.MSA.png
/usr/local/bin/wxtoimg -m ${3}-map.png -e MCIR $3.wav ${3}.MCIR.png
/usr/local/bin/wxtoimg -m ${3}-map.png -e MSA-PRECIP $3.wav ${3}.MSA-PRECIP.png
/usr/local/bin/wxtoimg -m ${3}-map.png -e EC $3.wav ${3}.EC.png
/usr/local/bin/wxtoimg -m ${3}-map.png -e HVCT $3.wav ${3}.HVCT.png
/usr/local/bin/wxtoimg -m ${3}-map.png -e CC $3.wav ${3}.CC.png
/usr/local/bin/wxtoimg -m ${3}-map.png -e SEA $3.wav ${3}.SEA.png
fi
NOW=$(date +%m-%d-%Y_%H-%M)
mkdir /home/pi/weather/Pictures/${NOW}
sudo cp /home/pi/weather/*.png /home/pi/weather/Pictures/${NOW}/ #move pictures to date folder in pi/pictures
sudo mv /var/www/html/APT_Pictures/PREVIOUS/* /var/www/html/APT_Pictures/ARCHIVE #move previous to archive
sudo mv /var/www/html/APT_Pictures/LATEST/* /var/www/html/APT_Pictures/PREVIOUS #move latest pictures to previous folder
sudo cp /home/pi/weather/Pictures/${NOW} /var/www/html/APT_Pictures/LATEST -r #copys date folder to latest
sudo cp /home/pi/weather/*-map.png home/pi/weather/Pictures/${NOW}/ #copys map to archive folder
##sudo mv /home/pi/weather/Pictures/${NOW}/*-map.png /home/pi/weather/maps #moves map from /pi/pictures date to maps folder
sudo rm /home/pi/weather/*.png #removes pictures from weather folder
sudo mv /home/pi/weather/*.wav /home/pi/weather/audio #moves audio to audio folder

Perhaps the scripts are outputing status messages to stderr instead of stdout (which your ...) >> log.txt method would have captured.)?
Here's how I'd capture stdout and stderr for debugging purposes.
$ /bin/bash script1.sh 1>>script1_stdout.log 2>>script1_stderr.log
$ /bin/bash script2.sh 1>>script2_stdout.log 2>>script2_stderr.log
$ /bin/bash script3.sh 1>>script3_stdout.log 2>>script3_stderr.log
Or combine the two streams into a single log file:
$ /bin/bash script1.sh 1>>script1.log 2>&1
$ /bin/bash script2.sh 1>>script2.log 2>&1
$ /bin/bash script3.sh 1>>script3.log 2>&1
The "1" in 1>> refers to stdout and the "2" in 2>> refers to stderr.
Edit: If you want to continue to see stdout/stderr messages and still write them to file, use tee as described here. tee prints stdin it receives, writes a copy of stdout to the file path provided.
$ /bin/bash script1.sh 2>&1 | tee script1.log
$ /bin/bash script2.sh 2>&1 | tee script2.log
$ /bin/bash script3.sh 2>&1 | tee script3.log
Reference about stdout and stderr.

Related

Synchronize all current users in bash script using mkfifo pipe

I'm creating a program written in bash script that manages users and groups. I want to reload all current users when I add or delete a user in a specific tab. I tried to use mkfifo fpipe but it only reloads all users when I restart the app. Any ideas to solve this problem? Below is the code that performs this function.
mkfifo "$fpipe"
trap "rm -f $fpipe $fts" EXIT
fpipe="OUTPUT.txt"
#getAllUsers function
function get_all_user(){
echo -e '\f' >> "$fpipe"
alluser=$(cat /etc/passwd | awk -F: '$7=="/bin/bash" {print $1"\\n"$3"\\n"$4"\\n"}' | tr -d '[:space:]' )
echo -e $alluser > "$fpipe"
}
export -f get_all_user
#get_selected_user function
function get_selected_user()
{
echo -e '\f' > "$temp"
echo "$1" > "$temp"
cat $temp
}
export -f get_selected_user
#adduser function
function run_adduser()
{
# check for this in '/etc/passwd' and '/etc/shadow'
# $2 is the username
# $3 is the password
if id "$2" &>/dev/null; then
zenity --warning \
--text="Username existed. Please enter another username."
else
useradd -m -p $(openssl passwd -1 $3) -s /bin/bash -G sudo $2
zenity --info \
--text="User added successfully."
fi
}
export -f run_adduser
# Users information tab
get_all_user
yad --plug=$KEY --tabnum=1 --width=600 --height=450 --expand-column=0 --limit=10 \
--list --select-action='#bash -c "get_selected_user %s %s %s"' --column="Username" --column="UID" --column="GID" <&3 &
exec 3>&-

$! returns "not found" when executed in a shell function

I'm trying to monitor the duration and cpu/mem usage of a compile command, to do this i want to use the pid of the compile command which is want to get using $!.
This is my script, i'm testing it on the sleep command for now
#!/bin/sh
FILE_EXTENSION="$(date +"%d-%m-%Y-%T")"
TIME_FILE_NAME=outout/test_duration_$FILE_EXTENSION.txt
RESOURCES_FILE_NAME=output/test_internal_resources$FILE_EXTENSION.csv
monitorResources(){
PID=$1
echo "the pid i found is $($PID)"
mkdir -p output
echo "TIME_STAMP, Usage%, Memory Usage (MB)" > "$RESOURCES_FILE_NAME"
TOTAL="$(free -m | grep Mem | tr -s ' ' | cut -d ' ' -f 2)"
while ps -p "$PID" > /dev/null
do
logResources "$PID" "$TOTAL"
done
}
logResources(){
PID=$1
TOTAL=$2
echo "logging procces $(PID)"
DATE=$(date +"%H:%M:%S:%s%:z")
echo "$DATE, " >> "$RESOURCES_FILE_NAME"
top -b -n 1 -p "$PID" | tr -s ' ' | cut -d ' ' -f 10 >> "$RESOURCES_FILE_NAME"
##VAR="$(top -b -n 1 -p $PID | tr -s ' ' | cut -d ' ' -f 11)"
##echo "scale=3; ($VAR*$TOTAL/100)" | bc >> $RESOURCES_FILE_NAME
sleep 1
}
##cd /php-src
##{ time make -j $(nproc) ; } > $TIME_FILE_NAME 2>&1 &
sleep 5 &
monitorResources $!
wait
cat "$TIME_FILE_NAME"
cat "$RESOURCES_FILE_NAME"
currently this returns:
./run_test.sh: 1: 1209: not found
./run_test.sh: 1: the: not found
./run_test.sh: 22: PID: not found
logging procces
./run_test.sh: 22: PID: not found
logging procces
Should i pass the variable differently? I've tried using pgrep aswell but that gave similar results. I'm pretty new to shell scripting.

Multithreading in bash scripting

I run a bash script, and looping as much line in text file. to cURL the site listed in the txt file.
here is my script :
SECRET_KEY='zuhahaha'
FILE_NAME=""
case "$1" in
"sma")
FILE_NAME="sma.txt"
;;
"smk")
FILE_NAME="smk.txt"
;;
"smp")
FILE_NAME="smp.txt"
;;
"sd")
FILE_NAME="sd.txt"
;;
*)
echo "not in case !"
;;
esac
function save_log()
{
printf '%s\n' \
"Header Code : $1" \
"Executed at : $(date)" \
"Response Body : $2" \
"====================================================================================================="$'\r\n\n' >> output.log
}
while IFS= read -r line;
do
HTTP_RESPONSE=$(curl -L -s -w "HTTPSTATUS:%{http_code}\\n" -H "X-Gitlab-Event: Push Hook" -H 'X-Gitlab-Token: '$SECRET_KEY --insecure $line 2>&1) &
HTTP_BODY=$(echo $HTTP_RESPONSE | sed -e 's/HTTPSTATUS\:.*//g') &
HTTP_STATUS=$(echo $HTTP_RESPONSE | tr -d '\n' | sed -e 's/.*HTTPSTATUS://') &
save_log "$HTTP_STATUS" "$HTTP_BODY" &
done < $FILE_NAME
how i can run threading or make the loop fast in bash ?
You should be able to do this relatively easily. Don't try to background each command, but instead put the body of your while loop into a subshell and background that. That way, your commands (which clearly depend on each other) run sequentially, but all the lines in the file can be process in parallel.
while IFS= read -r line;
do
(
HTTP_RESPONSE=$(curl -L -s -w "HTTPSTATUS:%{http_code}\\n" -H "X-Gitlab-Event: Push Hook" -H 'X-Gitlab-Token: '$SECRET_KEY --insecure $line 2>&1)
HTTP_BODY=$(echo $HTTP_RESPONSE | sed -e 's/HTTPSTATUS\:.*//g')
HTTP_STATUS=$(echo $HTTP_RESPONSE | tr -d '\n' | sed -e 's/.*HTTPSTATUS://')
save_log "$HTTP_STATUS" "$HTTP_BODY" ) &
done < $FILE_NAME
My favourite was to do this is generate a file that lists all the commands you wish to perform. If you have a script that performs your operations create a file like:
$ cat commands.txt
echo 1
echo 2
echo $[12+3]
....
For example this could be hundreds of commands long.
To execute each line in parallel, use the parallel command with, say, at most 3 jobs running in parallel at any time.
$ cat commands.txt | parallel -j
1
2
15
For your curl example you could generate thousands of curl commands, execute them say 30 in parallel at any one time.

make scripts run only one instance when a user logs in

I am trying to run a few bash scripts continually when I am logged in to my Linux Mint install. Adding them to startup applications doesnt appear to work, because they are not always running when I check. I also dont want to create multiple instances of the scripts so adding them to my .bashrc or a cronjob seems to be out. Any other suggestions?
An example script (warns me when my battery is below 30%):
#!/bin/bash
while :
do
#echo "starting script: $(date)">>battery_log
percent=$(upower -i /org/freedesktop/UPower/devices/battery_BAT0| grep -E "percentage" | grep -o '[0-9]\+')
cpu_temp=$(cat /sys/class/thermal/thermal_zone0/temp | awk '{print "deg C: "$1/1000}')
discharge=$(upower -i /org/freedesktop/UPower/devices/battery_BAT0| grep -E "state")
is_discharging=$(echo $discharge | grep -c discharging)
#echo "$percent"
#echo "$cpu_temp"
#echo hello | grep -c he
#if [echo $discharge | grep -c discharging -gt 0 ]; then
#echo "success"
#fi
#echo "$discharge"
if [ "$is_discharging" -gt 0 ]; then
echo "---discharging: $(date)">>battery_log
if [ "$percent" -lt 30 ]; then
#exec 2>>/home/king/Scripts/battery_log.txt
export DISPLAY=:0
#export XAUTHORITY=~otheruser/.Xauthority
#kdialog --msgbox "$(upower -i /org/freedesktop/UPower/devices/battery_BAT0| grep -E "state|to\ full|percentage") \n cpu temp: $(cat /sys/class/thermal/thermal_zone0/temp | awk '{print "deg C: "$1/1000}')"
kdialog --title "Low Battery" --passivepopup "$(upower -i /org/freedesktop/UPower/devices/battery_BAT0| grep -E "state|to\ full|percentage") \n cpu temp: $(cat /sys/class/thermal/thermal_zone0/temp | awk '{print "deg C: "$1/1000}')" 20
fi
fi
sleep 300 #5min
done
Before you run the script, check if an instance of your script is already running.
pgrep script.sh.
If it's already running, then you can exit, otherwise continue the script.
pgrep script.sh && exit should do the trick.

passing double-quotes through weka machine learning

I am using CLI of Weka, namely, Primer and I have tried many different combo of passing several argument with no success. When I pass sth like this:
weka_options=("weka.classifiers.functions.SMOreg -C 1.0 -N 0")
the program runs with no issue, but passing something like this:
weka_options=("weka.classifiers.functions.SMOreg -C 1.0 -N 0 -I \"weka.classifiers.functions.supportVector.RegSMOImproved -L 1.0e-3 -W 1 -P 1.0e-12 -T 0.001 -V\" -K \"weka.classifiers.functions.supportVector.NormalizedPolyKernel -C 250007 -E 8.0\"")
with/withOut escape character and even single quoted `, brings me error in my bash scripts:
bash ./weka.sh "$sub_working_dir" $train_percentage "$weka_options" $files_string > $predictions
where weka.sh contains:
java -Xmx1024m -classpath ".:$WEKAPATH" $weka_options -t "$train_set" -T "$test_set" -c 53 -p 53
Here is what I get:
---Registering Weka Editors---
Trying to add database driver (JDBC): jdbc.idbDriver - Error, not in CLASSPATH?
Weka exception: Can't open file No suitable converter found for '0.001'!.
Can anyone pinpoint the issue?
Updated question: here is the codes:
# Usage:
#
# ./aca2_explore.sh working-dir datasets/*
# e.g.
# ./aca2_explore.sh "aca2-explore-working-dir/" datasets/*
#
# Place this script in the same folder as aca2.sh and the folder containing the datasets.
#
#
# Please note that:
# - All the notes contained in aca2.sh apply
# - This script will erase the contents of working-dir
# to properly sort negative floating numbers, independently of local language options
export LC_ALL=C
# parameters parsing
output_directory=$1
first_file_index=2
files=${#:$first_file_index}
# global constants
datasets=$(($# - 1))
output_row=$(($datasets + 3))
output_columns_range="2-7"
learned_model_mae_column=4
results_learned_model_mae_column=4
# parameters
working_dir="$output_directory"
if [ -d "$working_dir" ];
then
rm -r "$working_dir"
fi
mkdir "$working_dir"
sub_working_dir="$working_dir""aca2-explore-sub-working-dir/"
path_to_results_file="$sub_working_dir""results.csv"
train_percentage=25
logfile="$working_dir""aca2_explore_log.csv"
echo "" > "$logfile"
reduced_log_header="Options,average_test_set_speedup,null_model_mae,learned_model_mae,learned_model_rmse,mae_ratio,R^2"
reduced_logfile="$working_dir""aca2_explore_reduced_log.csv"
echo "$reduced_log_header" > "$reduced_logfile"
sorted_reduced_logfile="$working_dir""aca2_explore_sorted_reduced_log.csv"
weka_options_list=(
"weka.classifiers.functions.LinearRegression -S 0 -R 1.0E-8"
"weka.classifiers.functions.MultilayerPerceptron -L 0.3 -M 0.2 -N 100 -V 0 -S 0 -E 20 -H a"
"weka.classifiers.meta.AdditiveRegression -S 1.0 -I 10 -W weka.classifiers.trees.DecisionStump"
"weka.classifiers.meta.Bagging -P 100 -S 1 -num-slots 1 -I 10 -W weka.classifiers.trees.REPTree -- -M 2 -V 0.001 -N 3 -S 1 -L -1 -I 0.0"
"weka.classifiers.meta.CVParameterSelection -X 10 -S 1 -W weka.classifiers.rules.ZeroR"
"weka.classifiers.meta.MultiScheme -X 0 -S 1 -B \"weka.classifiers.rules.ZeroR \""
"weka.classifiers.meta.RandomCommittee -S 1 -num-slots 1 -I 10 -W weka.classifiers.trees.RandomTree -- -K 0 -M 1.0 -V 0.001 -S 1"
"weka.classifiers.meta.RandomizableFilteredClassifier -S 1 -F \"weka.filters.unsupervised.attribute.RandomProjection -N 10 -R 42 -D Sparse1\" -W weka.classifiers.lazy.IBk -- -K 1 -W 0 -A \"weka.core.neighboursearch.LinearNNSearch -A \"weka.core.EuclideanDistance -R first-last\"\""
"weka.classifiers.meta.RandomSubSpace -P 0.5 -S 1 -num-slots 1 -I 10 -W weka.classifiers.trees.REPTree -- -M 2 -V 0.001 -N 3 -S 1 -L -1 -I 0.0"
"weka.classifiers.meta.RegressionByDiscretization -B 10 -K weka.estimators.UnivariateEqualFrequencyHistogramEstimator -W weka.classifiers.trees.J48 -- -C 0.25 -M 2"
"weka.classifiers.meta.Stacking -X 10 -M \"weka.classifiers.rules.ZeroR \" -S 1 -num-slots 1 -B \"weka.classifiers.rules.ZeroR \""
"weka.classifiers.meta.Vote -S 1 -B \"weka.classifiers.rules.ZeroR \" -R AVG"
"weka.classifiers.rules.DecisionTable -X 1 -S \"weka.attributeSelection.BestFirst -D 1 -N 5\""
"weka.classifiers.rules.M5Rules -M 4.0"
"weka.classifiers.rules.ZeroR"
"weka.classifiers.trees.DecisionStump"
"weka.classifiers.trees.M5P -M 4.0"
"weka.classifiers.trees.RandomForest -I 100 -K 0 -S 1 -num-slots 1"
"weka.classifiers.trees.RandomTree -K 0 -M 1.0 -V 0.001 -S 1"
"weka.classifiers.trees.REPTree -M 2 -V 0.001 -N 3 -S 1 -L -1 -I 0.0")
files_string=""
for file in ${files[#]}
do
files_string="$files_string""$file"" "
done
#echo $files_string
for weka_options in "${weka_options_list[#]}"
do
echo "$weka_options"
echo "$weka_options" >> "$logfile"
bash ./aca2.sh "$sub_working_dir" $train_percentage "$weka_options" $files_string
cat "$path_to_results_file" >> "$logfile"
result_columns=$(tail -n +"$output_row" "$path_to_results_file" | head -1 | cut -d, -f"$output_columns_range")
echo "$weka_options"",""$result_columns" >> "$reduced_logfile"
echo "" >> "$logfile"
done
tail -n +2 "$reduced_logfile" > "$sorted_reduced_logfile"
sort --field-separator=',' --key="$results_learned_model_mae_column" "$sorted_reduced_logfile" -o "$sorted_reduced_logfile"".tmp"
echo "$reduced_log_header" > "$sorted_reduced_logfile"
cat "$sorted_reduced_logfile"".tmp" >> "$sorted_reduced_logfile"
rm "$sorted_reduced_logfile"".tmp"
where the file aca2.sh is:
#!/bin/bash
# Run this script as ./script.sh working-directory train-set-filter-percentage "weka_options" datasets/*
#
# e.g.
# Place this script in a folder together with a directory containing your datasets. Call then the script as
# ./aca2.sh "aca2-working-dir/" 25 "weka.classifiers.functions.LinearRegression -S 0 -R 1.0E-8" datasets_folder/*
#
# NOTE: the script will erase the content of working-directory
# for correct behaviour $WEKAHOME environment variable must be set to the folder containing weka.jar, otherwise modify the call to the weka classifier below
#
# To define the error measures used in this script, I made use of some of the notions found in this article:
# http://scott.fortmann-roe.com/docs/MeasuringError.html
# parameters parsing
output_directory=$1
train_set_percentage=$2
if [ $train_set_percentage -lt 1 ] || [ $train_set_percentage -gt 100 ];
then
echo "Invalid train set percentage: "$train_set_percentage
exit 1
fi
weka_options=$3
first_file_index=4
files=${#:$first_file_index}
# global constants
predictions_characters_range_value="23-28"
predictions_characters_range_error="34-39"
tmp_dir="$output_directory"
if [ -d "$tmp_dir" ];
then
rm -r "$tmp_dir"
fi
mkdir "$tmp_dir"
results_header="testfile,average_test_set_speedup,null_model_mae,learned_model_mae,learned_model_rmse,mae_ratio,R^2"
results_file=$tmp_dir"results.csv"
echo "$results_header" > "$results_file"
arff_header="% ARFF conversion of CSV dataset
#RELATION program
#ATTRIBUTE ...
#DATA"
# global constants
datasets_per_program=5
entries_per_dataset=128
train_set_instances_to_select=$((datasets_per_program*entries_per_dataset*train_set_percentage/100))
all_prediction="$tmp_dir""all_predictions.txt"
count=0
prediction_efficiency_ideal_avg=0
arff_header_file="$tmp_dir""arff_header.txt"
echo "$arff_header" > "$arff_header_file"
count=0
for filename in ${files[#]}
do
echo "Test set: $filename"
echo "$filename" >> "$all_prediction"
cur_dir="$tmp_dir$filename.dir/"
mkdir -p $cur_dir
testfile=$filename
train_set="$cur_dir""train_set.arff"
echo "$arff_header" > $train_set
selected_train_subset="$cur_dir""selected_train_subset.csv"
for trainfile in ${files[#]}
do
if [ "$trainfile" != "$testfile" ]; then
# filter train set to feed only top 25% for model generation
sort --field-separator=',' --key=53 "$trainfile" -o "$selected_train_subset"
head -$train_set_instances_to_select "$selected_train_subset" >> $train_set
fi
done
test_set="$cur_dir""test_set.arff"
#echo "$arff_header" > $test_set
cp "$testfile" "$test_set"
# This file will contain the full configuration space dataset relative to the test program
complete_test_set="$cur_dir""complete_test_set.csv"
cp "$test_set" "$complete_test_set"
sort --field-separator=',' --key=53 "$test_set" -o "$test_set"
head -8 "$test_set" > "$test_set"".tmp"
mv "$test_set"".tmp" "$test_set"
cur_prediction="$cur_dir""cur_prediction.tmp"
# generate basis for predicted test set file by copying the actual test set, removing speedups
predicted_test_set="$cur_dir""predicted_test_set.csv"
cp "$test_set" "$predicted_test_set"
cut -d, -f53 --complement "$predicted_test_set" > "$predicted_test_set"".tmp"
mv "$predicted_test_set"".tmp" "$predicted_test_set"
cat "$arff_header_file" "$test_set" > "$test_set"".tmp"
mv "$test_set"".tmp" "$test_set"
java -Xmx1024m -classpath ".:$WEKAHOME/weka.jar:$WEKAJARS/*" $weka_options -t "$train_set" -T "$test_set" -c 53 -p 53 | tail -n +6 | head -8 > "$cur_prediction"
predictions_file="$cur_dir""predictions.csv"
cut -c"$predictions_characters_range_value" "$cur_prediction" | tr -d " " > "$predictions_file"
paste -d',' "$actual_speedups" "$predictions_file" > "$predictions_file"".tmp"
mv "$predictions_file"".tmp" "$predictions_file"
done
You almost have this right. You were trying to do the right thing it looks like (or just getting accidentally close).
You cannot use a string for arbitrarily quoted arguments (this is Bash FAQ 050).
You need to use an array instead. But you need an array with a separate element for each argument. Not just one argument.
weka_options=(weka.classifiers.functions.SMOreg -C 1.0 -N 0)
or
weka_options=(weka.classifiers.functions.SMOreg -C 1.0 -N 0 -I "weka.classifiers.functions.supportVector.RegSMOImproved -L 1.0e-3 -W 1 -P 1.0e-12 -T 0.001 -V" -K "weka.classifiers.functions.supportVector.NormalizedPolyKernel -C 250007 -E 8.0")
(I assume the string weka.classifiers.functions.supportVector.RegSMOImproved -L 1.0e-3 -W 1 -P 1.0e-12 -T 0.001 -V is the argument to the -I flag and that the string weka.classifiers.functions.supportVector.NormalizedPolyKernel -C 250007 -E 8.0 is the argument to the -K flag. If that's not the case then those quotes likely want to get removed also.)
And then when you use the array you need to use "${weka_options[#]}" to get the elements of the array as individual quoted words.
java -Xmx1024m -classpath ".:$WEKAPATH" "${weka_options[#]}" -t "$train_set" -T "$test_set" -c 53 -p 53

Resources