I have logs in this format:
log1,john,time,etc
log2,peter,time,etc
log3,jack,time,etc
log4,peter,time,etc
I want to create a list for every person in the format
"name"=("no.lines" "line" "line" ...)
For example:
peter=("2" "log2,peter,time,etc" "log4,peter,time,etc")
I already have this structure and know how to create variables like
declare "${FIELD[1]}"=1
but I don't know how to increase number of records and I am getting an error if I want to create a list like this and append into it.
#!/bin/bash
F=("log1,john,time,etc" "log2,peter,time,etc" "log3,jack,time,etc" "log4,peter,time,etc")
echo "${F[#]}"
declare -a CLIENTS
for LINE in "${F[#]}"
do
echo "$LINE"
IFS=',' read -ra FIELD < <(echo "$LINE")
if [ -z "${!FIELD[1]}" ] && [ -n "${FIELD[1]}" ] # check if there is already record for given line, if not create
then
CLIENTS=("${CLIENTS[#]}" "${FIELD[1]}") # add person to list of variables records for later access
declare -a "${FIELD[1]}"=("1" "LINE") # ERROR
elif [ -n "${!FIELD[1]}" ] && [ -n "${FIELD[1]}" ] # if already record for client
then
echo "Increase records number" # ???
echo "Append record"
"${FIELD[#]}"=("${FIELD[#]}" "$LINE") # ERROR
else
echo "ELSE"
fi
done
echo -e "CLIENTS: \n ${CLIENTS[#]}"
echo "Client ${CLIENTS[0]} has ${!CLIENTS[0]} records"
echo "Client ${CLIENTS[1]} has ${!CLIENTS[1]} records"
echo "Client ${CLIENTS[2]} has ${!CLIENTS[2]} records"
echo "Client ${CLIENTS[3]} has ${!CLIENTS[3]} records"
Be warned: The below uses namevars, a new bash 4.3 feature.
First: I would strongly suggest namespacing your arrays with a prefix to avoid collisions with unrelated variables. Thus, using content_ as that prefix:
read_arrays() {
while IFS= read -r line && IFS=, read -r -a fields <<<"$line"; do
name=${fields[1]}
declare -g -a "content_${fields[1]}"
declare -n cur_array="content_${fields[1]}"
cur_array+=( "$line" )
unset -n cur_array
done
}
Then:
lines_for() {
declare -n cur_array="content_$1"
printf '%s\n' "${#cur_array[#]}" ## emit length of array for given person
}
...or...
for_each_line() {
declare -n cur_array="content_$1"; shift
for line in "${cur_array[#]}"; do
"$#" "$line"
done
}
Tying all this together:
$ read_arrays <<'EOF'
log1,john,time,etc
log2,peter,time,etc
log3,jack,time,etc
log4,peter,time,etc
EOF
$ lines_for peter
2
$ for_each_line peter echo
log2,peter,time,etc
log4,peter,time,etc
...and, if you really want the format you asked for, with the number of columns as explicit data, and variable names that aren't safely namespaced, it's easy to convert from one to the other:
# this should probably be run in a subshell to avoid namespace pollution
# thus, (generate_stupid_format) >output
generate_stupid_format() {
for scoped_varname in "${!content_#}"; do
unscoped_varname="${scoped_varname#content_}"
declare -n unscoped_var=$unscoped_varname
declare -n scoped_var=$scoped_varname
unscoped_var=( "${#scoped_var[#]}" "${scoped_var[#]}" )
declare -p "$unscoped_varname"
done
}
Bash with Coreutils, grep and sed
If I understand your code right, you try to have multidimensional arrays, which Bash doesn't support. If I were to solve this problem from scratch, I'd use this mix of command line tools (see security concerns at the end of the answer!):
#!/bin/bash
while read name; do
printf "%s=(\"%d\" \"%s\")\n" \
"$name" \
"$(grep -c "$name" "$1")" \
"$(grep "$name" "$1" | tr $'\n' ' ' | sed 's/ /" "/g;s/" "$//')"
done < <(cut -d ',' -f 2 "$1" | sort -u)
Sample output:
$ ./SO.sh infile
jack=("1" "log3,jack,time,etc")
john=("1" "log1,john,time,etc")
peter=("2" "log2,peter,time,etc" "log4,peter,time,etc")
This uses process substitution to prepare the log file so we can loop over unique names; the output of the substitution looks like
$ cut -d ',' -f 2 "$1" | sort -u
jack
john
peter
i.e., a list of unique names.
For each name, we then print the summarized log line with
printf "%s=(\"%d\" \"%s\")\n"
Where
The %s string is just the name ("$name").
The log line count is the output of a grep command,
grep -c "$name" "$1"
which counts the number of occurrences of "$name". If the name can occur elsewhere in the log line, we can limit the search to just the second field of the log lines with
grep -c "$name" <(cut -d ',' -f 2 "$1")
Finally, to get all log lines on one line with proper quoting and all, we use
grep "$name" "$1" | tr $'\n' ' ' | sed 's/ /" "/g;s/" "$//'
This gets all lines containing "$name", replaces newlines with spaces, then surrounds the spaces with quotes and removes the extra quotes from the end of the line.
Pure Bash
After initially thinking that pure Bash would be too cumbersome, it turned out to be not all that complicated:
#!/bin/bash
declare -A count
declare -A lines
old_ifs=IFS
IFS=,
while read -r -a line; do
name="${line[1]}"
(( ++count[$name] ))
lines[$name]+="\"${line[*]}\" "
done < "$1"
for name in "${!count[#]}"; do
printf "%s=(\"%d\" %s)\n" "$name" "${count[$name]}" "${lines[$name]% }"
done
IFS="$old_ifs"
This updates two associative arrays while looping over the input file: count keeps track of the number of times a certain name occurs, and lines appends the log lines to an entry per name.
To separate fields by commas, we set the input field separator IFS to a comma (but save it beforehand so it can be reset at the end).
read -r -a reads the lines into an array line with comma separated fields, so the name is now in ${line[1]}. We increase the count for that name in the arithmetic expression (( ... )), and append (+=) the log line in the next line.
${line[*]} prints all fields of the array separated by IFS, which is exactly what we want. We also add a space here; the unwanted space at the end of the line (after the last element) will be removed later.
The second loop iterates over all the keys of the count array (the names), then prints the properly formatted line for each. ${lines[$name]% } removes the space from the end of the line.
Security concerns
As it seems that the output of these scripts is supposed to be reused by the shell, we might want to prevent malicious code execution if we can't trust the contents of the log file.
A way to do that for the Bash solution (hat tip: Charles Duffy) would be the following: the for loop would have to be replaced by
for name in "${!count[#]}"; do
IFS=' ' read -r -a words <<< "${lines[$name]}"
printf -v words_str '%q ' "${words[#]}"
printf "%q=(\"%d\" %s)\n" "$name" "${count[$name]}" "${words_str% }"
done
That is, we split the combined log lines into an array words, print that with the %q formatting flag into a string words_str and then use that string for our output, resulting in escaped output like this:
peter=("2" \"log2\,peter\,time\,etc\" \"log4\,peter\,time\,etc\")
jack=("1" \"log3\,jack\,time\,etc\")
john=("1" \"log1\,john\,time\,etc\")
The analogous could be done for the first solution.
You can use awk. As a demo:
awk -F, '{a1[$2]=a1[$2]" \""$0"\""; sum[$2]++} END{for (e in sum){print e"=(" "\""sum[e]"\""a1[e]")"}}' file
john=("1" "log1,john,time,etc")
peter=("2" "log2,peter,time,etc" "log4,peter,time,etc")
jack=("1" "log3,jack,time,etc")
Related
i am passing a argument and that argument i have to match in file and extract the information. Could you please how I can get it?
Example:
I have below details in file-
iMedical_Refined_load_Procs_task_id=970113
HV_Rawlayer_Execution_Process=988835
iMedical_HV_Refined_Load=988836
DHS_RawLayer_Execution_Process=988833
iMedical_DHS_Refined_Load=988834
If I am passing 'hv' as argument so it should to pick 'iMedical_HV_Refined_Load' and give the result - '988836'
If I am passing 'dhs' so it should pick - 'iMedical_DHS_Refined_Load' and give the result = '988834'
I tried below logic but its not giving the result correctly. What Changes I need to do-
echo $1 | tr a-z A-Z
g=${1^^}
echo $g
echo $1
val=$(awk -F= -v s="$g" '$g ~ s{print $2}' /medaff/Scripts/Aggrify/sltconfig.cfg)
echo "TASK ID is $val"
Assuming your matching criteria is the first string after delimiter _ and the output needed is the numbers after the = char, then you can try this sed
$ sed -n "/_$1/I{s/[^=]*=\(.*\)/\1/p}" input_file
$ read -r input
hv
$ sed -n "/_$input/I{s/[^=]*=\(.*\)/\1/p}" input_file
988836
$ read -r input
dhs
$ sed -n "/_$input/I{s/[^=]*=\(.*\)/\1/p}" input_file
988834
If I'm reading it right, 2 quick versions -
$: cat 1
awk -F= -v s="_${1^^}_" '$1~s{print $2}' file
$: cat 2
sed -En "/_${1^^}_/{s/^.*=//;p;}" file
Both basically the same logic.
In pure bash -
$: cat 3
while IFS='=' read key val; do [[ "$key" =~ "_${1^^}_" ]] && echo "$val"; done < file
That's a lot less efficient, though.
If you know for sure there will be only one hit, all these could be improved a bit by short-circuit exits, but on such a small sample it won't matter at all. If you have a larger dataset to read, then I strongly suggest you formalize your specs better than "in this set I should get...".
This question already has answers here:
Why does adding spaces around bash comparison operator change the result?
(2 answers)
How to pass the value of a variable to the standard input of a command?
(9 answers)
Closed 1 year ago.
I want to check which lines of the file /etc/passwd end with the "/bin/bash" string (field number 7, ":" as delimiter).
So far, I've written the following code:
while read line
do
if [ $("$line" | cut -d : -f 7)=="/bin/bash" ]
then
echo $line | cut -d : -f 1
echo "\n"
fi
done < /etc/passwd
Currently, executing the script throws errors that show a bad interpretation (most likely due to the syntax).
I'd appreciate if you could help me.
You MUST surround the == operator with spaces. [ and [[ do different things based on how many arguments are given:
if [ "$( echo "$line" | cut -d: -f7 )" == "/bin/bash" ]; ...
I would actually do this: parse the line into fields while you're reading it.
while IFS=: read -ra fields; do
[[ ${fields[-1]} == "/bin/bash" ]] && printf "%s\n\n" "${fields[0]}"
done < /etc/passwd
This line is wrong:
if [ $("$line" | cut -d : -f 7)=="/bin/bash" ]
Also, this is not going to do what you want:
echo "\n"
Bash echo doesn't understand backslash-escaped characters without
-e. If you want to print a new line use just echo but notice that
the previous echo:
echo $line | cut -d : -f 1
will add a newline already.
You should always check your scripts with
shellcheck. The correct script would be:
#!/usr/bin/env bash
while read -r line
do
if [ "$(echo "$line" | cut -d : -f 7)" == "/bin/bash" ]
then
echo "$line" | cut -d : -f 1
fi
done < /etc/passwd
But notice that you don't really need a loop which is very slow and
could use the following awk one-liner:
awk -v FS=: '$7 == "/bin/bash" {print $1}' /etc/passwd
Instead of looping through the rows, and then checking for the /bin/bash part, why not use something like grep to get all the desired rows, like so:
grep ':/bin/bash$' /etc/passwd
Optionality, you can loop over the rows by using a simple while;
grep ':/bin/bash$' /etc/passwd | while read -r line ; do
echo "Processing $line"
done
Don't do while read | cut. Use IFS as:
#!/bin/sh
while IFS=: read name passwd uid gid gecos home shell; do
if test "$shell" = /bin/bash; then
echo "$name"
fi
done < /etc/passwd
But for this particular use case, it's probably better to do:
awk '$7 == "/bin/bash"{print $1}' FS=: /etc/passwd
The issue your code has is a common error. Consider the line:
if [ $("$line" | cut -d : -f 7)=="/bin/bash" ]
Assume you have a value in $line in which the final field is /bin/dash. The process substitution will insert the string /bin/dash, and bash will attempt to execute:
if [ /bin/dash==/bin/bash ]
since /bin/bash==/bin/bash is a non-empty string, the command [ /bin/bash==/bin/bash ] returns succesfully. It does not perform any sort of string comparison. In order for [ to do a string comparison, you need to pass it 4 arguments. For example, [ /bin/dash = /bin/bash ] would fail. Note the 4 arguments to that call are /bin/dash, =, /bin/bash, and ]. [ is an incredibly bizarre command that requires its final argument to be ]. I strongly recommend never using it, and replacing it instead with its cousin test (very closely related, indeed both test and [ used to be linked to the same executable) which behaves exactly the same but does not require its final argument to be ].
I'm looking for option to read the variable which has comma separated fields (Eg: a,b,c,d,e,f)and generate an another variable from that (eg:a,'a',b,'b',c,'c',d,'d',e,'e',f,'f'). I have tried with 'FOR' loop approach but its adding comma at the end.
Eg:
Var1=a,b,c,d,e,f
Expected output:
Var2=a,'a',b,'b',c,'c',d,'d',e,'e',f,'f'
for i in $(echo $Var1 | sed "s/,/ /g")
do
Var2="$i"",'""$i""',"
fi
done
I'm getting Var2=a,'a',b,'b',c,'c',d,'d',e,'e',f,'f', ending with comma
Is there any good approach to get it done without making more complex?
Thanks
DMP
Here is a way to do this in sed.
$ var1="a,b,c,d,e,f,"
$ var2=$(sed -e "s/[a-z]/&,\'&\'/g" -e 's/,$//g' <<<"$var1")
$ echo $var2
a,'a',b,'b',c,'c',d,'d',e,'e',f,'f'
The first -e in sed repeats the single characters and then the next -e removes the end comma ,.
The above will not work if var1 has multiple characters between each comma ,. For that use, -E or regex option for sed
$ var1='abc,192,hk3,def,HoZ,'
$ var2=$(sed -E -e "s/[a-zA-Z0-9]+/&,\'&\'/g" -e 's/,$//g' <<<"$var1")
$ echo "$var2"
abc,'abc',192,'192',hk3,'hk3',def,'def',HoZ,'HoZ'
You'll have to deal with an extra comma one way or another.
Here's what I'd offer as a solution. I'm also using an actual array to make sure we can process strings with spaces:
#!/usr/bin/env bash
# Input variable
VAR1=a,b,c,d,e,f
# Read VAR1 into an array
IFS=',' read -r -a VAR1_ARRAY <<< ${VAR1}
VAR2=''
for EL in ${VAR1_ARRAY[#]}; do
VAR2="${VAR2},${EL},'${EL}'"
done
# Remove a leading comma
VAR2=${VAR2:1:${#VAR2} - 1}
echo ${VAR2}
The output:
a,'a',b,'b',c,'c',d,'d',e,'e',f,'f'
I am trying to create a bash script that will read in a file that has one line and as it is reading the line and encounters whitespace create a new line then continue reading the line
File trying to read
david:69 jim:98 peter:21 sam:56 april:32
This is my current bash script
#!/bin/bash
fileName=$1 # storing the file
numberSpaces=$2 # will be used later to specify the number of spaces inbetween name:score
# Checking if no file was specified on the command line
[ $# -eq 0 ] && { echo "No file specified"; exit 1;}
# Checking if the file entered on the command line exists
[ ! -f $fileName ] && { echo "File $fileName not found."; exit 2;}
# Internal Field Seperator(IFS) will read what is on the left and right of a specified char
while IFS=' ' read -r name;
do
echo "$name"
echo -e "\n"
done < $fileName
#while read -r line
#do
# name="$line"
# echo "Content of file - $name"
#done < "$fileName"
I am trying to get it to print to the screen
$ spacing.sh file.txt
$ david:69
$ jim:98
$ peter:21
$ sam:56
$ april:32
It is currently just printing the file contents on one line.
Any help or suggestion would be greatly appreciated
In one line:
tr -s " " "\n" < file.txt
You will need to use -d option in read to make it read till the given delimiter:
while IFS= read -r -d ' ' name;
do
echo "$name"
done < <(sed 's/$/ /' file.txt)
As per help read:
-d delim: continue until the first character of DELIM is read, rather than newline
sed is used to inseart a space at end of line so that -d can work with space as delimiter,
You can just split the line into an array, then iterate over that array.
read -r -a fields < "$fileName"
for field in "${fields[#]}"; do
echo "$field"
done
There's no need to set the value of IFS here, because the default (space, tab, newline) is sufficient to split a line delimited by arbitrary whitespace. One example of using IFS would be to subsequently split the colon-delimited parts of each field into two separate variables:
IFS=: read -r name count <<< "$field" # field=david:69
# name=david
# count=69
I want to make a script for an automated setup for a multiseat system. First action is
lspci | grep -i 'vga\|graphic' | cut -b 1-7 > text.txt
Now i want to put the two lines of the file into variables. My dowdy solution was this:
VAR1=$(head -n 1 text.txt)
VAR2=$(tail -n 1 text.txt)
It also works, however, there's probably a better solution to convert a text file line by line into variables.
The following should achieve exactly what you're doing, without the use of a temporary file
#!/bin/bash
{ read -r var1 _ && read -r var2 _; } < <(lspci | grep -i 'vga\|graphics')
Now, if you have several lines from lspci | grep -i 'vga\|graphics' (or just one, or none), you might want something more general, i.e., put the results in an array:
#!/bin/bash
var=()
while read -r f _; do var+=( "$f" ); done < <(lspci | grep -i 'vga\|graphics')
# display the content of var
declare -p var
If you have a recent version of Bash, and you love mapfile and awk (but who doesn't?), you could also do something like this:
#!/bin/bash
mapfile -t var < <(lspci | awk 'tolower($0) ~ /var|graphics/ { print $1 }')
# display the content of var
declare -p var
For a Pure Bash possibility (except for lspci, of course):
#!/bin/bash
shopt -s extglob
var=()
while read -r v rest; do
[[ ${rest,,} = *#(vga|graphics)* ]] && var+=( "$v" )
done < <(lspci)
# display var
declare -p var
This uses:
Lower case conversion of rest with ${rest,,}
Pattern matching and extended globs with *#(vga|graphics)* (to avoid regular expressions altogether).
If you can format your text file into name value pairs, you could use bash associative arrays to store and reference each item. Note in this code = is used as the delimiter to separate the name value pair.
#read in fast config (name value pair file)
declare -A MYMAP
while read item; do
NAME=$(cut -d "=" -f1 <<<"$item")
VALUE=$(cut -d "=" -f2 <<<"$item")
MYMAP["$NAME"]="$VALUE"
done <./config_file.txt
#size of map
MYMAP_N=${#MYMAP[#]}
#make a list of keys
KEYS=("${!MYMAP[#]}")
#dereference map
SELECTION="${MYMAP["my_first_key"]}"
If the values do not contain spaces, in bash using the array variable:
declare -a vars
eval "vars=(`echo line1; echo line2`)" # the `echo ...` simulates your command
echo number of values: ${#vars[#]}
for ((I = 0; I < ${#vars[#]}; ++I )); do
echo value \#$I is ${vars[$I]}
done
echo all values : ${vars[*]}
The trick is to generate the statement initializing the array with the values, and then eval it.
If the values have spaces/special characters, then escaping/quoting might be necessary.
read VAR1 VAR2 < <(sed -n '1p;$p' myfile | tr '\n' ' ')
This ought to do what you need it uses process substitution to print the lines you want and then redirects them to the variables, if you want different lines just build this statement as you need with a for loop putting whatever lines you want if you want them all use wc to count the lines, then build VAR1 .. VAR[n] and sed -n '1p;2p;3p..[n]p' and you then can eval the built statement.