unix concatenate list of files into on line - linux

In a directory, there is several files such as:
file1
file2
file3
Is there a simple way to concatenate those files to get one line (connected by "OR") in bash as follows:
file1 OR file2 OR file3
Or do I need to write a script for it?

You can use this function to print all filenames (including ones with space, newline or special characters) with " OR " as separator (assuming your filename doesn't contain ASCII code 4):
orfiles() {
local IFS=$'\4'
local out="$*"
echo "${out//$'\4'/ OR }"
}
Then call it as:
orfiles *
How it works:
We set IFS (Internal Field Separator) to ASCII 4 locally inside the function
We store output of "$*" in local variable out. This will place \4 after each filename in variable $out.
Finally using BASH string substitution we globally replace \4 by " OR " while printing the output from $out.
In Unix systems IFS is only a single character delimiter therefore it cannot store multi character string " OR " and we have to do this in 2 steps as shown above.

You can simply do that with
printf '%s OR ' $(ls -1 *) | sed 's/OR $/''/'; echo -e '\n'
Where ls -1 * is the directory.

The moment that should be considered is that a filename could contain whitespace(s).
Use the following ls + awk solution:
ls -1 * | awk '{ r=(r)? r" OR "$0 : $0 }END{ print r }'
Workaround for filenames with newline(s):
echo -e $(ls -1b hello* | awk -v RS= '{gsub(/\n/," OR ",$0); gsub(/\\ /," ",$0); print $0}')
-b - ls option to print C-style escapes for nongraphic characters

ls -1|awk -v q='"' '{printf "%s%s", NR==1?"":" OR ", q $0 q}END{print ""}'
the ls & awk way to do it, with example that the filename containing spaces:
kent$ ls -1
file1
file2
'file with OR and space'
kent$ ls -1|awk -v q='"' '{printf "%s%s", NR==1?"":" OR ", q $0 q}END{print ""}'
"file1" OR "file2" OR "file with OR and space"

$ for f in *; do printf '%s%s' "$s" "$f"; s=" OR "; done; printf '\n'
file1 OR file2 OR file3

Related

Difficulty to create .txt file from loop in bash

I've this data :
cat >data1.txt <<'EOF'
2020-01-27-06-00;/dev/hd1;100;/
2020-01-27-12-00;/dev/hd1;100;/
2020-01-27-18-00;/dev/hd1;100;/
2020-01-27-06-00;/dev/hd2;200;/usr
2020-01-27-12-00;/dev/hd2;200;/usr
2020-01-27-18-00;/dev/hd2;200;/usr
EOF
cat >data2.txt <<'EOF'
2020-02-27-06-00;/dev/hd1;120;/
2020-02-27-12-00;/dev/hd1;120;/
2020-02-27-18-00;/dev/hd1;120;/
2020-02-27-06-00;/dev/hd2;230;/usr
2020-02-27-12-00;/dev/hd2;230;/usr
2020-02-27-18-00;/dev/hd2;230;/usr
EOF
cat >data3.txt <<'EOF'
2020-03-27-06-00;/dev/hd1;130;/
2020-03-27-12-00;/dev/hd1;130;/
2020-03-27-18-00;/dev/hd1;130;/
2020-03-27-06-00;/dev/hd2;240;/usr
2020-03-27-12-00;/dev/hd2;240;/usr
2020-03-27-18-00;/dev/hd2;240;/usr
EOF
I would like to create a .txt file for each filesystem ( so hd1.txt, hd2.txt, hd3.txt and hd4.txt ) and put in each .txt file the sum of the value from each FS from each dataX.txt. I've some difficulties to explain in english what I want, so here an example of the result wanted
Expected content for the output file hd1.txt:
2020-01;/dev/hd1;300;/
2020-02;/dev/hd1;360;/
2020-03;/dev/hd1;390:/
Expected content for the file hd2.txt:
2020-01;/dev/hd2;600;/usr
2020-02;/dev/hd2;690;/usr
2020-03;/dev/hd2;720;/usr
The implementation I've currently tried:
for i in $(cat *.txt | awk -F';' '{print $2}' | cut -d '/' -f3| uniq)
do
cat *.txt | grep -w $i | awk -F';' -v date="$(cat *.txt | awk -F';' '{print $1}' | cut -d'-' -f-2 | uniq )" '{sum+=$3} END {print date";"$2";"sum}' >> $i
done
But it doesn't works...
Can you show me how to do that ?
Because the format seems to be so constant, you can delimit the input with multiple separators and parse it easily in awk:
awk -v FS='[;-/]' '
prev != $9 {
if (length(output)) {
print output >> fileoutput
}
prev = $9
sum = 0
}
{
sum += $9
output = sprintf("%s-%s;/%s/%s;%d;/%s", $1, $2, $7, $8, sum, $11)
fileoutput = $8 ".txt"
}
END {
print output >> fileoutput
}
' *.txt
Tested on repl generates:
+ cat hd1.txt
2020-01;/dev/hd1;300;/
2020-02;/dev/hd1;360;/
2020-03;/dev/hd1;390;/
+ cat hd2.txt
2020-01;/dev/hd2;600;/usr
2020-02;/dev/hd2;690;/usr
2020-03;/dev/hd2;720;/usr
Alternatively, you could -v FS=';' and use split to split first and second column to extract the year and month and the hdX number.
If you seek a bash solution, I suggest you invert the loops - first iterate over files, then over identifiers in second column.
for file in *.txt; do
prev=
output=
while IFS=';' read -r date dev num path; do
hd=$(basename "$dev")
if [[ "$hd" != "${prev:-}" ]]; then
if ((${#output})); then
printf "%s\n" "$output" >> "$fileoutput"
fi
sum=0
prev="$hd"
fi
sum=$((sum + num))
output=$(
printf "%s;%s;%d;%s" \
"$(cut -d'-' -f1-2 <<<"$date")" \
"$dev" "$sum" "$path"
)
fileoutput="${hd}.txt"
done < "$file"
printf "%s\n" "$output" >> "$fileoutput"
done
You could also almost translate awk to bash 1:1 by doing IFS='-;/' in while read loop.

Passing two variables instead of two files to awk

Assume two multi-line text files that are dynamically generated during execution of a bash shell script: file1 and file2
$ echo -e "foo-bar\nbar-baz\nbaz-qux" > file1
$ cat file1
foo-bar
bar-baz
baz-qux
$ echo -e "foo\nbar\nbaz" > file2
$ cat file2
foo
bar
baz
Further assume that I wish to use awk involving an operation on the text strings of both files. For example:
$ awk 'NR==FNR{var1=$1;next} {print $var1"-"$1}' FS='-' file1 FS=' ' file2
Is there any way that I can skip having to save the text strings as files in my script and, instead, pass along the text strings to awk as variables (or as here-strings or the like)?
Something along the lines of:
$ var1=$(echo -e "foo-bar\nbar-baz\nbaz-qux")
$ var2=$(echo -e "foo\nbar\nbaz")
$ awk 'NR==FNR{var1=$1;next} {print $var1"-"$1}' FS='-' "$var1" FS=' ' "$var2"
# awk: fatal: cannot open file `foo-bar
# bar-baz
# baz-qux' for reading (No such file or directory)
$ awk '{print FILENAME, FNR, $0}' <(echo 'foo') <(echo 'bar')
/dev/fd/63 1 foo
/dev/fd/62 1 bar

Searching and selecting strings from a file

I have a trouble in separating few exact 'fields' with strings and then putting them into .txt file. I need to extract 'nologin' users from /etc/passwd file and that is an easy step. I'm using this command:
grep -n 'nologin' /etc/passwd > file1.txt
cat command gives me for example:
2:daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
3:bin:x:2:2:bin:/bin:/usr/sbin/nologin
4:sys:x:3:3:sys:/dev:/usr/sbin/nologin
and it is saved to file1.txt
Now I have to extract from file1.txt a number (2, 3, 4), login (daemon, bin, sys) UID and shell. It should look like this
2:daemon:1:/usr/sbin/nologin
3:bin:2:/usr/sbin/nologin
4:sys:3:/usr/sbin/nologin
I also have to save that output to a *.txt file.
How can I achieve this?
You can use the cut command like this:
cut -d':' -f1 file1.txt > file2.txt
According to the man page:
-d, --delimiter=DELIM
use DELIM instead of TAB for field delimiter
-f, --fields=LIST
select only these fields; also print any line that contains no delimiter character, unless the -s option is specified
I think you can use awk to get your specific fields
1- Go to your terminal and make sure you know your file path
2- use awk commands where -F use to indicate the separator and $ use to indicate the field number
your file1.txt contains:
2:daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
3:bin:x:2:2:bin:/bin:/usr/sbin/nologin
4:sys:x:3:3:sys:/dev:/usr/sbin/nologin
using awk command in your terminal with file1
awk -F '[:/]' '{print $1 " " $2 " " $4 " " " " $(NF-2) " " $(NF-1) " " $NF}' file1.txt
make sure your file path is correct
$1 or $2 = the number of field or column
$NF = last column
$(NF-1) = the column before the last column
and so on
" " = mean space
your output will be:
2 daemon 1 usr sbin nologin
3 bin 2 usr sbin nologin
4 sys 3 usr sbin nologin
you can add the delimiter back in the awk command instead of space
for example,
awk -F '[:/]' '{print $1 ":" $2 ":" $4 ":/" $(NF-2) "/" $(NF-1) "/" $NF}' file1.txt
output:
output
Thanks and Have a good day
awk -F: '{print $1,$2,$4,$NF}' OFS=: file1.txt > newfile.txt
2:daemon:1:/usr/sbin/nologin
3:bin:2:/usr/sbin/nologin
4:sys:3:/usr/sbin/nologin

bash, extract string from text file with space delimiter

I have a text files with a line like this in them:
MC exp. sig-250-0 events & $0.98 \pm 0.15$ & $3.57 \pm 0.23$ \\
sig-250-0 is something that can change from file to file (but I always know what it is for each file). There are lines before and above this, but the string "MC exp. sig-250-0 events" is unique in the file.
For a particular file, is there a good way to extract the second number 3.57 in the above example using bash?
use awk for this:
awk '/MC exp. sig-250-0/ {print $10}' your.txt
Note that this will print: $3.57 - with the leading $, if you don't like this, pipe the output to tr:
awk '/MC exp. sig-250-0/ {print $10}' your.txt | tr -d '$'
In comments you wrote that you need to call it in a script like this:
while read p ; do
echo $p,awk '/MC exp. sig-$p/ {print $10}' filename | tr -d '$'
done < grid.txt
Note that you need a sub shell $() for the awk pipe. Like this:
echo "$p",$(awk '/MC exp. sig-$p/ {print $10}' filename | tr -d '$')
If you want to pass a shell variable to the awk pattern use the following syntax:
awk -v p="MC exp. sig-$p" '/p/ {print $10}' a.txt | tr -d '$'
More lines would've been nice but I guess you would like to have a simple use awk.
awk '{print $N}' $file
If you don't tell awk what kind of field-separator it has to use it will use just a space ' '. Now you just have to count how many fields you have got to get your field you want to get. In your case it would be 10.
awk '{print $10}' file.txt
$3.57
Don't want the $?
Pipe your awk result to cut:
awk '{print $10}' foo | cut -d $ -f2
-d will use the $ als field-separator and -f will select the second field.
If you know you always have the same number of fields, then
#!/bin/bash
file=$1
key=$2
while read -ra f; do
if [[ "${f[0]} ${f[1]} ${f[2]} ${f[3]}" == "MC exp. $key events" ]]; then
echo ${f[9]}
fi
done < "$file"

awk - backticks in solaris to accept command line arg

I have this script that is meant to trim the field specified as argument to the script.
ie sh script.sh file.txt "|" 2
#!/bin/bash
filename="$1"
delim="$2"
arg="$3"
gsubber="\"gsub("^[ \t]*|[ \t]*$","",'\$$arg')\""
myout=`nawk -F"$delim" -v fl="$gsubber" \'{ { fl } }1\' OFS="$delim" "$filename"`
echo "$myout"
So this file 'file.txt' as input:
sid|storeNo|latitude
9| gerdy| fd¿kjhn422-405
0000543210 |gfdjk39
gfd|fd||fd
becomes this output:
sid|storeNo|latitude
9|gerdy| fd¿kjhn422-405
0000543210 |gfdjk39
gfd|fd||fd
I get this error:
nawk: syntax error at source line 1
context is
' <<<
missing }
nawk: bailing out at source line 1
Once someone can assist with providing the correct syntax, I should have no trouble extending it to support multiple fields. ie sh script.sh file.txt "|" 2 3 could then trim the 2nd and 3rd field only.
Thanks in advance!
Try:
#!/bin/bash
filename=$1
delim=$2
arg=$3
regex='^[ \t]*|[ \t]*$'
myout=$(
nawk -F"$delim" -v regex="$regex" -v arg="$arg" '
{ gsub(regex, "", $arg) }
1' OFS="$delim" "$filename"
)
printf '%s\n' "$myout"
Edit:
In order to handle multiple fields in the arguments (see comments below):
#!/bin/bash
filename=$1
delim=$2
shift 2
args=$#
regex='^[ \t]*|[ \t]*$'
myout=$(
nawk -F"$delim" -v regex="$regex" -v args="$args" '{
n = split(args, t, " ")
for (i = 0; ++i <=n;)
gsub(regex, "", $t[i])
}1' OFS="$delim" "$filename"
)
printf '%s\n' "$myout"
this should work:
#!/bin/bash
filename="$1"
delim="$2"
arg="$3"
myout=`nawk -F"$delim" -v f="$arg" '{gsub(/^[ \t]*|[ \t]*$/,"",$f) }1' OFS="$delim" "$filename"`
echo "$myout"
you don't have to extract gsub out, since in the gsub function call, only field index is variable. you could pass the field index as var to awk.

Resources