Line from bash command output stored in variable as string - linux

I'm trying to find a solution to a problem analog to this one:
#command_A
A_output_Line_1
A_output_Line_2
A_output_Line_3
#command_B
B_output_Line_1
B_output_Line_2
Now I need to compare A_output_Line_2 and B_output_Line_1 and echo "Correct" if they are equal and "Not Correct" otherwise.
I guess the easiest way to do this is to copy a line of output in some variable and then after executing the two commands, simply compare the variables and echo something.
This I need to implement in a bash script and any information on how to get certain line of output stored in a variable would help me put the pieces together.
Also, it would be cool if anyone can tell me not only how to copy/store a line, but probably just a word or sequence like : line 1, bytes 4-12, stored like string in a variable.
I am not a complete beginner but also not anywhere near advanced linux bash user. Thanks to any help in advance and sorry for bad english!

An easier way might be to use diff, no?
Something like:
command_A > command_A.output
command_B > command_B.output
diff command_A.output command_B.output
This will work for comparing multiple strings.
But, since you want to know about single lines (and words in the lines) here are some pointers:
# first line of output of command_A
command_A | head -n 1
The -n 1 option says only to use the first line (default is 10 I think)
# second line of output of command_A
command_A | head -n 2 | tail -n 1
that will take the first two lines of the output of command_A and then the last of those two lines. Happy times :)
You can now store this information in a variable:
export output_A=`command_A | head -n 2 | tail -n 1`
export output_B=`command_B | head -n 1`
And then compare it:
if [ "$output_A" == "$output_B" ]; then echo 'Correct'; else echo 'Not Correct'; fi
To just get parts of a string, try looking into cut or (for more powerful stuff) sed and awk.
Also, just learing a good general purpose scripting language like python or ruby (even perl) can go a long way with this kind of problem.

Use the IFS (internal field separator) to separate on newlines and store the outputs in an array.
#!/bin/bash
IFS='
'
array_a=( $(./a.sh) )
array_b=( $(./b.sh) )
if [ "${array_a[1]}" = "${array_b[0]}" ]; then
echo "CORRECT"
else
echo "INCORRECT"
fi

Related

How do I add the first 2 letters of every line in a file to a list using bash?

I have a file ($ScriptName). I want the first 2 charactors of every line to be in a list (Starters). I am using a bash script.
How would I do this?
I have declared my array like this:
array=() #Empty array
Using guidence from this: https://opensource.com/article/18/5/you-dont-know-bash-intro-bash-arrays
I am using manjaro 19 and the latest kernel.
To get the first two characters from each line, you can use
cut -c1,2 "$ScriptName"
-c1,2 means "output characters in positions 1 and 2"
I'm not sure what you mean by a "list". If you just want to create a file with the results, use redirection:
cut -c1,2 "$ScriptName" > Starters
If you want to populate an array, just use
while IFS= read -r starter ; do Starters+=("$starter") ; done < <(cut -c1,2 "$ScriptName")
Moreover, if you're interested in letters rather than characters, you can use sed to remove non-letters from each line and then use the solution shown above.
sed 's/[^[:alpha:]]//g' "$ScriptName" | cut -c1,2
Try this Shellcheck-clean (except for a missing initialization of ScriptName) pure Bash code:
Starters=()
while IFS= read -r line || [[ -n $line ]]; do
Starters+=( "${line:0:2}" )
done < "$ScriptName"
See Arrays [Bash Hackers Wiki] for information about using arrays in Bash.
See BashFAQ/001 (How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?)
for information about reading files line-by-line in Bash.
See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) (particularly the bit about "range notation") for an explanation of ${line:0:2}".
The mapfile bash built-in command combined with cut makes it simple:
cut -c1,2 "$ScriptName" | mapfile Starters

Finding a line that shows in a file only once

Assuming that I have files with 100 lines. There are a lot of lines that repeat themselves in the file, and only one line that does not.
I want to find the line that shows only once. Is there a command for that or do I have to build some complicated loop as below?
My code so far:
#!/bin/bash
filename="repeat_lines.txt"
var="$(wc -l <$filename )"
echo "length:" $var
#cp ex4.txt ex4_copy.txt
for((index=0; index < var; index++));
do
one="$(head -n $index $filename | tail -1)"
counter=0
for((index2=0; index2 < var; index2++));
do
two="$(head -n $index2 $filename | tail -1)"
if [ "$one" == "$two" ]; then
counter=$((counter+1))
fi
done
echo $one"is "$counter" times in the text: "
done
If I understood your question correctly, then
sort repeat_lines.txt | uniq -u should do the trick.
e.g. for file containing:
a
b
a
c
b
it will output c.
For further reference, see sort manpage, uniq manpage.
You've got a reasonable answer that uses standard shell tools sort and uniq. That's probably the solution you want to use, if you want something that is portable and doesn't require bash.
But an alternative would be to use functionality built into your bash shell. One method might be to use an associative array, which is a feature of bash 4 and above.
$ cat file.txt
a
b
c
a
b
$ declare -A lines
$ while read -r x; do ((lines[$x]++)); done < file.txt
$ for x in "${!lines[#]}"; do [[ ${lines["$x"]} -gt 1 ]] && unset lines["$x"]; done
$ declare -p lines
declare -A lines='([c]="1" )'
What we're doing here is:
declare -A creates the associative array. This is the bash 4 feature I mentioned.
The while loop reads each line of the file, and increments a counter that uses the content of a line of the file as the key in the associative array.
The for loop steps through the array, deleting any element whose counter is greater than 1.
declare -p prints the details of an array in a predictable, re-usable format. You could alternately use another for loop to step through the remaining array elements (of which there might be only one) in order to do something with them.
Note that this solution, while fine for small files (say, up to a few thousand lines), may not scale well for very large files of, say, millions of lines. Bash isn't the fastest at reading input this way, and one must be cognizant of memory limits when using arrays.
The sort alternative has the benefit of memory optimization using files on disk for extremely large files, at the expense of speed.
If you're dealing with files of only a few hundred lines, then it's hard to predict which solution will be faster. In the end, the form of output may dictate your choice of solution. The sort | uniq pipe generates a list to standard output. The bash solution above generates the same list as keys in an array. Otherwise, they are functionally equivalent.

Grep filtering output from a process after it has already started?

Normally when one wants to look at specific output lines from running something, one can do something like:
./a.out | grep IHaveThisString
but what if IHaveThisString is something which changes every time so you need to first run it, watch the output to catch what IHaveThisString is on that particular run, and then grep it out? I can just dump to file and later grep but is it possible to do something like background it and then bring it to foreground and bringing it back but now piped to some grep? Something akin to:
./a.out
Ctrl-Z
fg | grep NowIKnowThisString
just wondering..
No, it is only in your screen buffer if you didn't save it in some other way.
Short form: You can do this, but you need to know that you need to do it ahead-of-time; it's not something that can be put into place interactively after-the-fact.
Write your script to determine what the string is. We'd need a more detailed example of the output format to give a better example of usage, but here's one for the trivial case where the entire first line is the filter target:
run_my_command | { read string_to_filter_for; fgrep -e "$string_to_filter_for" }
Replace the read string_to_filter_for with as many commands as necessary to read enough input to determine what the target string is; this could be a loop if necessary.
For instance, let's say that the output contains the following:
Session id: foobar
and thereafter, you want to grep for lines containing foobar.
...then you can pipe through the following script:
re='Session id: (.*)'
while read; do
if [[ $REPLY =~ $re ]] ; then
target=${BASH_REMATCH[1]}
break
else
# if you want to print the preamble; leave this out otherwise
printf '%s\n' "$REPLY"
fi
done
[[ $target ]] && grep -F -e "$target"
If you want to manually specify the filter target, this can be done by having the loop check for a file being created with filter contents, and using that when starting up grep afterwards.
That is a little bit strange what you need, but you can do it tis way:
you must go into script session first;
then you use shell how usually;
then you start and interrupt you program;
then run grep over typescript file.
Example:
$ script
$ ./a.out
Ctrl-Z
$ fg
$ grep NowIKnowThisString typescript
You could use a stream editor such as sed instead of grep. Here's an example of what I mean:
$ cat list
Name to look for: Mike
Dora 1
John 2
Mike 3
Helen 4
Here we find the name to look for in the fist line and want to grep for it. Now piping the command to sed:
$ cat list | sed -ne '1{s/Name to look for: //;h}' \
> -e ':r;n;G;/^.*\(.\+\).*\n\1$/P;s/\n.*//;br'
Mike 3
Note: sed itself can take file as a parameter, but you're not working with text files, so that's how you'd use it.
Of course, you'd need to modify the command for your case.

How to convert the script from using command,read to command,cut?

Here is the test sample:
test_catalog,test_title,test_type,test_artist
And i can use the following sript to cut off the text above by comma and set the variable respectively:
IFS=","
read cdcatnum cdtitle cdtype cdac < $temp_file
(ps:and the $temp_file is the dir of the test sample)
And if i want to replace the read with command,cut.Any idea?
There are many solutions:
line=$(head -1 "$temp_file")
echo $line | cut -d, ...
or
cut -d, ... <<< "$line"
or you can tell BASH to copy the line into an array:
typeset IFS=,
set -A ARRAY $(head -1 "$temp_file")
# use it
echo $ARRAY[0] # test_catalog
echo $ARRAY[1] # test_title
...
I prefer the array solution because it gives you a distinct data type and clearly communicates your intent. The echo/cut solution is also somewhat slower.
[EDIT] On the other hand, the read command splits the line into individual variables which gives each value a name. Which is more readable: $ARRAY[0] or $cdcatnum?
If you move columns around, you will just need to rearrange the arguments to the read command - if you use arrays, you will have to update all the array indices which you will get wrong.
Also read makes it much more simple to process the whole file:
while read cdcatnum cdtitle cdtype cdac ; do
....
done < "$temp_file"
man cut ?
But seriously, if you have something that works, why do you want to change it?
Personally, I'd probably use awk or perl to manipulate CSV files in linux.

Egrep acts strange with -f option

I've got a strangely acting egrep -f.
Example:
$ egrep -f ~/tmp/tmpgrep2 orig_20_L_A_20090228.txt | wc -l
3
$ for lines in `cat ~/tmp/tmpgrep2` ; do egrep $lines orig_20_L_A_20090228.txt ; done | wc -l
12
Could someone give me a hint what could be the problem?
No, the files did not changed between executions. The expected answer for the egrep line count is 12.
UPDATE on file contents: the searched file contains cca 13000 lines, each of them are 500 char long, the pattern file contains 12 lines, each of them are 24 char long. The pattern always (and only) occurs on a fixed position in the seached file (26-49).
UPDATE on pattern contents: every pattern from tmpgrep2 are a 24 char long number.
If the search patterns are found on the same lines, then you can get the result you see:
Suppose you look for:
abc
def
ghi
jkl
and the data file is:
abcdefghijklmnoprstuvwxzy
then the one-time command will print 1 and the loop will print 4.
Could it be that the lines read contain something that the shell is expanding/substituting for you, in the second version? Then that doesn't get done by grep when it reads the patterns itself, thus leading to a different sent of patterns being matched.
I'm not totally sure if the shell is doing any expansion on the variable value in an invocation like that, but it's an idea at least.
EDIT: Nope, it doesn't seem to do any substitutions. But it could be quoting issue, if your patterns contain whitespace the for loop will step through each token, not through each line. Take a look at the read bash builtin.
Do you have any duplicates in ~/tmp/tmpgrep2? Egrep will only use the dupes one time, but your loop will use each occurrence.
Get rid of dupes by doing something like this:
$ for lines in `sort < ~/tmp/tmpgrep2 | uniq` ; do egrep $lines orig_20_L_A_20090228.txt ; done | wc -l
I second #unwind.
Why don't you run without wc -l and see what each search is finding?
And maybe:
for lines in `cat ~/tmp/tmpgrep2` ; do echo $lines ; done
Just to see now the shell is handling $lines?
The others have already come up with most of the things I would look at. The next thing I would check is the environment variable GREP_OPTIONS, or whatever it is called on your machine. I've gotten the strangest error messages or behaviors when using a command line argument that interfered with the environment settings.

Resources