Here is the test sample:
test_catalog,test_title,test_type,test_artist
And i can use the following sript to cut off the text above by comma and set the variable respectively:
IFS=","
read cdcatnum cdtitle cdtype cdac < $temp_file
(ps:and the $temp_file is the dir of the test sample)
And if i want to replace the read with command,cut.Any idea?
There are many solutions:
line=$(head -1 "$temp_file")
echo $line | cut -d, ...
or
cut -d, ... <<< "$line"
or you can tell BASH to copy the line into an array:
typeset IFS=,
set -A ARRAY $(head -1 "$temp_file")
# use it
echo $ARRAY[0] # test_catalog
echo $ARRAY[1] # test_title
...
I prefer the array solution because it gives you a distinct data type and clearly communicates your intent. The echo/cut solution is also somewhat slower.
[EDIT] On the other hand, the read command splits the line into individual variables which gives each value a name. Which is more readable: $ARRAY[0] or $cdcatnum?
If you move columns around, you will just need to rearrange the arguments to the read command - if you use arrays, you will have to update all the array indices which you will get wrong.
Also read makes it much more simple to process the whole file:
while read cdcatnum cdtitle cdtype cdac ; do
....
done < "$temp_file"
man cut ?
But seriously, if you have something that works, why do you want to change it?
Personally, I'd probably use awk or perl to manipulate CSV files in linux.
Related
I have a file ($ScriptName). I want the first 2 charactors of every line to be in a list (Starters). I am using a bash script.
How would I do this?
I have declared my array like this:
array=() #Empty array
Using guidence from this: https://opensource.com/article/18/5/you-dont-know-bash-intro-bash-arrays
I am using manjaro 19 and the latest kernel.
To get the first two characters from each line, you can use
cut -c1,2 "$ScriptName"
-c1,2 means "output characters in positions 1 and 2"
I'm not sure what you mean by a "list". If you just want to create a file with the results, use redirection:
cut -c1,2 "$ScriptName" > Starters
If you want to populate an array, just use
while IFS= read -r starter ; do Starters+=("$starter") ; done < <(cut -c1,2 "$ScriptName")
Moreover, if you're interested in letters rather than characters, you can use sed to remove non-letters from each line and then use the solution shown above.
sed 's/[^[:alpha:]]//g' "$ScriptName" | cut -c1,2
Try this Shellcheck-clean (except for a missing initialization of ScriptName) pure Bash code:
Starters=()
while IFS= read -r line || [[ -n $line ]]; do
Starters+=( "${line:0:2}" )
done < "$ScriptName"
See Arrays [Bash Hackers Wiki] for information about using arrays in Bash.
See BashFAQ/001 (How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?)
for information about reading files line-by-line in Bash.
See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) (particularly the bit about "range notation") for an explanation of ${line:0:2}".
The mapfile bash built-in command combined with cut makes it simple:
cut -c1,2 "$ScriptName" | mapfile Starters
Assuming that I have files with 100 lines. There are a lot of lines that repeat themselves in the file, and only one line that does not.
I want to find the line that shows only once. Is there a command for that or do I have to build some complicated loop as below?
My code so far:
#!/bin/bash
filename="repeat_lines.txt"
var="$(wc -l <$filename )"
echo "length:" $var
#cp ex4.txt ex4_copy.txt
for((index=0; index < var; index++));
do
one="$(head -n $index $filename | tail -1)"
counter=0
for((index2=0; index2 < var; index2++));
do
two="$(head -n $index2 $filename | tail -1)"
if [ "$one" == "$two" ]; then
counter=$((counter+1))
fi
done
echo $one"is "$counter" times in the text: "
done
If I understood your question correctly, then
sort repeat_lines.txt | uniq -u should do the trick.
e.g. for file containing:
a
b
a
c
b
it will output c.
For further reference, see sort manpage, uniq manpage.
You've got a reasonable answer that uses standard shell tools sort and uniq. That's probably the solution you want to use, if you want something that is portable and doesn't require bash.
But an alternative would be to use functionality built into your bash shell. One method might be to use an associative array, which is a feature of bash 4 and above.
$ cat file.txt
a
b
c
a
b
$ declare -A lines
$ while read -r x; do ((lines[$x]++)); done < file.txt
$ for x in "${!lines[#]}"; do [[ ${lines["$x"]} -gt 1 ]] && unset lines["$x"]; done
$ declare -p lines
declare -A lines='([c]="1" )'
What we're doing here is:
declare -A creates the associative array. This is the bash 4 feature I mentioned.
The while loop reads each line of the file, and increments a counter that uses the content of a line of the file as the key in the associative array.
The for loop steps through the array, deleting any element whose counter is greater than 1.
declare -p prints the details of an array in a predictable, re-usable format. You could alternately use another for loop to step through the remaining array elements (of which there might be only one) in order to do something with them.
Note that this solution, while fine for small files (say, up to a few thousand lines), may not scale well for very large files of, say, millions of lines. Bash isn't the fastest at reading input this way, and one must be cognizant of memory limits when using arrays.
The sort alternative has the benefit of memory optimization using files on disk for extremely large files, at the expense of speed.
If you're dealing with files of only a few hundred lines, then it's hard to predict which solution will be faster. In the end, the form of output may dictate your choice of solution. The sort | uniq pipe generates a list to standard output. The bash solution above generates the same list as keys in an array. Otherwise, they are functionally equivalent.
I'm trying to find a solution to a problem analog to this one:
#command_A
A_output_Line_1
A_output_Line_2
A_output_Line_3
#command_B
B_output_Line_1
B_output_Line_2
Now I need to compare A_output_Line_2 and B_output_Line_1 and echo "Correct" if they are equal and "Not Correct" otherwise.
I guess the easiest way to do this is to copy a line of output in some variable and then after executing the two commands, simply compare the variables and echo something.
This I need to implement in a bash script and any information on how to get certain line of output stored in a variable would help me put the pieces together.
Also, it would be cool if anyone can tell me not only how to copy/store a line, but probably just a word or sequence like : line 1, bytes 4-12, stored like string in a variable.
I am not a complete beginner but also not anywhere near advanced linux bash user. Thanks to any help in advance and sorry for bad english!
An easier way might be to use diff, no?
Something like:
command_A > command_A.output
command_B > command_B.output
diff command_A.output command_B.output
This will work for comparing multiple strings.
But, since you want to know about single lines (and words in the lines) here are some pointers:
# first line of output of command_A
command_A | head -n 1
The -n 1 option says only to use the first line (default is 10 I think)
# second line of output of command_A
command_A | head -n 2 | tail -n 1
that will take the first two lines of the output of command_A and then the last of those two lines. Happy times :)
You can now store this information in a variable:
export output_A=`command_A | head -n 2 | tail -n 1`
export output_B=`command_B | head -n 1`
And then compare it:
if [ "$output_A" == "$output_B" ]; then echo 'Correct'; else echo 'Not Correct'; fi
To just get parts of a string, try looking into cut or (for more powerful stuff) sed and awk.
Also, just learing a good general purpose scripting language like python or ruby (even perl) can go a long way with this kind of problem.
Use the IFS (internal field separator) to separate on newlines and store the outputs in an array.
#!/bin/bash
IFS='
'
array_a=( $(./a.sh) )
array_b=( $(./b.sh) )
if [ "${array_a[1]}" = "${array_b[0]}" ]; then
echo "CORRECT"
else
echo "INCORRECT"
fi
I'd like to convert a list separated with '\n' in another one separated with space.
Ex:
Get a dictionary like ispell english dictionary. http://downloads.sourceforge.net/wordlist/ispell-enwl-3.1.20.zip
My initial idea was using a variable as accumulator:
a=""; cat american.0 | while read line; do a="$a $line"; done; echo $a
... but it results '\n' string!!!
Questions:
Why is it not working?
What is the correct way to do that?
Thanks.
The problem is that when you have a pipeline:
command_1 | command_2
each command is run in a separate subshell, with a separate copy of the parent environment. So any variables that the command creates, or any modifications it makes to existing variables, will not be perceived by the containing shell.
In your case, you don't really need the pipeline, because this:
cat filename | command
is equivalent, in every way that you need, to this:
command < filename
So you can write:
a=""; while read line; do a="$a $line"; done < american.0; echo $a
to avoid creating any subshells.
That said, according to this StackOverflow answer, you can't really rely on a shell variable being able to hold more than about 1–4KB of data, so you probably need to rethink your overall approach. Storing the entire word-list in a shell variable likely won't work, and even if it does, it likely won't work well.
Edited to add: To create a temporary file named /tmp/american.tmp that contains what the variable $a would have, you can write:
while IFS= read -r line; do
printf %s " $line"
done < american.0 > /tmp/american.tmp
If you want to replace '\n' with a space, you can simply use tr as follows:
a=$(tr '\n' ' ' < american.0)
I'm trying to execute a command for each line coming from a cat command. I'm basing this on sample code I got from a vendor.
Here's the script:
for tbl in 'cat /tmp/tables'
do
echo $tbl
done
So I was expecting the output to be each line in the file. Instead I'm getting this:
cat
/tmp/tables
That's obviously not what I wanted.
I'm going to replace the echo with an actual command that interfaces with a database.
Any help in straightening this out would be greatly appreciated.
You are using the wrong type of quotes.
You need to use the back-quotes rather than the single quote to make the argument being a program running and piping out the content to the forloop.
for tbl in `cat /tmp/tables`
do
echo "$tbl"
done
Also for better readability (if you are using bash), you can write it as
for tbl in $(cat /tmp/tables)
do
echo "$tbl"
done
If your expectations are to get each line (The for-loops above will give you each word), then you may be better off using xargs, like this
cat /tmp/tables | xargs -L1 echo
or as a loop
cat /tmp/tables | while read line; do echo "$line"; done
The single quotes should be backticks:
for tbl in `cat /etc/tables`
Although, this will not get you output/input by line, but by word. To process line by line, you should try something like:
cat /etc/tables | while read line
echo $line
done
With while loop:
while read line
do
echo "$line"
done < "file"
while IFS= read -r tbl; do echo "$tbl" ; done < /etc/tables
read this.
You can do a lot of parsing in bash by redefining the IFS (Input Field Seperator), for example
IFS="\t\n" # You must use double quotes for escape sequences.
for tbl in `cat /tmp/tables`
do
echo "$tbl"
done