How do I add the first 2 letters of every line in a file to a list using bash? - linux

I have a file ($ScriptName). I want the first 2 charactors of every line to be in a list (Starters). I am using a bash script.
How would I do this?
I have declared my array like this:
array=() #Empty array
Using guidence from this: https://opensource.com/article/18/5/you-dont-know-bash-intro-bash-arrays
I am using manjaro 19 and the latest kernel.

To get the first two characters from each line, you can use
cut -c1,2 "$ScriptName"
-c1,2 means "output characters in positions 1 and 2"
I'm not sure what you mean by a "list". If you just want to create a file with the results, use redirection:
cut -c1,2 "$ScriptName" > Starters
If you want to populate an array, just use
while IFS= read -r starter ; do Starters+=("$starter") ; done < <(cut -c1,2 "$ScriptName")
Moreover, if you're interested in letters rather than characters, you can use sed to remove non-letters from each line and then use the solution shown above.
sed 's/[^[:alpha:]]//g' "$ScriptName" | cut -c1,2

Try this Shellcheck-clean (except for a missing initialization of ScriptName) pure Bash code:
Starters=()
while IFS= read -r line || [[ -n $line ]]; do
Starters+=( "${line:0:2}" )
done < "$ScriptName"
See Arrays [Bash Hackers Wiki] for information about using arrays in Bash.
See BashFAQ/001 (How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?)
for information about reading files line-by-line in Bash.
See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) (particularly the bit about "range notation") for an explanation of ${line:0:2}".

The mapfile bash built-in command combined with cut makes it simple:
cut -c1,2 "$ScriptName" | mapfile Starters

Related

For loop in command line runs bash script reading from text file line by line

I have a bash script which asks for two arguments with a space between them. Now I would like to automate filling out the prompt in the command line with reading from a text file. The text file contains a list with the argument combinations.
So something like this in the command line I think;
for line in 'cat text.file' ; do script.sh ; done
Can this be done? What am I missing/doing wrong?
Thanks for the help.
A while loop is probably what you need. Put the space separated strings in the file text.file :
cat text.file
bingo yankee
bravo delta
Then write the script in question like below.
#!/bin/bash
while read -r arg1 arg2
do
/path/to/your/script.sh "$arg1" "$arg2"
done<text.file
Don't use for to read files line by line
Try something like this:
#!/bin/bash
ARGS=
while IFS= read -r line; do
ARGS="${ARGS} ${line}"
done < ./text.file
script.sh "$ARGS"
This would add each line to a variable which then is used as the arguments of your script.
'cat text.file' is a string literal, $(cat text.file) would expand to output of command however cat is useless because bash can read file using redirection, also with quotes it will be treated as a single argument and without it will split at space tab and newlines.
Bash syntax to read a file line by line, but will be slow for big files
while IFS= read -r line; do ... "$line"; done < text.file
unsetting IFS for read command preserves leading spaces
-r option preserves \
another way, to read whole file is content=$(<file), note the < inside the command substitution. so a creative way to read a file to array, each element a non-empty line:
read_to_array () {
local oldsetf=${-//[^f]} oldifs=$IFS
set -f
IFS=$'\n' array_content=($(<"$1")) IFS=$oldifs
[[ $oldsetf ]]||set +f
}
read_to_array "file"
for element in "${array_content[#]}"; do ...; done
oldsetf used to store current set -f or set +f setting
oldifs used to store current IFS
IFS=$'\n' to split on newlines (multiple newlines will be treated as one)
set -f avoid glob expansion for example in case line contains single *
note () around $() to store the result of splitting to an array
If I were to create a solution determined by the literal of what you ask for (using a for loop and parsing lines from a file) I would use iterations determined by the number of lines in the file (if it isn't too large).
Assuming each line has two strings separated by a single space (to be used as positional parameters in your script:
file="$1"
f_count="$(wc -l < $file)"
for line in $(seq 1 $f_count)
do
script.sh $(head -n $line $file | tail -n1) && wait
done
You may have a much better time using sjsam's solution however.

Bash: How to extract numbers preceded by _ and followed by

I have the following format for filenames: filename_1234.svg
How can I retrieve the numbers preceded by an underscore and followed by a dot. There can be between one to four numbers before the .svg
I have tried:
width=${fileName//[^0-9]/}
but if the fileName contains a number as well, it will return all numbers in the filename, e.g.
file6name_1234.svg
I found solutions for two underscores (and splitting it into an array), but I am looking for a way to check for the underscore as well as the dot.
You can use simple parameter expansion with substring removal to simply trim from the right up to, and including, the '.', then trim from the left up to, and including, the '_', leaving the number you desire, e.g.
$ width=filename_1234.svg; val="${width%.*}"; val="${val##*_}"; echo $val
1234
note: # trims from left to first-occurrence while ## trims to last-occurrence. % and %% work the same way from the right.
Explained:
width=filename_1234.svg - width holds your filename
val="${width%.*}" - val holds filename_1234
val="${val##*_}" - finally val holds 1234
Of course, there is no need to use a temporary value like val if your intent is that width should hold the width. I just used a temp to protect against changing the original contents of width. If you want the resulting number in width, just replace val with width everywhere above and operate directly on width.
note 2: using shell capabilities like parameter expansion prevents creating a separate subshell and spawning a separate process that occurs when using a utility like sed, grep or awk (or anything that isn't part of the shell for that matter).
Try the following code :
filename="filename_6_1234.svg"
if [[ "$filename" =~ ^(.*)_([^.]*)\..*$ ]];
then
echo "${BASH_REMATCH[0]}" #will display 'filename_6_1234.svg'
echo "${BASH_REMATCH[1]}" #will display 'filename_6'
echo "${BASH_REMATCH[2]}" #will display '1234'
fi
Explanation :
=~ : bash operator for regex comparison
^(.*)_([^.])\..*$ : we look for any character, followed by an underscore, followed by any character, followed by a dot and an extension. We create 2 capture groups, one for before the last underscore, one for after
BASH_REMATCH : array containing the captured groups
Some more way
[akshay#localhost tmp]$ filename=file1b2aname_1234.svg
[akshay#localhost tmp]$ after=${filename##*_}
[akshay#localhost tmp]$ echo ${after//[^0-9]}
1234
Using awk
[akshay#localhost tmp]$ awk -F'[_.]' '{print $2}' <<< "$filename"
1234
I would use
sed 's!_! !g' | awk '{print "_" $NF}'
to get from filename_1234.svg to _1234.svg then
sed 's!svg!!g'
to get rid of the extension.
If you set IFS, you can use Bash's build-in read.
This splits the filename by underscores and dots and stores the result in the array a.
IFS='_.' read -a a <<<'file1b2aname_1234.svg'
And this takes the second last element from the array.
echo ${a[-2]}
There's a solution using cut:
name="file6name_1234.svg"
num=$(echo "$name" | cut -d '_' -f 2 | cut -d '.' -f 1)
echo "$num"
-d is for specifying a delimiter.
-f refers to the desired field.
I don't know anything about performance but it's simple to understand and simple to maintain.

formatting issue in printf script

I have a file stv.txt containing some names
For example stv.txt is as follows:
hello
world
I want to generate another file by using these names and adding some extra text to them.I have written a script as follows
for i in `cat stvv.txt`;
do printf 'if(!strcmp("$i",optarg))' > my_file;
done
output
if(!strcmp("$i",optarg))
desired output
if(!strcmp("hello",optarg))
if(!strcmp("world",optarg))
how can I get the correct result?
This is a working solution.
1 All symbols inside single quotes is considered a string. 2 When using printf, do not surround the variable with quotes. (in this example)
The code below should fix it,
for i in `cat stvv.txt`;
printf 'if(!strcmp('$i',optarg))' > my_file;
done
basically, break the printf statement into three parts.
1: the string 'if(!strcmp('
2: $i (no quotes)
3: the string ',optarg))'
hope that helps!
To insert a string into a printf format, use %s in the format string:
$ for line in $(cat stvv.txt); do printf 'if(!strcmp("%s",optarg))\n' "$line"; done
if(!strcmp("hello",optarg))
if(!strcmp("world",optarg))
The code $(cat stvv.txt) will perform word splitting and pathname expansion on the contents of stvv.txt. You probably don't want that. It is generally safer to use a while read ... done <stvv.txt loop such as this one:
$ while read -r line; do printf 'if(!strcmp("%s",optarg))\n' "$line"; done <stvv.txt
if(!strcmp("hello",optarg))
if(!strcmp("world",optarg))
Aside on cat
If you are using bash, then $(cat stvv.txt) could be replaced with the more efficient $(<stvv.txt). This question, however, is tagged shell not bash. The cat form is POSIX and therefore portable to all POSIX shells while the bash form is not.

Line from bash command output stored in variable as string

I'm trying to find a solution to a problem analog to this one:
#command_A
A_output_Line_1
A_output_Line_2
A_output_Line_3
#command_B
B_output_Line_1
B_output_Line_2
Now I need to compare A_output_Line_2 and B_output_Line_1 and echo "Correct" if they are equal and "Not Correct" otherwise.
I guess the easiest way to do this is to copy a line of output in some variable and then after executing the two commands, simply compare the variables and echo something.
This I need to implement in a bash script and any information on how to get certain line of output stored in a variable would help me put the pieces together.
Also, it would be cool if anyone can tell me not only how to copy/store a line, but probably just a word or sequence like : line 1, bytes 4-12, stored like string in a variable.
I am not a complete beginner but also not anywhere near advanced linux bash user. Thanks to any help in advance and sorry for bad english!
An easier way might be to use diff, no?
Something like:
command_A > command_A.output
command_B > command_B.output
diff command_A.output command_B.output
This will work for comparing multiple strings.
But, since you want to know about single lines (and words in the lines) here are some pointers:
# first line of output of command_A
command_A | head -n 1
The -n 1 option says only to use the first line (default is 10 I think)
# second line of output of command_A
command_A | head -n 2 | tail -n 1
that will take the first two lines of the output of command_A and then the last of those two lines. Happy times :)
You can now store this information in a variable:
export output_A=`command_A | head -n 2 | tail -n 1`
export output_B=`command_B | head -n 1`
And then compare it:
if [ "$output_A" == "$output_B" ]; then echo 'Correct'; else echo 'Not Correct'; fi
To just get parts of a string, try looking into cut or (for more powerful stuff) sed and awk.
Also, just learing a good general purpose scripting language like python or ruby (even perl) can go a long way with this kind of problem.
Use the IFS (internal field separator) to separate on newlines and store the outputs in an array.
#!/bin/bash
IFS='
'
array_a=( $(./a.sh) )
array_b=( $(./b.sh) )
if [ "${array_a[1]}" = "${array_b[0]}" ]; then
echo "CORRECT"
else
echo "INCORRECT"
fi

How to convert the script from using command,read to command,cut?

Here is the test sample:
test_catalog,test_title,test_type,test_artist
And i can use the following sript to cut off the text above by comma and set the variable respectively:
IFS=","
read cdcatnum cdtitle cdtype cdac < $temp_file
(ps:and the $temp_file is the dir of the test sample)
And if i want to replace the read with command,cut.Any idea?
There are many solutions:
line=$(head -1 "$temp_file")
echo $line | cut -d, ...
or
cut -d, ... <<< "$line"
or you can tell BASH to copy the line into an array:
typeset IFS=,
set -A ARRAY $(head -1 "$temp_file")
# use it
echo $ARRAY[0] # test_catalog
echo $ARRAY[1] # test_title
...
I prefer the array solution because it gives you a distinct data type and clearly communicates your intent. The echo/cut solution is also somewhat slower.
[EDIT] On the other hand, the read command splits the line into individual variables which gives each value a name. Which is more readable: $ARRAY[0] or $cdcatnum?
If you move columns around, you will just need to rearrange the arguments to the read command - if you use arrays, you will have to update all the array indices which you will get wrong.
Also read makes it much more simple to process the whole file:
while read cdcatnum cdtitle cdtype cdac ; do
....
done < "$temp_file"
man cut ?
But seriously, if you have something that works, why do you want to change it?
Personally, I'd probably use awk or perl to manipulate CSV files in linux.

Resources