Concating string with shell script with accumulator - string

I'd like to convert a list separated with '\n' in another one separated with space.
Ex:
Get a dictionary like ispell english dictionary. http://downloads.sourceforge.net/wordlist/ispell-enwl-3.1.20.zip
My initial idea was using a variable as accumulator:
a=""; cat american.0 | while read line; do a="$a $line"; done; echo $a
... but it results '\n' string!!!
Questions:
Why is it not working?
What is the correct way to do that?
Thanks.

The problem is that when you have a pipeline:
command_1 | command_2
each command is run in a separate subshell, with a separate copy of the parent environment. So any variables that the command creates, or any modifications it makes to existing variables, will not be perceived by the containing shell.
In your case, you don't really need the pipeline, because this:
cat filename | command
is equivalent, in every way that you need, to this:
command < filename
So you can write:
a=""; while read line; do a="$a $line"; done < american.0; echo $a
to avoid creating any subshells.
That said, according to this StackOverflow answer, you can't really rely on a shell variable being able to hold more than about 1–4KB of data, so you probably need to rethink your overall approach. Storing the entire word-list in a shell variable likely won't work, and even if it does, it likely won't work well.
Edited to add: To create a temporary file named /tmp/american.tmp that contains what the variable $a would have, you can write:
while IFS= read -r line; do
printf %s " $line"
done < american.0 > /tmp/american.tmp

If you want to replace '\n' with a space, you can simply use tr as follows:
a=$(tr '\n' ' ' < american.0)

Related

formatting issue in printf script

I have a file stv.txt containing some names
For example stv.txt is as follows:
hello
world
I want to generate another file by using these names and adding some extra text to them.I have written a script as follows
for i in `cat stvv.txt`;
do printf 'if(!strcmp("$i",optarg))' > my_file;
done
output
if(!strcmp("$i",optarg))
desired output
if(!strcmp("hello",optarg))
if(!strcmp("world",optarg))
how can I get the correct result?
This is a working solution.
1 All symbols inside single quotes is considered a string. 2 When using printf, do not surround the variable with quotes. (in this example)
The code below should fix it,
for i in `cat stvv.txt`;
printf 'if(!strcmp('$i',optarg))' > my_file;
done
basically, break the printf statement into three parts.
1: the string 'if(!strcmp('
2: $i (no quotes)
3: the string ',optarg))'
hope that helps!
To insert a string into a printf format, use %s in the format string:
$ for line in $(cat stvv.txt); do printf 'if(!strcmp("%s",optarg))\n' "$line"; done
if(!strcmp("hello",optarg))
if(!strcmp("world",optarg))
The code $(cat stvv.txt) will perform word splitting and pathname expansion on the contents of stvv.txt. You probably don't want that. It is generally safer to use a while read ... done <stvv.txt loop such as this one:
$ while read -r line; do printf 'if(!strcmp("%s",optarg))\n' "$line"; done <stvv.txt
if(!strcmp("hello",optarg))
if(!strcmp("world",optarg))
Aside on cat
If you are using bash, then $(cat stvv.txt) could be replaced with the more efficient $(<stvv.txt). This question, however, is tagged shell not bash. The cat form is POSIX and therefore portable to all POSIX shells while the bash form is not.

Extract substrings from a file and store them in shell variables

I am working on a script. I have a file called test.txt whose contents are as follows:
a. parent = 192.168.1.2
b. child1 = 192.168.1.21
c. child2 = 192.154.1.2
I need to store the values in three different variables called parent, child1and child2 as follows and then my script will use these values:
parent = 192.168.1.2
child1= 192.168.1.21
child2= 192.154.1.2
How can I do that using sed or awk? I know there is a way to extract substrings using awk function substr but my particular requirement is tostore them in variables as mentioned above. Thanks
Try this if you're using bash:
$ declare $(awk '{print $2"="$4}' file)
$ echo "$parent"
192.168.1.2
If the file contained white space in the values you want to init the variables with then you'd just have to set IFS to a newline before invoking declare, e.g. (simplified the input file to highlight the important part of white space on the right of the = signs):
$ cat file
parent=192.168.1.2 is first
child1=192.168.1.21 comes after it
child2=and then theres 192.154.1.2
$ IFS=$'\n'; declare $(awk -F'=' '{print $1"="$2}' file)
$ echo "$parent"
192.168.1.2 is first
$ echo "$child1"
192.168.1.21 comes after it
Ed Morton's answer is the way to go for the specific problem at hand - elegant and concise.
Update: Ed has since updated his answer to also provide a solution that correctly deals with variable value values with embedded spaces - the original lack of which prompted this answer.
His solution is superior to this one - more concise and more efficient (the only caveat is that you may have to restore the previous $IFS value afterward).
This solution may still be of interest if you need to process variable definitions one by one, e.g., in order to transform variable values based on other shell functions or variables before assigning them.
The following uses bash with process substitution on a simplified problem to process variable definitions one by one:
#!/usr/bin/env bash
while read -r name val; do # read a name-value pair
# Assign the value after applying a transformation to it; e.g.:
# 'value of' -> 'value:'
declare $name="${val/ of /: }" # `declare "$name=${val/ of /: }"` would work too.
done < <(awk -F= '{print $1, $2}' <<<$'v1=value of v1\nv2= value of v2')
echo "v1=[$v1], v2=[$v2]" # -> 'v1=[value: v1], v2=[value: v2]'
awk's output lines are read line by line, split into name and value, and declared as shell variables individually.
Since read, which trims by whitespace, is only given 2 variable names to read into, the 2nd one receives everything from the 2nd token _through the end of the line, thus preserving interior whitespace (and, as written, will trim leading and trailing whitespace in the process).
Note that declare normally does not require a variable reference on the RHS of the assignment (the value) to be double-quoted (e.g. a=$b; though it never hurts). In this particular case, however - seemingly because the LHS (the name) is also a variable reference - the double quotes are needed.
I also got it done finally . Thanks everyone for helping.
counter=0
while read line
do
declare $(echo $line | awk '{print $2"="$4}')
#echo "$parent"
if [ $counter = 0 ]
then
parent=$(echo $parent)
fi
if [ $counter = 1 ]
then
child1=$(echo $child)
else
child2=$(echo $child)
fi
counter=$((counter+1))
done < "/etc/cluster_info.txt"
eval "$( sed 's/..//;s/ *//g' YourFile )"
just a sed equivalent to Ed solution and with an eval instead of declare.

Rename a variable in a for loop

Lets say i have a nested for loop:
for i in $test
do
name=something
for j in $test2
do
name2=something
jj=$j | sed s/'tRap\/tRapTrain'/'BEEML\/BEEMLTrain'/g
if [ name == name2 ]
then
qsub scrip.sh $i $j $jj
fi
done
done
Now the problem occurs when i try to rename the variable $j into variable $jj. I only get empty values back for submitting the script within the if statement. Is there another way to rename variables so that i can pass them through to the if part of the code?
PS. i tried 3 for loops but this makes the script awfully slow.
Your problem is piping the assignment into sed. Try something like
jj=$(echo $j | sed s/'tRap\/tRapTrain'/'BEEML\/BEEMLTrain'/g)
This uses command substitution to assign jj.
This is not correct:
jj=$j | sed s/'tRap\/tRapTrain'/'BEEML\/BEEMLTrain'/g
In order to assign the output of a command to a variable you need to use command substitution like this:
jj=$(sed s/'tRap\/tRapTrain'/'BEEML\/BEEMLTrain'/g <<< "$j")
You may not even have to use sed because bash has in-built string replacement. For example, the following will replace foo with bar in the j variable and assign it to jj:
jj=${j//foo/bar}
There is also a problem with your if-statement. It should be:
if [ "$name" == "$name2" ]
A tiny little thing:
Sed treats the first character after the action selector as the field separator.
Knowing this you can translate your expresion:
sed s/'tRap\/tRapTrain'/'BEEML\/BEEMLTrain'/g
into:
sed s%'tRap/tRapTrain'%'BEEML/BEEMLTrain'%g
So you don't have to worry about scaping your slashes when substituting paths. I normally use '%', but feel free to use any other character. I think the optimal approach would be using a non printable character:
SEP=$'\001' ; sed s${SEP}'tRap/tRapTrain'${SEP}'BEEML/BEEMLTrain'${SEP}g

Looping through the elements of a path variable in Bash

I want to loop through a path list that I have gotten from an echo $VARIABLE command.
For example:
echo $MANPATH will return
/usr/lib:/usr/sfw/lib:/usr/info
So that is three different paths, each separated by a colon. I want to loop though each of those paths. Is there a way to do that? Thanks.
Thanks for all the replies so far, it looks like I actually don't need a loop after all. I just need a way to take out the colon so I can run one ls command on those three paths.
You can set the Internal Field Separator:
( IFS=:
for p in $MANPATH; do
echo "$p"
done
)
I used a subshell so the change in IFS is not reflected in my current shell.
The canonical way to do this, in Bash, is to use the read builtin appropriately:
IFS=: read -r -d '' -a path_array < <(printf '%s:\0' "$MANPATH")
This is the only robust solution: will do exactly what you want: split the string on the delimiter : and be safe with respect to spaces, newlines, and glob characters like *, [ ], etc. (unlike the other answers: they are all broken).
After this command, you'll have an array path_array, and you can loop on it:
for p in "${path_array[#]}"; do
printf '%s\n' "$p"
done
You can use Bash's pattern substitution parameter expansion to populate your loop variable. For example:
MANPATH=/usr/lib:/usr/sfw/lib:/usr/info
# Replace colons with spaces to create list.
for path in ${MANPATH//:/ }; do
echo "$path"
done
Note: Don't enclose the substitution expansion in quotes. You want the expanded values from MANPATH to be interpreted by the for-loop as separate words, rather than as a single string.
In this way you can safely go through the $PATH with a single loop, while $IFS will remain the same inside or outside the loop.
while IFS=: read -d: -r path; do # `$IFS` is only set for the `read` command
echo $path
done <<< "${PATH:+"${PATH}:"}" # append an extra ':' if `$PATH` is set
You can check the value of $IFS,
IFS='xxxxxxxx'
while IFS=: read -d: -r path; do
echo "${IFS}${path}"
done <<< "${PATH:+"${PATH}:"}"
and the output will be something like this.
xxxxxxxx/usr/local/bin
xxxxxxxx/usr/bin
xxxxxxxx/bin
Reference to another question on StackExchange.
for p in $(echo $MANPATH | tr ":" " ") ;do
echo $p
done
IFS=:
arr=(${MANPATH})
for path in "${arr[#]}" ; do # <- quotes required
echo $path
done
... it does take care of spaces :o) but also adds empty elements if you have something like:
:/usr/bin::/usr/lib:
... then index 0,2 will be empty (''), cannot say why index 4 isnt set at all
This can also be solved with Python, on the command line:
python -c "import os,sys;[os.system(' '.join(sys.argv[1:]).format(p)) for p in os.getenv('PATH').split(':')]" echo {}
Or as an alias:
alias foreachpath="python -c \"import os,sys;[os.system(' '.join(sys.argv[1:]).format(p)) for p in os.getenv('PATH').split(':')]\""
With example usage:
foreachpath echo {}
The advantage to this approach is that {} will be replaced by each path in succession. This can be used to construct all sorts of commands, for instance to list the size of all files and directories in the directories in $PATH. including directories with spaces in the name:
foreachpath 'for e in "{}"/*; do du -h "$e"; done'
Here is an example that shortens the length of the $PATH variable by creating symlinks to every file and directory in the $PATH in $HOME/.allbin. This is not useful for everyday usage, but may be useful if you get the too many arguments error message in a docker container, because bitbake uses the full $PATH as part of the command line...
mkdir -p "$HOME/.allbin"
python -c "import os,sys;[os.system(' '.join(sys.argv[1:]).format(p)) for p in os.getenv('PATH').split(':')]" 'for e in "{}"/*; do ln -sf "$e" "$HOME/.allbin/$(basename $e)"; done'
export PATH="$HOME/.allbin"
This should also, in theory, speed up regular shell usage and shell scripts, since there are fewer paths to search for every command that is executed. It is pretty hacky, though, so I don't recommend that anyone shorten their $PATH this way.
The foreachpath alias might come in handy, though.
Combining ideas from:
https://stackoverflow.com/a/29949759 - gniourf_gniourf
https://stackoverflow.com/a/31017384 - Yi H.
code:
PATHVAR='foo:bar baz:spam:eggs:' # demo path with space and empty
printf '%s:\0' "$PATHVAR" | while IFS=: read -d: -r p; do
echo $p
done | cat -n
output:
1 foo
2 bar baz
3 spam
4 eggs
5
You can use Bash's for X in ${} notation to accomplish this:
for p in ${PATH//:/$'\n'} ; do
echo $p;
done
OP's update wants to ls the resulting folders, and has pointed out that ls only requires a space-separated list.
ls $(echo $PATH | tr ':' ' ') is nice and simple and should fit the bill nicely.

How to convert the script from using command,read to command,cut?

Here is the test sample:
test_catalog,test_title,test_type,test_artist
And i can use the following sript to cut off the text above by comma and set the variable respectively:
IFS=","
read cdcatnum cdtitle cdtype cdac < $temp_file
(ps:and the $temp_file is the dir of the test sample)
And if i want to replace the read with command,cut.Any idea?
There are many solutions:
line=$(head -1 "$temp_file")
echo $line | cut -d, ...
or
cut -d, ... <<< "$line"
or you can tell BASH to copy the line into an array:
typeset IFS=,
set -A ARRAY $(head -1 "$temp_file")
# use it
echo $ARRAY[0] # test_catalog
echo $ARRAY[1] # test_title
...
I prefer the array solution because it gives you a distinct data type and clearly communicates your intent. The echo/cut solution is also somewhat slower.
[EDIT] On the other hand, the read command splits the line into individual variables which gives each value a name. Which is more readable: $ARRAY[0] or $cdcatnum?
If you move columns around, you will just need to rearrange the arguments to the read command - if you use arrays, you will have to update all the array indices which you will get wrong.
Also read makes it much more simple to process the whole file:
while read cdcatnum cdtitle cdtype cdac ; do
....
done < "$temp_file"
man cut ?
But seriously, if you have something that works, why do you want to change it?
Personally, I'd probably use awk or perl to manipulate CSV files in linux.

Resources