How to substitute shell variables in complex text files

How to substitute shell variables in complex text files - linux

I have several text files in which I have introduced shell variables ($VAR1 or $VAR2 for instance).
I would like to take those files (one by one) and save them in new files where all variables would have been replaced.
To do this, I used the following shell script (found on StackOverflow):
while read line
do
eval echo "$line" >> destination.txt
done < "source.txt"
This works very well on very basic files.
But on more complex files, the "eval" command does too much:
Lines starting with "#" are skipped
XML files parsing results in tons of errors
Is there a better way to do it? (in shell script... I know this is easily done with Ant for instance)
Kind regards

Looking, it turns out on my system there is an envsubst command which is part of the gettext-base package.
So, this makes it easy:
envsubst < "source.txt" > "destination.txt"
Note if you want to use the same file for both, you'll have to use something like moreutil's sponge, as suggested by Johnny Utahh: envsubst < "source.txt" | sponge "source.txt". (Because the shell redirect will otherwise empty the file before its read.)

In reference to answer 2, when discussing envsubst, you asked:
How can I make it work with the variables that are declared in my .sh script?
The answer is you simply need to export your variables before calling envsubst.
You can also limit the variable strings you want to replace in the input using the envsubst SHELL_FORMAT argument (avoiding the unintended replacement of a string in the input with a common shell variable value - e.g. $HOME).
For instance:
export VAR1='somevalue' VAR2='someothervalue'
MYVARS='$VAR1:$VAR2'
envsubst "$MYVARS" <source.txt >destination.txt
Will replace all instances of $VAR1 and $VAR2 (and only VAR1 and VAR2) in source.txt with 'somevalue' and 'someothervalue' respectively.

I know this topic is old, but I have a simpler working solution without exporting the variables. Can be a oneliner, but I prefer to split using \ on line end.
var1='myVar1'\
var2=2\
var3=${var1}\
envsubst '$var1,$var3' < "source.txt" > "destination.txt"
# ^^^^^^^^^^^ ^^^^^^^^^^ ^^^^^^^^^^^^^^^
# define which to replace input output
The variables need to be defined to the same line as envsubst is to get considered as environment variables.
The '$var1,$var3' is optional to only replace the specified ones. Imagine an input file containing ${VARIABLE_USED_BY_JENKINS} which should not be replaced.

Define your ENV variable
$ export MY_ENV_VAR=congratulation
Create template file (in.txt) with following content
$MY_ENV_VAR
You can also use all other ENV variables defined by your system like (in linux) $TERM, $SHELL, $HOME...
Run this command to raplace all env-variables in your in.txt file and to write the result to out.txt
$ envsubst "`printf '${%s} ' $(sh -c "env|cut -d'=' -f1")`" < in.txt > out.txt
Check the content of out.txt file
$ cat out.txt
and you should see "congratulation".

There is also this option:
define your variables in a file
$ cat variables.env
# info about what this var is
export var1=a
# info about var again
export var2=b
define a template file that uses the variables
$ cat file1-template.txt
This is var1: "${var1}"
This is var2: "${var2}"
generate the final file, with variables replaced with values
$ source variables.env
$ envsubst < file1-template.txt > file1.txt
$ cat file1.txt
This is var1: "a"
This is var2: "b"

If you want env variables to be replaced in your source files while keeping all of the non env variables as they are, you can use the following command:
envsubst "$(printf '${%s} ' $(env | sed 's/=.*//'))" < source.txt > destination.txt
The syntax for replacing only specific variables is explained here. The command above is using a sub-shell to list all defined variables and then passing it to the envsubst
So if there's a defined env variable called $NAME, and your source.txt file looks like this:
Hello $NAME
Your balance is 123 ($USD)
The destination.txt will be:
Hello Arik
Your balance is 123 ($USD)
Notice that the $NAME is replaced and the $USD is left untouched

while IFS='=' read -r name value ; do
# Print line if found variable
sed -n '/${'"${name}"'}/p' docker-compose.yml
# Replace variable with value.
sed -i 's|${'"${name}"'}|'"${value}"'|' docker-compose.yml
done < <(env)
Note: Variable name or value should not contain "|", because it is used as a delimiter.

If you really only want to use bash (and sed), then I would go through each of your environment variables (as returned by set in posix mode) and build a bunch of -e 'regex' for sed from that, terminated by a -e 's/\$[a-zA-Z_][a-zA-Z0-9_]*//g', then pass all that to sed.
Perl would do a nicer job though, you have access to the environment vars as an array and you can do executable replacements so you only match any environment variable once.

Actually you need to change your read to read -r which will make it ignore backslashes.
Also, you should escape quotes and backslashes.
So
while read -r line; do
line="${line//\\/\\\\}"
line="${line//\"/\\\"}"
line="${line//\`/\\\`}"
eval echo "\"$line\""
done > destination.txt < source.txt
Still a terrible way to do expansion though.

Export all the needed variables and then use a perl onliner
TEXT=$(echo "$TEXT"|perl -wpne 's#\${?(\w+)}?# $ENV{$1} // $& #ge;')
This will replace all the ENV variables present in TEXT with actual values.
Quotes are also preserved :)

Call the perl binary, in search and replace per line mode ( the -pi ) by running the perl code ( the -e) in the single quotes, which iterates over the keys of the special %ENV hash containing the exported variable names as keys and the exported variable values as the keys' values and for each iteration simple replace a string containing a $<<key>> with its <<value>>.
perl -pi -e 'foreach $key(sort keys %ENV){ s/\$$key/$ENV{$key}/g}' file
Caveat:
An additional logic handling is required for cases in which two or more vars start with the same string ...

envsubst seems exactly like something I wanted to use, but -v option surprised me a bit.
While envsubst < template.txt was working fine, the same with option -v was not working:
$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.1 (Maipo)
$ envsubst -V
envsubst (GNU gettext-runtime) 0.18.2
Copyright (C) 2003-2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Bruno Haible.
As I wrote, this was not working:
$ envsubst -v < template.txt
envsubst: missing arguments
$ cat template.txt | envsubst -v
envsubst: missing arguments
I had to do this to make it work:
TEXT=`cat template.txt`; envsubst -v "$TEXT"
Maybe it helps someone.

Related

envsubst: default values for unset variables

I've got a json file input.json like the following one:
{
"variable" : "${ENV_VAR}"
}
of course, I can invoke envsubst from bash like the following:
$ export ENV_VAR=myvalue
$ envsubst < input.json > output.json
$ cat output.json
{
"variable" : "myvalue"
}
Now, I wish I could set default values for variables in the input.json for the case when ENV_VAR is not set, like in the following example which, as unfortunately can be seen in the example below, doesn't work:
$ cat input.json
{
"variable" : "${ENV_VAR:=defaultvalue}"
}
$ export ENV_VAR=newvalue
$ envsubst < input.json > output.json
$ cat output.json
{
"variable" : "${ENV_VAR:=defaultvalue}"
}
$ unset ENV_VAR
$ envsubst < input.json > output.json
$ cat output.json
{
"variable" : "${ENV_VAR:=defaultvalue}"
}
What's curious, if I execute the envsubst like in the following example (without involving an input file), it works
$ export ENV_VAR=myvalue
$ echo "value is ${ENV_VAR:=defaultvalue}" | envsubst
value is myvalue
$ unset ENV_VAR
$ echo "value is ${ENV_VAR:=defaultvalue}" | envsubst
value is defaultvalue
Where is the problem with the files?

According to man envsubst, envsubst will only ever replace references to environment variables in the form of ${VAR} or $VAR. Special shell features like ${VAR:-default} are not supported. The only thing you could do is to (re)define all variables in the environment of the envsubst invocation and assign local default values, if they are missing:
ENV_VAR="${ENV_VAR:-defaultvalue}" \
OTHER_VAR="${OTHER_VAR:-otherdefault}" \
envsubst < input.json > output.json
Note, that this is actually a single command line split into multiple lines each ending with a line continuation \. The first two lines are variable assignments, that are only effective in the environment of the executed command envsubst in the last line. What's happening is, that the shell will create an environment for the execution of the command (as it would always do). That environment is initially a copy of the current shell environment. Within that new environment ENV_VAR and OTHER_VAR are assigned the values of expanding the expression ${VAR:-default}, which essentially expands to default unless VAR is defined and has a none-empty value. The command envsubst is executed, receiving the file input.json as standard-input and having its standard-output redirected to output.json (both is done by the shell, transparent to the command). After the command execution, the shell deletes the command environment returning to its original environment, i.e. the local variable assignments are no longer effective.
There is no way to define default values from inside the JSON file, unless you implement a program to do so yourself, or use another tool that can to that.
You could do something like the following, but it is NOT RECOMMENDED:
eval echo "$(cat input.json)" > output.json
which will read input.json into a string, and than evaluate the command echo <string> as if it was type literally, which means that any embedded ${VAR:-default} stuff should be expanded by the shell before the string is passed to echo. BUT any other embedded shell feature will be evaluated as well, which poses a HUGE SECURITY RISK.

I'm using https://github.com/a8m/envsubst and it has enhancements over the original gettext envsubst that the expressions in the template file supports default values.
The example in the README just works.
echo 'welcome $HOME ${USER:=a8m}' | envsubst

Blockquote
I'm using https://github.com/a8m/envsubst and it has enhancements over the original gettext envsubst that the expressions in the template file supports default values.
Similarly, a Rust variant called 'envsub' is available which also supports the default values. See https://github.com/stephenc/envsub .

There's also https://github.com/busyloop/envcat
which supports complex templates (so you can not only do default values but also conditions etc.).

What does "DOLLAR=$ "do? [duplicate]

Is there a way to prevent envsubst from substituting a $VARIABLE? For example, I would expect something like:
export THIS=THAT
echo "dont substitute \\\$THIS" | envsubst
and have it return
dont substitute $THIS
but instead I get
dont substitute \THAT
is there any escape character for doing this?

If you give envsubst a list of variables, it only substitutes those variables, ignoring other substitutions. I'm not exactly sure how it works, but something like the following seems to do what you want:
$ export THIS=THAT FOO=BAR
$ echo 'dont substitute $THIS but do substitute $FOO' | envsubst '$FOO'
dont substitute $THIS but do substitute BAR
Note that $THIS is left alone, but $FOO is replaced by BAR.

export DOLLAR='$'
export THIS=THAT
echo '${DOLLAR}THIS' | envsubst
Or more clear:
export THIS=THAT
echo '${DOLLAR}THIS' | DOLLAR='$' envsubst

My workaround is as follows:
Original template:
$change_this
$dont_change_this
Editted template:
$change_this
§dont_change_this
Now you can process:
envsubst < $template | sed -e 's/§/$/g'
This relies on the character § not occurring anywhere else on your template. You can use any other character.

$ echo $SHELL
/bin/bash
$ echo \$SHELL
$SHELL
$ echo \$SHELL | envsubst
/bin/bash
$ echo \$\${q}SHELL | envsubst
$SHELL
So doing $$ allows you to add a $ character. Then just "substitute" non-existent variable (here I used ${q} but can be something more meaningful like ${my_empty_variable} and you'll end up with what you need.
Just as with the paragraph solution - you need something special - here... a non-existent variable, which I like a bit more than performing additional sed on templates.

If there's only one or two variables you don't want to expand, you can sort of whitelist them by temporarily setting them to their own name, like this:
$ echo 'one $two three $four' | four='$four' envsubst
one three $four
Here, the $four variable gets replaced with $four, effectively leaving it unchanged.

In my case I wanted to only escape vars that aren't already defined. To do so run:
envsubst "$(env | sed -e 's/=.*//' -e 's/^/\$/g')"

Another way to "escape" some environment variable substitution is to use default value assignment (or any other variable processing) as envsubst will not substitute these:
$ export two=2
$ echo 'one $two three ${four:-}' | envsubst
one 2 three ${four:-}
The fourth envvar is not substituted, while in its output the processing to use defaulkt value is still there. This does not matter though, as processing this line later on will still deliver nothing if the variable is not set and its value when set.

Here's an alternative that I use, as it saves installing the entire gettext package for just one program. I have this awk script, I call envtmpl, it will swap any environment variable that looks like {{ENV-VAR}} for the value of ENV-VAR
#! /usr/bin/awk -f
{ for (a in ENVIRON) gsub("{{" _ a _ "}}",ENVIRON[a]); print }
So
$ echo "My shell '{{SHELL}}' is cool" | envtmpl
My shell '/bin/bash' is cool
As you can see, if {{ & }} aren't what you prefer, its really each to change and this script works fine with busybox's awk.
It's not going to be the world's fastest solution, but it's really easy to implement and I mostly run it to prepare config files, so speed is pretty irrelevant.
WARNING: The only major difference between this and envsubst is that this will NOT alter variables where no value exists. That is {{HAS-NO-VALUE}} will be left exactly as that, where as envsubst will remove those (replace them with blank).
You can fix this by adding more code into the awk, if you want.

The way I did it is
export DONT_CHANGE_THIS=\${DONT_CHANGE_THIS}
envsubst < some-template.yml > changed.yml
So it will try to replace ${var} with \${var} and as output, you will get ${var} printed as it is

I used escape character for this
MYENVVAR="\${MYENVVAR}"
export MYENVVAR
envsubst #whatever you want
then reset it to what actually I want
MYENVVAR="my value"
export MYENVVAR

I just connected parts of other answers to create one-liner that substitutes vars prefixed with $, but ignores $$:
echo "\$TEST ; \$\$l" > TEST_FILE
cat TEST_FILE
# $TEST ; $$l
export TEST=1
cat TEST_FILE | sed -e 's/\$\$/§/g' | envsubst | sed -e 's/§/\$/g'
# 1 ; $l

bash script to replace all occurrences of placeholders in file

I'm trying to write a bash script to replace all occurrences of a placeholder in a file with an environment variable of the same name. As an example, if I have a file like the following...
This is an {{VAR1}} {{VAR2}}.
It should work across multiple lines in this {{VAR2}}.
... and I have the following environment variables set:
VAR1='example'
VAR2='file'
after running the script on my file, I should get the output:
This is an example file.
It should work across multiple lines in this file.
I'm sure there must be a solution using awk/sed, but so far the closest I've come can't handle if there's more than one variable on a line. Here's my attempt so far:
cat example.txt | grep -o '{{.*}}' > temp
while read placeholder; do
varName=$(echo "$placeholder" | tr -d '{}')
value="${!varName}"
sed -i "s/$placeholder/$value/g" "$file"
done < temp
rm -rf temp

I'd use Perl:
perl -pe 's/{{(.*?)}}/$ENV{$1}/g' filename
This assumes that VAR1 and VAR2 are environment variables (i.e., are exported), so that Perl can pick them out of its environment. This would be required of any approach that isn't pure shell; I just mention it to avoid confusion.
This works as follows:
s/pattern/replacement/g is a substitution command; you may recognize it from sed. The difference is that here we can use Perl's more powerful regex engine and variables. The g flag makes it so that all matches are replaced; without it, it would apply only to the first.
In the pattern, .*? matches non-greedily, so that in a line that contains foo {{VAR1}} bar {{VAR2}} baz, the pattern {{.*?}} matches only {{VAR1}} instead of {{VAR1}} bar {{VAR2}}.
The part between {{ and }} is captured because it is between () and can be reused as $1
$ENV{$1} in the replacement uses the special %ENV hash that contains the environment of the Perl process. $ENV{$1} is the value of the environment variable that has the name $1, which is the captured group from before.

Only bash and sed:
$ VAR1='example'
$ VAR2='file'
$ export VAR1 VAR2
$ sed -e '{s/{{\([^{]*\)}}/${\1}/g; s/^/echo "/; s/$/";/}' -e e filename
This is an example file.
It should work across multiple lines in this file.
sed -e '{s/{{\([^{]*\)}}/${\1}/g;}' filename:
This is an ${VAR1} ${VAR2}.
It should work across multiple lines in this ${VAR2}.
{{\([^{]*\)}} - Search for {{..}}
[^{] - Non greedy match
\1 - Access to the bracketed values \(...\).
sed -e '{s/{{\([^{]*\)}}/${\1}/g; s/^/echo "/; s/$/";/}' filename:
echo "This is an ${VAR1} ${VAR2}.";
echo "It should work across multiple lines in this ${VAR2}.";
s/^/echo "/ - Replace the beginning of the line with echo "
s/$/";/ - Replace the end of the line with ";

I was just playing with your original approach. Wouldn't adding another loop on $varName work?
cat example.txt | grep -o '{{.*}}' > temp
while read placeholder; do
varName=$(echo "$placeholder" | tr -d '{}')
for i in $varName; do
value="${!i}"
sed -i "s/{{$i}}/$value/g" example.txt
done
done < temp
rm -rf temp

Extract all variable values in a shell script

I'm debugging an old shell script; I want to check the values of all the variables used, it's a huge ugly script with approx more than 140 variables used. Is there anyway I can extract the variable names from the script and put them in a convenient pattern like:
#!/bin/sh
if [ ${BLAH} ....
.....
rm -rf ${JUNK}.....
to
echo ${BLAH}
echo ${JUNK}
...

Try running your script as follows:
bash -x ./script.bash
Or enable the setting in the script:
set -x

You can dump all interested variables in one command using:
set | grep -w -e BLAH -e JUNK
To dump all the variables to stdout use:
set
or
env
from inside your script.

You can extract a (sub)list of the variables declared in your script using grep:
grep -Po "([a-z][a-zA-Z0-9_]+)(?==\")" ./script.bash | sort -u
Disclaimer: why "sublist"?
The expression given will match string followed by an egal sign (=) and a double quote ("). So if you don't use syntax such as myvar="my-value" it won't work.
But you got the idea.
grep Options
-P --perl-regexp: Interpret PATTERN as a Perl regular expression (PCRE, see below) (experimental) ;
-o --only-matching: Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.
Pattern
I'm using a positive lookahead: (?==\") to require an egal sign followed by a double quote.

In bash, but not sh, compgen -v will list the names of all variables assigned (compare this to set, which has a great deal of output other than variable names, and thus needs to be parsed).
Thus, if you change the top of the script to #!/bin/bash, you will be able to use compgen -v to generate that list.
That said, the person who advised you use set -x did well. Consider this extension on that:
PS4=':$BASH_SOURCE:$LINENO+'; set -x
This will print the source file and line number before every command (or variable assignment) which is executed, so you will have a log not only of which variables are set, but just where in the source each one was assigned. This makes tracking down where each variable is set far easier.

Shell - Write variable contents to a file

I would like to copy the contents of a variable (here called var) into a file.
The name of the file is stored in another variable destfile.
I'm having problems doing this. Here's what I've tried:
cp $var $destfile
I've also tried the same thing with the dd command... Obviously the shell thought that $var was referring to a directory and so told me that the directory could not be found.
How do I get around this?

Use the echo command:
var="text to append";
destdir=/some/directory/path/filename
if [ -f "$destdir" ]
then
echo "$var" > "$destdir"
fi
The if tests that $destdir represents a file.
The > appends the text after truncating the file. If you only want to append the text in $var to the file existing contents, then use >> instead:
echo "$var" >> "$destdir"
The cp command is used for copying files (to files), not for writing text to a file.

echo has the problem that if var contains something like -e, it will be interpreted as a flag. Another option is printf, but printf "$var" > "$destdir" will expand any escaped characters in the variable, so if the variable contains backslashes the file contents won't match. However, because printf only interprets backslashes as escapes in the format string, you can use the %s format specifier to store the exact variable contents to the destination file:
printf "%s" "$var" > "$destdir"

None of the answers above work if your variable:
starts with -e
starts with -n
starts with -E
contains a \ followed by an n
should not have an extra newline appended after it
and so they cannot be relied upon for arbitrary string contents.
In bash, you can use "here strings" as:
cat <<< "$var" > "$destdir"
As noted in the comment by Ash below, #Trebawa's answer (formulated in the same room as mine!) using printf is a better approach than cat.

All of the above work, but also have to work around a problem (escapes and special characters) that doesn't need to occur in the first place: Special characters when the variable is expanded by the shell. Just don't do that (variable expansion) in the first place. Use the variable directly, without expansion.
Also, if your variable contains a secret and you want to copy that secret into a file, you might want to not have expansion in the command line as tracing/command echo of the shell commands might reveal the secret. Means, all answers which use $var in the command line may have a potential security risk by exposing the variable contents to tracing and logging of the shell.
For variables that are already exported, use this:
printenv var >file
That means, in case of the OP question:
printenv var >"$destfile"
Note: variable names are case sensitive.
Warning: It is not a good idea to export a variable just for the sake of printing it with printenv. If you have a non-exported script variable that contains a secret, exporting it will expose it to all future child processes (unless unexported, for example using export -n).

If I understood you right, you want to copy $var in a file (if it's a string).
echo $var > $destdir

When you say "copy the contents of a variable", does that variable contain a file name, or does it contain a name of a file?
I'm assuming by your question that $var contains the contents you want to copy into the file:
$ echo "$var" > "$destdir"
This will echo the value of $var into a file called $destdir. Note the quotes. Very important to have "$var" enclosed in quotes. Also for "$destdir" if there's a space in the name. To append it:
$ echo "$var" >> "$destdir"

you may need to edit a conf file in a build process:
echo "db-url-host=$POSTGRESQL_HOST" >> my-service.conf
You can test this solution with running before export POSTGRESQL_HOST="localhost"

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string