How can I re-add a unicode byte order marker in linux? - linux

I have a rather large SQL file which starts with the byte order marker of FFFE. I have split this file using the unicode aware linux split tool into 100,000 line chunks. But when passing these back to windows, it does not like any of the parts other than the first one as only it has the FFFE byte order marker on.
How can I add this two byte code using echo (or any other bash command)?

Based on sed's solution of Anonymous, sed -i '1s/^/\xef\xbb\xbf/' foo adds the BOM to the UTF-8 encoded file foo. Usefull is that it also converts ASCII files to UTF8 with BOM

To add BOMs to the all the files that start with "foo-", you can use sed. sed has an option to make a backup.
sed -i '1s/^\(\xff\xfe\)\?/\xff\xfe/' foo-*
straceing this shows sed creates a temp file with a name starting with "sed". If you know for sure there is no BOM already, you can simplify the command:
sed -i '1s/^/\xff\xfe/' foo-*
Make sure you need to set UTF-16, because i.e. UTF-8 is different.

For a general-purpose solution—something that sets the correct byte-order mark regardless of whether the file is UTF-8, UTF-16, or UTF-32—I would use vim’s 'bomb' option:
$ echo 'hello' > foo
$ xxd < foo
0000000: 6865 6c6c 6f0a hello.
$ vim -e -s -c ':set bomb' -c ':wq' foo
$ xxd < foo
0000000: efbb bf68 656c 6c6f 0a ...hello.
(-e means runs in ex mode instead of visual mode; -s means don’t print status messages; -c means “do this”)

Try uconv
uconv --add-signature

Something like (backup first)):
for i in $(ls *.sql)
do
cp "$i" "$i.temp"
printf '\xFF\xFE' > "$i"
cat "$i.temp" >> "$i"
rm "$i.temp"
done

Matthew Flaschen's answer is a good one, however it has a couple of flaws.
There's no check that the copy succeeded before the original file is truncated. It would be better to make everything contingent on a successful copy, or test for the existence of the temporary file, or to operate on the copy. If you're a belt-and-suspenders kind of person, you'd do a combo as I've illustrated below
The ls is unnecessary.
I'd use a better variable name than "i" - perhaps "file".
Of course, you could be very paranoid and check for the existence of the temporary file at the beginning so you don't accidentally overwrite it and/or use a UUID or a generated file name. One of mktemp, tempfile or uuidgen would do the trick.
td=TMPDIR
export TMPDIR=
usertemp=~/temp # set this to use a temp directory on the same filesystem
# you could use ./temp to ensure that it's one the same one
# you can use mktemp -d to create the dir instead of mkdir
if [[ ! -d $usertemp ]] # if this user temp directory doesn't exist
then # then create it, unless you can't
mkdir $usertemp || export TMPDIR=$td # if you can't create it and TMPDIR is/was
fi # empty then mktemp automatically falls
# back to /tmp
for file in *.sql
do
# TMPDIR if set overrides the argument to -p
temp=$(mktemp -p $usertemp) || { echo "$0: Unable to create temp file."; exit 1; }
{ printf '\xFF\xFE' > "$temp" &&
cat "$file" >> "$temp"; } || { echo "$0: Write failed on $file"; exit 1; }
{ rm "$file" &&
mv "$temp" "$file"; } || { echo "$0: Replacement failed for $file; exit 1; }
done
export TMPDIR=$td
Traps might be better than all the separate error handlers I've added.
No doubt all this extra caution is overkill for a one-shot script, but these techniques can save you when push comes to shove, especially in a multi-file operation.

$ printf '\xEF\xBB\xBF' > bom.txt
Then check:
$ grep -rl $'\xEF\xBB\xBF' .
./bom.txt

Related

Netcat: cat hello.txt | nc -l 14223 > hello.txt does not transfer things in hello.txt to another machine [duplicate]

Basically I want to take as input text from a file, remove a line from that file, and send the output back to the same file. Something along these lines if that makes it any clearer.
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > file_name
however, when I do this I end up with a blank file.
Any thoughts?
Use sponge for this kind of tasks. Its part of moreutils.
Try this command:
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | sponge file_name
You cannot do that because bash processes the redirections first, then executes the command. So by the time grep looks at file_name, it is already empty. You can use a temporary file though.
#!/bin/sh
tmpfile=$(mktemp)
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > ${tmpfile}
cat ${tmpfile} > file_name
rm -f ${tmpfile}
like that, consider using mktemp to create the tmpfile but note that it's not POSIX.
Use sed instead:
sed -i '/seg[0-9]\{1,\}\.[0-9]\{1\}/d' file_name
try this simple one
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name
Your file will not be blank this time :) and your output is also printed to your terminal.
You can't use redirection operator (> or >>) to the same file, because it has a higher precedence and it will create/truncate the file before the command is even invoked. To avoid that, you should use appropriate tools such as tee, sponge, sed -i or any other tool which can write results to the file (e.g. sort file -o file).
Basically redirecting input to the same original file doesn't make sense and you should use appropriate in-place editors for that, for example Ex editor (part of Vim):
ex '+g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' -scwq file_name
where:
'+cmd'/-c - run any Ex/Vim command
g/pattern/d - remove lines matching a pattern using global (help :g)
-s - silent mode (man ex)
-c wq - execute :write and :quit commands
You may use sed to achieve the same (as already shown in other answers), however in-place (-i) is non-standard FreeBSD extension (may work differently between Unix/Linux) and basically it's a stream editor, not a file editor. See: Does Ex mode have any practical use?
One liner alternative - set the content of the file as variable:
VAR=`cat file_name`; echo "$VAR"|grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' > file_name
Since this question is the top result in search engines, here's a one-liner based on https://serverfault.com/a/547331 that uses a subshell instead of sponge (which often isn't part of a vanilla install like OS X):
echo "$(grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name)" > file_name
The general case is:
echo "$(cat file_name)" > file_name
Edit, the above solution has some caveats:
printf '%s' <string> should be used instead of echo <string> so that files containing -n don't cause undesired behavior.
Command substitution strips trailing newlines (this is a bug/feature of shells like bash) so we should append a postfix character like x to the output and remove it on the outside via parameter expansion of a temporary variable like ${v%x}.
Using a temporary variable $v stomps the value of any existing variable $v in the current shell environment, so we should nest the entire expression in parentheses to preserve the previous value.
Another bug/feature of shells like bash is that command substitution strips unprintable characters like null from the output. I verified this by calling dd if=/dev/zero bs=1 count=1 >> file_name and viewing it in hex with cat file_name | xxd -p. But echo $(cat file_name) | xxd -p is stripped. So this answer should not be used on binary files or anything using unprintable characters, as Lynch pointed out.
The general solution (albiet slightly slower, more memory intensive and still stripping unprintable characters) is:
(v=$(cat file_name; printf x); printf '%s' ${v%x} > file_name)
Test from https://askubuntu.com/a/752451:
printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do (v=$(cat file_uniquely_named.txt; printf x); printf '%s' ${v%x} > file_uniquely_named.txt); done; cat file_uniquely_named.txt; rm file_uniquely_named.txt
Should print:
hello
world
Whereas calling cat file_uniquely_named.txt > file_uniquely_named.txt in the current shell:
printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do cat file_uniquely_named.txt > file_uniquely_named.txt; done; cat file_uniquely_named.txt; rm file_uniquely_named.txt
Prints an empty string.
I haven't tested this on large files (probably over 2 or 4 GB).
I have borrowed this answer from Hart Simha and kos.
This is very much possible, you just have to make sure that by the time you write the output, you're writing it to a different file. This can be done by removing the file after opening a file descriptor to it, but before writing to it:
exec 3<file ; rm file; COMMAND <&3 >file ; exec 3>&-
Or line by line, to understand it better :
exec 3<file # open a file descriptor reading 'file'
rm file # remove file (but fd3 will still point to the removed file)
COMMAND <&3 >file # run command, with the removed file as input
exec 3>&- # close the file descriptor
It's still a risky thing to do, because if COMMAND fails to run properly, you'll lose the file contents. That can be mitigated by restoring the file if COMMAND returns a non-zero exit code :
exec 3<file ; rm file; COMMAND <&3 >file || cat <&3 >file ; exec 3>&-
We can also define a shell function to make it easier to use :
# Usage: replace FILE COMMAND
replace() { exec 3<$1 ; rm $1; ${#:2} <&3 >$1 || cat <&3 >$1 ; exec 3>&- }
Example :
$ echo aaa > test
$ replace test tr a b
$ cat test
bbb
Also, note that this will keep a full copy of the original file (until the third file descriptor is closed). If you're using Linux, and the file you're processing on is too big to fit twice on the disk, you can check out this script that will pipe the file to the specified command block-by-block while unallocating the already processed blocks. As always, read the warnings in the usage page.
The following will accomplish the same thing that sponge does, without requiring moreutils:
shuf --output=file --random-source=/dev/zero
The --random-source=/dev/zero part tricks shuf into doing its thing without doing any shuffling at all, so it will buffer your input without altering it.
However, it is true that using a temporary file is best, for performance reasons. So, here is a function that I have written that will do that for you in a generalized way:
# Pipes a file into a command, and pipes the output of that command
# back into the same file, ensuring that the file is not truncated.
# Parameters:
# $1: the file.
# $2: the command. (With $3... being its arguments.)
# See https://stackoverflow.com/a/55655338/773113
siphon()
{
local tmp file rc=0
[ "$#" -ge 2 ] || { echo "Usage: siphon filename [command...]" >&2; return 1; }
file="$1"; shift
tmp=$(mktemp -- "$file.XXXXXX") || return
"$#" <"$file" >"$tmp" || rc=$?
mv -- "$tmp" "$file" || rc=$(( rc | $? ))
return "$rc"
}
There's also ed (as an alternative to sed -i):
# cf. http://wiki.bash-hackers.org/howto/edit-ed
printf '%s\n' H 'g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' wq | ed -s file_name
You can use slurp with POSIX Awk:
!/seg[0-9]\{1,\}\.[0-9]\{1\}/ {
q = q ? q RS $0 : $0
}
END {
print q > ARGV[1]
}
Example
This does the trick pretty nicely in most of the cases I faced:
cat <<< "$(do_stuff_with f)" > f
Note that while $(…) strips trailing newlines, <<< ensures a final newline, so generally the result is magically satisfying.
(Look for “Here Strings” in man bash if you want to learn more.)
Full example:
#! /usr/bin/env bash
get_new_content() {
sed 's/Initial/Final/g' "${1:?}"
}
echo 'Initial content.' > f
cat f
cat <<< "$(get_new_content f)" > f
cat f
This does not truncate the file and yields:
Initial content.
Final content.
Note that I used a function here for the sake of clarity and extensibility, but that’s not a requirement.
A common usecase is JSON edition:
echo '{ "a": 12 }' > f
cat f
cat <<< "$(jq '.a = 24' f)" > f
cat f
This yields:
{ "a": 12 }
{
"a": 24
}
Try this
echo -e "AAA\nBBB\nCCC" > testfile
cat testfile
AAA
BBB
CCC
echo "$(grep -v 'AAA' testfile)" > testfile
cat testfile
BBB
CCC
I usually use the tee program to do this:
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name
It creates and removes a tempfile by itself.

Bash add line numbers to a file and save the output to the input file itself [duplicate]

Basically I want to take as input text from a file, remove a line from that file, and send the output back to the same file. Something along these lines if that makes it any clearer.
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > file_name
however, when I do this I end up with a blank file.
Any thoughts?
Use sponge for this kind of tasks. Its part of moreutils.
Try this command:
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | sponge file_name
You cannot do that because bash processes the redirections first, then executes the command. So by the time grep looks at file_name, it is already empty. You can use a temporary file though.
#!/bin/sh
tmpfile=$(mktemp)
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > ${tmpfile}
cat ${tmpfile} > file_name
rm -f ${tmpfile}
like that, consider using mktemp to create the tmpfile but note that it's not POSIX.
Use sed instead:
sed -i '/seg[0-9]\{1,\}\.[0-9]\{1\}/d' file_name
try this simple one
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name
Your file will not be blank this time :) and your output is also printed to your terminal.
You can't use redirection operator (> or >>) to the same file, because it has a higher precedence and it will create/truncate the file before the command is even invoked. To avoid that, you should use appropriate tools such as tee, sponge, sed -i or any other tool which can write results to the file (e.g. sort file -o file).
Basically redirecting input to the same original file doesn't make sense and you should use appropriate in-place editors for that, for example Ex editor (part of Vim):
ex '+g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' -scwq file_name
where:
'+cmd'/-c - run any Ex/Vim command
g/pattern/d - remove lines matching a pattern using global (help :g)
-s - silent mode (man ex)
-c wq - execute :write and :quit commands
You may use sed to achieve the same (as already shown in other answers), however in-place (-i) is non-standard FreeBSD extension (may work differently between Unix/Linux) and basically it's a stream editor, not a file editor. See: Does Ex mode have any practical use?
One liner alternative - set the content of the file as variable:
VAR=`cat file_name`; echo "$VAR"|grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' > file_name
Since this question is the top result in search engines, here's a one-liner based on https://serverfault.com/a/547331 that uses a subshell instead of sponge (which often isn't part of a vanilla install like OS X):
echo "$(grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name)" > file_name
The general case is:
echo "$(cat file_name)" > file_name
Edit, the above solution has some caveats:
printf '%s' <string> should be used instead of echo <string> so that files containing -n don't cause undesired behavior.
Command substitution strips trailing newlines (this is a bug/feature of shells like bash) so we should append a postfix character like x to the output and remove it on the outside via parameter expansion of a temporary variable like ${v%x}.
Using a temporary variable $v stomps the value of any existing variable $v in the current shell environment, so we should nest the entire expression in parentheses to preserve the previous value.
Another bug/feature of shells like bash is that command substitution strips unprintable characters like null from the output. I verified this by calling dd if=/dev/zero bs=1 count=1 >> file_name and viewing it in hex with cat file_name | xxd -p. But echo $(cat file_name) | xxd -p is stripped. So this answer should not be used on binary files or anything using unprintable characters, as Lynch pointed out.
The general solution (albiet slightly slower, more memory intensive and still stripping unprintable characters) is:
(v=$(cat file_name; printf x); printf '%s' ${v%x} > file_name)
Test from https://askubuntu.com/a/752451:
printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do (v=$(cat file_uniquely_named.txt; printf x); printf '%s' ${v%x} > file_uniquely_named.txt); done; cat file_uniquely_named.txt; rm file_uniquely_named.txt
Should print:
hello
world
Whereas calling cat file_uniquely_named.txt > file_uniquely_named.txt in the current shell:
printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do cat file_uniquely_named.txt > file_uniquely_named.txt; done; cat file_uniquely_named.txt; rm file_uniquely_named.txt
Prints an empty string.
I haven't tested this on large files (probably over 2 or 4 GB).
I have borrowed this answer from Hart Simha and kos.
This is very much possible, you just have to make sure that by the time you write the output, you're writing it to a different file. This can be done by removing the file after opening a file descriptor to it, but before writing to it:
exec 3<file ; rm file; COMMAND <&3 >file ; exec 3>&-
Or line by line, to understand it better :
exec 3<file # open a file descriptor reading 'file'
rm file # remove file (but fd3 will still point to the removed file)
COMMAND <&3 >file # run command, with the removed file as input
exec 3>&- # close the file descriptor
It's still a risky thing to do, because if COMMAND fails to run properly, you'll lose the file contents. That can be mitigated by restoring the file if COMMAND returns a non-zero exit code :
exec 3<file ; rm file; COMMAND <&3 >file || cat <&3 >file ; exec 3>&-
We can also define a shell function to make it easier to use :
# Usage: replace FILE COMMAND
replace() { exec 3<$1 ; rm $1; ${#:2} <&3 >$1 || cat <&3 >$1 ; exec 3>&- }
Example :
$ echo aaa > test
$ replace test tr a b
$ cat test
bbb
Also, note that this will keep a full copy of the original file (until the third file descriptor is closed). If you're using Linux, and the file you're processing on is too big to fit twice on the disk, you can check out this script that will pipe the file to the specified command block-by-block while unallocating the already processed blocks. As always, read the warnings in the usage page.
The following will accomplish the same thing that sponge does, without requiring moreutils:
shuf --output=file --random-source=/dev/zero
The --random-source=/dev/zero part tricks shuf into doing its thing without doing any shuffling at all, so it will buffer your input without altering it.
However, it is true that using a temporary file is best, for performance reasons. So, here is a function that I have written that will do that for you in a generalized way:
# Pipes a file into a command, and pipes the output of that command
# back into the same file, ensuring that the file is not truncated.
# Parameters:
# $1: the file.
# $2: the command. (With $3... being its arguments.)
# See https://stackoverflow.com/a/55655338/773113
siphon()
{
local tmp file rc=0
[ "$#" -ge 2 ] || { echo "Usage: siphon filename [command...]" >&2; return 1; }
file="$1"; shift
tmp=$(mktemp -- "$file.XXXXXX") || return
"$#" <"$file" >"$tmp" || rc=$?
mv -- "$tmp" "$file" || rc=$(( rc | $? ))
return "$rc"
}
There's also ed (as an alternative to sed -i):
# cf. http://wiki.bash-hackers.org/howto/edit-ed
printf '%s\n' H 'g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' wq | ed -s file_name
You can use slurp with POSIX Awk:
!/seg[0-9]\{1,\}\.[0-9]\{1\}/ {
q = q ? q RS $0 : $0
}
END {
print q > ARGV[1]
}
Example
This does the trick pretty nicely in most of the cases I faced:
cat <<< "$(do_stuff_with f)" > f
Note that while $(…) strips trailing newlines, <<< ensures a final newline, so generally the result is magically satisfying.
(Look for “Here Strings” in man bash if you want to learn more.)
Full example:
#! /usr/bin/env bash
get_new_content() {
sed 's/Initial/Final/g' "${1:?}"
}
echo 'Initial content.' > f
cat f
cat <<< "$(get_new_content f)" > f
cat f
This does not truncate the file and yields:
Initial content.
Final content.
Note that I used a function here for the sake of clarity and extensibility, but that’s not a requirement.
A common usecase is JSON edition:
echo '{ "a": 12 }' > f
cat f
cat <<< "$(jq '.a = 24' f)" > f
cat f
This yields:
{ "a": 12 }
{
"a": 24
}
Try this
echo -e "AAA\nBBB\nCCC" > testfile
cat testfile
AAA
BBB
CCC
echo "$(grep -v 'AAA' testfile)" > testfile
cat testfile
BBB
CCC
I usually use the tee program to do this:
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name
It creates and removes a tempfile by itself.

How to execute Linux shell variables within double quotes?

I have the following hacking-challenge, where we don't know, if there is a valid solution.
We have the following server script:
read s # read user input into var s
echo "$s"
# tests if it starts with 'a-f'
echo "$s" > "/home/user/${s}.txt"
We only control the input "$s". Is there a possibility to send OS-commands like uname or do you think "no way"?
I don't see any avenue for executing arbitrary commands. The script quotes $s every time it is referenced, so that limits what you can do.
The only serious attack vector I see is that the echo statement writes to a file name based on $s. Since you control $s, you can cause the script to write to some unexpected locations.
$s could contain a string like bob/important.txt. This script would then overwrite /home/user/bob/important.txt if executed with sufficient permissions. Sorry, Bob!
Or, worse, $s could be bob/../../../etc/passwd. The script would try to write to /home/user/bob/../../../etc/passwd. If the script is running as root... uh oh!
It's important to note that the script can only write to these places if it has the right permissions.
You could embed unusual characters in $s that would cause irregular file names to be created. Un-careful scripts could be taken advantage of. For example, if $s were foo -rf . bar, then the file /home/user/foo -rf . bar.txt would be created.
If someone ran for file in /home/user; rm $file; done they'd have a surprise on their hands. They would end up running rm /home/user/foo -rf . bar.txt, which is a disaster. If you take out /home/user/foo and bar.txt you're left with rm -rf . — everything in the current directory is deleted. Oops!
(They should have quoted "$file"!)
And there are two other minor things which, while I don't know how to take advantage of them maliciously, do cause the script to behave slightly differently than intended.
read allows backslashes to escape characters like space and newline. You can enter \space to embed spaces and \enter to have read parse multiple lines of input.
echo accepts a couple of flags. If $s is -n or -e then it won't actually echo $s; rather, it will interpret $s as a command-line flag.
Use read -r s or any \ will be lost/missinterpreted by your command.
read -r s?"Your input: "
if [ -n "${s}" ]
then
# "filter" file name from command
echo "${s##*/}" | sed 's|^ *\([[:alnum:]_]\{1,\}\)[[:blank:]].*|/home/user/\1.txt|' | read Output
(
# put any limitation on user here
ulimit -t 5 1>/dev/null 2>&1
`${read}`
) > ${OutPut}
else
echo "Bad command" > /home/user/Error.txt
fi
Sure:
read s
$s > /home/user/"$s".txt
If I enter uname, this prints Linux. But beware: this is a security nightmare. What if someone enters rm -rf $HOME? You'd also have issues with commands containing a slash.

Make SED command work for any variable

deploy.sh
USERNAME="Tom"
PASSWORD="abc123"
FILE="config.conf"
sed -i "s/\PLACEHOLDER_USERNAME/$USERNAME/g" $FILE
sed -i "s/\PLACEHOLDER_PASSWORD/$PASSWORD/g" $FILE
config.conf
deloy="PLACEHOLDER_USERNAME"
pass="PLACEHOLDER_PASSWORD"
This file puts my variables defined in deploy into my config file. I can't source the file so I want put my variables in this way.
Question
I want a command that is generic to work for all placeholder variables using some sort of while loop rather than needing one command per variable. This means any term starting with placeholder_ in the file will try to be replaced with the value of the variable defined already in deploy.sh
All variables should be set and not empty. I guess if there is the ability to print a warning if it can't find the variable that would be good but it isn't mandatory for this.
Basically, use shell code to write a sed script and then use sed -i .bak -f sed.script config.conf to apply it:
trap "rm -f sed.script; exit 1" 0 1 2 3 13 15
for var in USERNAME PASSWORD
do
echo "s/PLACEHOLDER_$var/${!var}/"
done > sed.script
sed -i .bak -f sed.script config.conf
rm -f sed.script
trap 0
The main 'tricks' here are:
knowing that ${!var} expands to the value of the variable named by $var, and
knowing that sed will take a script full of commands via -f sed.script, and
knowing how to use trap to ensure temporary files are cleaned up.
You could also use sed -e "s/.../.../" -e "s/.../.../" -i .bak config.conf too, but the script file is easier, I think, especially if you have more than 2 values to substitute. If you want to go down this route, use a bash array to hold the arguments to sed. A more careful script would use at least $$ in the script file name, or use mktemp to create the temporary file.
Revised answer
The trouble is, although much closer to being generic, it is still not generic since I have to manually put in what variables I want to change. Can it not be more like "for each placeholder_, find the variable in deploy.sh and add that variable, so it can work for any number of variables.
So, find what the variables are in the configuration file, then apply the techniques of the previous answer to solve that problem:
trap "rm -f $tmp; exit 1" 0 1 2 3 13 15
for file in "$#"
do
for var in $(sed 's/.*PLACEHOLDER_\([A-Z0-9_]*\).*/\1/' "$file")
do
value="${!var}"
[ -z "$value" ] && { echo "$0: variable $var not set for $file" >&2; exit 1; }
echo "s/PLACEHOLDER_$var/$value/"
done > $tmp
sed -i .bak -f $tmp "$file"
rm -f $tmp
done
trap 0
This code still pulls the values from the environment. You need to clarify what is required if you want to extract the settings from the shell script, but it can be done — the script will have to be sufficiently self-aware to find its source so it can search it for the names. But the basics are in this answer; the rest is a question of tinkering until it does what you need.
#!/bin/ksh
TemplateFile=$1
SourceData=$2
(sed 's/.*/#V0r:PLACEHOLDER_&:r0V#/' ${SourceData}; cat ${TemplateFile}) | sed -n "
s/$/²/
H
$ {
x
s/^\(\n *\)*//
# also reset t flag
t varxs
:varxs
s/^#V0r:\([a-zA-Z0-9_]\{1,\}\)=\([^²]*\):r0V#²\(\n.*\)\"\1\"/#V0r:\1=\2:r0V#²\3\2/
t varxs
# clean the line when no more occurance in text
s/^[^²]*:r0V#²\n//
# and next
t varxs
# clean the marker
s/²\(\n\)/\1/g
s/²$//
# display the result
p
}
"
call like this: YourScript.ksh YourTemplateFile YourDataSourceFile where:
YourTemplateFile is the file that contain the structure with generic value like deloy="PLACEHOLDER_USERNAME"
YourDataSourceFile is the file that contain all the peer Generic value = specific value like USERNAME="Tom"

Why does sed leave many files around?

I noticed many files in my directory, called "sedAbCdEf" or such.
Why does it create these files?
Do these have any value after a script has run?
Can I send these files to another location , e.g. /tmp/?
Update:
I checked the scripts until I found one which makes the files. Here is some sample code:
#!/bin/bash
a=1
b=`wc -l < ./file1.txt`
while [ $a -le $b ]; do
for i in `sed -n "$a"p ./file1.txt`; do
for j in `sed -n "$a"p ./file2.txt`; do
sed -i "s/$i/\nZZ$jZZ\n/g" ./file3.txt
c=`grep -c $j file3.txt`
if [ "$c" -ge 1 ]
then
echo $j >> file4.txt
echo "Replaced "$i" with "$j" "$c" times ("$a"/"$b")."
fi
echo $i" not found ("$a"/"$b")."
a=`expr $a + 1`
done
done
done
Why does it create these files?
sed -i "s/$i/\nZZ$jZZ\n/g" ./file3.txt
the -i option makes sed stores the stdout output into a temporary file.
After sed is done, it will rename this temp file to replace your original file3.txt.
If something is wrong when sed is running, these sedAbCdE temp files will be left there.
Do these have any value after a script has run?
Your old file is untouched. Usually no.
Can I send these files to another location , e.g. /tmp/?
Yes you can, see above.
Edit: see this for further reading.
If you use -i option (it means make changes inplace) sed writes to a temporary file and then renames it to your file. Thus if operation is aborted your file is left unchanged.
You can see which files are opened, renamed with strace:
$ strace -e open,rename sed -i 's/a/b/g' somefile
Note: somefile is opened as readonly.
It seems there is no way to override the backup directory. GNU sed always writes in the file's directory (±symlinks). From sed/execute.c:
if (follow_symlinks)
input->in_file_name = follow_symlink (name);
else
input->in_file_name = name;
/* get the base name */
tmpdir = ck_strdup(input->in_file_name);
if ((p = strrchr(tmpdir, '/')))
*(p + 1) = 0;
else
strcpy(tmpdir, ".");
Prefix sed is hardcoded:
output_file.fp = ck_mkstemp (&input->out_file_name, tmpdir, "sed");
This may be that, since you have used too much sed actions, and in a looped pattern, sed may be making tmp files which are not removed properly.
Sed creates un-deleteable files in Windows
Take a look at this post, sed have such an issue to be reported before. The better way is to make a script that removes the files, or create a function that remove all files that deletes all files with name starting with sed, (^sed* )like thing.

Resources