How to check if sed has changed a file - linux

I am trying to find a clever way to figure out if the file passed to sed has been altered successfully or not.
Basically, I want to know if the file has been changed or not without having to look at the file modification date.
The reason why I need this is because I need to do some extra stuff if sed has successfully replaced a pattern.
I currently have:
grep -q $pattern $filename
if [ $? -eq 0 ]
then
sed -i s:$pattern:$new_pattern: $filename
# DO SOME OTHER STUFF HERE
else
# DO SOME OTHER STUFF HERE
fi
The above code is a bit expensive and I would love to be able to use some hacks here.

A bit late to the party but for the benefit of others, I found the 'w' flag to be exactly what I was looking for.
sed -i "s/$pattern/$new_pattern/w changelog.txt" "$filename"
if [ -s changelog.txt ]; then
# CHANGES MADE, DO SOME STUFF HERE
else
# NO CHANGES MADE, DO SOME OTHER STUFF HERE
fi
changelog.txt will contain each change (ie the changed text) on it's own line. If there were no changes, changelog.txt will be zero bytes.
A really helpful sed resource (and where I found this info) is http://www.grymoire.com/Unix/Sed.html.

I believe you may find these GNU sed extensions useful
t label
If a s/// has done a successful substitution since the last input line
was read and since the last t or T command, then branch to label; if
label is omitted, branch to end of script.
and
q [exit-code]
Immediately quit the sed script without processing any more input, except
that if auto-print is not disabled the current pattern space will be printed.
The exit code argument is a GNU extension.
It seems like exactly what are you looking for.

This might work for you (GNU sed):
sed -i.bak '/'"$old_pattern"'/{s//'"$new_pattern"'/;h};${x;/./{x;q1};x}' file || echo changed
Explanation:
/'"$old_pattern"'/{s//'"$new_pattern"'/;h} if the pattern space (PS) contains the old pattern, replace it by the new pattern and copy the PS to the hold space (HS).
${x;/./{x;q1};x} on encountering the last line, swap to the HS and test it for the presence of any string. If a string is found in the HS (i.e. a substitution has taken place) swap back to the original PS and exit using the exit code of 1, otherwise swap back to the original PS and exit with the exit code of 0 (the default).

You can diff the original file with the sed output to see if it changed:
sed -i.bak s:$pattern:$new_pattern: "$filename"
if ! diff "$filename" "$filename.bak" &> /dev/null; then
echo "changed"
else
echo "not changed"
fi
rm "$filename.bak"

You could use awk instead:
awk '$0 ~ p { gsub(p, r); t=1} 1 END{ exit (!t) }' p="$pattern" r="$repl"
I'm ignoring the -i feature: you can use the shell do do redirections as necessary.
Sigh. Many comments below asking for basic tutorial on the shell. You can use the above command as follows:
if awk '$0 ~ p { gsub(p, r); t=1} 1 END{ exit (!t) }' \
p="$pattern" r="$repl" "$filename" > "${filename}.new"; then
cat "${filename}.new" > "${filename}"
# DO SOME OTHER STUFF HERE
else
# DO SOME OTHER STUFF HERE
fi
It is not clear to me if "DO SOME OTHER STUFF HERE" is the same in each case. Any similar code in the two blocks should be refactored accordingly.

In macos I just do it as follows:
changes=""
changes+=$(sed -i '' "s/$to_replace/$replacement/g w /dev/stdout" "$f")
if [ "$changes" != "" ]; then
echo "CHANGED!"
fi
I checked, and this is faster than md5, cksum and sha comparisons

I know it is a old question and using awk instead of sed is perhaps the best idea, but if one wants to stick with sed, an idea is to use the -w flag. The file argument to the w flag only contains the lines with a match. So, we only need to check that it is not empty.

perl -sple '$replaced++ if s/$from/$to/g;
END{if($replaced != 0){ print "[Info]: $replaced replacement done in $ARGV(from/to)($from/$to)"}
else {print "[Warning]: 0 replacement done in $ARGV(from/to)($from/$to)"}}' -- -from="FROM_STRING" -to="$DESIRED_STRING" </file/name>
Example:
The command will produce the following output, stating the number of changes made/file.
perl -sple '$replaced++ if s/$from/$to/g;
END{if($replaced != 0){ print "[Info]: $replaced replacement done in $ARGV(from/to)($from/$to)"}
else {print "[Warning]: 0 replacement done in $ARGV(from/to)($from/$to)"}}' -- -from="timeout" -to="TIMEOUT" *
[Info]: 5 replacement done in main.yml(from/to)(timeout/TIMEOUT)
[Info]: 1 replacement done in task/main.yml(from/to)(timeout/TIMEOUT)
[Info]: 4 replacement done in defaults/main.yml(from/to)(timeout/TIMEOUT)
[Warning]: 0 replacement done in vars/main.yml(from/to)(timeout/TIMEOUT)
Note: I have removed -i from the above command , so it will not update the files for the people who are just trying out the command. If you want to enable in-place replacements in the file add -i after perl in above command.

check if sed has changed MANY files
recursive replace of all files in one directory
produce a list of all modified files
workaround with two stages: match + replace
g='hello.*world'
s='s/hello.*world/bye world/g;'
d='./' # directory of input files
o='modified-files.txt'
grep -r -l -Z -E "$g" "$d" | tee "$o" | xargs -0 sed -i "$s"
the file paths in $o are zero-delimited

$ echo hi > abc.txt
$ sed "s/hi/bye/g; t; q1;" -i abc.txt && (echo "Changed") || (echo "Failed")
Changed
$ sed "s/hi/bye/g; t; q1;" -i abc.txt && (echo "Changed") || (echo "Failed")
Failed
https://askubuntu.com/questions/1036912/how-do-i-get-the-exit-status-when-using-the-sed-command/1036918#1036918

Don't use sed to tell if it has changed a file; instead, use grep to tell if it is going to change a file, then use sed to actually change the file. Notice the single line of sed usage at the very end of the Bash function below:
# Usage: `gs_replace_str "regex_search_pattern" "replacement_string" "file_path"`
gs_replace_str() {
REGEX_SEARCH="$1"
REPLACEMENT_STR="$2"
FILENAME="$3"
num_lines_matched=$(grep -c -E "$REGEX_SEARCH" "$FILENAME")
# Count number of matches, NOT lines (`grep -c` counts lines),
# in case there are multiple matches per line; see:
# https://superuser.com/questions/339522/counting-total-number-of-matches-with-grep-instead-of-just-how-many-lines-match/339523#339523
num_matches=$(grep -o -E "$REGEX_SEARCH" "$FILENAME" | wc -l)
# If num_matches > 0
if [ "$num_matches" -gt 0 ]; then
echo -e "\n${num_matches} matches found on ${num_lines_matched} lines in file"\
"\"${FILENAME}\":"
# Now show these exact matches with their corresponding line 'n'umbers in the file
grep -n --color=always -E "$REGEX_SEARCH" "$FILENAME"
# Now actually DO the string replacing on the files 'i'n place using the `sed`
# 's'tream 'ed'itor!
sed -i "s|${REGEX_SEARCH}|${REPLACEMENT_STR}|g" "$FILENAME"
fi
}
Place that in your ~/.bashrc file, for instance. Close and reopen your terminal and then use it.
Usage:
gs_replace_str "regex_search_pattern" "replacement_string" "file_path"
Example: replace do with bo so that "doing" becomes "boing" (I know, we should be fixing spelling errors not creating them :) ):
$ gs_replace_str "do" "bo" test_folder/test2.txt
9 matches found on 6 lines in file "test_folder/test2.txt":
1:hey how are you doing today
2:hey how are you doing today
3:hey how are you doing today
4:hey how are you doing today hey how are you doing today hey how are you doing today hey how are you doing today
5:hey how are you doing today
6:hey how are you doing today?
$SHLVL:3
Screenshot of the output:
References:
https://superuser.com/questions/339522/counting-total-number-of-matches-with-grep-instead-of-just-how-many-lines-match/339523#339523
https://unix.stackexchange.com/questions/112023/how-can-i-replace-a-string-in-a-files/580328#580328

Related

Netcat: cat hello.txt | nc -l 14223 > hello.txt does not transfer things in hello.txt to another machine [duplicate]

Basically I want to take as input text from a file, remove a line from that file, and send the output back to the same file. Something along these lines if that makes it any clearer.
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > file_name
however, when I do this I end up with a blank file.
Any thoughts?
Use sponge for this kind of tasks. Its part of moreutils.
Try this command:
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | sponge file_name
You cannot do that because bash processes the redirections first, then executes the command. So by the time grep looks at file_name, it is already empty. You can use a temporary file though.
#!/bin/sh
tmpfile=$(mktemp)
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > ${tmpfile}
cat ${tmpfile} > file_name
rm -f ${tmpfile}
like that, consider using mktemp to create the tmpfile but note that it's not POSIX.
Use sed instead:
sed -i '/seg[0-9]\{1,\}\.[0-9]\{1\}/d' file_name
try this simple one
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name
Your file will not be blank this time :) and your output is also printed to your terminal.
You can't use redirection operator (> or >>) to the same file, because it has a higher precedence and it will create/truncate the file before the command is even invoked. To avoid that, you should use appropriate tools such as tee, sponge, sed -i or any other tool which can write results to the file (e.g. sort file -o file).
Basically redirecting input to the same original file doesn't make sense and you should use appropriate in-place editors for that, for example Ex editor (part of Vim):
ex '+g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' -scwq file_name
where:
'+cmd'/-c - run any Ex/Vim command
g/pattern/d - remove lines matching a pattern using global (help :g)
-s - silent mode (man ex)
-c wq - execute :write and :quit commands
You may use sed to achieve the same (as already shown in other answers), however in-place (-i) is non-standard FreeBSD extension (may work differently between Unix/Linux) and basically it's a stream editor, not a file editor. See: Does Ex mode have any practical use?
One liner alternative - set the content of the file as variable:
VAR=`cat file_name`; echo "$VAR"|grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' > file_name
Since this question is the top result in search engines, here's a one-liner based on https://serverfault.com/a/547331 that uses a subshell instead of sponge (which often isn't part of a vanilla install like OS X):
echo "$(grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name)" > file_name
The general case is:
echo "$(cat file_name)" > file_name
Edit, the above solution has some caveats:
printf '%s' <string> should be used instead of echo <string> so that files containing -n don't cause undesired behavior.
Command substitution strips trailing newlines (this is a bug/feature of shells like bash) so we should append a postfix character like x to the output and remove it on the outside via parameter expansion of a temporary variable like ${v%x}.
Using a temporary variable $v stomps the value of any existing variable $v in the current shell environment, so we should nest the entire expression in parentheses to preserve the previous value.
Another bug/feature of shells like bash is that command substitution strips unprintable characters like null from the output. I verified this by calling dd if=/dev/zero bs=1 count=1 >> file_name and viewing it in hex with cat file_name | xxd -p. But echo $(cat file_name) | xxd -p is stripped. So this answer should not be used on binary files or anything using unprintable characters, as Lynch pointed out.
The general solution (albiet slightly slower, more memory intensive and still stripping unprintable characters) is:
(v=$(cat file_name; printf x); printf '%s' ${v%x} > file_name)
Test from https://askubuntu.com/a/752451:
printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do (v=$(cat file_uniquely_named.txt; printf x); printf '%s' ${v%x} > file_uniquely_named.txt); done; cat file_uniquely_named.txt; rm file_uniquely_named.txt
Should print:
hello
world
Whereas calling cat file_uniquely_named.txt > file_uniquely_named.txt in the current shell:
printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do cat file_uniquely_named.txt > file_uniquely_named.txt; done; cat file_uniquely_named.txt; rm file_uniquely_named.txt
Prints an empty string.
I haven't tested this on large files (probably over 2 or 4 GB).
I have borrowed this answer from Hart Simha and kos.
This is very much possible, you just have to make sure that by the time you write the output, you're writing it to a different file. This can be done by removing the file after opening a file descriptor to it, but before writing to it:
exec 3<file ; rm file; COMMAND <&3 >file ; exec 3>&-
Or line by line, to understand it better :
exec 3<file # open a file descriptor reading 'file'
rm file # remove file (but fd3 will still point to the removed file)
COMMAND <&3 >file # run command, with the removed file as input
exec 3>&- # close the file descriptor
It's still a risky thing to do, because if COMMAND fails to run properly, you'll lose the file contents. That can be mitigated by restoring the file if COMMAND returns a non-zero exit code :
exec 3<file ; rm file; COMMAND <&3 >file || cat <&3 >file ; exec 3>&-
We can also define a shell function to make it easier to use :
# Usage: replace FILE COMMAND
replace() { exec 3<$1 ; rm $1; ${#:2} <&3 >$1 || cat <&3 >$1 ; exec 3>&- }
Example :
$ echo aaa > test
$ replace test tr a b
$ cat test
bbb
Also, note that this will keep a full copy of the original file (until the third file descriptor is closed). If you're using Linux, and the file you're processing on is too big to fit twice on the disk, you can check out this script that will pipe the file to the specified command block-by-block while unallocating the already processed blocks. As always, read the warnings in the usage page.
The following will accomplish the same thing that sponge does, without requiring moreutils:
shuf --output=file --random-source=/dev/zero
The --random-source=/dev/zero part tricks shuf into doing its thing without doing any shuffling at all, so it will buffer your input without altering it.
However, it is true that using a temporary file is best, for performance reasons. So, here is a function that I have written that will do that for you in a generalized way:
# Pipes a file into a command, and pipes the output of that command
# back into the same file, ensuring that the file is not truncated.
# Parameters:
# $1: the file.
# $2: the command. (With $3... being its arguments.)
# See https://stackoverflow.com/a/55655338/773113
siphon()
{
local tmp file rc=0
[ "$#" -ge 2 ] || { echo "Usage: siphon filename [command...]" >&2; return 1; }
file="$1"; shift
tmp=$(mktemp -- "$file.XXXXXX") || return
"$#" <"$file" >"$tmp" || rc=$?
mv -- "$tmp" "$file" || rc=$(( rc | $? ))
return "$rc"
}
There's also ed (as an alternative to sed -i):
# cf. http://wiki.bash-hackers.org/howto/edit-ed
printf '%s\n' H 'g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' wq | ed -s file_name
You can use slurp with POSIX Awk:
!/seg[0-9]\{1,\}\.[0-9]\{1\}/ {
q = q ? q RS $0 : $0
}
END {
print q > ARGV[1]
}
Example
This does the trick pretty nicely in most of the cases I faced:
cat <<< "$(do_stuff_with f)" > f
Note that while $(…) strips trailing newlines, <<< ensures a final newline, so generally the result is magically satisfying.
(Look for “Here Strings” in man bash if you want to learn more.)
Full example:
#! /usr/bin/env bash
get_new_content() {
sed 's/Initial/Final/g' "${1:?}"
}
echo 'Initial content.' > f
cat f
cat <<< "$(get_new_content f)" > f
cat f
This does not truncate the file and yields:
Initial content.
Final content.
Note that I used a function here for the sake of clarity and extensibility, but that’s not a requirement.
A common usecase is JSON edition:
echo '{ "a": 12 }' > f
cat f
cat <<< "$(jq '.a = 24' f)" > f
cat f
This yields:
{ "a": 12 }
{
"a": 24
}
Try this
echo -e "AAA\nBBB\nCCC" > testfile
cat testfile
AAA
BBB
CCC
echo "$(grep -v 'AAA' testfile)" > testfile
cat testfile
BBB
CCC
I usually use the tee program to do this:
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name
It creates and removes a tempfile by itself.

Bash add line numbers to a file and save the output to the input file itself [duplicate]

Basically I want to take as input text from a file, remove a line from that file, and send the output back to the same file. Something along these lines if that makes it any clearer.
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > file_name
however, when I do this I end up with a blank file.
Any thoughts?
Use sponge for this kind of tasks. Its part of moreutils.
Try this command:
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | sponge file_name
You cannot do that because bash processes the redirections first, then executes the command. So by the time grep looks at file_name, it is already empty. You can use a temporary file though.
#!/bin/sh
tmpfile=$(mktemp)
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > ${tmpfile}
cat ${tmpfile} > file_name
rm -f ${tmpfile}
like that, consider using mktemp to create the tmpfile but note that it's not POSIX.
Use sed instead:
sed -i '/seg[0-9]\{1,\}\.[0-9]\{1\}/d' file_name
try this simple one
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name
Your file will not be blank this time :) and your output is also printed to your terminal.
You can't use redirection operator (> or >>) to the same file, because it has a higher precedence and it will create/truncate the file before the command is even invoked. To avoid that, you should use appropriate tools such as tee, sponge, sed -i or any other tool which can write results to the file (e.g. sort file -o file).
Basically redirecting input to the same original file doesn't make sense and you should use appropriate in-place editors for that, for example Ex editor (part of Vim):
ex '+g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' -scwq file_name
where:
'+cmd'/-c - run any Ex/Vim command
g/pattern/d - remove lines matching a pattern using global (help :g)
-s - silent mode (man ex)
-c wq - execute :write and :quit commands
You may use sed to achieve the same (as already shown in other answers), however in-place (-i) is non-standard FreeBSD extension (may work differently between Unix/Linux) and basically it's a stream editor, not a file editor. See: Does Ex mode have any practical use?
One liner alternative - set the content of the file as variable:
VAR=`cat file_name`; echo "$VAR"|grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' > file_name
Since this question is the top result in search engines, here's a one-liner based on https://serverfault.com/a/547331 that uses a subshell instead of sponge (which often isn't part of a vanilla install like OS X):
echo "$(grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name)" > file_name
The general case is:
echo "$(cat file_name)" > file_name
Edit, the above solution has some caveats:
printf '%s' <string> should be used instead of echo <string> so that files containing -n don't cause undesired behavior.
Command substitution strips trailing newlines (this is a bug/feature of shells like bash) so we should append a postfix character like x to the output and remove it on the outside via parameter expansion of a temporary variable like ${v%x}.
Using a temporary variable $v stomps the value of any existing variable $v in the current shell environment, so we should nest the entire expression in parentheses to preserve the previous value.
Another bug/feature of shells like bash is that command substitution strips unprintable characters like null from the output. I verified this by calling dd if=/dev/zero bs=1 count=1 >> file_name and viewing it in hex with cat file_name | xxd -p. But echo $(cat file_name) | xxd -p is stripped. So this answer should not be used on binary files or anything using unprintable characters, as Lynch pointed out.
The general solution (albiet slightly slower, more memory intensive and still stripping unprintable characters) is:
(v=$(cat file_name; printf x); printf '%s' ${v%x} > file_name)
Test from https://askubuntu.com/a/752451:
printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do (v=$(cat file_uniquely_named.txt; printf x); printf '%s' ${v%x} > file_uniquely_named.txt); done; cat file_uniquely_named.txt; rm file_uniquely_named.txt
Should print:
hello
world
Whereas calling cat file_uniquely_named.txt > file_uniquely_named.txt in the current shell:
printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do cat file_uniquely_named.txt > file_uniquely_named.txt; done; cat file_uniquely_named.txt; rm file_uniquely_named.txt
Prints an empty string.
I haven't tested this on large files (probably over 2 or 4 GB).
I have borrowed this answer from Hart Simha and kos.
This is very much possible, you just have to make sure that by the time you write the output, you're writing it to a different file. This can be done by removing the file after opening a file descriptor to it, but before writing to it:
exec 3<file ; rm file; COMMAND <&3 >file ; exec 3>&-
Or line by line, to understand it better :
exec 3<file # open a file descriptor reading 'file'
rm file # remove file (but fd3 will still point to the removed file)
COMMAND <&3 >file # run command, with the removed file as input
exec 3>&- # close the file descriptor
It's still a risky thing to do, because if COMMAND fails to run properly, you'll lose the file contents. That can be mitigated by restoring the file if COMMAND returns a non-zero exit code :
exec 3<file ; rm file; COMMAND <&3 >file || cat <&3 >file ; exec 3>&-
We can also define a shell function to make it easier to use :
# Usage: replace FILE COMMAND
replace() { exec 3<$1 ; rm $1; ${#:2} <&3 >$1 || cat <&3 >$1 ; exec 3>&- }
Example :
$ echo aaa > test
$ replace test tr a b
$ cat test
bbb
Also, note that this will keep a full copy of the original file (until the third file descriptor is closed). If you're using Linux, and the file you're processing on is too big to fit twice on the disk, you can check out this script that will pipe the file to the specified command block-by-block while unallocating the already processed blocks. As always, read the warnings in the usage page.
The following will accomplish the same thing that sponge does, without requiring moreutils:
shuf --output=file --random-source=/dev/zero
The --random-source=/dev/zero part tricks shuf into doing its thing without doing any shuffling at all, so it will buffer your input without altering it.
However, it is true that using a temporary file is best, for performance reasons. So, here is a function that I have written that will do that for you in a generalized way:
# Pipes a file into a command, and pipes the output of that command
# back into the same file, ensuring that the file is not truncated.
# Parameters:
# $1: the file.
# $2: the command. (With $3... being its arguments.)
# See https://stackoverflow.com/a/55655338/773113
siphon()
{
local tmp file rc=0
[ "$#" -ge 2 ] || { echo "Usage: siphon filename [command...]" >&2; return 1; }
file="$1"; shift
tmp=$(mktemp -- "$file.XXXXXX") || return
"$#" <"$file" >"$tmp" || rc=$?
mv -- "$tmp" "$file" || rc=$(( rc | $? ))
return "$rc"
}
There's also ed (as an alternative to sed -i):
# cf. http://wiki.bash-hackers.org/howto/edit-ed
printf '%s\n' H 'g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' wq | ed -s file_name
You can use slurp with POSIX Awk:
!/seg[0-9]\{1,\}\.[0-9]\{1\}/ {
q = q ? q RS $0 : $0
}
END {
print q > ARGV[1]
}
Example
This does the trick pretty nicely in most of the cases I faced:
cat <<< "$(do_stuff_with f)" > f
Note that while $(…) strips trailing newlines, <<< ensures a final newline, so generally the result is magically satisfying.
(Look for “Here Strings” in man bash if you want to learn more.)
Full example:
#! /usr/bin/env bash
get_new_content() {
sed 's/Initial/Final/g' "${1:?}"
}
echo 'Initial content.' > f
cat f
cat <<< "$(get_new_content f)" > f
cat f
This does not truncate the file and yields:
Initial content.
Final content.
Note that I used a function here for the sake of clarity and extensibility, but that’s not a requirement.
A common usecase is JSON edition:
echo '{ "a": 12 }' > f
cat f
cat <<< "$(jq '.a = 24' f)" > f
cat f
This yields:
{ "a": 12 }
{
"a": 24
}
Try this
echo -e "AAA\nBBB\nCCC" > testfile
cat testfile
AAA
BBB
CCC
echo "$(grep -v 'AAA' testfile)" > testfile
cat testfile
BBB
CCC
I usually use the tee program to do this:
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name
It creates and removes a tempfile by itself.

Why does sed leave many files around?

I noticed many files in my directory, called "sedAbCdEf" or such.
Why does it create these files?
Do these have any value after a script has run?
Can I send these files to another location , e.g. /tmp/?
Update:
I checked the scripts until I found one which makes the files. Here is some sample code:
#!/bin/bash
a=1
b=`wc -l < ./file1.txt`
while [ $a -le $b ]; do
for i in `sed -n "$a"p ./file1.txt`; do
for j in `sed -n "$a"p ./file2.txt`; do
sed -i "s/$i/\nZZ$jZZ\n/g" ./file3.txt
c=`grep -c $j file3.txt`
if [ "$c" -ge 1 ]
then
echo $j >> file4.txt
echo "Replaced "$i" with "$j" "$c" times ("$a"/"$b")."
fi
echo $i" not found ("$a"/"$b")."
a=`expr $a + 1`
done
done
done
Why does it create these files?
sed -i "s/$i/\nZZ$jZZ\n/g" ./file3.txt
the -i option makes sed stores the stdout output into a temporary file.
After sed is done, it will rename this temp file to replace your original file3.txt.
If something is wrong when sed is running, these sedAbCdE temp files will be left there.
Do these have any value after a script has run?
Your old file is untouched. Usually no.
Can I send these files to another location , e.g. /tmp/?
Yes you can, see above.
Edit: see this for further reading.
If you use -i option (it means make changes inplace) sed writes to a temporary file and then renames it to your file. Thus if operation is aborted your file is left unchanged.
You can see which files are opened, renamed with strace:
$ strace -e open,rename sed -i 's/a/b/g' somefile
Note: somefile is opened as readonly.
It seems there is no way to override the backup directory. GNU sed always writes in the file's directory (±symlinks). From sed/execute.c:
if (follow_symlinks)
input->in_file_name = follow_symlink (name);
else
input->in_file_name = name;
/* get the base name */
tmpdir = ck_strdup(input->in_file_name);
if ((p = strrchr(tmpdir, '/')))
*(p + 1) = 0;
else
strcpy(tmpdir, ".");
Prefix sed is hardcoded:
output_file.fp = ck_mkstemp (&input->out_file_name, tmpdir, "sed");
This may be that, since you have used too much sed actions, and in a looped pattern, sed may be making tmp files which are not removed properly.
Sed creates un-deleteable files in Windows
Take a look at this post, sed have such an issue to be reported before. The better way is to make a script that removes the files, or create a function that remove all files that deletes all files with name starting with sed, (^sed* )like thing.

Appending a line to a file only if it does not already exist

I need to add the following line to the end of a config file:
include "/configs/projectname.conf"
to a file called lighttpd.conf
I am looking into using sed to do this, but I can't work out how.
How would I only insert it if the line doesn't already exist?
Just keep it simple :)
grep + echo should suffice:
grep -qxF 'include "/configs/projectname.conf"' foo.bar || echo 'include "/configs/projectname.conf"' >> foo.bar
-q be quiet
-x match the whole line
-F pattern is a plain string
https://linux.die.net/man/1/grep
Edit:
incorporated #cerin and #thijs-wouters suggestions.
This would be a clean, readable and reusable solution using grep and echo to add a line to a file only if it doesn't already exist:
LINE='include "/configs/projectname.conf"'
FILE='lighttpd.conf'
grep -qF -- "$LINE" "$FILE" || echo "$LINE" >> "$FILE"
If you need to match the whole line use grep -xqF
Add -s to ignore errors when the file does not exist, creating a new file with just that line.
Try this:
grep -q '^option' file && sed -i 's/^option.*/option=value/' file || echo 'option=value' >> file
Using sed, the simplest syntax:
sed \
-e '/^\(option=\).*/{s//\1value/;:a;n;ba;q}' \
-e '$aoption=value' filename
This would replace the parameter if it exists, else would add it to the bottom of the file.
Use the -i option if you want to edit the file in-place.
If you want to accept and keep white spaces, and in addition to remove the comment, if the line already exists, but is commented out, write:
sed -i \
-e '/^#\?\(\s*option\s*=\s*\).*/{s//\1value/;:a;n;ba;q}' \
-e '$aoption=value' filename
Please note that neither option nor value must contain a slash /, or you will have to escape it to \/.
To use bash-variables $option and $value, you could write:
sed -i \
-e '/^#\?\(\s*'${option//\//\\/}'\s*=\s*\).*/{s//\1'${value//\//\\/}'/;:a;n;ba;q}' \
-e '$a'${option//\//\\/}'='${value//\//\\/} filename
The bash expression ${option//\//\\/} quotes slashes, it replaces all / with \/.
Note: Just trapped into a problem. In bash you may quote "${option//\//\\/}", but in the sh of busybox, this does not work, so you should avoid the quotes, at least in non-bourne-shells.
All combined in a bash function:
# call option with parameters: $1=name $2=value $3=file
function option() {
name=${1//\//\\/}
value=${2//\//\\/}
sed -i \
-e '/^#\?\(\s*'"${name}"'\s*=\s*\).*/{s//\1'"${value}"'/;:a;n;ba;q}' \
-e '$a'"${name}"'='"${value}" $3
}
Explanation:
/^\(option=\).*/: Match lines that start with option= and (.*) ignore everything after the =. The \(…\) encloses the part we will reuse as \1later.
/^#?(\s*'"${option//////}"'\s*=\s*).*/: Ignore commented out code with # at the begin of line. \? means «optional». The comment will be removed, because it is outside of the copied part in \(…\). \s* means «any number of white spaces» (space, tabulator). White spaces are copied, since they are within \(…\), so you do not lose formatting.
/^\(option=\).*/{…}: If matches a line /…/, then execute the next command. Command to execute is not a single command, but a block {…}.
s//…/: Search and replace. Since the search term is empty //, it applies to the last match, which was /^\(option=\).*/.
s//\1value/: Replace the last match with everything in (…), referenced by \1and the textvalue`
:a;n;ba;q: Set label a, then read next line n, then branch b (or goto) back to label a, that means: read all lines up to the end of file, so after the first match, just fetch all following lines without further processing. Then q quit and therefore ignore everything else.
$aoption=value: At the end of file $, append a the text option=value
More information on sed and a command overview is on my blog:
https://marc.wäckerlin.ch/computer/stream-editor-sed-overview-and-reference
If writing to a protected file, #drAlberT and #rubo77 's answers might not work for you since one can't sudo >>. A similarly simple solution, then, would be to use tee --append (or, on MacOS, tee -a):
LINE='include "/configs/projectname.conf"'
FILE=lighttpd.conf
grep -qF "$LINE" "$FILE" || echo "$LINE" | sudo tee --append "$FILE"
Here's a sed version:
sed -e '\|include "/configs/projectname.conf"|h; ${x;s/incl//;{g;t};a\' -e 'include "/configs/projectname.conf"' -e '}' file
If your string is in a variable:
string='include "/configs/projectname.conf"'
sed -e "\|$string|h; \${x;s|$string||;{g;t};a\\" -e "$string" -e "}" file
If, one day, someone else have to deal with this code as "legacy code", then that person will be grateful if you write a less exoteric code, such as
grep -q -F 'include "/configs/projectname.conf"' lighttpd.conf
if [ $? -ne 0 ]; then
echo 'include "/configs/projectname.conf"' >> lighttpd.conf
fi
another sed solution is to always append it on the last line and delete a pre existing one.
sed -e '$a\' -e '<your-entry>' -e "/<your-entry-properly-escaped>/d"
"properly-escaped" means to put a regex that matches your entry, i.e. to escape all regex controls from your actual entry, i.e. to put a backslash in front of ^$/*?+().
this might fail on the last line of your file or if there's no dangling newline, I'm not sure, but that could be dealt with by some nifty branching...
Here is a one-liner sed which does the job inline. Note that it preserves the location of the variable and its indentation in the file when it exists. This is often important for the context, like when there are comments around or when the variable is in an indented block. Any solution based on "delete-then-append" paradigm fails badly at this.
sed -i '/^[ \t]*option=/{h;s/=.*/=value/};${x;/^$/{s//option=value/;H};x}' test.conf
With a generic pair of variable/value you can write it this way:
var=c
val='12 34' # it handles spaces nicely btw
sed -i '/^[ \t]*'"$var"'=/{h;s/=.*/='"$val"'/};${x;/^$/{s//c='"$val"'/;H};x}' test.conf
Finally, if you want also to keep inline comments, you can do it with a catch group. E.g. if test.conf contains the following:
a=123
# Here is "c":
c=999 # with its own comment and indent
b=234
d=567
Then running this
var='c'
val='"yay"'
sed -i '/^[ \t]*'"$var"'=/{h;s/=[^#]*\(.*\)/='"$val"'\1/;s/'"$val"'#/'"$val"' #/};${x;/^$/{s//'"$var"'='"$val"'/;H};x}' test.conf
Produces that:
a=123
# Here is "c":
c="yay" # with its own comment and indent
b=234
d=567
As an awk-only one-liner:
awk -v s=option=value '/^option=/{$0=s;f=1} {a[++n]=$0} END{if(!f)a[++n]=s;for(i=1;i<=n;i++)print a[i]>ARGV[1]}' file
ARGV[1] is your input file. It is opened and written to in the for loop of theEND block. Opening file for output in the END block replaces the need for utilities like sponge or writing to a temporary file and then mving the temporary file to file.
The two assignments to array a[] accumulate all output lines into a. if(!f)a[++n]=s appends the new option=value if the main awk loop couldn't find option in file.
I have added some spaces (not many) for readability, but you really need just one space in the whole awk program, the space after print.
If file includes # comments they will be preserved.
Here's an awk implementation
/^option *=/ {
print "option=value"; # print this instead of the original line
done=1; # set a flag, that the line was found
next # all done for this line
}
{print} # all other lines -> print them
END { # end of file
if(done != 1) # haven't found /option=/ -> add it at the end of output
print "option=value"
}
Run it using
awk -f update.awk < /etc/fdm_monitor.conf > /etc/fdm_monitor.conf.tmp && \
mv /etc/fdm_monitor.conf.tmp /etc/fdm_monitor.conf
or
awk -f update.awk < /etc/fdm_monitor.conf | sponge /etc/fdm_monitor.conf
EDIT:
As a one-liner:
awk '/^option *=/ {print "option=value";d=1;next}{print}END{if(d!=1)print "option=value"}' /etc/fdm_monitor.conf | sponge /etc/fdm_monitor.conf
use awk
awk 'FNR==NR && /configs.*projectname\.conf/{f=1;next}f==0;END{ if(!f) { print "your line"}} ' file file
sed -i 's/^option.*/option=value/g' /etc/fdm_monitor.conf
grep -q "option=value" /etc/fdm_monitor.conf || echo "option=value" >> /etc/fdm_monitor.conf
here is an awk one-liner:
awk -v s="option=value" '/^option/{f=1;$0=s}7;END{if(!f)print s}' file
this doesn't do in-place change on the file, you can however :
awk '...' file > tmpfile && mv tmpfile file
Using sed, you could say:
sed -e '/option=/{s/.*/option=value/;:a;n;:ba;q}' -e 'aoption=value' filename
This would replace the parameter if it exists, else would add it to the bottom of the file.
Use the -i option if you want to edit the file in-place:
sed -i -e '/option=/{s/.*/option=value/;:a;n;:ba;q}' -e 'aoption=value' filename
sed -i '1 h
1 !H
$ {
x
s/^option.*/option=value/g
t
s/$/\
option=value/
}' /etc/fdm_monitor.conf
Load all the file in buffer, at the end, change all occurence and if no change occur, add to the end
The answers using grep are wrong. You need to add an -x option to match the entire line otherwise lines like #text to add will still match when looking to add exactly text to add.
So the correct solution is something like:
grep -qxF 'include "/configs/projectname.conf"' foo.bar || echo 'include "/configs/projectname.conf"' >> foo.bar
Using sed: It will insert at the end of line. You can also pass in variables as usual of course.
grep -qxF "port=9033" $light.conf
if [ $? -ne 0 ]; then
sed -i "$ a port=9033" $light.conf
else
echo "port=9033 already added"
fi
Using oneliner sed
grep -qxF "port=9033" $lightconf || sed -i "$ a port=9033" $lightconf
Using echo may not work under root, but will work like this. But it will not let you automate things if you are looking to do it since it might ask for password.
I had a problem when I was trying to edit from the root for a particular user. Just adding the $username before was a fix for me.
grep -qxF "port=9033" light.conf
if [ $? -ne 0 ]; then
sudo -u $user_name echo "port=9033" >> light.conf
else
echo "already there"
fi
I elaborated on kev's grep/sed solution by setting variables in order to reduce duplication.
Set the variables in the first line (hint: $_option shall match everything on the line up until the value [including any seperator like = or :]).
_file="/etc/ssmtp/ssmtp.conf" _option="mailhub=" _value="my.domain.tld" \
sh -c '\
grep -q "^$_option" "$_file" \
&& sed -i "s/^$_option.*/$_option$_value/" "$_file" \
|| echo "$_option$_value" >> "$_file"\
'
Mind that the sh -c '...' just has the effect of widening the scope of the variables without the need for an export. (See Setting an environment variable before a command in bash not working for second command in a pipe)
You can use this function to find and search config changes:
#!/bin/bash
#Find and Replace config values
find_and_replace_config () {
file=$1
var=$2
new_value=$3
awk -v var="$var" -v new_val="$new_value" 'BEGIN{FS=OFS="="}match($1, "^\\s*" var "\\s*") {$2=" " new_val}1' "$file" > output.tmp && sudo mv output.tmp $file
}
find_and_replace_config /etc/php5/apache2/php.ini max_execution_time 60
If you want to run this command using a python script within a Linux terminal...
import os,sys
LINE = 'include '+ <insert_line_STRING>
FILE = <insert_file_path_STRING>
os.system('grep -qxF $"'+LINE+'" '+FILE+' || echo $"'+LINE+'" >> '+FILE)
The $ and double quotations had me in a jungle, but this worked.
Thanks everyone
Try:
LINE='include "/configs/projectname.conf"'
sed -n "\|$LINE|q;\$a $LINE" lighttpd.conf >> lighttpd.conf
Use the pipe as separator and quit if $LINE has been found. Otherwise, append $LINE at the end.
Since we only read the file in sed command, I suppose we have no clobber issue in general (it depends on your shell settings).
Using only sed I'd suggest the following solution:
sed -i \
-e 's#^include "/configs/projectname.conf"#include "/configs/projectname.conf"#' \
-e t \
-e '$ainclude "/configs/projectname.conf"' lighttpd.conf
s replace the line include "/configs/projectname.conf with itself (using # as delimiter here)
t if the replacement was successful skip the rest of the commands
$a otherwise jump to the last line and append include "/configs/projectname.conf after it
Almost all of the answers work but not in all scenarios or OS as per my experience. Only thing that worked on older systems and new and different flavours of OS is the following.
I needed to append KUBECONFIG path to bashrc file if it doesnt exist. So, what I did is
I assume that it exists and delete it.
with sed I append the string I want.
sed -i '/KUBECONFIG=/d' ~/.bashrc
echo 'export KUBECONFIG=/etc/rancher/rke2/rke2.yaml' >> ~/.bashrc
I needed to edit a file with restricted write permissions so needed sudo. working from ghostdog74's answer and using a temp file:
awk 'FNR==NR && /configs.*projectname\.conf/{f=1;next}f==0;END{ if(!f) { print "your line"}} ' file > /tmp/file
sudo mv /tmp/file file

Quick unix command to display specific lines in the middle of a file?

Trying to debug an issue with a server and my only log file is a 20GB log file (with no timestamps even! Why do people use System.out.println() as logging? In production?!)
Using grep, I've found an area of the file that I'd like to take a look at, line 347340107.
Other than doing something like
head -<$LINENUM + 10> filename | tail -20
... which would require head to read through the first 347 million lines of the log file, is there a quick and easy command that would dump lines 347340100 - 347340200 (for example) to the console?
update I totally forgot that grep can print the context around a match ... this works well. Thanks!
I found two other solutions if you know the line number but nothing else (no grep possible):
Assuming you need lines 20 to 40,
sed -n '20,40p;41q' file_name
or
awk 'FNR>=20 && FNR<=40' file_name
When using sed it is more efficient to quit processing after having printed the last line than continue processing until the end of the file. This is especially important in the case of large files and printing lines at the beginning. In order to do so, the sed command above introduces the instruction 41q in order to stop processing after line 41 because in the example we are interested in lines 20-40 only. You will need to change the 41 to whatever the last line you are interested in is, plus one.
# print line number 52
sed -n '52p' # method 1
sed '52!d' # method 2
sed '52q;d' # method 3, efficient on large files
method 3 efficient on large files
fastest way to display specific lines
with GNU-grep you could just say
grep --context=10 ...
No there isn't, files are not line-addressable.
There is no constant-time way to find the start of line n in a text file. You must stream through the file and count newlines.
Use the simplest/fastest tool you have to do the job. To me, using head makes much more sense than grep, since the latter is way more complicated. I'm not saying "grep is slow", it really isn't, but I would be surprised if it's faster than head for this case. That'd be a bug in head, basically.
What about:
tail -n +347340107 filename | head -n 100
I didn't test it, but I think that would work.
I prefer just going into less and
typing 50% to goto halfway the file,
43210G to go to line 43210
:43210 to do the same
and stuff like that.
Even better: hit v to start editing (in vim, of course!), at that location. Now, note that vim has the same key bindings!
You can use the ex command, a standard Unix editor (part of Vim now), e.g.
display a single line (e.g. 2nd one):
ex +2p -scq file.txt
corresponding sed syntax: sed -n '2p' file.txt
range of lines (e.g. 2-5 lines):
ex +2,5p -scq file.txt
sed syntax: sed -n '2,5p' file.txt
from the given line till the end (e.g. 5th to the end of the file):
ex +5,p -scq file.txt
sed syntax: sed -n '2,$p' file.txt
multiple line ranges (e.g. 2-4 and 6-8 lines):
ex +2,4p +6,8p -scq file.txt
sed syntax: sed -n '2,4p;6,8p' file.txt
Above commands can be tested with the following test file:
seq 1 20 > file.txt
Explanation:
+ or -c followed by the command - execute the (vi/vim) command after file has been read,
-s - silent mode, also uses current terminal as a default output,
q followed by -c is the command to quit editor (add ! to do force quit, e.g. -scq!).
I'd first split the file into few smaller ones like this
$ split --lines=50000 /path/to/large/file /path/to/output/file/prefix
and then grep on the resulting files.
If your line number is 100 to read
head -100 filename | tail -1
Get ack
Ubuntu/Debian install:
$ sudo apt-get install ack-grep
Then run:
$ ack --lines=$START-$END filename
Example:
$ ack --lines=10-20 filename
From $ man ack:
--lines=NUM
Only print line NUM of each file. Multiple lines can be given with multiple --lines options or as a comma separated list (--lines=3,5,7). --lines=4-7 also works.
The lines are always output in ascending order, no matter the order given on the command line.
sed will need to read the data too to count the lines.
The only way a shortcut would be possible would there to be context/order in the file to operate on. For example if there were log lines prepended with a fixed width time/date etc.
you could use the look unix utility to binary search through the files for particular dates/times
Use
x=`cat -n <file> | grep <match> | awk '{print $1}'`
Here you will get the line number where the match occurred.
Now you can use the following command to print 100 lines
awk -v var="$x" 'NR>=var && NR<=var+100{print}' <file>
or you can use "sed" as well
sed -n "${x},${x+100}p" <file>
With sed -e '1,N d; M q' you'll print lines N+1 through M. This is probably a bit better then grep -C as it doesn't try to match lines to a pattern.
Building on Sklivvz' answer, here's a nice function one can put in a .bash_aliases file. It is efficient on huge files when printing stuff from the front of the file.
function middle()
{
startidx=$1
len=$2
endidx=$(($startidx+$len))
filename=$3
awk "FNR>=${startidx} && FNR<=${endidx} { print NR\" \"\$0 }; FNR>${endidx} { print \"END HERE\"; exit }" $filename
}
To display a line from a <textfile> by its <line#>, just do this:
perl -wne 'print if $. == <line#>' <textfile>
If you want a more powerful way to show a range of lines with regular expressions -- I won't say why grep is a bad idea for doing this, it should be fairly obvious -- this simple expression will show you your range in a single pass which is what you want when dealing with ~20GB text files:
perl -wne 'print if m/<regex1>/ .. m/<regex2>/' <filename>
(tip: if your regex has / in it, use something like m!<regex>! instead)
This would print out <filename> starting with the line that matches <regex1> up until (and including) the line that matches <regex2>.
It doesn't take a wizard to see how a few tweaks can make it even more powerful.
Last thing: perl, since it is a mature language, has many hidden enhancements to favor speed and performance. With this in mind, it makes it the obvious choice for such an operation since it was originally developed for handling large log files, text, databases, etc.
print line 5
sed -n '5p' file.txt
sed '5q' file.txt
print everything else than line 5
`sed '5d' file.txt
and my creation using google
#!/bin/bash
#removeline.sh
#remove deleting it comes move line xD
usage() { # Function: Print a help message.
echo "Usage: $0 -l LINENUMBER -i INPUTFILE [ -o OUTPUTFILE ]"
echo "line is removed from INPUTFILE"
echo "line is appended to OUTPUTFILE"
}
exit_abnormal() { # Function: Exit with error.
usage
exit 1
}
while getopts l:i:o:b flag
do
case "${flag}" in
l) line=${OPTARG};;
i) input=${OPTARG};;
o) output=${OPTARG};;
esac
done
if [ -f tmp ]; then
echo "Temp file:tmp exist. delete it yourself :)"
exit
fi
if [ -f "$input" ]; then
re_isanum='^[0-9]+$'
if ! [[ $line =~ $re_isanum ]] ; then
echo "Error: LINENUMBER must be a positive, whole number."
exit 1
elif [ $line -eq "0" ]; then
echo "Error: LINENUMBER must be greater than zero."
exit_abnormal
fi
if [ ! -z $output ]; then
sed -n "${line}p" $input >> $output
fi
if [ ! -z $input ]; then
# remove this sed command and this comes move line to other file
sed "${line}d" $input > tmp && cp tmp $input
fi
fi
if [ -f tmp ]; then
rm tmp
fi
You could try this command:
egrep -n "*" <filename> | egrep "<line number>"
Easy with perl! If you want to get line 1, 3 and 5 from a file, say /etc/passwd:
perl -e 'while(<>){if(++$l~~[1,3,5]){print}}' < /etc/passwd
I am surprised only one other answer (by Ramana Reddy) suggested to add line numbers to the output. The following searches for the required line number and colours the output.
file=FILE
lineno=LINENO
wb="107"; bf="30;1"; rb="101"; yb="103"
cat -n ${file} | { GREP_COLORS="se=${wb};${bf}:cx=${wb};${bf}:ms=${rb};${bf}:sl=${yb};${bf}" grep --color -C 10 "^[[:space:]]\\+${lineno}[[:space:]]"; }

Resources