Related
What's the simplest way to do a find and replace for a given input string, say abc, and replace with another string, say XYZ in file /tmp/file.txt?
I am writting an app and using IronPython to execute commands through SSH — but I don't know Unix that well and don't know what to look for.
I have heard that Bash, apart from being a command line interface, can be a very powerful scripting language. So, if this is true, I assume you can perform actions like these.
Can I do it with bash, and what's the simplest (one line) script to achieve my goal?
The easiest way is to use sed (or perl):
sed -i -e 's/abc/XYZ/g' /tmp/file.txt
Which will invoke sed to do an in-place edit due to the -i option. This can be called from bash.
If you really really want to use just bash, then the following can work:
while IFS='' read -r a; do
echo "${a//abc/XYZ}"
done < /tmp/file.txt > /tmp/file.txt.t
mv /tmp/file.txt{.t,}
This loops over each line, doing a substitution, and writing to a temporary file (don't want to clobber the input). The move at the end just moves temporary to the original name. (For robustness and security, the temporary file name should not be static or predictable, but let's not go there.)
For Mac users:
sed -i '' 's/abc/XYZ/g' /tmp/file.txt
(See the comment below why)
File manipulation isn't normally done by Bash, but by programs invoked by Bash, e.g.:
perl -pi -e 's/abc/XYZ/g' /tmp/file.txt
The -i flag tells it to do an in-place replacement.
See man perlrun for more details, including how to take a backup of the original file.
I was surprised when I stumbled over this...
There is a replace command which ships with the "mysql-server" package, so if you have installed it try it out:
# replace string abc to XYZ in files
replace "abc" "XYZ" -- file.txt file2.txt file3.txt
# or pipe an echo to replace
echo "abcdef" |replace "abc" "XYZ"
See man replace for more on this.
This is an old post but for anyone wanting to use variables as #centurian said the single quotes mean nothing will be expanded.
A simple way to get variables in is to do string concatenation since this is done by juxtaposition in bash the following should work:
sed -i -e "s/$var1/$var2/g" /tmp/file.txt
Bash, like other shells, is just a tool for coordinating other commands. Typically you would try to use standard UNIX commands, but you can of course use Bash to invoke anything, including your own compiled programs, other shell scripts, Python and Perl scripts etc.
In this case, there are a couple of ways to do it.
If you want to read a file, and write it to another file, doing search/replace as you go, use sed:
sed 's/abc/XYZ/g' <infile >outfile
If you want to edit the file in place (as if opening the file in an editor, editing it, then saving it) supply instructions to the line editor 'ex'
echo "%s/abc/XYZ/g
w
q
" | ex file
Example is like vi without the fullscreen mode. You can give it the same commands you would at vi's : prompt.
I found this thread among others and I agree it contains the most complete answers so I'm adding mine too:
sed and ed are so useful...by hand.
Look at this code from #Johnny:
sed -i -e 's/abc/XYZ/g' /tmp/file.txt
When my restriction is to use it in a shell script, no variable can be used inside in place of "abc" or "XYZ". The BashFAQ seems to agree with what I understand at least. So, I can't use:
x='abc'
y='XYZ'
sed -i -e 's/$x/$y/g' /tmp/file.txt
#or,
sed -i -e "s/$x/$y/g" /tmp/file.txt
but, what can we do? As, #Johnny said use a while read... but, unfortunately that's not the end of the story. The following worked well with me:
#edit user's virtual domain
result=
#if nullglob is set then, unset it temporarily
is_nullglob=$( shopt -s | egrep -i '*nullglob' )
if [[ is_nullglob ]]; then
shopt -u nullglob
fi
while IFS= read -r line; do
line="${line//'<servername>'/$server}"
line="${line//'<serveralias>'/$alias}"
line="${line//'<user>'/$user}"
line="${line//'<group>'/$group}"
result="$result""$line"'\n'
done < $tmp
echo -e $result > $tmp
#if nullglob was set then, re-enable it
if [[ is_nullglob ]]; then
shopt -s nullglob
fi
#move user's virtual domain to Apache 2 domain directory
......
As one can see if nullglob is set then, it behaves strangely when there is a string containing a * as in:
<VirtualHost *:80>
ServerName www.example.com
which becomes
<VirtualHost ServerName www.example.com
there is no ending angle bracket and Apache2 can't even load.
This kind of parsing should be slower than one-hit search and replace but, as you already saw, there are four variables for four different search patterns working out of one parse cycle.
The most suitable solution I can think of with the given assumptions of the problem.
You can use sed:
sed -i 's/abc/XYZ/gi' /tmp/file.txt
You can use find and sed if you don't know your filename:
find ./ -type f -exec sed -i 's/abc/XYZ/gi' {} \;
Find and replace in all Python files:
find ./ -iname "*.py" -type f -exec sed -i 's/abc/XYZ/gi' {} \;
Be careful if you replace URLs with "/" character.
An example of how to do it:
sed -i "s%http://domain.com%http://www.domain.com/folder/%g" "test.txt"
Extracted from: http://www.sysadmit.com/2015/07/linux-reemplazar-texto-en-archivos-con-sed.html
If the file you are working on is not so big, and temporarily storing it in a variable is no problem, then you can use Bash string substitution on the whole file at once - there's no need to go over it line by line:
file_contents=$(</tmp/file.txt)
echo "${file_contents//abc/XYZ}" > /tmp/file.txt
The whole file contents will be treated as one long string, including linebreaks.
XYZ can be a variable eg $replacement, and one advantage of not using sed here is that you need not be concerned that the search or replace string might contain the sed pattern delimiter character (usually, but not necessarily, /). A disadvantage is not being able to use regular expressions or any of sed's more sophisticated operations.
You may also use the ed command to do in-file search and replace:
# delete all lines matching foobar
ed -s test.txt <<< $'g/foobar/d\nw'
See more in "Editing files via scripts with ed".
To edit text in the file non-interactively, you need in-place text editor such as vim.
Here is simple example how to use it from the command line:
vim -esnc '%s/foo/bar/g|:wq' file.txt
This is equivalent to #slim answer of ex editor which is basically the same thing.
Here are few ex practical examples.
Replacing text foo with bar in the file:
ex -s +%s/foo/bar/ge -cwq file.txt
Removing trailing whitespaces for multiple files:
ex +'bufdo!%s/\s\+$//e' -cxa *.txt
Troubleshooting (when terminal is stuck):
Add -V1 param to show verbose messages.
Force quit by: -cwq!.
See also:
How to edit files non-interactively (e.g. in pipeline)? at Vi SE
Try the following shell command:
find ./ -type f -name "file*.txt" | xargs sed -i -e 's/abc/xyz/g'
You can use python within the bash script too. I didn't have much success with some of the top answers here, and found this to work without the need for loops:
#!/bin/bash
python
filetosearch = '/home/ubuntu/ip_table.txt'
texttoreplace = 'tcp443'
texttoinsert = 'udp1194'
s = open(filetosearch).read()
s = s.replace(texttoreplace, texttoinsert)
f = open(filetosearch, 'w')
f.write(s)
f.close()
quit()
Simplest way to replace multiple text in a file using sed command
Command -
sed -i 's#a/b/c#D/E#g;s#/x/y/z#D:/X#g;' filename
In the above command s#a/b/c#D/E#g where I am replacing a/b/c with D/E and then after the ; we again doing the same thing
You can use rpl command. For example you want to change domain name in whole php project.
rpl -ivRpd -x'.php' 'old.domain.name' 'new.domain.name' ./path_to_your_project_folder/
This is not clear bash of cause, but it's a very quick and usefull. :)
For MAC users in case you don't read the comments :)
As mentioned by #Austin, if you get the Invalid command code error
For the in-place replacements BSD sed requires a file extension after the -i flag to save to a backup file with given extension.
sed -i '.bak' 's/find/replace' /file.txt
You can use '' empty string if you want to skip backup.
sed -i '' 's/find/replace' /file.txt
All merit to #Austin
Open file using vim editor. In command mode
:%s/abc/xyz/g
This is the simplest
In case of doing changes in multiple files together we can do in a single line as:-
user_name='whoami'
for file in file1.txt file2.txt file3.txt; do sed -i -e 's/default_user/${user_name}/g' $file; done
Added if in case could be useful.
I am quite new to shell scripting.
I am scraping a website and the scraped text contains a lot of repetitions. Usually they are the menus on a forum, for example. Mostly, I do this in Python, but I thought that sed command will save me reading and printing the input, loops etc. I want to delete thousands of repeated lines from the same single file. I do not want to copy it to another file, because I will end up with 100 new files. The following is a shadow script which I run from the bash shell.
#!/bin/sed -f
sed -i '/^how$/d' input_file.txt
sed -i '/^is test$/d' input_file.txt
sed -i '/^repeated text/d' input_file.txt
This is the content of the input file:
how to do this task
why it is not working
this is test
Stackoverflow is a very helpful community of programmers
that is test
this is text
repeated text is common
this is repeated text of the above line
Then I run in the shell the following command:
sed -f scriptFile input_file.txt
I get the following error
sed: scriptFile line 2: untermindated `s' command
How can I correct the script, and what is the correct syntax of the command I should use to get it work?
Any help is highly appreciated.
assuming you know what your script is doing, it's very easy to put them into a script. in your case, the script should be:
/^how$/d
/^is test$/d
/^repeated text/d
that's good enough.
to make the script alone to be executable is easy too:
#!/usr/bin/env sed -f
/^how$/d
/^is test$/d
/^repeated text/d
then
chmod +x your_sed_script
./your_sed_script <old >new
here is a very good and compact tutorial. you can learn a lot from it.
following is an example from the site, just in case the link is dead:
If you have a large number of sed commands, you can put them into a file and use
sed -f sedscript <old >new
where sedscript could look like this:
# sed comment - This script changes lower case vowels to upper case
s/a/A/g
s/e/E/g
s/i/I/g
s/o/O/g
s/u/U/g
Wouldn't it be easier to do it with egrep followed by a mv, for example
egrep -v 'pattern1|pattern2|pattern3|...' <input_file.txt >tmpfile.txt
mv tmpfile.txt input_file.txt
Each pattern would describe the lines being deleted, much like in sed. You would not end up with additional files, because the mv removes them.
If you have so many pattern, that you don't want to specify them directly on the command line, you can store them in a file use the -f option of egrep.
I am trying to do a simple operation here. Which is to cut a few characters from one file (style.css) do a find a replace on another file (client_custom.css) for more then 100 directories with different names
When I use the following command
for d in */; do sed -n 73p ~/assets/*/style.css | cut -c 29-35 | xargs -I :hex: sed -i 's/!BGCOLOR!/:hex:/' ~/assets/*/client_custom.css $d; done
It keeps giving me the following error for all the directories
sed: couldn't edit dirname/: not a regular file
I am confused on why its giving me that error message explicitly gave the full path to the file. It works perfectly fine without a for loop.
Can anyone please help me out with this issue?
sed doesn't support folders as input.
for d in */;
puts folders into $d. If you write sed ... $d, then BASH will put the folder name into the arguments of sed and the poor tool will be confused.
Also ~/assets/*/client_custom.css since this will expand to all the files which match this pattern. So sed will be called once with all file names. You probably want to invoke sed once per file name.
Try
for f in ~/assets/*/client_custom.css; do
... | sed -i 's/!BGCOLOR!/:hex:/' $f
done
or, even better:
for f in ~/assets/*/client_custom.css; do
... | sed 's/!BGCOLOR!/:hex:/' "${f}.in" > "${f}"
done
(which doesn't overwrite the output file). This way, you can keep the "*.in" files, edit them with the patterns and then use sed to "expand" all the variables.
I'm trying to copy a bunch of files below a directory and a number of the files have spaces and single-quotes in their names. When I try to string together find and grep with xargs, I get the following error:
find .|grep "FooBar"|xargs -I{} cp "{}" ~/foo/bar
xargs: unterminated quote
Any suggestions for a more robust usage of xargs?
This is on Mac OS X 10.5.3 (Leopard) with BSD xargs.
You can combine all of that into a single find command:
find . -iname "*foobar*" -exec cp -- "{}" ~/foo/bar \;
This will handle filenames and directories with spaces in them. You can use -name to get case-sensitive results.
Note: The -- flag passed to cp prevents it from processing files starting with - as options.
find . -print0 | grep --null 'FooBar' | xargs -0 ...
I don't know about whether grep supports --null, nor whether xargs supports -0, on Leopard, but on GNU it's all good.
The easiest way to do what the original poster wants is to change the delimiter from any whitespace to just the end-of-line character like this:
find whatever ... | xargs -d "\n" cp -t /var/tmp
This is more efficient as it does not run "cp" multiple times:
find -name '*FooBar*' -print0 | xargs -0 cp -t ~/foo/bar
I ran into the same problem. Here's how I solved it:
find . -name '*FoooBar*' | sed 's/.*/"&"/' | xargs cp ~/foo/bar
I used sed to substitute each line of input with the same line, but surrounded by double quotes. From the sed man page, "...An ampersand (``&'') appearing in the replacement is replaced by the string matching the RE..." -- in this case, .*, the entire line.
This solves the xargs: unterminated quote error.
This method works on Mac OS X v10.7.5 (Lion):
find . | grep FooBar | xargs -I{} cp {} ~/foo/bar
I also tested the exact syntax you posted. That also worked fine on 10.7.5.
Just don't use xargs. It is a neat program but it doesn't go well with find when faced with non trivial cases.
Here is a portable (POSIX) solution, i.e. one that doesn't require find, xargs or cp GNU specific extensions:
find . -name "*FooBar*" -exec sh -c 'cp -- "$#" ~/foo/bar' sh {} +
Note the ending + instead of the more usual ;.
This solution:
correctly handles files and directories with embedded spaces, newlines or whatever exotic characters.
works on any Unix and Linux system, even those not providing the GNU toolkit.
doesn't use xargs which is a nice and useful program, but requires too much tweaking and non standard features to properly handle find output.
is also more efficient (read faster) than the accepted and most if not all of the other answers.
Note also that despite what is stated in some other replies or comments quoting {} is useless (unless you are using the exotic fishshell).
Look into using the --null commandline option for xargs with the -print0 option in find.
For those who relies on commands, other than find, eg ls:
find . | grep "FooBar" | tr \\n \\0 | xargs -0 -I{} cp "{}" ~/foo/bar
find | perl -lne 'print quotemeta' | xargs ls -d
I believe that this will work reliably for any character except line-feed (and I suspect that if you've got line-feeds in your filenames, then you've got worse problems than this). It doesn't require GNU findutils, just Perl, so it should work pretty-much anywhere.
I have found that the following syntax works well for me.
find /usr/pcapps/ -mount -type f -size +1000000c | perl -lpe ' s{ }{\\ }g ' | xargs ls -l | sort +4nr | head -200
In this example, I am looking for the largest 200 files over 1,000,000 bytes in the filesystem mounted at "/usr/pcapps".
The Perl line-liner between "find" and "xargs" escapes/quotes each blank so "xargs" passes any filename with embedded blanks to "ls" as a single argument.
Frame challenge — you're asking how to use xargs. The answer is: you don't use xargs, because you don't need it.
The comment by user80168 describes a way to do this directly with cp, without calling cp for every file:
find . -name '*FooBar*' -exec cp -t /tmp -- {} +
This works because:
the cp -t flag allows to give the target directory near the beginning of cp, rather than near the end. From man cp:
-t, --target-directory=DIRECTORY
copy all SOURCE arguments into DIRECTORY
The -- flag tells cp to interpret everything after as a filename, not a flag, so files starting with - or -- do not confuse cp; you still need this because the -/-- characters are interpreted by cp, whereas any other special characters are interpreted by the shell.
The find -exec command {} + variant essentially does the same as xargs. From man find:
-exec command {} +
This variant of the -exec action runs the specified command on
the selected files, but the command line is built by appending
each selected file name at the end; the total number of invoca‐
matched files. The command line is built in much the same way
that xargs builds its command lines. Only one instance of `{}'
is allowed within the command, and (when find is being invoked
from a shell) it should be quoted (for example, '{}') to protect
it from interpretation by shells. The command is executed in
the starting directory. If any invocation returns a non-zero
value as exit status, then find returns a non-zero exit status.
If find encounters an error, this can sometimes cause an immedi‐
ate exit, so some pending commands may not be run at all. This
variant of -exec always returns true.
By using this in find directly, this avoids the need of a pipe or a shell invocation, such that you don't need to worry about any nasty characters in filenames.
With Bash (not POSIX) you can use process substitution to get the current line inside a variable. This enables you to use quotes to escape special characters:
while read line ; do cp "$line" ~/bar ; done < <(find . | grep foo)
Be aware that most of the options discussed in other answers are not standard on platforms that do not use the GNU utilities (Solaris, AIX, HP-UX, for instance). See the POSIX specification for 'standard' xargs behaviour.
I also find the behaviour of xargs whereby it runs the command at least once, even with no input, to be a nuisance.
I wrote my own private version of xargs (xargl) to deal with the problems of spaces in names (only newlines separate - though the 'find ... -print0' and 'xargs -0' combination is pretty neat given that file names cannot contain ASCII NUL '\0' characters. My xargl isn't as complete as it would need to be to be worth publishing - especially since GNU has facilities that are at least as good.
For me, I was trying to do something a little different. I wanted to copy my .txt files into my tmp folder. The .txt filenames contain spaces and apostrophe characters. This worked on my Mac.
$ find . -type f -name '*.txt' | sed 's/'"'"'/\'"'"'/g' | sed 's/.*/"&"/' | xargs -I{} cp -v {} ./tmp/
If find and xarg versions on your system doesn't support -print0 and -0 switches (for example AIX find and xargs) you can use this terribly looking code:
find . -name "*foo*" | sed -e "s/'/\\\'/g" -e 's/"/\\"/g' -e 's/ /\\ /g' | xargs cp /your/dest
Here sed will take care of escaping the spaces and quotes for xargs.
Tested on AIX 5.3
I created a small portable wrapper script called "xargsL" around "xargs" which addresses most of the problems.
Contrary to xargs, xargsL accepts one pathname per line. The pathnames may contain any character except (obviously) newline or NUL bytes.
No quoting is allowed or supported in the file list - your file names may contain all sorts of whitespace, backslashes, backticks, shell wildcard characters and the like - xargsL will process them as literal characters, no harm done.
As an added bonus feature, xargsL will not run the command once if there is no input!
Note the difference:
$ true | xargs echo no data
no data
$ true | xargsL echo no data # No output
Any arguments given to xargsL will be passed through to xargs.
Here is the "xargsL" POSIX shell script:
#! /bin/sh
# Line-based version of "xargs" (one pathname per line which may contain any
# amount of whitespace except for newlines) with the added bonus feature that
# it will not execute the command if the input file is empty.
#
# Version 2018.76.3
#
# Copyright (c) 2018 Guenther Brunthaler. All rights reserved.
#
# This script is free software.
# Distribution is permitted under the terms of the GPLv3.
set -e
trap 'test $? = 0 || echo "$0 failed!" >& 2' 0
if IFS= read -r first
then
{
printf '%s\n' "$first"
cat
} | sed 's/./\\&/g' | xargs ${1+"$#"}
fi
Put the script into some directory in your $PATH and don't forget to
$ chmod +x xargsL
the script there to make it executable.
bill_starr's Perl version won't work well for embedded newlines (only copes with spaces). For those on e.g. Solaris where you don't have the GNU tools, a more complete version might be (using sed)...
find -type f | sed 's/./\\&/g' | xargs grep string_to_find
adjust the find and grep arguments or other commands as you require, but the sed will fix your embedded newlines/spaces/tabs.
I used Bill Star's answer slightly modified on Solaris:
find . -mtime +2 | perl -pe 's{^}{\"};s{$}{\"}' > ~/output.file
This will put quotes around each line. I didn't use the '-l' option although it probably would help.
The file list I was going though might have '-', but not newlines. I haven't used the output file with any other commands as I want to review what was found before I just start massively deleting them via xargs.
I played with this a little, started contemplating modifying xargs, and realised that for the kind of use case we're talking about here, a simple reimplementation in Python is a better idea.
For one thing, having ~80 lines of code for the whole thing means it is easy to figure out what is going on, and if different behaviour is required, you can just hack it into a new script in less time than it takes to get a reply on somewhere like Stack Overflow.
See https://github.com/johnallsup/jda-misc-scripts/blob/master/yargs and https://github.com/johnallsup/jda-misc-scripts/blob/master/zargs.py.
With yargs as written (and Python 3 installed) you can type:
find .|grep "FooBar"|yargs -l 203 cp --after ~/foo/bar
to do the copying 203 files at a time. (Here 203 is just a placeholder, of course, and using a strange number like 203 makes it clear that this number has no other significance.)
If you really want something faster and without the need for Python, take zargs and yargs as prototypes and rewrite in C++ or C.
You might need to grep Foobar directory like:
find . -name "file.ext"| grep "FooBar" | xargs -i cp -p "{}" .
If you are using Bash, you can convert stdout to an array of lines by mapfile:
find . | grep "FooBar" | (mapfile -t; cp "${MAPFILE[#]}" ~/foobar)
The benefits are:
It's built-in, so it's faster.
Execute the command with all file names in one time, so it's faster.
You can append other arguments to the file names. For cp, you can also:
find . -name '*FooBar*' -exec cp -t ~/foobar -- {} +
however, some commands don't have such feature.
The disadvantages:
Maybe not scale well if there are too many file names. (The limit? I don't know, but I had tested with 10 MB list file which includes 10000+ file names with no problem, under Debian)
Well... who knows if Bash is available on OS X?
I have a set of 10 CSV files, which normally have a an entry of this kind
a,b,c,d
d,e,f,g
Now due to some error entries in this file have become of this kind
a,b,c,d
d,e,f,g
,,,
h,i,j,k
Now I want to remove the line with only commas in all the files. These files are on a Linux filesystem.
Any command that you recommend that can replaces the erroneous lines in all the files.
It depends on what you mean by replace. If you mean 'remove', then a trivial variant on #wnoise's solution is:
grep -v '^,,,$' old-file.csv > new-file.csv
Note that this deletes just those lines with exactly three commas. If you want to delete mal-formed lines with any number of commas (including zero) - and no other characters on the line, then:
grep -v '^,*$' ...
There are endless other variations on the regex that would deal with other scenarios. Dealing with full CSV data with commas inside quotes starts to need something other than a regex machine. It can be done, within broad limits, especially in more complex regex systems such as PCRE or Perl. But it requires more work.
Check out Mastering Regular Expressions.
sed 's/,,,/replacement/' < old-file.csv > new-file.csv
optionally followed by
mv new-file.csv old-file.csv
Replace or remove, your post is not clear... For replacement see wnoise's answer. For removing, you could use
awk '$0 !~ /,,,/ {print}' <old-file.csv > new-file.csv
What about trying to keep only lines which are matching the desired format instead of handling one exception ?
If the provided input is what you really want to match:
grep -E '[a-z],[a-z],[a-z],[a-z]' < oldfile.csv > newfile.csv
If the input is different, provide it, the regular expression should not be too hard to write.
Do you want to replace them with something, or delete them entirely? Either way, it can be done with sed. To delete:
sed -i -e '/^,\+$/ D' yourfile1.csv yourfile2.csv ...
To replace: well, see wnoise's answer, or if you don't want to create new files with the output,
sed -i -e '/^,\+$/ s//replacement/' yourfile1.csv yourfile2.csv ...
or
sed -i -e '/^,\+$/ c\
replacement' yourfile1.csv yourfile2.csv ...
(that should be entered exactly as is, including the line break). Of course, you can also do this with awk or perl or, if you're only deleting lines, even grep:
egrep -v '^,+$' < oldfile.csv > newfile.csv
I tested these to make sure they work, but I'd advise you to do the same before using them (just in case). You can omit the -i option from sed, in which case it'll print out the results (rather than writing them back to the file), or omit the output redirection >newfile.csv from grep.
EDIT: It was pointed out in a comment that some features of these sed commands only work on GNU sed. As far as I can tell, these are the -i option (which can be replaced with shell redirection, sed ... <infile >outfile ) and the \+ modifier (which can be replaced with \{1,\} ).
Most simply:
$ grep -v ,,,, oldfile > newfile
$ mv newfile oldfile
yes, awk or grep are very good option if you are working in linux platform. However you can use perl regex for other platform. using join & split options.