How to remove multiple lines in multiple files on Linux using bash

How to remove multiple lines in multiple files on Linux using bash - linux

I am trying to remove 2 lines from all my Javascript files on my Linux shared hosting. I wanted to do this without writing a script as I know this should be possible with sed. My current attempt looks like this:
find . -name "*.js" | xargs sed -i ";var
O0l='=sTKpUG"
The second line is actually longer than this but is malicious code so I have not included it here. As you guessed my server has been hacked so I need to clean up all these JavaScript files.
I forgot to mention that the output I am getting at the moment is:
sed: -e expression #1, char 4: expected newer version of sed
The 2 lines are just as follows consecutively:
;var
O0l='=sTKpUG
except that the second line is longer, but the rest of the second line should not influence the command.

He meant removing two adjacent lines.
you can do something like this, remember to backup your files.
find . -name "*.js" | xargs sed -i -e "/^;var/N;/^;var\nO0l='=sTKpUG/d"
Since sed processes input file line by line, it does not store the newline '\n' character in its buffer, so we need to tell it by using flag /N to append the next line, with newline character.
/^;var/N;
Then we do our pattern searching and deleting.
/^;var\nO0l='=sTKpUG/d

It really isn't clear yet what the two lines look like, and it isn't clear if they are adjacent to each other in the JavaScript, so we'll assume not. However, the answer is likely to be:
find . -name "*.js" |
xargs sed -i -e '/^distinctive-pattern1$/d' -e '/^alternative-pattern-2a$/d'
There are other ways of writing the sed script using a single command string; I prefer to use separate arguments for separate operations (it makes the script clearer).
Clearly, if you need to keep some of the information on one of the lines, you can use a search pattern adjusted as appropriate, and then do a substitute s/short-pattern// instead of d to remove the short section that must be removed. Similarly with the long line if that's relevant.

Related

Wildcard in sed command to replace string not working

I'm trying to use the sed command in terminal to replace a specific line in all my text files with a certain extension by a specific string:
sed -i.bak '35s/^.*$/5\) 1\-4/' fitting_file*.feedme
So I am trying to replace line 35 in each of these files with the string "5) 1-4". When I run an ls fitting_file*.feedme | wc -l command in this directory, I get 221 files. However, when I run the above sed command, it only edits the FIRST file in the order of ls fitting_file*.feedme. I know this because grep '5) 1-4' fitting_file*.feedme continually only returns the first file on the list after I run the replacement command. I also tried replacing fitting_file*.feedme with a space-separated list of a couple of these files in my sed command as a test, but it still only operated on the one I chose to list first. Why is this happening?

sed operates on a single stream. It essentially concats all the files together and treats that as a single stream. So it replaces the 35th line of the big concatenated stream.
To see this, make a 20 line file called A and a 20 line file called B. Apply your sed command as
sed -i.bak '35s/^.*$/5\) 1\-4/' A B
and you will see the 15th line of B replaced.
I think this should answer your direct question. As far how to get done what you like, I assume you've already figured out that wrapping your sed command in a for is one way to do it. :)

Try
Create a file containing your sed instruction like this
#!/bin/bash
sed -i.bak '35s/^.*$/5\) 1\-4/' $1
exit 0
and call it prog.sh. Next make it executable :
chmod u+x prog.sh
now you can solve your problem using
find . -name fitting_file\*.feedme -exec ./prog.sh {} \;
You could do all this on one line but frankly the number of escapes required is a bit much. Good luck.

To do what you're trying to do without using a shell loop is:
awk -i inplace -v inplace::suffix=.bak 'FNR==35{$0="5) 1-4"}1' fitting_file*.feedme
Note that unlike sed which can just count lines across all input files, awk has NR to track the number of records (lines by default) across all files and FNR for the same but just within the current file.
The above uses GNU awk for inplace editing just like GNU sed has -i for that. The default awk on MacOS is BSD awk, not GNU awk, but you should install GNU awk as it doesn't have all the bugs/quirks that BSD awk does and it has a ton of extremely useful extensions.
If you just want to use MacOS's awk then it'd be something like:
find . -name 'fitting_file*.feedme' -exec sh -c "\
awk 'FNR==35{\$0=\"5) 1-4\"}1' \"\$1\" > \"\$1.bak\" &&
mv -- \"\$1.bak\" \"\$1\"
" sh {} \;
which is obviously getting kinda complicated - I'd probably put the awk+mv script in a file to execute from sh -c or just resort to a shell loop myself if faced with that alternative (or a similar quoting nightmare with xargs)!

Replace spaces in all files in a directory with underscores

I have found some similar questions here but not this specific one and I do not want to break all my files. I have a list of files and I simply need to replace all spaces with underscores. I know this is a sed command but I am not sure how to generically apply this to every file.
I do not want to rename the files, just modify them in place.
Edit: To clarify, just in case it's not clear, I only want to replace whitespace within the files, file names should not be changed.

find . -type f -exec sed -i -e 's/ /_/g' {} \;
find grabs all items in the directory (and subdirectories) that are files, and passes those filenames as arguments to the sed command using the {} \; notation. The sed command it appears you already understand.
if you only want to search the current directory, and ignore subdirectories, you can use
find . -maxdepth 1 -type f -exec sed -i -e 's/ /_/g' {} \;

This is a 2 part problem. Step 1 is providing the proper sed command, 2 is providing the proper command to replace all files in a given directory.
Substitution in sed commands follows the form s/ItemToReplace/ItemToReplaceWith/pattern, where s stands for the substitution and pattern stands for how the operation should take place. According to this super user post, in order to match whitespace characters you must use either \s or [[:space:]] in your sed command. The difference being the later is for POSIX compliance. Lastly you need to specify a global operation which is simply /g at the end. This simply replaces all spaces in a file with underscores.
Substitution in sed commands follows the form s/ItemToReplace/ItemToReplaceWith/pattern, where s stands for the substitution and pattern stands for how the operation should take place. According to this super user post, in order to match whitespace characters you must use either just a space in your sed command, \s, or [[:space:]]. The difference being the last 2 are for whitespace catching (tabs and spaces), with the last needed for POSIX compliance. Lastly you need to specify a global operation which is simply /g at the end.
Therefore, your sed command is
sed s/ /_/g FileNameHere
However this only accomplishes half of your task. You also need to be able to do this for every file within a directory. Unfortunately, wildcards won't save us in the sed command, as * > * would be ambiguous. Your only solution is to iterate through each file and overwrite them individually. For loops by default should come equipped with file iteration syntax, and when used with wildcards expands out to all files in a directory. However sed's used in this manner appear to completely lose output when redirecting to a file. To correct this, you must specify sed with the -i flag so it will edit its files. Whatever item you pass after the -i flag will be used to create a backup of the old files. If no extension is passed (-i '' for instance), no backup will be created.
Therefore the final command should simply be
for i in *;do sed -i '' 's/ /_/g' $i;done
Which looks for all files in your current directory and echos the sed output to all files (Directories do get listed but no action occurs with them).

Well... since I was trying to get something running I found a method that worked for me:
for file in `ls`; do sed -i 's/ /_/g' $file; done

sed for a string in only 1 line

What I want to do here is locate any file that contains a specific string in a specific line, and remove said line, not just the string.
What I have is something along the lines of this:
find / -type f -name '*.foo' -exec sed '1/stringtodetect/d' {} \;
However this will remove everything BETWEEN line 1 and the string. given that sed argument. (sed '1,/stringtodetect/d' "$file")
Lets say I have a .php file, and I'm looking for the string 'gotcha'.
I only want to edit the file if it has the string in the FIRST line of the file, like so:
gotcha with this.
gotcha
useful text
more text
dont delete me
If I ran the script, I'd want the contents of the same file to appear as such:
List item
List item
dont delete me
Any tips?

You are using the following range address for the delete command:
1,/stringtodelete/
This means all lines from line 1 until the first occurrence of stringtodelete.
Furthermore, you need not (and should not!) iterate over the results from find. find has the -exec option for that. It executes a command for each file which has been found, passing the filename as an argument.
It should be:
find / -type f -name '*.foo' -exec sed '/stringtodetect/d' {} \;
Test the command first. Once you are sure it works, use sed -i to modify the files in place. If you want a backup you can use sed -i.backup (for example). To remove the backups once you are sure you can use find again:
find / -type -name '*.foo.backup' -delete

You need a sed script that will skip any line by number that is not the one you are interested in, and only for the line you are interested in delete the line if it matches.
sed -e1bt -eb -e:t -e/string/d < $file
-e1bt = for line 1, branch to label "t"
-eb = branch unconditionally to the end of the script (at which point it will print the line).
-e:t = define label "t"
-e/string/d = delete the line if it contains "string" - this instruction will only be reached if the unconditional branch to the end of the script was NOT taken, i.e. if the line number branch WAS taken.

Could it be that it is matching parts of a string.
If you try exact match, it might help.
Also, remove the 1, at the beginning or replace it with 0,
sed '/<stringtodetect>/d' "$file";

sed is for simple substitutions on individual lines, that is all. For anything else just use awk for simplicity, clarity, robustness, portability and all of the other desirable attributes of software:
awk '!(NR==1 && /stringtodetect/)' file

You were close. I think what you're looking for is: sed '1{/gotcha/d;}'

grep based on blacklist -- without procedural code?

It's a well-known task, simple to describe:
Given a text file foo.txt, and a blacklist file of exclusion strings, one per line, produce foo_filtered.txt that has only the lines of foo.txt that do not contain any exclusion string.
A common application is filtering compiler warnings from a build log, but to ignore warnings on files that are not yours. The file foo.txt is the warnings file (itself filtered from the build log), and a blacklist file excluded_filenames.txt with file names, one per line.
I know how it's done in procedural languages like Perl or AWK, and I've even done it with combinations of Linux commands such as cut, comm, and sort.
But I feel that I should be really close with xargs, and just can't see the last step.
I know that if excluded_filenames.txt has only 1 file name in it, then
grep -v foo.txt `cat excluded_filenames.txt`
will do it.
And I know that I can get the filenames one per line with
xargs -L1 -a excluded_filenames.txt
So how do I combine those two into a single solution, without explicit loops in a procedural language?
Looking for the simple and elegant solution.

You should use the -f option (or you can use fgrep which is the same):
grep -vf excluded_filenames.txt foo.txt
You could also use -F which is more directly the answer to what you asked:
grep -vF "`cat excluded_filenames.txt`" foo.txt
from man grep
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file contains zero patterns, and therefore matches nothing.
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched.

Replacing a line in a csv file?

I have a set of 10 CSV files, which normally have a an entry of this kind
a,b,c,d
d,e,f,g
Now due to some error entries in this file have become of this kind
a,b,c,d
d,e,f,g
,,,
h,i,j,k
Now I want to remove the line with only commas in all the files. These files are on a Linux filesystem.
Any command that you recommend that can replaces the erroneous lines in all the files.

It depends on what you mean by replace. If you mean 'remove', then a trivial variant on #wnoise's solution is:
grep -v '^,,,$' old-file.csv > new-file.csv
Note that this deletes just those lines with exactly three commas. If you want to delete mal-formed lines with any number of commas (including zero) - and no other characters on the line, then:
grep -v '^,*$' ...
There are endless other variations on the regex that would deal with other scenarios. Dealing with full CSV data with commas inside quotes starts to need something other than a regex machine. It can be done, within broad limits, especially in more complex regex systems such as PCRE or Perl. But it requires more work.
Check out Mastering Regular Expressions.

sed 's/,,,/replacement/' < old-file.csv > new-file.csv
optionally followed by
mv new-file.csv old-file.csv

Replace or remove, your post is not clear... For replacement see wnoise's answer. For removing, you could use
awk '$0 !~ /,,,/ {print}' <old-file.csv > new-file.csv

What about trying to keep only lines which are matching the desired format instead of handling one exception ?
If the provided input is what you really want to match:
grep -E '[a-z],[a-z],[a-z],[a-z]' < oldfile.csv > newfile.csv
If the input is different, provide it, the regular expression should not be too hard to write.

Do you want to replace them with something, or delete them entirely? Either way, it can be done with sed. To delete:
sed -i -e '/^,\+$/ D' yourfile1.csv yourfile2.csv ...
To replace: well, see wnoise's answer, or if you don't want to create new files with the output,
sed -i -e '/^,\+$/ s//replacement/' yourfile1.csv yourfile2.csv ...
or
sed -i -e '/^,\+$/ c\
replacement' yourfile1.csv yourfile2.csv ...
(that should be entered exactly as is, including the line break). Of course, you can also do this with awk or perl or, if you're only deleting lines, even grep:
egrep -v '^,+$' < oldfile.csv > newfile.csv
I tested these to make sure they work, but I'd advise you to do the same before using them (just in case). You can omit the -i option from sed, in which case it'll print out the results (rather than writing them back to the file), or omit the output redirection >newfile.csv from grep.
EDIT: It was pointed out in a comment that some features of these sed commands only work on GNU sed. As far as I can tell, these are the -i option (which can be replaced with shell redirection, sed ... <infile >outfile ) and the \+ modifier (which can be replaced with \{1,\} ).

Most simply:
$ grep -v ,,,, oldfile > newfile
$ mv newfile oldfile

yes, awk or grep are very good option if you are working in linux platform. However you can use perl regex for other platform. using join & split options.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string