How can i except specific lines with sed? - linux

I have this command to run git log and refresh a list of file:
sed -E 's|(.*): .*|echo \1: $(git log -1 --pretty="format:%ct" \1)|e' app/config/file.yml
My problem is this command refresh every line in file.yml but i have a prefix which i don't want to refresh. The prefix is web/compile/*
I tried to do with this but unfortunately delete eveything whitout /web/compile prefix.
sed -i.bkp '/web\/compiled\/*/!e' -E 's|(.*): .*|echo \1: $(git log -1 --pretty="format:%ct" \1)|e' app/config/file.yml

First of all, you should know that using sed to modify a yaml file is risky. (I saw you retain the leading stuff till the last : to avoid interfering the indentations, but it is not safe )
using sed, you can match a pattern by /pattern/{action}, here you just fill the pattern part with your "prefix" path. However, double-check your "prefix", if there are leading spaces(indent), you may want to have \s*/web/com....
similar to your s|pat|rep_or_cmd|e, you can use another separator for the pattern matching, which we talked about in the last item if your pattern contains slash. so \#^\s*/web/compile# will be easier to read
come to solution. You have two different ways to do what you want:
sed '\#^\s*/web/compile#n; s|your s cmd....|' file
OR
sed '\#^\s*/web/compile#!s|your s cmd....|' file
Note
In your question, the prefix you wrote sometimes web/compile sometimes /web/compile, you should write the right one in the sed command and test.

Related

Replace spaces in all files in a directory with underscores

I have found some similar questions here but not this specific one and I do not want to break all my files. I have a list of files and I simply need to replace all spaces with underscores. I know this is a sed command but I am not sure how to generically apply this to every file.
I do not want to rename the files, just modify them in place.
Edit: To clarify, just in case it's not clear, I only want to replace whitespace within the files, file names should not be changed.
find . -type f -exec sed -i -e 's/ /_/g' {} \;
find grabs all items in the directory (and subdirectories) that are files, and passes those filenames as arguments to the sed command using the {} \; notation. The sed command it appears you already understand.
if you only want to search the current directory, and ignore subdirectories, you can use
find . -maxdepth 1 -type f -exec sed -i -e 's/ /_/g' {} \;
This is a 2 part problem. Step 1 is providing the proper sed command, 2 is providing the proper command to replace all files in a given directory.
Substitution in sed commands follows the form s/ItemToReplace/ItemToReplaceWith/pattern, where s stands for the substitution and pattern stands for how the operation should take place. According to this super user post, in order to match whitespace characters you must use either \s or [[:space:]] in your sed command. The difference being the later is for POSIX compliance. Lastly you need to specify a global operation which is simply /g at the end. This simply replaces all spaces in a file with underscores.
Substitution in sed commands follows the form s/ItemToReplace/ItemToReplaceWith/pattern, where s stands for the substitution and pattern stands for how the operation should take place. According to this super user post, in order to match whitespace characters you must use either just a space in your sed command, \s, or [[:space:]]. The difference being the last 2 are for whitespace catching (tabs and spaces), with the last needed for POSIX compliance. Lastly you need to specify a global operation which is simply /g at the end.
Therefore, your sed command is
sed s/ /_/g FileNameHere
However this only accomplishes half of your task. You also need to be able to do this for every file within a directory. Unfortunately, wildcards won't save us in the sed command, as * > * would be ambiguous. Your only solution is to iterate through each file and overwrite them individually. For loops by default should come equipped with file iteration syntax, and when used with wildcards expands out to all files in a directory. However sed's used in this manner appear to completely lose output when redirecting to a file. To correct this, you must specify sed with the -i flag so it will edit its files. Whatever item you pass after the -i flag will be used to create a backup of the old files. If no extension is passed (-i '' for instance), no backup will be created.
Therefore the final command should simply be
for i in *;do sed -i '' 's/ /_/g' $i;done
Which looks for all files in your current directory and echos the sed output to all files (Directories do get listed but no action occurs with them).
Well... since I was trying to get something running I found a method that worked for me:
for file in `ls`; do sed -i 's/ /_/g' $file; done

SED replacing with 'possible' newline

I have a sed command that is working fine, except when it comes across a newline right in the file somewhere. Here is my command:
sed -i 's,\(.*\),\2 - \1,g'
Now, it works perfectly, but I just ran across this file that has the a tag like so:
<a href="link">Click
here now</a>
Of course it didn't find this one. So I need to modify it somehow to allow for lines breaks in the search. But I have no clue how to make it allow for that unless I go over the entire file first off and remove all \n before hand. Problem there is I loose all formatting in the file.
You can do this by inserting a loop into your sed script:
sed -e '/<a href/{;:next;/<\/a>/!{N;b next;};s,\(.*\),\2 - \1,g;}' yourfile
As-is, that will leave an embedded newline in the output, and it wasn't clear if you wanted it that way or not. If not, just substitute out the newline:
sed -e '/<a href/{;:next;/<\/a>/!{N;b next;};s/\n//g;s,\(.*\),\2 - \1,g;}' yourfile
And maybe clean up extra spaces:
sed -e '/<a href/{;:next;/<\/a>/!{N;b next;};s/\n//g;s/\s\{2,\}/ /g;s,\(.*\),\2 - \1,g;}' yourfile
Explanation: The /<a href/{...} lets us ignore lines we don't care about. Once we find one we like, we check to see if it has the end marker. If not (/<\a>/!) we grab the next line and a newline (N) and branch (b) back to :next to see if we've found it yet. Once we find it we continue on with the substitutions.
Here is a quick and dirty solution that assumes there will be no more than one newline in a link:
sed -i '' -e '/\(.*\),\2 - \1,g'
The first command (/<a href=.*>/{/<\/a>/!{N;s|\n||;};}) checks for the presence of <a href=...> without </a>, in which case it reads the next line into the pattern space and removes the newline. The second is yours.

How to remove multiple lines in multiple files on Linux using bash

I am trying to remove 2 lines from all my Javascript files on my Linux shared hosting. I wanted to do this without writing a script as I know this should be possible with sed. My current attempt looks like this:
find . -name "*.js" | xargs sed -i ";var
O0l='=sTKpUG"
The second line is actually longer than this but is malicious code so I have not included it here. As you guessed my server has been hacked so I need to clean up all these JavaScript files.
I forgot to mention that the output I am getting at the moment is:
sed: -e expression #1, char 4: expected newer version of sed
The 2 lines are just as follows consecutively:
;var
O0l='=sTKpUG
except that the second line is longer, but the rest of the second line should not influence the command.
He meant removing two adjacent lines.
you can do something like this, remember to backup your files.
find . -name "*.js" | xargs sed -i -e "/^;var/N;/^;var\nO0l='=sTKpUG/d"
Since sed processes input file line by line, it does not store the newline '\n' character in its buffer, so we need to tell it by using flag /N to append the next line, with newline character.
/^;var/N;
Then we do our pattern searching and deleting.
/^;var\nO0l='=sTKpUG/d
It really isn't clear yet what the two lines look like, and it isn't clear if they are adjacent to each other in the JavaScript, so we'll assume not. However, the answer is likely to be:
find . -name "*.js" |
xargs sed -i -e '/^distinctive-pattern1$/d' -e '/^alternative-pattern-2a$/d'
There are other ways of writing the sed script using a single command string; I prefer to use separate arguments for separate operations (it makes the script clearer).
Clearly, if you need to keep some of the information on one of the lines, you can use a search pattern adjusted as appropriate, and then do a substitute s/short-pattern// instead of d to remove the short section that must be removed. Similarly with the long line if that's relevant.

Replace Windows newlines in a lot of files using sed - but it doesn't

I have a lot of files that end in the classical ^M, an artifact from my Windows times. As this is all source code, git actually thinks those files changed, so I want to remove those nasty lines once and for all.
Here is what I created:
sed -i 's/^M//g' file
But that does not work. Of course I did not type a literal ^M but rather ^V^M (ctrl V, ctrl M). In vim it works (:%s/s/^M//g) and if I modify it like this:
sed -i 's/^M/a/g' file
It also works, i.e. it ends every line with an 'a'. It also works to do this:
sed -i 's/random_string//g' file
Where random_string exists in the file. So I can replace ^M by any character and I can remove lines but I cannot remove ^M. Why?
Note: It is important that it is just removed, no replacing by another invisible char or something. I would also like to avoid double execution and adding an arbitrary string and removing it afterwards. I want to understand why this fails (but it does not report an error).
That character is matched with \r by sed. Use:
sed -e "s/\r//g" input-file
For my case, I had to do
sed -e "s/\r/\n/g" filename.csv
After that wc -l filename Showed correct output instead of 0 lines.

Replacing a line in a csv file?

I have a set of 10 CSV files, which normally have a an entry of this kind
a,b,c,d
d,e,f,g
Now due to some error entries in this file have become of this kind
a,b,c,d
d,e,f,g
,,,
h,i,j,k
Now I want to remove the line with only commas in all the files. These files are on a Linux filesystem.
Any command that you recommend that can replaces the erroneous lines in all the files.
It depends on what you mean by replace. If you mean 'remove', then a trivial variant on #wnoise's solution is:
grep -v '^,,,$' old-file.csv > new-file.csv
Note that this deletes just those lines with exactly three commas. If you want to delete mal-formed lines with any number of commas (including zero) - and no other characters on the line, then:
grep -v '^,*$' ...
There are endless other variations on the regex that would deal with other scenarios. Dealing with full CSV data with commas inside quotes starts to need something other than a regex machine. It can be done, within broad limits, especially in more complex regex systems such as PCRE or Perl. But it requires more work.
Check out Mastering Regular Expressions.
sed 's/,,,/replacement/' < old-file.csv > new-file.csv
optionally followed by
mv new-file.csv old-file.csv
Replace or remove, your post is not clear... For replacement see wnoise's answer. For removing, you could use
awk '$0 !~ /,,,/ {print}' <old-file.csv > new-file.csv
What about trying to keep only lines which are matching the desired format instead of handling one exception ?
If the provided input is what you really want to match:
grep -E '[a-z],[a-z],[a-z],[a-z]' < oldfile.csv > newfile.csv
If the input is different, provide it, the regular expression should not be too hard to write.
Do you want to replace them with something, or delete them entirely? Either way, it can be done with sed. To delete:
sed -i -e '/^,\+$/ D' yourfile1.csv yourfile2.csv ...
To replace: well, see wnoise's answer, or if you don't want to create new files with the output,
sed -i -e '/^,\+$/ s//replacement/' yourfile1.csv yourfile2.csv ...
or
sed -i -e '/^,\+$/ c\
replacement' yourfile1.csv yourfile2.csv ...
(that should be entered exactly as is, including the line break). Of course, you can also do this with awk or perl or, if you're only deleting lines, even grep:
egrep -v '^,+$' < oldfile.csv > newfile.csv
I tested these to make sure they work, but I'd advise you to do the same before using them (just in case). You can omit the -i option from sed, in which case it'll print out the results (rather than writing them back to the file), or omit the output redirection >newfile.csv from grep.
EDIT: It was pointed out in a comment that some features of these sed commands only work on GNU sed. As far as I can tell, these are the -i option (which can be replaced with shell redirection, sed ... <infile >outfile ) and the \+ modifier (which can be replaced with \{1,\} ).
Most simply:
$ grep -v ,,,, oldfile > newfile
$ mv newfile oldfile
yes, awk or grep are very good option if you are working in linux platform. However you can use perl regex for other platform. using join & split options.

Resources