Bash script to remove 'x' amount of characters the end of multiple filenames in a directory? - linux

I have a list of file names in a directory (/path/to/local). I would like to remove a certain number of characters from all of those filenames.
Example filenames:
iso1111_plane001_00321.moc1
iso1111_plane002_00321.moc1
iso2222_plane001_00123.moc1
In every filename I wish to remove the last 5 characters before the file extension.
For example:
iso1111_plane001_.moc1
iso1111_plane002_.moc1
iso2222_plane001_.moc1
I believe this can be done using sed, but I cannot determine the exact coding. Something like...
for filename in /path/to/local/*.moc1; do
mv $filname $(echo $filename | sed -e 's/.....^//');
done
...but that does not work. Sorry if I butchered the sed options, I do not have much experience with it.

mv $filname $(echo $filename | sed -e 's/.....\.moc1$//');
or
echo ${filename%%?????.moc1}.moc1
%% is a bash internal operator...

This sed command will work for all the examples you gave.
sed -e 's/\(.*\)_.*\.moc1/\1_.moc1/'
However, if you just want to specifically "remove 5 characters before the last extension in a filename" this command is what you want:
sed -e 's/\(.*\)[0-9a-zA-Z]\{5\}\.\([^.]*\)/\1.\2/'
You can implement this in your script like so:
for filename in /path/to/local/*.moc1; do
mv $filename "$(echo $filename | sed -e 's/\(.*\)[0-9a-zA-Z]\{5\}\.\([^.]*\)/\1.\2/')";
done
First Command Explanation
The first sed command works by grabbing all characters until the first underscore: \(.*\)_
Then it discards all characters until it finds .moc1: .*\.moc1
Then it replaces the text that it found with everything it grabbed at first inside the parenthesis: /\1
And finally adds the .moc1 extension back on the end and ends the regex: .moc1/
Second Command Explanation
The second sed command works by grabbing all characters at first: \(.*\)
And then it is forced to stop grabbing characters so it can discard five characters, or more specifically, five characters that lie in the ranges 0-9, a-z, and A-Z: [0-9a-zA-Z]\{5\}
Then comes the dot '.' character to mark the last extension : \.
And then it looks for all non-dot characters. This ensures that we are grabbing the last extension: \([^.]*\)
Finally, it replaces all that text with the first and second capture groups, separated by the . character, and ends the regex: /\1.\2/

This might work for you (GNU sed):
sed -r 's/(.*).{5}\./\1./' file

Related

linux sed remove block of code with linebreak

Trying to remove this entire block of code from a script:
https://pastebin.com/gBnFBQSR
I am able to do so up until the linebreak and ending }
sed '/var gfjfgjk/,/appendChild(s);\n}/d'
how can I have it include the linebreak and } at the end
Thanks to #Wiktor Stribiżew link:
for i in $(grep -rl gfjfgjk) ; do if grep -m1 gfjfgjk $i >/dev/null 2>&1 ; then echo $i ; sed -i -e '1,6d;7s/^}//' $i ; fi; done
The only different really is, I wanted to make sure gfjfgjk matched the first line of the file in case it was injected somewhere else in the script and then sed removed the first 7 lines of legit code.
With sed, would you please try the following:
sed '
:l # define a label "l"
N # read next line and append to the pattern space
$!b l # goto "l" unless eof
s/var gfjfgjk.*appendChild(s);\n}//
# remove the specified block including newlines
' file
It first slurps all lines into the pattern space so we can process
multiple lines (including the newline characters) at once.
The possible problem is if the file contains the pattern appendChild(s);\n}
in multiple lines, sed will fall in the longest match due to the
greedy nature of regex.
As an alternative, if perl is your option, you can also say:
perl -0777 -pe 's/var gfjfgjk.*?appendChild\(s\);\n}//s' file
The -0777 option tells perl to read the all lines at once.
The regex .*? enables the shortest match.
The s option at the end makes the dot . match newlines. Otherwise
the dot in perl regex does not match newlines.

Why sed script is not deleting lines?

The following is my sed script:
#!/bin/bash
sed -r 's/^\s+?\/\/.*$/d/;
s/LOG.debug/System.out.println/g' "$1"
What the first command should do is delete all lines beginning with // preceded by any number of spaces or tabs. The only problem is instead of deleting those lines, it replaces them with a literal 'd'.
If you need to remove the matching lines, you should not use the s command.
Use
sed '/^[[:blank:]]*\/\//d; s/LOG\.debug/System.out.println/g' "$1"
# ^^^^^^^^^^^^^^^^^^^^
Also, note that to match a literal dot, you need to escape it, hence, LOG\.debug.
See an online sed demo.

Remove root directory from filename in linux

I want to split this line
/home/edwprod/abortive_visit/bin/abortive_proc_call.ksh
to
/edwprod/abortive_visit/bin/abortive_proc_call.ksh
Can I use sed or awk command for this?
you don't need awk or sed , just try this
echo -n "/"; echo "/home/edwprod/abortive_visit/bin/abortive_proc_call.ksh" |cut -f3-6 -d/
echo '/home/edwprod/abortive_visit/bin/abortive_proc_call.ksh' | sed 's#^/[^/]\+##'
Explanatory words: using sed's replace function, we redefine the separator, which is commonly /, to #, saving us the escaping of slashes within the string. We anchor the regex at the beginning of line ^, and replace the first slash, followed by any non-slash, with nothing, thus removing the first element of the path (not the root, btw).

Add spaces after punctuation marks with sed

I need to capitalize a txt file but I found some problems when I try to add a space after any punctuation mark with sed. For instance: "Hello,World" -> to "Hello, World"
I tried the following:
#!/bin/bash
if [ $# != 1 ]; then
echo "No parameter"
exit
fi
cp $1 $1.bak
ARCH1=/tmp/`basename $1`.$$
sed 's/[A-Z]*/\L&/g' $1 > $ARCH1
sed -i 's/^./\u&/' $ARCH1
sed 's/ */\ /g' $ARCH1 #Here I replace >= 2 spaces for 1
sed 's/, */, /g' $ARCH1
#These 2 lines don't work well
sed 's/. */. /g' $ARCH1
sed 's/; */; /g' $ARCH1
mv $ARCH1 $1
The script doesn't crash, but the output is not the one that I expect.
I believe the reason your script doesn't work is that you forgot to pass -i to sed in several calls, and also that you don't escape . in the regex, so that . matches any character.
I also believe that a simpler way to do what you're trying to do is
sed -i.bak 's/[A-Z]*/\L&/g; s/\([.,;]\) */\1 /' "$1"
-i.bak edits the file in-place and creates a backup with the .bak extension, and the script is simply
s/[A-Z]*/\L&/g # lower-case everything (I got that from your code)
s/\([.,;]\) */\1 / # replace spaces after period, comma or semicolon
Here
[.,;] is a character set matching period, comma or semicolon,
\(stuff\) captures stuff in a group for later use, and
\1 is a back reference referring to the first such capture.
Note that this is a very simple approach. If your text, for example, contains ellipses (...), it'll waltz right over that and make ... into . . ., and similar caveats apply for ?! and such.
Using GNU sed:
$ echo "foo;BAR,BaZ.qux" | sed -r 's/[[:punct:]]+/& /g; s/[[:alnum:]]+/\L\u&/g'
Foo; Bar, Baz. Qux
\L lower cases the whole word, then \u upper cases the first character.
See your regex(7) man page for regular expression documentation.

Replace whole line containing a string using Sed

I have a text file which has a particular line something like
sometext sometext sometext TEXT_TO_BE_REPLACED sometext sometext sometext
I need to replace the whole line above with
This line is removed by the admin.
The search keyword is TEXT_TO_BE_REPLACED
I need to write a shell script for this. How can I achieve this using sed?
You can use the change command to replace the entire line, and the -i flag to make the changes in-place. For example, using GNU sed:
sed -i '/TEXT_TO_BE_REPLACED/c\This line is removed by the admin.' /tmp/foo
You need to use wildcards (.*) before and after to replace the whole line:
sed 's/.*TEXT_TO_BE_REPLACED.*/This line is removed by the admin./'
The Answer above:
sed -i '/TEXT_TO_BE_REPLACED/c\This line is removed by the admin.' /tmp/foo
Works fine if the replacement string/line is not a variable.
The issue is that on Redhat 5 the \ after the c escapes the $. A double \\ did not work either (at least on Redhat 5).
Through hit and trial, I discovered that the \ after the c is redundant if your replacement string/line is only a single line. So I did not use \ after the c, used a variable as a single replacement line and it was joy.
The code would look something like:
sed -i "/TEXT_TO_BE_REPLACED/c $REPLACEMENT_TEXT_STRING" /tmp/foo
Note the use of double quotes instead of single quotes.
The accepted answer did not work for me for several reasons:
my version of sed does not like -i with a zero length extension
the syntax of the c\ command is weird and I couldn't get it to work
I didn't realize some of my issues are coming from unescaped slashes
So here is the solution I came up with which I think should work for most cases:
function escape_slashes {
sed 's/\//\\\//g'
}
function change_line {
local OLD_LINE_PATTERN=$1; shift
local NEW_LINE=$1; shift
local FILE=$1
local NEW=$(echo "${NEW_LINE}" | escape_slashes)
# FIX: No space after the option i.
sed -i.bak '/'"${OLD_LINE_PATTERN}"'/s/.*/'"${NEW}"'/' "${FILE}"
mv "${FILE}.bak" /tmp/
}
So the sample usage to fix the problem posed:
change_line "TEXT_TO_BE_REPLACED" "This line is removed by the admin." yourFile
All of the answers provided so far assume that you know something about the text to be replaced which makes sense, since that's what the OP asked. I'm providing an answer that assumes you know nothing about the text to be replaced and that there may be a separate line in the file with the same or similar content that you do not want to be replaced. Furthermore, I'm assuming you know the line number of the line to be replaced.
The following examples demonstrate the removing or changing of text by specific line numbers:
# replace line 17 with some replacement text and make changes in file (-i switch)
# the "-i" switch indicates that we want to change the file. Leave it out if you'd
# just like to see the potential changes output to the terminal window.
# "17s" indicates that we're searching line 17
# ".*" indicates that we want to change the text of the entire line
# "REPLACEMENT-TEXT" is the new text to put on that line
# "PATH-TO-FILE" tells us what file to operate on
sed -i '17s/.*/REPLACEMENT-TEXT/' PATH-TO-FILE
# replace specific text on line 3
sed -i '3s/TEXT-TO-REPLACE/REPLACEMENT-TEXT/'
for manipulation of config files
i came up with this solution inspired by skensell answer
configLine [searchPattern] [replaceLine] [filePath]
it will:
create the file if not exists
replace the whole line (all lines) where searchPattern matched
add replaceLine on the end of the file if pattern was not found
Function:
function configLine {
local OLD_LINE_PATTERN=$1; shift
local NEW_LINE=$1; shift
local FILE=$1
local NEW=$(echo "${NEW_LINE}" | sed 's/\//\\\//g')
touch "${FILE}"
sed -i '/'"${OLD_LINE_PATTERN}"'/{s/.*/'"${NEW}"'/;h};${x;/./{x;q100};x}' "${FILE}"
if [[ $? -ne 100 ]] && [[ ${NEW_LINE} != '' ]]
then
echo "${NEW_LINE}" >> "${FILE}"
fi
}
the crazy exit status magic comes from https://stackoverflow.com/a/12145797/1262663
In my makefile I use this:
#sed -i '/.*Revision:.*/c\'"`svn info -R main.cpp | awk '/^Rev/'`"'' README.md
PS: DO NOT forget that the -i changes actually the text in the file... so if the pattern you defined as "Revision" will change, you will also change the pattern to replace.
Example output:
Abc-Project written by John Doe
Revision: 1190
So if you set the pattern "Revision: 1190" it's obviously not the same as you defined them as "Revision:" only...
bash-4.1$ new_db_host="DB_HOSTNAME=good replaced with 122.334.567.90"
bash-4.1$
bash-4.1$ sed -i "/DB_HOST/c $new_db_host" test4sed
vim test4sed
'
'
'
DB_HOSTNAME=good replaced with 122.334.567.90
'
it works fine
To do this without relying on any GNUisms such as -i without a parameter or c without a linebreak:
sed '/TEXT_TO_BE_REPLACED/c\
This line is removed by the admin.
' infile > tmpfile && mv tmpfile infile
In this (POSIX compliant) form of the command
c\
text
text can consist of one or multiple lines, and linebreaks that should become part of the replacement have to be escaped:
c\
line1\
line2
s/x/y/
where s/x/y/ is a new sed command after the pattern space has been replaced by the two lines
line1
line2
cat find_replace | while read pattern replacement ; do
sed -i "/${pattern}/c ${replacement}" file
done
find_replace file contains 2 columns, c1 with pattern to match, c2 with replacement, the sed loop replaces each line conatining one of the pattern of variable 1
To replace whole line containing a specified string with the content of that line
Text file:
Row: 0 last_time_contacted=0, display_name=Mozart, _id=100, phonebook_bucket_alt=2
Row: 1 last_time_contacted=0, display_name=Bach, _id=101, phonebook_bucket_alt=2
Single string:
$ sed 's/.* display_name=\([[:alpha:]]\+\).*/\1/'
output:
100
101
Multiple strings delimited by white-space:
$ sed 's/.* display_name=\([[:alpha:]]\+\).* _id=\([[:digit:]]\+\).*/\1 \2/'
output:
Mozart 100
Bach 101
Adjust regex to meet your needs
[:alpha] and [:digit:]
are Character Classes and Bracket Expressions
This worked for me:
sed -i <extension> 's/.*<Line to be replaced>.*/<New line to be added>/'
An example is:
sed -i .bak -e '7s/.*version.*/ version = "4.33.0"/'
-i: The extension for the backup file after the replacement. In this case, it is .bak.
-e: The sed script. In this case, it is '7s/.*version.*/ version = "4.33.0"/'. If you want to use a sed file use the -f flag
s: The line number in the file to be replaced. In this case, it is 7s which means line 7.
Note:
If you want to do a recursive find and replace with sed then you can grep to the beginning of the command:
grep -rl --exclude-dir=<directory-to-exclude> --include=\*<Files to include> "<Line to be replaced>" ./ | sed -i <extension> 's/.*<Line to be replaced>.*/<New line to be added>/'
The question asks for solutions using sed, but if that's not a hard requirement then there is another option which might be a wiser choice.
The accepted answer suggests sed -i and describes it as replacing the file in-place, but -i doesn't really do that and instead does the equivalent of sed pattern file > tmp; mv tmp file, preserving ownership and modes. This is not ideal in many circumstances. In general I do not recommend running sed -i non-interactively as part of an automatic process--it's like setting a bomb with a fuse of an unknown length. Sooner or later it will blow up on someone.
To actually edit a file "in place" and replace a line matching a pattern with some other content you would be well served to use an actual text editor. This is how it's done with ed, the standard text editor.
printf '%s\n' '/TEXT_TO_BE_REPLACED/' d i 'This line is removed by the admin' . w q | \
ed -s /tmp/foo > /dev/null
Note that this only replaces the first matching line, which is what the question implied was wanted. This is a material difference from most of the other answers.
That disadvantage aside, there are some advantages to using ed over sed:
You can replace the match with one or multiple lines without any extra effort.
The replacement text can be arbitrarily complex without needing any escaping to protect it.
Most importantly, the original file is opened, modified, and saved. A copy is not made.
How it works
How it works:
printf will use its first argument as a format string and print each of its other arguments using that format, effectively meaning that each argument to printf becomes a line of output, which is all sent to ed on stdin.
The first line is a regex pattern match which causes ed to move its notion of "the current line" forward to the first line that matches (if there is no match the current line is set to the last line of the file).
The next is the d command which instructs ed to delete the entire current line.
After that is the i command which puts ed into insert mode;
after that all subsequent lines entered are written to the current line (or additional lines if there are any embedded newlines). This means you can expand a variable (e.g. "$foo") containing multiple lines here and it will insert all of them.
Insert mode ends when ed sees a line consisting of .
The w command writes the content of the file to disk, and
the q command quits.
The ed command is given the -s switch, putting it into silent mode so it doesn't echo any information as it runs,
the file to be edited is given as an argument to ed,
and, finally, stdout is thrown away to prevent the line matching the regex from being printed.
Some Unix-like systems may (inappropriately) ship without an ed installed, but may still ship with an ex; if so you can simply use it instead. If have vim but no ex or ed you can use vim -e instead. If you have only standard vi but no ex or ed, complain to your sysadmin.
It is as similar to above one..
sed 's/[A-Za-z0-9]*TEXT_TO_BE_REPLACED.[A-Za-z0-9]*/This line is removed by the admin./'
Below command is working for me. Which is working with variables
sed -i "/\<$E\>/c $D" "$B"
I very often use regex to extract data from files I just used that to replace the literal quote \" with // nothing :-)
cat file.csv | egrep '^\"([0-9]{1,3}\.[0-9]{1,3}\.)' | sed s/\"//g | cut -d, -f1 > list.txt

Resources