remove repeated string character from every line in a file in Unix - string

I have a file in Unix that has a '\' character at the end of every line. I would like to remove it from every line. There are over 1000 lines.
I have seen some examples, but didn't quite work. I am new to Unix and hoping I would get my answer here.
Thanks,
Ab

Try doing this :
sed -i.~ 's#\\$##g' file.txt
EXPLANATIONS
-i do the substitution for real in the file
.~ makes backup files with this suffix
s### is the skeleton syntax for substitutions (I have arbitrary chosen # as delimiter)
$ mean end of line

This eliminates the last character of e
sed 's/.$//' original_file > new_file

A Perl one-liner.
perl -i~ -pe 's/\\$//' file
This will create a backup of the original with a ~ extension and replace every \ at the end of each line.

Related

How to add character at the end of specific line in UNIX/LINUX?

Here is my input file. I want to add a character ":" into the end of lines that have ">" at the beginning of the line. I tried seq -i 's|$|:|' input.txt but ":" was added to all the ending of each line. It is also hard to call out specific line numbers because, in each of my input files, the line contains">" present in different line numbers. I want to run a loop for multiple files so it is useless.
>Pas_pyrG_2
AAAGTCACAATGGTTAAAATGGATCCTTATATTAATGTCGATCCAGGGACAATGAGCCCA
TTCCAGCATGGTGAAGTTTTTGTTACCGAAGATGGTGCAGAAACAGATCTGGATCTGGGT
>Pas_rpoB_4
CAAACTCACTATGGTCGTGTTTGTCCAATTGAAACTCCTGAAGGTCCAAACATTGGTTTG
ATCAACTCGCTTTCTGTATACGCAAAAGCGAATGACTTCGGTTTCTTGGAAACTCCATAC
CGCAAAGTTGTAGATGGTCGTGTAACTGATGATGTTGAATATTTATCTGCAATTGAAGAA
>Pas_cpn60_2
ATGAACCCAATGGATTTAAAACGCGGTATCGACATTGCAGTAAAAACTGTAGTTGAAAAT
ATCCGTTCTATTGCTAAACCAGCTGATGATTTCAAAGCAATTGAACAAGTAGGTTCAATC
TCTGCTAACTCTGATACTACTGTTGGTAAACTTATTGCTCAAGCAATGGAAAAAGTAGGT
AAAGAAGGCGTAATCACTGTAGAAGAAGGCTCAGGCTTCGAAGACGCATTAGACGTTGTA
Here is experted output file:
>Pas_pyrG_2:
AAAGTCACAATGGTTAAAATGGATCCTTATATTAATGTCGATCCAGGGACAATGAGCCCA
TTCCAGCATGGTGAAGTTTTTGTTACCGAAGATGGTGCAGAAACAGATCTGGATCTGGGT
>Pas_rpoB_4:
CAAACTCACTATGGTCGTGTTTGTCCAATTGAAACTCCTGAAGGTCCAAACATTGGTTTG
ATCAACTCGCTTTCTGTATACGCAAAAGCGAATGACTTCGGTTTCTTGGAAACTCCATAC
CGCAAAGTTGTAGATGGTCGTGTAACTGATGATGTTGAATATTTATCTGCAATTGAAGAA
>Pas_cpn60_2:
ATGAACCCAATGGATTTAAAACGCGGTATCGACATTGCAGTAAAAACTGTAGTTGAAAAT
ATCCGTTCTATTGCTAAACCAGCTGATGATTTCAAAGCAATTGAACAAGTAGGTTCAATC
TCTGCTAACTCTGATACTACTGTTGGTAAACTTATTGCTCAAGCAATGGAAAAAGTAGGT
AAAGAAGGCGTAATCACTGTAGAAGAAGGCTCAGGCTTCGAAGACGCATTAGACGTTGTA
Do seq have more option to modify or the other commands can solve this problem?
sed -i '/^>/ s/$/:/' input.txt
Search the lines of input for lines that match ^> (regex for "starts with the > character). Those that do substitute : for end-of-line (you got this part right).
/ slashes are the standard separator character in sed. If you wish to use different characters, be sure to pass -e or s|$|:| probably won't work. Since / characters, unlike | characters, are not meaningful character within the shell, it's best to use them unless the pattern also contains slashes, in which case things get unwieldy.
Be careful with sed -i. Make a backup - make sure you know what's changing by using diff to compare the files.
On OSX -i requires an argument.
Using ed to edit the file:
printf "%s\n" 'g/^>/s/$/:/' w | ed -s input.txt
For every line starting with >, add a colon to the end, and then write the changed file back to disk.

Inserting string in file in nth line after pattern using sed

I want to insert word after nth line after pattern using sed.
I tied to modify this command but it inserts only in first line after pattern.
sed -i '/myPattern/a \ LineIWantToinser ' myFile
What command should I use to insert for example in third line after pattern?
Easiest way to do it with GNU sed is.. (maybe some direct solution exists!?)
sed -n '/pattern/=' file
to see line where pattern is (grep also can be used here with -n)
then if linenumber+ numoflines is for example 123
sed '123aSOME INSERTED TEXT AFTER THAT LINE' file
where little a is append command (after that line, if i is used will be pre pattern line)
ps. I'm eager to see if #neronlevelu (or other sed Lover) will find some better sed solution.
Edit: i've found it, it seems a for append or i for insert must? be on first position on line when using { with ; inside } like
sed '/pattern/{N;N;N;
a SOME TEXT FOR INSERTING
}' file
sed '/pattern/{N;N;N;i \
Line to add after 3 lines with patterne as starting counter
' YourFile
number of N to add line between pattern and inserted line.
there is no check for end of file or pattern in the 3 lines. (not specified in PO)
A version with bash and ed:
ed -s myFile <<<$'/myPattern/+3a\n LineIWantToinser \n.\nwq'
ed enables us to use the line addressing /myPattern/+3.

How can I remove the last character of a file in unix?

Say I have some arbitrary multi-line text file:
sometext
moretext
lastline
How can I remove only the last character (the e, not the newline or null) of the file without making the text file invalid?
A simpler approach (outputs to stdout, doesn't update the input file):
sed '$ s/.$//' somefile
$ is a Sed address that matches the last input line only, thus causing the following function call (s/.$//) to be executed on the last line only.
s/.$// replaces the last character on the (in this case last) line with an empty string; i.e., effectively removes the last char. (before the newline) on the line.
. matches any character on the line, and following it with $ anchors the match to the end of the line; note how the use of $ in this regular expression is conceptually related, but technically distinct from the previous use of $ as a Sed address.
Example with stdin input (assumes Bash, Ksh, or Zsh):
$ sed '$ s/.$//' <<< $'line one\nline two'
line one
line tw
To update the input file too (do not use if the input file is a symlink):
sed -i '$ s/.$//' somefile
Note:
On macOS, you'd have to use -i '' instead of just -i; for an overview of the pitfalls associated with -i, see the bottom half of this answer.
If you need to process very large input files and/or performance / disk usage are a concern and you're using GNU utilities (Linux), see ImHere's helpful answer.
truncate
truncate -s-1 file
Removes one (-1) character from the end of the same file. Exactly as a >> will append to the same file.
The problem with this approach is that it doesn't retain a trailing newline if it existed.
The solution is:
if [ -n "$(tail -c1 file)" ] # if the file has not a trailing new line.
then
truncate -s-1 file # remove one char as the question request.
else
truncate -s-2 file # remove the last two characters
echo "" >> file # add the trailing new line back
fi
This works because tail takes the last byte (not char).
It takes almost no time even with big files.
Why not sed
The problem with a sed solution like sed '$ s/.$//' file is that it reads the whole file first (taking a long time with large files), then you need a temporary file (of the same size as the original):
sed '$ s/.$//' file > tempfile
rm file; mv tempfile file
And then move the tempfile to replace the file.
Here's another using ex, which I find not as cryptic as the sed solution:
printf '%s\n' '$' 's/.$//' wq | ex somefile
The $ goes to the last line, the s deletes the last character, and wq is the well known (to vi users) write+quit.
After a whole bunch of playing around with different strategies (and avoiding sed -i or perl), the best way i found to do this was with:
sed '$! { P; D; }; s/.$//' somefile
If the goal is to remove the last character in the last line, this awk should do:
awk '{a[NR]=$0} END {for (i=1;i<NR;i++) print a[i];sub(/.$/,"",a[NR]);print a[NR]}' file
sometext
moretext
lastlin
It store all data into an array, then print it out and change last line.
Just a remark: sed will temporarily remove the file.
So if you are tailing the file, you'll get a "No such file or directory" warning until you reissue the tail command.
EDITED ANSWER
I created a script and put your text inside on my Desktop. this test file is saved as "old_file.txt"
sometext
moretext
lastline
Afterwards I wrote a small script to take the old file and eliminate the last character in the last line
#!/bin/bash
no_of_new_line_characters=`wc '/root/Desktop/old_file.txt'|cut -d ' ' -f2`
let "no_of_lines=no_of_new_line_characters+1"
sed -n 1,"$no_of_new_line_characters"p '/root/Desktop/old_file.txt' > '/root/Desktop/my_new_file'
sed -n "$no_of_lines","$no_of_lines"p '/root/Desktop/old_file.txt'|sed 's/.$//g' >> '/root/Desktop/my_new_file'
opening the new_file I created, showed the output as follows:
sometext
moretext
lastlin
I apologize for my previous answer (wasn't reading carefully)
sed 's/.$//' filename | tee newFilename
This should do your job.
A couple perl solutions, for comparison/reference:
(echo 1a; echo 2b) | perl -e '$_=join("",<>); s/.$//; print'
(echo 1a; echo 2b) | perl -e 'while(<>){ if(eof) {s/.$//}; print }'
I find the first read-whole-file-into-memory approach can be generally quite useful (less so for this particular problem). You can now do regex's which span multiple lines, for example to combine every 3 lines of a certain format into 1 summary line.
For this problem, truncate would be faster and the sed version is shorter to type. Note that truncate requires a file to operate on, not a stream. Normally I find sed to lack the power of perl and I much prefer the extended-regex / perl-regex syntax. But this problem has a nice sed solution.

linux command for finding a substring and moving it to the end of line

I need to read a file line by line in Linux, find a substring in each line, remove it and place it at the end of that line.
Example:
Line in the original file:
a,b,c,substring,d,e,f
Line in the output file:
a,b,c,d,e,f,substring
How do I do it with the Linux command? Thanks!
sed '/substring/{ s///; s/$/substring/;} '
will handle a fixed substring. Note that if substring begins with a ,, this handles your example case well. If the substring is not fixed but may be a general regular expression:
sed 's/\(substring\)\(.*\)/\2\1'
If you are looking for general csv parsing, you should rephrase the question. (It will be difficult to apply this solution to find a fixed string at the start of a line if you are thinking of the input as comma separated fields.)
I always prefer to use perl's command line to do such regex tasks - perl is powerful enough to cover awk and sed in most of my usages, and both available in windows and linux, it is just easy and handy to me, so the solution in perl would be like:
perl -ne "s/^(.*?)(?:(?<comma>,)(?<substr>substring)|(?<substr>substring)(?<comma>,))(?<right>.*)$/$1$+{right}$+{comma}$+{substr}/; print" input.txt > output.txt
or a simpler one:
perl -lpe "if(s/(,substring|substring,)//){ s/$/,substring/ }" input.txt > output.txt
input.txt
substring,a,b,c,d,e,f
a,b,c,substring,d,e,f
a,b,c,d,e,f,substring
substring,a
a,substring
substring
a
output.txt
a,b,c,d,e,f,substring
a,b,c,d,e,f,substring
a,b,c,d,e,f,substring
a,substring
a,substring
substring
a
You can edit based on your actual input:
If there are any space between words and commas
If you are using tab as separator
Some explanation of the command line:
use perl's -n -e options: -n means process the input line by line in a loop; -e means one line program in the command line
use perl's -l -p options: -l means process multilines; -p means always print
The one line program is just a regex replacement and a print
(?:pattern) means group but don't capture the match
(?<comma>) is a named group, you then need to use $+{comma} hash to access it

Replace whole line containing a string using Sed

I have a text file which has a particular line something like
sometext sometext sometext TEXT_TO_BE_REPLACED sometext sometext sometext
I need to replace the whole line above with
This line is removed by the admin.
The search keyword is TEXT_TO_BE_REPLACED
I need to write a shell script for this. How can I achieve this using sed?
You can use the change command to replace the entire line, and the -i flag to make the changes in-place. For example, using GNU sed:
sed -i '/TEXT_TO_BE_REPLACED/c\This line is removed by the admin.' /tmp/foo
You need to use wildcards (.*) before and after to replace the whole line:
sed 's/.*TEXT_TO_BE_REPLACED.*/This line is removed by the admin./'
The Answer above:
sed -i '/TEXT_TO_BE_REPLACED/c\This line is removed by the admin.' /tmp/foo
Works fine if the replacement string/line is not a variable.
The issue is that on Redhat 5 the \ after the c escapes the $. A double \\ did not work either (at least on Redhat 5).
Through hit and trial, I discovered that the \ after the c is redundant if your replacement string/line is only a single line. So I did not use \ after the c, used a variable as a single replacement line and it was joy.
The code would look something like:
sed -i "/TEXT_TO_BE_REPLACED/c $REPLACEMENT_TEXT_STRING" /tmp/foo
Note the use of double quotes instead of single quotes.
The accepted answer did not work for me for several reasons:
my version of sed does not like -i with a zero length extension
the syntax of the c\ command is weird and I couldn't get it to work
I didn't realize some of my issues are coming from unescaped slashes
So here is the solution I came up with which I think should work for most cases:
function escape_slashes {
sed 's/\//\\\//g'
}
function change_line {
local OLD_LINE_PATTERN=$1; shift
local NEW_LINE=$1; shift
local FILE=$1
local NEW=$(echo "${NEW_LINE}" | escape_slashes)
# FIX: No space after the option i.
sed -i.bak '/'"${OLD_LINE_PATTERN}"'/s/.*/'"${NEW}"'/' "${FILE}"
mv "${FILE}.bak" /tmp/
}
So the sample usage to fix the problem posed:
change_line "TEXT_TO_BE_REPLACED" "This line is removed by the admin." yourFile
All of the answers provided so far assume that you know something about the text to be replaced which makes sense, since that's what the OP asked. I'm providing an answer that assumes you know nothing about the text to be replaced and that there may be a separate line in the file with the same or similar content that you do not want to be replaced. Furthermore, I'm assuming you know the line number of the line to be replaced.
The following examples demonstrate the removing or changing of text by specific line numbers:
# replace line 17 with some replacement text and make changes in file (-i switch)
# the "-i" switch indicates that we want to change the file. Leave it out if you'd
# just like to see the potential changes output to the terminal window.
# "17s" indicates that we're searching line 17
# ".*" indicates that we want to change the text of the entire line
# "REPLACEMENT-TEXT" is the new text to put on that line
# "PATH-TO-FILE" tells us what file to operate on
sed -i '17s/.*/REPLACEMENT-TEXT/' PATH-TO-FILE
# replace specific text on line 3
sed -i '3s/TEXT-TO-REPLACE/REPLACEMENT-TEXT/'
for manipulation of config files
i came up with this solution inspired by skensell answer
configLine [searchPattern] [replaceLine] [filePath]
it will:
create the file if not exists
replace the whole line (all lines) where searchPattern matched
add replaceLine on the end of the file if pattern was not found
Function:
function configLine {
local OLD_LINE_PATTERN=$1; shift
local NEW_LINE=$1; shift
local FILE=$1
local NEW=$(echo "${NEW_LINE}" | sed 's/\//\\\//g')
touch "${FILE}"
sed -i '/'"${OLD_LINE_PATTERN}"'/{s/.*/'"${NEW}"'/;h};${x;/./{x;q100};x}' "${FILE}"
if [[ $? -ne 100 ]] && [[ ${NEW_LINE} != '' ]]
then
echo "${NEW_LINE}" >> "${FILE}"
fi
}
the crazy exit status magic comes from https://stackoverflow.com/a/12145797/1262663
In my makefile I use this:
#sed -i '/.*Revision:.*/c\'"`svn info -R main.cpp | awk '/^Rev/'`"'' README.md
PS: DO NOT forget that the -i changes actually the text in the file... so if the pattern you defined as "Revision" will change, you will also change the pattern to replace.
Example output:
Abc-Project written by John Doe
Revision: 1190
So if you set the pattern "Revision: 1190" it's obviously not the same as you defined them as "Revision:" only...
bash-4.1$ new_db_host="DB_HOSTNAME=good replaced with 122.334.567.90"
bash-4.1$
bash-4.1$ sed -i "/DB_HOST/c $new_db_host" test4sed
vim test4sed
'
'
'
DB_HOSTNAME=good replaced with 122.334.567.90
'
it works fine
To do this without relying on any GNUisms such as -i without a parameter or c without a linebreak:
sed '/TEXT_TO_BE_REPLACED/c\
This line is removed by the admin.
' infile > tmpfile && mv tmpfile infile
In this (POSIX compliant) form of the command
c\
text
text can consist of one or multiple lines, and linebreaks that should become part of the replacement have to be escaped:
c\
line1\
line2
s/x/y/
where s/x/y/ is a new sed command after the pattern space has been replaced by the two lines
line1
line2
cat find_replace | while read pattern replacement ; do
sed -i "/${pattern}/c ${replacement}" file
done
find_replace file contains 2 columns, c1 with pattern to match, c2 with replacement, the sed loop replaces each line conatining one of the pattern of variable 1
To replace whole line containing a specified string with the content of that line
Text file:
Row: 0 last_time_contacted=0, display_name=Mozart, _id=100, phonebook_bucket_alt=2
Row: 1 last_time_contacted=0, display_name=Bach, _id=101, phonebook_bucket_alt=2
Single string:
$ sed 's/.* display_name=\([[:alpha:]]\+\).*/\1/'
output:
100
101
Multiple strings delimited by white-space:
$ sed 's/.* display_name=\([[:alpha:]]\+\).* _id=\([[:digit:]]\+\).*/\1 \2/'
output:
Mozart 100
Bach 101
Adjust regex to meet your needs
[:alpha] and [:digit:]
are Character Classes and Bracket Expressions
This worked for me:
sed -i <extension> 's/.*<Line to be replaced>.*/<New line to be added>/'
An example is:
sed -i .bak -e '7s/.*version.*/ version = "4.33.0"/'
-i: The extension for the backup file after the replacement. In this case, it is .bak.
-e: The sed script. In this case, it is '7s/.*version.*/ version = "4.33.0"/'. If you want to use a sed file use the -f flag
s: The line number in the file to be replaced. In this case, it is 7s which means line 7.
Note:
If you want to do a recursive find and replace with sed then you can grep to the beginning of the command:
grep -rl --exclude-dir=<directory-to-exclude> --include=\*<Files to include> "<Line to be replaced>" ./ | sed -i <extension> 's/.*<Line to be replaced>.*/<New line to be added>/'
The question asks for solutions using sed, but if that's not a hard requirement then there is another option which might be a wiser choice.
The accepted answer suggests sed -i and describes it as replacing the file in-place, but -i doesn't really do that and instead does the equivalent of sed pattern file > tmp; mv tmp file, preserving ownership and modes. This is not ideal in many circumstances. In general I do not recommend running sed -i non-interactively as part of an automatic process--it's like setting a bomb with a fuse of an unknown length. Sooner or later it will blow up on someone.
To actually edit a file "in place" and replace a line matching a pattern with some other content you would be well served to use an actual text editor. This is how it's done with ed, the standard text editor.
printf '%s\n' '/TEXT_TO_BE_REPLACED/' d i 'This line is removed by the admin' . w q | \
ed -s /tmp/foo > /dev/null
Note that this only replaces the first matching line, which is what the question implied was wanted. This is a material difference from most of the other answers.
That disadvantage aside, there are some advantages to using ed over sed:
You can replace the match with one or multiple lines without any extra effort.
The replacement text can be arbitrarily complex without needing any escaping to protect it.
Most importantly, the original file is opened, modified, and saved. A copy is not made.
How it works
How it works:
printf will use its first argument as a format string and print each of its other arguments using that format, effectively meaning that each argument to printf becomes a line of output, which is all sent to ed on stdin.
The first line is a regex pattern match which causes ed to move its notion of "the current line" forward to the first line that matches (if there is no match the current line is set to the last line of the file).
The next is the d command which instructs ed to delete the entire current line.
After that is the i command which puts ed into insert mode;
after that all subsequent lines entered are written to the current line (or additional lines if there are any embedded newlines). This means you can expand a variable (e.g. "$foo") containing multiple lines here and it will insert all of them.
Insert mode ends when ed sees a line consisting of .
The w command writes the content of the file to disk, and
the q command quits.
The ed command is given the -s switch, putting it into silent mode so it doesn't echo any information as it runs,
the file to be edited is given as an argument to ed,
and, finally, stdout is thrown away to prevent the line matching the regex from being printed.
Some Unix-like systems may (inappropriately) ship without an ed installed, but may still ship with an ex; if so you can simply use it instead. If have vim but no ex or ed you can use vim -e instead. If you have only standard vi but no ex or ed, complain to your sysadmin.
It is as similar to above one..
sed 's/[A-Za-z0-9]*TEXT_TO_BE_REPLACED.[A-Za-z0-9]*/This line is removed by the admin./'
Below command is working for me. Which is working with variables
sed -i "/\<$E\>/c $D" "$B"
I very often use regex to extract data from files I just used that to replace the literal quote \" with // nothing :-)
cat file.csv | egrep '^\"([0-9]{1,3}\.[0-9]{1,3}\.)' | sed s/\"//g | cut -d, -f1 > list.txt

Resources