Unix/Linux, Delete comments from lines

Unix/Linux, Delete comments from lines - linux

I need to delete/remove comments from a user-input line without deleting any codes. So for example:
mail -s 'text' brown < text #comments
How do I remove the comments and leave the code intact?
I can delete lines that begin with #, but not if it begins somewhere in the middle of the lines.
I tried:
echo $line | sed -e 's/\
but it does not work. Any idea what I'm doing wrong?
Also, how to detect cases in which # is not used to begin a comment?
For example quoted # and line of code that ends with # since they are not comments.
echo $line | sed -e '/^#/d'
In this line, the # is not used as a comment, but as part of code. I figure out that I need to detect that if # is within quotes or does not have a whitespace character before the #. How do I leave the output as it is?

You can remove all from # to end of line using this awk
awk '{sub(/#.*$/,"")}1' file
But if you have file like this:
#!/bin/bash
pidof tail #See if tail is running
if [ $? -ne 0 ] ; then #start loop
awk '{print " # "$8}' file >tmp # this is my code
fi # end of loop
awk -F# '{for (i=1;i<=NF;i++) print $i}' file > tmp2
a=a+1 # increment a
There are no way you can remove the comment automatically without destroying some.

Well, consider what almost always comes after a comment in bash.
#comment...
#another comment
A line break! Which is effectively a character. So, all you have to do is add a wildcard after your #, to include the actual comment text, then put a line break 'character' at the end. You'll actually need to use \n rather than trying to hit Enter. Unfortunately I'm not on linux at the moment, and sometimes delimiters (the backslash) don't work properly. Trying something like `\n` might work, or maybe using $'\n'.
EDIT: With regex ^ will indicate the start of a new line, while $ indicates the end.
As for not deleting actual code, matching for a space immediately followed by # should work. I would match for a space OR line break preceding the #.
At any rate, please be sure not to accidentally ruin whatever you're working on, just in case I'm wrong.

Related

Unexpected End Of File Error for invalid line number [duplicate]

I need my script to send an email from terminal. Based on what I've seen here and many other places online, I formatted it like this:
/var/mail -s "$SUBJECT" "$EMAIL" << EOF
Here's a line of my message!
And here's another line!
Last line of the message here!
EOF
However, when I run this I get this warning:
myfile.sh: line x: warning: here-document at line y delimited by end-of-file (wanted 'EOF')
myfile.sh: line x+1: syntax error: unexpected end of file
...where line x is the last written line of code in the program, and line y is the line with /var/mail in it. I've tried replacing EOF with other things (ENDOFMESSAGE, FINISH, etc.) but to no avail. Nearly everything I've found online has it done this way, and I'm really new at bash so I'm having a hard time figuring it out on my own. Could anyone offer any help?

The EOF token must be at the beginning of the line, you can't indent it along with the block of code it goes with.
If you write <<-EOF you may indent it, but it must be indented with Tab characters, not spaces. So it still might not end up even with the block of code.
Also make sure you have no whitespace after the EOF token on the line.

The line that starts or ends the here-doc probably has some non-printable or whitespace characters (for example, carriage return) which means that the second "EOF" does not match the first, and doesn't end the here-doc like it should. This is a very common error, and difficult to detect with just a text editor. You can make non-printable characters visible for example with cat:
cat -A myfile.sh
Once you see the output from cat -A the solution will be obvious: remove the offending characters.

Please try to remove the preceeding spaces before EOF:-
/var/mail -s "$SUBJECT" "$EMAIL" <<-EOF
Using <tab> instead of <spaces> for ident AND using <<-EOF works fine.
The "-" removes the <tabs>, not <spaces>, but at least this works.

Note one can also get this error if you do this;
while read line; do
echo $line
done << somefile
Because << somefile should read < somefile in this case.

May be old but I had a space after the ending EOF
<< EOF
blah
blah
EOF <-- this was the issue. Had it for years, finally looked it up here

For anyone stumbling here who googled "bash warning: here-document delimited by end-of-file", it may be that you are getting the
warning: here-document at line 74 delimited by end-of-file
...type warning because you accidentally used a here document symbol (<<) when you meant to use a here string symbol (<<<). That was my case.

Here is a flexible way to do deal with multiple indented lines without using heredoc.
echo 'Hello!'
sed -e 's:^\s*::' < <(echo '
Some indented text here.
Some indented text here.
')
if [[ true ]]; then
sed -e 's:^\s\{4,4\}::' < <(echo '
Some indented text here.
Some extra indented text here.
Some indented text here.
')
fi
Some notes on this solution:
if the content is expected to have simple quotes, either escape them using \ or replace the string delimiters with double quotes. In the latter case, be careful that construction like $(command) will be interpreted. If the string contains both simple and double quotes, you'll have to escape at least of kind.
the given example print a trailing empty line, there are numerous way to get rid of it, not included here to keep the proposal to a minimum clutter
the flexibility comes from the ease with which you can control how much leading space should stay or go, provided that you know some sed REGEXP of course.

When I want to have docstrings for my bash functions, I use a solution similar to the suggestion of user12205 in a duplicate of this question.
See how I define USAGE for a solution that:
auto-formats well for me in my IDE of choice (sublime)
is multi-line
can use spaces or tabs as indentation
preserves indentations within the comment.
function foo {
# Docstring
read -r -d '' USAGE <<' END'
# This method prints foo to the terminal.
#
# Enter `foo -h` to see the docstring.
# It has indentations and multiple lines.
#
# Change the delimiter if you need hashtag for some reason.
# This can include $$ and = and eval, but won't be evaluated
END
if [ "$1" = "-h" ]
then
echo "$USAGE" | cut -d "#" -f 2 | cut -c 2-
return
fi
echo "foo"
}
So foo -h yields:
This method prints foo to the terminal.
Enter `foo -h` to see the docstring.
It has indentations and multiple lines.
Change the delimiter if you need hashtag for some reason.
This can include $$ and = and eval, but won't be evaluated
Explanation
cut -d "#" -f 2: Retrieve the second portion of the # delimited lines. (Think a csv with "#" as the delimiter, empty first column).
cut -c 2-: Retrieve the 2nd to end character of the resultant string
Also note that if [ "$1" = "-h" ] evaluates as False if there is no first argument, w/o error, since it becomes an empty string.

make sure where you put the ending EOF you put it at the beginning of a new line

Along with the other answers mentioned by Barmar and Joni, I've noticed that I sometimes have to leave a blank line before and after my EOF when using <<-EOF.

How can I remove the last character of a file in unix?

Say I have some arbitrary multi-line text file:
sometext
moretext
lastline
How can I remove only the last character (the e, not the newline or null) of the file without making the text file invalid?

A simpler approach (outputs to stdout, doesn't update the input file):
sed '$ s/.$//' somefile
$ is a Sed address that matches the last input line only, thus causing the following function call (s/.$//) to be executed on the last line only.
s/.$// replaces the last character on the (in this case last) line with an empty string; i.e., effectively removes the last char. (before the newline) on the line.
. matches any character on the line, and following it with $ anchors the match to the end of the line; note how the use of $ in this regular expression is conceptually related, but technically distinct from the previous use of $ as a Sed address.
Example with stdin input (assumes Bash, Ksh, or Zsh):
$ sed '$ s/.$//' <<< $'line one\nline two'
line one
line tw
To update the input file too (do not use if the input file is a symlink):
sed -i '$ s/.$//' somefile
Note:
On macOS, you'd have to use -i '' instead of just -i; for an overview of the pitfalls associated with -i, see the bottom half of this answer.
If you need to process very large input files and/or performance / disk usage are a concern and you're using GNU utilities (Linux), see ImHere's helpful answer.

truncate
truncate -s-1 file
Removes one (-1) character from the end of the same file. Exactly as a >> will append to the same file.
The problem with this approach is that it doesn't retain a trailing newline if it existed.
The solution is:
if [ -n "$(tail -c1 file)" ] # if the file has not a trailing new line.
then
truncate -s-1 file # remove one char as the question request.
else
truncate -s-2 file # remove the last two characters
echo "" >> file # add the trailing new line back
fi
This works because tail takes the last byte (not char).
It takes almost no time even with big files.
Why not sed
The problem with a sed solution like sed '$ s/.$//' file is that it reads the whole file first (taking a long time with large files), then you need a temporary file (of the same size as the original):
sed '$ s/.$//' file > tempfile
rm file; mv tempfile file
And then move the tempfile to replace the file.

Here's another using ex, which I find not as cryptic as the sed solution:
printf '%s\n' '$' 's/.$//' wq | ex somefile
The $ goes to the last line, the s deletes the last character, and wq is the well known (to vi users) write+quit.

After a whole bunch of playing around with different strategies (and avoiding sed -i or perl), the best way i found to do this was with:
sed '$! { P; D; }; s/.$//' somefile

If the goal is to remove the last character in the last line, this awk should do:
awk '{a[NR]=$0} END {for (i=1;i<NR;i++) print a[i];sub(/.$/,"",a[NR]);print a[NR]}' file
sometext
moretext
lastlin
It store all data into an array, then print it out and change last line.

Just a remark: sed will temporarily remove the file.
So if you are tailing the file, you'll get a "No such file or directory" warning until you reissue the tail command.

EDITED ANSWER
I created a script and put your text inside on my Desktop. this test file is saved as "old_file.txt"
sometext
moretext
lastline
Afterwards I wrote a small script to take the old file and eliminate the last character in the last line
#!/bin/bash
no_of_new_line_characters=`wc '/root/Desktop/old_file.txt'|cut -d ' ' -f2`
let "no_of_lines=no_of_new_line_characters+1"
sed -n 1,"$no_of_new_line_characters"p '/root/Desktop/old_file.txt' > '/root/Desktop/my_new_file'
sed -n "$no_of_lines","$no_of_lines"p '/root/Desktop/old_file.txt'|sed 's/.$//g' >> '/root/Desktop/my_new_file'
opening the new_file I created, showed the output as follows:
sometext
moretext
lastlin
I apologize for my previous answer (wasn't reading carefully)

sed 's/.$//' filename | tee newFilename
This should do your job.

A couple perl solutions, for comparison/reference:
(echo 1a; echo 2b) | perl -e '$_=join("",<>); s/.$//; print'
(echo 1a; echo 2b) | perl -e 'while(<>){ if(eof) {s/.$//}; print }'
I find the first read-whole-file-into-memory approach can be generally quite useful (less so for this particular problem). You can now do regex's which span multiple lines, for example to combine every 3 lines of a certain format into 1 summary line.
For this problem, truncate would be faster and the sed version is shorter to type. Note that truncate requires a file to operate on, not a stream. Normally I find sed to lack the power of perl and I much prefer the extended-regex / perl-regex syntax. But this problem has a nice sed solution.

How to do something like grep -B to select only one line?

Everything is in the title. Basicaly let's say I have this pattern
some text lalala
another line
much funny wow grep
I grep funny and I want my output to be "lalala"
Thank you

One possible answer is to use either ed or ex to do this (it is trivial in them):
ed - yourfile <<< 'g/funny/.-2p'
(Or replace ed with ex. You might have red, the restricted editor, too; it can't modify files.) This looks for the pattern /funny/ globally, and whenever it is found, prints the line 2 before the matching line (that's the .-2p part). Or, if you want the most recent line containing 'lalala' before the line matching 'funny':
ed - yourfile <<< 'g/funny/?lalala?p'
The only problem is if you're trying to process standard input rather than a file; then you have to save the standard input to a file and process that file, which spoils the concurrency.
You can't do negative offsets in sed (though GNU sed allows you to do positive offsets, so you could use sed -n '/lalala/,+2p' file to get the 'lalala' to 'funny' lines (which isn't quite what you want) based on finding 'lalala', but you cannot find the 'lalala' lines based on finding 'funny'). Standard sed does not allow offsets at all.
If you need to print just the IP address found on a line 8 lines before the pattern-matching line, you need a slightly more involved ed script, but it is still doable:
ed - yourfile <<< 'g/funny/.-8s/.* //p'
This uses the same basic mechanism to find the right line, then runs a substitute command to remove everything up to the last space on the line and print the modified version. Since there isn't a w command, it doesn't actually modify the file.

Since grep -B only prints each full number of lines before the match, you'll have to pipe the output into something like grep or Awk.
grep -B 2 "funny" file|awk 'NR==1{print $NF; exit}'
You could also just use Awk.
awk -v s="funny" '/[[:space:]]lalala$/{n=NR+2; o=$NF}NR==n && $0~s{print o}' file
For the specific example of an IP address 8 lines before the match as mentioned in your comment:
awk -v s="funny" '
/[[:space:]][0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$/ {
n=NR+8
ip=$NF
}
NR==n && $0~s {
print ip
}' file
These Awk solutions first find the output field you might want, then print the output only if the word you want exists in the nth following line.

Here's an attempt at a slightly generalized Awk solution. It maintains a circular queue of the last q lines and prints the line at the head of the queue when it sees a match.
#!/bin/sh
: ${q=8}
e=$1
shift
awk -v q="$q" -v e="$e" '{ m[(NR%q)+1] = $0 }
$0 ~ e { print m[((NR+1)%q)+1] }' "${#--}"
Adapting to a different default (I set it to 8) or proper option handling (currently, you'd run it like q=3 ./qgrep regex file) as well as remembering (and hence printing) the entire line should be easy enough.
(I also didn't bother to make it work correctly if you see a match in the first q-1 lines. It will just print an empty line then.)

SED replacing with 'possible' newline

I have a sed command that is working fine, except when it comes across a newline right in the file somewhere. Here is my command:
sed -i 's,\(.*\),\2 - \1,g'
Now, it works perfectly, but I just ran across this file that has the a tag like so:
<a href="link">Click
here now</a>
Of course it didn't find this one. So I need to modify it somehow to allow for lines breaks in the search. But I have no clue how to make it allow for that unless I go over the entire file first off and remove all \n before hand. Problem there is I loose all formatting in the file.

You can do this by inserting a loop into your sed script:
sed -e '/<a href/{;:next;/<\/a>/!{N;b next;};s,\(.*\),\2 - \1,g;}' yourfile
As-is, that will leave an embedded newline in the output, and it wasn't clear if you wanted it that way or not. If not, just substitute out the newline:
sed -e '/<a href/{;:next;/<\/a>/!{N;b next;};s/\n//g;s,\(.*\),\2 - \1,g;}' yourfile
And maybe clean up extra spaces:
sed -e '/<a href/{;:next;/<\/a>/!{N;b next;};s/\n//g;s/\s\{2,\}/ /g;s,\(.*\),\2 - \1,g;}' yourfile
Explanation: The /<a href/{...} lets us ignore lines we don't care about. Once we find one we like, we check to see if it has the end marker. If not (/<\a>/!) we grab the next line and a newline (N) and branch (b) back to :next to see if we've found it yet. Once we find it we continue on with the substitutions.

Here is a quick and dirty solution that assumes there will be no more than one newline in a link:
sed -i '' -e '/\(.*\),\2 - \1,g'
The first command (/<a href=.*>/{/<\/a>/!{N;s|\n||;};}) checks for the presence of <a href=...> without </a>, in which case it reads the next line into the pattern space and removes the newline. The second is yours.

Replace whole line containing a string using Sed

I have a text file which has a particular line something like
sometext sometext sometext TEXT_TO_BE_REPLACED sometext sometext sometext
I need to replace the whole line above with
This line is removed by the admin.
The search keyword is TEXT_TO_BE_REPLACED
I need to write a shell script for this. How can I achieve this using sed?

You can use the change command to replace the entire line, and the -i flag to make the changes in-place. For example, using GNU sed:
sed -i '/TEXT_TO_BE_REPLACED/c\This line is removed by the admin.' /tmp/foo

You need to use wildcards (.*) before and after to replace the whole line:
sed 's/.*TEXT_TO_BE_REPLACED.*/This line is removed by the admin./'

The Answer above:
sed -i '/TEXT_TO_BE_REPLACED/c\This line is removed by the admin.' /tmp/foo
Works fine if the replacement string/line is not a variable.
The issue is that on Redhat 5 the \ after the c escapes the $. A double \\ did not work either (at least on Redhat 5).
Through hit and trial, I discovered that the \ after the c is redundant if your replacement string/line is only a single line. So I did not use \ after the c, used a variable as a single replacement line and it was joy.
The code would look something like:
sed -i "/TEXT_TO_BE_REPLACED/c $REPLACEMENT_TEXT_STRING" /tmp/foo
Note the use of double quotes instead of single quotes.

The accepted answer did not work for me for several reasons:
my version of sed does not like -i with a zero length extension
the syntax of the c\ command is weird and I couldn't get it to work
I didn't realize some of my issues are coming from unescaped slashes
So here is the solution I came up with which I think should work for most cases:
function escape_slashes {
sed 's/\//\\\//g'
}
function change_line {
local OLD_LINE_PATTERN=$1; shift
local NEW_LINE=$1; shift
local FILE=$1
local NEW=$(echo "${NEW_LINE}" | escape_slashes)
# FIX: No space after the option i.
sed -i.bak '/'"${OLD_LINE_PATTERN}"'/s/.*/'"${NEW}"'/' "${FILE}"
mv "${FILE}.bak" /tmp/
}
So the sample usage to fix the problem posed:
change_line "TEXT_TO_BE_REPLACED" "This line is removed by the admin." yourFile

All of the answers provided so far assume that you know something about the text to be replaced which makes sense, since that's what the OP asked. I'm providing an answer that assumes you know nothing about the text to be replaced and that there may be a separate line in the file with the same or similar content that you do not want to be replaced. Furthermore, I'm assuming you know the line number of the line to be replaced.
The following examples demonstrate the removing or changing of text by specific line numbers:
# replace line 17 with some replacement text and make changes in file (-i switch)
# the "-i" switch indicates that we want to change the file. Leave it out if you'd
# just like to see the potential changes output to the terminal window.
# "17s" indicates that we're searching line 17
# ".*" indicates that we want to change the text of the entire line
# "REPLACEMENT-TEXT" is the new text to put on that line
# "PATH-TO-FILE" tells us what file to operate on
sed -i '17s/.*/REPLACEMENT-TEXT/' PATH-TO-FILE
# replace specific text on line 3
sed -i '3s/TEXT-TO-REPLACE/REPLACEMENT-TEXT/'

for manipulation of config files
i came up with this solution inspired by skensell answer
configLine [searchPattern] [replaceLine] [filePath]
it will:
create the file if not exists
replace the whole line (all lines) where searchPattern matched
add replaceLine on the end of the file if pattern was not found
Function:
function configLine {
local OLD_LINE_PATTERN=$1; shift
local NEW_LINE=$1; shift
local FILE=$1
local NEW=$(echo "${NEW_LINE}" | sed 's/\//\\\//g')
touch "${FILE}"
sed -i '/'"${OLD_LINE_PATTERN}"'/{s/.*/'"${NEW}"'/;h};${x;/./{x;q100};x}' "${FILE}"
if [[ $? -ne 100 ]] && [[ ${NEW_LINE} != '' ]]
then
echo "${NEW_LINE}" >> "${FILE}"
fi
}
the crazy exit status magic comes from https://stackoverflow.com/a/12145797/1262663

In my makefile I use this:
#sed -i '/.*Revision:.*/c\'"`svn info -R main.cpp | awk '/^Rev/'`"'' README.md
PS: DO NOT forget that the -i changes actually the text in the file... so if the pattern you defined as "Revision" will change, you will also change the pattern to replace.
Example output:
Abc-Project written by John Doe
Revision: 1190
So if you set the pattern "Revision: 1190" it's obviously not the same as you defined them as "Revision:" only...

bash-4.1$ new_db_host="DB_HOSTNAME=good replaced with 122.334.567.90"
bash-4.1$
bash-4.1$ sed -i "/DB_HOST/c $new_db_host" test4sed
vim test4sed
'
'
'
DB_HOSTNAME=good replaced with 122.334.567.90
'
it works fine

To do this without relying on any GNUisms such as -i without a parameter or c without a linebreak:
sed '/TEXT_TO_BE_REPLACED/c\
This line is removed by the admin.
' infile > tmpfile && mv tmpfile infile
In this (POSIX compliant) form of the command
c\
text
text can consist of one or multiple lines, and linebreaks that should become part of the replacement have to be escaped:
c\
line1\
line2
s/x/y/
where s/x/y/ is a new sed command after the pattern space has been replaced by the two lines
line1
line2

cat find_replace | while read pattern replacement ; do
sed -i "/${pattern}/c ${replacement}" file
done
find_replace file contains 2 columns, c1 with pattern to match, c2 with replacement, the sed loop replaces each line conatining one of the pattern of variable 1

To replace whole line containing a specified string with the content of that line
Text file:
Row: 0 last_time_contacted=0, display_name=Mozart, _id=100, phonebook_bucket_alt=2
Row: 1 last_time_contacted=0, display_name=Bach, _id=101, phonebook_bucket_alt=2
Single string:
$ sed 's/.* display_name=\([[:alpha:]]\+\).*/\1/'
output:
100
101
Multiple strings delimited by white-space:
$ sed 's/.* display_name=\([[:alpha:]]\+\).* _id=\([[:digit:]]\+\).*/\1 \2/'
output:
Mozart 100
Bach 101
Adjust regex to meet your needs
[:alpha] and [:digit:]
are Character Classes and Bracket Expressions

This worked for me:
sed -i <extension> 's/.*<Line to be replaced>.*/<New line to be added>/'
An example is:
sed -i .bak -e '7s/.*version.*/ version = "4.33.0"/'
-i: The extension for the backup file after the replacement. In this case, it is .bak.
-e: The sed script. In this case, it is '7s/.*version.*/ version = "4.33.0"/'. If you want to use a sed file use the -f flag
s: The line number in the file to be replaced. In this case, it is 7s which means line 7.
Note:
If you want to do a recursive find and replace with sed then you can grep to the beginning of the command:
grep -rl --exclude-dir=<directory-to-exclude> --include=\*<Files to include> "<Line to be replaced>" ./ | sed -i <extension> 's/.*<Line to be replaced>.*/<New line to be added>/'

The question asks for solutions using sed, but if that's not a hard requirement then there is another option which might be a wiser choice.
The accepted answer suggests sed -i and describes it as replacing the file in-place, but -i doesn't really do that and instead does the equivalent of sed pattern file > tmp; mv tmp file, preserving ownership and modes. This is not ideal in many circumstances. In general I do not recommend running sed -i non-interactively as part of an automatic process--it's like setting a bomb with a fuse of an unknown length. Sooner or later it will blow up on someone.
To actually edit a file "in place" and replace a line matching a pattern with some other content you would be well served to use an actual text editor. This is how it's done with ed, the standard text editor.
printf '%s\n' '/TEXT_TO_BE_REPLACED/' d i 'This line is removed by the admin' . w q | \
ed -s /tmp/foo > /dev/null
Note that this only replaces the first matching line, which is what the question implied was wanted. This is a material difference from most of the other answers.
That disadvantage aside, there are some advantages to using ed over sed:
You can replace the match with one or multiple lines without any extra effort.
The replacement text can be arbitrarily complex without needing any escaping to protect it.
Most importantly, the original file is opened, modified, and saved. A copy is not made.
How it works
How it works:
printf will use its first argument as a format string and print each of its other arguments using that format, effectively meaning that each argument to printf becomes a line of output, which is all sent to ed on stdin.
The first line is a regex pattern match which causes ed to move its notion of "the current line" forward to the first line that matches (if there is no match the current line is set to the last line of the file).
The next is the d command which instructs ed to delete the entire current line.
After that is the i command which puts ed into insert mode;
after that all subsequent lines entered are written to the current line (or additional lines if there are any embedded newlines). This means you can expand a variable (e.g. "$foo") containing multiple lines here and it will insert all of them.
Insert mode ends when ed sees a line consisting of .
The w command writes the content of the file to disk, and
the q command quits.
The ed command is given the -s switch, putting it into silent mode so it doesn't echo any information as it runs,
the file to be edited is given as an argument to ed,
and, finally, stdout is thrown away to prevent the line matching the regex from being printed.
Some Unix-like systems may (inappropriately) ship without an ed installed, but may still ship with an ex; if so you can simply use it instead. If have vim but no ex or ed you can use vim -e instead. If you have only standard vi but no ex or ed, complain to your sysadmin.

It is as similar to above one..
sed 's/[A-Za-z0-9]*TEXT_TO_BE_REPLACED.[A-Za-z0-9]*/This line is removed by the admin./'

Below command is working for me. Which is working with variables
sed -i "/\<$E\>/c $D" "$B"

I very often use regex to extract data from files I just used that to replace the literal quote \" with // nothing :-)
cat file.csv | egrep '^\"([0-9]{1,3}\.[0-9]{1,3}\.)' | sed s/\"//g | cut -d, -f1 > list.txt

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string