Background information:
I am trying to write a small shell script, which searches a pattern (string) in a .fas-file and prints the line and position, where the pattern was found. The following code snippet works, when I call the shell script:
Script (search.sh):
#!/bin/bash
awk 's=index($0, "CAATCTCC"){print "line=" NR, "start position=" s}' 100nt_upstream_of_mTSS.fas
Command line call:
$ ./search.sh
First problem:
When I change the script to:
awk 's=index($0, "CAATCTCC"){print "line=" NR, "start position=" s}'
and do the following command line call in my bash:
$ ./search.sh 100nt_upstream_of_mTSS.fas
"nothing" happens (something is running, but it takes way too long and no results come up, so terminate the process).
Worth to know:
I am in the directory, where search.sh is located
the file 100nt_upstream_of_mTSS.fas is located there, too
search.sh is executable
I might be "screen blind", but I can't find the reason, why I am unable to pass a command line argument to my script.
Solution - see comments
Note: Only the first occurence of the pattern in a line is found this way.
Second problem:
Furthermore, I would like to make the motif (the string) I search for variable. I tried this:
Script:
#!/bin/bash
FILE=$1
MOTIF=$2
awk 's=index($0, "$MOTIF"){print "line=" NR, "start position=" s}' "$FILE"
Command line call:
$ ./search.sh 100nt_upstream_of_mTSS.fas CAATCTCC
Idea: First command-line argument worked and was substitued correctly. Why is the second one not substituted correctly?
Solution so far:
Script:
#!/bin/bash
file=$1
awk -v s="$2" 'i=index($0, s){print "line: " NR, "pos: " i}' "$file"
Testing:
Testfile (test.txt):
1 GAGAGAGAGA
2 CTCTCTCTCT
3 TATATATATA
4 CGCGCGCGCG
5 CCCCCCCCCC
6 GGGGGGGGGG
7 AAAAAAAAAA
8 TTTTTTTTTT
9 TGATTTTTTT
10 CCCCCCCCGA
$ ./search.sh test.txt GA
will print:
line: 1 pos: 1
line: 4 pos: 2
line: 6 pos: 1
line: 9 pos: 2
line: 10 pos: 9
This script will print line and first match position in the line of only the first character of my pattern. How do I manage to have all results printed and the full pattern being used?
As far as I understood you want to pass the Input_file(file which you want to process by script) as an argument, if this is the case then following may help you in same.
cat search.sh
#!/bin/bash
variable=$1
awk 's=index($0, "CAATCTCC"){print "line=" NR, "start position=" s}' "$variable"
./search.sh 100nt_upstream_of_mTSS.fas
Related
i have a file like below
cat test -(X- different words )
XXXXXXXXXXXXXXXXXXX
always-a-constant:::pete
XXXXXXXXXXXXXXXXX
i need to add steve next to pete and generate a new file and the output should look like the below.
XXXXXXXXXXXXXXXXXXX
always-a-constant:::pete,steve
XXXXXXXXXXXXXXXXX
I can use awk
cat test | grep -i alw| awk '{print $0",steve"}'
always-a-constant:::pete,steve
but i still need the other lines XXX .
Method 1: sed
With this as our test file:
$ cat test
XXXXXXXXXXXXXXXXXXX
always-a-constant:::pete
XXXXXXXXXXXXXXXXX
We can add ,steve after any line that starts with alw with:
$ sed '/^alw/ s/$/,steve/' test
XXXXXXXXXXXXXXXXXXX
always-a-constant:::pete,steve
XXXXXXXXXXXXXXXXX
In sed's regular expressions, $ matches at the end of a line. Thus, s/$/,steve/ adds ,steve on at the end of the line.
Method 2: awk
$ awk '/^alw/ {$0=$0",steve"} 1' test
XXXXXXXXXXXXXXXXXXX
always-a-constant:::pete,steve
XXXXXXXXXXXXXXXXX
In awk, $0 represents the current line. $0",steve" represents the current line followed by the string .steve. Thus, $0=$0",steve" replaces the current line with the current line followed by ,steve.
The final 1 in the awk command is important: it is awk's shorthand for print-the-line. Any non-zero number would work. In more detail, awk treats the 1 as a condition (a boolean). Because it evaluates to true (nonzero), the action is performed. Since we didn't actually specify an action, awk performs the default action which is to print the line. Hence, 1 is shorthand for print-the-line.
Method 3: sed
Alternatively, we can add ,steve after any line that ends with :::pete with:
$ sed 's/:::pete/&,steve/' test
XXXXXXXXXXXXXXXXXXX
always-a-constant:::pete,steve
XXXXXXXXXXXXXXXXX
Hi I'm trying to add text to the 1st line of a file using sed
so far iv'e tried
#!/bin/bash
touch test
sed -i -e '1i/etc/example/live/example.com/fullchain.pem;\' test
And this dosn't work
also tried
#!/bin/bash
touch test
sed -i "1i ssl_certificate /etc/example/live/example.com/fullchain.pem;" test
this dosn't seem to work either
oddly when I try
#!/bin/bash
touch test
echo "ssl_certificate /etc/example/live/example.com/fullchain.pem;" > test
I get the 1st line of text to appear when i use cat test
but as soon as i type sed -i "2i ssl_certificate_key /etc/example/live/example.com/privkey.pem;"
I can't see the information that i sould do on line 2 this being ssl_certificate_key /etc/example/live/example.com/privkey.pem;
so my question to summerise
Can text be inserted into the 1st line of a newly created file using sed?
If yes whats the best way of inserting text after the 1st line of text?
Suppose you have a file like this:
one
two
Then to append to the first line:
$ sed '1 s_$_/etc/example/live/example.com/fullchain.pem;_' file
one/etc/example/live/example.com/fullchain.pem;
two
To insert before the first line:
$ sed '1 i /etc/example/live/example.com/fullchain.pem;' file
/etc/example/live/example.com/fullchain.pem;
one
two
Or, to append after the first line:
$ sed '1 a /etc/example/live/example.com/fullchain.pem;' file
one
/etc/example/live/example.com/fullchain.pem;
two
Note the number 1 in those sed expressions - that's called the address in sed terminology. It tells you on which line the command that follows is to operate.
If your file doesn't contain the line you're addressing, the sed command won't get executed. That's why you can't insert/append on line 1, if your file is empty.
Instead of using stream editor, to append (to empty files), just use a shell redirection >>:
echo "content" >> file
Your problem stems from the fact that sed cannot locate the line you're telling it to write at, for example:
touch test
sed -i -e '1i/etc/example/live/example.com/fullchain.pem;\' test
attempts to write to insert at the line 1 of test, but that line doesn't exist at this point. If you've created your file as:
echo -en "\n" > test
sed -i '1i/etc/example/live/example.com/fullchain.pem;\' test
it would not complain, but you'd be having an extra line. Similarly, when you call:
sed -i "2i ssl_certificate_key /etc/example/live/example.com/privkey.pem;"
you're telling sed to insert the following data at the line 2 which doesn't exist at that point so sed doesn't get to edit the file.
So, for the initial line or the last line in the file, you should not use sed because simple > and >> stream redirects are more than enough.
Your command will work if you make sure the input file has at least one line:
[ "$(wc -l < test)" -gt 0 ] || printf '\n' >> test
sed -i -e '1 i/etc/example/live/example.com/fullchain.pem;\' test
To insert text to the first line and put the rest on a new line using sed on macOS this worked for me
sed -i '' '1 i \
Insert
' ~/Downloads/File-path.txt
First and Last
I would assume that anyone who searched for how to insert/append text to the beginning/end of a file probably also needs to know how to do the other also.
cal | \
gsed -E \
-e '1i\{' \
-e '1i\ "lines": [' \
-e 's/(.*)/ "\1",/' \
-e '$s/,$//' \
-e '$a\ ]' \
-e '$a\}'
Explanation
This is cal output piped to gnu-sed (called gsed on macOS installed via brew.sh) with extended RegEx (-E) and 6 "scripts" applied (-e) and line breaks escaped with \ for readability. Scripts 1 & 2 use 1i\ to "at line 1, insert". Scripts 5 & 6 use $a\ to "at line <last>, append". I vertically aligned the text outputs to make the code represent what is expected in the result. Scripts 3 & 4 do substitutions (the latter applying only to "line <last>"). The result is converting command output to valid JSON.
output
{
"lines": [
" October 2019 ",
"Su Mo Tu We Th Fr Sa ",
" 1 2 3 4 5 ",
" 6 7 8 9 10 11 12 ",
"13 14 15 16 17 18 19 ",
"20 21 22 23 24 25 26 ",
"27 28 29 30 31 ",
" "
]
}
For help getting this to work with the macos/BSD version of sed, see my answer here.
I'm trying to create a little script that basically uses dig +short to find the IP of a website, and then pipe that to sed/awk/grep to replace a line. This is what the current file looks like:
#Server
123.455.1.456
246.523.56.235
So, basically, I want to search for the '#Server' line in a text file, and then replace the two lines underneath it with an IP address acquired from dig.
I understand some of the syntax of sed, but I'm really having trouble figuring out how to replace two lines underneath a match. Any help is much appreciated.
Based on the OP, it's not 100% clear exactly what needs to replaced where, but here's a a one-liner for the general case, using GNU sed and bash. Replace the two lines after "3" with standard input:
echo Hoot Gibson | sed -e '/3/{r /dev/stdin' -e ';p;N;N;d;}' <(seq 7)
Outputs:
1
2
3
Hoot Gibson
6
7
Note: sed's r command is opaquely documented (in Linux anyway). For more about r, see:
"5.9. The 'r' command isn't inserting the file into the text" in this sed FAQ.
here's how in awk:
newip=12.34.56.78
awk -v newip=$newip '{
if($1 == "#Server"){
l = NR;
print $0
}
else if(l>0 && NR == l+1){
print newip
}
else if(l==0 || NR != l+2){
print $0
}
}' file > file.tmp
mv -f file.tmp file
explanation:
pass $newip to awk
if the first field of the current line is #Server, let l = current line number.
else if the current line is one past #Server, print the new ip.
else if the current row is not two past #Server, print the line.
overwrite original file with modified version.
What is the best way to remove all lines from a text file starting at first empty line in Bash? External tools (awk, sed...) can be used!
Example
1: ABC
2: DEF
3:
4: GHI
Line 3 and 4 should be removed and the remaining content should be saved in a new file.
With GNU sed:
sed '/^$/Q' "input_file.txt" > "output_file.txt"
With AWK:
$ awk '/^$/{exit} 1' test.txt > output.txt
Contents of output.txt
$ cat output.txt
ABC
DEF
Walkthrough: For lines that matches ^$ (start-of-line, end-of-line), exit (the whole script). For all lines, print the whole line -- of course, we won't get to this part after a line has made us exit.
Bet there are some more clever ways to do this, but here's one using bash's 'read' builtin. The question asks us to keep lines before the blank in one file and send lines after the blank to another file. You could send some of standard out one place and some another if you are willing to use 'exec' and reroute stdout mid-script, but I'm going to take a simpler approach and use a command line argument to let me know where the post-blank data should go:
#!/bin/bash
# script takes as argument the name of the file to send data once a blank line
# found
found_blank=0
while read stuff; do
if [ -z $stuff ] ; then
found_blank=1
fi
if [ $found_blank ] ; then
echo $stuff > $1
else
echo $stuff
fi
done
run it like this:
$ ./delete_from_empty.sh rest_of_stuff < demo
output is:
ABC
DEF
and 'rest_of_stuff' has
GHI
if you want the before-blank lines to go somewhere else besides stdout, simply redirect:
$ ./delete_from_empty.sh after_blank < input_file > before_blank
and you'll end up with two new files: after_blank and before_blank.
Perl version
perl -e '
open $fh, ">","stuff";
open $efh, ">", "rest_of_stuff";
while(<>){
if ($_ !~ /\w+/){
$fh=$efh;
}
print $fh $_;
}
' demo
This creates two output files and iterates over the demo data. When it hits a blank line, it flips the output from one file to the other.
Creates
stuff:
ABC
DEF
rest_of_stuff:
<blank line>
GHI
Another awk would be:
awk -vRS= '1;{exit}' file
By setting the record separator RS to be an empty string, we define the records as paragraphs separated by a sequence of empty lines. It is now easily to adapt this to select the nth block as:
awk -vRS= '(FNR==n){print;exit}' file
There is a problem with this method when processing files with a DOS line-ending (CRLF). There will be no empty lines as there will always be a CR in the line. But this problem applies to all presented methods.
UPDATED:
Using sed, how can I insert (NOT SUBSTITUTE) a new line on only the first match of keyword for each file.
Currently I have the following but this inserts for every line containing Matched Keyword and I want it to only insert the New Inserted Line for only the first match found in the file:
sed -ie '/Matched Keyword/ i\New Inserted Line' *.*
For example:
Myfile.txt:
Line 1
Line 2
Line 3
This line contains the Matched Keyword and other stuff
Line 4
This line contains the Matched Keyword and other stuff
Line 6
changed to:
Line 1
Line 2
Line 3
New Inserted Line
This line contains the Matched Keyword and other stuff
Line 4
This line contains the Matched Keyword and other stuff
Line 6
You can sort of do this in GNU sed:
sed '0,/Matched Keyword/s//New Inserted Line\n&/'
But it's not portable. Since portability is good, here it is in awk:
awk '/Matched Keyword/ && !x {print "Text line to insert"; x=1} 1' inputFile
Or, if you want to pass a variable to print:
awk -v "var=$var" '/Matched Keyword/ && !x {print var; x=1} 1' inputFile
These both insert the text line before the first occurrence of the keyword, on a line by itself, per your example.
Remember that with both sed and awk, the matched keyword is a regular expression, not just a keyword.
UPDATE:
Since this question is also tagged bash, here's a simple solution that is pure bash and doesn't required sed:
#!/bin/bash
n=0
while read line; do
if [[ "$line" =~ 'Matched Keyword' && $n = 0 ]]; then
echo "New Inserted Line"
n=1
fi
echo "$line"
done
As it stands, this as a pipe. You can easily wrap it in something that acts on files instead.
If you want one with sed*:
sed '0,/Matched Keyword/s//Matched Keyword\nNew Inserted Line/' myfile.txt
*only works with GNU sed
This might work for you:
sed -i -e '/Matched Keyword/{i\New Inserted Line' -e ':a;n;ba}' file
You're nearly there! Just create a loop to read from the Matched Keyword to the end of the file.
After inserting a line, the remainder of the file can be printed out by:
Introducing a loop place holder :a (here a is an arbitrary name).
Print the current line and fetch the next into the pattern space with the ncommand.
Redirect control back using the ba command which is essentially a goto to the a place holder. The end-of-file condition is naturally taken care of by the n command which terminates any further sed commands if it tries to read passed the end-of-file.
With a little help from bash, a true one liner can be achieved:
sed $'/Matched Keyword/{iNew Inserted Line\n:a;n;ba}' file
Alternative:
sed 'x;/./{x;b};x;/Matched Keyword/h;//iNew Inserted Line' file
This uses the Matched Keyword as a flag in the hold space and once it has been set any processing is curtailed by bailing out immediately.
If you want to append a line after first match only, use AWK instead of SED as below
awk '{print} /Matched Keyword/ && !n {print "New Inserted Line"; n++}' myfile.txt
Output:
Line 1
Line 2
Line 3
This line contains the Matched Keyword and other stuff
New Inserted Line
Line 4
This line contains the Matched Keyword and other stuff
Line 6