I am trying to use a literal $ as an end anchor for a search using grep. The entire problem is search for line(s) in a file that start with At and ends with a literal $. I have tried several variations of the code I think will work and get no results even though there should be.
grep '\<At[a-zA-z]\{1,\}\$\>' test.txt
Any suggestions would be appreciated and I am a first year student of Linux so forgive me if I am missing something simple. Thank you
grep '\<At[a-zA-z]\{1,\}[$]\>' test.txt
To avoid playing shell escaping games, put the $ inside a character class.
Related
I am trying to find a way to escape the dollar sign within the sed command in a bash script. I have found here tons of answers that say that you need to put four backslashes in a row in order to escape the sign. I have also tried the version with two backslashes, but for some reason, I can't get it to work. Can someone please tell me what I'm doing wrong?
Let's say that you have a file called aaa.txt located in /home/Documents. The file has just only one line of text, which says aaa. I am trying to replace it using this command (please don't tell me that I can reference it in a different way than with line numbers because this is just a reduced example of something else I am doing):
sed -i "1ccd /home/userr/$PARAMETER/$METHOD" "/home/userr/Documents/aaa.txt"
The output that I get is this:
cd /home/userr/\/\
Which is not what I want. I want to have exactly this output, with dollar signs in the string:
cd /home/userr/$PARAMETER/$METHOD
What is the proper form of the string passed to the sed command to achieve this?
You can escape the dollar sign with a backslash:
sed -i "1ccd /home/userr/\$PARAMETER/\$METHOD" "/home/userr/Documents/aaa.txt"
Documentation here: https://www.gnu.org/software/bash/manual/html_node/Escape-Character.html
You need to remove the double quotes so sed will not try and expand it. This sed should work without escaping needed.
$ sed -i '1ccd /home/userr/$PARAMETER/$METHOD' /home/userr/Documents/aaa.txt
I have to search for a string in a file like below using grep which is not working as expected.
It's just a simple search of the string, but not sure why it is not working
echo "Naizhu NZ1020 Lady Necklace Sexy Tcollarbone Chain Alloy PlatingSilver" | grep "Lady Necklace"
Can somebody help me here why it's not working, want to know the reason
The command grep will print the whole line matching the pattern.
So
echo "Naizhu NZ1020 Lady Necklace Sexy Tcollarbone Chain Alloy PlatingSilver" | grep "Lady Necklace"
will give you
Naizhu NZ1020 Lady Necklace Sexy Tcollarbone Chain Alloy PlatingSilver
You might use grep -o or --only-matching to
Print only the matched (non-empty) parts of a matching line, with each
such part on a separate output line.
and to get only
Lady Necklace
Within the comments of the question it was mentioned that a file is used for input. Since the encoding of that is unknown currently, you may also try use character classes
grep -o "Lady[[:space:]]Necklace"
Please see man grep for more options.
You should also have a look into your input file and if the words you like to lookup are in the same line and separated with a space and not with other not printable characters.
I would like to search file.txt with grep to locate a url ending with. ".doc". When it finds .doc, I want grep to go backwards and find "http://" at the begining of that string.
The output would be http://somesite.com/random-code-that-changes-daily/somefilename.doc
There is only 1 .doc url on this page, so multiple search results should not be an issue.
Please excuse, I am a novice. I did locate the answer at one time but search for 1 hour and can no longer find. I am willing to read and learn but I do not think I'm using the correct search terms for what I want to do. Thank you.
You can use regular expressions,
with the marker ^ you can indicate the start of the line you are looking for.
with the marker $ you can indicate the end of the line you are looking for.
then, you can do something like
grep '^http:\\' \ '.doc$' file.txt
or
grep '^http://\|.doc$' file.txt
or not using regular expressions but just a matching pattern with wildcards as #choroba suggested:
grep 'http://.*\.doc' file.txt
You can also search for http:// and print the line if it contains .doc somewhere after it:
grep 'http://.*\.doc' file.txt
If you want to only print the matching part, use the -o option (if your version of grep supports it).
I'm in the process of switching from zsh to bash, and I need to produce a bash script that can remove duplicate entries in $PATH without reordering the entries (thus no sort -d magic). zsh has some nice array handling shortcuts that made it easy to do this efficiently, but I'm not aware of such shortcuts in bash. I came across this answer which has gotten me 90% of the way there, but there is a small problem that I would like to understand better. It appears that when I run that awk command, the last record processed incorrectly matches the pattern.
$ awk 'BEGIN{RS=ORS=":"}!a[$0]++' <<<"aa:bb:cc:aa:bb:cc"
aa:bb:cc:cc
$ awk 'BEGIN{RS=ORS=":"}!a[$0]++' <<<"aa:bb:cc:aa:bb"
aa:bb:cc:bb
$ awk 'BEGIN{RS=ORS=":"}!a[$0]++' <<<"aa:bb:cc:aa:bb:cc:" # note trailing colon
aa:bb:cc:
I don't understand awk well enough to know why it behaves this way, but I have managed to work around the issue by using an intermediate array like so.
array=($(awk 'BEGIN{RS=":";ORS=" "}!a[$0]++' <<<"aa:bb:cc:aa:bb:cc:"))
# Use a subshell to avoid modifying $IFS in current context
echo $(export IFS=":"; echo "${array[*]}")
aa:bb:cc
This seems like a sub-optimal solution however, so my question is: did I do something wrong in the awk command that is causing false positive matches on the final record processed?
The last record in your original string is cc\n which is different from cc. When unsure what's happening in any program in any language, adding some print statements is step 1 to debugging/investigating:
$ awk 'BEGIN{RS=ORS=":"} {print "<"$0">"}' <<<"aa:bb:cc:aa:bb:cc"
<aa>:<bb>:<cc>:<aa>:<bb>:<cc
>:$
If you want the RS to be : or \n then just state that (with GNU awk at least):
$ awk 'BEGIN{RS="[:\n]"; ORS=":"} !a[$0]++' <<<"aa:bb:cc:aa:bb:cc"
aa:bb:cc:$
The $ in all of the above is my prompt.
Another possible workaround instead of your bash array solution
$ echo "aa:bb:cc:aa:bb:cc" | tr ':' '\n' | awk '!a[$0]++' | paste -sd:
aa:bb:cc
I am sorry for posting this but this is driving me crazy. I am very new to bash scripting and am really struggling. I have files with the following format 8_S58_L001.sorted.bam and I would like to take the first digit (8 in this case) from many files and generate a csv file. This will give me the order in which samples were processed by a downstream function.
The script is as follows and it works, however I get an error (-bash: syntax error near unexpected token `done') everytime I run it and am struggling to understand why. So far I have spent 2 days trying to get to the bottom of it and have searched extensively through various forums.
do
test=$(ls -LR | grep .bam$| sed 's/_.*//'| awk '{print}' ORS=',' | sed 's/*$//')
echo $test>../SampleOrder/fileOrder2.csv
done
If I just run
test=$(ls -LR | grep .bam$| sed 's/_.*//'| awk '{print}' ORS=',' | sed 's/*$//')
echo $test>../SampleOrder/fileOrder2.csv
Then I get the desired output and no errors but if it is incorporated within an do statement I get the above error. I am hoping to incorporate this into a larger script so I want to deal with this error first.
I should say that this is being run on a linux based cluster.
Can someone with more experience tell me where I am going wrong.
Thanks in advance
Sam
bash doesn't have a do statement, and done is a reserved word when it is the first word in a command.
So in
do
something
something
done
do is a syntax error. do is only useful in the context of for and while loops, where it serves to separate the condition from the body of the loop.
Since you're reporting a syntax error on the done as opposed to the do, my guess is that you've let Windows line-endings creep into your file. Bash doesn't regard the \r (CR) character as special, so if your file actually contains do\r, then that will be considered to be the name of an external command.
You should be aware that grep .bam$ doesn't do what you are expecting it to do. The dot is a grep wildcard which matches any single character, so the pattern .bam$ will match any string of 4 or more characters that ends in "bam". If you are trying to match all strings that end in ".bam", you should escape the dot and write grep "\.bam$"
But as a previous commenter correctly noted, you should be using shell wildcards (ls *.bam) instead of grep (ls | grep .bam$)