Command for printing part of a String? [closed] - linux

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have a file name test,
it contains a String James Bond 007,
and i want to print only James Bond.
I tried the following commands:
$ strings -n 2 test
$ sed -n '/James/,/Bond/p' test
$ awk '{print substr($1,10)}' test

To print the first two words, you can use awk:
awk '{print $1, $2}' test
To print the first ten characters, you can put the file contents in a variable, then use the bash substring operation:
contents=$(cat test)
echo "${contents:0:10}"
Or in awk:
awk '{print substr($0, 1, 10)}' test
Notice that $0 means the whole line, and you have to give both a starting index and length to substr(). Indexes in awk start at 1 rather than 0.
In sed, /James/,/Bond/ is a line range expression; it processes all the lines starting from a line containing James until a line containing Bond. It doesn't process just part of the lines.

Related

how can I remove some numbers at the end of line in a text file [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 months ago.
Improve this question
I have a text file which contains a series of same line except at the end.
eg
lesi-1-1500-1
lesi-1-1500-2
lesi-1-1500-3
how can I remove the last number? it goes upto 250
to change in the file itself
sed -i 's/[0-9]\+$//' /path/to/file
or
sed 's/[0-9]\+$//' /path/to/file > /path/to/output
see example
You can do it with Awk by breaking it into fields.
echo "lesi-1-1500-2" > foo.txt
echo "lesi-1-1500-3" >> foo.txt
cat foo.txt | awk -F '-' '{print $1 "-" $2 "-" $3 }'
The -F switch allows us to set the delimiter which is -. Then we just print the first three fields with - for formatting.

How to remove a word and following characters to the next coma [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 months ago.
Improve this question
I have a string. Part of it contains "Log":true. which I would like to remove using bash and sed.
Original line
[...]\"Date\":\"1661731200000\",\"Log\":true,\"$$type\":\"system\",\"created\":\"2022-08-01T13:37:43+0[...]
Modified line
[...]\"Date\":\"1661731200000\",\"$$type\":\"system\",\"created\":\"2022-08-01T13:37:43+0[...]
I'm struggling to find the right expression. Is is possible to achieve it with sed?
Match ,\"Log\": followed by any sequence of alphabetic characters.
sed 's/,\"Log\":[a-z]*//' filename
#!/bin/sh
cat << EOF >> edpop
2d
wq
EOF
cat file | tr ',' '\n' > file2
ed -s file2 < edpop
cat file2 | tr '\n' ',' > file
rm -v ./file2
rm -v ./edpop
This replaces the commas with newlines, deletes the second line with ed, (which corresponds with the second comma field) and then replaces the newlines with commas again.

print multiple words in specific pattern [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I have a file word.txt
$ cat word.txt
cat
dog
rat
bird
I have a URL like this
https://example.com/?word=
I want to generate a URL list like this
https://example.com/?word=cat
https://example.com/?word=cat,dog
https://example.com/?word=cat,dog,rat
https://example.com/?word=cat,dog,rat,bird
I have 177 words so how can I automate this process with Bash or any other easy programming
Read the input line by line, add the line to the URL. Don't include the comma for the first line.
#! /bin/bash
url='https://example.com/?word='
while read -r line ; do
url+=$comma$line
comma=,
echo "$url"
done < word.txt
This task can be accomplished with a single GNU sed command:
sed -n 's|^|https://example.com/?word=|; :a; p; N; s/\n/,/; ba' word.txt
That should be more efficient than the plain bash.
Explanation:
-n   With this option, sed only produces output when explicitly told to via the p command.
s|^|https://example.com/?word=|   Replaces the beginning of the line with the https://example.com/?word=. This command effectively prepends that string to the pattern space.
:a   Label a for branch command (b). Used when looping through the lines.
p   Prints the pattern space.
N   Adds a newline to the pattern space, then appends the next line of input to the pattern space.
s/\n/,/   Replaces the newline with the comma (,).
ba   Jumps to the label a. This effectively creates a loop for all input lines except the first line.
Another variant:
url='https://example.com/?word='
while read word; do
words+=($word) list="${words[#]}"
printf '%s%s\n' "$url" ${list// /,}
done < word.txt
You can use something like this if you want to do it in python
words=["a","b","c"]
joinedWords = ",".join(words)
url="https://example.com/?word="
print(url,joinedWords)
reqUrl = url+joinedWords
print(reqUrl)

How can I shorten header in a fasta file? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have a file that looks like this:
>Gene.10::S0008.1::g.10::m.10 Gene.10::S0008.1::g.10 ORF type:complete len:250 (-),score=22.42 S_0008.1:286-1035(-)
MKGDDFNIITAPVPINRIWWYSLTNRQRIALVFYMSFYVAGTLTNTASMFIDKFYIYIMR
LESLQMGSADPIDYKYLLEVQIVRGFWREDVHEVVDKVFRGKSIGYIKTNLMIPVEIWNN
CQVRSFRGIPCHSVAIICLIFGMLILYYHCTTVALFRTFMILNANLAAILLFIAMSMEYS
AAVEYDYCVNSVFMNRKTGGKAFVRGRYYNRTLEASGSTFKLMMVGDILFFCPMIGLGCY
LLFCNRENL*
>Gene.11::S0009.1::g.10::m.11 Gene.11::S0009.1::g.10 ORF type:complete len:250 (-),score=22.42 S_0008.1:286-1035(-)
QSAISNDEELNKIMDA
....
I want to delete everything in the header after the first space. How can I do this easily in linux?
Resultant file:
>Gene.10::S0008.1::g.10::m.10
MKGDDFNIITAPVPINRIWWYSLTNRQRIALVFYMSFYVAGTLTNTASMFIDKFYIYIMR
LESLQMGSADPIDYKYLLEVQIVRGFWREDVHEVVDKVFRGKSIGYIKTNLMIPVEIWNN
CQVRSFRGIPCHSVAIICLIFGMLILYYHCTTVALFRTFMILNANLAAILLFIAMSMEYS
AAVEYDYCVNSVFMNRKTGGKAFVRGRYYNRTLEASGSTFKLMMVGDILFFCPMIGLGCY
LLFCNRENL*
>Gene.11::S0009.1::g.10::m.11
QSAISNDEELNKIMDA
I would use sed:
sed '/^>/s/^>\([^ ]*\) .*/>\1 /'
If a line starts with > then remove everything after the first space. The following:
echo '>Gene.10::S0008.1::g.10::m.10 Gene.10::S0008.1::g.10 ORF type:complete len:250 (-),score=22.42 Sxl_rink_0008.1:286-1035(-)
MKGDDFNIITAPVPINRIWWYSLTNRQRIALVFYMSFYVAGTLTNTASMFIDKFYIYIMR
LESLQMGSADPIDYKYLLEVQIVRGFWREDVHEVVDKVFRGKSIGYIKTNLMIPVEIWNN
CQVRSFRGIPCHSVAIICLIFGMLILYYHCTTVALFRTFMILNANLAAILLFIAMSMEYS
AAVEYDYCVNSVFMNRKTGGKAFVRGRYYNRTLEASGSTFKLMMVGDILFFCPMIGLGCY
LLFCNRENL*
>Gene.11::S0009.1::g.10::m.11 Gene.11::S0009.1::g.10 ORF type:complete len:250 (-),score=22.42 Sxl_rink_0008.1:286-1035(-)
QSAISNDEELNKIMDA' | sed '/^>/s/^>\([^ ]*\) .*/>\1 /'
outputs:
>Gene.10::S0008.1::g.10::m.10
MKGDDFNIITAPVPINRIWWYSLTNRQRIALVFYMSFYVAGTLTNTASMFIDKFYIYIMR
LESLQMGSADPIDYKYLLEVQIVRGFWREDVHEVVDKVFRGKSIGYIKTNLMIPVEIWNN
CQVRSFRGIPCHSVAIICLIFGMLILYYHCTTVALFRTFMILNANLAAILLFIAMSMEYS
AAVEYDYCVNSVFMNRKTGGKAFVRGRYYNRTLEASGSTFKLMMVGDILFFCPMIGLGCY
LLFCNRENL*
>Gene.11::S0009.1::g.10::m.11
QSAISNDEELNKIMDA
I don't know if the one space left after the header is relevant or not, but I left it.
If in those long lines of characters are no spaces anywhere, you can just remove everything until the first space with cut:
cut -d' ' -f1
which will remove all characters after the first space (including the space, dunno if the space is relevant).
#edit: As the OP edited both the input and the output, the answer now removes everything up to the first space, as to removing up to the second space...
Using awk you will have a more readable solution :
awk 'NR==1{print $1}NR!=1{print}' test.txt
Then you can redirect output to new file to store the fix :
awk 'NR==1{print $1}NR!=1{print}' test.txt > new_test.txt
EDIT
I thought there was multiple files, and just one header per file.
awk '{print $1}' test.txt
would work on your example as other lines does not contain spaces
Perl to the rescue!
perl -pe 's/ .*// if /^>/' -- file.fasta

Grep the most recent value of a particular column from a CSV file [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
"cola","colb","colc","cold","cole","colf"
"a","b","c","d","e","f"
"a1","b1","c1","d1","e1","f1"
"a2","b2","c2","d2","e2","f2"
Assuming this is the CSV file, I want to grep the value "e" from the column "cole" and store it into a shell variable. And then use the shell variable as a part of a wget command.
How would I do this?
set -f # disable globbing
variable="$(awk 'NR==2 {print $5}' file)"
set +f
Awk is well suited to this. If you know the column number you can simply do:
$ awk 'NR==2{print $5}' file.csv
e
This will print the fifth field on the second line. If you want to use the column name then:
$ awk 'NR==1{for(i=1;i<=NF;i++)c[$i]=i}NR==2{print $c[col]}' col="cole" file.csv
e
Just set col="<name of column to use>".
You can use command substitution to store the value in variable:
$ val="$(awk 'NR==2{print $5}' file.csv)"
$ wget --what-ever-option "$val"
Or just use it in place:
$ wget --what-ever-option "$(awk 'NR==2{print $5}' file.csv)"

Resources