Extract multiple substrings from very long line

Extract multiple substrings from very long line - linux

I have very long line of text in format:
http://address1/user1=username1;ip1=ipaddres1;password1=pass1;some text;http://address2/user2=username2;ip2=ipaddress2;password2=pass2;some text;...etc
How I can extract usernames (part after user1=, user2=) and ip addresses (part after ip1=, ip2=) from this line (in line have more than 20 usernames and ip addresses) and put them in two files (user.txt, ip.txt)?
Thanks

You can use grep to find the matching parts and cut to remove the stuff before =:
grep -o 'user[0-9]=[^;]*' input.txt | cut -d= -f2- > user.txt
grep -o 'ip[0-9]=[^;]*' input.txt | cut -d= -f2- > ip.txt
-o only prints the matching parts. If there are several matches on the same line, they are printed to separate lines.

Using awk: To store into separate files:
gawk -v RS='some text' '{$1=$1;match($0,/user[0-9]+=([^;]+).*ip[0-9]+=([^;]+).*/,a);print a[1]>"username";print a[2] > "ipaddress"}' long_file
cat username
username1
username2
cat ipaddress
ipaddres1
ipaddress2
This awk assumes that between each records there is some text is present.
Or using grep with -P:
grep -oP 'user[0-9]+=\K[^;]+' long_file > username
grep -oP 'ip[0-9]+=\K[^;]+' long_file >ip_address

Related

Bash issue with floating point numbers in specific format

(Need in bash linux)I have a file with numbers like this
1.415949602
91.09582241
91.12042924
91.40270349
91.45625033
91.70150341
91.70174342
91.70660043
91.70966213
91.72597066
91.7287678315
91.7398645966
91.7542977976
91.7678146465
91.77196659
91.77299733
abcdefghij
91.7827827
91.78288651
91.7838959
91.7855
91.79080605
91.80103075
91.8050505
sed 's/^91\.//' file (working)
Any way possible I can do these 3 steps?
1st I try this
cat input | tr -d 91. > 1.txt (didnt work)
cat input | tr -d "91." > 1.txt (didnt work)
cat input | tr -d '91.' > 1.txt (didnt work)
then
grep -x '.\{10\}' (working)
then
grep "^[6-9]" (working)
Final 1 line solution
cat input.txt | sed 's/\91.//g' | grep -x '.\{10\}' | grep "^[6-9]" > output.txt

Your "final" solution:
cat input.txt |
sed 's/\91.//g' |
grep -x '.\{10\}' |
grep "^[6-9]" > output.txt
should avoid the useless cat, and also move the backslash in the sed script to the correct place (and I added a ^ anchor and removed the g flag since you don't expect more than one match on a line anyway);
sed 's/^91\.//' input.txt |
grep -x '.\{10\}' |
grep "^[6-9]" > output.txt
You might also be able to get rid of at least one useless grep but at this point, I would switch to Awk:
awk '{ sub(/^91\./, "") } /^[6-9].{9}$/' input.txt >output.txt
The sub() does what your sed replacement did; the final condition says to print lines which match the regex.
The same can conveniently, but less readably, be written in sed:
sed -n 's/^91\.([6-9][0-9]\{9\}\)$/\1/p' input.txt >output.txt
assuming your sed dialect supports BRE regex with repetitions like [0-9]\{9\}.

How can I extract all the services from /etc/services?

I want to extract the services from the file /etc/services. The problem is that when extracting them, I get the following output when entering head file.txt:
acr-nema
afbackup
afbackup
afmbackup
afmbackup
afpovertcp
afpovertcp
afs3-bos 7007
But the desired output should be as follows:
acr-nema 104/udp dicom
afbackup 2988/tcp #
afbackup 2988/udp
afmbackup 2989/tcp #
afmbackup 2989/udp
afpovertcp 548/tcp #
afpovertcp 548/udp
afs3-bos 7007/tcp #
The command that I am entering is the following:
cat /etc/services | sed '/^#/ d' | cut -d ' ' -f 1 | sort | awk '!a[$0]++' > file.txt

give this a try:
awk '$0&&/^[^#]/&&!a[$0]++' /etc/services |sort
btw, don't do cat aFile|awk '...' instead, do awk '...' file

cut -f1 /etc/services | grep '^[^#].*s$' | sort --unique
can be used to get unique services but if you want to store in a different file you can add > file.txt

How to grep for a key in the file?

I have a text file that carries the following values
Key 1: 0e3f02b50acfe57e21ba991b39d75170d80d98e831400250d3b4813c9b305fd801
Key 2: 8e3db2b4cdfc55d91512daa9ed31b348545f6ba80fcf2c3e1dbb6ce9405f959602
I am using the following grep command to extract value of Key 1
grep -Po '(?<=Key 1=)[^"]*' abc.txt
However, it doesn't seem to work.
Please help me figure out the correct grep command
My output should be:
0e3f02b50acfe57e21ba991b39d75170d80d98e831400250d3b4813c9b305fd801

A grep+cut solution: Search for the right key, then return the third field:
$ grep '^Key 1:' abc.txt | cut -d' ' -f3
Or, equivalently in awk:
$ awk '/^Key 1:/ { print $3 }' abc.txt

Don't use grep to modify the matching string, that's pointless, messy, and non-portable when sed already does it concisely and portably:
$ sed -n 's/^Key 1: //p' file
0e3f02b50acfe57e21ba991b39d75170d80d98e831400250d3b4813c9b305fd801

If your version of grep doesn't support PCRE, you can do the same with sed, e.g.
$ sed -n '/^Key 1: [^"]/s/^Key 1: //p' file.txt
0e3f02b50acfe57e21ba991b39d75170d80d98e831400250d3b4813c9b305fd801
Explanation
-n suppress normal printing of pattern space
/^Key 1: [^"]/ find the pattern
s/^Key 1: // substitute (nothing) for pattern
p print the remainder

You have mistake in your grep (change Key 1= to Key 1:)
grep -Po '(?<=Key 1: )[^"]*' abc.txt

grep -oP '(?<=Key 1: )[^"]+' abc.txt
seems to work for me.

How to get just numerical value from a string in bash

I have an xml file and i want to extract just the numerical value from a string in the file.One of the solution i came up with is
cat file.xml |grep -i "mu "| grep -o '[0-9]'
But i get each digit separated by new line,e.g for 100,i get 1 then new line,then 0 and so on.The other solution i came up with is
cat file.xml |grep -i "mu "|cut -d ' ' -f 4| tr '=' ' '|cut -d ' ' -f2|tr '""' ' '|sed -e 's/^ *//g' -e 's/ *$//g'
My question: Is there a simpler solution to this problem that i get just a numerical value from a line without caring about fields and not to use cut or tr commands?

Use this egrep:
egrep -o '[0-9]+'

One option you have is to delete everything that is not a digit from your input
tr -cd '[:digit:]'
Or for floating numbers
tr -cd '[:digit:].'

I would encourage avoidance of XML as a format, personally; at least for your own use. Instead of "<mu value="100" />", you could use the following:-
# Name your data file ma-me-mo-mu.txt
100+200+300+400
and then:-
while IFS='+' read ma me mo mu
do
echo "${ma}"
echo "${me}"
echo "${mo}"
echo "${mu}"
done
You don't need to name your columns inside the data file itself. They go in the file name.

Get line number while using grep

I am using grep recursive to search files for a string, and all the matched files and the lines containing that string are print on the terminal. But is it possible to get the line numbers of those lines too??
ex: presently what I get is /var/www/file.php: $options = "this.target", but what I am trying to get is /var/www/file.php: 1142 $options = "this.target";, well where 1142 would be the line number containing that string.
Syntax I am using to grep recursively is sudo grep -r 'pattern' '/var/www/file.php'
One more question is, how do we get results for not equal to a pattern. Like all the files but not the ones having a certain string?

grep -n SEARCHTERM file1 file2 ...

Line numbers are printed with grep -n:
grep -n pattern file.txt
To get only the line number (without the matching line), one may use cut:
grep -n pattern file.txt | cut -d : -f 1
Lines not containing a pattern are printed with grep -v:
grep -v pattern file.txt

If you want only the line number do this:
grep -n Pattern file.ext | gawk '{print $1}' FS=":"
Example:
$ grep -n 9780545460262 EXT20130410.txt | gawk '{print $1}' FS=":"
48793
52285
54023

grep -A20 -B20 pattern file.txt
Search pattern and show 20 lines after and before pattern

grep -nr "search string" directory
This gives you the line with the line number.

In order to display the results with the line numbers, you might try this
grep -nr "word to search for" /path/to/file/file
The result should be something like this:
linenumber: other data "word to search for" other data

When working with vim you can place
function grepn() {
grep -n $# /dev/null | awk -F $':' '{t = $1; $1 = $2; $2 = t; print; }' OFS=$':' | sed 's/^/vim +/' | sed '/:/s// /' | sed '/:/s// : /'
}
in your .bashrc and then
grepn SEARCHTERM file1 file2 ...
results in
vim +123 file1 : xxxxxxSEARCHTERMxxxxxxxxxx
vim +234 file2 : xxxxxxSEARCHTERMxxxxxxxxxx
Now, you can open vim on the correspondending line (for example line 123) by simply copying
vim +123 file1
to your shell.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Extract multiple substrings from very long line - linux

Related

Bash issue with floating point numbers in specific format

How can I extract all the services from /etc/services?

How to grep for a key in the file?

How to get just numerical value from a string in bash

Get line number while using grep

Categories

Resources