Strip text from output in rhel - linux

So im trying to strip away some of the text from this output using awk
This is my output ,
href="/warning:understand-how-this-works!/5HpHagT65TZzG1PH3CSu63k8DbpvD8s5ip4nEB3kEsrePxLM2Uo">+</a>
href="/warning:understand-how-this-works!/5HpHagT65TZzG1PH3CSu63k8DbpvD8s5ip4nEB3kEsrePxLM2Uo">+</a>
href="/warning:understand-how-this-works!/5HpHagT65TZzG1PH3CSu63k8DbpvD8s5ip4nEB3kEsrePxLM2Uo">+</a>
Basically, I am trying to take that info, from the output of a text file,Remove this part:
href="/warning:understand-how-this-works!/
and this part
">+</a>
So it only shows:
5HpHagT65TZzG1PH3CSu63k8DbpvD8s5ip4nEB3kEsrePxLM2Uo
or, outputs that.
Running on centos 6

Could you please try following and let me know if this helps you.
awk '{sub(/.*!\//,X,$0);sub(/\".*/,X,$0);print}' Input_file

You can use grep if you want:
grep -oP '!/\K.*?(?=")' inputfile
Or awk by playing around FS :
awk -F'!/|">' '{print $2}' input
Or use sed backrefrencing:
sed -r 's/(^.*\!\/)(.*?)(">.*)/\2/g' input

Related

How do you change column names to lowercase with linux and store the file as it is?

I am trying to change the column names to lowercase in a csv file. I found the code to do that online but I dont know how to replace the old column names(uppercase) with new column names(lowercase) in the original file. I did something like this:
$cat head -n1 xxx.csv | tr "[A-Z]" "[a-z]"
But it simply just prints out the column names in lowercase, which is not enough for me.
I tried to add sed -i but it did not do any good. Thanks!!
Using awk (readability winner) :
concise way:
awk 'NR==1{print tolower($0);next}1' file.csv
or using ternary operator:
awk '{print (NR==1) ? tolower($0): $0}' file.csv
or using if/else statements:
awk '{if (NR==1) {print tolower($0)} else {print $0}}' file.csv
To change the file for real:
awk 'NR==1{print tolower($0);next}1' file.csv | tee /tmp/temp
mv /tmp/temp file.csv
For your information, sed using the in place edit switch -i do the same: it use a temporary file under the hood.
You can check this by using :
strace -f -s 800 sed -i'' '...' file
Using perl:
perl -i -pe '$_=lc() if $.==1' file.csv
It replace the file on the fly with -i switch
You can use sed to tell it to replace the first line with all lower-case and then print the rest as-is:
sed '1s/.*/\L&/' ./xxx.csv
Redirect the output or use -i to do an in-place edit.
Proof of Concept
$ echo -e "COL1,COL2,COL3\nFoO,bAr,baZ" | sed '1s/.*/\L&/'
col1,col2,col3
FoO,bAr,baZ

Print between special characters with sed,grep

I need to print the string between these characters....
atob(' ')
I am using a = in the second part as an attempt to stop the code on an equal signs (which the base64 string I'm trying to get ends in.)
I use this script, but it prints the entire line containing the above characters. I need just the data in between.
sed -n '/atob/,${p;/==/q;}'
I appreciate any help. Thank you.
Does this work (tested for GNU sed 4.2.2)?
 sed -n -e "s/atop('\(.*\)')/\1/p" b.txt
where b.txt is
atop('safdasdfasf')
or you can try awk
awk -F\' '/atop/ {print $2}' b.txt
(tested for gnu awk 4.0.2 and added the suggestion by Jotne)
And another working sed:
echo "atop('safdasdfasf')" | sed -r "/atop/ s/^[^']+'([^']+)'.*/\1/"
safdasdfasf

How to extract string between 2 xml tags?

I have a string like this
<anytag>my message</anytag>
How I can extract the message between the tags with sed or awk?
So I get only "my message"
Using xmllint (from libxml2):
xmllint --xpath '//anytag/text()' <(echo "<anytag>my message</anytag>")
try:
awk -F'[><]' '{print $3}' Input_file
Making field separator as '[><]' and printing 3rd field.
sed 's/<.*>\(.*\)<\/.*>/\1/g' file
I do not want to install xml paser for a lite extract string, my xml
message is not complicated
For simple strings you may use the following sed approach:
s="<anytag>my message</anytag>"
sed 's~<[^<>]*>\([^<>]*\)</[^<>]*>~\1~' <<< $s
The output:
my message
You can use the following awk command if each line of your file is in the format you have shown.
awk -F "<[^<]+?>" '{print $2;}' <filename>
Input:
<anytag>my message</anytag>
<mytag>abc</mytag>
Output:
my message
abc

Difference between awk -FS and awk -f in shell scripting

I am new to shell scripting and I'm very confused between awk -FS and awk -f commands used. I've tried reading multiple pages on the difference between these two but was not able to understand clearly. Kindly help.
Here is an example:
Lets consider that a text file say, data.txt has the below details.
S.No Product Qty Price
1-Pen-2-10
2-Pencil-1-5
3-Eraser-1-2
Now, when i try to use the following command:
$ awk -f'-' '{print $1,$2} data.txt
I get the below output:
1 Pen
2 Pencil
3 Eraser
But when i use the command:
$ awk -FS'-' '{print $1,$2} data.txt
the output is:
1-Pen-2-10
2-Pencil-1-5
3-Eraser-1-2
I don't understand the difference it does using the -FS command. Could somebody help me out on what exactly happens between these two commands. Thanks!
You are more confused than you think. There is no -FS.
FS is a variable that contains the field separator.
-F is an option that sets FS to it's argument.
-f is an option whose argument is the name of a file that contains the script to execute.
The scripts you posted would have produced syntax errors, not the output you say they produced, so idk what to tell you...
-FS is not an argument to awk. -F is, as is -f.
The -F argument tells awk what value to use for FS (the field separator).
The -f argument tells awk to use its argument as the script file to run.
This command (I fixed your quoting):
awk -f'-' '{print $1,$2}' data.txt
tells awk to use standard input (that's what - means) for its argument. This should hang when run in a terminal. And should be an error after that as awk then tries to use '{print $1,$2}' as a filename to read from.
This command:
awk -FS'-' '{print $1,$2}' data.txt
tells awk to use S- as the value of FS. Which you can see by running this command:
awk -FS'-' 'BEGIN {print "["FS"]"}'

awk to print some parameters of a line

I have lines in a file in linux, and i am trying print the line without the | and without some parameters
$cat file
2013-07-15,Provider 1.99,3|30000055|2347|0,12222,1,3,0,0,0,19,aaa,bbb
2013-07-15,Provider 1.99,3|30000055|2347|0,12222,44,12,0,0,0,33,aaa,bbb
and i need the output like:
2013-07-15,Provider,2347,12222,1,3,0,0,0,19,aaa,bbb
2013-07-15,Provider,2347,12222,44,12,0,0,0,33,aaa,bbb
and i am trying with awk, but i have some problems.
If your lines have similar pattern you would to retain then you can do:
awk 'BEGIN{FS=OFS=","}{$2="Provider";$3=2347}1' file
If you don't know what the patterns are then here is a more generic one:
awk 'BEGIN{FS=OFS=","}{split($2,a,/ /);split($3,b,/\|/);$2=a[1];$3=b[3]}1' file
If it doesn't solve your problem, I am pretty sure it would help you guide to get one.
Using sed:
sed 's/ [^|]*|[^|]*|\([^|]*\)|[^,]/,\1/' input
and some shorter version:
sed 's/ .*|\([^|]*\)|[^,]*/,\1/' input
and even shorter:
sed 's/ .*|\(.*\)|[^,]*/,\1/' input
Use awk, and let blank or comma or pipe be the field separators:
awk -F '[[:blank:],|]' -v OFS=, '{
print $1,$2,$6,$8,$9,$10,$11,$12,$13,$14,$15,$16
}' file
2013-07-15,Provider,2347,12222,1,3,0,0,0,19,aaa,bbb
2013-07-15,Provider,2347,12222,44,12,0,0,0,33,aaa,bbb

Resources