I have the input file (myfile) as:
/data/152.18224487:2,S/proforma invoice.doc
/data/152.916612:2,/proforma invoice.doc
/data/152.48152834/Bank T.T Copy 12 d3d.doc
/data/155071755/Bank T.T Copy.doc
/data/1521/Quotation Request.doc
/data/15.462/Quotation Request 2ds.doc
/data/15.22649962_test4/Quotation Request 33 zz (.doc
/data/15.226462_test6/Quotation Request.doc
and I need to exclude all data after latest "/" to the end of the row to have this output:
/data/152.18224487:2,S
/data/152.916612:2,
/data/152.48152834
/data/155071755
/data/1521
/data/15.462
/data/15.22649962_test4
/data/15.226462_test6
How can I do this from command line linux ?
This is a follow-up question related to extract last section of data from file using linux command
Could you please try following.
awk 'match($0,/\/.*\//){print substr($0,RSTART,RLENGTH-1)}' Input_file
Above will look from / to till last occurrence of / in case your Input_file can start other than / then try following.
awk 'match($0,/.*\//){print substr($0,RSTART,RLENGTH-1)}' Input_file
This one is combined with your previous question ,
ie. data:
>> Vi 'x' found in file /data/152.916612:2,/proforma invoice.doc
>> Vi 'x' found in file /data/152.48152834/Bank T.T Copy 12 d3d.doc
>> Vi 'x' found in file /data/155071755/Bank T.T Copy.doc
...
wwk:
$ awk '
(s=match($0,/found in file /)+RLENGTH) && (match(substr($0,s),/.*\//)) {
print substr($0,s,RLENGTH-1)
}' file
Output:
/data/152.18224487:2,S
/data/152.916612:2,
/data/152.48152834
...
Try
sed 's:/[^/]*$::' < inputfile > outputfile
You stated in a comment elsewhere that you also need only the rest after the last slash, so here we go:
sed 's:^.*/::' < inputfile > outputfile
awk -F/ '{print "/"$1$2"/"$3}' file
/data/152.18224487:2,S
/data/152.916612:2,
/data/152.48152834
/data/155071755
/data/1521
/data/15.462
/data/15.22649962_test4
/data/15.226462_test6
Related
My file contains some urls in urls.txt
Another file contains some different words in words.txt
I need to be add every word at the end of the every url.
I want output like (line by line)
url.1/hi
url.1/hello
url.1/there
url.2/hi
url.2/hello
url.2/there
url.3/hi
url.3/hello
url.3/there
I just want single line command :)
like
urls.txt + words.txt >> output.txt
Given:
cat words.txt
hi
hello
there
cat urls.txt
url.1
url.2
url.3
You can use awk this way:
awk 'FNR==NR{words[++cnt]=$1; next}
{for (i=1; i<=cnt; i++) print $1 words[i]}
' words.txt urls.txt
Prints:
url.1hi
url.1hello
url.1there
url.2hi
url.2hello
url.2there
url.3hi
url.3hello
url.3there
This might work for you (GNU sed & Parallel)
parallel echo "{1}{2}" :::: urlsFile :::: wordsFile
or
sed 's#.*#echo "&" | sed "s#.*#&\&#" wordsFile#e' urlsFile
we have the following file example
we want to remove the , character on the last line that topic word exists
more file
{"topic":"life_is_hard","partition":84,"replicas":[1006,1003]},
{"topic":"life_is_hard","partition":85,"replicas":[1001,1004]},
{"topic":"life_is_hard","partition":86,"replicas":[1002,1005]},
{"topic":"life_is_hard","partition":87,"replicas":[1003,1006]},
{"topic":"life_is_hard","partition":88,"replicas":[1004,1001]},
{"topic":"life_is_hard","partition":89,"replicas":[1005,1002]},
{"topic":"life_is_hard","partition":90,"replicas":[1006,1004]},
{"topic":"life_is_hard","partition":91,"replicas":[1001,1005]},
{"topic":"life_is_hard","partition":92,"replicas":[1002,1006]},
{"topic":"life_is_hard","partition":93,"replicas":[1003,1001]},
{"topic":"life_is_hard","partition":94,"replicas":[1004,1002]},
{"topic":"life_is_hard","partition":95,"replicas":[1005,1003]},
{"topic":"life_is_hard","partition":96,"replicas":[1006,1005]},
{"topic":"life_is_hard","partition":97,"replicas":[1001,1006]},
{"topic":"life_is_hard","partition":98,"replicas":[1002,1001]},
{"topic":"life_is_hard","partition":99,"replicas":[1003,1002]},
expected output
{"topic":"life_is_hard","partition":84,"replicas":[1006,1003]},
{"topic":"life_is_hard","partition":85,"replicas":[1001,1004]},
{"topic":"life_is_hard","partition":86,"replicas":[1002,1005]},
{"topic":"life_is_hard","partition":87,"replicas":[1003,1006]},
{"topic":"life_is_hard","partition":88,"replicas":[1004,1001]},
{"topic":"life_is_hard","partition":89,"replicas":[1005,1002]},
{"topic":"life_is_hard","partition":90,"replicas":[1006,1004]},
{"topic":"life_is_hard","partition":91,"replicas":[1001,1005]},
{"topic":"life_is_hard","partition":92,"replicas":[1002,1006]},
{"topic":"life_is_hard","partition":93,"replicas":[1003,1001]},
{"topic":"life_is_hard","partition":94,"replicas":[1004,1002]},
{"topic":"life_is_hard","partition":95,"replicas":[1005,1003]},
{"topic":"life_is_hard","partition":96,"replicas":[1006,1005]},
{"topic":"life_is_hard","partition":97,"replicas":[1001,1006]},
{"topic":"life_is_hard","partition":98,"replicas":[1002,1001]},
{"topic":"life_is_hard","partition":99,"replicas":[1003,1002]}
we try to removed the character , from the the last line that contain topic word as the following sed cli but this syntax not renewed the ,
sed -i '${s/,[[:blank:]]*$//}' file
sed (GNU sed) 4.2.2
In case you have control M characters in your Input_file then remove them by doing:
tr -d '\r' < Input_file > temp && mv temp Input_file
Could you please try following once. From your question what I understood is you want to remove comma from very last line which has string topic in it, if this is the case then I am coming up with tac + awk solution here.
tac Input_file |
awk '/topic/ && ++count==1{sub(/,$/,"")} 1' |
tac
Once you are happy with above results then append > temp && mv temp Input_file to above command too, to save output into Input_file itself.
Explanation:
Atac will read Input_file from bottom line to first line then passing it's output to awk where I am checking if first occurrence of topic is coming remove comma from last and rest of lines simply print then passing this output to tac again to make Input_file in original form again.
You should use the address $ (last line):
sed '$s/,$//' file
Using awk:
$ awk '{if(NR>1)print p;p=$0}END{sub(/,$/,"",p);print p}' file
Output:
...
{"topic":"life_is_hard","partition":98,"replicas":[1002,1001]},
{"topic":"life_is_hard","partition":99,"replicas":[1003,1002]}
I have to write a script file to cut the following column and paste it the end of the same row in a new .arff file. I guess the file type doesn't matter.
Current file:
63,male,typ_angina,145,233,t,left_vent_hyper,150,no,2.3,down,0,fixed_defect,'<50'
67,male,asympt,160,286,f,left_vent_hyper,108,yes,1.5,flat,3,normal,'>50_1'
The output should be:
male,typ_angina,145,233,t,left_vent_hyper,150,no,2.3,down,0,fixed_defect,'<50',63
male,asympt,160,286,f,left_vent_hyper,108,yes,1.5,flat,3,normal,'>50_1',67
how can I do this? using a Linux script file?
sed -r 's/^([^,]*),(.*)$/\2,\1/' Input_file
Brief explanation,
^([^,]*) would match the first field which separated by commas, and \1 behind refer to the match
(.*)$ would be the remainding part except the first comma, and \2 would refer to the match
Shorter awk solution:
$ awk -F, '{$(NF+1)=$1;sub($1",","")}1' OFS=, input.txt
gives:
male,typ_angina,145,233,t,left_vent_hyper,150,no,2.3,down,0,fixed_defect,'<50',63
male,asympt,160,286,f,left_vent_hyper,108,yes,1.5,flat,3,normal,'>50_1',67
Explanation:
{$(NF+1)=$1 # add extra field with value of field $1
sub($1",","") # search for string "$1," in $0, replace it with ""
}1 # print $0
EDIT: Reading your comments following your question, looks like your swapping more columns than just the first to the end of the line. You might consider using a swap function that you call multiple times:
func swap(i,j){s=$i; $i=$j; $j=s}
However, this won't work whenever you want to move a column to the end of the line. So let's change that function:
func swap(i,j){
s=$i
if (j>NF){
for (k=i;k<NF;k++) $k=$(k+1)
$NF=s
} else {
$i=$j
$j=s
}
}
So now you can do this:
$ cat tst.awk
BEGIN{FS=OFS=","}
{swap(1,NF+1); swap(2,5)}1
func swap(i,j){
s=$i
if (j>NF){
for (k=i;k<NF;k++) $k=$(k+1)
$NF=s
} else {
$i=$j
$j=s
}
}
and:
$ awk -f tst.awk input.txt
male,t,145,233,typ_angina,left_vent_hyper,150,no,2.3,down,0,fixed_defect,'<50',63
male,f,160,286,asympt,left_vent_hyper,108,yes,1.5,flat,3,normal,'>50_1',67
Why using sed or awk, the shell can handle this easily
while read l;do echo ${l#*,},${l%%,*};done <infile
If it's a win file with \r
while read l;do f=${l%[[:cntrl:]]};echo ${f#*,},${l%%,*};done <infile
If you want to keep the file in place.
printf "%s" "$(while read l;do f=${l%[[:cntrl:]]};printf "%s\n" "${f#*,},${l%%,*}";done <infile)">infile
I'm trying to create a little script that basically uses dig +short to find the IP of a website, and then pipe that to sed/awk/grep to replace a line. This is what the current file looks like:
#Server
123.455.1.456
246.523.56.235
So, basically, I want to search for the '#Server' line in a text file, and then replace the two lines underneath it with an IP address acquired from dig.
I understand some of the syntax of sed, but I'm really having trouble figuring out how to replace two lines underneath a match. Any help is much appreciated.
Based on the OP, it's not 100% clear exactly what needs to replaced where, but here's a a one-liner for the general case, using GNU sed and bash. Replace the two lines after "3" with standard input:
echo Hoot Gibson | sed -e '/3/{r /dev/stdin' -e ';p;N;N;d;}' <(seq 7)
Outputs:
1
2
3
Hoot Gibson
6
7
Note: sed's r command is opaquely documented (in Linux anyway). For more about r, see:
"5.9. The 'r' command isn't inserting the file into the text" in this sed FAQ.
here's how in awk:
newip=12.34.56.78
awk -v newip=$newip '{
if($1 == "#Server"){
l = NR;
print $0
}
else if(l>0 && NR == l+1){
print newip
}
else if(l==0 || NR != l+2){
print $0
}
}' file > file.tmp
mv -f file.tmp file
explanation:
pass $newip to awk
if the first field of the current line is #Server, let l = current line number.
else if the current line is one past #Server, print the new ip.
else if the current row is not two past #Server, print the line.
overwrite original file with modified version.
I have a file like this (tens of variables) :
PLAY="play"
APPS="/opt/play/apps"
LD_FILER="/data/mysql"
DATA_LOG="/data/log"
I need a script that will output the variables into another file like this (with space between them):
PLAY=${PLAY} APPS=${APPS} LD_FILER=${LD_FILER}
Is it possible ?
I would say:
$ awk -F= '{printf "%s=${%s} ", $1,$1} END {print ""}' file
PLAY=${PLAY} APPS=${APPS} LD_FILER=${LD_FILER} DATA_LOG=${DATA_LOG}
This loops through the file and prints the content before = in a format var=${var} together with a space. At the end, it prints a new line.
Note this leaves a trailing space at the end of the line. If this matters, we can check how to improve it.
< input sed -e 's/\(.*\)=.*/\1=${\1}/' | tr \\n \ ; echo
sed 's/"\([^"]*"\)"/={\1}/;H;$!d
x;y/\n/ /;s/.//' YourFile
your sample exclude last line so if this is important
sed '/DATA_LOG=/ d
s/"\([^"]*"\)"/={\1}/;H;$!d
x;y/\n/ /;s/.//' YourFile