How can i join consecutive lines into a single lines using awk? Actually i have this with my awk command:
awk -F "\"*;\"*" '{if (NR!=1) {print $2}}' file.csv
I remove the first line
44895436200043
38401951900014
72204547300054
38929771400013
32116464200027
50744963500014
i want to have this:
44895436200043 38401951900014 72204547300054 38929771400013 32116464200027 50744963500014
csv file
That's a job for tr:
# tail -n +2 prints the whole file from line 2 on
# tr '\n' ' ' translates newlines to spaces
tail -n +2 file | tr '\n' ' '
With awk, you can achieve this by changing the output record separator to " ":
# BEGIN{ORS= " "} sets the internal output record separator to a single space
# NR!=1 adds a condition to the default action (print)
awk 'BEGIN{ORS=" "} NR!=1' file
I assume you want to modify your existing awk, so that it prints a horizontal space separated list, instead of words, one per row.
You can replace the print $2 action in your command, you can do this:
awk -F "\"*;\"*" 'NR!=1{u=u s $2; s=" "} END {print u}' file.csv
or replace the ORS (output record separator)
awk -F "\"*;\"*" -v ORS=" " 'NR!=1{print $2}' file.csv
or pipe output to xargs:
awk -F "\"*;\"*" 'NR!=1{print $2}' file.csv | xargs
I have file that contains below information
$ cat test.txt
Studentename:Ram
rollno:12
subjects:6
Highest:95
Lowest:65
Studentename:Krish
rollno:13
subjects:6
Highest:90
Lowest:45
Studentename:Sam
rollno:14
subjects:6
Highest:75
Lowest:65
I am trying place info of single student in single.
i.e My output should be
Studentename:Ram rollno:12 subjects:6 Highest:95 Lowest:65
Studentename:Krish rollno:13 subjects:6 Highest:90 Lowest:45
Studentename:Sam rollno:14 subjects:6 Highest:75 Lowest:65.
Below is the command I wrote
cat test.txt | tr "\n" " " | sed 's/Lowest:[0-9]\+/Lowest:[0:9]\n/g'
Above command is breaking line at regex Lowest:[0-9] but it doesn't print the pattern. Instead it is printing Lowest:[0-9].
Please help
Try:
$ sed '/^Studente/{:a; N; /Lowest/!ba; s/\n/ /g}' test.txt
Studentename:Ram rollno:12 subjects:6 Highest:95 Lowest:65
Studentename:Krish rollno:13 subjects:6 Highest:90 Lowest:45
Studentename:Sam rollno:14 subjects:6 Highest:75 Lowest:65
How it works
/^Studente/{...} tells sed to perform the commands inside the curly braces only on lines that start with Studente. Those commands are:
:a
This defines a label a.
N
This reads in the next line and appends it to the pattern space.
/Lowest/!ba
If the current pattern space does not contain Lowest, this tells sed to branch back to label a.
In more detail, /Lowest/ is true if the line contains Lowest. In sed, ! is negation so /Lowest/! is true if the line does not containLowest. Inba, thebstands for the branch command anda` is the label to branch to.
s/\n/ /g
This tells sed to replace all newlines with spaces.
Try this using awk :
awk '{if ($1 !~ /^Lowest/) {printf "%s ", $0} else {print}}' file.txt
Or shorter but more obfuscated :
awk '$1!~/^Lowest/{printf"%s ",$0;next}1' file.txt
Or correcting your command :
tr "\n" " " < file.txt | sed 's/Lowest:[0-9]\+/&\n/g'
Explanation: & is whats matched in the left part of substitution
Another possible GNU sed that doesn't assume Lowest is the last item:
sed ':a; N; /\nStudent/{P; D}; s/\n/ /; ba' test.txt
This might work for you (GNU sed):
sed '/^Studentename:/{:a;x;s/\n/ /gp;d};H;$ba;d' file
Use the hold space to gather up the fields and then remove the newlines to produce a record.
Replace comma with space using a shell script
Given the following input:
Test,10.10.10.10,"80,22,3306",connect
I need to get below output using a bash script
Test 10.10.10.10 "80,22,3306" connect
If you have gawk, you can use FPAT (field pattern), setting it to a regular expression.
awk -v FPAT='([^,]+)|(\"[^"]+\")' '{ for(i=1;i<=NF;i++) { printf "%s ",$i } }' <<< "Test,10.10.10.10,\"80,22,3306\",connect"
We set FPAT to separate the text based on anything that is not a comma and also data enclosed in quotation marks as as well as anything that is not a quotation mark. We then print all the fields with a spaces in between.
Considering if your Input_file is same as shown sample then following sed may help you in same too.
sed 's/\(.[^,]*\),\([^,]*\),\(".*"\),\(.*\)/\1 \2 \3 \4/g' Input_file
Assuming you can read your input from the file, this works
#!/usr/bin/bash
while read -r line;do
declare -a begin=$(echo $line | awk -F'"' '{print $1}' | tr "," " " )
declare -a end=$(echo $line |awk -F'"' '{print $3}' | tr "," " " )
declare -a middle=$(echo $line | awk -F'"' '{print $2}' )
echo "${begin[#]} \"${middle[#]}\" ${end[#]}"
done < connect_file
Edit: I see,that you want to keep the commas between port numbers. I have edited the script.
echo Test,10.10.10.10,\"80,22,3306\",connect|awk '{sub(/,/," ")gsub(/,"80,22,3306",/," \4280,22,3306\42 ")}1'
Test 10.10.10.10 "80,22,3306" connect
I have a 10000 line file that contains on each line a string in the form of "data:key", which is also right-padded by 8 characters, where ':' is the delimiter. I am attempting to use awk from within Linux to print these pairs on their own lines, so that line #1 = data and line #2 = key, and I have achieved this using the command:
awk -F: '{print $1; print$2}' < ~/prices.txt
My problem occurs on the second line of each set. For some reason, it is padded with as much whitespace as there was from removing the data from the line. So, if my line was "26900:9976", the first line would be '26900' and the second line would be ' 9976', whitespace included.
If curious, I want to do it this way because I am piping the results to db_load to use within a B+-tree.
Not exactly your answer but you can use tr for this:
tr ':' '\n' < input
also I don't see the behaviour you are describing with your awk command, however, you can always add a sed to the pipeline to remove leading white space:
tr ':' '\n' < ~/prices.txt | sed 's/^[ \t]*//'
awk -F: '{print $1; print$2}' < ~/prices.txt | sed 's/^[ \t]*//'
You can use a regular expression as the field separator: a colon followed by zero or more whitespace chars will separate the fields.
awk -F ':[[:space:]]*' '{print $1; print $2}' < ~/prices.txt
Let's say I have the following string:
something1: +12.0 (some unnecessary trailing data (this must go))
something2: +15.5 (some more unnecessary trailing data)
something4: +9.0 (some other unnecessary data)
something1: +13.5 (blah blah blah)
How do I turn that into simply
+12.0,+15.5,+9.0,+13.5
in bash?
Clean and simple:
awk '{print $2}' file.txt | paste -s -d, -
You can use awk and sed:
awk -vORS=, '{ print $2 }' file.txt | sed 's/,$/\n/'
Or if you want to use a pipe:
echo "data" | awk -vORS=, '{ print $2 }' | sed 's/,$/\n/'
To break it down:
awk is great at handling data broken down into fields
-vORS=, sets the "output record separator" to ,, which is what you wanted
{ print $2 } tells awk to print the second field for every record (line)
file.txt is your filename
sed just gets rid of the trailing , and turns it into a newline (if you want no newline, you can do s/,$//)
cat data.txt | xargs | sed -e 's/ /, /g'
This might work for you:
cut -d' ' -f5 file | paste -d',' -s
+12.0,+15.5,+9.0,+13.5
or
sed '/^.*\(+[^ ]*\).*/{s//\1/;H};${x;s/\n/,/g;s/.//p};d' file
+12.0,+15.5,+9.0,+13.5
or
sed 's/\S\+\s\+//;s/\s.*//;H;$!d;x;s/.//;s/\n/,/g' file
For each line in the file; chop off the first field and spaces following, chop off the remainder of the line following the second field and append to the hold space. Delete all lines except the last where we swap to the hold space and after deleting the introduced newline at the start, convert all newlines to ,'s.
N.B. Could be written:
sed 's/\S\+\s\+//;s/\s.*//;1h;1!H;$!d;x;s/\n/,/g' file
$ awk -v ORS=, '{print $2}' data.txt | sed 's/,$//'
+12.0,+15.5,+9.0,+13.5
$ cat data.txt | tr -s ' ' | cut -d ' ' -f 2 | tr '\n' ',' | sed 's/,$//'
+12.0,+15.5,+9.0,+13.5
awk one liner
$ awk '{printf (NR>1?",":"") $2}' file
+12.0,+15.5,+9.0,+13.5
This should work too
awk '{print $2}' file | sed ':a;{N;s/\n/,/};ba'
You can use grep:
grep -o "+\S\+" in.txt | tr '\n' ','
which finds the string starting with +, followed by any string \S\+, then convert new line characters into commas. This should be pretty quick for large files.
Try this easy code:
awk '{printf("%s,",$2)}' File1
try this:
sedSelectNumbers='s".* \(+[0-9]*[.][0-9]*\) .*"\1,"'
sedClearLastComma='s"\(.*\),$"\1"'
cat file.txt |sed "$sedSelectNumbers" |tr -d "\n" |sed "$sedClearLastComma"
the good thing is the easy part of deleting newline "\n" characters!
EDIT: another great way to join lines into a single line with sed is this: |sed ':a;N;$!ba;s/\n/ /g' got from here.
A solution written in pure Bash:
#!/bin/bash
sometext="something1: +12.0 (some unnecessary trailing data (this must go))
something2: +15.5 (some more unnecessary trailing data)
something4: +9.0 (some other unnecessary data)
something1: +13.5 (blah blah blah)"
a=()
while read -r a1 a2 a3; do
# we can add some code here to check valid values or modify them
a+=("${a2}")
done <<< "${sometext}"
# between parenthesis to modify IFS for the current statement only
(IFS=',' ; printf '%s: %s\n' "Result" "${a[*]}")
Result: +12.0,+15.5,+9.0,+13.5
Don't seen this simple solution with awk
awk 'b{b=b","}{b=b$2}END{print b}' infile
With perl:
fg#erwin ~ $ perl -ne 'push #l, (split(/\s+/))[1]; END { print join(",", #l) . "\n" }' <<EOF
something1: +12.0 (some unnecessary trailing data (this must go))
something2: +15.5 (some more unnecessary trailing data)
something4: +9.0 (some other unnecessary data)
something1: +13.5 (blah blah blah)
EOF
+12.0,+15.5,+9.0,+13.5
You can also do it with two sed calls:
$ cat file.txt
something1: +12.0 (some unnecessary trailing data (this must go))
something2: +15.5 (some more unnecessary trailing data)
something4: +9.0 (some other unnecessary data)
something1: +13.5 (blah blah blah)
$ sed 's/^[^:]*: *\([+0-9.]\+\) .*/\1/' file.txt | sed -e :a -e '$!N; s/\n/,/; ta'
+12.0,+15.5,+9.0,+13.5
First sed call removes uninteresting data, and the second join all lines.
You can also print like this:
Just awk: using printf
bash-3.2$ cat sample.log
something1: +12.0 (some unnecessary trailing data (this must go))
something2: +15.5 (some more unnecessary trailing data)
something4: +9.0 (some other unnecessary data)
something1: +13.5 (blah blah blah)
bash-3.2$ awk ' { if($2 != "") { if(NR==1) { printf $2 } else { printf "," $2 } } }' sample.log
+12.0,+15.5,+9.0,+13.5
Another Perl solution, similar to Dan Fego's awk:
perl -ane 'print "$F[1],"' file.txt | sed 's/,$/\n/'
-a tells perl to split the input line into the #F array, which is indexed starting at 0.
Well the hardest part probably is selecting the second "column" since I wouldn't know of an easy way to treat multiple spaces as one. For the rest it's easy. Use bash substitutions.
# cat bla.txt
something1: +12.0 (some unnecessary trailing data (this must go))
something2: +15.5 (some more unnecessary trailing data)
something4: +9.0 (some other unnecessary data)
something1: +13.5 (blah blah blah)
# cat bla.sh
OLDIFS=$IFS
IFS=$'\n'
for i in $(cat bla.txt); do
i=$(echo "$i" | awk '{print $2}')
u="${u:+$u, }$i"
done
IFS=$OLDIFS
echo "$u"
# bash ./bla.sh
+12.0, +15.5, +9.0, +13.5
Yet another AWK solution
Run
awk '{printf "%s", $c; while(getline){printf "%s%s", sep, $c}}' c=2 sep=','
to use the 2nd column to form the list separated by commas. Give the input as usual in standard input or as a file name argument.