How can you run AWK in Vim's selection of the search?
My pseudo-code
%s/!awk '{ print $2 }'//d
I am trying to delete the given column in the file.
Though they probably address the issue of the original poster, none of the answer addresses the issue advertised in the title of the question. My proposal to remove the first line of the question and to retitle it as "Deleting one column in vim" having been unanimously rejected, here is a solution for people arriving there by actually looking for that.
Deleting a column (here the second one, as in OP's pseudocode example) with awk in vim :
:%!awk '{$2=""; print $0}'
Of course, it also works for a portion of the file — e.g. for lines 10 to 20 :
:10,20!awk '{$2=""; print $0}'
As for "[running] awk in Vim's selection of the search", not sure you can exactly do that but anyway the search and substitution is an easy job for awk, if not its primary purpose. The following replaces "pattern" with "betterpattern" in the second column if it matches :
:%!awk '$2~"pattern" {gsub("pattern","betterpattern",$2)}
Note that the NOT operator requires escaping (\! instead of !). The following replaces the value in the second column by its increment by 10 if it matches "number" and let other lines unchanged :
:%!awk '$2~"number" {gsub($1,$1+10)} $2\!~"number" {print $0}'
Appart from this point it's just awk syntax.
In command mode, press Ctrl-v to go into visual mode, then you can block-select the column using cursor movement keys. You can then yank and put it or delete it or whatever you need using the appropriate vim commands and keystrokes.
You do not have to use awk, even if the second column is not a rectangular region. Use a substitution:
:%s/ \w\+ / /
The second column is made up of at least one from word characters (\w\+) separated by blanks. The replacement is one blank. This one is for a selected range of lines:
:'<,'>s/ \w\+ / /
if you want to delete something, use :%s/pattern//
pattern can't be a command, it's mostly a regular expression. expressing 2nd field in regular expression is not very easy
if you want to delete 2nd field, you can filter the text through cut utility
:%! cut -d ' ' -f 2 --complement
You can delete a given column in a file just from vim.
In command mode use the following to delete column n:
:%s/\(.\{n-1}\).\{1}\(.*$\)/\1\2/g
you could press 0, then press w to go to your 2nd column, and do cw.
Related
How to change delimiter from current comma (,) to semicolon (;) inside .txt file using linux command?
Here is my ME_1384_DataWarehouse_*.txt file:
Data Warehouse,ME_1384,Budget for HW/SVC,13/05/2022,10,9999,13/05/2022,27,08,27,08
Data Warehouse,ME_1384,Budget for HW/SVC,09/05/2022,10,9999,09/05/2022,45,58,45,58
Data Warehouse,ME_1384,Budget for HW/SVC,25/05/2022,10,9999,25/05/2022,7,54,7,54
Data Warehouse,ME_1384,Budget for HW/SVC,25/05/2022,10,9999,25/05/2022,7,54,7,54
It is very important that value of last two columns is number with 2 decimal places, so value of last 2 columns in first row for example is:"27,08"
That could be the main problem why delimiter couldn't be change in proper way.
I tried with:
sed 's/,/;/g' ME_1384_DataWarehouse_*.txt
and every comma sign has been changed, including mentioned value of the last 2 columns.
Is there anyone who can help me out with this issue?
With sed you can replace the nth occurrence of a certain lookup string. Example:
$ sed 's/,/;/4' file
will replace the 4th comma with a semicolon.
So, if you know you have 11 fields (10 commas), you can do
$ sed 's/,/;/g;s/;/,/10;s/;/,/8' file
Example:
$ seq 1 11 | paste -sd, | sed 's/,/;/g;s/;/,/10;s/;/,/8'
1;2;3;4;5;6;7;8,9;10,11
Your question is somewhat unclear, but if you are trying to say "don't change the last comma, or the third-to-last one", a solution to that might be
perl -pi~ -e 's/,(?![^,]+(?:,[^,]+,[^,]+)?$)/;/g' ME_1384_DataWarehouse_*.txt
Perl in isolation does not perform any loop over the input lines, but the -p option says to loop over input one line at a time, like sed, and print every line (there is also -n to simulate the behavior of sed -n); the -i~ says to modify the file, but save the original with a tilde added to its file name as a backup; and the regex uses a negative lookahead (?!...) to protect the two fields you want to exempt from the replacement. Lookaheads are a modern regex feature which isn't supported by older tools like sed.
Once you are satisfied with the solution, you can remove the ~ after -i to disable the generation of backups.
You can do this with awk:
awk -F, 'BEGIN {OFS=";"} {a=$NF;NF-=1; printf "%s,%s\n",$0,a} ' input_file
This should work with most awk version (do not count on Solaris standard awk)
The idea is to store the last element from row in variable, decrease the number of fields and then print using new delimiter, comma and stored last field.
I have a text file and I want to remove every word except the first word on every line and I have no idea how to do this.
So, if I have:
one two three
four five
six
I want to remain with:
one
four
six
Got any ideas?
If the lines don't start with whitespace, you could replace ' .*' (which matches everything after the first word) with an empty string:
:%s/ .*//g
A more robust solution is to filter it through a program that is really good at these kinds of manipulations: awk.
Say you had this content:
one two three
four five
six
Run :%!awk '{print $1}' and you will get:
one
four
six
awk's default field separator character is a space, though you could change it to whatever you wanted, depending on what you needed.
Alternatively, you can do it using a macro.
Type qa in normal mode to start recording a macro in register a.
Then type 0elDj to delete everything on the current line but the first word, and go to the next line.
Type q again to end recording the macro.
Now you can fire the macro on any line with #a.
Run :%norm! #a to apply the macro to every line in the buffer.
This way you can repeat any complex operation you want, not just substituting.
I love macros :)
EDIT: Note that it doesn't work when a line has strictly less than 2 characters. For this reason, this is generally not the best approach to this problem.
Basically after I sort I want my columns to be separated by tabs. right now it is separated by two spaces. The man pages did not have anything related to output formatting (at least I didn't notice it).
If its not possible, I guess I have to use awk to sort and print. Any better alternative?
EDIT:
To clarify the question, the location of the double spaces is not consistent. I actually have data like this:
<date>\t<user>\t<message>.
I sort by date by year, month, day and time which looks like
Wed Jan 11 23:44:30 CST 2012
and then have the output of the sorted data like the original file that is
<date>\t<user>\t<message>.
EDIT 2: Seems like my testing for tab was wrong. I was copy pasting raw line from bash to my Windows box. That's why it didn't recognize as a tab instead it showed spaces. I downloaded whole file to windows and now I can see that the fields are tab separated.
Also, I figured out that separation of fields (\t \n , : ;, etc) is same in the new file after sorting. That means, in the original file if I have tab separated field, my sorted file is also going to be tab separated.
One last thing, the "correct" answer was not exactly the correct solution to the problem. I don't know if I can comment on my own thread and mark it as correct. If it is OK to do that, please let me know.
Thanks for the comments guys. Really appreciate your help!
Pipe your output to column:
sort <whatever> | column -t -s\t
You can use sed:
sort data.txt | sed 's/ /\t/g'
^^
||
2 blank spaces
This will take the output of your sort operation and substitute a single tab for 2 consecutive blanks.
From what I understood the file is already sorted and what you want is to replace the two separating spaces by a TAB character, in that case, use the following:
sed 's/ /\t/g' < sorted_file > new_formatted_file
(Be careful to copy/paste correctly the two spaces in the regular expression)
I already know how to do it with
:%s/\(\S\+\)^I\(\S\+\)/\2^I\1/
but I feel like I'm typing way to much stuff. Is there a cleaner, quicker way to do it?
If the columns are lined up, you can use visual block mode by hitting Ctrl+V, then cut and paste. If the columns are not lined up, increase the tab width first so that it's longer than the content of the columns in question.
Best way to do it in VIM is - not to do it with VIM and (re)use existing tools for the job. *NIX specific solution:
:%!awk -F \\t '{print $2 FS $1}'
Would pipe the content of the tab-delimited file to awk and it will print first two columns swapped, separated by field separator (FS). awk can be also found for Windows.
P.S. Initially I wanted to write the same with cut but for whatever reason on my system the cut -f 2,1 (-d is not needed as TAB is the default delimiter) printed the fields in the same order, not swapped :|
Using Vim 6.0. Say I'm editing this file:
sdfsdg
dfgdfg
34 12
2 4
45 1
34 5
How do I sort the second column?
If you have decent shell available, select your numbers and run the command
:'<,'>!sort -n -k 2
If you gonna type this in visual mode, after typing the colon, markers '<,'> will appead automatically, and you'll only have to type the rest of it.
This type of commands (:[motion]!) is called filtering. You can learn more by consulting vim's help:
:h filter
Sort all lines on second column N by using Vim sort command, e.g.
:sort /.*\%2v/
Reference: vimtips.txt
For vim7 I would go for:
:sort n /.*\s/
This will sort numbers ignoring text matched by given regexp. In your case it is second column.
Sort by 2nd column by selecting it in visual mode (e.g. Control+v), then run:
!sort
or to sort by third column
sort -k 3
or
:sort /.*\%3v/
Alternatively select the lines you wish to sort using the Shift+V command. Then enter
!sort -k 3n
or use the below code to tell Vim to skip the first two words in every line and sort on whatever follows:
:%sort /^\S\+\s\+\S\+\s\+/
or i.e. sort by 8th line:
:sort /.*\%55v/
The 'virtual' specification is the absolute number of column , which treats spaces + tabs as single character (shortly, it doesn't count tabs as eight spaces),
so to sort by last column:
:%sort /\<\S\+\>$/ r
If more columns were there, you may use repetition to avoid complicated pattern.
For example, this will sort the entire file by the 100th column ("column" here means the space separated column)
:%sort /^\(\S\+\s\+\)\{99}/