What is the easiest way to join 12 columns? - text

I have 12 columns separated by a tab. How can I join them side-by-side?
[Added] You can also tell me other methods as AWK: the faster the better.

Since you asked specifically about awk (there are tools better suited to the job), the following is a first-cut solution:
awk '{print $1$2$3$4$5$6$7$8$9$10$11$12}'
A more complicated and configurable solution, where you could change the number of columns used for output, would be:
awk -v lim=12 '{for(x=1;x<lim;x++){printf "%s",$x};print ""}'
Other possibilities, if you're not restricted to awk, are:
tr -d '\011' # to combine ALL columns on the line.
cut --output-delimiter='' -f1-12 # more general (1-12 or 3-7 or 1-6,9).
Based on your edit and comments, I suggest cut is the best tool for the job. Use "man cut", "info cut" or "cut --help" for more details (this depends on your platform).

If you are just using awk to concatenate the columns I would use 'tr' and delete tab
cat file1 | tr -d '\011'> file2

Try this:
{
print $1$2$3$4$5$6$7$8$9$(10)$(11)$(12)
}
I'm not an awk genius so I don't know if there's some sort of looping construct you can use.

Well, it depends on your editor/command of choice. But generally, it boild down to replacing the character with nothing.
For example, in vim: ":%s/\t//g"

You did not mention what tool you would like to use but any text editor would be able to replace the tab to an empty character, I guess that would work, that's what I usually do.

Related

I am trying to replace a text for example

Example:
"word" -nothing
To
word" - nothing
in gvim.
I tried
:%s/^.*\"/
But what I get is: -nothing
Well I am new to scripting so I would like to know if it can be done in any other way like using gvim or awk or sed.
In vim... Check for \(word + quote + space + hyphen\) as first reference, followed directly by another \(word\) as second reference... replace by first reference + space + second reference... Make sure the find/replace can happen multiple times on a line with g suffix.
:%s/\(\w" -\)\(\w\)/\1 \2/g
Note that I left out the leading quote... I suppose it is possible you might have spaces in the quoted text - and I think this form might be better for you. Now in sed, that is the really cool thing about the relationship between *nix tools - they all use similar (or the same) regular expressions pattern language. So, the same exact pattern as above can be done in sed (using : as delimiter for clarity).
sed 's:\(\w" -\)\(\w\):\1 \2:g'
Awk doesn't do back references; so, not to say it can't be done, but it is not so convenient.
Could you please try following and let me know if this helps you.
awk '{sub(/^"/,"");sub(/-/,"- ")} 1' Input_file
Solution 2nd: With sed.
sed 's/^"//;s/-/- /' Input_file
Since you also tagged grep: GNU grep has the -P switch for PCRE (Perl compatible reg ex) which has \K: Keep the stuff left of the \K, don't include it in $&, so:
$ echo \"word\" | grep -oP "\"\Kword\""
word"
If I understand your question correctly you want to replace first " in each line with empty string. So in sed it is just:
sed 's/"//'
Without g flag it will replace only first occurrence in each line.
EDIT:
The same way it will work in Vim (unless you have 'gdefault' option set), so in Vim you can:
:%s/"//
try this :
:%s/\"(.*)\"/\1\"/gc

Embedding quotation marks in command string generated by AWK?

I need to match all instances of strings in one file, with a master list in another. However, if my string is abc I want only that, not abcdef, abc1234 and so on.
So, a word boundary for the regex? Right now, I'm using a simple awk one liner:
cat results_file| sort -k 1| awk -F" " '{ print $1" /home/owner/file_2_search"}'|
xargs -L 1 /bin/grep -i
However, to force a word boundary, I'd need to grep string\b and the quotes (single or double) seem to be required.
In awk, \b is a special character, you need \\b ... And the quoted quotes ... (arg) ... Or am I missing something and overdoing this?
This is a Linux box, so presumably gawk. I have gone over quoting rules for awk, and realize this has got to be simple (and not complex ... but), but am not seeing it.
Had meant to post as an answer, not a comment. Will try to pose a more readable question, but confess to having second thoughts about doing this as a one-liner in the first place -- may be best to follow an alternate method. Appreciate the willingness to help.
--Joe

What are the differences among grep, awk & sed? [duplicate]

This question already has answers here:
What are the differences between Perl, Python, AWK and sed? [closed]
(5 answers)
What is the difference between sed and awk? [closed]
(3 answers)
Closed last month.
I am confused about the differences between grep, awk and sed in terms of their role in Unix/Linux system administration and text processing.
Short definition:
grep: search for specific terms in a file
#usage
$ grep This file.txt
Every line containing "This"
Every line containing "This"
Every line containing "This"
Every line containing "This"
$ cat file.txt
Every line containing "This"
Every line containing "This"
Every line containing "That"
Every line containing "This"
Every line containing "This"
Now awk and sed are completly different than grep.
awk and sed are text processors. Not only do they have the ability to find what you are looking for in text, they have the ability to remove, add and modify the text as well (and much more).
awk is mostly used for data extraction and reporting. sed is a stream editor
Each one of them has its own functionality and specialties.
Example
Sed
$ sed -i 's/cat/dog/' file.txt
# this will replace any occurrence of the characters 'cat' by 'dog'
Awk
$ awk '{print $2}' file.txt
# this will print the second column of file.txt
Basic awk usage:
Compute sum/average/max/min/etc. what ever you may need.
$ cat file.txt
A 10
B 20
C 60
$ awk 'BEGIN {sum=0; count=0; OFS="\t"} {sum+=$2; count++} END {print "Average:", sum/count}' file.txt
Average: 30
I recommend that you read this book: Sed & Awk: 2nd Ed.
It will help you become a proficient sed/awk user on any unix-like environment.
Grep is useful if you want to quickly search for lines that match in a file. It can also return some other simple information like matching line numbers, match count, and file name lists.
Awk is an entire programming language built around reading CSV-style files, processing the records, and optionally printing out a result data set. It can do many things but it is not the easiest tool to use for simple tasks.
Sed is useful when you want to make changes to a file based on regular expressions. It allows you to easily match parts of lines, make modifications, and print out results. It's less expressive than awk but that lends it to somewhat easier use for simple tasks. It has many more complicated operators you can use (I think it's even turing complete), but in general you won't use those features.
I just want to mention a thing, there are many tools can do text processing, e.g.
sort, cut, split, join, paste, comm, uniq, column, rev, tac, tr, nl, pr, head, tail.....
they are very handy but you have to learn their options etc.
A lazy way (not the best way) to learn text processing might be: only learn grep , sed and awk. with this three tools, you can solve almost 99% of text processing problems and don't need to memorize above different cmds and options. :)
AND, if you 've learned and used the three, you knew the difference. Actually, the difference here means which tool is good at solving what kind of problem.
a more lazy way might be learning a script language (python, perl or ruby) and do every text processing with it.

more efficent shell text manipulation

I am using this command:
cut -d: -f2
To sort and reedit text, Is there a more efficient way to do this without using sed or awk?
I would also like to know how I would append a period to the end of each field
At the moment the output is like $x['s'] and I would like it to be $x['s'] .
Just using standard unix tools
edit: I just wanted to know if it was possible without sed or awk, otherwise how would you do it with awk?
Short answer: not really
Longer answer: cut is intended for slicing up lines of text, it does that well. If you need a more complicated behavior, you'll need a text manipulation language. You have rejected the old time answers, so I'll recommend perl.
Any particular reason you don't want to use sed or awk?
edit: I just wanted to know if it was possible without sed or awk, otherwise how would you do it with awk?
awk -F: '{print $2"."}'

truncate output in BASH

How do I truncate output in BASH?
For example, if I "du file.name" how do I just get the numeric value and nothing more?
later addition:
all solutions work perfectly. I chose to accept the most enlightning "cut" answer because I prefer the simplest approach in bash files others are supposed to be able to read.
If you know what the delimiters are then cut is your friend
du | cut -f1
Cut defaults to tab delimiters so in this case you are selecting the first field.
You can change delimiters: cut -d ' ' would use a space as a delimiter. (from Tomalak)
You can also select individual character positions or ranges:
ls | cut -c1-2
I'd recommend cut, as others have said. But another alternative that is sometimes useful because it allows any whitespace as separators, is to use awk:
du file.name | awk '{print $1}'
du | cut -f 1
If you just want the number of bytes of a single file, use the -s operator.
SIZE=-s file.name
That gives you a different number than du, but I'm not sure how exactly you're using this.
This has the advantage of not having to run du, and having bash get the size of the file directly.
It's hard to answer questions like this in a vacuum, because we don't know how you're going to use the data. Knowing that might suggest an entirely different answer.

Resources