Cut and Awk command : Delimiter behaviour - linux

I tried to use cut command to get a list of file names and their sizes from "ls -l" command output.
$ ls -l | cut -f 5,9 -d " "
It gives me output based on 'SINGLE WHITE SPACE' as a delimiter. When "ls -l" output contains consecutive spaces in certain rows, then the output of the command is not proper for those rows.
The rows which have only single white space as column separator, give correct output.
When I run following command:
$ ls -l | awk '{ print $5"\t"$9 }'
awk is ignoring multiple spaces and properly extracting columns from "ls -l" output.
While, cut is treating each space as a delimiter, there by putting values in wrong columns.
It gives correct output for all rows.
Why is this happening ? What can I do to work this out with cut command ?

awk splits fields on whitespace. cut splits fields on a delimiting character. awk is the better tool for this problem.
As an alternative, you can pipe ls -l into a utility that either compresses multiple space chars (maybe tr -s), or into a utility that replaces multiple space chars with a single one (maybe sed). Then cut will do what you want it to.

Don't parse ls -- your code will not print the full filename if it contains spaces. To get the file size and name, use stat:
stat -c "%s %n" *

try this?:
ls -l | tr -s ' ' | cut -d ' ' -f 5, 9

Related

What do back brackets do in this bash script code?

so i'm doing a problem with bashscript, this one: ./namefreq.sh ANA should return a list of two names (on separate lines) ANA and RENEE, both of which have frequency 0.120.
Basically I have a file from table.csv shown in the code below that have names and a frequency number next to them e.g. Anna, 0.120
I'm still unsure what the `` does for this code, and I'm also struggling to understand how this code is able to print out two names with identical frequencies. The way I read the code is:
grep compares the word (-w) typed by the user (./bashscript.sh Anna) to the value of (a), which then uses the cut command to be able to compare the 2nd field of the line separated by the delimiter "," which is the frequency from the file table.csv and then | cut -f1 -d"," prints out the first fields which are the names with the same frequency
^ would this be correct?
thanks :)
#!/bin/bash
a=`grep -w $1 table.csv | cut -f2 -d','`
grep -w $a table.csv | cut -f1 -d',' | sort -d
When a command is in backticks or $(), the output of the command is subsituted back into the command in place of it. So if the file has Anna,0.120
a=`grep -w Anna table.csv | cut -f2 -d','`
will execute the grep and cut commands, which will output 0.120, so it will be equivalent to
a=0.120
Then the command looks for all the lines that match 0.120, extracts the first field with cut, and sorts them.

Empty string as a output field seperator for Cut

How can I use cut with --output-delimiter=""? I want to join two columns using cut.
I tried the following command. However cat -v shows that there are non printable characters. Specifically "^#". Any suggestions to how can I overcome this?
cut -d, -f 3,6 --output-delimiter="" file1.csv | cat -v
This is the content of my file
011,IBM,Palmisano,t,t,t
012,INTC,Otellini,t,t,t
013,SAP,Snabe,t,t,t
014,VMW,Maritz,t,t,t
015,ORCL,Ellison,t,t,t
017,RHT,Whitehurst,t,t,t
When i run my command I'm seeing
Palmisano^#t
Otellini^#t
Snabe^#t
Maritz^#t
Ellison^#t
Whitehurst^#t
Expected output: Basically I want to exclude ^# in the output
Palmisanot
Otellinit
Snabet
Maritzt
Ellisont
Whitehurstt
Thank you.
The output delimiter is not an empty string, but probably the NULL character. You might want to try
cut -d, -f 3,6 --output-delimiter=$'\00' file1.csv
(Assuming your shell supports $'...'-quoting; bash and zsh are fine here, not sure about others).
edit:
cut apparently puts the NULL character if the output separator is set to the empty string. I do not see a way around it.
If awk is an acceptable solution, this will do the trick:
awk -F, '{print $3 $6}' file*
If you want to be more verbose and explicit:
awk 'BEGIN{FS=","; OFS=""}; {print $3,$6}' file*
FS="," sets the field separator to ,.
OFS="" sets the Output Field Separator to the empty string.
You probably don't want to cut by fields but instead by characters or perhaps bytes. See the description of -c and/or -b in the man page, instead of using -f.

Sed, Awk for combining the output of two cut statements

I'm trying to combine the below outputs into one command. The issue is that the field I'm trying to grab is in reverse order. I was told that cut doesn't support a "reverse" option and to use AWK for this purpose but it didn't end up working for my purpose. I'm trying to take the output of the ls- l against the /dev/block to return the partitions and automatically build a dd if= / of= for each outputted line based on the output of the command.
I tried piping the output to awk:
cut -d' ' -f23,25 ... | awk '{print $2,$1}'
however, the result was when using sed to input the prefix and suffix, it wasn't in the appropriate order.
I built the two statements below which individually return the expected output, just looking for the "right" way to combine both of these statements in the most efficient manner using sed / awk.
ls -l /dev/block/platform/msm_sdcc.1/by-name/ | cut -d' ' -f 25 | sed "s/^/dd if=/"
ls -l /dev/block/platform/msm_sdcc.1/by-name/ | cut -d' ' -f 23 | sed "s/.*/of=\/external_sd\/&.dsk/"
Any assistance will be appreciated.
Thank you.
If you're already using awk, I don't think you'll need cut or sed. You can probably do something like the following, though I'll have to trust you on the field numbers
ls -l /dev/block/platform/msm_sdcc.1/by-name | awk '{print "dd if=/"$25 " of=/" $23 ".dsk"}'
awk will split on all whitespace, not just the space character, so it's possible the fields will shift some, though it may be more reliable too.

Find line number in a text file - without opening the file

In a very large file I need to find the position (line number) of a string, then extract the 2 lines above and below that string.
To do this right now - I launch vi, find the string, note it's line number, exit vi, then use sed to extract the lines surrounding that string.
Is there a way to streamline this process... ideally without having to run vi at all.
Maybe using grep like this:
grep -n -2 your_searched_for_string your_large_text_file
Will give you almost what you expect
-n : tells grep to print the line number
-2 : print 2 additional lines (and the wanted string, of course)
You can do
grep -C 2 yourSearch yourFile
To send it in a file, do
grep -C 2 yourSearch yourFile > result.txt
Use grep -n string file to find the line number without opening the file.
you can use cat -n to display the line numbers and then use awk to get the line number after a grep in order to extract line number:
cat -n FILE | grep WORD | awk '{print $1;}'
although grep already does what you mention if you give -C 2 (above/below 2 lines):
grep -C 2 WORD FILE
You can do it with grep -A and -B options, like this:
grep -B 2 -A 2 "searchstring" | sed 3d
grep will find the line and show two lines of context before and after, later remove the third one with sed.
If you want to automate this, simple you can do a Shell Script. You may try the following:
#!/bin/bash
VAL="your_search_keyword"
NUM1=`grep -n "$VAL" file.txt | cut -f1 -d ':'`
echo $NUM1 #show the line number of the matched keyword
MYNUMUP=$["NUM1"-1] #get above keyword
MYNUMDOWN=$["NUM1"+1] #get below keyword
sed -n "$MYNUMUP"p file.txt #display above keyword
sed -n "$MYNUMDOWN"p file.txt #display below keyword
The plus point of the script is you can change the keyword in VAL variable as you like and execute to get the needed output.

How to trim specific text with grep

I am in need of trimming some text with grep, I have tried various other methods and havn't had much luck, so for example:
C:\Users\Admin\Documents\report2011.docx: My Report 2011
C:\Users\Admin\Documents\newposter.docx: Dinner Party Poster 08
How would it be possible to trim the text file, so to trim the ":" and all characters after it.
E.g. so the output would be like:
C:\Users\Admin\Documents\report2011.docx
C:\Users\Admin\Documents\newposter.docx
use awk?
awk -F: '{print $1':'$2}' inputFile > outFile
you can use grep
(note that -o returns only the matching text)
grep -oe "^C:[^:]" inputFile > outFile
That is pretty simple to do with grep -o:
$ grep -o '^C:[^:]*' input
C:\Users\Admin\Documents\report2011.docx
C:\Users\Admin\Documents\newposter.docx
If you can have other drives just replace C by .:
$ grep -o '^.:[^:]*' input
If a line can start with something different than a drive name, you can consider both the occurrence a drive name in the beginning of the line and the case where there is no such drive name:
$ grep -o '^\(.:\|\)[^:]*' input
cat inputFile | cut -f1,2 -d":"
The -d specifies your delimiter, in this case ":". The -f1,2 means you want the first and second fields.
The first part doesn't necessarily have to be cat inputFile, it's just whatever it takes to get the text that you referred to. The key part being cut -f1,2 -d":"
Your text looks like output of grep. If what you're asking is how to print filenames matching a pattern, use GNU grep option --files-with-matches
You can use this as well for your example
grep -E -o "^C\S+"| tr -d ":"
egrep -o "^C\S+"| tr -d ":"
\S here is non-space character match

Resources