Searching and selecting strings from a file - linux

I have a trouble in separating few exact 'fields' with strings and then putting them into .txt file. I need to extract 'nologin' users from /etc/passwd file and that is an easy step. I'm using this command:
grep -n 'nologin' /etc/passwd > file1.txt
cat command gives me for example:
2:daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
3:bin:x:2:2:bin:/bin:/usr/sbin/nologin
4:sys:x:3:3:sys:/dev:/usr/sbin/nologin
and it is saved to file1.txt
Now I have to extract from file1.txt a number (2, 3, 4), login (daemon, bin, sys) UID and shell. It should look like this
2:daemon:1:/usr/sbin/nologin
3:bin:2:/usr/sbin/nologin
4:sys:3:/usr/sbin/nologin
I also have to save that output to a *.txt file.
How can I achieve this?

You can use the cut command like this:
cut -d':' -f1 file1.txt > file2.txt
According to the man page:
-d, --delimiter=DELIM
use DELIM instead of TAB for field delimiter
-f, --fields=LIST
select only these fields; also print any line that contains no delimiter character, unless the -s option is specified

I think you can use awk to get your specific fields
1- Go to your terminal and make sure you know your file path
2- use awk commands where -F use to indicate the separator and $ use to indicate the field number
your file1.txt contains:
2:daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
3:bin:x:2:2:bin:/bin:/usr/sbin/nologin
4:sys:x:3:3:sys:/dev:/usr/sbin/nologin
using awk command in your terminal with file1
awk -F '[:/]' '{print $1 " " $2 " " $4 " " " " $(NF-2) " " $(NF-1) " " $NF}' file1.txt
make sure your file path is correct
$1 or $2 = the number of field or column
$NF = last column
$(NF-1) = the column before the last column
and so on
" " = mean space
your output will be:
2 daemon 1 usr sbin nologin
3 bin 2 usr sbin nologin
4 sys 3 usr sbin nologin
you can add the delimiter back in the awk command instead of space
for example,
awk -F '[:/]' '{print $1 ":" $2 ":" $4 ":/" $(NF-2) "/" $(NF-1) "/" $NF}' file1.txt
output:
output
Thanks and Have a good day

awk -F: '{print $1,$2,$4,$NF}' OFS=: file1.txt > newfile.txt
2:daemon:1:/usr/sbin/nologin
3:bin:2:/usr/sbin/nologin
4:sys:3:/usr/sbin/nologin

Related

How to shorten awk command and add leading zeros to the record count

I am adding header and trailer records to a file with awk commands. How can I shorten/consolidate the awk command, is it possible not to use temp files, and do all the manipulations with original file so I can shorten awk command that way as well. Also, I need to have the trailer record count 10 bytes long with leading zeros. Here is the awk command I am using:
awk -v today="$(date +%Y%m%d)" \
-v last_day="$(date +"%Y%m%d" \
-d "$(date +%Y-%m-01) +1 month -1 day")" \
'BEGIN { print "ABCDEFG MC " today " FB XXX1 " today, last_day } 1' \
original_file.txt > original_file H.txt
awk '{print} END{print "TRL" NR - 1}' \
original_fileH.txt > original_fileT.txt
rm original_file.txt
rm original_fileH.txt
mv original_fileT.txt original_file.txt
Here is file sample:
Record line 1 Record line 2
Here file result after awk execution:
ABCDEFG MC 20201007 FB XXX1 20201007 20201031 Record line 1 Record line 2 TRL3
If you just need to add a header and a tail, you could just do the following:
awk -v today="$(date +%Y%m%d)" \
-v last_day="$(date -d "$(date +%Y-%m-01) +1 month -1 day")" "+%Y%m%d" \
'BEGIN { print "ABCDEFG MC",today,"FB XXX1",today, last_day }
1; END {print "TRL", NR}' file1 > file2
mv file2 file1
If you have GNU awk, you can clean it up a bit more:
awk -i inplace \
'BEGIN{ today=strftime("%Y%m%d"); Y=substr(today,1,4); m=substr(today,5,2)
last_day=strftime(mktime(Y" "m+1" -1 0 0 0"))
print "ABCDEFG MC",today,"FB XXX1",today, last_day }
1;END{print "TRL", NR}' file1
I didn't use your date functions for the sake of clarity and also as the question here is more of an awk thing.
Basically for the file awk1 with contents:
Record line 1 Record line 2
We can use the following one-liner
cat awk1 | awk -v val1="foo" -v val2="bar" '{ print "START " val1 " MIDDLE " val2 " " $0 " END " NF }'
START foo MIDDLE bar Record line 1 Record line 2 END 6
Basically, you don't need the BEGIN or END unless you're doing things like headers or summaries. Tranposing in variables in the right positions here should give you what you want.

Joining consecutive lines using awk

How can i join consecutive lines into a single lines using awk? Actually i have this with my awk command:
awk -F "\"*;\"*" '{if (NR!=1) {print $2}}' file.csv
I remove the first line
44895436200043
38401951900014
72204547300054
38929771400013
32116464200027
50744963500014
i want to have this:
44895436200043 38401951900014 72204547300054 38929771400013 32116464200027 50744963500014
csv file
That's a job for tr:
# tail -n +2 prints the whole file from line 2 on
# tr '\n' ' ' translates newlines to spaces
tail -n +2 file | tr '\n' ' '
With awk, you can achieve this by changing the output record separator to " ":
# BEGIN{ORS= " "} sets the internal output record separator to a single space
# NR!=1 adds a condition to the default action (print)
awk 'BEGIN{ORS=" "} NR!=1' file
I assume you want to modify your existing awk, so that it prints a horizontal space separated list, instead of words, one per row.
You can replace the print $2 action in your command, you can do this:
awk -F "\"*;\"*" 'NR!=1{u=u s $2; s=" "} END {print u}' file.csv
or replace the ORS (output record separator)
awk -F "\"*;\"*" -v ORS=" " 'NR!=1{print $2}' file.csv
or pipe output to xargs:
awk -F "\"*;\"*" 'NR!=1{print $2}' file.csv | xargs

unix concatenate list of files into on line

In a directory, there is several files such as:
file1
file2
file3
Is there a simple way to concatenate those files to get one line (connected by "OR") in bash as follows:
file1 OR file2 OR file3
Or do I need to write a script for it?
You can use this function to print all filenames (including ones with space, newline or special characters) with " OR " as separator (assuming your filename doesn't contain ASCII code 4):
orfiles() {
local IFS=$'\4'
local out="$*"
echo "${out//$'\4'/ OR }"
}
Then call it as:
orfiles *
How it works:
We set IFS (Internal Field Separator) to ASCII 4 locally inside the function
We store output of "$*" in local variable out. This will place \4 after each filename in variable $out.
Finally using BASH string substitution we globally replace \4 by " OR " while printing the output from $out.
In Unix systems IFS is only a single character delimiter therefore it cannot store multi character string " OR " and we have to do this in 2 steps as shown above.
You can simply do that with
printf '%s OR ' $(ls -1 *) | sed 's/OR $/''/'; echo -e '\n'
Where ls -1 * is the directory.
The moment that should be considered is that a filename could contain whitespace(s).
Use the following ls + awk solution:
ls -1 * | awk '{ r=(r)? r" OR "$0 : $0 }END{ print r }'
Workaround for filenames with newline(s):
echo -e $(ls -1b hello* | awk -v RS= '{gsub(/\n/," OR ",$0); gsub(/\\ /," ",$0); print $0}')
-b - ls option to print C-style escapes for nongraphic characters
ls -1|awk -v q='"' '{printf "%s%s", NR==1?"":" OR ", q $0 q}END{print ""}'
the ls & awk way to do it, with example that the filename containing spaces:
kent$ ls -1
file1
file2
'file with OR and space'
kent$ ls -1|awk -v q='"' '{printf "%s%s", NR==1?"":" OR ", q $0 q}END{print ""}'
"file1" OR "file2" OR "file with OR and space"
$ for f in *; do printf '%s%s' "$s" "$f"; s=" OR "; done; printf '\n'
file1 OR file2 OR file3

extract string from a file using shell script

I have file called log.txt. file contains are like below :-
/proc
used avail
10 100
how can i extract the below strings from that file using shell script. I want the below strings to be extracted.
/proc
10
100
awk '{print $1 $4 $5}' log.txt
awk '/\/proc/ {print;getline;getline;print $1"\n"$2}' log.txt
The above awk command calls getline twice whenever a line matches /proc. The print statement then prints out the second line after the match.
Output:
/proc
10
100
Using sed, and if it is spaces you have between 10 and 100:
sed -e '2d;3s/ */\n/' log.txt
if it is tabs you have between 10 and 100, and you gave GNU sed:
sed -e '2d;3s/\t\t*/\n/' log.txt
if it is tabs you have between 10 and 100, and you do not have gave GNU sed, but real tabs instead of the 2 \t above.
Avoiding sed (or awk) and using standard UNIX utilities, and if it is spaces you have between 10 and 100:
paste -s -d " " file | tr -s " " | cut -d " " -f 1,4,5 | tr " " "\n"
if it is tabs you have between 10 and 100, and you have GNU utilities:
paste -s -d "\t" file | tr -s "\t" | cut -f 1,4,5 | tr "\t" "\n"
if it is tabs you have between 10 and 100, and you do not have gave GNU utilities, but real tabs instead of the \t above and a real instead of the \n.

how to get requred field from file on linux?

I have one file which contains three fields separated by two spaces. I need to get only third field from file. File content is as in following example:
kuldeep Mirat Shakti
balaji salunke pune
.
.
.
How can I get the third field?
To get the 3rd field, assuming you don't have any "embedded spaces", just
awk '{print $3}' file
awk by default sets whitespaces as field delimiters. So even if you have 2 spaces or more, the 3rd field is always $3.
However, if you want to be specific, then specify a Field delimiter
awk -F" " '{print $3}' file
If you have other choices, a Ruby one
ruby -F" " -ane 'print $F[2]' file
ruby -ane 'print $F[2]' file
Update: If you need to get all fields after 3rd,
awk -F" " '{$1=$2=$3=""}1' OFS=" " file # add a pipe to `sed 's/^[ \t]*//'` if desired
ruby -F" " -ane 'puts $F[3..-1].join(" ")' file
Use awk:
awk -F' ' '{print $3}' file
This also works if fields may contain embedded spaces.
To get the third field of each line, pipe through awk, e.g
cat filename | awk '{print $3}'
If you just want to get the third field of the first line, use head, too:
cat filename | head -n 1 | awk '{print $3}'
Given #balaji's comment to #kurani's answer:
perl -pe 's/^.*? .*? //' filename
awk -F' ' '{for(i=3; i<NF; i++) {printf("%s%s",$i,FS)}; print $NF}' filename
less filename | cut -d" " -f 3

Resources