Linux counting words in random characters

Linux counting words in random characters - linux

I have generated a file of random characters for A-Z and a-z, the file has different sizes for example 10000 characters or 1000000 I would like to search in them how many times the word 'cat' or 'dog' appeared Would someone be able to provide the command linux grep... | wc... or any other command that can handle this task.

grep has a -c command that will count the number of matches found.
So
grep -c "cat\|dog" <file name>
add -i if you want a case insensitive count

You can use grep with the flag -o. For example:
grep -o "dog\|cat" <filename> | wc -l
About the flag -o, according to man grep: «Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.»
This solution will work in several situations: multiple lines, a single line, the word surrounded with whitespaces or other characters, etc.

Related

How to count exact match of certain patterns in a text file using linux shell command?

I want to find the count of certain pattern in a text file which contains lot of mixed patterns also using linux shell command.
I have a text file which contains below patterns,
[--------------]
[+--------------+]
[+----------+------------+--------------------+]
[+---------------------+---------------------+]
How to find exact count of only first pattern [--------------]?
Note: Don't include square bracket as a pattern. Only special character inside square bracket is a pattern.

cat ./file | sed -e 's/\]/\]\n/' |grep "\[--------------\]" -c
cat reads file
sed replace ] with ]\n
grep searches every line for your expression and prints the number of lines -c

grep the files that contain specific word and don't contain those words

I need to use grep command to get all the files that contain a [form] word,
and don't contain a [head] or [cfGenerate] words at the same time.

GNU grep with PCRE :
grep -P '^(?!.*head)(?!.*cfGenerate).*form'
negative lookaheads can be combined to make the match fail if any unwanted pattern occurs.

Using grep to get 12 letter alphabet only lines

Using grep
How many 12 letter - alphabet only lines are in testing.txt?
excerpt of testing.txt
tyler1
Tanktop_Paedo
xyz2#geocities.com
milt#uole.com
justincrump
cranges10
namer#uole.com
soulfunkbrotha
timetolearnz
hotbooby#geocities.com
Fire_Crazy
helloworldad
dingbat#geocities.com
from this excerpt, I want to get a result of 2. (helloworldad, and timetolearnz)
I want to check every line and grep only those that have 12 characters in each line. I can't think of a way to do this with grep though.
For the alphabet only, I think I can use
grep [A-Za-z] testing.txt
However, how do I make it so only the characters [A-Za-z] show up in those 12 characters?

You can do it with extended regex -E and by specifying that the match is exactly {12} characters from start ^ to finish $
$ grep -E "^[A-Za-z]{12}$" testing.txt
timetolearnz
helloworldad
Or if you want to get the count -c of the lines you can use
$ grep -cE "^[A-Za-z]{12}$" testing.txt
2

grep supports whole-line match and counting, e.g.:
grep -xc '[[:alpha:]]\{12\}' testing.txt
Output:
2
The [:alpha:] character class is another way of saying [A-Za-z]. See section 3.2 of the the info pages: info grep 'Regular Expressions' 'Character Classes and Bracket Expressions' for more on this subject. Or look it up in the pdf manual online.

How to print the longest word in a file by using combination of grep and wc

iam trining to find the longest word in a text file.
i tried it and find out the no of characters in the longest word in a file
by using the command
wc -L
i need to print the longest word By using this number and grep command .

If you must use the two commands give, I'd suggest:
grep -E ".{$(wc -L < test.txt)}" test.txt
The command substitution is used to build the correct brace expression to match the line(s) with exactly the given number of characters. -E is needed to enable extended regular expression support; otherwise, the braces need to be escaped: grep ".\{...\}" test.txt.
Using an awk command that makes a single pass through the file may be faster.

Find the number of occurences of certain string sequences

I want to count the number of occurences of the IP address 192.168.1.10 in a text file using grep | wc.
The command I use is:
cat ./capture.txt|grep "192.168.1.10"|wc -w
which returns 0, and I don't know why.
Here is the content of my .txt file:

give this a try:
grep -Fwo '192.168.1.10' file|wc -l
-F makes the grep take your pattern as literal string instead of regex
-w excludes 192.168.1.101 or 192.168.1.100
-o lists each match in a line. grep does line based match, if your pattern matched twice in a line, the result of occurrence count may be wrong.

cat ./capture.txt | grep "\b192\.168\.1\.10\b" -c
\. search for dot, not any character
\b match at the beginning or end of a word
-c return the number of occurrences

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Linux counting words in random characters - linux

grep has a -c command that will count the number of matches found. So grep -c "cat\|dog" <file name> add -i if you want a case insensitive count

Related

How to count exact match of certain patterns in a text file using linux shell command?

grep the files that contain specific word and don't contain those words

Using grep to get 12 letter alphabet only lines

How to print the longest word in a file by using combination of grep and wc

Find the number of occurences of certain string sequences

Categories

Resources