Searching for a specifc given hex string in multiple images on Linux - linux

I am doing some research on image processing and I wanted to know if its possible to search for a specific hex string/byte array in various images. It would be great if it gives me a list of images that has that specific string. Basically what grep -r "" does. For some reason grep doesn't do the job. I am not familiar with strings. I did had a look at "man strings" but it didn't help much. Anyways, I would like to look for a specific hex string say "0002131230443" (or even specific byte array i.e. base64 string) in several images in the same folder. Any help would be highly appreciated.
I found this code which exactly does what I want using xxd and grep. Please find the command below:
xxd -p /your/file | tr -d '\n' | grep -c '22081b00081f091d2733170d123f3114'
FYI:
It'll return 1 if the content matches, 0 else.
xxd -p converts the file to plain hex dump, tr -d '\n' removes the newlines added by xxd, and grep -c counts the number of lines matched.
Does anyone know how to run the code above in a specific directory using bash script. I have around 400 images and I want it to return only 400 (i.e. a string matching count) if all the 400 images have that particular string. I found this script code below but it runs the same code over and over for 400 times returning either 0 or 1 each time:
#!/bin/bash
FILES=/FILEPATH/*
for f in $FILES
do
echo "PROCESSING $f FILES"
echo "-------------------"
XXD -u $f | grep ABCD
echo "-------------------"
done
Thanks guys.
Plasma33

With GNU grep:
#!/bin/bash
files=/FILEPATH/*
for f in $files
do
grep -m 1 -P '\x22\x08\x1b\x00' < "$f"
done | wc -l

Related

Search, match and copy directories into another based on names in a txt file

My goal is copy a bulk of specific directories whose names are in a txt file as follows:
$ cat names.txt
raw1
raw2
raw3
raw4
raw5
These directories have subdirectories, hence it is important to copy all the contents. When I list in my terminal it looks like this:
$ ls -l
raw3
raw7
raw1
raw8
raw5
raw6
raw2
raw4
To perform this task, I have tried the following:
cat names.txt | while read line; do grep -l '$line' | xargs -r0 cp -t <desired_destination>; done
But, I get this mistake
cp: cannot stat No such file or directory
I suppose it's because the names in the file list (names.txt) don't match in sorting with the ones in the terminal. Notice that they are unsorted and by using while read line doesn't work. Thank you for taking the time and commitment to help me.
Having problems following the logic of the current code so in the name of K.I.S.S. I propose:
tgtdir=/my/target/directory
while read -r srcdir
do
[[ -d "${srcdir}" ]] && cp -rp "${srcdir}" "${tgtdir}"
done < <(tr -d '\r' < names.dat)
NOTES:
the < <(tr -d '\r' < names.dat) is used to remove windows/dos line endings from names.dat (per comments from OP); if names.dat is updated to remove the \r' then the tr -d with be a no-op (ie, bit of overhead to spawn the subprocess but the script should still read names.dat correctly)
assumes script is run from the directory where the source directories reside otherwise code can be modified to either cd to said directory or preface the ${srcdir} references with said directory
OP can add/modify the cp flags as needed, but I'm assuming at a minimum -r will be needed in order to recursively copy the directories
UUoC.
cat names.txt | while read line; do ...; done
is better written
while read line; do ...; done < names.txt
do grep -l '$LINE' | is eating your input.
printf "%s\n" 1 2 3 |while read line; do echo "Read: [$line]"; grep . | cat; done
Read: [1]
2
3
In your case, it is likely finding no lines that match the literal string $LINE which you have embedded in single-qote marks, which do not allow it to be parsed for content. Use "$line" (avoid capitals), and wouldn't be helpful even if it did match:
$: printf "%s\n" 1 2 3 | grep -l .
(standard input)
You didn't tell it what to read from, so -l is pointless since it's reading the same stdin stream that the read is.
I think what you want is a little simpler -
xargs cp -Rt /your/desired/target/directory/ < names.txt
Assuming you wanted to leave the originals where they were.

grep - limit number of files read

I have a directory with over 100,000 files. I want to know if the string "str1" exists as part of the content of any of these files.
The command:
grep -l 'str1' * takes too long as it reads all of the files.
How can I ask grep to stop reading any further files if it finds a match? Any one-liner?
Note: I have tried grep -l 'str1' * | head but the command takes just as much time as the previous one.
Naming 100,000 filenames in your command args is going to cause a problem. It probably exceeds the size of a shell command-line.
But you don't have to name all the files if you use the recursive option with just the name of the directory the files are in (which is . if you want to search files in the current directory):
grep -l -r 'str1' . | head -1
Use grep -m 1 so that grep stops after finding the first match in a file. It is extremely efficient for large text files.
grep -m 1 str1 * /dev/null | head -1
If there is a single file, then /dev/null above ensures that grep does print out the file name in the output.
If you want to stop after finding the first match in any file:
for file in *; do
if grep -q -m 1 str1 "$file"; then
echo "$file"
break
fi
done
The for loop also saves you from the too many arguments issue when you have a directory with a large number of files.

How to get lines that don't contain certain patterns

I have a file that contain many lines :
ABRD0455252003666
JLKS8568875002886
KLJD2557852003625
.
.
.
AION9656532007525
BJRE8242248007866
I want to extract the lines that start with (ABRD or AION) and in column 12 to 14 the numbers (003 or 007).
The output should be
KLJD2557852003625
BJRE8242248007866
I have tried this and it works but it s too long command and I want to optimise it for performance concerns:
egrep -a --text '^.{12}(?:003|007)' file.txt > result.txt |touch results.txt && chmod 777 results.txt |egrep -v -a --text "ABRD|AION" result.txt > result2.text
The -a option is a non-standard extension for dealing with binary files, it is not needed for text files.
grep -E '^.{11}(003|007)' file.txt | grep -Ev '^(ABRD|AION)'
The first stage matches any line with either 003 or 007 in the twelfth through fourteenth column.
The second stage filters out removes any line starting with either ABRD or AION.
You really need to just read a regexp tutorial but meantime try this:
grep -E "^(ABRD|AION).{7}00[37]"

How to check if a file contains only zeros in a Linux shell?

How to check if a large file contains only zero bytes ('\0') in Linux using a shell command? I can write a small program for this but this seems to be an overkill.
If you're using bash, you can use read -n 1 to exit early if a non-NUL character has been found:
<your_file tr -d '\0' | read -n 1 || echo "All zeroes."
where you substitute the actual filename for your_file.
The "file" /dev/zero returns a sequence of zero bytes on read, so a cmp file /dev/zero should give essentially what you want (reporting the first different byte just beyond the length of file).
If you have Bash,
cmp file <(tr -dc '\000' <file)
If you don't have Bash, the following should be POSIX (but I guess there may be legacy versions of cmp which are not comfortable with reading standard input):
tr -dc '\000' <file | cmp - file
Perhaps more economically, assuming your grep can read arbitrary binary data,
tr -d '\000' <file | grep -q -m 1 ^ || echo All zeros
I suppose you could tweak the last example even further with a dd pipe to truncate any output from tr after one block of data (in case there are very long sequences without newlines), or even down to one byte. Or maybe just force there to be newlines.
tr -d '\000' <file | tr -c '\000' '\n' | grep -q -m 1 ^ || echo All zeros
It won't win a prize for elegance, but:
xxd -p file | grep -qEv '^(00)*$'
xxd -p prints a file in the following way:
23696e636c756465203c6572726e6f2e683e0a23696e636c756465203c73
7464696f2e683e0a23696e636c756465203c7374646c69622e683e0a2369
6e636c756465203c737472696e672e683e0a0a766f696420757361676528
63686172202a70726f676e616d65290a7b0a09667072696e746628737464
6572722c202255736167653a202573203c
So we grep to see if there is a line that is not made completely out of 0's, which means there is a char different to '\0' in the file. If not, the file is made completely out of zero-chars.
(The return code signals which one happened, I assumed you wanted it for a script. If not, tell me and I'll write something else)
EDIT: added -E for grouping and -q to discard output.
Straightforward:
if [ -n $(tr -d '\0000' < file | head -c 1) ]; then
echo a nonzero byte
fi
The tr -d removes all null bytes. If there are any left, the if [ -n sees a nonempty string.
Completely changed my answer based on the reply here
Try
perl -0777ne'print /^\x00+$/ ? "yes" : "no"' file

Bash Sorting STDIN

I want to write a bash script that sorts the input by rules in different files. The first rule is to write all chars or strings in file1. The second rule is to write all numbers in file2. The third rule is to write all alphanumerical strings in file3. All specials chars must be ignored. Because I am not familiar with bash I don t know how to realize this.
Could someone help me?
Thanks,
Haniball
Thanks for the answers,
I wrote this script,
#!/bin/bash
inp=0 echo "Which filename for strings?"
read strg
touch $strg
echo "Which filename for nums?"
read nums
touch $nums
echo "Which filename for alphanumerics?"
read alphanums
touch $alphanums
while [ "$inp" != "quit" ]
do
echo "Input: "
read inp
echo $inp | grep -o '\<[a-zA-Z]+>' > $strg
echo $inp | grep -o '\<[0-9]>' > $nums
echo $inp | grep -o -E '\<[0-9]{2,}>' > $nums
done
After I ran it, it only writes string in the stringfile.
Greetings, Haniball
Sure can help. See here:
How To Ask Questions The Smart Way
Help Vampires: A Spotter’s Guide
cool site about the bash is here: http://wiki.bash-hackers.org/doku.php
for sorting try man sort
for pattern matching try man grep
other useful tools: man sed man awk man strings man tee
And it is always correct tag your homework as "homework" ;)
You can try something like:
<input_file strings -1 -a | tee chars_and_strings.txt |\
grep "^[A-Za-z0-9][A-Za-z0-9]*$" | tee alphanum.txt |\
grep "^[0-9][0-9]*$" > numonly.txt
The above is only for USA - no international (read unicode) chars, where things coming a little bit more complicated.
grep is sufficient (your question is a bit vague. If I got something wrong, let me know...)
Using the following input file:
this is a string containing words,
single digits as in 1 and 2 as well
as whole numbers 42 1066
all chars or strings
$ grep -o '\<[a-zA-Z]\+\>' sorting_input
this
is
a
string
containing
words
single
digits
as
in
and
as
well
all single digit numbers
$ grep -o '\<[0-9]\>' sorting_input
1
2
all multiple digit numbers
$ grep -o -E '\<[0-9]{2,}\>' sorting_input
42
1066
Redirect the output to a file, i.e. grep ... > file1
Bash really isn't the best language for this kind of task. While possible, ild highly recommend the use of perl, python, or tcl for this.
That said, you can write all of stdin from input to a temporary file with shell redirection. Then, use a command like grep to output matches to another file. It might look something like this.
#!/bin/bash
cat > temp
grep pattern1 > file1
grep pattern2 > file2
grep pattern3 > file3
rm -f temp
Then run it like this:
cat file_to_process | ./script.sh
I'll leave the specifics of the pattern matching to you.

Resources