Compute checksum on file from command line - linux

Looking for a command or set of commands that are readily available on Linux distributions that would allow me to create a script to generate a checksum for a file.
This checksum is generated by a build system I have no control over by summing every single byte in the file and then truncating that number to 4 bytes.
I know how to do do this using tools like node.js, perl, python, C/C++, etc, but I need to be able to do this on a bare bones Linux distribution running remotely that I can't modify (it's on a PLC).
Any ideas? I've been searching for awhile and haven't found anything that looks straightforward yet.

The solution for byte by byte summation and truncating that number to 4 bytes using much primitive shell commands.
#! /usr/bin/env bash
declare -a bytes
bytes=(`xxd -p -c 1 INPUT_FILE | tr '\n' ' '`)
total=0;
for(( i=0; i<${#bytes[#]}; i++));
do
total=$(($total + 0x${bytes[i]}))
if [ $total > 4294967295 ]; then
total=$(($total & 4294967295))
fi
done
echo "Checksum: " $total

If you just want to do byte by byte summation and truncating that number to 4 bytes then the following command can be used.
xxd -p -c 1 <Input file> | awk '{s+=$1; if(s > 4294967295) s = and(4294967295, s) } END {print s}'
The xxd command is used extract hexdump of the input file and each byte is added to compute the sum. If the sum exceeds 2^32-1 = 4294967295 value, then a bitwise and operation is performed to truncate the bits.

Have you tried cksum? I use it inside a few scripts. It's very simple to use.
http://linux.die.net/man/1/cksum

Related

NULL (\0) added at the end of file

I'm trying to clean a binary file to delete all the NULL on it. The task is quite simple, but I found out a lot of files have a NULL at the end of the file and i dont know what. I'm dumping the hexadecimal value of each byte and i dont see the null anywhere, but if I do a hexdump of the file, I see a value 00 at the end and I dont know why.... Could be that it is a EOF, but it's weird becuase it doesnt appear in all files. This is the script I have, quite simpel one, it generates 100 random binary files, and then reads file per file, char per char. Following the premise that bash wont store NULL's on variables, rewritting the char after storing it on a variable would avoid the NULL's, but no....
#!/bin/bash
for i in $(seq 0 100)
do
echo "$i %"
time dd if=/dev/urandom of=$i bs=1 count=1000
while read -r -n 1 c;
do
echo -n "$c" >> temp
done < $i
mv temp $i
done
I also tried with:
tr '\000' <inFile > outfile
But same result.
This is how it looks the hexdump of one the files with this problem
00003c0 0b12 a42b cb50 2a90 1fd6 a4f9 89b4 ddb6
00003d0 3fa3 eb7e 00c4
c4 should be the last byte butas you can see, there's a 00 there ....
Any clue?
EDIT:
Forgot to mention that the machine where im running this is something similar like raspberry pi and the tools provided with it are quite limited.
Try these other commands:
od -tx1 inFile
xxd inFile
hexdump outputs 00 when the size is an odd number of bytes.
It seems hexdump without options is like -x, hexdump -h gives the list of options; hexdump -C may also help.

created a 100 Bytes file containing zeros .but i cant see zeros

I want to create a 100 Byte file of zeroes.
I used the script:
dd if=/dev/zero of=zero_file_100B bs=50 count=2
It works fine, but when I do cat zero_file_100B it does not print anything. What might be the reason for that?
The reason for that is the character with ASCII code 0 has no graphical representation in your terminal emulator, which is the expected behavior. You should't confuse it with the character '0' which has ASCII code 48.
To view binary data a portable way, you might use od:
od -v -t u1 zero
but when I do cat zero_file_100B it does not print anything
perhaps you were looking for this
$ printf "%0*d" 100 0 > 38650594
$ ls -l 38650594
-rw-rw-r-- 1 me me 100 Jul 29 10:32 38650594 # Size is 100 bytes.
What do you want with zero?
You already succeeded in \0, you can print '0' with
for i in {1..100}; do printf 0; done`
Slower than other solutions, but you can change this for other requirements:
# create file with size 100 only containing strings "zero"
for i in {1..25}; do printf "zero"; done
# create file with size 100 only containing strings "zeros"
for i in {1..20}; do printf "zeros"; done

Bash script that prints out contents of a binary file, one word at a time, without xxd

I'd like to create a BASH script that reads a binary file, word (32-bits) by word and pass that word to an application called devmem.
Right now, I have:
...
for (( i=0; i<${num_words}; i++ ))
do
val=$(dd if=${file_name} skip=${i} count=1 bs=4 2>/dev/null)
echo -e "${val}" # Weird output...
devmem ${some_address} 32 ${val}
done
...
${val} has some weird (ASCII?) format character representations that looks like a diamond with a question mark.
If I replace the "val=" line with:
val=$(dd ... | xxd -r -p)
I get my desired output.
What is the easiest way of replicating the functionality of xxd using BASH?
Note: I'm working on an embedded Linux project where the requirements don't allow me to install xxd.
This script is performance driven, please correct me if I'm wrong in my approach, but for this reason I chose (dd -> binary word -> devmem) instead of (hexdump -> file, parse file -> devmem).
- Regardless of the optimal route for my end goal, this exercise has been giving me some trouble and I'd very much appreciate someone helping me figure out how to do this.
Thanks!
As I see you shouldn't use '|' or echo, because they are both ASCII tools. Instead I think '>' could work for you.
I think devmem is a bash function or alias, so I would try something like this:
for (( i=0; i<${num_words}; i++ ))
do
dd if=${file_name} skip=${i} count=1 bs=4 2>/dev/null 1> binary_file
# echo -e "${val}" # Weird output...
devmem ${some_address} 32 $(cat binary_file)
done
"As cat simply catenates streams of bytes, it can be also used to concatenate binary files, where it will just concatenate sequence of bytes." wiki
Or you can alter devmem to accept file as input...I hope this will help!

How to loop an executable command in the terminal in Linux?

Let me first describe my situation, I am working on a Linux platform and have a collection of .bmp files that add one to the picture number from filename0022.bmp up to filename0680.bmp. So a total of 658 pictures. I want to be able to run each of these pictures through a .exe file that operates on the picture then kicks out the file to a file specified by the user, it also has some threshold arguments: lower, upper. So the typical call for the executable is:
./filter inputfile outputfile lower upper
Is there a way that I can loop this call over all the files just from the terminal or by creating some kind of bash script? My problem is similar to this: Execute a command over multiple files with a batch file but this time I am working in a Linux command line terminal.
You may be interested in looking into bash scripting.
You can execute commands in a for loop directly from the shell.
A simple loop to generate the numbers you specifically mentioned. For example, from the shell:
user#machine $ for i in {22..680} ; do
> echo "filename${i}.bmp"
> done
This will give you a list from filename22.bmp to filename680.bmp. That simply handles the iteration of the range you had mentioned. This doesn't cover zero padding numbers. To do this you can use printf. The printf syntax is printf format argument. We can use the $i variable from our previous loop as the argument and apply the %Wd format where W is the width. Prefixing the W placeholder will specify the character to use. Example:
user#machine $ for i in {22..680} ; do
> echo "filename$(printf '%04d' $i).bmp"
> done
In the above $() acts as a variable, executing commands to obtain the value opposed to a predefined value.
This should now give you the filenames you had specified. We can take that and apply it to the actual application:
user#machine $ for i in {22..680} ; do
> ./filter "filename$(printf '%04d' $i).bmp" lower upper
> done
This can be rewritten to form one line:
user#machine $ for i in {22..680} ; do ./filter "filename$(printf '%04d' $i).bmp" lower upper ; done
One thing to note from the question, .exe files are generally compiled in COFF format where linux expects an ELF format executable.
here is a simple example:
for i in {1..100}; do echo "Hello Linux Terminal"; done
to append to a file:(>> is used to append, you can also use > to overwrite)
for i in {1..100}; do echo "Hello Linux Terminal" >> file.txt; done
You can try something like this...
#! /bin/bash
for ((a=022; a <= 658 ; a++))
do
printf "./filter filename%04d.bmp outputfile lower upper" $a | "sh"
done
You can leverage xargs for iterating:
ls | xargs -i ./filter {} {}_out lower upper
Note:
{} corresponds to one line output from the pipe, here it's the inputfile name.
Output files wouldbe named with postfix '_out'.
You can test that AS-IS in your shell :
for i in *; do
echo "$i" | tr '[:lower:]' '[:upper:]'
done
If you have a special path, change * by your path + a glob : Ex :
for i in /home/me/*.exe; do ...
See http://mywiki.wooledge.org/glob
This while prepend the name of the output images like filtered_filename0055.bmp
for i in *; do
./filter $i filtered_$i lower upper
done

How to create a hex dump of file containing only the hex characters without spaces in bash?

How do I create an unmodified hex dump of a binary file in Linux using bash? The od and hexdump commands both insert spaces in the dump and this is not ideal.
Is there a way to simply write a long string with all the hex characters, minus spaces or newlines in the output?
xxd -p file
Or if you want it all on a single line:
xxd -p file | tr -d '\n'
Format strings can make hexdump behave exactly as you want it to (no whitespace at all, byte by byte):
hexdump -ve '1/1 "%.2x"'
1/1 means "each format is applied once and takes one byte", and "%.2x" is the actual format string, like in printf. In this case: 2-character hexadecimal number, leading zeros if shorter.
It seems to depend on the details of the version of od. On OSX, use this:
od -t x1 -An file |tr -d '\n '
(That's print as type hex bytes, with no address. And whitespace deleted afterwards, of course.)
Perl one-liner:
perl -e 'local $/; print unpack "H*", <>' file
The other answers are preferable, but for a pure Bash solution, I've modified the script in my answer here to be able to output a continuous stream of hex characters representing the contents of a file. (Its normal mode is to emulate hexdump -C.)
I think this is the most widely supported version (requiring only POSIX defined tr and od behavior):
cat "$file" | od -v -t x1 -A n | tr -d ' \n'
This uses od to print each byte as hex without address without skipping repeated bytes and tr to delete all spaces and linefeeds in the output. Note that not even the trailing linefeed is emitted here. (The cat is intentional to allow multicore processing where cat can wait for filesystem while od is still processing previously read part. Single core users may want replace that with < "$file" od ... to save starting one additional process.)
tldr;
$ od -t x1 -A n -v <empty.zip | tr -dc '[:xdigit:]' && echo
504b0506000000000000000000000000000000000000
$
Explanation:
Use the od tool to print single hexadecimal bytes (-t x1) --- without address offsets (-A n) and without eliding repeated "groups" (-v) --- from empty.zip, which has been redirected to standard input. Pipe that to tr which deletes (-d) the complement (-c) of the hexadecimal character set ('[:xdigit:]'). You can optionally print a trailing newline (echo) as I've done here to separate the output from the next shell prompt.
References:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/od.html
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/tr.html
This code produces a "pure" hex dump string and it runs faster than the all the
other examples given.
It has been tested on 1GB files filled with binary zeros, and all linefeeds.
It is not data content dependent and reads 1MB records instead of lines.
perl -pe 'BEGIN{$/=\1e6} $_=unpack "H*"'
Dozens of timing tests show that for 1GB files, these other methods below are slower.
All tests were run writing output to a file which was then verified by checksum.
Three 1GB input files were tested: all bytes, all binary zeros, and all LFs.
hexdump -ve '1/1 "%.2x"' # ~10x slower
od -v -t x1 -An | tr -d "\n " # ~15x slower
xxd -p | tr -d \\n # ~3x slower
perl -e 'local \$/; print unpack "H*", <>' # ~1.5x slower
- this also slurps the whole file into memory
To reverse the process:
perl -pe 'BEGIN{$/=\1e6} $_=pack "H*",$_'
You can use Python for this purpose:
python -c "print(open('file.bin','rb').read().hex())"
...where file.bin is your filename.
Explaination:
Open file.bin in rb (read binary) mode.
Read contents (returned as bytes object).
Use bytes method .hex(), which returns hex dump without spaces or new lines.
Print output.

Resources