azure Blob MD5 checksum and local MD5 checksum not matching - azure

my file test.txt contains
checksum test file
when I upload into blob its md5 is
CONTENT-MD5 cvL65GNcvWFoqZUTI5oscw==
when I run in local md5Sum test.txt its value is
72f2fae4635cbd6168a99513239a2c73

As discussed in the comments. Solution from here:
Googled around and found a suggestion to use openssl dgst, and it
worked!
openssl dgst -md5 -binary $filename | base64
Turns out, md5sum returns a hex representation of the hash and I had
to unhex it before computing its base64:
md5sum --binary $filename | awk '{print $1}' | xxd -p -r | base64

Related

Weird hash output

I'm trying to create a hash for files in the directory using this script:
for file in *.zip; do openssl dgst -sha256 -binary ${file%.*}.zip $file | base64 >> ${file%.*}.zip.base64sha256; done
It creates hash like this:
b5iQL1fo5r+6osykGr0mcEZ14Xdbn8y0SrFGIuzMfeRvmJAvV+jmv7qh7OUavSZwRnXhd1ufzLRKsUYi7Mx95A==
But for terraform and AWS Lambdas I need a shorted hash value. I can get by using terminal and command like this:
openssl dgst -sha256 -binary archive.zip | base64 >> hash.base64sha256
And output is b5iQL1fo5r+6osykGr0mcEZ14Xdbn8y0SrFGIuzMfeQ=
So the question is: how I can retrieve short version of hash? It's required by terraform and AWS (when hash value is long - lambda are going to redeploy every time)
If you decode the "long" base64 you'll see that it's the same sequence of bytes repeated. That's because here
openssl dgst -sha256 -binary ${file%.*}.zip $file
you are specifying the file twice, once removing the extension and then re-adding it as .zip in ${file%.*}.zip, the other plainly as $file. This results in outputting the concatenated hash for both inputs (that are the same). To fix this, just specify it once:
openssl dgst -sha256 -binary "$file"
(with quotes to avoid problems with whitespace in shell expansion)
Instead of
for file in *.zip; do openssl dgst -sha256 -binary ${file%.*}.zip $file | base64 >> ${file%.*}.zip.base64sha256; done
try
for file in *.zip; do openssl dgst -sha256 -binary ${file%.*}.zip | base64 >> ${file%.*}.zip.base64sha256; done

How can i get md5sum for all files inside zip without extracting

Is there any way to get md5sum for all/anyone file inside zip without extracting the zip?
I can extract needed files using unzip <.zip>
But i need to get md5sum without extracting the zip.
This may not be exactly what you are looking for but it will get you closer. You wouldn't be extracting the entire zip but extracting a file to pipe it to md5sum to get checksum. Without reading the contents of the file, md5sum won't be able to generate a hash.
Let's say you have 3 files with this MD5:
b1946ac92492d2347c6235b4d2611184 a.txt
591785b794601e212b260e25925636fd b.txt
6f5902ac237024bdd0c176cb93063dc4 c.txt
You zip them into a single file using zip final.zip a.txt b.txt c.txt
When you list the files, you see there are 3 files.
unzip -l final.zip
Archive: final.zip
Length Date Time Name
--------- ---------- ----- ----
6 2021-08-08 17:20 a.txt
6 2021-08-08 17:20 b.txt
12 2021-08-08 17:20 c.txt
--------- -------
24 3 files
To get MD5 of each of the files without extracting the entire zip, you can do this:
unzip -p final.zip a.txt | md5sum
b1946ac92492d2347c6235b4d2611184 -
unzip -p final.zip b.txt | md5sum
591785b794601e212b260e25925636fd -
unzip -p final.zip c.txt | md5sum
6f5902ac237024bdd0c176cb93063dc4 -
Alternative
You can do md5sum *.txt > checksums to get hash of all files and store them in a checksums file. Add that to the zip so you know the md5 of each of the file when the files were added to zipped.
For all files in a zip you can use this;
File='final.zip' ; unzip -lqq $File | while read L ; do unzip -p $File ${L##*[[:space:]]} | md5sum | sed "s/-/${L##*[[:space:]]}/" ; done
Gives;
b1946ac92492d2347c6235b4d2611184 a.txt
591785b794601e212b260e25925636fd b.txt
6f5902ac237024bdd0c176cb93063dc4 c.txt
Based on #MartinMann's answer, a version that works correctly no matter if the file names contain spaces or special characters:
ZIPFILE="final.zip"; unzip -Z1 "$ZIPFILE" | grep -v '/$' | while read L; do "$(unzip -p "$ZIPFILE" "$L" | md5sum | cut '-d ' -f1) $L" ; done
Gives:
b1946ac92492d2347c6235b4d2611184 a.txt
591785b794601e212b260e25925636fd b.txt
6f5902ac237024bdd0c176cb93063dc4 path/to/file with spaces.txt

Filter output of command in Linux

How do I print out just the hashsum and file name with sha256sum command? I want Hashsum and just the filename instead of the full path.
Command:
sha256sum /mydir/someOtherDir/file.txt
Output:
123Hashsum /mydir/someOtherDir/file.txt
Desired Output:
123Hashsum file.txt
You can read the output into variables
read -r sha file < <(sha256sum /mydir/someOtherDir/file.txt)
Then you can read just the filename with basename
echo "$sha" "$(basename "$file")"
You can try piping to sed as below (works with absolute paths only) :
sha256sum /mydir/someOtherDir/file.txt | sed 's:/.*/::'

sha256sum hashing of email address

I am trying to use sha256sum hashing command to hash email address.
$ echo -n example#gmail.com | sha256sum | awk '{print $1}'
264e53d93759bde067fd01ef2698f98d1253c730d12f021116f02eebcfa9ace6
Now I want to apply the same on an input file with email_address only,but the below shows a different hash and looks like CAT is the culprit here,let me know how to overcome the issue.
$ echo -n | cat test.txt | sha256sum | awk '{print $1}'
5c98fab97a397b50d060913638c18f7fd42345248bb973c486b6347232e8013e
ideally ,I would like to see below if the test.txt has only one record example#gmail.com (it can have n number of email address)
example#gmail.com|264e53d93759bde067fd01ef2698f98d1253c730d12f021116f02eebcfa9ace6
It's a bit unclear, but I think your test.txt file looks like this:
example#gmail.com
rkj#stackoverflow.com
sean#stackoverflow.com
And you want to produce the output:
example#gmail.com|264e53d93759bde067fd01ef2698f98d1253c730d12f021116f02eebcfa9ace6
rkj#stackoverflow.com|2a583d8e55db9bac7247cac8dc4b52780010583217844e864e159c458ce0185c
sean#stackoverflow.com|797f02327b00f12486f2ed5e85e61680ecb75a0f969b54a627e329506aaf595c
If that is the case, this will do it:
while read email; do SHA=$(echo -n $email | sha256sum | awk '{print $1}'); echo "$email|$SHA"; done < test.txt
In your initial command, you are using echo -n which prints the arguments without a trailing newline. So you are getting the hash of just the e-mail address itself. Once you start working with a file, every line is going to have a newline on it, so for each line you have to strip off the newline and get the hash. That is effectively what my suggested solution is doing.

Checksum on string

Is there a way to calculate a checksum on a string in Linux? The checksum commands that I have seen (cksum, md5sum, sha1sum, etc.) all require a file as input and I do not have a file. I only have a path to a location and want to calculate the checksum on that path.
echo -n 'exampleString' | md5sum
should work.
echo -n "yourstring" |md5sum
echo -n "yourstring" |sha1sum
echo -n "yourstring" |sha256sum
don't forget -n or the result will change (cuz the newline will be parsed)

Resources