Generating just a SHA-256 hash from the Linux command line - linux

I'm looking for a single command that emits just the sha256 hash, as a hexadecimal number, of the contents of a single supplied file.
I am aware of shasum -a 256, openssl dgst -sha256, sha256sum et al. but they all emit other information together with the checksum and I would like to avoid the need for post-processing the result with sed or some such.

You may use:
sh -c 'shasum < "$1" | cut -d" " -f1' -- "$file"

The following perl one-liner actually seems to satisfy all my stated requirements:
perl -e 'use Digest::SHA;' -e 'use warnings;' -e 'use strict;' -e 'use autodie;' -e 'my $sha = Digest::SHA->new(256); open my $fh, $ARGV[0]; $sha->addfile($fh, "b"); print $sha->hexdigest . "\n";' "$file"
In particular it is safe for "funny" file names and can be used for arbitrary shell scripting, including as a find --exec argument (which is one thing I need it for).
(I have not written in Perl for a very long time, so the above can probably be improved.)

Related

parsing complex string using shell script

I'm trying the whole day to find a good way for parsing some strings with a shell script. the strings are used as calling parameter for some applications.
they looks like:
parsingParams -c "id=uid5 prog=/opt/bin/example arg=\"-D -t5 >/dev/null 1>&2\" info='fdhff fd'" start
I'm only allowed to use shell-script. I tried to use some sed and cut commands but nothing works fine.
My tries are like:
prog=$(echo $# | cut -d= -f3 | sed 's|\s.*$||')
that return the correct value of prog but for the value of arg I couldn't find a good way to get it.
the info parameter is optional also it may be left.
may any one have a good idea that can solve this problem?
many thanks in advance
Looks like you could use eval to let the shell parse your input string, but if you don't control the input (if it comes from an unreliable source), that will introduce a major vulnerability (imagine an attacker somehow passes -c "rm -rf /" to your program).
A safer way would be to explicitly specify allowed forms of user input.
The problem you have with splitting on space (with cut) if the space is quoted, can be avoided if you specify valid fields (content, not separator), for example in GNU awk, you can use FPAT:
$ params="id=uid5 prog=/opt/bin/example arg=\"-D -t5 >/dev/null 1>&2\" info='fdhff fd'"
$ awk -v FPAT="[^=]+=(\"[^\"]*\"|'[^']*'|[^ ]*) *" '{for (i=1; i<=NF; i++) print $i}' <<<"$params"
id=uid5
prog=/opt/bin/example
arg="-D -t5 >/dev/null 1>&2"
info='fdhff fd'
Valid fields will be in one of the following forms:
var="val with spaces"
var='val with spaces'
var=val_no_spaces
Now with assignments split (one per line, assuming newline is not allowed in params), you can process them further, even with cut:
$ awk ... | cut -d $'\n' -f3
arg="-D -t5 >/dev/null 1>&2"
eval
$ eval "id=uid5 prog=/opt/bin/example arg=\"-D -t5 >/dev/null 1>&2\" info='fdhff fd'"
$ echo $id
uid5
$ echo $prog
/opt/bin/example
$ echo $arg
-D -t5 >/dev/null 1>&2
$ echo $info
fdhff fd

What does grep -Po '...\K...' do? How else can that effect be achieved?

I have this script script.sh:
#!/bin/bash
file_path=$1
result=$(grep -Po 'value="\K.*?(?=")' $file_path)
echo $result
and this file text.txt:
value="a"
value="b"
value="c"
When I run ./script.sh /file/directory/text.txt command, the output in the terminal is the following:
a b c
I understand what the script does, but I don't understand HOW it works, so I need a detailed explanation of this part of command:
-Po 'value="\K.*?(?=")'
If I understood correctly, \K is a Perl command. Can you give me an alternative in shell (for example with awk command)?
Thank you in advance.
grep -P enables PCRE syntax. (This is a non-standard extension -- not even all builds of GNU grep support it, as it depends on the optional libpcre library, and whether to link this in is a compile-time option).
grep -o emits only matched text, and not the entire line containing said text, in output. (This too is nonstandard, though more widely available than -P).
\K is a PCRE extension to regex syntax discarding content prior to that point from being included in match output.
Since your shell is bash, you have ERE support built in. As an alternative that uses only built-in functionality (no external tools, grep, awk or otherwise):
#!/usr/bin/env bash
regex='value="([^"]*)"' # store regex (w/ match group) in a variable
results=( ) # define an empty array to store results
while IFS= read -r line; do # iterate over lines on input
if [[ $line =~ $regex ]]; then # ...and, when one matches the regex...
results+=( "${BASH_REMATCH[1]}" ) # ...put the group's contents in the array
fi
done <"$1" # with stdin coming from the file named in $1
printf '%s\n' "${results[*]}" # combine array results with spaces and print
See http://wiki.bash-hackers.org/syntax/ccmd/conditional_expression for a discussion of =~, and http://wiki.bash-hackers.org/syntax/shellvars#bash_rematch for a discussion of BASH_REMATCH. See BashFAQ #1 for a discussion of reading files line-by-line with a while read loop.

How to use functions in bash? [duplicate]

This question already has an answer here:
Compute base64 encoded hash from a given hash?
(1 answer)
Closed 5 years ago.
I have a file which has hashes of the files against their filename.
For example,
fb7e0a4408e46fd5573ffb9e73aec021a9dcf426235c0ccfc37d2f5e09a68a23 /path/to/some/file
237e0a4408e46fe3573f239e73aec021a9dcf426235c023fc37d2f5e09a68a12 /path/to/another/file
... and so on...
I need the hash converted to base64 encoded format.
So I used combination of a bash function and awk.
Here is what I wrote,
#!/bin/sh
base64Encode() {
$1 | openssl base64 -A
}
awk ' { t = base64Encode $1; print t } ' file.txt
But it does not seem to work. I'm using hashdeep to generate the hash-list file and hashdeep does not support base64 encoded output. That is why I'm using openssl.
Any help or tips regarding this would be great!
Edit:
The given answers work but I'm having some other issue it seems.
Usually cat filename | openssl dgst -sha256 | openssl base64 -A gives a base64 encoded output for filename file which is absoutely correct,
and output from hashdeep matched output from cat filename | openssl dgst -sha256.
So, I thought of piping the output obtained from above step to openssl base64 -A for base64 output. But, still I get different values from actual result.
Although this might be suited for a separate question perhaps, but still I would appreciate any support on this.
Awk only:
$ awk '{ c="echo " $1 "|openssl base64 -A"
c | getline r
print r }' file
ZmI3ZTBhNDQwOGU0NmZkNTU3M2ZmYjllNzNhZWMwMjFhOWRjZjQyNjIzNWMwY2NmYzM3ZDJmNWUwOWE2OGEyMwo=
MjM3ZTBhNDQwOGU0NmZlMzU3M2YyMzllNzNhZWMwMjFhOWRjZjQyNjIzNWMwMjNmYzM3ZDJmNWUwOWE2OGExMgo=
For the tight one-liner version see #123's comment below.
... and #EdMorton's super-tight (read: super-proof) version.
Because you especially asking for how to use functions, I divided the problem to several small functions. It is a good practice in all (bigger) bash programs.
The basic rule is: functions behaves like any other commands:
you can redirect their input/output
you can call them with arguments
and like.
The best functions are like common unix executables, e.g. reads from stdin and prints to stdout. This allows you use them in pipelines too.
So, now the rewrite:
# function for create base64 - reads from stdin, writes to stdout
base64Encode() {
openssl base64 -A
}
# function for dealing with your file
# e.g. reads lines "hash path" and prints "base64 path"
convert_hashes() {
while read -r hash path; do
b64=$(base64Encode <<< "$hash")
echo "$b64 $path"
done
}
#the "main" program
convert_hashes < your_file.txt
output
ZmI3ZTBhNDQwOGU0NmZkNTU3M2ZmYjllNzNhZWMwMjFhOWRjZjQyNjIzNWMwY2NmYzM3ZDJmNWUwOWE2OGEyMwo= /path/to/some/file
MjM3ZTBhNDQwOGU0NmZlMzU3M2YyMzllNzNhZWMwMjFhOWRjZjQyNjIzNWMwMjNmYzM3ZDJmNWUwOWE2OGExMgo= /path/to/another/file
Yes, i know, i want only the base64 without the attached path. Ot course, you can modify the above convert_hashes and remove the path from the output, e.g. instead of the echo "$b64 $path" you could use the echo "$b64" and the output will be just the b64 string only - but youre loosing information in the function - which string belongs to which path - imho, not the best practice.
Therefore, you can leave the function as-is, and use another tool, for getting the first column - and only when needed - e.g. in the "main" program. This way you have designed a function for later more universal way.
convert_hashes < your_file.txt | cut -d ' ' -f1
output
ZmI3ZTBhNDQwOGU0NmZkNTU3M2ZmYjllNzNhZWMwMjFhOWRjZjQyNjIzNWMwY2NmYzM3ZDJmNWUwOWE2OGEyMwo=
MjM3ZTBhNDQwOGU0NmZlMzU3M2YyMzllNzNhZWMwMjFhOWRjZjQyNjIzNWMwMjNmYzM3ZDJmNWUwOWE2OGExMgo=
Now imagine, that you extending the script, and want not use files, but the input is coming from another program: Let simulate this with the following get_data function (of course, in the real app it will do something other, not just cat:
get_data() {
cat <<EOF
fb7e0a4408e46fd5573ffb9e73aec021a9dcf426235c0ccfc37d2f5e09a68a23 /path/to/some/file
237e0a4408e46fe3573f239e73aec021a9dcf426235c023fc37d2f5e09a68a12 /path/to/another/file
EOF
}
now you can use the all above as:
get_data | convert_hashes
the output will be the same as above.
of course, you can do something with the output too, let say
get_data | convert_hashes | grep another/file | cut -d ' ' -f1
MjM3ZTBhNDQwOGU0NmZlMzU3M2YyMzllNzNhZWMwMjFhOWRjZjQyNjIzNWMwMjNmYzM3ZDJmNWUwOWE2OGExMgo=
Of course, if you have such "modular" structure, you can easily replace any parts, without need touch the other parts, let say going to replace the openssl with the base64 command.
base64Encode() {
base64
}
And everything will continue work, without any other changes. Of course, in real app is (probably) pointless to have function which calls only one program - but I especially doing this because you asked about the functions.
Otherwise, the above could be done in simple:
while read -r hash path; do
openssl base64 -A <<<"$hash"
echo
#or echo $(openssl base64 -A <<<"$hash")
#or printf "%s\n" $(openssl base64 -A <<<"$hash")
done < your_file.txt
or even
cut -d ' ' -f1 base | xargs -I% -n1 bash -c 'echo $(openssl base64 -A <<<"%")'
You need the echo or print because the openssl doesn't prints newlines by default. Output:
ZmI3ZTBhNDQwOGU0NmZkNTU3M2ZmYjllNzNhZWMwMjFhOWRjZjQyNjIzNWMwY2NmYzM3ZDJmNWUwOWE2OGEyMwo=
MjM3ZTBhNDQwOGU0NmZlMzU3M2YyMzllNzNhZWMwMjFhOWRjZjQyNjIzNWMwMjNmYzM3ZDJmNWUwOWE2OGExMgo=
Ps: to be honest, i do not understand why do you need base64 encode some already encoded hash - but YMMV. :)

Losing tabs and newlines when performing a replacement with sed

I need to replace -Xmx1024m (changing 1024 to 2048) in standalone.conf from a shell command.
I try do this with sed:
echo $(sed 's/1024/2048/g' standalone.conf) > standalone.conf.
The code works, but the text saved loses tabs and newlines.
Passing an expansion to echo unquoted subjects it to string-splitting and glob-expansion, passing every individual word produced by those processes as a separate argument to echo (which echo then combines with a single space between each).
Consider instead:
sed 's/1024/2048/g' <standalone.conf >standalone.conf.new && mv standalone.conf{.new,}
...and, in general, always use echo "$foo" or instead of echo $foo -- it's the lack of quotes here which was most immediately responsible for your bug.
Do not use echo, but let sed change the file:
sed -i 's/1024/2048/g' standalone.conf
sed 's/1024/2048/g' will translate 1024 to 2048 globally, throughout the file, which might be unwise. It would be much better to limit the translation somehow. If your goal is in fact the one you stated, namely changing -Xmx1024m to -Xmx2048m, then the following would at least be a reasonable start (assuming your sed supports the -i option):
sed -i -e 's/-Xmx1024m/-Xmx2048m/' standalone.conf
(If your sed does not support -i, then make the obvious changes.)
If the timestamp of the file is useful, and if you only want to update the file if it needs updating, then consider:
grep -q -e -Xmx1024m standalone.conf && sed -i -e 's/-Xmx1024m/-Xmx2048m/' standalone.conf

How can I randomize the lines in a file using standard tools on Red Hat Linux?

How can I randomize the lines in a file using standard tools on Red Hat Linux?
I don't have the shuf command, so I am looking for something like a perl or awk one-liner that accomplishes the same task.
Um, lets not forget
sort --random-sort
shuf is the best way.
sort -R is painfully slow. I just tried to sort 5GB file. I gave up after 2.5 hours. Then shuf sorted it in a minute.
And a Perl one-liner you get!
perl -MList::Util -e 'print List::Util::shuffle <>'
It uses a module, but the module is part of the Perl code distribution. If that's not good enough, you may consider rolling your own.
I tried using this with the -i flag ("edit-in-place") to have it edit the file. The documentation suggests it should work, but it doesn't. It still displays the shuffled file to stdout, but this time it deletes the original. I suggest you don't use it.
Consider a shell script:
#!/bin/sh
if [[ $# -eq 0 ]]
then
echo "Usage: $0 [file ...]"
exit 1
fi
for i in "$#"
do
perl -MList::Util -e 'print List::Util::shuffle <>' $i > $i.new
if [[ `wc -c $i` -eq `wc -c $i.new` ]]
then
mv $i.new $i
else
echo "Error for file $i!"
fi
done
Untested, but hopefully works.
cat yourfile.txt | while IFS= read -r f; do printf "%05d %s\n" "$RANDOM" "$f"; done | sort -n | cut -c7-
Read the file, prepend every line with a random number, sort the file on those random prefixes, cut the prefixes afterwards. One-liner which should work in any semi-modern shell.
EDIT: incorporated Richard Hansen's remarks.
A one-liner for python:
python -c "import random, sys; lines = open(sys.argv[1]).readlines(); random.shuffle(lines); print ''.join(lines)," myFile
And for printing just a single random line:
python -c "import random, sys; print random.choice(open(sys.argv[1]).readlines())," myFile
But see this post for the drawbacks of python's random.shuffle(). It won't work well with many (more than 2080) elements.
Related to Jim's answer:
My ~/.bashrc contains the following:
unsort ()
{
LC_ALL=C sort -R "$#"
}
With GNU coreutils's sort, -R = --random-sort, which generates a random hash of each line and sorts by it. The randomized hash wouldn't actually be used in some locales in some older (buggy) versions, causing it to return normal sorted output, which is why I set LC_ALL=C.
Related to Chris's answer:
perl -MList::Util=shuffle -e'print shuffle<>'
is a slightly shorter one-liner. (-Mmodule=a,b,c is shorthand for -e 'use module qw(a b c);'.)
The reason giving it a simple -i doesn't work for shuffling in-place is because Perl expects that the print happens in the same loop the file is being read, and print shuffle <> doesn't output until after all input files have been read and closed.
As a shorter workaround,
perl -MList::Util=shuffle -i -ne'BEGIN{undef$/}print shuffle split/^/m'
will shuffle files in-place. (-n means "wrap the code in a while (<>) {...} loop; BEGIN{undef$/} makes Perl operate on files-at-a-time instead of lines-at-a-time, and split/^/m is needed because $_=<> has been implicitly done with an entire file instead of lines.)
When I install coreutils with homebrew
brew install coreutils
shuf becomes available as n.
Mac OS X with DarwinPorts:
sudo port install unsort
cat $file | unsort | ...
FreeBSD has its own random utility:
cat $file | random | ...
It's in /usr/games/random, so if you have not installed games, you are out of luck.
You could consider installing ports like textproc/rand or textproc/msort. These might well be available on Linux and/or Mac OS X, if portability is a concern.
On OSX, grabbing latest from http://ftp.gnu.org/gnu/coreutils/ and something like
./configure
make
sudo make install
...should give you /usr/local/bin/sort --random-sort
without messing up /usr/bin/sort
Or get it from MacPorts:
$ sudo port install coreutils
and/or
$ /opt/local//libexec/gnubin/sort --random-sort

Resources