how to speed up openssl generate md5 checksum

how to speed up openssl generate md5 checksum - multithreading

I'am facing a problem, in AIX platform, we use a command to generate checksum:
Sample:
exec 0<list
while read line
do
openssl md5 $line >> checksum.out
done
But this last for a long time. I find out that our cpus still have free resources.
It's the openssl md5 running multithread? If not how can I let it run by multithread, or using other method to speed up it.
Best Regards
Void

If I understand correctly from the answer and comments of this question, it can't be done as there are dependencies between the steps in the hashing algorithm (and I guess OpenSSL would have a multithreaded implementation if it was generally possible).
However you could always parallelize the tasks by starting n instances of openssl md5 in parallel.
For example (assuming n = 4 threads)
while read line; do
openssl md5 $line >> checksum.out0 &
openssl md5 $(read) >> checksum.out1 &
openssl md5 $(read) >> checksum.out2 &
openssl md5 $(read) >> checksum.out3
done
The last one should not run in the background if you want to keep the exact number of threads running at the same time. Also you may want to make sure that the different lines take about the same time to complete so you don't end up with race conditions.
Also this example is not really tested (using $(read)), and there are probably better ways to do it (for example let each instance write its output to a separate file and then concatenate them all afterwards - e.g. cat checksum.out* > checksum.out), but it should be enough of an idea to help you get started.
EDIT:
I just tested and read works the way I hoped, so by making a new output file for each instance of openssl md5 with incremented numbers at the end (for example by including a counter variable) you can just add an extra line at the end of the script to cat the outputs into a single file.
Resulting script:
exec 0<list
COUNT=0
while read line; do
openssl md5 $line >> checksum.out$((COUNT++)) &
openssl md5 $(read) >> checksum.out$((COUNT++)) &
openssl md5 $(read) >> checksum.out$((COUNT++)) &
openssl md5 $(read) >> checksum.out$((COUNT++))
done
cat checksum.out* > checksum.out
Should do the trick (just remember to clean up all the temporary files afterwards...)

Related

Use more than one core in bash

I have a linux tool that (greatly simplifying) cuts me the sequences specified in illumnaSeq file. I have 32 files to grind. One file is processed in about 5 hours. I have a server on the centos, it has 128 cores.
I've found a few solutions, but each one works in a way that only uses one core. The last one seems to fire 32 nohups, but it'll still pressurize the whole thing with one core.
My question is, does anyone have any idea how to use the server's potential? Because basically every file can be processed independently, there are no relations between them.
This is the current version of the script and I don't know why it only uses one core. I wrote it with the help of advice here on stack and found on the Internet:
#!/bin/bash
FILES=/home/daw/raw/*
count=0
for f in $FILES
to
base=${f##*/}
echo "process $f file..."
nohup /home/daw/scythe/scythe -a /home/daw/scythe/illumina_adapters.fa -o "OUT$base" $f &
(( count ++ ))
if (( count = 31 )); then
wait
count=0
fi
done
I'm explaining: FILES is a list of files from the raw folder.
The "core" line to execute nohup: the first path is the path to the tool, -a path is the path to the file with paternas to cut, out saves the same file name as the processed + OUT at the beginning. The last parameter is the input file to be processed.
Here readme tools:
https://github.com/vsbuffalo/scythe
Does anybody know how you can handle it?
P.S. I also tried move nohup before count, but it's still use one core. I have no limitation on server.

IMHO, the most likely solution is GNU Parallel, so you can run up to say, 64 jobs in parallel something like this:
parallel -j 64 /home/daw/scythe/scythe -a /home/daw/scythe/illumina_adapters.fa -o OUT{.} {} ::: /home/daw/raw/*
This has the benefit that jobs are not batched, it keeps 64 running at all times, starting a new one as each job finishes, which is better than waiting potentially 4.9 hours for all 32 of your jobs to finish before starting the last one which takes a further 5 hours after that. Note that I arbitrarily chose 64 jobs here, if you don't specify otherwise, GNU Parallel will run 1 job per CPU core you have.
Useful additional parameters are:
parallel --bar ... gives a progress bar
parallel --dry-run ... does a dry run so you can see what it would do without actually doing anything
If you have multiple servers available, you can add them in a list and GNU Parallel will distribute the jobs amongst them too:
parallel -S server1,server2,server3 ...

duplicating stdin to check the end

I need to call openssl from a binary, I wrote xml text in a popen( ) call to a script embedding openssl
I get a problem if my binary fails during writing, openssl ends succesfully to write my file, but when I decode I get a truncated file.
I would like to check at the end of openssl call if the received stream ends with "< /myEndTag>"
Context: my binary must never write a file not encrypted, I would like not to openssl decode
here is an example, to illustrate (thanks to comments, this is not a valid statement, just a way to make you get an idea):
echo "blablaf foo bar" | openssl -out file.crypt | grep -E "bar$"
then, if grep has found "bar$", my file.crypt is good

I found a solution fitting my needs:
my script now use a tee to tail before openssl
tee >(tail -n2 > ${checkfile} ) | /usr/bin/openssl enc -aes-256-cbc -out ${outfile} -e -K ${KEY} -iv ${KEYOPTION}
grepping my end xml tag in the checkfile which only contains 2 lines is secure enough.
As I mentionned, checking openssl return code is not enough since I write to openssl via a popen statement.
If my binary hangs while he is writing, it seems the streams goes to openssl, which find his end, no matter if it's the real end or a broken stream. openssl finely make a valid encrypted output file with the truncated content.

Simulate keytool password keyboard input in bash script

I have like 100 keystore e.g. "store15.jks" files, and a single X.509 certificate "mycert.pem". I need to find out in which "store*.jks", "mycert.pem" is imported in. What I am trying to do is to make a script to iterate 100 times and do command
keytool -list -keystore store*.jks
I initially came up with simple script like this:
#!/bin/bash
for((i=1;i<100;i++))
do
cert="mycert.pem"
str="store"$i".jks"
OUTPUT="$(keytool -list -keystore $str)"
echo $OUTPUT
done
Alas, at the first iteration already, I am prompted for keystore password, like
Enter keystore password: //3 or 4 spaces after colon
That means I'd have to enter password for every single iteration, and there must be a (much) better way to do this, i.e. a way to simulate keyboard input when password is prompted. Browsing through the Stack Overflow I found some examples using certain "Expect" scripting, but they were either rudimentary or I just couldn't manage to get it right, so I failed at combining /bash and /expect. Must say I find it a bit strange that there is no /bash technique for task that might see pretty common. I would appreciate any help, preferring example scripts. Thanks!

The easiest way to do this is to use the -storepass option which allows you to pass the password on the command line. If for some reason that does not work for you (maybe you have an earlier version), here is an expect script that works for me:
expect -c "spawn /usr/bin/keytool -list; expect \"assword:\" { exp_send \"the_password\r\"}; expect EOF {exit}"

First of all, thanks alot to both of you guys, -storepass worked like a charm! You made me very happy :)
I'll now post my updated script that solved the problem:
#!/bin/bash
for((i=1;i<100;i++))
do
str="store"$i".jks"
sha="5A:6B:18"
OUTPUT="$(keytool -list -keystore $str -storepass mypass | grep $sha )"
echo $i
echo $OUTPUT
done
The answer to the original problem is store74.jks. Hope this helps someone someday.

For viewing public keys, now password is needed; if you can output a simple "ENTER" keystroke, that should suffice, too.
E.g.
echo "" |keytool -list -keystore key.jks

Bruteforce GPG passphrase using script [duplicate]

This question already has an answer here:
Best way to soft brute-force your own GPG/PGP passphrase?
(1 answer)
Closed 8 years ago.
I have forgotten my passphrase for my gpg key on linux. Can someone please help me write a simple script to use bruteforce to crack the key? I remember some of the words which MIGHT be in the passphrase, so hopefully, it will not take long for my computer to bruteforce it.
All is not lost if I can't recover the passphrase, it just means I will not be able to work on my project for the next 10 days until I get back to work to get another copy of the files, but this time with a new key for which I will remember to passphrase.
However, it will be nice to be able to work on my project in these 10 days.

Maybe something like:
#!/bin/bash
#
# try all word in words.txt
for word in $(cat words.txt); do
# try to decrypt with word
echo "${word}" | gpg --passphrase-fd 0 --no-tty --decrypt somegpgfile.gpg --output somegpgfile;
# if decrypt is successfull; stop
if [ $? -eq 0 ]; then
echo "GPG passphrase is: ${word}";
exit 0;
fi
done;
exit 1;

1) The script won't be simple, at least how you envisage "simple."
2) It will take a long time - that's the point of using pass phrases over simple passwords. Taking the time to write such a script, incorporating your words which may or may not be in the phrase plus a stab at iterating will probably take over ten days.
3) You probably will forget the next passphrase too.
4) Ooops!
Sorry dude, time to start a new project (at least to while away the next ten days - I suggest a passphrase cracker as an ideal distraction.)
Merry Christmas!
-Oisin

Tersmitten's answer may be out of date.
echo "${word}" | gpg --passphrase-fd 0 -q --batch --allow-multiple-messages --no-tty --output the_decrypted_file -d /some/input/file.gpg;
I used the above line with gpg 2.0.20 and libcrypt 1.5.2 to achieve the desired results.

How to check Linux version with Autoconf?

My program requires at least Linux 2.6.26 (I use timerfd and some other Linux-specific features).
I have an general idea how to write this macro but I don't have enough knowledge about writing test macros for Autoconf. Algorithm:
Run "uname --release" and store output
Parse output and subtract Linux version number (MAJOR.MINOR.MICRO)
Compare version
I don't know how to run command, store output and parse it.
Maybe such macro already exists and it's available (I haven't found any)?

I think you'd be better off detecting the specific functions you need using AC_CHECK_FUNC, rather than a specific kernel version.
This will also prevent breakage if you find yourself cross-compiling at some point in the future

There is a macro for steps 2 (parse) and 3 (compare) version, ax_compare_version. For example:
linux_version=$(uname --release)
AX_COMPARE_VERSION($linux_version, [eq3], [2.6.26],
[AC_MSG_NOTICE([Ok])],
[AC_MSG_ERROR([Bad Linux version])])
Here I used eq3 so that if $linux_version contained additional strings, such as -amd64, the comparison still succeeds. There is a plethora of comparison operators available.

I would suggest you not to check the Linux version number, but for the specific type you need or function. Who knows, maybe someone decides to backport timerfd_settime() to 2.4.x? So I think AC_CANONICAL_TARGET and AC_CHECK_LIB or similar are your friends. If you need to check the function arguments or test behaviour, you'd better write a simple program and use AC_LANG_CONFTEST([AC_LANG_PROGRAM(...)])/AC_TRY_RUN to do the job.

Without going too deep and write autoconf macros properly (which would be preferable anyway) don't forget that configure.ac is basically a shell script preprocessed by m4. So you can write shell commands directly.
# prev. part of configure.ac
if test `uname -r |cut -d. -f1` -lt 2 then; echo "major v. error"; exit 1; fi
if test `uname -r |cut -d. -f2` -lt 6 then; echo "minor v. error"; exit 1; fi
if test `uname -r |cut -d. -f3` -lt 26 then; echo "micro error"; exit 1; fi
# ...
This is just an idea if you want to do it avoiding writing macros for autoconf. This choice is not good, but should work...
The best way is the already suggested one: you should check for features; so, say in a future kernel timerfd is no more available... or changed someway your code is broken... you won't catch it since you test for version.
edit
As user foof says in comments (with other words), it is a naive way to check for major.minor.micro. E.g. 3.5.1 will fail because of 5 being lt 6, but 3.5.1 comes after 2.6.26 so (likely) it should be accepted. There are many tricks that can be used in order to transform x.y.z into a representation that puts each version in its "natural" order. E.g. if we expect x, y, or z won't be greather than 999, we can do something like multiplying by 1000000 major, 1000 minor and 1 micro: thus, you can compare the result with 2006026 as Foof suggested in comment(s).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

how to speed up openssl generate md5 checksum - multithreading

Related

Use more than one core in bash

duplicating stdin to check the end

Simulate keytool password keyboard input in bash script

Bruteforce GPG passphrase using script [duplicate]

How to check Linux version with Autoconf?

Categories

Resources