Checksum on string - linux

Is there a way to calculate a checksum on a string in Linux? The checksum commands that I have seen (cksum, md5sum, sha1sum, etc.) all require a file as input and I do not have a file. I only have a path to a location and want to calculate the checksum on that path.

echo -n 'exampleString' | md5sum
should work.

echo -n "yourstring" |md5sum
echo -n "yourstring" |sha1sum
echo -n "yourstring" |sha256sum
don't forget -n or the result will change (cuz the newline will be parsed)

Related

Filter output of command in Linux

How do I print out just the hashsum and file name with sha256sum command? I want Hashsum and just the filename instead of the full path.
Command:
sha256sum /mydir/someOtherDir/file.txt
Output:
123Hashsum /mydir/someOtherDir/file.txt
Desired Output:
123Hashsum file.txt
You can read the output into variables
read -r sha file < <(sha256sum /mydir/someOtherDir/file.txt)
Then you can read just the filename with basename
echo "$sha" "$(basename "$file")"
You can try piping to sed as below (works with absolute paths only) :
sha256sum /mydir/someOtherDir/file.txt | sed 's:/.*/::'

Can't use Linux wc on one-line file

I have a one-line file with no newline character at the end of the line. When i run the following:
diff oneline-file.txt any-file.txt | wc -c
I get:
Warning: missing newline character in oneline-file.txt
Why is this an error? How can i fix it? I could do this first:
echo "\n" >> oneline-file.txt
I'd rather do something that does not change the file. Thx.
Thanks to Barmar. This steered me in the right direction. Here's what I used:
diff <(sed 's/$/\n/g' oneline-file.txt) <(cat any-file.txt) | wc -c
You could pipe the result of your echo into diff, replacing the file name as an argument to diff with - (telling it to get that input from stdin).
You can use process substitution to echo the newline after the file:
diff <(cat oneline-file.txt; echo "") any-file.txt | wc -c

Ambiguous Redirection on shell script

I was trying to create a little shell script that allowed me to check the transfer progress when copying large files from my laptop's hdd to an external drive.
From the command line this is a simple feat using pv and simple redirection, although the line is rather long and you must know the file size (which is why I wanted the script):
console: du filename (to get the exact file size)
console: cat filename | pv -s FILE_SIZE -e -r -p > dest_path/filename
On my shell script I added egrep "[0-9]{1,}" -o to strip the filename and keep just the size numbers from the return value of du, and the rest should be straightforward.
#!/bin/bash
du $1 | egrep "[0-9]{1,}" -o
sudo cat $1 | pv -s $? -e -r -p > $2/$1
The problem is when I try to copy file12345.mp3 using this I get an ambiguous redirection error because egrep is getting the 12345 from the filename, but I just want the size.
Which means the return value from the first line is actually:
FILE_SIZE
12345
which bugs it.
How should I modify this script to parse just the first numbers until the first " " (space)?
Thanks in advance.
If I understand you correctly:
To retain only the filesize from the du command output:
du $1 | awk '{print $1}'
(assuming the 1st field is the size of the file)
Add double quotes to your redirection to avoid the error:
sudo cat $1 | pv -s $? -e -r -p > "$2/$1"
This quoting is done since your $2 contains spaces.

Problems with Grep Command in bash script

I'm having some rather unusual problems using grep in a bash script. Below is an example of the bash script code that I'm using that exhibits the behaviour:
UNIQ_SCAN_INIT_POINT=1
cat "$FILE_BASENAME_LIST" | uniq -d >> $UNIQ_LIST
sed '/^$/d' $UNIQ_LIST >> $UNIQ_LIST_FINAL
UNIQ_LINE_COUNT=`wc -l $UNIQ_LIST_FINAL | cut -d \ -f 1`
while [ -n "`cat $UNIQ_LIST_FINAL | sed "$UNIQ_SCAN_INIT_POINT"'q;d'`" ]; do
CURRENT_LINE=`cat $UNIQ_LIST_FINAL | sed "$UNIQ_SCAN_INIT_POINT"'q;d'`
CURRENT_DUPECHK_FILE=$FILE_DUPEMATCH-$CURRENT_LINE
grep $CURRENT_LINE $FILE_LOCTN_LIST >> $CURRENT_DUPECHK_FILE
MATCH=`grep -c $CURRENT_LINE $FILE_BASENAME_LIST`
CMD_ECHO="$CURRENT_LINE matched $MATCH times," cmd_line_echo
echo "$CURRENT_DUPECHK_FILE" >> $FILE_DUPEMATCH_FILELIST
let UNIQ_SCAN_INIT_POINT=UNIQ_SCAN_INIT_POINT+1
done
On numerous occasions, when grepping for the current line in the file location list, it has put no output to the current dupechk file even though there have definitely been matches to the current line in the file location list (I ran the command in terminal with no issues).
I've rummaged around the internet to see if anyone else has had similar behaviour, and thus far all I have found is that it is something to do with buffered and unbuffered outputs from other commands operating before the grep command in the Bash script....
However no one seems to have found a solution, so basically I'm asking you guys if you have ever come across this, and any idea/tips/solutions to this problem...
Regards
Paul
The `problem' is the standard I/O library. When it is writing to a terminal
it is unbuffered, but if it is writing to a pipe then it sets up buffering.
try changing
CURRENT_LINE=`cat $UNIQ_LIST_FINAL | sed "$UNIQ_SCAN_INIT_POINT"'q;d'`
to
CURRENT LINE=`sed "$UNIQ_SCAN_INIT_POINT"'q;d' $UNIQ_LIST_FINAL`
Are there any directories with spaces in their names in $FILE_LOCTN_LIST? Because if they are, those spaces will need escaped somehow. Some combination of find and xargs can usually deal with that for you, especially xargs -0
A small bash script using md5sum and sort that detects duplicate files in the current directory:
CURRENT="" md5sum * |
sort |
while read md5sum filename;
do
[[ $CURRENT == $md5sum ]] && echo $filename is duplicate;
CURRENT=$md5sum;
done
you tagged linux, some i assume you have tools like GNU find,md5sum,uniq, sort etc. here's a simple example to find duplicate files
$ echo "hello world">file
$ md5sum file
6f5902ac237024bdd0c176cb93063dc4 file
$ cp file file1
$ md5sum file1
6f5902ac237024bdd0c176cb93063dc4 file1
$ echo "blah" > file2
$ md5sum file2
0d599f0ec05c3bda8c3b8a68c32a1b47 file2
$ find . -type f -exec md5sum "{}" \; |sort -n | uniq -w32 -D
6f5902ac237024bdd0c176cb93063dc4 ./file
6f5902ac237024bdd0c176cb93063dc4 ./file1

Linux using grep to print the file name and first n characters

How do I use grep to perform a search which, when a match is found, will print the file name as well as the first n characters in that file? Note that n is a parameter that can be specified and it is irrelevant whether the first n characters actually contains the matching string.
grep -l pattern *.txt |
while read line; do
echo -n "$line: ";
head -c $n "$line";
echo;
done
Change -c to -n if you want to see the first n lines instead of bytes.
You need to pipe the output of grep to sed to accomplish what you want. Here is an example:
grep mypattern *.txt | sed 's/^\([^:]*:.......\).*/\1/'
The number of dots is the number of characters you want to print. Many versions of sed often provide an option, like -r (GNU/Linux) and -E (FreeBSD), that allows you to use modern-style regular expressions. This makes it possible to specify numerically the number of characters you want to print.
N=7
grep mypattern *.txt /dev/null | sed -r "s/^([^:]*:.{$N}).*/\1/"
Note that this solution is a lot more efficient that others propsoed, which invoke multiple processes.
There are few tools that print 'n characters' rather than 'n lines'. Are you sure you really want characters and not lines? The whole thing can perhaps be best done in Perl. As specified (using grep), we can do:
pattern="$1"
shift
n="$2"
shift
grep -l "$pattern" "$#" |
while read file
do
echo "$file:" $(dd if="$file" count=${n}c)
done
The quotes around $file preserve multiple spaces in file names correctly. We can debate the command line usage, currently (assuming the command name is 'ngrep'):
ngrep pattern n [file ...]
I note that #litb used 'head -c $n'; that's neater than the dd command I used. There might be some systems without head (but they'd pretty archaic). I note that the POSIX version of head only supports -n and the number of lines; the -c option is probably a GNU extension.
Two thoughts here:
1) If efficiency was not a concern (like that would ever happen), you could check $status [csh] after running grep on each file. E.g.: (For N characters = 25.)
foreach FILE ( file1 file2 ... fileN )
grep targetToMatch ${FILE} > /dev/null
if ( $status == 0 ) then
echo -n "${FILE}: "
head -c25 ${FILE}
endif
end
2) GNU [FSF] head contains a --verbose [-v] switch. It also offers --null, to accomodate filenames with spaces. And there's '--', to handle filenames like "-c". So you could do:
grep --null -l targetToMatch -- file1 file2 ... fileN |
xargs --null head -v -c25 --

Resources