Hexdump reverse command - linux

The hexdump command converts any file to hex values.
But what if I have hex values and I want to reverse the process, is this possible?

There is a similar tool called xxd. If you run xxd with just a file name it dumps the data in a fairly standard hex dump format:
# xxd bdata
0000000: 0001 0203 0405
......
Now if you pipe the output back to xxd with the -r option and redirect that to a new file, you can convert the hex dump back to binary:
# xxd bdata | xxd -r >bdata2
# cmp bdata bdata2
# xxd bdata2
0000000: 0001 0203 0405

I've written a short AWK script which reverses hexdump -C output back to the
original data. Use like this:
reverse-hexdump.sh hex.txt > data
Handles '*' repeat markers and generating original data even if binary.
hexdump -C and reverse-hexdump.sh make a data round-trip pair. It is
available here:
GitHub reverse-hexdump repo
Direct to reverse-hexdump.sh

Restore file, given only the output of hexdump file
If you only have the output of hexdump file and want to restore the original file, first note that hexdump's default output depends on the endianness of the system you ran hexdump on!
If you have access to the system that created the dump, you can determinate its endianness using below command:
[[ "$(printf '\01\03' | hexdump)" == *0103* ]] && echo big || echo little
Reversing little-endian hexdump
This is the most common case. All x86/x64 systems are little-endian. If you don't know the endianness of the system that ran hexdump file, try this.
sed 's/ \(..\)\(..\)/ \2\1/g;$d' dump | xxd -r
The sed part converts hexdump's format into xxd's format, at least so far that xxd -r works.
Reversing big-endian hexdump
sed '$d' dump | xxd -r
Known Bugs (see comment section)
A trailing null byte is added if the original file was of odd length (e.g. 1, 3, 5, 7, ..., byte long).
Repeating sections of the original file are not restored correctly if they were hexdumped using a *.
You can check your dump for above problematic cases by running below command:
grep -qE '^\*|^[0-9a-f]*[13579bdf] *$' dump && echo bug || echo ok
Better alternative to create hexdumps in the first place
Besides the non-posix (and therefore not so portable) xxd there is od (octal dump) which should be available on all unix-like systems as it is specified by posix:
od -tx1 -An -v
Will print a hexadecimal dump, grouping digits as single bytes (-tx1), with no Address prefixes (-An, similar to xxd -p) and without abbreviating repeated sections as * (-v). You can reverse such a dump using xxd -r -p.

As someone who sucks at bash, I could not understand the examples already posted.
Here is what would have helped me when I was originally searching:
Take your text file "AYE.TXT" and convert it into a hex dump called "BEE.TXT"
xxd -p "AYE.TXT" > "BEE.TXT"
Take your hex dump file ("BEE.TXT") and covert it back to ascii file "CEE.TXT"
xxd -r -p "BEE.TXT" > "CEE.TXT"
Now that you have some simple working code, feel free to check out
"xxd -help" on the command line for an explanation of what all those flags do.
(That part is the easy part, the hard part is the specifics of the bash syntax)

There is a tonne of more elegant ways to get this done, but I've quickly hacked something together that Works for Me (tm) when regenerating a binary file from a hex dump generated by hexdump -C some_file.bin:
sed 's/\(.\{8\}\) \(..\) \(..\) \(..\) \(..\) \(..\) \(..\) \(..\) \(..\)/\1: \2\3 \4\5 \6\7 \8\9/g' some_file.hexdump | sed 's/\(.*\) \(..\) \(..\) \(..\) \(..\) \(..\) \(..\) \(..\) \(..\) |/\1 \2\3 \4\5 \6\7 \8\9 /g' | sed 's/.$//g' | xxd -r > some_file.restored
Basically, uses 2 sed processeses, each handling it's part of each line. Ugly, but someone might find it useful.

If you don't have xxd, use hexdump, od, perl or python:
The following all give the same output:
# If you only have hexdump
hexdump -ve '1/1 "%.2x"' mybinaryfile > mydump
# This gives exactly the same output as:
xxd -p mybinaryfile > mydump
# Or, much slower:
od -v -t x1 -An < mybinaryfile | tr -d "\n " > mydump
# Or, the fastest:
perl -pe 'BEGIN{$/=\1e6} $_=unpack "H*"' < mybinaryfile > mydump
# Or, if you somehow have Python, and not Perl:
python -c "print(open('mybinaryfile','rb').read().hex())" > mydump
Then you can copy and paste, or pipe the output, and convert back with:
xxd -r -p mydump mybinaryfileagain
# Or
xxd -r -p < mydump > mybinaryfileagain
The hexdump command is available almost everywhere, and is usually part of the default busybox - if it's not linked, you can try running busybox hexdump or busybox xxd.
If xxd is not available to reverse the data, then you can try awk
The old days: Zmodem
In the old days we used to use X/Y/Zmodem which is available in the package lrzsz which can tolerate lossy comms - but it's a bidirectional protocol so the binaries need to be running at the same time and there needs to be bidirectional comms:
# Demo on local machine, using FIFOs
mkfifo /tmp/fifo-in
mkfifo /tmp/fifo-out
sz -b mybinaryfile > /tmp/fifo-out < /tmp/fifo-in
mkdir out; cd out
rz -b < /tmp/fifo-out > /tmp/fifo-in
Luckily, screen supports receiving Zmodem, so if you're in a screen session:
screen
telnet somehost
Then type Ctrl+A and : and then zmodem catch and Enter. Then inside the screen on the remote host, issue:
# sz -b mybinaryfile
Press Enter when you see the string starting with "!!!".
When you see "Transfer Complete", you may want to run reset if you want to continue the terminal session normally.

This program reverses hexdump -C output back to the original data.
Usage:
make
make test
./unhexdump -i inputfile -o outputfile
see https://github.com/zhouzq-thu/unhexdump!

i found more simple solution:
bin2hex
echo -n "abc" | hexdump -ve '1/1 "%02x"'
hex2bin
echo -n "616263" | grep -Eo ".{2}" | sed 's/\(.*\)/\\x\1/' | tr -d '\n' | xargs -0 echo -ne

Related

Problems with tail -f and awk? [duplicate]

Is that possible to use grep on a continuous stream?
What I mean is sort of a tail -f <file> command, but with grep on the output in order to keep only the lines that interest me.
I've tried tail -f <file> | grep pattern but it seems that grep can only be executed once tail finishes, that is to say never.
Turn on grep's line buffering mode when using BSD grep (FreeBSD, Mac OS X etc.)
tail -f file | grep --line-buffered my_pattern
It looks like a while ago --line-buffered didn't matter for GNU grep (used on pretty much any Linux) as it flushed by default (YMMV for other Unix-likes such as SmartOS, AIX or QNX). However, as of November 2020, --line-buffered is needed (at least with GNU grep 3.5 in openSUSE, but it seems generally needed based on comments below).
I use the tail -f <file> | grep <pattern> all the time.
It will wait till grep flushes, not till it finishes (I'm using Ubuntu).
I think that your problem is that grep uses some output buffering. Try
tail -f file | stdbuf -o0 grep my_pattern
it will set output buffering mode of grep to unbuffered.
If you want to find matches in the entire file (not just the tail), and you want it to sit and wait for any new matches, this works nicely:
tail -c +0 -f <file> | grep --line-buffered <pattern>
The -c +0 flag says that the output should start 0 bytes (-c) from the beginning (+) of the file.
In most cases, you can tail -f /var/log/some.log |grep foo and it will work just fine.
If you need to use multiple greps on a running log file and you find that you get no output, you may need to stick the --line-buffered switch into your middle grep(s), like so:
tail -f /var/log/some.log | grep --line-buffered foo | grep bar
you may consider this answer as enhancement .. usually I am using
tail -F <fileName> | grep --line-buffered <pattern> -A 3 -B 5
-F is better in case of file rotate (-f will not work properly if file rotated)
-A and -B is useful to get lines just before and after the pattern occurrence .. these blocks will appeared between dashed line separators
But For me I prefer doing the following
tail -F <file> | less
this is very useful if you want to search inside streamed logs. I mean go back and forward and look deeply
Didn't see anyone offer my usual go-to for this:
less +F <file>
ctrl + c
/<search term>
<enter>
shift + f
I prefer this, because you can use ctrl + c to stop and navigate through the file whenever, and then just hit shift + f to return to the live, streaming search.
sed would be a better choice (stream editor)
tail -n0 -f <file> | sed -n '/search string/p'
and then if you wanted the tail command to exit once you found a particular string:
tail --pid=$(($BASHPID+1)) -n0 -f <file> | sed -n '/search string/{p; q}'
Obviously a bashism: $BASHPID will be the process id of the tail command. The sed command is next after tail in the pipe, so the sed process id will be $BASHPID+1.
Yes, this will actually work just fine. Grep and most Unix commands operate on streams one line at a time. Each line that comes out of tail will be analyzed and passed on if it matches.
This one command workes for me (Suse):
mail-srv:/var/log # tail -f /var/log/mail.info |grep --line-buffered LOGIN >> logins_to_mail
collecting logins to mail service
Coming some late on this question, considering this kind of work as an important part of monitoring job, here is my (not so short) answer...
Following logs using bash
1. Command tail
This command is a little more porewfull than read on already published answer
Difference between follow option tail -f and tail -F, from manpage:
-f, --follow[={name|descriptor}]
output appended data as the file grows;
...
-F same as --follow=name --retry
...
--retry
keep trying to open a file if it is inaccessible
This mean: by using -F instead of -f, tail will re-open file(s) when removed (on log rotation, for sample).
This is usefull for watching logfile over many days.
Ability of following more than one file simultaneously
I've already used:
tail -F /var/www/clients/client*/web*/log/{error,access}.log /var/log/{mail,auth}.log \
/var/log/apache2/{,ssl_,other_vhosts_}access.log \
/var/log/pure-ftpd/transfer.log
For following events through hundreds of files... (consider rest of this answer to understand how to make it readable... ;)
Using switches -n (Don't use -c for line buffering!).By default tail will show 10 last lines. This can be tunned:
tail -n 0 -F file
Will follow file, but only new lines will be printed
tail -n +0 -F file
Will print whole file before following his progression.
2. Buffer issues when piping:
If you plan to filter ouptuts, consider buffering! See -u option for sed, --line-buffered for grep, or stdbuf command:
tail -F /some/files | sed -une '/Regular Expression/p'
Is (a lot more efficient than using grep) a lot more reactive than if you does'nt use -u switch in sed command.
tail -F /some/files |
sed -une '/Regular Expression/p' |
stdbuf -i0 -o0 tee /some/resultfile
3. Recent journaling system
On recent system, instead of tail -f /var/log/syslog you have to run journalctl -xf, in near same way...
journalctl -axf | sed -une '/Regular Expression/p'
But read man page, this tool was built for log analyses!
4. Integrating this in a bash script
Colored output of two files (or more)
Here is a sample of script watching for many files, coloring ouptut differently for 1st file than others:
#!/bin/bash
tail -F "$#" |
sed -une "
/^==> /{h;};
//!{
G;
s/^\\(.*\\)\\n==>.*${1//\//\\\/}.*<==/\\o33[47m\\1\\o33[0m/;
s/^\\(.*\\)\\n==> .* <==/\\o33[47;31m\\1\\o33[0m/;
p;}"
They work fine on my host, running:
sudo ./myColoredTail /var/log/{kern.,sys}log
Interactive script
You may be watching logs for reacting on events?
Here is a little script playing some sound when some USB device appear or disappear, but same script could send mail, or any other interaction, like powering on coffe machine...
#!/bin/bash
exec {tailF}< <(tail -F /var/log/kern.log)
tailPid=$!
while :;do
read -rsn 1 -t .3 keyboard
[ "${keyboard,}" = "q" ] && break
if read -ru $tailF -t 0 _ ;then
read -ru $tailF line
case $line in
*New\ USB\ device\ found* ) play /some/sound.ogg ;;
*USB\ disconnect* ) play /some/othersound.ogg ;;
esac
printf "\r%s\e[K" "$line"
fi
done
echo
exec {tailF}<&-
kill $tailPid
You could quit by pressing Q key.
you certainly won't succeed with
tail -f /var/log/foo.log |grep --line-buffered string2search
when you use "colortail" as an alias for tail, eg. in bash
alias tail='colortail -n 30'
you can check by
type alias
if this outputs something like
tail isan alias of colortail -n 30.
then you have your culprit :)
Solution:
remove the alias with
unalias tail
ensure that you're using the 'real' tail binary by this command
type tail
which should output something like:
tail is /usr/bin/tail
and then you can run your command
tail -f foo.log |grep --line-buffered something
Good luck.
Use awk(another great bash utility) instead of grep where you dont have the line buffered option! It will continuously stream your data from tail.
this is how you use grep
tail -f <file> | grep pattern
This is how you would use awk
tail -f <file> | awk '/pattern/{print $0}'

How to check if a file contains only zeros in a Linux shell?

How to check if a large file contains only zero bytes ('\0') in Linux using a shell command? I can write a small program for this but this seems to be an overkill.
If you're using bash, you can use read -n 1 to exit early if a non-NUL character has been found:
<your_file tr -d '\0' | read -n 1 || echo "All zeroes."
where you substitute the actual filename for your_file.
The "file" /dev/zero returns a sequence of zero bytes on read, so a cmp file /dev/zero should give essentially what you want (reporting the first different byte just beyond the length of file).
If you have Bash,
cmp file <(tr -dc '\000' <file)
If you don't have Bash, the following should be POSIX (but I guess there may be legacy versions of cmp which are not comfortable with reading standard input):
tr -dc '\000' <file | cmp - file
Perhaps more economically, assuming your grep can read arbitrary binary data,
tr -d '\000' <file | grep -q -m 1 ^ || echo All zeros
I suppose you could tweak the last example even further with a dd pipe to truncate any output from tr after one block of data (in case there are very long sequences without newlines), or even down to one byte. Or maybe just force there to be newlines.
tr -d '\000' <file | tr -c '\000' '\n' | grep -q -m 1 ^ || echo All zeros
It won't win a prize for elegance, but:
xxd -p file | grep -qEv '^(00)*$'
xxd -p prints a file in the following way:
23696e636c756465203c6572726e6f2e683e0a23696e636c756465203c73
7464696f2e683e0a23696e636c756465203c7374646c69622e683e0a2369
6e636c756465203c737472696e672e683e0a0a766f696420757361676528
63686172202a70726f676e616d65290a7b0a09667072696e746628737464
6572722c202255736167653a202573203c
So we grep to see if there is a line that is not made completely out of 0's, which means there is a char different to '\0' in the file. If not, the file is made completely out of zero-chars.
(The return code signals which one happened, I assumed you wanted it for a script. If not, tell me and I'll write something else)
EDIT: added -E for grouping and -q to discard output.
Straightforward:
if [ -n $(tr -d '\0000' < file | head -c 1) ]; then
echo a nonzero byte
fi
The tr -d removes all null bytes. If there are any left, the if [ -n sees a nonempty string.
Completely changed my answer based on the reply here
Try
perl -0777ne'print /^\x00+$/ ? "yes" : "no"' file

Count the number of occurences of binary data

I need to count the occurrences of the hex string 0xFF 0x84 0x03 0x07 in a binary file, without too much hassle... is there a quick way of grepping for this data from the linux command line or should I write dedicated code to do it?
Patterns without linebreaks
If your version of grep takes the -P parameter, then you can use grep -a -P, to search for an arbitrary binary string (with no linebreaks) inside a binary file. This is close to what you want:
grep -a -c -P '\xFF\x84\x03\x07' myfile.bin
-a ensures that binary files will not be skipped
-c outputs the count
-P specifies that your pattern is a Perl-compatible regular expression (PCRE), which allows strings to contain hex characters in the above \xNN format.
Unfortunately, grep -c will only count the number of "lines" the pattern appears on - not actual occurrences.
To get the exact number of occurrences with grep, it seems you need to do:
grep -a -o -P '\xFF\x84\x03\x07' myfile.bin | wc -l
grep -o separates out each match onto its own line, and wc -l counts the lines.
Patterns containing linebreaks
If you do need to grep for linebreaks, one workaround I can think of is to use tr to swap the character for another one that's not in your search term.
# set up test file (0a is newline)
xxd -r <<< '0:08 09 0a 0b 0c 0a 0b 0c' > test.bin
# grep for '\xa\xb\xc' doesn't work
grep -a -o -P '\xa\xb\xc' test.bin | wc -l
# swap newline with oct 42 and grep for that
tr '\n\042' '\042\n' < test.bin | grep -a -o -P '\042\xb\xc' | wc -l
(Note that 042 octal is the double quote " sign in ASCII.)
Another way, if your string doesn't contain Nulls (0x0), would be to use the -z flag, and swap Nulls for linebreaks before passing to wc.
grep -a -o -P -z '\xa\xb\xc' test.bin | tr '\0\n' '\n\0' | wc -l
(Note that -z and -P may be experimental in conjunction with each other. But with simple expressions and no Nulls, I would guess it's fine.)
use hexdump like
hexdump -v -e '"0x" 1/1 "%02X" " "' <filename> | grep -oh "0xFF 0x84 0x03 0x07" |wc -w
hexdump will output binary file in the given format like 0xNN
grep will find all the occurrences of the string without considering the same ones repeated on a line
wc will give you final count
did you try grep -a?
from grep man page:
-a, --text
Process a binary file as if it were text; this is equivalent to the --binary-files=text option.
How about:
$ hexdump a.out | grep -Ec 'ff ?84 ?03 ?07'
This doesn't quite answer your question, but does solve the problem when the search string is ASCII but the file is binary:
cat binaryfile | sed 's/SearchString/SearchString\n/g' | grep -c SearchString
Basically, 'grep' was almost there except it only counted one occurrence if there was no newline byte in between, so I added the newline bytes.

How to print only the hex values from hexdump without the line numbers or the ASCII table? [duplicate]

This question already has answers here:
How to create a hex dump of file containing only the hex characters without spaces in bash?
(9 answers)
Closed 7 years ago.
following Convert decimal to hexadecimal in UNIX shell script
I am trying to print only the hex values from hexdump, i.e. don't print the lines numbers and the ASCII table.
But the following command line doesn't print anything:
hexdump -n 50 -Cs 10 file.bin | awk '{for(i=NF-17; i>2; --i) print $i}'
Using xxd is better for this job:
xxd -p -l 50 -seek 10 file.bin
From man xxd:
xxd - make a hexdump or do the reverse.
-p | -ps | -postscript | -plain
output in postscript continuous hexdump style. Also known as plain hexdump style.
-l len | -len len
stop after writing <len> octets.
-seek offset
When used after -r: revert with <offset> added to file positions found in hexdump.
You can specify the exact format that you want hexdump to use for output, but it's a bit tricky. Here's the default output, minus the file offsets:
hexdump -e '16/1 "%02x " "\n"' file.bin
(To me, it looks like this would produce an extra trailing space at the end
of each line, but for some reason it doesn't.)
As an alternative, consider using xxd -p file.bin.
First of all, remove -C which is emitting the ascii information.
Then you could drop the offset with
hexdump -n 50 -s 10 file.bin | cut -c 9-

Using sed to get an env var from /proc/*/environ weirdness with \x00

I'm trying to grovel through some other processes environment to get a specific env var.
So I've been trying a sed command like:
sed -n "s/\x00ENV_VAR_NAME=\([^\x00]*\)\x00/\1/p" /proc/pid/environ
But I'm getting as output the full environ file. If I replace the \1 with just a static string, I get that string plus the entire environ file:
sed -n "s/\x00ENV_VAR_NAME=\([^\x00]*\)\x00/BLAHBLAH/p" /proc/pid/environ
I should just be getting "BLAHBLAH" in the last example. This doesn't happen if I get rid of the null chars and use some other test data set.
This lead me to try transforming the \x00 to \x01's, which does seem to work:
cat /proc/pid/environ | tr '\000' '\001' | sed -n "s/\x01ENV_VAR_NAME=\([^\x01]*\)\x01/\1/p"
Am I missing something simple about sed here? Or should I just stick to this workaround?
A lot of programs written in C tend to fail with strings with embedded NULs as a NUL terminates a C-style string. Unless specially written to handle it.
I process /proc/*/environ on the command line with xargs:
xargs -n 1 -0 < /proc/pid/environ
This gives you one env var per line. Without a command, xargs just echos the argument. You can then easily use grep, sed, awk, etc on that by piping to it.
xargs -n 1 -0 < /proc/pid/environ | sed -n 's/^ENV_VAR_NAME=\(.*\)/\1/p'
I use this often enough that I have a shell function for it:
pidenv()
{
xargs -n 1 -0 < /proc/${1:-self}/environ
}
This gives you the environment of a specific pid, or self if no argument is supplied.
You could process the list with gawk, setting the record separator to \0 and the field separator to =:
gawk -v 'RS=\0' -F= '$1=="ENV_VAR_NAME" {print $2}' /proc/pid/environ
Or you could use read in a loop to read each NUL-delimited line. For instance:
while read -d $'\0' ENV; do declare "$ENV"; done < /proc/pid/environ
echo $ENV_VAR_NAME
(Do this in a sub-shell to avoid clobbering your own environment.)
cat /proc/PID/environ | tr '\0' '\n' | sed 's/^/export /' ;
then copy and paste as needed.
In spite of really old and answered question, I am adding one very simple oneliner, probably simpler for getting the text output and further processing:
strings /proc/$PID/environ
For some reason sed does not match \0 with .
% echo -n "\00" | xxd
0000000: 00 .
% echo -n "\00" | sed 's/./a/g' | xxd
0000000: 00 .
% echo -n "\01" | xxd
0000000: 01 .
% echo -n "\01" | sed 's/./a/g' | xxd
0000000: 61 a
Solution: do not use sed or use your workaround.

Resources