efficient Unix script to base64 decode a file with encrypted data - linux

The file sample below need to decode the third column and output as fourth column
883122374206 883122002074206 UzJocGUUh4c=
883122406445 883122002106445 U5dkVjlVWIc=
883122533096 883122002233096 U0dwcGORRxA=
883122624312 883122002324312 U5OJkFQ1NIc=
883122759484 883122002459484 U4NmgHUwV4c=
883122763589 883122002463589 U4WTAYBQmYc=
883122981968 883122002478427 UyY3QDAAFoI=
883122936510 883122002636510 U4kggBFncxA=
883122326985 883122002666363 U1lwcHcyBBA=
883122330017 883122002668313 U3JlEVRBiIc=
883122339137 883122002673700 UwUiESBIAYc=
883122438696 883122002733023 U1MJgGJgg4c=
883122242176 883122002875188 U4Q3IBFBB0U=
883122230176 883122002883257 U2GUAXdZaIc=
883122532560 883122002232560 U4kVkFBzVhA=
output like
883122374206 883122002074206 UzJocGUUh4c=
883122406445 883122002106445 U5dkVjlVWIc=
883122533096 883122002233096 U0dwcGORRxA=
883122624312 883122002324312 U5OJkFQ1NIc=
883122759484 883122002459484 U4NmgHUwV4c=
883122763589 883122002463589 U4WTAYBQmYc=
883122981968 883122002478427 UyY3QDAAFoI=
883122936510 883122002636510 U4kggBFncxA=
883122326985 883122002666363 U1lwcHcyBBA=
883122330017 883122002668313 U3JlEVRBiIc=
883122339137 883122002673700 UwUiESBIAYc=
883122438696 883122002733023 U1MJgGJgg4c=
883122242176 883122002875188 U4Q3IBFBB0U=
883122230176 883122002883257 U2GUAXdZaIc=
883122532560 883122002232560 U4kVkFBzVhA=
when decoding one item I use script
echo "U2GUAXdZaIc=" | base64 --decode | hexdump -v -e '/1 "%x"' | dd conv=swab status=none;echo
when I decode only column3 using base64 -d below I get one line output file, how can I get line by line output ?
cat DS10_export41.ldif | base64 --decode | hexdump -v -e '/1 "%x"' | dd conv=swab status=none oflag=sync of=DS10_export42.ldif

Related

parsing data from log using awk

I want to extract machineId userId origReqUri,filename,mime,size,checksum as comma-separated from this log pattern. Any awk command to do it?
test1.1/test.log.2020-07-14-20:2020-07-14 20:47:44,239 [http--1594759553405 sessionId:4567 nodeId:node-1 machineId:31656 userId:2540397 origReqUri:/test1/batch] INFO com.test.company - [RETURN INFO - RETURN] - TRACK_PREPROCESSED_DATA_POPULATION: Populated test_doc_version entry for doc version [1130783_1_0] with data from test_doc_metadata. File name: [09014b3080135f44.doc]. Mime type: [application/msword]. Content size: [100352]. MD5 checksum: [7ef30e834107990c95c7e53f7b6f6ee6]. [source:]
I tried
grep machineId:31656 test.1/test.log.2020-07-14-* |grep "Populated test_doc_version entry" | awk machineId |awk origReqUri
I didn't use AWK, but I would resolve your problem using mostly SED and GREP, like this:
sed s/': '/':'/g input | sed s/' '/\\n/g | grep 'machineId\|userId\|origReqUri\|name\|type\|size\|checksum' | sed 's/\[\|\]\|\.//g' | tr '\n' ',' | sed 's/name/filename/g' | sed 's/type/mime/g' | sed 's/.$//'
ps.: "input" is the name of the file where I wrote the input.
The result for the provided input is:
machineId:31656,userId:2540397,origReqUri:/test1/batch,filename:09014b3080135f44doc,mime:application/msword,size:100352,checksum:7ef30e834107990c95c7e53f7b6f6ee6
It is probably not the best solution and we can certainly make it smaller and more beautiful, but I hope it helps you.
There's another solution, simpler and way more readable. You could do like this:
tr -s ' :[]' ' ' < input | cut -d ' ' -f 12,14,16,39,43,47,51
In here, it's not comma-separated. I guess it's better not to use commas since they are in the list of special symbols.
The result for this one is:
31656 2540397 /test1/batch 09014b3080135f44.doc application/msword 100352 7ef30e834107990c95c7e53f7b6f6ee6

Hex dump to binary data conversion

Need to convert hex dump file to binary data use xxd command (or any other suitable methods that works). The raw hexdump was produced not with xxd.
Tried two variants with different options:
xxd -r input.log binout.bin
xxd -r -p input.log binout.bin
Both methods produce wrong results: first command create binary file size 2.2GB, the second command produce binary file size 82382 bytes, both binary file size mismatch, the expected binary size is 65536 bytes.
part of hex file:
807e0000: 4562 537f e0b1 6477 84bb 6bae 1cfe 81a0 | EbS...dw..k.....
807e0010: 94f9 082b 5870 4868 198f 45fd 8794 de6c | ...+XpHh..E....l
807e0020: b752 7bf8 23ab 73d3 e272 4b02 57e3 1f8f | .R{.#.s..rK.W...
807e0030: 2a66 55ab 07b2 eb28 032f b5c2 9a86 c57b | *fU....(./.....{
807e0040: a5d3 3708 f230 2887 b223 bfa5 ba02 036a | ..7..0(..#.....j
807e0050: 5ced 1682 2b8a cf1c 92a7 79b4 f0f3 07f2 | \...+.....y.....
807e0060: a14e 69e2 cd65 daf4 d506 05be 1fd1 3462 | .Ni..e........4b
What can be the issue here and how to convert data correctly?
After the xxd you need to remove the first and last parts.
$ sed -i 's/^\(.\)\{9\}//g' binary.txt
$ sed -i 's/\(.\)\{16\}$//g' binary.txt
binary.txt is the name of your file after xxd.
After that you can convert it to binary again.
$ for i in $(cat binary.txt) ; do printf "\x$i" ; done > mybinary
After this if you have the original .bin file you can check md5sums of the files to see if they have the same value. If they have same value then the transformation completed succesfully.
$ md5sum originbinary
$ md5sum mybinary
You can cover more details in the first part of this link. https://acassis.wordpress.com/2012/10/21/how-to-transfer-files-to-a-linux-embedded-system-over-serial/

View images from blob - sqlite [duplicate]

I have a sqlite3 database. One column has the TEXT type, and contains blobs which I would like to save as file. Those are gzipped files.
The output of the command sqlite3 db.sqlite3 ".dump" is:
INSERT INTO "data" VALUES(1,'objects','object0.gz',X'1F8B080000000000000 [.. a few thousands of hexadecimal characters ..] F3F5EF')
How may I extract the binary data from the sqlite file to a file using the command line ?
sqlite3 cannot output binary data directly, so you have to convert the data to a hexdump, use cut to extract the hex digits from the blob literal, and use xxd (part of the vim package) to convert the hexdump back into binary:
sqlite3 my.db "SELECT quote(MyBlob) FROM MyTable WHERE id = 1;" \
| cut -d\' -f2 \
| xxd -r -p \
> object0.gz
With SQLite 3.8.6 or later, the command-line shell includes the fileio extension, which implements the writefile function:
sqlite3 my.db "SELECT writefile('object0.gz', MyBlob) FROM MyTable WHERE id = 1"
You can use writefile if using the sqlite3 command line tool:
Example usage:
select writefile('blob.bin', blob_column) from table where key='12345';
In my case, I use "hex" instead of "quote" to retrieve image from database, and no need "cut" in the command pipe. For example:
sqlite3 fr.db "select hex(bmp) from reg where id=1" | xxd -r -p > 2.png
I had to make some minor changes on CL's answer, in order to make it work for me:
The structure for the command that he is using does not have the database name in it, the syntax that I am using is something like:
sqlite3 mydatabase.sqlite3 "Select quote(BlobField) From TableWithBlod Where StringKey = '1';" | ...
The way he is using the cut command does not work in my machine. The correct way for me is:
cut -d "'" -f2
So the final command would be something like:
sqlite3 mydatabase.sqlite3 "Select quote(BlobField) From TableWithBlod Where StringKey = '1';" | cut -d "'" -f2 | xxd -r -p > myfile.extension
And in my case:
sqlite3 osm-carto_z14_m8_m.mbtiles "select quote(images.tile_data) from images where images.tile_id = '1';" | cut -d "'" -f2 | xxd -r -p > image.png

Find HEX value in file and grep the following value

I have a 2GB file in raw format. I want to search for all appearance of a specific HEX value "355A3C2F74696D653E" AND collect the following 28 characters.
Example: 355A3C2F74696D653E323031312D30342D32365431343A34373A30322D31343A34373A3135
In this case I want the output: "323031312D30342D32365431343A34373A30322D31343A34373A3135" or better: 2011-04-26T14:47:02-14:47:15
I have tried with
xxd -u InputFile | grep '355A3C2F74696D653E' | cut -c 1-28 > OutputFile.txt
and
xxd -u -ps -c 4000000 InputFile | grep '355A3C2F74696D653E' | cut -b 1-28 > OutputFile.txt
But I can't get it working.
Can anybody give me a hint?
As you are using xxd it seems to me that you want to search the file as if it were binary data. I'd recommend using a more powerful programming language for this; the Unix shell tools assume there are line endings and that the text is mostly 7-bit ASCII. Consider using Python:
#!/usr/bin/python
import mmap
fd = open("file_to_search", "rb")
needle = "\x35\x5A\x3C\x2F\x74\x69\x6D\x65\x3E"
haystack = mmap.mmap(fd.fileno(), length = 0, access = mmap.ACCESS_READ)
i = haystack.find(needle)
while i >= 0:
i += len(needle)
print (haystack[i : i + 28])
i = haystack.find(needle, i)
If your grep supports -P parameter then you could simply use the below command.
$ echo '355A3C2F74696D653E323031312D30342D32365431343A34373A30322D31343A34373A3135' | grep -oP '355A3C2F74696D653E\K.{28}'
323031312D30342D32365431343A
For 56 chars,
$ echo '355A3C2F74696D653E323031312D30342D32365431343A34373A30322D31343A34373A3135' | grep -oP '355A3C2F74696D653E\K.{56}'
323031312D30342D32365431343A34373A30322D31343A34373A3135
Why convert to hex first? See if this awk script works for you. It looks for the string you want to match on, then prints the next 28 characters. Special characters are escaped with a backslash in the pattern.
Adapted from this post: Grep characters before and after match?
I added some blank lines for readability.
VirtualBox:~$ cat data.dat
Thisis a test of somerandom characters before thestringI want5Z</time>2011-04-26T14:47:02-14:47:15plus somemoredata
VirtualBox:~$ cat test.sh
awk '/5Z\<\/time\>/ {
match($0, /5Z\<\/time\>/); print substr($0, RSTART + 9, 28);
}' data.dat
VirtualBox:~$ ./test.sh
2011-04-26T14:47:02-14:47:15
VirtualBox:~$
EDIT: I just realized something. The regular expression will need to be tweaked to be non-greedy, etc and between that and awk need to be tweaked to handle multiple occurrences as you need them. Perhaps some of the folks more up on awk can chime in with improvements as I am real rusty. An approach to consider anyway.

Base64 decoding has "%" at the end of the result sometimes. Is it the result supposed to be? Any solution to that?

I am just studying base64 encoding and decoding algorithms and try some programs. I found some example code online, but the result looks a little weird for me.
Here is the link: http://knol2share.blogspot.com/2011/07/base64-encoding-and-decoding-in-c.html
I tried to use it to encode and decode a string.
Enter a string:
02613
Base64 Encoded value: MDI2MTM=
Base64 Decoded value: 02613% -- I do not know why there is a "%", is there a way to get the correct result
I even tried the Base64 program in linux and got the same result after removing the newline in encoding.
Here is the result:
%echo -n 02613 |base64
MDI2MTM=
%echo -n MDI2MTM= | base64 --decode
02613%
Does anyone know how I can get the exact same result with the input string? Thanks.
It is printed if the decoded text does not end with a newline.
$ printf "foobar\n" | base64 | base64 --decode
foobar
$ printf "foobar" | base64 | base64 --decode
foobar%
Isn't the % sign a command prompt ?
Add new line after decoded b64 and check.
Here's the example with echo command, -n option will remove newline character, that's why in first case we don't have the symbol % and in the second case with have one appended.
➜ ~ echo "HELLO" | base64 | base64 --decode
HELLO
➜ ~ echo -n "HELLO" | base64 | base64 --decode
HELLO%

Resources