What encoding is this binary string? - text

I have a binary data file and need to retrieve some data from it. From trial and error and the help of a hex editor, I have identified the regions of text that I need, but I'm not sure what encoding is being used.
Each character is using two bytes, but in my sample set the second byte is always empty.
1F00 : a
1C00 : b
1A00 : d
1B00 : e
1900 : g
1600 : h
1700 : i
1500 : k
1200 : l
1000 : n
1100 : o
0E00 : p
0F00 : q
0C00 : r
0D00 : s
0A00 : t
0B00 : u
0800 : v
0900 : w
5000 : .
5E00 : <- space
3F00 : A
3C00 : B
3D00 : C
3A00 : D
3B00 : E
2D00 : S
for example, the word hello is represented as
16 00 1B 00 12 00 12 00 11 00
Obviously the weird thing is that 0x41 is not A, and that the alphabet is not even consecutive. It is possible that some weird cypher was being used, but I doubt it.
Joop Eggen found the solution below - a simple xor!

You probably saw it already, but one can see the xorring.
This is a poor man's encryption, every char as int:
code = (plain ^ 0x7e) << 8

If you on linux try to use enca (it detects character set and encoding of text files and can also convert them to other encodings).

There is definitely a pattern:
04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F
a x a
b x b
c x c
d x d
e x e
f x f
g x g
h x h
i x i
j x j
k x k
l x l
m x m
n x n
o x o
p x p
q x q
r x r
s x s
t x t
u x u
v x v
w x w
x x x
y x y
z x z
04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F

Related

Chip EMV - Getting AFL for every smart card

Continue from: EMV Reading PAN Code
I'm working in C, so I havn't Java tools and all the functions that parse automatically the response of APDU command.
I want to read all types of smart cards.
I have to parse the response of an GET PROCESSING OPTIONS and get the AFL (Access File Locator) of every card.
I have three cards with three different situation:
A) HelloBank: 77 12 82 2 38 0 94 c 10 2 4 1 18 1 1 0 20 1 1 0 90
B) PayPal: 77 12 82 2 39 0 94 c 18 1 1 0 20 1 1 0 28 1 3 1 90
C) PostePay: 80 a 1c 0 8 1 1 0 18 1 2 0 90
Case A)
I've got three different AFL: 10 2 4 1, 18 1 1 0, 20 1 1 0
So I send 00 B2 SFI P2 00 where SFI was 10>>3 (10 was first byte of first AFL) and P2 was SFI<<3|4 and this way I got the correct PAN Code of my card.
Case B)
I've got three different AFL: 18 1 1 0, 20 1 1 0, 28 1 3 1.
So I send 00 B2 SFI P2 00 builded in the same way as Case A, but I got the response 6A 83 for every AFL.
Case C)
I've got two different AFL: 8 1 1 0, 18 1 2 0 but I cannot parse those automatically because there isn't the same TAG of previous response.
If I use those AFL it worked and I can get the PAN Code of the card.
How can I make an universal way to read the correct AFL and how can I make the correct command with those AFL?
Here is the decoding of AFL:
You will get the AFL in multiple of 4 Bytes normally. Divide your complete AFL in a chunk of 4 Bytes. Lets take an example of 1 Chunk:
AABBCCDD
AA -> SFI (Decoding is described below)
BB -> First Record under this SFI
CC -> Last Record under this SFI
DD -> Record involved for Offline Data Authentication (Not for your use for the moment)
Taking your example 10 02 04 01 18 01 01 00 20 01 10 00
Chunks are 10 02 04 01, 18 01 01 00, 20 01 10 00
10 02 04 01 -->
Taking 1st Byte 10 : 00010000 Take initial 5 bits from MSB --> 00010 means 2 : Means SFI 2
Taking 2nd Byte 02 : First Record under SFI 2 is 02
Taking 3rd Byte 04 : Last Record under SFI 2 is 04
Excluding 4 Byte explanation since no use
Summary : SFI 2 contains record 2 to 4
How Read Record command will form :
APDU structure : CLA INS P1 P2 LE
CLA 00
INS B2
P1 (Rec No)02 (SInce in this SFI 2 inital record is 02)
P2 (SFI) SFI 02 : Represent the SFI in 5 binay digit 00010 and then append 100 in the end : 00010100 : In Hex 14
So P2 is 14
LE 00
APDU to Read SFI 2 Rec 2 : 00 B2 02 14 00
APDU to Read SFI 2 Rec 3 : 00 B2 03 14 00
APDU to Read SFI 2 Rec 4 : 00 B2 04 14 00
Now if you will try to Read Rec 5, Since this Rec is not present you will get SW 6A83 in this case.
Use the same procedure for all chunk to identify the available Records and SFIs
BY this mechanisam you can write the function to parse the AFL

difficulty understanding the example in RFC 6979

I'm trying to follow section A.1.2 of RFC 6979 and am having some difficulty.
So h1 is as follows:
h1
AF 2B DB E1 AA 9B 6E C1 E2 AD E1 D6 94 F4 1F C7
1A 83 1D 02 68 E9 89 15 62 11 3D 8A 62 AD D1 BF
If that is run through bits2octets(h1) you're supposed to get this:
01 79 5E DF 0D 54 DB 76 0F 15 6D 0D AC 04 C0 32
2B 3A 20 42 24
I don't understand how.
Here's bits2octets defined in Java (from the RFC):
private byte[] bits2octets(byte[] in)
{
BigInteger z1 = bits2int(in);
BigInteger z2 = z1.subtract(q);
return int2octets(z2.signum() < 0 ? z1 : z2);
}
Here's bits2int:
private BigInteger bits2int(byte[] in)
{
BigInteger v = new BigInteger(1, in);
int vlen = in.length * 8;
if (vlen > qlen) {
v = v.shiftRight(vlen - qlen);
}
return v;
}
Heres q:
q = 0x4000000000000000000020108A2E0CC0D99F8A5EF
h1 is 32 bytes long. q is 21 bytes long.
So bits2int returns the first 21 bytes of h1. ie.
af2bdbe1aa9b6ec1e2ade1d694f41fc71a831d0268
Convert that to an integer and then subtract q and you get this:
af2bdbe1aa9b6ec1e2ade1d694f41fc71a831d0268
- 04000000000000000000020108A2E0CC0D99F8A5EF
------------------------------------------
ab2bdbe1aa9b6ec1e2addfd58c513efb0ce9245c79
The result is positive so it - z2 - is kept.
Then int2octets() is called.
private byte[] int2octets(BigInteger v)
{
byte[] out = v.toByteArray();
if (out.length < rolen) {
byte[] out2 = new byte[rolen];
System.arraycopy(out, 0,
out2, rolen - out.length,
out.length);
return out2;
} else if (out.length > rolen) {
byte[] out2 = new byte[rolen];
System.arraycopy(out, out.length - rolen,
out2, 0, rolen);
return out2;
} else {
return out;
}
}
q and v are the same size so ab2bdbe1aa9b6ec1e2addfd58c513efb0ce9245c79
is returned. But that's not what the test vector says:
bits2octets(h1)
01 79 5E DF 0D 54 DB 76 0F 15 6D 0D AC 04 C0 32
2B 3A 20 42 24
I don't get it. Did I mess up in my analysis somewhere?
The output is obtained as (0xaf2b...d1bf >> (256 - 163)) mod q = 0x0179...4224. Your mistake was assuming bits2int shifted bytes instead of bits.

Wordnet database has letters in weird/invalid places

I was noticing that some lines in the database files (like data.verb) are not following the correct format. (The database format is outlined here).
02286687 40 v 0a fall_upon d strike 0 come_upon 9 light_upon 0 chance_upon 0 come_across 2 chance_on 0 happen_upon 0 attain d discover 0 003 # 02285629 v 0000 + 07214432 n 0a01 + 00043195 n 0a01 01 + 08 00 | find unexpectedly; "the archeologists chanced upon an old tomb"; "she struck a goldmine"; "The hikers finally struck the main path to the lake"
Where the w_cnt 0a should be a the number 10. This also happens in other places like:
02575723 41 v 08 flim-flam 0 play_a_joke_on 1 play_tricks 0 trick 0 fob 0 fox 0 pull_a_fast_one_on 0 play_a_trick_on 0 008 # 02575082 v 0000 + 10022759 n 0602 + 00171618 n 0401 + 10463714 n 0404 + 06760722 n 0401 + 00752954 n 0401 + 00779248 n 010c ~ 02578384 v 0000 02 + 09 00 + 30 04 | deceive somebody; "We tricked the teacher into thinking that class would be cancelled next week"
Where 010c isn't a valid number. Unless [digit][letter] is a valid format, but is not described in the documentation I have read so far.
Why are their random letters among the numbers?
Looks like the numbers are in hexadecimal format - A is 10, for example.

HEX & Decimal conversion

I have a binary file , the definition of its content is as below : ( all data is stored
in little endian (ie. least significant byte first)) . The example numbers below are HEX
11 63 39 46 --- Time, UTC in seconds since 1 Jan 1970.
01 00 --- 0001 = No Fix, 0002 = SPS
97 85 ff e0 7b db 4c 40 --- Latitude, as double
a1 d5 ce 56 8d 26 28 40 --- Longitude, as double
f0 37 e1 42 --- Height in meters, as float
fe 2b f0 3a --- Speed in km/h, as float
00 00 00 00 --- Heading (degrees ?), as float
01 00 --- RCR, log reason. 0001=Time, 0004=Distance
59 20 6a f3 4a 26 e3 3f --- Distance in meters, as double,
2a --- ? Don't know
a8 --- Checksum, xor of all bytes above not including 0x2a
the data from the Binary file "in HEX" is as below
"F25D39460200269652F5032445401F4228D79BCC54C09A3A2743B4ADE73F2A83"
I appreciate if you can support me to translate this data line based on the instruction before.
Probably wrong, but here's a shot at it using Ruby:
hex = "F25D39460200269652F5032445401F4228D79BCC54C09A3A2743B4ADE73F2A83"
ints = hex.scan(/../).map{ |s| s.to_i(16) }
raw = ints.pack('C*')
fields = raw.unpack( 'VvEEVVVvE')
p fields
#=> [1178164722, 2, 42.2813707974677, -83.1970117467067, 1126644378, 1072147892, nil, 33578, nil]
p Time.at( fields.first )
#=> 2007-05-02 21:58:42 -0600
I'd appreciate it if someone well-versed in #pack and #unpack would show me a better way to accomplish the first three lines.
My Cygnus Hex Editor could load such a file and, using structure templates, display the data in its native formats.
Beyond that, it's just a matter of doing through each value and working out the translation for each byte.

Haskell doubt: how to transform a Matrix represented as: [String] to a Matrix Represented as [[Int]]?

Im trying to solve Problem 11 of Project Euler in haskell. I almost did it, but right now im
stuck, i want to transform a Matrix represented as [String] to a Matrix represented as [[Int]].
I "drawed" the matrices:
What i want:
"08 02 22 97 38 15 00 40 [ ["08","02","22","97","38","15","00","40"], [[08,02,22,97,38,15,00,40]
49 49 99 40 17 81 18 57 map words lines ["49","49","99","40","17","81","18","57"], ??a [49,49,99,40,17,81,18,57]
81 49 31 73 55 79 14 29 ----------> ["81","49","31","73","55","79","14","29"], ---------> [81,49,31,73,55,79,14,29]
52 70 95 23 04 60 11 42 ["52","70","95","23","04","60","11","42"], [52,70,95,23,04,60,11,42]
22 31 16 71 51 67 63 89 ["22","31","16","71","51","67","63","89"], [22,31,16,71,51,67,63,89]
24 47 32 60 99 03 45 02" ["24","47","32","60","99","03","45","02"] ] [24,47,32,60,99,03,45,02]]
Im stuck in doing the last transformation (??a)
for curiosity(and learning) i also want to know how to do a matrix of digits:
Input:
"123456789 [ "123456789" [ [1,2,3,4,5,6,7,8,9]
124834924 lines "124834924" ??b [1,2,4,8,3,4,9,2,4]
328423423 ---------> "328423423" ---------> [3,2,8,4,2,3,4,2,3]
334243423 "334243423" [3,3,4,2,4,3,4,2,3]
932402343" "932402343" ] [9,3,2,4,0,2,3,4,3] ]
What is the best way to make (??a) and (??b) ?
What you want is the read function:
read :: (Read a) => String -> a
This thoughtfully parses a string into whatever you're expecting (as long as it's an instance of the class Read, but fortunately Int is such).
So just map that over the words, like so:
parseMatrix :: (Read a) => String -> [[a]]
parseMatrix s = map (map read . words) $ lines s
Just use that in a context that expects [[Int]] and Haskell's type inference will take it from there.
To get the digits, just remember that String is actually just [Char]. Instead of using words, map a function that turns each Char into a single-element list; everything else is the same.

Resources