Octave advanced textread usage, bash - linux

I have following text file:
079082084072079032084069067072000000000,0
082078032049050032067072065082071069000,1
076065066032065083083084000000000000000,0
082078032049050072082000000000000000000,1
082078032049050072082000000000000000000,1
082078032049050072082000000000000000000,1
070083087032073073032080068000000000000,0
080067065032049050032072082000000000000,0
082078032056072082000000000000000000000,1
070083087032073073073000000000000000000,0
082078032087069069075069078068000000000,1
082078032049050072082000000000000000000,1
077065073078084032077069067072032073073,0
082078032049050072082000000000000000000,1
080067065032049050032072082000000000000,0
082078032049050072082000000000000000000,1
I need too matrices:
X size 16x13
Y size 16x1
I want to separate each row of the file into 13 values, example:
079 082 084 072 079 032 084 069 067 072 000 000 000
Is it possible to import it into octave using textread function?
If no, can it be done using Linux bash command?

Yes, you can do this with textscan (see bottom if you really want to use textread:
octave> txt = "079082084072079032084069067072000000000,0\n082078032049050032067072065082071069000,1";
octave> textscan (txt, repmat ("%3d", 1, 13))
ans =
{
[1,1] =
79
82
[1,2] =
82
78
[1,3] =
84
32
[1,4] =
72
49
[...]
Note that you are reading them as numeric values, so you do not get the preceding zeros. If you want them, you can either read them as string by using "%3s" in the format (extra trouble to handle and reduced performance since you will then be handling cell arrays).
Since you are reading from a file:
[fid, msg] = fopen ("data.txt", "r");
if (fid)
error ("failed to fopen 'data.txt': %s", msg);
endif
data = textscan (fid, repmat ("%3d", 1, 13));
fclose (fid);
If you really want to use textread:
octave> [d1, d2, d3, d4, d5, d6, d7, d8, d9, d10, d11, d12, d13] = textread ("data.txt", repmat ("%3d", 1, 13))
d1 =
79
82
76
[...]
d2 =
82
78
65
[...]
or:
octave> data = cell (1, 13);
octave> [data{:}] = textread ("data.txt", repmat ("%3d", 1, 13))
data =
{
[1,1] =
79
82
76
[...]
[1,2] =
82
78
65
[...]
If you need to capture the value after the comma (not really part of your original question), you can use:
octave> textscan (txt, [repmat("%3d", 1, 13) ",%1d"])
ans =
{
[1,1] =
79
82
[1,2] =
82
78
[1,3] =
84
32
[...]
[1,14] =
0
1
}

You can do this pretty easily by reading three characters at a time using read in the shell:
while IFS="${IFS}," read -rn3 val tail; do
[[ $tail ]] && echo || printf '%s ' "$val"
done < file
This implementation assumes that if we encounter a value after the comma, we should go to the next line.

Related

What is the equivalent of perl's Win32::OLE::Variant in Python 3

I have the following code snippet in perl for automating an application script using Win32::OLE
use Win32::OLE;
use Win32::OLE::Variant;
my $app = new Win32::OLE 'Some.Application';
my $InfoPacket = "78 00 C4 10 95 B4
00 02 31 7F 80 FF";
my #Bytes = split(/[ \n][ \n]*/, $InfoPacket);
my #HexBytes;
foreach(#Bytes)
{
push #HexBytes, eval "0x$_";
}
my $Packet = pack("C12", #HexBytes);
my $VarPacket = Variant(VT_UI1, $Packet);
my $InfoObj = app -> ProcessPacket($VarPacket);
print $InfoObj -> Text();
I have converted the entire code in Python 3, except for the [exact] equivalent of pack() and Variant() functions.
from win32com.client import Dispatch
from struct import pack
app = Dispatch("Some.Application")
InfoPacket = "78 00 C4 10 95 B4 \
00 02 31 7F 80 FF"
Bytes = InfoPacket.split()
HexBytes = [int(b, 16) for b in Bytes]
Packet = pack('B'*12, *HexBytes) # This however, is not giving the exact same output as perl's...
VarPacket = ... # Need to know the python equivalent of above Variant() function...
InfoObj = app.ProcessPacket(VarPacket)
print (InfoObj.Text())
Please suggest the python equivalent of the pack() and Variant() functions used in perl script in the given context so that the final variable VarPacket can be used by Python's Dispatch object to properly generate the InfoObj object.
Thanks !!!
I am not sure about the Python equivalent of the Perl Variant, but for the first question about packing the unsigned char array, the following works for me:
from struct import pack
def gen_list():
info_packet = "78 00 C4 10 95 B4 00 02 31 7F 80 FF"
count = 0
for hex_str in info_packet.split():
yield int(hex_str, 16)
count += 1
for j in range(count, 120):
yield int(0)
packet = pack("120B", *list(gen_list()))
Edit
From the test file testPyComTest.py it looks like you can generate the variant like this:
import win32com.client
import pythoncom
variant = win32com.client.VARIANT(pythoncom.VT_ARRAY | pythoncom.VT_UI1, packet)

Changing protobuff optional field to oneof

I have the following message:
message Message {
int64 id = 1;
google.protobuf.FloatValue weight = 2;
google.protobuf.FloatValue override_weight = 3;
}
and I wish to change the type of weight and override_weight(optional fields) to google.protobuf.DoubleValue so what I did was the fllowing:
message Message {
int64 id = 1;
oneof weight_oneof {
google.protobuf.FloatValue weight = 2 [deprecated=true];
google.protobuf.DoubleValue double_weight = 4;
}
oneof override_weight_oneof {
google.protobuf.FloatValue override_weight = 3 [deprecated=true];
google.protobuf.DoubleValue double_override_weight = 5;
}
}
My question is, lets assume I have old messages who were compiled by the previous protobuff message compiler for the old message, would I be able to parse them as the new message?
The documentation is very vague about this:
"Move optional fields into or out of a oneof: You may lose some of your information (some fields will be cleared) after the message is serialized and parsed. However, you can safely move a single field into a new oneof and may be able to move multiple fields if it is known that only one is ever set."
Has anyone tried this before? what is the best practice for this situation?
As far as I know fields in an oneof are just serialize using their tag number. The serialized data does not indicate if a field is part of an oneof. This is all handled by the serializer and deserializer. So as long as the tag numbers do not conflict it can be assumed that it will work in both directions, old messages to a new serializer and new messages to an old serializer.
You could test this using an online protobuf deserializer.
Verification:
The code does indeed produce the same byte strings. Below you will find the message definitions and python code I used. The python code will output a byte string you can copy and use in the decoder of Marc Gravell.
syntax = "proto3";
message MessageA {
int64 id = 1;
float weight = 2;
float override_weight = 3;
}
message MessageB {
int64 id = 1;
oneof weight_oneof {
float weight = 2 [deprecated=true];
double double_weight = 4;
}
oneof override_weight_oneof {
float override_weight = 3 [deprecated=true];
double double_override_weight = 5;
}
}
import Example_pb2
# Set some data in the original message
msgA = Example_pb2.MessageA()
msgA.id = 1234
msgA.weight = 3.21
msgA.override_weight = 5.43
# Output the serialized bytes in a pretty format
str = 'msgA = '
for x in msgA.SerializeToString():
str += "{:02x} ".format(x)
print(str)
# Next set the original fields in the new message
msgB = Example_pb2.MessageB()
msgB.id = 1234
msgB.weight = 3.21
msgB.override_weight = 5.43
# Output the serialized bytes in a pretty format
str = 'msgB 1 = '
for x in msgB.SerializeToString():
str += "{:02x} ".format(x)
print(str)
# And finally set the new fields in msgB
msgB.double_weight = 3.21
msgB.double_override_weight = 5.43
# Output the serialized bytes in a pretty format
str = 'msgB 2 = '
for x in msgB.SerializeToString():
str += "{:02x} ".format(x)
print(str)
The output of the python script was:
msgA = 08 d2 09 15 a4 70 4d 40 1d 8f c2 ad 40
msgB 1 = 08 d2 09 15 a4 70 4d 40 1d 8f c2 ad 40
msgB 2 = 08 d2 09 21 ae 47 e1 7a 14 ae 09 40 29 b8 1e 85 eb 51 b8 15 40
As you can see message A and message B yield the same byte string when setting the original fields. Only when you set the new fields you get a different string.

How can I improve my hex-based conversion

I was facing a problem where I got data (String) from a database with Linebreaks as (Hex) 0D. I displayed this data in a Textbox, which did not use the 0D as a linbreak. I found that the Textbox needs 0D-0A (LF & CR, dont know which is which) to actually show the new line. To solve this problem I came up with the following code.
Private Function convertString(txt As String) As String
Dim data = System.Text.Encoding.Default.GetBytes(txt)
Dim hexString As String = BitConverter.ToString(data)
hexString = hexString.Replace("0D", "0D-0A")
Dim arr As [String]() = hexString.Split("-"c)
Dim array As Byte() = New Byte(arr.Length - 1) {}
For i As Integer = 0 To arr.Length - 1
array(i) = Convert.ToByte(arr(i), 16)
Next
Return System.Text.Encoding.Default.GetString(array)
End Function
Explanation/procedure:
1. Convert String to ByteArray
2. Convert ByteArray to Hex-String (Hex-Chars separated by '-' )
3. Adding the missing lf or cr by replacing the solo one
4. Convert Hex-String back to ByteArray
5. Convert ByteArray back to String
Now my question:
I am pretty sure there is a better way to do that. How can I simplify those lines of code?
You should be able to just Replace vbCr with vbCrLf:
Dim txt = "This is a" & vbCr & "test"
Encoding.UTF8.GetBytes(txt).HexDump()
gives (HexDump is a custom utility method, but not relevant to the question):
00000000 54 68 69 73 20 69 73 20 61 0D 74 65 73 74 This is a·test
Dim txt2 = txt.Replace(vbCr, vbCrLf)
Encoding.UTF8.GetBytes(txt2).HexDump()
gives:
00000000 54 68 69 73 20 69 73 20 61 0D 0A 74 65 73 74 This is a··test
So, your whole method would be:
Private Function convertString(txt As String) As String
Return txt.Replace(vbCr, vbCrLf)
End Function

Urlencode/decode, different representation of the same string

I am a bit out of my comfort zone here, so I'm not even sure I'm aproaching the problem appropriately. Anyhow, here goes:
So I have a problem where I shall hash some info with sha1 that will work as that info's id.
when a client wants to signal what current info is being used, it sends a percent-encoded sha1-string.
So one example is, my server hashes some info and gets a hex representation like so:
44 c1 b1 0d 6a de ce 01 09 fd 27 bc 81 7f 0e 90 e3 b7 93 08
and the client sends me
D%c1%b1%0dj%de%ce%01%09%fd%27%bc%81%7f%0e%90%e3%b7%93%08
Removing the % we get
D c1 b1 0dj de ce 01 09 fd 27 bc 81 7f 0e 90 e3 b7 93 08
which matches my hash except for the beginning D and the j after the 0d, but replacing those with their ascii hex no, we have identical hash.
So, as I have read and understood the urlencoding, the standard would allow a client to send the D as either D or %44? So different clients would be able to send different representations off the same hash, and I will not be able to just compare them for equality?
I would prefer to be able to compare the urlencoded strings as they are when they are sent, but one way to do it would be to decode them, removing all '%' and get the ascii hex value for whatever mismatch I get, much like the D and the j in my above example.
This all seems to be a very annoying way to do things, am I missing something, please tell me I am? :)
I am doing this in node.js but I suppose the solution would be language/platform agnostic.
I made this crude solution for now:
var unreserved = 'A B C D E F G H I J S O N K L M N O P Q R S T U V W X Y Za b c d e f g h i j s o n k l m n o p q r s t u v w x y z + 1 2 3 4 5 6 7 8 9 0 - _ . ~';
function hexToPercent(hex){
var index = 0,
end = hex.length,
delimiter = '%',
step = 2,
result = '',
tmp = '';
if(end % step !== 0){
console.log('\'' + hex + '\' must be dividable by ' + step + '.');
return result;
}
while(index < end){
tmp = hex.slice(index, index + step);
if(unreserved.indexOf(String.fromCharCode('0x' + tmp)) !== -1){
result = result + String.fromCharCode('0x' + tmp);
}
else{
result = result + delimiter + tmp;
}
index = index + step;
}
return result;
}

C++ converting hex number to human readable string representation

I have a char[32] that consist of ASCII characters, some of which may be non printable characters, however, they are all in valid range of 0 to 255.
Let's say, it contains these numbers:
{ 9A 2E 5C 66 26 3D 5A 88 76 26 F1 09 B5 32 DE 10 91 3E 68 2F EA 7B C9 52 66 23 9D 77 16 BB C9 42 }
I'd like to print out or store in std::string as "9A2E5C66263D5A887626F109B532DE10913E682FEA7BC95266239D7716BBC942", however, if I print it out using printf or sprintf, it will yield the ASCII representative of each number, and instead it will print out "ö.\f&=Zàv&Ò µ2fië>h/Í{…Rf#ùwª…B", which is the correct ASCII character representation, since: ö = 9a, . = 2e, ...
How do I reliably get the string representation of the hex numbers? ie: I'd expect a char[64] which contains "9A2E5C66263D5A887626F109B532DE10913E682FEA7BC95266239D7716BBC942" instead.
Thanks!
void bytesToHex(char* dest, char* src, int size) {
for (unsigned int i = 0; i < size; i++) {
sprintf(&dest[i * 2], "%02x", src[i]);
}
}
You'd have to allocate your own memory here.
It would be used like this:
char myBuffer[32]
char result[65];
bytesToHex(result, myBuffer, 32);
result[64] = 0;
// print it
printf(result);
// or store it in an std::string
std::string str = string(result);
I tried:
char szStringRep[64];
for ( int i = 0; i < 32; i++ ) {
sprintf( &szStringRep[i*2], "%x", szHash[i] );
}
it works as intended, but if it encountered a < 0x10 number, it will null terminated as it is printing 0
You can encode your string to another system like Base64, which is widely used when using encryption algorithms.

Resources