How do I insert a row with a TimeUUIDType column in Cassandra? - cassandra

In Cassandra, I have the following Column Family:
<ColumnFamily CompareWith="TimeUUIDType" Name="Posts"/>
I'm trying to insert a record into it as follows using a C++ generated function generated by Thrift:
ColumnPath new_col;
new_col.__isset.column = true; /* this is required! */
new_col.column_family.assign("Posts");
new_col.super_column.assign("");
new_col.column.assign("1968ec4a-2a73-11df-9aca-00012e27a270");
client.insert("Keyspace1", "somekey", new_col, "Random Value", 1234, ONE);
However, I'm getting the following error: "UUIDs must be exactly 16 bytes"
I've even tried the Cassandra CLI with the following command:
set Keyspace1.Posts['somekey']['1968ec4a-2a73-11df-9aca-00012e27a270'] = 'Random Value'
but I still get the following error:
Exception null
InvalidRequestException(why:UUIDs must be exactly 16 bytes)
at org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:11994)
at org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:659)
at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:632)
at org.apache.cassandra.cli.CliClient.executeSet(CliClient.java:420)
at org.apache.cassandra.cli.CliClient.executeCLIStmt(CliClient.java:80)
at org.apache.cassandra.cli.CliMain.processCLIStmt(CliMain.java:132)
at org.apache.cassandra.cli.CliMain.main(CliMain.java:173)

Thrift is a binary protocol; 16 bytes means 16 bytes. "1968ec4a-2a73-11df-9aca-00012e27a270" is 36 bytes. You need to get your library to give you the raw, 16 bytes form.
I don't use C++ myself, but "version 1 uuid" is the magic string you want to google for when looking for a library that can do this. http://www.google.com/search?q=C%2B%2B+version+1+uuid

Related

SAS - Column Pointer Error when Importing COBOL Structured Data

I have a structure MF file that I'm now trying to import into SAS. The fixed portion of the record is working, but when I hit the Days line the column pointer is at 30. The variable is at 15. How did the column pointer jump to 30 vs my expected value and how can I update the below.
data test;
INFILE file1 recfm=N;
INPUT seqno s370fPD9.
type $ebcdic2.
record_type $ebcdic1.
occurs s370fPD2.
#;
ARRAY Day{60} Day1-Day60;
ARRAY AMT{60} AMT1-AMT60;
ARRAY CRED{60} CRED1-CRED60;
ARRAY PAY{60} PAY1-PAY60;
DO I = 1 TO OCCURS;
input Days{I} s370fPD2.
AMT{I} s370fPD6. +2
CRED{I} s370fPD6. +2
PAY{I} s370fPD8. #;
end;
run;
I get the following in the log. While there isn't a note for Amt1, it is not populated correctly:
NOTE: INVALID DATA for Days1 at byte position 30-31.
NOTE: INVALID DATA for CRED1 at byte position 52-57.
NOTE: INVALID DATA for PAY1 at byte position 70-71.
For the first record, days1 should be at 15. Cred1 at 25 Pay1 at 33.

How do I fix USER FATAL MESSAGE 740?

How do I fix USER FATAL MESSAGE 740? This error is generated by Nastran when I try to run a BDF/DAT file of mine.
*** USER FATAL MESSAGE 740 (RDASGN)
UNIT NUMBER 5 HAS ALREADY BEEN ASSIGNED TO THE LOGICAL NAME INPUT
USER ACTION: CHANGE THE UNIT NUMBER ON THE ASSIGN STATEMENT AND IF THE UNIT IS USED FOR
PARAM,POST,<0 THEN SPECIFY PARAM,OUNIT2 WITH THE NEW UNIT NUMBER.
AVOID USING THE FOLLOWING UNIT NUMBERS THAT ARE ASSIGNED TO SPECIAL FILES IN MSC.NASTRAN:
1 THRU 12, 14 THRU 22, 40, 50, 51, 91, 92. SEE THE MSC.NASTRAN INSTALLATIONS/OPERATIONS
GUIDE SECTION ON MAKING FILE ASSIGNMENTS OR MSC.NASTRAN QUICK REFERENCE GUIDE ON
ASSIGN PHYSICAL FILE FOR REFERENCE.
Below is the head of my BDF file.
assign userfile='SUB1_PLATE.csv', status=UNKNOWN, form=formatted, unit=52
SOL 200
CEND
ECHO = NONE
DESOBJ(MIN) = 35
set 30=1008,1007,1015,1016
DESMOD=SUB1_PLATE
SUBCASE 1
$! Subcase name : DefaultLoadCase
$LBCSET SUBCASE1 DefaultLbcSet
ANALYSIS = STATICS
SPC = 1
LOAD = 6
DESSUB = 99
DISPLACEMENT(SORT1,PLOT,REAL)=ALL
STRESS(SORT1,PLOT,VONMISES,CORNER)=ALL
BEGIN BULK
param,xyunit,52
[...]
ENDDATA
Below is the solution
Correct
assign userfile='SUB1_PLAT.csv', status=UNKNOWN, form=formatted, unit=52
I shortened the name of CSV file to SUB1_PLAT.csv. This reduced the length of the line to 72 characters.
Incorrect
assign userfile='SUB1_PLATE.csv', status=UNKNOWN, form=formatted, unit=52
The file management section is limited to 72 characters, spaces included. The incorrect line stretches 73 characters. The nastran reader ignores the 73rd character and on. Instead of reading "unit=52" the reader reads "unit=5" which triggers the error.
|<--------------------- 72 Characters -------------------------------->||<- Characters are ignored truncated ->
assign userfile='SUB1_PLATE.csv', status=UNKNOWN, form=formatted, unit=52
References
MSC Nastran Reference Guide
The records of the first four sections are input in free-field format
and only columns 1 through 72 are used for data. Any information in
columns 73 through 80 may appear in the printed echo, but will not be
used by the program. If the last character in a record is a comma,
then the record is continued to the next record.

Is there a string size limit when feeding .fromstring() method as input?

I'm working on multiple well-formed xml files, whose sizes range from 100 MB to 4 GB. My goal is to read them as strings and then import them as ElementTree objects using .fromstring() method (from xml.etree.ElementTree module).
However, as the process goes through and the string size increases, two exceptions occured related to memory restriction :
xml.etree.ElementTree.ParseError: out of memory: line 1, column 0
OverflowError: size does not fit in an int
It looks like .fromstring() method enforces a string size limit to the input, around 1GB... ?
To debug this, I wrote a short script using a for loop:
xmlFiles_list = [path1, path2, ...]
for fp in xmlFiles_list:
xml_fo = open(fp, mode='r', encoding="utf-8")
xml_asStr = xml_fo.read()
xml_fo.close()
print(len(xml_asStr.encode("utf-8")) / 10**9) # display string size in GB
try:
etree = cElementTree.fromstring(xml_asStr)
print(".fromstring() success!\n")
except Exception as e:
print(f"Error :{type(e)} {str(e)}\n")
continue
The ouput is as following :
0.895206753
.fromstring() success!
1.220224531
Error :<class 'xml.etree.ElementTree.ParseError'> out of memory: line 1, column 0
1.328233473
Erreur :<class 'xml.etree.ElementTree.ParseError'> out of memory: line 1, column 0
2.567867904
Error :<class 'OverflowError'> size does not fit in an int
4.080672538
Error :<class 'OverflowError'> size does not fit in an int
I found multiple workarounds to avoid this issue : .parse() method or lxml module for bette performance. I just hope someone could shed some light on this :
Is there a specific string size limit in xml.etree.ET module and .fromstring() method ?
Why do I end up with two different exceptions as the string size increases ? Are they related to the same memory-allocation restriction ?
Python version/system: 3.9 (64 bits)
RAM : 32go
Hope my topic is clear enough, I'm new on stackoverflow

How to read binary data in pyspark

I'm reading binary file http://snap.stanford.edu/data/amazon/productGraph/image_features/image_features.b
using pyspark.
import array
from io import StringIO
img_embedding_file = sc.binaryRecords("s3://bucket/image_features.b", 4106)
def mapper(features):
a = array.array('f')
a.frombytes(features)
return a.tolist()
def byte_mapper(bytes):
return str(bytes)
decoded_embeddings = img_embedding_file.map(lambda x: [byte_mapper(x[:10]), mapper(x[10:])])
When just product_id is selected from the rdd using
decoded_embeddings = img_embedding_file.map(lambda x: [byte_mapper(x[:10]), mapper(x[10:])])
The output for product_id is
["b'1582480311'", "b'\\x00\\x00\\x00\\x00\\x88c-?\\xeb\\xe2'", "b'7#\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00'", "b'\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00'", "b'\\xec/\\x0b?\\x00\\x00\\x00\\x00K\\xea'", "b'\\x00\\x00c\\x7f\\xd9?\\x00\\x00\\x00\\x00'", "b'L\\xa6\\n>\\x00\\x00\\x00\\x00\\xfe\\xd4'", "b'\\x00\\x00\\x00\\x00\\x00\\x00\\xe5\\xd0\\xa2='", "b'\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00'", "b'\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00'"]
The file is hosted on s3.
The file in each row has first 10 bytes for product_id next 4096 bytes as image_features
I'm able to extract all the 4096 image features but facing issue when reading the first 10 bytes and converting it into proper readable format.
EDIT:
Finally, the problem comes from the recordLength. It's not 4096 + 10 but 4096*4 + 10. Chaging to :
img_embedding_file = sc.binaryRecords("s3://bucket/image_features.b", 16394)
Should work.
Actually you can find this in the provided code from the web site you downloaded the binary file:
for i in range(4096):
feature.append(struct.unpack('f', f.read(4))) # <-- so 4096 * 4
Old answer:
I think the issue comes from your byte_mapper function.
That's not the correct way to convert bytes to string. You should be using decode:
bytes = b'1582480311'
print(str(bytes))
# output: "b'1582480311'"
print(bytes.decode("utf-8"))
# output: '1582480311'
If you're getting the error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x88 in position 4: invalid start byte
That means product_id string contains non-utf8 characters. If you don't know the input encoding, it's difficult to convert into strings.
However, you may want to ignore those characters by adding option ignore to decode function:
bytes.decode("utf-8", "ignore")

Understanding BLE characteristic values for cycle power measurement 0x2A63

I am currently using Dart/Flutter BLE plugin to better understand BLE devices.
Plugin:
https://pub.dartlang.org/packages/flutter_blue
When I connect to my virtual cycle trainer I select the 0x1818 service and then I subscribe to the 0x2A63 characteristic for Cycle Power Measurement.
I am struggling to align the response list I get with the GATT documentation for this service/characteristics below. There is 18 values in this list, however there is only 17 in the GATTS list. Also the values don't seem to make any sense.
I also tried to convert the first two values '52','24' to a 16 bit binary to see if that aligns with the flags for the first field, but the result was the below which again makes no sense.
0x3418 = 11010000011000
https://www.bluetooth.com/specifications/gatt/viewer?attributeXmlFile=org.bluetooth.characteristic.cycling_power_measurement.xml
This screenshot is when I first connect to the trainer.
This screenshot is when I am cycling lightly on the bike
This screenshot is when I stop cycling but the pedals and wheel are still turning.
The cycle trainer is the Cycleops Magnus, which doesn't have the Cycle Speed Cadence service 1816, but can provide virtual speed based on power.
My Question is this:
Which of the values in the list corresponding with the GATTS
characteristics and bonus question is, how would I infer speed or
cadence from the values in this service?
Based on section 3.55 of the Bluetooth GATT specs:
DEC - [52,24,40,0,58,29,59,0,0,0,107,136,23, 0,214, 81, 1,0]
BIT - 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Flag field = 24,52 (bit0 and bit1)
2452 = 00001001 10010100
section 3.55.2.1
the corresponding (1) equates to
- bit2 = Accumulated Torque Present
- bit4 = Wheel Revolution Data Present
- bit7 = Extreme Torque Magnitudes Present
- bit8 = Extreme Angles Present
- bit11 = Accumulated Energy Present
Then from section 3.55.2, you go down the list of bits based on the flags:
Instant Power is bits2 (40) and bit3 (0)
(Dec) 0040 == 00000000 00101000 == 40w
to decipher the rest of the bits, we then have to refer to the flags field since the remaining bits after the flags field and instant power have to depend on what the flags field says that the trainer is supporting.
Based on bit2 of the flags field which says that "Accumulated Torque Present" (
Present if bit 2 of Flags field set to 1) Hence the next 2 bits represents Accumulated Torque
Dec (2958)
The next data would then be based on bit4 of the flags field - Wheel Rev Data Present (Present if bit 4 of Flags field set to 1). This is wheel speed which would translate into speed once you taken into account wheel circumference. For Wheel Rev Data, this is represented by the next 6 bits.
Cumulative Wheel Revolutions - 4 bits
Last Wheel Event Time - 2 bits
like you mentioned, this trainer does not offer cadence service, hence that's why you do not see the flags field (bit5) to be 1. Hence you cannot infer cadence from this data set.
For Wheel speed, you would then decode the data from the 6 bits based on Cum Wheel Rev and Last Wheel Event Time. I can't offer you code on how to decode the 6 bits as you're using flutter and I've no experience in flutter language. (I do Swift) but can likely take a look at this code from GoldenCheetah and convert accordingly.
BT40Device::getWheelRpm(QDataStream& ds)
{
quint32 wheelrevs;
quint16 wheeltime;
ds >> wheelrevs;
ds >> wheeltime;
double rpm = 0.0;
if(!prevWheelStaleness) {
quint16 time = wheeltime - prevWheelTime;
quint32 revs = wheelrevs - prevWheelRevs;
// Power sensor uses 1/2048 second time base and CSC sensor 1/1024
if (time) rpm = (has_power ? 2048 : 1024)*60*revs / double(time);
}
else prevWheelStaleness = false;
prevWheelRevs = wheelrevs;
prevWheelTime = wheeltime;
dynamic_cast<BT40Controller*>(parent)->setWheelRpm(rpm);
}

Resources