Background: I'm using node.js to get the volume setting from a device via serial connection. I need to obtain this data as an integer value.
I have the data in a buffer ('buf'), and am using readInt16BE() to convert to an int, as follows:
console.log( buf )
console.log( buf.readInt16BE(0) )
Which gives me the following output as I adjust the external device:
<Buffer 00 7e>
126
<Buffer 00 7f>
127
<Buffer 01 00>
256
<Buffer 01 01>
257
<Buffer 01 02>
258
Problem: All looks well until we reach 127, then we take a jump to 256. Maybe it's something to do with signed and unsigned integers - I don't know!
Unfortunately I have very limited documentation about the external device, I'm having to reverse engineer it! Is it possible it only sends a 7-bit value? Hopefully there is a way around this?
Regarding a solution - I must also be able to convert back from int to this format!
Question: How can I create a sequential range of integers when 7F seems to be the largest value my device sends, which causes a big jump in my integer scale?
Thanks :)
127 is the maximum value of a signed 8-bit integer. If the integer is overflowing into the next byte at 128 it would be safe to assume you are not being sent a 16 bit value, but rather 2 signed 8-bit values, and reading the value as a 16-bit integer would be incorrect.
I would start by using the first byte as a multiplier of 128 and add the second byte, this will give the series you are seeking.
buf = Buffer([0,127]) //<Buffer 00 7f>
buf.readInt8(0) * 128 + buf.readInt8(1)
>127
buf = Buffer([1,0]) //<Buffer 01 00>
buf.readInt8(0) * 128 + buf.readInt8(1)
>128
buf = Buffer([1,1]) //<Buffer 01 01>
buf.readInt8(0) * 128 + buf.readInt8(1)
>129
The way to get back is to divide by 128, round it down to the nearest integer for the first byte, and the second byte contains the remainder.
i = 129
buf = Buffer([Math.floor(i / 128), i % 128])
<Buffer 01 01>
Needed to treat the data as two signed 8-bit values. As per #forrestj the solution is to do:
valueInt = buf.readInt8(0) * 128 + buf.readInt8(1)
We can also convert the int value into the original format by doing the following:
byte1 = Math.floor(valueInt / 128)
byte2 = valueInt % 128
Related
I need to parse UDP packets which can be invalid or contain some errors. I would like to replace invalid characters with . after a bytes to string conversion, in order to display the content of the packets.
How can I do it? This is my code:
func main() {
a := []byte{'a', 0xff, 0xaf, 'b', 0xbf}
s := string(a)
s = strings.Replace(s, string(0xFFFD), ".", 0)
fmt.Println("s: ", s) // I would like to display "a..b."
for _, r := range s {
fmt.Println("r: ", r)
}
rs := []rune(s)
fmt.Println("rs: ", rs)
}
The root problem with your approach is that the result of type converting []byte to string does not have any U+FFFDs in it: this type-conversion only copies bytes from the source to the destination, verbatim.
Just as byte slices, strings in Go are not obliged to contain UTF-8-encoded text; they can contain any data, including opaque binary data which has nothing to do with text.
But some operations on strings—namely type-converting them to []rune and iterating over them using range—do interpret strings as UTF-8-encoded text.
That is precisely where you got tripped: your range debugging loop attempted to interpret the string, and each time another attempt at decoding a properly encoded code point failed, range yielded a replacement character, U+FFFD.
To reiterate, the string obtained by the type-conversion does not contain the characters you wanted to get replaced by your regexp.
As to how to actually make a valid UTF-8-encoded string out of your data, you might employ a two-step process:
Type-convert your byte slice to a string—as you already do.
Use any means of interpreting a string as UTF-8—replacing U+FFFD which will dynamically appear during this process—as you're iterating.
Something like this:
var sb strings.Builder
for _, c := range string(b) {
if c == '\uFFFD' {
sb.WriteByte('.')
} else {
sb.WriteRune(c)
}
}
return sb.String()
A note on performance: since type-converting a []byte to string copies memory—because strings are immutable while slices are not—the first step with type-conversion might be a waste of resources for code dealing with large chunks of data and/or working in tight processing loops.
In this case, it may be worth using the DecodeRune function of the encoding/utf8 package which works on byte slices.
An example from its docs can be easily adapted to work with the loop above.
See also: Remove invalid UTF-8 characters from a string
#kostix answer is correct and explains very clearly the issue with scanning unicode runes from a string.
Just adding the following remark : if your intention is to view characters only in the ASCII range (printable characters < 127) and you don't really care about other unicode code points, you can be more blunt :
// create a byte slice with the same byte length as s
var bs = make([]byte, len(s))
// scan s byte by byte :
for i := 0; i < len(s); i++ {
switch {
case 32 <= s[i] && s[i] <= 126:
bs[i] = s[i]
// depending on your needs, you may also keep characters in the 0..31 range,
// like 'tab' (9), 'linefeed' (10) or 'carriage return' (13) :
// case s[i] == 9, s[i] == 10, s[i] == 13:
// bs[i] = s[i]
default:
bs[i] = '.'
}
}
fmt.Printf("rs: %s\n", bs)
playground
This function will give you something close to the "text" part of hexdump -C.
You may want to use strings.ToValidUTF8() for this:
ToValidUTF8 returns a copy of the string s with each run of invalid UTF-8 byte sequences replaced by the replacement string, which may be empty.
It "seemingly" does exactly what you need. Testing it:
a := []byte{'a', 0xff, 0xaf, 'b', 0xbf}
s := strings.ToValidUTF8(string(a), ".")
fmt.Println(s)
Output (try it on the Go Playground):
a.b.
I wrote "seemingly" because as you can see, there's a single dot between a and b: because there may be 2 bytes, but a single invalid sequence.
Note that you may avoid the []byte => string conversion, because there's a bytes.ToValidUTF8() equivalent that operates on and returns a []byte:
a := []byte{'a', 0xff, 0xaf, 'b', 0xbf}
a = bytes.ToValidUTF8(a, []byte{'.'})
fmt.Println(string(a))
Output will be the same. Try this one on the Go Playground.
If it bothers you that multiple (invalid sequence) bytes may be shrinked into a single dot, read on.
Also note that to inspect arbitrary byte slices that may or may not contain texts, you may simply use hex.Dump() which generates an output like this:
a := []byte{'a', 0xff, 0xaf, 'b', 0xbf}
fmt.Println(hex.Dump(a))
Output:
00000000 61 ff af 62 bf |a..b.|
There's your expected output a..b. with other (useful) data like the hex offset and hex representation of bytes.
To get a "better" picture of the output, try it with a little longer input:
a = []byte{'a', 0xff, 0xaf, 'b', 0xbf, 50: 0xff}
fmt.Println(hex.Dump(a))
00000000 61 ff af 62 bf 00 00 00 00 00 00 00 00 00 00 00 |a..b............|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000030 00 00 ff |...|
Try it on the Go Playground.
I use knowles sph0645lm4h-b microphone to acquire data, which is a 24-bits PCM format with 18 data presicion. Then the 24-bits PCM data is truncated to 18-bits data, because the last 6 bits is alway 0 according to the specification. After that, the 18-bits data is stored as a 32-bits unsigned integer. When the MSB bit is 0, which means it's a positive integer, and the MSB is 0, which means it's a negative integer.
After that, i find all data is positive, no matter which sound i used to test. I tested it with a dual frequency, and do a FFT, then I found the result is almost right except the lower frequency about 0-100Hz is larger. But i reconstructed the sound with the data, which i used for FFT algorithm. The reconstructed sound is almost right but with noise.
I use a buffer to store the microphone data, which is transmitted using DMA. The buffer is
uint16_t fft_buffer[FFT_LENGTH*4]
The DMA configuration is doing as following:
DMA_InitStructure.DMA_Channel = DMA_Channel_0;
DMA_InitStructure.DMA_PeripheralBaseAddr = (uint32_t)&(SPI2->DR);
DMA_InitStructure.DMA_Memory0BaseAddr = (uint32_t)fft_buffer;
DMA_InitStructure.DMA_DIR = DMA_DIR_PeripheralToMemory;
DMA_InitStructure.DMA_PeripheralInc = DMA_PeripheralInc_Disable;
DMA_InitStructure.DMA_MemoryInc = DMA_MemoryInc_Enable;
DMA_InitStructure.DMA_PeripheralDataSize =DMA_PeripheralDataSize_HalfWord;
DMA_InitStructure.DMA_MemoryDataSize = DMA_MemoryDataSize_HalfWord;
DMA_InitStructure.DMA_BufferSize = FFT_LENGTH*4;
DMA_InitStructure.DMA_Mode = DMA_Mode_Normal;
DMA_InitStructure.DMA_Priority = DMA_Priority_VeryHigh;
DMA_InitStructure.DMA_FIFOMode = DMA_FIFOMode_Disable;
DMA_InitStructure.DMA_FIFOThreshold = DMA_FIFOThreshold_Full;
DMA_InitStructure.DMA_MemoryBurst = DMA_MemoryBurst_Single;
DMA_InitStructure.DMA_PeripheralBurst = DMA_PeripheralBurst_Single;
extract data from buffer, truncate to 18 bits and extends it to 32 bits and the store at fft_integer:
int32_t fft_integer[FFT_LENGTH];
fft_buffer stores the original data from one channel and redundant data from other channel. Original data is store at two element of array, like fft_buffer[4] and fft_buffer[5], which are both 16 bits. And fft_integer store just data from one channel and each data take a 32bits.This is why the size of fft_buffer Array is [FFT_LENGTH*4]. 2 elements are used for data from one channel and 2 element is used for the other channel. But for fft_integer, the size of fft_integer array is FFT_LENGTH. Because data from one channel is stored and 18bits can be stored in one element of type int32_t.
for (t=0;t<FFT_LENGTH*4;t=t+4){
uint8_t first_8_bits, second_8_bits, last_2_bits;
uint32_t store_int;
/* get the first 8 bits, middle 8 bits and last 2 bits, combine it to a new value */
first_8_bits = fft_buffer[t]>>8;
second_8_bits = fft_buffer[t]&0xFF;
last_2_bits = (fft_buffer[t+1]>>8)>>6;
store_int = ((first_8_bits <<10)+(second_8_bits <<2)+last_2_bits);
/* convert it to signed integer number according to the MSB of value
* if MSB is 1, then set all the bits before MSB to 1
*/
const uint8_t negative = ((store_int & (1 << 17)) != 0);
int32_t nativeInt;
if (negative)
nativeInt = store_int | ~((1 << 18) - 1);
else
nativeInt = store_int;
fft_integer[cnt] = nativeInt;
cnt++;
}
The microphone is using I2S Interface and it's a single mono microphone, which means that there is just half of the data is effective at half of the transmission time. It works for about 128ms, and then will stop working.
This picture shows the data, which i convert to a integer.
My question is why there is are large components of lower frequency although it can reconstruct the similar sound. I'm sure there is no problem in Hardware configuration.
I have done a experiment to see which original data is stored in buffer. I have done the following test:
uint8_t a, b, c, d
for (t=0;t<FFT_LENGTH*4;t=t+4){
a = (fft_buffer[t]&0xFF00)>>8;
b = fft_buffer[t]&0x00FF;
c = (fft_buffer[t+1]&0xFF00)>>8;
/* set the tri-state to 0 */
d = fft_buffer[t+1]&0x0000;
printf("%.2x",a);
printf("%.2x",b);
printf("%.2x",c);
printf("%.2x\n",d);
}
The PCM data is shown like following:
0ec40000
0ec48000
0ec50000
0ec60000
0ec60000
0ec5c000
...
0cf28000
0cf20000
0cf10000
0cf04000
0cef8000
0cef0000
0cedc000
0ced4000
0cee4000
0ced8000
0cec4000
0cebc000
0ceb4000
....
0b554000
0b548000
0b538000
0b53c000
0b524000
0b50c000
0b50c000
...
Raw data in Memory:
c4 0e ff 00
c5 0e ff 40
...
52 0b ff c0
50 0b ff c0
I use it as little endian.
The large low-frequency component starting from DC in the original data is due to the large DC offset caused by incorrectly translating the 24 bit two's complement samples to int32_t. DC offset is inaudible unless it caused clipping or arithmetic overflow to occur. There are not really any low frequencies up to 100Hz, that is merely an artefact of the FFT's response to the strong DC (0Hz) element. That is why you cannot hear any low frequencies.
Below I have stated a number of assumptions as clearly as possible so that the answer may perhaps be adapted to match the actualité.
Given:
Raw data in Memory:
c4 0e ff 00
c5 0e ff 40
...
52 0b ff c0
50 0b ff c0
I use it as little endian.
and
2 elements are used for data from one channel and 2 element is used for the other channel
and given the subsequent comment:
fft_buffer[0] stores the higher 16 bits, fft_buffer[1] stores the lower 16 bits
Then the data is in fact cross-endian such that for example, for:
c4 0e ff 00
then
fft_buffer[n] = 0x0ec4 ;
fft_buffer[n+1] = 0x00ff ;
and the reconstructed sample should be:
0x00ff0ec4
then the translation is a matter of reinterpreting fft_buffer as a 32 bit array, swapping the 16 bit word order, then a shift to move the sign-bit to the int32_t sign-bit position and (optionally) a re-scale, e.g.:
c4 0e ff 00 => 0x00ff0ec4
0x00ff0ec4<< 8 = 0xff0ec400
0xff0ec400/ 16384 = 0xffff0ec4(-61756)
thus:
// Reinterpret DMA buffer as 32bit samples
int32_t* fft_buffer32 = (int32_t*)fft_buffer ;
// For each even numbered DMA buffer sample...
for( t = 0; t < FFT_LENGTH * 2; t += 2 )
{
// ... swap 16 bit word order
int32_t sample = fft_buffer32 [t] << 16 |
fft_buffer32 [t] >> 16 ;
// ... from 24 to 32 bit 2's complement and rescale to
// maintain original magnitude. Copy to single channel
// fft_integer array.
fft_integer[t / 2] = (sample << 8) / 16384 ;
}
I wrote a Java Card applet that saves some data into the APDU buffer at offset ISO7816.OFFSET_CDATA and sends those bytes as a response.
Util.arrayCopy(Input_Data, (short)0, buffer, (short) ISO7816.OFFSET_CDATA, (short)Datalength);
apdu.setOutgoing();
apdu.setOutgoingLength((short)(DataLength) );
apdu.sendBytesLong(buffer, ISO7816.OFFSET_CDATA, (short)(DataLength));
I tested this in a simulator without any problem. But when I test this on a real smart card (Java Card v2.2.1 manufactured by Gemalto), I get the status word 0x6180 as response.
My command APDU is 00 40 00 00 80 Data, where data has a length of 128 bytes, so I have 4+128 bytes in the buffer and (260-(4+128)) byte is null.
Your simulator probably uses T=1 transport protocol, but your real card does not. It uses T=0 protocol, which means it can either receive data, or send data in a single APDU.
Status word 0x6180 indicates there are 0x80 bytes to receive from the card. Generally, 61XX means XX bytes to receive.
How to receive them? Well, there is a special APDU command called GET RESPONSE. You should call it each time you get 61XX status word. Use XX as the Le byte of your GET RESPONSE APDU
APDU -> 61 XX
00 C0 00 00 XX -> your data 90 00
A few other notes on your code:
Datalength vs DataLength?
Copy your output data to 0 instead of ISO7816.OFFSET_CDATA
Why do you cast DataLength to short each time? Is it short? Do not cast then. Is it byte? You cast it in a wrong way then, because unsigned byte value > 0x80 will be cast to a negative short. The correct cast from an unsigned byte to a short is (short) (DataLength & 0xFF)
Use setOutgoingAndSend whenever you can. It is much simpler.
Use arrayCopyNonAtomic instead of arrayCopy whenever you are not copying to a persistent array. Performance of arrayCopyNonAtomic is much better.
I'm receiving a high volume of "hex array" strings in the form:
'16 03 03 00 50 40 f2 12 71 0b c0 4f 99 dc 87 6f'
What's the most efficient way of copying them into an existing, larger buffer?
I'm guessing the naive way would be:
var lineBuffer = new Buffer(line.replace(/\s+/g, ''), 'hex');
lineBuffer.copyTo(mainBuffer, offset);
offset += 16;
I'm wary of using line[index] and doing the simple bit shift and sum, because string[index] just resolves to another string.
To move this out of the comments,
The solution you're looking for is probably something like this:
const line = '16 03 03 00 50 40 f2 12 71 0b c0 4f 99 dc 87 6f';
const regex = / /g;
const encoding = 'hex';
const replacement = '';
function getHexString(input) {
return input.replace(regex, replacement);
}
const existingBuffer = Buffer.alloc(1024); // just as an example
const offset = existingBuffer.length; // or wherever you need them to go.
function write(buffer, str, pos) {
const hexString = getHexString(str);
buffer.write(hexString, pos, encoding);
}
write(existingBuffer, line, 0);
I can get about 1,600,000 ops/sec through benchmark.js with () => write(existingBuffer, line, 0). Cheating a bit, since I'm constantly writing to position 0 instead of appending, but it should be close enough to get you an idea. The closest I could come with other combinations I was trying was about 1,200,000 ops/sec.
Also as a side note, I would strongly suggest using Buffer.alloc() wherever you are creating your original buffer, if you aren't. You can also use Buffer.allocUnsafe() which is faster, but that may leave non-zero data in the buffer. Which may be okay, if you know you're going to fill the entire buffer with new data before using it (or are only using known-filled slices). More reading here.
I have a buffer in node <Buffer 42 d9 00 00> that is supposed to represent the decimal 108.5. I am using this module to try and decode the buffer: https://github.com/feross/ieee754.
ieee754.read = function (buffer, offset, isLE, mLen, nBytes)
The arguments mean the following:
buffer = the buffer
offset = offset into the buffer
value = value to set (only for write)
isLe = is little endian?
mLen = mantissa length
nBytes = number of bytes
I try to read the value: ieee754.read(buffer, 0, false, 5832704, 4) but am not getting the expected result. I think I am calling the function correctly, although I am unsure about the mLen argument.
[I discovered that] the node Buffer class has that ability built in: buffer.readFloatBE(0).