How do I store an ArrayBuffer in MongoDB?

How do I store an ArrayBuffer in MongoDB? - node.js

I'm trying to build an Electron React App that captures short videos of the screen but am unable to store videos (as part of a larger "Payload" object:
{video_buffer: xxx, time_stamp: 3456345 ...}
The videos are created in the renderer process via MediaStream and MediaRecorder API's then converted to Blob. Here is some sample code:-
let mediaRecorder = new MediaRecorder(videoStream);
mediaRecorder.start()
let chunks = []
mediaRecorder.ondataavailable = (event) => {chunks.push(event.data)}
mediaRecorder.onstop = async (event) => {
let blob = new Blob(chunks, {type: 'video/mp4'})
chunks = []
const buffer = await blob.arrayBuffer()
The buffer works when I convert to a URL and set it to the Video element src
let videoURL = URL.createObjectURL(blob)
The videos are short and occupy about 2.5MB max.
I can only send the buffer (or the URL) via IPC (not the Blob).
In the main process, I'm using Mongoose with the schema:
new payloadSchema = {video_buffer: {type: Buffer}}
but this doesn't work, presumably because buffer is an ArrayBuffer and not a Data View or TypedArray.
So I tried converting it to a Uint8Array TypedArray object:
const new_buffer = new Uint8Array(buffer)
but i get this error on saving:
video_buffer: CastError: Cast to Buffer failed for value "Uint8Array(2041510) [
[electron] 26, 69, 223, 163, 159, 66, 134, 129, 1, 66, 247, 129,
[electron] 1, 66, 242, 129, 4, 66, 243, 129, 8, 66, 130, 132,
[electron] 119, 101, 98, 109, 66, 135, 129, 4, 66, 133, 129, 2,
[electron] 24, 83, 128, 103, 1, 255, 255, 255, 255, 255, 255, 255,
[electron] 21, 73, 169, 102, 153, 42, 215, 177, 131, 15, 66, 64,
[electron] 77, 128, 134, 67, 104, 114, 111, 109, 101, 87, 65, 134,
[electron] 67, 104, 114, 111, 109, 101, 22, 84, 174, 107, 171, 174,
[electron] 169, 215, 129, 1, 115, 197, 135, 155, 112, 30, 237, 62,
[electron] 162, 245, 131, 129,
[electron] ... 2041410 more items
[electron] ]" at path "video_buffer"
I then tried to convert the Uint8Array into a hex encoded string then storing that as a buffer .
On querying MongoDB in Terminal it has stored the string:
const buff_to_hex_string = new_buffer.toString('hex')
Then save it via Mongoose as before : (Schema {type: Buffer})
This seemed to work with following output from MongoDB:
The output of the saved Payload Schema is:
video_buffer: Binary {
[electron] _bsontype: 'Binary',
[electron] sub_type: 0,
[electron] position: 9514868,
[electron] buffer: <Buffer 32 36 2c 36 39 2c 32 32 33 2c 31 36 33 2c 31 35 39 2c 36 36 2c 31 33 34 2c 31 32 39 2c 31 2c 36 36 2c 32 34 37 2c 31 32 39 2c 31 2c 36 36 2c 32 34 32 ... 9514818 more bytes>
Question: On retrieving the stored buffer object, what is the easiest way to convert it back into an ArrayBuffer so that I can play it in a video element ?
Do I need to convert backwards --> hex string --> Uint8Array --> ArrayBuffer ---> Blob ---> URL ?
I've tried lots of methods provided but none work.
Any help would be much appreciated.
Thanks.

Related

What data format is this alongside ascii and decimal?

Consider the following:
use sha2::{Sha256,Digest};
fn main() {
let mut hasher = Sha256::new();
hasher.update(b"hello world");
let result = hasher.finalize();
let str_result = format!("{:x}", result);
println!("A string is: {:x}", result);
println!("ASCII decimal maps: {:?}", str_result.bytes());
println!("What data coding is this?: {:?}", result);
}
The SHA256 hash as a string is: b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9
ASCII decimal maps: Bytes(Copied { it: Iter([98, 57, 52, 100, 50, 55, 98, 57, 57, 51, 52, 100, 51, 101, 48, 56, 97, 53, 50, 101, 53, 50, 100, 55, 100, 97, 55, 100, 97, 98, 102, 97, 99, 52, 56, 52, 101, 102, 101, 51, 55, 97, 53, 51, 56, 48, 101, 101, 57, 48, 56, 56, 102, 55, 97, 99, 101, 50, 101, 102, 99, 100, 101, 57]) })
What data coding is this?: [185, 77, 39, 185, 147, 77, 62, 8, 165, 46, 82, 215, 218, 125, 171, 250, 196, 132, 239, 227, 122, 83, 128, 238, 144, 136, 247, 172, 226, 239, 205, 233]
The first two make sense, we have the ASCII representations, followed by the ASCII > Decimal map. What is the third format? [185, 77, 39, 185, 147, 77, 62, 8, 165, 46, 82, 215, 218, 125, 171, 250, 196, 132, 239, 227, 122, 83, 128, 238, 144, 136, 247, 172, 226, 239, 205, 233]?

It's the bytes of the hash represented as an array of decimals instead of as a hexadecimal string.
b94d27... -> [185, 77, 39 ...]
0xb9 -> 185
0x4d -> 77
0x27 -> 39

Unexpected result when calling toString on a buffer in Node

I'm in a situation where I need to revert data back to a buffer that has had toString called on it. For example:
const buffer // I need this, or equivalent
const bufferString = buffer.toString() // This is all I have
The node documentation implies that .toString() defaults to 'utf8' encoding, and I can revert this with Buffer.from(bufferString, 'utf8'), but this doesn't work and I get different data. (maybe some data loss when it is converted to a string, although the documentation doesn't seem to mention this).
Does anyone know why this is happening or how to fix it?
Here is the data I have to reproduce this:
const intArr = [31, 139, 8, 0, 0, 0, 0, 0, 0, 0, 170, 86, 42, 201, 207, 78, 205, 83, 178, 82, 178, 76, 78, 53, 179, 72, 74, 51, 215, 53, 54, 51, 51, 211, 53, 49, 78, 50, 210, 77, 74, 49, 182, 208, 53, 52, 178, 180, 72, 75, 76, 52, 75, 180, 76, 50, 81, 170, 5, 0, 0, 0, 255, 255, 3, 0, 29, 73, 93, 151, 48, 0, 0, 0]
const buffer = Buffer.from(intArr) // The buffer I want!
const bufferString = buffer.toString() // The string I have!, note .toString() and .toString('utf8') are equivalent
const differentBuffer = Buffer.from(bufferString, 'utf8')
You can get the initial intArr from a buffer by doing this:
JSON.parse(JSON.stringify(Buffer.from(buffer)))['data']
Edit: interestingly calling .toString() on differentBuffer gives the same initial string.

I think the important part of the documentation you linked is When decoding a Buffer into a string that does not exclusively contain valid UTF-8 data, the Unicode replacement character U+FFFD � will be used to represent those errors. When you are converting your buffer into a utf8 string, not all characters are valid utf8, as you can see by doing a console.log(bufferString); almost all of it comes out as gibberish. Therefore you are irretrievably losing data when converting from the buffer into a utf8 string and you can't get that lost data back when converting back into the buffer.
In your example if you were to use utf16 instead of utf8 you don't lose information and thus your buffer is the same once converting back. I.E
const intArr = [31, 139, 8, 0, 0, 0, 0, 0, 0, 0, 170, 86, 42, 201, 207, 78, 205, 83, 178, 82, 178, 76, 78, 53, 179, 72, 74, 51, 215, 53, 54, 51, 51, 211, 53, 49, 78, 50, 210, 77, 74, 49, 182, 208, 53, 52, 178, 180, 72, 75, 76, 52, 75, 180, 76, 50, 81, 170, 5, 0, 0, 0, 255, 255, 3, 0, 29, 73, 93, 151, 48, 0, 0, 0]
const buffer = Buffer.from(intArr);
const bufferString = buffer.toString('utf16le');
const differentBuffer = Buffer.from(bufferString, 'utf16le') ;
console.log(buffer); // same as the below log
console.log(differentBuffer); // same as the above log

Use the 'latin1' or 'binary' encoding with Buffer.toString and Buffer.from. Those encodings are the same and map bytes to the unicode characters U+0000 to U+00FF.

Ruby equivalent of Node .toString('ascii')

I am struggling with converting a Node application to Ruby. I have a Buffer of integers that I need to encode as an ASCII string.
In Node this is done like this:
const a = Buffer([53, 127, 241, 120, 57, 136, 112, 210, 162, 200, 111, 132, 46, 146, 210, 62, 133, 88, 80, 97, 58, 139, 234, 252, 246, 19, 191, 84, 30, 126, 248, 76])
const b = a.toString('hex')
// b = "357ff178398870d2a2c86f842e92d23e855850613a8beafcf613bf541e7ef84c"
const c = a.toString('ascii')
// c = '5qx9\bpR"Ho\u0004.\u0012R>\u0005XPa:\u000bj|v\u0013?T\u001e~xL'
I want to get the same output in Ruby but I don't know how to convert a to c. I used b to validate that a is parsed the same in Ruby and Node and it looks like it's working.
a = [53, 127, 241, 120, 57, 136, 112, 210, 162, 200, 111, 132, 46, 146, 210, 62, 133, 88, 80, 97, 58, 139, 234, 252, 246, 19, 191, 84, 30, 126, 248, 76].pack('C*')
b = a.unpack('H*')
# ["357ff178398870d2a2c86f842e92d23e855850613a8beafcf613bf541e7ef84c"]
# c = ???
I have tried serveral things, virtually all of the unpack options, and I also tried using the encode function but I lack the understanding of what the problem is here.

Okay well I am not that familiar with Node.js but you can get fairly close with some basic understandings:
Node states:
'ascii' - For 7-bit ASCII data only. This encoding is fast and will strip the high bit if set.
Update After rereading the nod.js description I think it just means it will drop 127 and only focus on the first 7 bits so this can be simplified to:
def node_js_ascii(bytes)
bytes.map {|b| b % 128 }
.reject(&127.method(:==))
.pack('C*')
.encode(Encoding::UTF_8)
end
node_js_ascii(a)
#=> #=> "5qx9\bpR\"Ho\u0004.\u0012R>\u0005XPa:\vj|v\u0013?T\u001E~xL"
Now the only differences are that node.js uses "\u000b" to represent a vertical tab and ruby uses "\v" and that ruby uses uppercase characters for unicode rather than lowercase ("\u001E" vs "\u001e") (you could handle this if you so chose)
Please note This form of encoding is not reversible due to the fact that you have characters that are greater than 8 bits in your byte array.
TL;DR (previous explanation and solution only works up to 8 bits)
Okay so we know the max supported decimal is 127 ("1111111".to_i(2)) and that node will strip the high bit if set meaning [I am assuming] 241 (an 8 bit number will become 113 if we strip the high bit)
With that understanding we can use:
a = [53, 127, 241, 120, 57, 136, 112, 210, 162, 200, 111, 132, 46, 146, 210, 62, 133, 88, 80, 97, 58, 139, 234, 252, 246, 19, 191, 84, 30, 126, 248, 76].map do |b|
b < 128 ? b : b - 128
end.pack('C*')
#=> "5\x7Fqx9\bpR\"Ho\x04.\x12R>\x05XPa:\vj|v\x13?T\x1E~xL"
Then we can encode that as UTF-8 like so:
a.encode(Encoding::UTF_8)
#=> "5\u007Fqx9\bpR\"Ho\u0004.\u0012R>\u0005XPa:\vj|v\u0013?T\u001E~xL"
but there is still is still an issue here.
It seems Node.js also ignores the Delete (127) when it converts to 'ascii' (I mean the high bit is set but if we strip it then it is 63 ("?") which doesn't match the output) so we can fix that too
a = [53, 127, 241, 120, 57, 136, 112, 210, 162, 200, 111, 132, 46, 146, 210, 62, 133, 88, 80, 97, 58, 139, 234, 252, 246, 19, 191, 84, 30, 126, 248, 76].map do |b|
b < 127 ? b : b - 128
end.pack('C*')
#=> "5\xFFqx9\bpR\"Ho\x04.\x12R>\x05XPa:\vj|v\x13?T\x1E~xL"
a.encode(Encoding::UTF_8, undef: :replace, replace: '')
#=> "5qx9\bpR\"Ho\u0004.\u0012R>\u0005XPa:\vj|v\u0013?T\u001E~xL"
Now since 127 - 128 = -1 (negative signed bit) becomes "\xFF" an undefined character in UTF-8 so we add undef: :replace what to do when the character is undefined use replace and we add replace: '' to replace with nothing.

numpy array saving to csv

I'm trying to save numpy array to csv file but there is a problem,
I use two different solution but they did not work
my numpy array looks like,
In[39]: arr[0]
Out[39]:
array([ array([[ 30, 29, 198, ..., 149, 149, 149],
[ 29, 29, 197, ..., 149, 149, 149],
[ 29, 29, 197, ..., 149, 149, 149],
...,
[ 63, 63, 96, ..., 105, 104, 104],
[ 63, 63, 96, ..., 106, 105, 105],
[ 77, 77, 217, ..., 217, 217, 217]], dtype=uint8),
list([0, 0, 0, 0, 0, 0, 0, 0, 0])], dtype=object)
Its shape is (1200, 2) numpy array and I want to save it to csv file,
with np.savetxt function
In[40]: np.savetxt("numpy_array.csv", arr, delimiter=',')
Traceback (most recent call last):
File "D:\Program files\Anaconda3\lib\site-packages\numpy\lib\npyio.py", line 1254, in savetxt
fh.write(asbytes(format % tuple(row) + newline))
TypeError: only length-1 arrays can be converted to Python scalars
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\Program files\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2862, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-41-673bcc1d77a6>", line 1, in <module>
np.savetxt("numpy_array.csv", arr, delimiter=',')
File "D:\Program files\Anaconda3\lib\site-packages\numpy\lib\npyio.py", line 1258, in savetxt
% (str(X.dtype), format))
TypeError: Mismatch between array dtype ('object') and format specifier ('%.18e,%.18e')
with pandas
In[42]: df = pd.DataFrame(arr)
In[43]: df[:5]
Out[43]:
0 \
0 [[30, 29, 198, 198, 197, 197, 197, 197, 197, 1...
1 [[29, 29, 197, 197, 196, 196, 197, 197, 197, 1...
2 [[29, 29, 196, 196, 196, 196, 196, 196, 196, 1...
3 [[29, 29, 196, 196, 196, 196, 196, 196, 196, 1...
4 [[29, 29, 196, 196, 196, 196, 196, 196, 197, 1...
1
0 [0, 0, 0, 0, 0, 0, 0, 0, 0]
1 [1, 0, 0, 0, 0, 0, 0, 0, 0]
2 [1, 0, 0, 0, 0, 0, 0, 0, 0]
3 [1, 0, 0, 0, 0, 0, 0, 0, 0]
4 [1, 0, 0, 0, 0, 0, 0, 0, 0]
In[44]: df.to_csv("h.csv", index=False)
In[45]: a = pd.read_csv("h.csv", header=None,names =['input', 'output'])
In[46]: a[:5]
Out[46]:
input \
0 0
1 [[ 30 29 198 ..., 149 149 149]\r\n [ 29 29 1...
2 [[ 29 29 197 ..., 149 149 149]\r\n [ 29 29 1...
3 [[ 29 29 196 ..., 149 149 149]\r\n [ 29 29 1...
4 [[ 29 29 196 ..., 149 149 149]\r\n [ 29 29 1...
output
0 1
1 [0, 0, 0, 0, 0, 0, 0, 0, 0]
2 [1, 0, 0, 0, 0, 0, 0, 0, 0]
3 [1, 0, 0, 0, 0, 0, 0, 0, 0]
4 [1, 0, 0, 0, 0, 0, 0, 0, 0]
when I print "df[:5]", everything looks great, but after I saved it to csv then read it from csv, it looks awful, there are not commas between numbers and there are '\r\n' between list.
I want to see like "df[:5]" 's output after read csv file, how can I do it, what is the problem?

Your array is 2d, (1200, 2) with object dtype. Evidently the first column contains 2d arrays, and the 2nd column lists.
arr[0,0] is a 2d array
array([[ 30, 29, 198, ..., 149, 149, 149],
[ 29, 29, 197, ..., 149, 149, 149],
[ 29, 29, 197, ..., 149, 149, 149],
...,
[ 63, 63, 96, ..., 105, 104, 104],
[ 63, 63, 96, ..., 106, 105, 105],
[ 77, 77, 217, ..., 217, 217, 217]], dtype=uint8)
You could easily write in a csv format. For example:
In [342]: arr = np.array([[ 30, 29, 198, 149, 149, 149],
...: [ 29, 29, 197, 149, 149, 149],
...: [ 29, 29, 197, 149, 149, 149],
...: [ 63, 63, 96, 105, 104, 104],
...: [ 63, 63, 96, 106, 105, 105],
...: [ 77, 77, 217, 217, 217, 217]], dtype=np.uint8)
...:
...:
In [343]: np.savetxt('arr.txt', arr, delimiter=',', fmt='%4d')
produces a file that looks like:
In [344]: cat arr.txt
30, 29, 198, 149, 149, 149
29, 29, 197, 149, 149, 149
29, 29, 197, 149, 149, 149
63, 63, 96, 105, 104, 104
63, 63, 96, 106, 105, 105
77, 77, 217, 217, 217, 217
Read savetxt for more details on fmt.
But the full array is not compatible with the simple 2d layout of a csv file. Sure you could write something more complicated, but you couldn't load it with a csv reader like np.genfromtxt or np.loadtxt. Those expect the neat row and column layout with a well defined delimiter.
In [346]: data = np.genfromtxt('arr.txt',delimiter=',',dtype=None)
In [347]: data
Out[347]:
array([[ 30, 29, 198, 149, 149, 149],
[ 29, 29, 197, 149, 149, 149],
[ 29, 29, 197, 149, 149, 149],
[ 63, 63, 96, 105, 104, 104],
[ 63, 63, 96, 106, 105, 105],
[ 77, 77, 217, 217, 217, 217]])
The pandas df shows two columns, one with the arrays, the other with the lists. But in a column 0 appears to contain string representations of the 2d arrays, as indicated by the newline characters. Did you look at the h.csv file? Part of the reason for using csv is so people can read it, and other programs (like excel) can read it.
Make an array like your big one
In [349]: barr = np.empty((3,2), object)
In [350]: barr[:,0]=[arr,arr,arr]
In [351]: barr[:,1]=[[0,0,0] for _ in range(3)]
In [352]: barr
Out[352]:
array([[array([[ 30, 29, 198, 149, 149, 149],
[ 29, 29, 197, 149, 149, 149],
[ 29, 29, 197, 149, 149, 149],
[ 63, 63, 96, 105, 104, 104],
[ 63, 63, 96, 106, 105, 105],
[ 77, 77, 217, 217, 217, 217]], dtype=uint8),
list([0, 0, 0])],
[array([[ 30, 29, 198, 149, 149, 149],
...
[ 77, 77, 217, 217, 217, 217]], dtype=uint8),
list([0, 0, 0])]], dtype=object)
Write it %s format, the only one that will work with objects like this:
In [354]: np.savetxt('barr.txt',barr, delimiter=',',fmt='%s')
In [355]: cat barr.txt
[[ 30 29 198 149 149 149]
[ 29 29 197 149 149 149]
[ 29 29 197 149 149 149]
[ 63 63 96 105 104 104]
[ 63 63 96 106 105 105]
[ 77 77 217 217 217 217]],[0, 0, 0]
[[ 30 29 198 149 149 149]
[ 29 29 197 149 149 149]
[ 29 29 197 149 149 149]
[ 63 63 96 105 104 104]
[ 63 63 96 106 105 105]
[ 77 77 217 217 217 217]],[0, 0, 0]
[[ 30 29 198 149 149 149]
[ 29 29 197 149 149 149]
[ 29 29 197 149 149 149]
[ 63 63 96 105 104 104]
[ 63 63 96 106 105 105]
[ 77 77 217 217 217 217]],[0, 0, 0]
That is not a valid csv file. It is text, but with [] and varying line lengths, none of the standard csv file readers can handle it.
Saving that array as you did with pandas, I get:
In [364]: cat pdbarr.txt
0,1
"[[ 30 29 198 149 149 149]
[ 29 29 197 149 149 149]
[ 29 29 197 149 149 149]
[ 63 63 96 105 104 104]
[ 63 63 96 106 105 105]
[ 77 77 217 217 217 217]]","[0, 0, 0]"
"[[ 30 29 198 149 149 149]
[ 29 29 197 149 149 149]
[ 29 29 197 149 149 149]
[ 63 63 96 105 104 104]
[ 63 63 96 106 105 105]
[ 77 77 217 217 217 217]]","[0, 0, 0]"
"[[ 30 29 198 149 149 149]
[ 29 29 197 149 149 149]
[ 29 29 197 149 149 149]
[ 63 63 96 105 104 104]
[ 63 63 96 106 105 105]
[ 77 77 217 217 217 217]]","[0, 0, 0]"
Notice all the quotes - it's writing those component arrays and lists as strings. Again, not a valid csv.

Numpy itself has no 'save as csv'-function. Normally you save it through another package (like pandas or pickle).
What you see 'it looks awful' is the pandas format. Add arr = np.array(a)
and you have you numpy format again.

Find color blending function from partially known input and output

I have a set of known destination (background) colors and known output colors for some blending function for some constant source color, but I don't know the source color or the function.
As you can see in the image below, there doesn't seem to be full correlation between hues. Destinations on the left are sorted in order of hue, output colors are clearly not. This suggests that the R,G,B channels are not handled independently.
I know that all these destination-output pairs are using the same function and source color, so how would I go about solving? Using a color palette is not suitable as the given set is only a close approximation.
The blend function will be implemented programmatically, but the solution from these unknowns can be either manual or programmatic for my purposes.
Image of given set, destination on the left, output on the right
RGB values:
Source Output
255, 145, 139 237, 139, 131
236, 197, 189 218, 133, 149
217, 195, 42 197, 131, 93
89, 175, 141 58, 135, 90
98, 167, 195 106, 115, 143
111, 169, 193 58, 128, 91
72, 132, 161 76, 115, 139
121, 130, 196 120, 125, 158
99, 104, 167 100, 124, 149
132, 133, 147 131, 126, 135
99, 100, 116 109, 128, 127
66, 63, 79 85, 131, 121

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How do I store an ArrayBuffer in MongoDB? - node.js

Related

What data format is this alongside ascii and decimal?

Unexpected result when calling toString on a buffer in Node

Ruby equivalent of Node .toString('ascii')

numpy array saving to csv

Find color blending function from partially known input and output

Categories

Resources