I want to use Google Chrome's IndexedDB to persist data on the client-side.
Idea is to access the IndexedDB outside of chrome, via Node.JS, later on.
The background is the idea to track usage behaviour locally and store the collected data on the client for later analysis without a server backend.
From my understanding, the indexedDB is implemented as a LevelDB. However, I cannot open the levelDB with any of the tools/libs like LevelUp/LevelDown or leveldb-json.
I'm always getting this error message:
leveldb-dump-to-json --file test.json --db https_www.reddit.com_0.indexeddb.leveldb
events.js:141
throw er; // Unhandled 'error' event
^ OpenError: Invalid argument: idb_cmp1 does not match existing comparator : leveldb.BytewiseComparator
at /usr/local/lib/node_modules/leveldb- json/node_modules/levelup/lib/levelup.js:114:34 Christians-Air:IndexedDB
Can anybody please help? It seems as if the Chrome implementation is somehow special/different.
Keys in leveldb are arbitrary binary sequences. Clients implement comparators to define ordering between keys. The default comparator for leveldb is something equivalent to strncmp. Chrome's comparator for Indexed DB's store is more complicated. If you try and use a leveldb instance with a different comparator than it was created with you'll observe keys in seemingly random order, insertion would be unpredictable or cause corruption - dogs and cats living together, mass hysteria. So leveldb lets you name the comparator (persisted to the database) to help detect and avoid this mistake, which is what you're seeing. Chrome's code names its comparator for Indexed DB "idb_cmp1".
To inspect one of Chrome's Indexed DB leveldb instances outside of chrome you'd need to implement a compatible comparator. The code lives in Chrome's implementation at content/browser/indexed_db/indexed_db_backing_store.cc - and note that there's no guarantee that this is fixed across versions. (Apart from backwards compatibility, of course)
It's implemented and public available on github now
C# example
Python example
maven: https://github.com/hnuuhc/often-utils
code:
Map<String, String> storages = LocalStorage.home().getForDomain("pixiv.net");
Related
I am using Python 3.8.10 and fabric 2.7.0.
I have a Connection to a remote host. I am executing a command such as follows:
resObj = connection.run("cat /usr/bin/binaryFile")
So in theory the bytes of /usr/bin/binaryFile are getting pumped into stdout but I can not figure out what wizardry is required to get them out of resObj.stdout and written into a file locally that would have a matching checksum (as in, get all the bytes out of stdout). For starters, len(resObj.stdout) !== binaryFile.size. Visually glancing at what is present in resObj.stdout and comparing to what is in /usr/bin/binaryFile via hexdump or similar makes them seem about similar, but something is going wrong.
May the record show, I am aware that this particular example would be better accomplished with...
connection.get('/usr/bin/binaryFile')
The point though is that I'd like to be able to get arbitrary binary data out of stdout.
Any help would be greatly appreciated!!!
I eventually gave up on doing this using the fabric library and reverted to straight up paramiko. People give paramiko a hard time for being "too low level" but the truth is that it offers a higher level API which is pretty intuitive to use. I ended up with something like this:
with SSHClient() as client:
client.set_missing_host_key_policy(AutoAddPolicy())
client.connect(hostname, **connectKwargs)
stdin, stdout, stderr = client.exec_command("cat /usr/bin/binaryFile")
In this setup, I can get the raw bytes via stdout.read() (or similarly, stderr.read()).
To do other things that fabric exposes, like put and get it is easy enough to do:
# client from above
with client.open_sftp() as sftpClient:
sftpClient.put(...)
sftpClient.get(...)
also was able to get the exit code per this SO answer by doing:
# stdout from above
stdout.channel.recv_exit_status()
The docs for recv_exit_status list a few gotchas that are worth being aware of too. https://docs.paramiko.org/en/latest/api/channel.html#paramiko.channel.Channel.recv_exit_status .
Moral of the story for me is that fabric ends up feeling like an over abstraction while Paramiko has an easy to use higher level API and also the low level primitives when appropriate.
Can I route Node A into Node B, and Node B back into Node A (of course using a Mixer in between) -- otherwise called "Feedback"? (For example, WebAudio supports this).
No, trying to setup a recursive route will result in AVAudioEngine freezing and a seemingly unrelated error appearing in the console:
warning: could not execute support code to read Objective-C class data in the process. This may reduce the quality of type information available.
A socketIO handshake looks something like this :
http://localhost:3000/socket.io/?EIO=3&transport=polling&t=M5eHk0h
What is the t parameter? Can't find a explanation.
This is the timestampParam from engine.io-client. Its value is a Unique ID generated using the npm package yeast.
This is referenced in the API docs under the Socket Constructor Options (docs below). If no value is given to timestampParam when creating a new instance of a Socket, the parameter name is switched to t and assigned a value from yeast(). You can see this in the source for on Line 223 of lib/transports/polling.js
Socket constructor() Options
timestampParam (String): timestamp parameter (t)
To clarify where engine.io-client comes into play, it is a dependency of socket.io-client which, socket.io depends on. engine.io provides the actual communication layer implementation which socket.io is built upon. engine.io-client is the client portion of engine.io.
Why does socket.io use t?
As jfriend00 pointed out in the comments, t is used for cache busting. Cache busting, is a technique that prevents the browser from serving a cached resource instead of requesting the resource.
Socket.io implements cache busting with a timestamp parameter in the query string. If you assign timestampParam a value of ts then the key for the timestamp would be ts, it defaults to t if no value is assigned. By assigning this parameter a unique value created with yeast on every poll to the server, Socket.io is able to always retrieve the latest data from the server and circumvent the cache. Since polling transports would not work as expected without cache busting, timestamping is enabled by default and must be explicitly disabled.
AFAIK, the Socket.io server does not utilize the timestamp parameter for anything other than cache busting.
More about yeast()
yeast() guarantees a compressed unique ID specifically for cache busting. The README gives us some more detailed information on how yeast() works.
Yeast is a unique id generator. It has been primarily designed to generate a unique id which can be used for cache busting. A common practice for this is to use a timestamp, but there are couple of downsides when using timestamps.
The timestamp is already 13 chars long. This might not matter for 1 request but if you make hundreds of them this quickly adds up in bandwidth and processing time.
It's not unique enough. If you generate two stamps right after each other, they would be identical because the timing accuracy is limited to milliseconds.
Yeast solves both of these issues by:
Compressing the generated timestamp using a custom encode() function that returns a string representation of the number.
Seeding the id in case of collision (when the id is identical to the previous one).
To keep the strings unique it will use the . char to separate the generated stamp from the seed.
I would like to know if it is possible to send a block of data like 128 bytes of data to a Xively server MOTOROLA SREC for example I need this to do firmware upgrades / download images to my Arduino connected device? As far as I can see one can only get - datapoints / values ?
A value of a datapoint can be a string. Firmware updates can be implement using Xively API V2 by just storing string encoded binaries as datapoints, provided that the size is small.
You probably can make some use of timestamps for rolling back versions that did work or something similar. Also you probably want to use the datapoints endpoint so you can just grab the entire response body and no need to parse anything.
/v2/feeds/<feed_id>/datastreams/<datastream_id>/datapoints/<timestamp>.csv
I suppose, you will need implement this in the bootloader which needs to be very small and maybe you can actually skip paring the HTTP headers and only attempt to very whether the body looks right (i.e. has some magic byte that you put in there, you can also try some to checksum it. This would a little bit opportunistic, but might be okay for an experiment. You should probably add Xively device provisioning to this also, but wouldn't try implementing everything right away.
It is however quite challenging to implement reliable firmware updates and there are sever papers out there which you should read. Some suggest to make device's behaviour most primitive you can, avoid any logic and make it rely on what server tells it to do.
To actually store the firmware string you can use cURL helper.
Add first version into a new datastream
Update with a new version
I'd built an WSSv3 application which upload files in small chunks; when every data piece arrives, I temporarly keep it into a SQL 2005 image data type field for performance reasons**.
Problem come when upload ends; I need to move data from my SQL Server to Sharepoint Document Library through WSSv3 object model.
Right now, I can think two approaches:
SPFileCollection.Add(string, (byte[])reader[0]); // OutOfMemoryException
and
SPFile file = folder.Files.Add("filename", new byte[]{ });
using(Stream stream = file.OpenBinaryStream())
{
// ... init vars and stuff ...
while ((bytes = reader.GetBytes(0, offset, buffer, 0, BUFFER_SIZE)) > 0)
{
stream.Write(buffer, 0, (int)bytes); // Timeout issues
}
file.SaveBinary(stream);
}
Are there any other way to complete successfully this task?
** Performance reasons: if you tries to write every chunk directly at Sharepoint, you'll note a performance degradation as file grows up (>100Mb).
I ended with following code:
myFolder.Files.Add("filename",
new DataRecordStream(dataReader,
dataReader.GetOrdinal("Content"), length));
You can find DataRecordStream implementation here. It's basically a Stream whos read data from a DbDataRecord through .GetBytes
This approach is similar to OpenBinaryStream()/SaveBinary(stream), but it's doesnt keeps all byte[] in memory while you transfer data. In some point, DataRecordStream will be accessed from Microsoft.SharePoint.SPFile.CloneStreamToSPFileStream using 64k chunks.
Thank you all for valuable infos!
The first thing I would say is that SharePoint is really, really not designed for this. It stores all files in its own database so that's where these large files are going. This is not a good idea for lots of reasons: scalability, cost, backup/restore, performance, etc... So I strongly recommend using file shares instead.
You can increase the timeout of the web request by changing the executionTimeout attribute of the httpRuntime element in web.config.
Apart from that, I'm not sure what else to suggest. I haven't heard of such large files being stored in SharePoint. If you absolutely must do this, try also asking on Server Fault.
As mentioned previously, storing large files in Sharepoint is generally a bad idea. See this article for more information: http://blogs.msdn.com/joelo/archive/2007/11/08/what-not-to-store-in-sharepoint.aspx
With that said, it is possible to use external storage for BLOBs, which may or may not help your performance issues -- Microsoft released a half-complete external BLOB storage provider that does the trick, but it unfortunately works at the farm level and affects all uploads. Ick.
Fortunately, since you can implement your own external BLOB provider, you may be able to write something to better handle these specific files. See this article for details: http://207.46.16.252/en-us/magazine/2009.06.insidesharepoint.aspx
Whether or not this would be worth the overhead depends on how much of a problem you're having. :)