Big Data for offline Electron app - node.js

i got a question about handling Big Data in offline application in Electron.
i made my application which handles a bunch of data (actually a 250MB Sqlite Database)
i'm handling the sqlite data with sql.js library, because the sqlite npm library doesent't pass the build phase (even with some tricks found online)
what i'd like to know is if this is the best solution to handle offline data, with update capability.
my app actually connects to my server, syncs with the online database, and allow the user to work offline.
the sync is made by downloading the whole DB every time, because if i perform a massive INSERT statement (even with TRANSACTION, pragma and everything else) the sql.js library hase some issues with memory, and crashes.
now they're asking me to add more data, so the DB size will grow.
is any other option that i may evaluate?!
i was evaluating everything, from localstorage to PouchDB, but i don't want to bring down everything without good reasons...
any help will be appreciated

Related

node.js firestore cache snapshot and persist even after reboot [duplicate]

https://firebase.google.com/docs/firestore/manage-data/enable-offline
How does Firestore work with offline data?
How are writes merged by many clients editing the same data offline, that then come online at the same time?
How long is offline data persisted? If my user uses my app for 5 years offline, then comes back online, will this be an issue? Do offline changes persist after device restarts?
Does query performance of offline data degrade as the data set gets larger?
Im specifically interested in the web Firestore client.
Do all the language clients implement the above in the same manner?
Thanks.
How are writes merged by many clients editing the same data offline, that then come online at the same time?
The write operations that will take place on Firebase servers, will be in the order in which that series of operations happened. The last operation (the most recent one) will be the one that will be available in the database by the time the synchronization occurs.
How long is offline data persisted? If my user uses my app for 5 years offline, then comes back online, will this be an issue?
The problem is not about how long is about how many operations do you make while the device is offline. While offline, Firestore will keep in queue all the write operations. As this queue grows, local operations and app startup will slow down. Nothing major, but over time these may add up. The major problem in this case is that the end result of this will be that the data on the server stays unmodified. Then what is the purpose of a realtime database? Firestore is really designed as an online database that came work for short to intermediate periods of being disconnected, not to stay offline for 5 years. Beside that, in 5 years it might be a problem of compatibility and not of the number of writes.
Do offline changes persist after device restarts?
The offline persistence is also called disk persistence. This type of persistence is enabled by default in Cloud Firestore and it means that recently listened data (as well as any pending writes from the app to the database) are persisted to disk. The data in this cache survives app restarts and device reboots.
Does query performance of offline data degrade as the data set gets larger?
Yes it does, like explained above.
Do all the language clients implement the above in the same manner?
No. For iOS and Android, the offline feature works fine while for web, this feature is still experimental.

What database to use for temporary data storage?

Creating a Node.js server, on Heroku.
I only need to store data for around 30min-1 hr at a time, then, I can release the data.
Heroku recommends not using SQLite because it is an on memory database, and will get reset every time the server goes to sleep.
Since I don't need the data for very long, is it okay if I go through with this?
If you're curious, the project is to track timestamps of summoner spells that occur in a game of League of Legends. Since League of Legends games only last around 30min-1hr, I don't need the hold data for very long.
SQLite is not an in-memory database. It stores the tables in a database file. In-memory tables are just an optional feature. The advantage of SQLite is that no setup or administration is needed, since it is embedded in your application. It has a small footprint, is very well tested and is actively developed. SQLite is the right choice.
Maybe the confusion comes from misunderstanding "embedded in your application". It is the database engine code that is embedded, not the database and its tables.

How should I keep temporary data for socket.io interactions in node.js?

I am building a simple game in node.js using socket.io. My web experience with node.js has typically involved saving everything to a relational database and keeping nothing in memory. I set up a relational database for the state of a game. I am using sqlite3 for development and I might use something like PostgreSQL or MySQL for production.
My concern is that, every time an event is emitted from the socket the whole game-state is loaded into memory from the server. I feel that in practice this will be less efficient than just keeping all of the game-state data in memory. Events will probably be emitted every 5 seconds or so during a game. All of the game data is temporary, none of it will be needed after the game is over. A game-state consists of a set of about 120 groups of small strings and integers (about 10 per group but subject to change).
Is it good practice to keep this type of data in memory?
If not, should I stick with relational databases or switch to a third option like a file-based storage structure?
Should I not load the whole gamestate in for every event even though that will lead to a lot more read/writes (at least triple)?
I would not keep this data in the memory of your NodeJS application. Its best avoid storing state in your app server. If you really need faster read access than sql provides consider using a cache like Redis or Memcached as a layer between your app and db.
All that being said its best not to prematurely optimize you code. Most SQL engines have their own form of cacheing, and optimizing your sql queries is a better place to start if your experiencing performance issues. Postgresql Query Optimization
But don't worry about it until its an actual problem (because most likely it never will be).
Sounds like relational, SQL type database is a huge overhead for your specifics. Do you have idea how big your data is and how many users you'd like to handle? Then you could compare it with your's server ability. If result is negative (couldn't handle with mem) - then i'd go with some quick nosql, like mongo. For yours example its sounds like the best choice. It'll be faster to get data for single session, easier to dump, more elastic in structure.

Cleaning up mongodb after integration tests in node

I have an api written in node with a mongodb back end.
I'm using supertest to automate testing of an api. Of course this results in a lot of changes to database and I like to get some input on options for managing this. The goal is for each test to have no permanent impact on the database. The database should look exactly the same after the test is finished as it did before the test ran.
In my case, I don't want the database to be dropped or fully emptied out between tests. I need some real data maintained in the database at all times. I just want the changes by the tests themselves to be reverted.
With a relational database, I would put a transaction around each unit tests and roll it back after the test was done (pass or fail). As far as I know, this is not an option with mongo.
Some options I have considered:
Fake databases
I've heard of in-memory databases like fongo (which is a Java thing) and tingodb. I haven't used these but the issue with this type of solution is always that it requires a good parity with the actual product to maintain itself as a viable option. As soon as I used a mongo feature that the fake doesn't support I'll have a problem unit testing.
Manual cleanup
There is always the option of just having a routine that finds all the data added by the test (marked in some way) and removes it. You'd have to be careful about updates and deletes here. Also there is likely a lot of upkeep making sure the cleanup routine accurately cleans things up.
Database copying
If it were fast enough, maybe having a baseline test database and making a copy of it before each test could work. It'd have to be pretty fast though.
So how do people generally handle this?
I think this is a brand new way in testing without transaction.
imho - using mongo >=3.2, we can setup inMemory storage engine, which is perfect for this kind of scenario.
Start mongo with inMemory
restore database
create a working copy for test
perform a test on working copy
drop working copy
if more tests GOTO 3

Checking iCloud for existing content

What is the best way to check iCloud for existing data?
I need to check that data doesn't exist on the local device, or iCloud so I can then download it.
Since you included the core-data tag I'm assuming you mean that you're using Core Data rather than iCloud file APIs or the ubiquitous key-value store.
With Core Data's built-in iCloud support, you check on existing data in exactly the same way as if you were not using iCloud. Once you create your Core Data stack, you check what data exists by doing normal Core Data fetches. There's no (exposed) concept of local vs. cloud data, there's just one data store that happens to know how to communicate with iCloud. You don't explicitly initiate downloads-- they happen automatically.
At app launch time when you call addPersistentStoreWithType:configuration:URL:options:error:, Core Data internally initiates the download of any data that's not available locally yet. As a result this method may block for a while. If it returns successfully, then all current downloads can be assumed to be complete.
If new changes appear while your app is running, Core Data will download and import them and, when it's finished, will post NSPersistentStoreDidImportUbiquitousContentChangesNotification to tell you what just happened.
This all describes how Core Data's iCloud is supposed to work. In practice you'll probably find that it doesn't always work as intended.
Thanks to #Tom Harrington for pointing out, this error is nothing to do with the developer/coding - it's purely to do with iCloud/Apple/connection issues.
More on this SO answer I found.

Resources