Allowing users to upload csv data into parse app - node.js

I need to let my users upload CSV data into my app. Data such as contacts, or products. Their are a number of web based client libraries that can handle the client side logic. What I am looking for is a fast reliable solution to get the data into a parse class.
I have not written any code. Right now I am trying to discover the best process to do this. I have played with parse batch save and know that is not reliable for 1000's of inserts. My thought is to upload the CSV store it in a parse class "uploads" and then have background job lift out say 100, or 1000 records at a time and insert them. Then send a notification when it is done.
Is this the best option, or has anybody found a simpler faster solution?

Related

Bulk Data Transfer through REST API

I have been informed that "REST API is not made / good for Bulk Data Transfer. Its a proven fact". I tried to search over google about this, but unable to find any fruitful answer. Can anyone let me know whether this statement is actually True or not? If its TRUE, then why?
Note: I am not exposing Bulk Data (50 million rows from database) over Web. I am saving it to Server as JSON format (Approx. 3GB file size) and transferring it to other system. I am using Node JS for this purpose. Network is not an issue to transfer file.
Nothing wrong with exposing a end point which returns huge data
It might be concern on how you are sending that data, as memory could be a issue
Why don't you consider streaming the data, that way memory needed is only one packet of the data which has to be streamed at a time
NodeJS has many way to pipe the data into response object, you can also consider JSONStream module from npmjs.org

Node Streams to Mysql

I have to parse large csvs approx 1gb, map the header to the database columns, and format every row. I.E the csv has "Gender" Male but my database only accepts enum('M', 'F', 'U').
Since the files are so large I have to use node streams, to transform the file and then use load data infile to upload it all at once.
I would like granular control over the inserts, which load data infile doesn't provide. If a single line has incorrect data the whole upload fails. I am currently using mysqljs, which doesn't provide an api to check if the pool has reached queueLimit and therefore I can't pause the stream reliably.
I am wondering if I can use apache kafka or spark to stream the instructions and it will be added to the database sequentially. I have skimmed through the docs and read some tutorials but none of them show how to connect them to the database. It is mostly consumer/producer examples.
I know there are multiple ways of solving this problem but I am very much interested in a way to seamlessly integrate streams with databases. If streams can work with I.O why not databases? I am pretty sure big companies don't use load data infile or add chunks of data to array repeatedly and insert to database.

Using GTFS data, how should i extend it with realtime gtfs?

I am building an application using GTFS data. I am a bit confused when it comes to GTFS-realtime.
I have stored all the GTFS information in a database(Mongo), I am able to retrieve stop times of a specific bus stop.
So now I want to integrate GTFS-realtime information to it. What will be the best way to deal with the information retrived? I am using gtfs-realtime-binding (nodsjs library) by Google.
I have the following idea:
Store the realtime-GTFS information in a separate database and query it after getting the stoptime from GTFS. And I can update the database periodically to make sure the real time info is up to date.
Also, I know the retrieve data is in .proto binary format. Should I store them as ascii or is there a better way to deal with it?
I couldnt find much information about how to deal with the realtime data so I hope someone can give me a direction on what to do next.
Thanks!
In your case GTFS-Realtime can be used as "ephemeral" data, and I would go with an object in memory, with the stop_id/route_id as keys.
For every request:
Check if the realtime object contains the id, then present realtime. Else load from the database.

How to create a appropriate database model for the IM

recently we're developing the IM feature for our app. And we would save the chat record with core data. The strategy we make are:
every account has a separate sqlite file.
every chat has a separate table (dynamic created, refer to this article ), however, the table structure is the same. such as,
sender_id
msg_id
content
msg_send_time
...
If we put all the chat message in a table, and we fetch the records by "fromid and toid" to get a specific dialog records. However, if we have thousands of thousands message in this table, we doubt the fetch request would be very slow. so we create a specific table for each dialog.
So, is there any better solution for this problem?
Creating "tables" for conversations dynamically is a very bad idea. This will create so much overhead that it will make your code completely inefficient.
Instead, use single entity (not table, mind you, Core Data is not a database) to capture the messages. Filter by user IDs.
This will perform without a glitch with 100.000s of messages, far more than should be stored or displayed on a mobile device.

What's the best method for fetch the huge files from the webserver using c#

Hi i have a spec for fetch the files from server and predict the un-used files from the directory in this situation i am going to fetch the files from server it will return huge files, the problem is the cpu usage will increase while i am fetching large files, so i like to eliminate this scenario. can any one knows how to avoid this situation please share with me though it might help full for me.
Thanks
You can split your large file on server into several smaller pieces and fetch some metadata about amount of pieces, size etc. and than fetch them one by one from your client c# code and join pieces in binary mode to your larger file.

Resources