I am loading a few big JSON data from 3rd party API on server startup and write them into .JSON files (150mb json files), loading it into an object whenever I need to use it.
The thing is, I am not sure this is the right and efficient way to do so. Should I use a database instead? If yes, could you mention which one to use?
Thanks.
glad to answer your question.
Modern databases are already able to keep up with large file sizes, so in this case size would not be an issue.
However, the issue regarding performance is that it still depends on the usage and purpose of the application.
For example, sometimes the application might require content caching, in this case most databases already have this function built-in, however, there are also applications where this won't apply.
This issue also discusses the comparison of disk storage and database storage, there are lots of good answers in there, I hope it will help.
Related
I am currently considering whether I should be storing media in an apache cassandra database. The use case is that the site will be taking uploads from users for insurance claims and will need to store the files so that they cannot be accessed outside the correct permissions and at the same time they need to be able to be streamed. If I store them on a file system, I have to deal with redundancy backups and so on using file system based old tech. I am not really interested in dealing with a CDN because many of them are expensive but also I the permissions to the whether you can view the content depends on information in the app such as which adjuster is assigned to the case and so on. In addition I want to stream the files rather than require download and view which would be the default mode with requests against a CDN. If I put them in cassandra it will handle the replication, storage and I can stream the binary data out of the database to the user with integrated permissions. What I am concerned about is if I will run into problems with cassandra rows having huge HD video files that are sometimes 1 to 2 hours long (testimony).
I am interested in the recommendations of Cassandra users concerning this issue. How would to solve the problem. Any lessons you have learned that I can benefit from. Would you suggest anything specific about the video tables if I go with cassandra storage? Is there any CDN that will stream, not require download, allow me to plug in permissions and at the same time be open source?
Thanks a bunch.
Cassandra is definitely not designed and should not be used as an object store. I've worked on plenty of use cases where Cassandra was used as the metadata store alongside the object store/CDN and can complement them quite nicely.
Check out KillrVideo for inspiration: https://killrvideo.github.io/
This seems like a good key-value usecase for Streaming LOB support in Oracle NoSQL Database. You might want to look at this - http://docs.oracle.com/cd/NOSQL/html/GettingStartedGuide/lobapi.html
What is the best/easiest way to store data offline? I have a website that I only run local (it's just for personal use) so I am not using any php or sql. I have a lot of posts containing a date, a time, a description the consist of a lot of text and a few of them contain an audio file (there are very few audio files so they may be stored separately from the rest). Now I want to make a website which can show these posts at request, but since I am not using either a server or a database I'm not sure how to store them. Use of any kind of framework or library is allowed, as long as I can use it without an internet connection.
Thanks.
EDIT: JSON is a good way to read data without a server-side language, but I don't know if it's possible to or how to write to a file without a server-side language. To summarize: I want a database (for both storing and accessing) without the need for a server.
Easy way without setting up a web or database server is to use JSON files imo. The syntax is very easy to learn!
Edit: I'd there is a better way to do this without dB setup / server side languages I'd like to hear it
I'd like to just use .json files to store data, rather than using a database. Many simple sites have little data, and reading/writing to a file (that can be added to version control) seems adequate, and eliminates the need for database versioning / deployment logistics.
npm: node-store
Here's one way to do it, yet I'd need to implement all kinds of query functionality.
I'm really unfamiliar with CouchDB. From the little I've read, it looks like it might use files to store the JSON data, but it might use some kind of disk storage. Can someone shed some light on this?
Does CouchDB store its JSON in text-based files that can be added to version control (git)?
Does anyone know of another text-based storage system with some query functionality?
CouchDB is a full fledged database. The value that gives you above simply using file based storage is additional indexing. Ie., if you do file based then you can either only do key based look ups (the file name) or build your own secondary indexing methodology (symlinks or whatever). Now you're in the database building business instead of the app building business, which is silly because your entire premise seems to be simplicity and focusing on your app.
Also, keep in mind that when you have many (even just 2) people causing writes to your file(s), then you're going to run into either file system locking problems or users overwriting one another.
You're correct though, if you only have a few pieces of information then a single JSON file - basically a config file - is far easier than a database. Especially if people are only reading from the file.
Also, keep in mind that there are Database-as-a-Service solutions that remove the need for DIY install/configure/maintenance/administration. One of them is Cloudant which is based on CouchDB, is API compatible, contributes back, etc. (I work at Cloudant).
Does anyone know of another text-based storage system with some query functionality?
You can use ueberDB module with Dirty file storage.
As far as I remember, this storage just appends your data to the same text file over and over again, so if you really have small dataset, it'll work just fine.
If you data will grow too much, you can always change storage while using the same module.
I don't know much about FirefoxOS hence this question.
I have an android app that ships with already prepared data saved in SQLite database. In the runtime the app copies that db to the device storage and uses it for reading and writing data. This is much more efficient than creating empty DB file and inserting data when the app first starts(e.g from JSON).
I was wondering how can I achieve the same thing in Firefox OS? Is there any way I can create IndexedDB, fill it with data and then add it to the app package as an asset?
Unfortunately this behavior is not yet supported. As Fabrice Desré mentioned in bugzilla, some of the files to achieve this behaviour is specific to gaia apps, which gecko does not have access at the moment.
By now, you will have to stick with the less efficient method (depending on the size of your db, the difference isn't that big).
Hope I was able to help,
cheers
I'm just wondering if anyone who has experience on Azure Table Storage could comment on if it is a good idea to use 1 table to store multiple types?
The reason I want to do this is so I can do transactions. However, I also want to get a sense in terms of development, would this approach be easy or messy to handle? So far, I'm using Azure Storage Explorer to assist development and viewing multiple types in one table has been messy.
To give an example, say I'm designing a community site of blogs, if I store all blog posts, categories, comments in one table, what problems would I encounter? On ther other hand, if I don't then how do I ensure some consistency on category and post for example (assume 1 post can have one 1 category)?
Or are there any other different approaches people take to get around this problem using table storage?
Thank you.
If your goal is to have perfect consistency, then using a single table is a good way to go about it. However, I think that you are probably going to be making things more difficult for yourself and get very little reward. The reason I say this is that table storage is extremely reliable. Transactions are great and all if you are dealing with very very important data, but in most cases, such as a blog, I think you would be better off just 1) either allowing for some very small percentage of inconsistent data and 2) handling failures in a more manual way.
The biggest issue you will have with storing multiple types in the same table is serialization. Most of the current table storage SDKs and utilities were designed to handle a single type. That being said, you can certainly handle multiple schemas either manually (i.e. deserializing your object to a master object that contains all possible properties) or interacting directly with the REST services (i.e. not going through the Azure SDK). If you used the REST services directly, you would have to handle serialization yourself and thus you could more efficiently handle the multiple types, but the trade off is that you are doing everything manually that is normally handled by the Azure SDK.
There really is no right or wrong way to do this. Both situations will work, it is just a matter of what is most practical. I personally tend to put a single schema per table unless there is a very good reason to do otherwise. I think you will find table storage to be reliable enough without the use of transactions.
You may want to check out the Windows Azure Toolkit. We have designed that toolkit to simplify some of the more common azure tasks.