Is it okay to store user data on GitHub Gist? - store

I recently made a web app which saves the user's data on GitHub Gist to make it shareable. These Gists are quite large (depending on the user) and contain binary data (mp3 files).
Now even though Gist allows automated uploads through its API, I'm unsure if it's actually okay to abuse Gist in order to store users' data or is this actually intended?
Thanks in advance!

Related

Share files and data with external clients

We have a use case to share data and some associated files with external clients. Data is stored in data lake(snowflake). We are thinking of storing related files in s3 or azure blob storage. These files are supplementary information/additional attachments to the data.
The data is securely served from snowflake.
Is it possible to generate a secure link to the file and serve that along with the data for users to access?
Pre signed or SAS URL's will not work because of security concerns.
Is it possible to generate links to files that are easy to open through a browser with b2b type of authentication? Do we need to build a custom function/app to achieve this? Or are there any other options? Did anyone work through a similar use case before?
Moving from comments to an answer for closure:
There are upcoming features that might be exactly what you need. These features are in private preview now, and official documentation will come soon.
In the meantime check the presentation at https://events.snowflake.com/summit/agenda/session/546417?widget=true to learn more.
I'll update this answer when the features get released and publicly documented.

Static Files and Heroku

I've been gathering some ideas for a web app, which is to include functionality for merging PDFs based upon user selections. There isn't going to be a giant amount of documents, but the problem is that the Node plugin I'm using to merge them doesn't seem to be able to draw them down from S3 for the process. I know that storing static files on Heroku is frowned-upon, but if they're not something that the user changes, then is it okay to store some of them there, or is there something I'm overlooking? The argument I've heard against storing anything on Heroku is that it's ephemeral, so the PDFs that the user generates would be deleted when the Dyno is restarted...but that's no problem, because when they create them, it's just a one and done download situation. Am I going to run into any issues storing just 100-200 MB of PDFs on the Dyno, or is there some clever way I could bridge that gap?

Best image upload directory structure practises?

I have developed a large web application with NodeJS. I allow my users to upload multiple images to my Google Cloud Storage bucket.
Currently, I am storing all images under the same directory of /uploads/images/.
I'm beginning to think that this is not the safest way, and could effect performance later down the track when the directory has thousands of images. It also opens up a threat since some images are meant to be private, and it could allow users to search for images by guessing a unique ID, such as uploads/images/29rnw92nr89fdhw.png.
Would I be best changing my structure to something like /uploads/{user-id}/images/ instead? That way each directory only has a couple dozen images. Although, can a directory handle thousands of other subdirectories without suffering performance issues? Does Google Cloud Storage happen to accomodate for issues like this?
GCS does not actually have "directories." They're an illusion that the UI and command-line tools provide as a nicety. As such, you can put billions of objects inside the same "directory" without running into any problems.
One addendum there: if you are inserting more than a thousand objects per second, there are some additional caveats worth being aware of. In such a case, you would see a performance benefit to avoiding sequential object names. In other words, uploading /uploads/user-id/images/000000.jpg through /uploads/user-id/images/999999.jpg, in order, in rapid succession, would likely be slower than if you used random object names. GCS has documentation with more on this, but this should not be a concern unless you are uploading in excess of 1000 objects per second.
A nice, long GUID should be effectively unguessable (or at least no more guessable than a password or an access token), but they do have the downside of being non-revocable without renaming the image. Once someone knows it, they know it forever and can leak it to others. If you need firm control of your objects, you could keep them all private and visible only to your project and allow users to access them only via signed URLs. This offers you the most flexibility and control, but it's also harder to implement.

Chrome Extension Database Storage

I am working on a page action extension and would like to store information that all users of the extension can access. The information will be key:value pairs, where the key is a web url and the value is an array of links.
I have to be able to update the database without redeploying the extension to the chrome store. What is it that I should look into using? The storage APIs seem oriented towards user data rather than data stored by the app and updated by the developer.
If you want something to be updated without deploying an updated version through CWS, you'll need to host the data yourself somewhere and have the extension query it.
Using chrome.storage.local as a cache for said data would be totally appropriate.
the question is pretty broad so ill give you some ideas Ive done before.
since you say you dont want to republish when the db changes, you need to store the data for the db yourself. this doesnt mean you need to store an actual db, just a way for the extension to get the data.
ideally, you are only adding new pairs. if so, an easy way is to store your pairs in a public google spreadsheet. the extension then remembers the last row synced and uses the row feed to get data incrementally.
there a few tricks to get right the spreadsheet sync. take a look at my github "plus for trello" for a full implementation.
this is a good way to incrementally sync, thou if the db isnt huge you could just host a csv file and get it periodically from the extension.
now that you can get the data into the extension, decide how to store it. chrome.storage.local or indexedDb should be fine thou indexedDb is usually best for later querying more complex things than just a hash table.

how can a program keep a secret from its creator?

The idea is that I want a program that can edit a file yet I, the programmer, cannot edit or forge the file. Encrypting the file is an obvious choice, but even then, I'll still have to keep the encryption key secret from myself somehow.
Obscuring the secret doesn't seem to work, because I could just use the de-obscuring part of the code that I would need for the program.
I'm asking this because I'm trying to make a program that will keep me productive by monitoring my activities, and tell my friends/boss/family just how terrible a procrastinator i am if i don't live up to the goals i set the previous day (in other words: Present me can force future me to not procrastinate)
It seems the content of the program doesn't matter that much but you want to assure that the timestamp and content of the log can't be forged. I suggest writing the log to some external site where you can put data to but not delete from.
Writing false values to the log can only be prevented by having a log which progresses by time. For example, if you hide expenses from your bank account you'll run into problems because future balances will be lower than expected.
For short pieces of information like your account balance, just write it to some public site like Twitter. AFAIK it's not possible to send twitter messages like there were sent some time before.
For more complex data like the progress of a software development project push your changes with a version control system like git to a remote repo where you can't delete or overwrite history.
Update: As you explained in the comments you want to log dinstinct data on your computer that could be forged to anything. IMHO it's virtually impossible for you to write a program on your own which runs on your own computer with root but cannot be controlled. The only kind of software that is somehow similar to your request is DRM software that is calling home to prevent software "piracy". You need a binary program written by somebody else or with the source code deleted. It would need some kind of encrypted and obfuscated network communication which you can't understand.
I think there is not much hope for you using this approach. Better learn to control yourself and not answer random questions by strangers on Stackoverflow, ehem.

Resources