I am using the Azure blob's metadata information mechanism mentioned here to save some information in the blob store, and later retrieve information from it.
My questions are mainly related to performance and maintenance concerns.
Is there any upper limit on the size of this metadata? What is the
maximum number of keys I can store ?
Does it expire after a certain date?
Is there any chance of losing data that is stored in the blob
metadata
If yes, I would go ahead, and write these to a database, from the service I am writing. However, ideally, I would like to use the blob's metadata feature, which is very useful, and well thought out.
Check out this documentation:
https://learn.microsoft.com/en-us/rest/api/storageservices/fileservices/Setting-and-Retrieving-Properties-and-Metadata-for-Blob-Resources?redirectedfrom=MSDN
The size of the metadata cannot exceed 8 KB altogether. This means keys, values, semicolons, everything. There is no explicit limitation for the number of keys themselves, but all of them (with the actual values and other characters) must fit into the 8 KB limit.
As for the expiration, I don't think so. At least the documentation doesn't mention it. I guess if expiration was an issue, it would be important enough to be mentioned in the documentation :)
As for losing the metadata: metadata is stored along the blob, so if you lose the blob you lose the metadata (like the datacenter explodes and you didn't have the appropriate replication for your account). Other than that, I don't think it can just disappear. The documentation also states that partial updates are not possible, so it is either updated fully or not, you can't lose half of your updates.
Related
If sensitive data was to enter a bigquery table, is it possible to permanently delete the automatic backups that are used by the time travel feature before the retention period (default of 7 days, but can be a minimum of 2 days) elapses?
Thus making it impossible to roll back and recover a snapshot of the table that contains the sensitive data and allowing for a complete and irreversible purge of the data from the project.
I haven't yet seen anything in the BQ google docs to suggest this is possible or how to handle a situation like this, but this seems like a big caveat to handling sensitive data in bigquery.
If this is not possible what other options are there to restrict access to the historical data in bigquery? Is time travel a permission that can be withdrawn by a custom role?
I understand the limitations of QUOTA_BYTES_PER_ITEM and QUOTA_BYTES when using chrome.storage.sync. I'm finding them quite limiting for a annotated history related extension I am writing. I understand that local storage could avoid this problem, but I need a user to be able to maintain their data as they move to other devices or someday replace their machine. My question is - are their other storage methods to get around this? What about Google Keep? It is an extension, but it appears capable of a "unlimited" storage of notes, or at least far more than the limitations of chrome.storage.sync. Is it simply not playing by the same rules, or are there other methods I could be using? Currently I'm concatenating information into large strings in javascript and storing these using chrome.storage.sync. Then parsing that information later as my database.
Thanks for any help!
This is mostly a GDPR question.
When data is soft-deleted using Azure Search's SoftDeleteColumnDeletionDetectionPolicy, is all of the original document data still kept, or is only enough info (document ID and the IsDeleted bit) kept to know that the document has been deleted?
Looking around at the Azure documentation, this isn't clear. It's clear that this is intended to use for soft deletes on the data source side, but it's not clear whether Azure also treats this strictly as a soft-delete, and therefore keeps all the data and just marks the document as deleted via the soft-delete bit.
When an Azure Search indexer processes a document marked as deleted, the document is removed from the search index (i.e., "hard" deleted).
I have developed a large web application with NodeJS. I allow my users to upload multiple images to my Google Cloud Storage bucket.
Currently, I am storing all images under the same directory of /uploads/images/.
I'm beginning to think that this is not the safest way, and could effect performance later down the track when the directory has thousands of images. It also opens up a threat since some images are meant to be private, and it could allow users to search for images by guessing a unique ID, such as uploads/images/29rnw92nr89fdhw.png.
Would I be best changing my structure to something like /uploads/{user-id}/images/ instead? That way each directory only has a couple dozen images. Although, can a directory handle thousands of other subdirectories without suffering performance issues? Does Google Cloud Storage happen to accomodate for issues like this?
GCS does not actually have "directories." They're an illusion that the UI and command-line tools provide as a nicety. As such, you can put billions of objects inside the same "directory" without running into any problems.
One addendum there: if you are inserting more than a thousand objects per second, there are some additional caveats worth being aware of. In such a case, you would see a performance benefit to avoiding sequential object names. In other words, uploading /uploads/user-id/images/000000.jpg through /uploads/user-id/images/999999.jpg, in order, in rapid succession, would likely be slower than if you used random object names. GCS has documentation with more on this, but this should not be a concern unless you are uploading in excess of 1000 objects per second.
A nice, long GUID should be effectively unguessable (or at least no more guessable than a password or an access token), but they do have the downside of being non-revocable without renaming the image. Once someone knows it, they know it forever and can leak it to others. If you need firm control of your objects, you could keep them all private and visible only to your project and allow users to access them only via signed URLs. This offers you the most flexibility and control, but it's also harder to implement.
I have a SharePoint 2007 database that is 16GB in size and I want to know why, and how I can reduce the size. Ideally I would like a trimmed replica to use as a developer workstation that retains a good sample data set, and has the ability to be refreshed.
Can you please tell me if there are any third party tools or other methods to accomplish this? I have found the Microsoft tool (stsadm) to be very limited in this regard.
Many thanks.
You can start with the Storage Space Allocation page, available in every site collection.. http://server/_layouts/storman.aspx
That can tell you what lists are big etc.
Trashcans are also good candidates for trimming a database.
I regularly make backups of every site collection and just inspect the ones that get too big. It's always something; large PPTs or loads of images, etc.
Ultimately SQL Server will not just automatically shrink your database, so if you delete stuff the filesize on disk will not decrease; this is a SQL Server admin task.
16 GB is not that big really.. you can just backup and restore it in your dev environment and then delete some unneeded site(collection)s out of it to make it smaller.