I'm using CosmoDB to store my data. The ID is a GUID. There is a new requirement to display a consecutive number. How to achive that in a document based db? I want to keep GUID as "id" and have another unique field "display_id". The application is running in a "App Service". I do not want to run a SQL-Server with a "serial table".
There's no such facility built-in to provide you with increasing numbers. That will be up to you to manage in your app layer (or potentially in a Cosmos DB stored procedure).
Also: There is nothing specific to document databases and increasing numbers. It's simply something that isn't offered via the database engine.
Related
Is there any way to purge/mask data in a Log Analytics workspace with regular expressions or similar, to be able to remove sensitive data that has been sent to the workspace?
Like social security numbers, that is a part of an URL?
As per this Microsoft Document, Log Analytics is a flexible store, which while prescribing a schema to your data, allows you to override every field with custom values. we can Mask the data in the Log Analytics workspace and here are a few approaches where we can set a few strategies for handling personal data
Where possible, stop the collection of, obfuscate, anonymize, or otherwise adjust the data being collected to exclude it from being considered "private". This is by far the preferred approach, saving you the need to create a very costly and impactful data handling strategy.
Where not possible, attempt to normalize the data to reduce the impact on the data platform and performance. For example, instead of logging an explicit User ID, create a lookup data that will correlate the username and their details to an internal ID that can then be logged elsewhere. That way, should one of your users ask you to delete their personal information, it is possible that only deleting the row in the lookup table corresponding to the user will be sufficient.
Finally, if private data must be collected, build a process around the purge API path and the existing query API path to meet any obligations you may have around exporting and deleting any private data associated with a user.
Here is the KQL query for verifying the private data in log analytics
search *
| where * matches regex #'\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b' //RegEx originally provided on https://stackoverflow.com/questions/5284147/validating-ipv4-addresses-with-regexp
| summarize count() by $table
I have a customer that owns a carpet cleaning business and we have all of his different franchisee's data in a multi-tenant database model and we would like to move this data into a data warehouse in snowflake. I don't want to have to build a separate database for each customer because then I have to keep each database up to date with the latest data model. I want to use 1 data model to rule them all. I have a tenant ID that I keep with each record to identify the franchisee's data. I want to give a set of credentials to each franchisee to where they can hook up their analytics tool of choice (tableau, power bi, etc.) and only get access to the rows that are applicable to them. Is there a way to secure the rows they see in each table based on their user. In other words some sort of row level access control similar to profiles in postgres. Are there any better methods for handling this type of scenario? Ultimately I want to maintain and manage the least number of elt jobs and data models.
This is the purpose of ether Secure Views, or Reader Accounts.
We are using both, and they have about the same technical hassle/setup costs. But we are using an internal tool to build/alter the schema's.
To expand on Simeon's answer:
You could have a single Snowflake account and create a Snowflake role & user for each franchisee. These roles would have access to a Secure View which uses the CURRENT_ROLE / CURRENT_USER context functions as in this example from the Snowflake documentation.
You'd have to have a role -> tennant ID "mapping table" which is used in the Secure View to limit the rows down to the correct franchisee.
We're in the design phase for building an audit trail in an existing web application. The application runs on Windows Azure and uses a SQL Azure database.
The audit logs must be filtered by user, or by object type (eg. show all action of a user, or show all actions that are performed on a object).
We have to choose how to store the data, should we use SQL Azure, or should we use table storage? We prefer table storage (cheaper)..
however the 'problem' with table storage is how to define the partition key. We have several thousand customers (the appplication users) in our SQL database, each in their own tenant. Using the tenant ID as partition key is not specific enough, so we have to add something to the partition key. So there's the issue: given the requirements for filtering, we can add a user ID to the partition key to make filtering by user easy, or we can add an object ID to make filtering by object easy.
So we see two possible solutions:
- use SQL Azure instead of table storage
- use table storage and use two tables with different partition keys, which means we duplicate all entries
Any ideas what's the best approach for our situation? Are there other, better solutions?
DocumentDB on Azure might be worth considering.
https://azure.microsoft.com/en-us/documentation/articles/documentdb-use-cases/
You can have audit trail stored in DocDB as JSON documents (user, activity, object fields and can index on all fields )
Azure Table Storage is appropriate to store log data. As Azure App services use Azure Table Storage to store the diagnosis logs.
In think you can consider to set the PartitionKey as your user's tenant name, and the RowKey is the user's ID. As according the Table Storage Data Model, we only need to keep:
Together the PartitionKey and RowKey uniquely identify every entity within a table
Alternatively, you can clarify your concern about:
Using the tenant ID as partition key is not specific enough, so we have to add something to the partition key
Additionally, you can refer https://azure.microsoft.com/en-us/documentation/articles/storage-table-design-guide/#overview for more info about design Azure Table Storage.
Any update, please feel free to let me know.
If you're worried about filtering in multiple ways - you could always write the same data to multiple partitions. It works really well. For example, in our app we have Staff and Customers. When there is an interaction we want to track/trace that applied to both of them (perhaps an over the phone Purchase), we will write the same information (typically as json) to our audit tables.
{
PurchaseId: 9485,
CustomerId: 138,
StaffId: 509,
ProductId: 707958,
Quantity: 20
Price: 31.99,
Date: '2017-08-15 15:48:39'
}
And we will write that same row to the following partitions: Product_707958, Customer_138, Staff_509. The row key is the same across the three rows in each partition: Purchase_9485. Now if I want to go and query everything that has happened for a given staff, customer, or item, I just grab the whole partition. The storage is dirt cheap, so who cares if you write it to multiple places?
Also, an idea for you considering you have multiple tenants - you could make the table name Tenant_[SomeId]. There are some other issues you might have to deal with, but it is in a sense another key to get at schema-less data.
In Azure Search we can create multiple indexes for different search results, and we have two types of api-key. One is for administation and other one is for querying. But with same api-key users can search all indexes.
In my solution I need to design a system so that different users that use the system will get different results by their previleges. I thought this could be solved with dedicated indexes for each role but still users can query other indexes if they want to.
How can I be sure that every user can ONLY be able to search on particular a index.
Out of the box it is not possible to restrict the key usage for a specific index. You would need to do something on your own.
Other possibility would be to create different search service accounts and then creating indexes in them instead of having one account. You can then grant access to your users to appropriate search service account.
UPDATE
Based on your comments, you're actually looking to restrict search results (documents) by user's role i.e. going one level deeper than indexes. To achieve this, what you could do is dynamically append this role criteria to your search query as OData Filter. For example, let's say your index has boolean fields for each role type (Administrator, User etc. etc.) and the user searches for some keyword. Then what you could do is create an OData Filter $filter where you check for these conditions. So your search URL would look something like:
https://<search-service-name>.search.windows.net/indexes/<index-name>/docs?search=<search-string>&$filter=Administrator%20eq%20true
That way Search Service is doing all the filtering and you don't have to do anything in your code.
You can learn more about query options here: https://msdn.microsoft.com/en-us/library/azure/dn798927.aspx.
I have a rather very know Solr issue. The index contain a group of docs of employee records that has a set of public access fields and a set of secure fields. Based on the user's security credentials (which may be indexed in the doc as one field), if a document matched, all its public fields and some of the secured fields which he has access. This list of secure fields varies document to document in the same index. Example: a manage of a department (belonging to one company) can view all secure fields of employees (doc) under him but not for those who do not work under him (whether in the same company or not). But he can still see ALL the public fields of ALL the of the employees (matched and filtered docs).
So being manager, I can see all (public + secure) fields of every one working under me but my asst can see only some of the secure fields who are under him. How to implement this in Solr. Thanks.
The documentation states that Solr does not concern itself with security at the document level.
Solr is designed to be an index of your data, not a replacement for your database (Access control is an important DB feature, only adds complexity to an index)
My suggestions:
Remove all sensitive data from the index. Each Solr document could include a reference (or link) to a 3rd party system/database holding the sensitive data requiring access control.
Encrypt the sensitive content within the index Using public/private key encryption, you can control who is able to decrypt the sensitive fields of a Solr document. (This solution wouldn't scale very well, nor does it allow searching of encrypted fields)
Create a sensitive search index, for each manager: Use the web server's authentication mechanism to control access to the index and load sensitive data there.
I would suggest to take the following steps:
separate out the public and secure content, you can use two separate cores.
add a ServletFilter that sits between User and SOLR webapp and then you can use some basic ACL based security on top of SOLR results to filter out the content as per your application requirements.