Row level access control in snowflake - security

I have a customer that owns a carpet cleaning business and we have all of his different franchisee's data in a multi-tenant database model and we would like to move this data into a data warehouse in snowflake. I don't want to have to build a separate database for each customer because then I have to keep each database up to date with the latest data model. I want to use 1 data model to rule them all. I have a tenant ID that I keep with each record to identify the franchisee's data. I want to give a set of credentials to each franchisee to where they can hook up their analytics tool of choice (tableau, power bi, etc.) and only get access to the rows that are applicable to them. Is there a way to secure the rows they see in each table based on their user. In other words some sort of row level access control similar to profiles in postgres. Are there any better methods for handling this type of scenario? Ultimately I want to maintain and manage the least number of elt jobs and data models.

This is the purpose of ether Secure Views, or Reader Accounts.
We are using both, and they have about the same technical hassle/setup costs. But we are using an internal tool to build/alter the schema's.

To expand on Simeon's answer:
You could have a single Snowflake account and create a Snowflake role & user for each franchisee. These roles would have access to a Secure View which uses the CURRENT_ROLE / CURRENT_USER context functions as in this example from the Snowflake documentation.
You'd have to have a role -> tennant ID "mapping table" which is used in the Secure View to limit the rows down to the correct franchisee.

Related

Azure data explorer- Restrict table and records

We have a requirement in Kusto/ADX where we need to provide access to only one table and for certain records if conditions are met for a group or a User.
I have explored RLS and Restricted view Access on this, however below is my stands
RLS & Restricted view access can not be applied together on a same table
RLS can restrict user only on records basis and not table level
Restricted View access can restrict table level but not records. Also this has a pain point, I should apply restrict view policy to all other table and add restricted viewer access role to those users whom we don't restrict. For a single group/user to access one table, doing all these change seems to be painful.
Do we have any other best approach to handle this scenario?
Thank you.
Bharath Kumar B
You need to split the tables into multiple databases, and each database will have a different set of users who can view the data.
On top of that, you'll need to apply RLS (Row Level Security) on tables, where you want some users to get only some of the records.

Azure Log Analytics Workspace and GDPR

Is there any way to purge/mask data in a Log Analytics workspace with regular expressions or similar, to be able to remove sensitive data that has been sent to the workspace?
Like social security numbers, that is a part of an URL?
As per this Microsoft Document, Log Analytics is a flexible store, which while prescribing a schema to your data, allows you to override every field with custom values. we can Mask the data in the Log Analytics workspace and here are a few approaches where we can set a few strategies for handling personal data
Where possible, stop the collection of, obfuscate, anonymize, or otherwise adjust the data being collected to exclude it from being considered "private". This is by far the preferred approach, saving you the need to create a very costly and impactful data handling strategy.
Where not possible, attempt to normalize the data to reduce the impact on the data platform and performance. For example, instead of logging an explicit User ID, create a lookup data that will correlate the username and their details to an internal ID that can then be logged elsewhere. That way, should one of your users ask you to delete their personal information, it is possible that only deleting the row in the lookup table corresponding to the user will be sufficient.
Finally, if private data must be collected, build a process around the purge API path and the existing query API path to meet any obligations you may have around exporting and deleting any private data associated with a user.
Here is the KQL query for verifying the private data in log analytics
search *
| where * matches regex #'\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b' //RegEx originally provided on https://stackoverflow.com/questions/5284147/validating-ipv4-addresses-with-regexp
| summarize count() by $table

Secure filtering in Power BI Embedded

Currently I have the following scenario. I have a report in Power BI which reads from a dataset which has data of all companies. In my ASP .NET MVC application the user will select the company for which to display the report and with Power BI Embedded the application filters the report by the ID of the company through the embed config defined in JS (filter parameters passed from server).
I am using app owns data approach where I have a master account and the embed token is generated for the master account.
The user accessing the report does not have access rights to all companies and this is being handled server-side. With this approach however, the user can easily alter the embed config in JS and display the report for a company which he is not authorized to access.
I looked into row-level security and I found the following approach https://community.powerbi.com/t5/Developer/PowerBi-Embedded-API-Works-with-RLS/td-p/231064 where there exists a role for every company and the embed token is generated for that particular company. This would be an ideal approach but in my scenario the companies are not pre-defined and can be created any time. Therefore, I would need to create a role per company. This however cannot be achieved programmatically as Power BI does not provide means to automate role creation.
The only approach I can think of is to clone a report for each new company and create a dataset specific to that report which will only have the data for that particular company. Then the generated embed token will only be valid for that particular report.
Has anyone also experienced this dilemma? Any suggestions what I should do in such scenario?
You still can use RLS, but without roles per company. Use USERPRINCIPALNAME() DAX function to find out which user is viewing the report. In the database make a table to specify which company can be seen by which user and add it to your model. Then use RLS to filter this table to only the row (or rows) where user is current one (here is where USERPRINCIPALNAME() comes into play), and let the relationship between this table and your data tables to filter out what should not be seen. This way there will be no JavaScript filters at all, so nothing can be changed by some malicious user.
See Using the username() or userprincipalname() DAX function.

How to implement Dynamic Security in PowerView

I have created a PowerView using a BISM connection in Enterprise Portal of AX. That PowerView report will be used by 100+ users. I want every user to his/her data in the PowerView instead of viewing the complete data. One option is to create 100+ security roles in SSAS (multidimentional) which is not a viable option. Please guide me how can i achieve dynamic security in PowerView so that every user sees its own view. Thanks.
Power View doesn't not offer any kind of security. You will need to do this in SSAS, but you don't need 100+ security roles. You will want to look into dynamic security. To create dynamic security, you will need some way to relate a user to the information they should see. This usually means adding a field to an existing table or creating new tables.
If all users are secured by the same attributes, they can be contained in a single role. If some users are secured based on one attribute and others based upon another attribute, then you may need multiple roles.
Here's how this might work.
Create a table that contains all users that will need access to your cube.
Create a bridge table that ties the users to the attribute on which you are securing their access. For instance, maybe users can only see certain products so you have a table of User IDs and Product IDs.
Add these tables to your DSV.
Create a user dimension.
Create a measure group based upon your security bridge table
Create a role for this user type and add an MDX statement to the Allowed Member Set. Also, set the Enable visual totals checkbox.
Populate the members for the role, preferably through an AD group rather than individually if you have 100+ users.
Your allowed member set will look something like
Exists(
{[Product].[Product ID].members},
STRTOSET("[Users].[UserName].[UserName].&[" + Username() + "]"),
"Bridge User Product"
)
You can find a good blog post here and a good video about SSAS security here (dynamic security starts around the 35 minute mark).

How to store audit data in Azure

We're in the design phase for building an audit trail in an existing web application. The application runs on Windows Azure and uses a SQL Azure database.
The audit logs must be filtered by user, or by object type (eg. show all action of a user, or show all actions that are performed on a object).
We have to choose how to store the data, should we use SQL Azure, or should we use table storage? We prefer table storage (cheaper)..
however the 'problem' with table storage is how to define the partition key. We have several thousand customers (the appplication users) in our SQL database, each in their own tenant. Using the tenant ID as partition key is not specific enough, so we have to add something to the partition key. So there's the issue: given the requirements for filtering, we can add a user ID to the partition key to make filtering by user easy, or we can add an object ID to make filtering by object easy.
So we see two possible solutions:
- use SQL Azure instead of table storage
- use table storage and use two tables with different partition keys, which means we duplicate all entries
Any ideas what's the best approach for our situation? Are there other, better solutions?
DocumentDB on Azure might be worth considering.
https://azure.microsoft.com/en-us/documentation/articles/documentdb-use-cases/
You can have audit trail stored in DocDB as JSON documents (user, activity, object fields and can index on all fields )
Azure Table Storage is appropriate to store log data. As Azure App services use Azure Table Storage to store the diagnosis logs.
In think you can consider to set the PartitionKey as your user's tenant name, and the RowKey is the user's ID. As according the Table Storage Data Model, we only need to keep:
Together the PartitionKey and RowKey uniquely identify every entity within a table
Alternatively, you can clarify your concern about:
Using the tenant ID as partition key is not specific enough, so we have to add something to the partition key
Additionally, you can refer https://azure.microsoft.com/en-us/documentation/articles/storage-table-design-guide/#overview for more info about design Azure Table Storage.
Any update, please feel free to let me know.
If you're worried about filtering in multiple ways - you could always write the same data to multiple partitions. It works really well. For example, in our app we have Staff and Customers. When there is an interaction we want to track/trace that applied to both of them (perhaps an over the phone Purchase), we will write the same information (typically as json) to our audit tables.
{
PurchaseId: 9485,
CustomerId: 138,
StaffId: 509,
ProductId: 707958,
Quantity: 20
Price: 31.99,
Date: '2017-08-15 15:48:39'
}
And we will write that same row to the following partitions: Product_707958, Customer_138, Staff_509. The row key is the same across the three rows in each partition: Purchase_9485. Now if I want to go and query everything that has happened for a given staff, customer, or item, I just grab the whole partition. The storage is dirt cheap, so who cares if you write it to multiple places?
Also, an idea for you considering you have multiple tenants - you could make the table name Tenant_[SomeId]. There are some other issues you might have to deal with, but it is in a sense another key to get at schema-less data.

Resources