I want to create ACL-based authorization system for a web application that uses SQL database as storage.
My problem is - I want to use the authorization rules defined as ACLs to filter the searches. For this I need to convert the ACLs to flat boolean expressions I could use in WHERE clause.
My ACL system would be simple. Every entry would be either ALLOW or DENY and will add a condition. The list will be scanned from top to bottom. The first entry that matches the condition will apply. So e.g. if I have an ACL like:
ALLOW x = 3
ALLOW x = 5
DENY true
I will need to filter with: x = 3 OR x = 5
If I have:
DENY x = 3
DENY x = 5
ALLOW true
the filter will be NOT (x = 3 OR x = 5)
I am still however not sure how to create an universal method of conversion that would apply to any mixture of DENY and ALLOW rules in any order. We can however assume that there must be one rule that would evaluate to true. We can achieve that by ACL inheritance and putting DENY true at the end of every list.
Could you help me solve this riddle?
I have solved this problem already. The key to avoid the three-value logic which was the problem before is to start from the end of the list. The algorithm will look like this:
group the acl into packs of ALLOW and DENY
discard the last pack (which should be DENY and end with DENY true)
add 'false' to the result
for every pack from the end:
combine all the conditions with OR
if pack is ALLOW:
add "or condition" to the result
elif pack is DENY:
add "and not condition" to the result
if result == 'false':
don't perform search, return empty list
else:
discard 'false or' from the beginning of the list (redundant)
So for example:
DENY x = 7
ALLOW x > 3
DENY true
would become false OR x > 3 AND NOT (x = 7)
Related
I recently found the following line of terraform in our code (some values sanitized):
subnet_ids = [ "${split(",", var.xxx_lb ? join(",", data.yyy_ids.private.ids) : join(",", concat(data.yyy_ids.public.ids, list(""))))}" ]
I'm trying to understand why code would be written this way. More specifically, what is the final join doing? Pulling it out for clarity:
join(",", concat(data.yyy_ids.public.ids, list("")))
It seems that someone (no longer at the company) was trying to ensure that a non-empty list is returned. We definitely don't want the empty ("") item in the list.
So, the questions here are:
What logically is going on in this statement?
Is there a better way?
If there is not a better way, how can we remove the empty entry from
the resulting list?
Update for others who may run into this sort of code:
Terraform versions lower than 0.12 conditionals don't work with lists, so join/split is used to turn lists into strings and then back to lists:
https://github.com/hashicorp/terraform/issues/12453
What logically is going on in this statement?
The original author attempts to create a list with the subnet ids
a) The first split statement will take a string in this case the subnet ids and return them as a list, by splitting them based on the delimiter ,
var.xxx_lb ? clause_if_true : clause_if_false
b) Next terraform will evaluate this variable as a boolean and according to the result you will get the public or the private subnet ids, by employing the ternary operator syntax
join(",", data.yyy_ids.private.ids)
c) In case the boolean value is true, terraform will examine this part
This will return a string by joining the items of the list.
And add the delimeter ,. I assume the reason that he attempts to join them as a string is to be accordance with the section a)
join(",", concat(data.yyy_ids.public.ids, list("")))
d) If the boolean value in b) evaluates to false terraform will examine this part.
The concat function takes as input lists and returns them as a single list.
And then performs the same logic as in c)
The list function is deprecated, tolist should be used instead.
Is there a better way?
I would employ a straight forward way. Check the boolean value, if it is true get the list with private ids. If false the public ones.
subnet_ids = var.xxx_lb ? data.yyy_ids.private.ids : data.yyy_ids.public.ids
I would appreciate suggestions for a more computationally efficient way to dynamically filter a Pandas DataFrame.
The size of the DataFrame, len(df.index), is around 680,000.
This code from the callback function of a Plotly Dash dashboard is triggered when points on a scatter graph are selected. These points are passed to points as a list of dictionaries containing various properties with keys 'A' to 'C'. This allows the user to select a subset of the data in the pandas.DataFrame instance df for cross-filtering analysis.
rows_boolean = pandas.Series([False] * len(df.index))
for point in points:
current_condition = ((df['A'] == point['a']) & (df['B'] == point['b'])
& (df['C'] >= point['c']) & (df['C'] < point['d']))
rows_boolean = rows_boolean | current_condition
filtered = df.loc[rows_boolean, list_of_column_names]
The body of this for loop is very slow as it is iterating over the whole data frame, it is manageable to run it once but not inside a loop.
Note that these filters are not additive, as in this example; each successive iteration of the for loop increases, rather than decreases, the size of filtered (as | rather than & operator is used).
Note also that I am aware of the existence of the method df['C'].between(point['c'], point['d']) as an alternative to the last two comparison operators, however, I only want this comparison to be inclusive at the lower end.
Solutions I have considered
Searching the many frustratingly similar posts on SO reveals a few ideas which get some of the way:
Using pandas.DataFrame.query() will require building a (potentially very large) query string as follows:
query = ' | '.join([f'((A == {point["a"]}) & (B == {point["b"]})
& (C >= {point["c"]}) & (C < {point["d"]}))' for point in points])
filtered = df.query(query)
My main concern here is that I don’t know how efficient the query method becomes when the query passed has several dozen (or even several hundred) conditions strung together. This solution also currently does not allow the selection of columns using list_of_column_names.
Another possible solution could come from implementing something like this.
To reiterate, speed is key here, so I'm not just after something that works, but something that works a darn sight faster than my boolean implementation above:
There should be one-- and preferably only one --obvious way to do it. (PEP 20)
I have a static prompt which is a single select. In that I have two values lets call it A and B. So when I select option 'A' my report pulls all data from the DB which is expected. So when user Select option 'B' the report should pull only the records whose code = 'M'. Here code is a column name in the report.
Note: For option 'A' I don't need to set any prompt in the report because it should pull all records by default.
Let's assume your parameter name is param and data item is named item.
Filter expression:
if (?param? = 'A')
then ([item])
else ('M')
= [item]
Note: You absolutely need to use a prompt. The result of selecting A should be to not filter.
I think I understand, try this:
Make the prompt a single value (i.e. B) with a use value of 'M'
Make the HEADER TEXT for the prompt A (so it is not an actual selection)
Make the filter optional
if the user selects A - the prompt is NULL and the optional filter is ignored
if the user selects B - the filter [Some data item] = ?YourParm? will occur
Also, if you prefer to not have header text
you can make static values A, B and modify the optional filter to be like this:
(?YourParm? <> 'M') OR ([Some data item] = ?YourParm?)
I have following text:
1 hwb wert: 330 kWh
In the first step, following mapping is tacking place:
330 kWh is mapped as: Lookup.major = "unit"
hwb wertis mapped as: Lookup.major = "keyword"
The JAPE Rules:
Phase: composedUnits
Input: Token Lookup
Options: control=appelt debug=true
Rule: TableRow
Priority:10
(
({Lookup.majorType == "keyword"})
({Token.kind == punctuation})[0,4]
({Lookup.majorType == "unit"})
)
Rule: ReversedTableRow
Priority: -2
(
({Token.kind == number})
({Lookup.majorType == "keyword"})
)
I can't understand why the ReversedTableRow-Rule is matched and not the TableRow.
The appelt priorities work only for the same regions of text (e.g. earlier match wins and longer match wins). Text consumed by a previous rule cannot be matched by a later rule...
From the documentation:
With the appelt style, only one rule can be fired for the same region
of text, according to a set of priority rules. Priority operates in
the following way.
From all the rules that match a region of the document starting at
some point X, the one which matches the longest region is fired.
If
more than one rule matches the same region, the one with the highest
priority is fired
If there is more than one rule with the same
priority, the one defined earlier in the grammar is fired.
...
Note also that depending on the control style, firing a rule may
‘consume’ that part of the text, making it unavailable to be matched
by other rules. This can be a problem for example if one rule uses
context to make it more specific, and that context is then missed by
later rules, having been consumed due to use of for example the
‘Brill’ control style.
The rule TableRow can win as longer with following modification, note that I added the :tableRow label, which does not include the leading number token.
(
({Token.kind == number})?
(
({Lookup.majorType == "keyword"})
({Token.kind == punctuation})[0,4]
({Lookup.majorType == "unit"})
):tableRow
)
I'm trying to use get() to access a list element in R, but am getting an error.
example.list <- list()
example.list$attribute <- c("test")
get("example.list") # Works just fine
get("example.list$attribute") # breaks
## Error in get("example.list$attribute") :
## object 'example.list$attribute' not found
Any tips? I am looping over a vector of strings which identify the list names, and this would be really useful.
Here's the incantation that you are probably looking for:
get("attribute", example.list)
# [1] "test"
Or perhaps, for your situation, this:
get("attribute", eval(as.symbol("example.list")))
# [1] "test"
# Applied to your situation, as I understand it...
example.list2 <- example.list
listNames <- c("example.list", "example.list2")
sapply(listNames, function(X) get("attribute", eval(as.symbol(X))))
# example.list example.list2
# "test" "test"
Why not simply:
example.list <- list(attribute="test")
listName <- "example.list"
get(listName)$attribute
# or, if both the list name and the element name are given as arguments:
elementName <- "attribute"
get(listName)[[elementName]]
If your strings contain more than just object names, e.g. operators like here, you can evaluate them as expressions as follows:
> string <- "example.list$attribute"
> eval(parse(text = string))
[1] "test"
If your strings are all of the type "object$attribute", you could also parse them into object/attribute, so you can still get the object, then extract the attribute with [[:
> parsed <- unlist(strsplit(string, "\\$"))
> get(parsed[1])[[parsed[2]]]
[1] "test"
flodel's answer worked for my application, so I'm gonna post what I built on it, even though this is pretty uninspired. You can access each list element with a for loop, like so:
#============== List with five elements of non-uniform length ================#
example.list=
list(letters[1:5], letters[6:10], letters[11:15], letters[16:20], letters[21:26])
#===============================================================================#
#====== for loop that names and concatenates each consecutive element ========#
derp=c(); for(i in 1:length(example.list))
{derp=append(derp,eval(parse(text=example.list[i])))}
derp #Not a particularly useful application here, but it proves the point.
I'm using code like this for a function that calls certain sets of columns from a data frame by the column names. The user enters a list with elements that each represent different sets of column names (each set is a group of items belonging to one measure), and the big data frame containing all those columns. The for loop applies each consecutive list element as the set of column names for an internal function* applied only to the currently named set of columns of the big data frame. It then populates one column per loop of a matrix with the output for the subset of the big data frame that corresponds to the names in the element of the list corresponding to that loop's number. After the for loop, the function ends by outputting that matrix it produced.
Not sure if you're looking to do something similar with your list elements, but I'm happy I picked up this trick. Thanks to everyone for the ideas!
"Second example" / tangential info regarding application in graded response model factor scoring:
Here's the function I described above, just in case anyone wants to calculate graded response model factor scores* in large batches...Each column of the output matrix corresponds to an element of the list (i.e., a latent trait with ordinal indicator items specified by column name in the list element), and the rows correspond to the rows of the data frame used as input. Each row should presumably contain mutually dependent observations, as from a given individual, to whom the factor scores in the same row of the ouput matrix belong. Also, I feel I should add that if all the items in a given list element use the exact same Likert scale rating options, the graded response model may be less appropriate for factor scoring than a rating scale model (cf. http://www.rasch.org/rmt/rmt143k.htm).
'grmscores'=function(ColumnNameList,DataFrame) {require(ltm) #(Rizopoulos,2006)
x = matrix ( NA , nrow = nrow ( DataFrame ), ncol = length ( ColumnNameList ))
for(i in 1:length(ColumnNameList)) #flodel's magic featured below!#
{x[,i]=factor.scores(grm(DataFrame[, eval(parse(text= ColumnNameList[i]))]),
resp.patterns=DataFrame[,eval(parse(text= ColumnNameList[i]))])$score.dat$z1}; x}
Reference
*Rizopoulos, D. (2006). ltm: An R package for latent variable modelling and item response theory analyses, Journal of Statistical Software, 17(5), 1-25. URL: http://www.jstatsoft.org/v17/i05/