SCIM PATCH library - node.js

I am implementing SCIM provisioning for my current project, and I am trying to implement the PATCH method and it seems not that easy.
What I read in the RFC is that SCIM PATCH is almost like JSON PATCH, but when I look deeper it seems a bit different on how the path is described which doesn't allow me to use json-patch libraries.
example:
"path":"addresses[type eq \"work\"]"
"path":"members[value eq
\"2819c223-7f76-453a-919d-413861904646\"]"
Do you know any library that is doing SCIM PATCH out of the box?
My project is currently a node project, but I don't care about the language I can rewrite it in javascript if needed.
Edit
I have finally create my own library for that, it is called scim-patch and it is available on npm https://www.npmjs.com/package/scim-patch

I implement SCIM PATCH operation in my own library. Please take a look here and here. It is currently a work in progress for v2, but the CRUD capability required by patch operations has matured.
First of all, you need a way to parse the SCIM path, which can optionally include a filter. I implement a finite state machine to parse the path and filter. A scanner would go through each byte of the text and point out interesting events, and a parser would use the scanner to break the text into meaningful tokens. For instance, emails[value eq "foo#bar.com"].type can be broken down to emails, [, eq, "foo#bar.com", ] and type. Finally, a compiler will take these token inputs and assemble it into an abstract syntax tree. On paper, it will look something like the following:
emails -> eq -> type
/ \
value "foo#bar.com"
Next, you need a way to traverse the resource data structure according to the abstract syntax tree. I designed my property model to carry a reference to the SCIM attribute. Consider the following resource:
{
"schemas": ["urn:ietf:params:scim:schemas:core:2.0:User"],
"userName": "imulab",
"emails": [
{
"value": "foo#bar.com",
"type": "work"
},
{
"value": "bar#foo.com",
"type": "home"
}
]
}

I start traversing from the root of the resource and find the child called emails, which will return a multiValued property of complex type. I see my next token (eq) is the root of a filter, so I perform the filter operations on the two elements of emails. For each element, I go down the value child and evaluate its value. Since only the first element matches the filter, I finally go down the type child of that complex property and arrive at the target property. From there, you are free to perform Add, Replace and Remove operations.
There are two things I recommend to watch out.
One thing is that you traversing path will split when you hit a multiValued property. In the above example, we only have one elements that matched the filter. In reality, we may have many matches, or there could be no filter at all, forcing you to traverse all elements.
The other is the syntax of the SCIM path. The specification mandates that it is possible to prefix the schema URN in front the actual paths and delimit them with a :. So in that representation, emails.type and urn:ietf:params:scim:schemas:core:2.0:User:emails.type are actual equivalents. Note that the schema URN contains dots (.) in the 2.0 part. This creates further complication that now you cannot simply delimit the text by . and hope to get all correct tokens. I use a Trie data structure to record all schema URNs as reserved words. Whenever I start a new segment in the path, I will try to match it in the Trie and not solely rely on the . to terminate the segment.
Hope it will help your work.

Have a look at scim2-filter-parser: https://github.com/15five/scim2-filter-parser
It is a library mainly used by the authors' django-scim2 library: https://github.com/15five/django-scim2
It relies on python AST objects, but I think you should get some takeaways from there.

Since I did not found any typescript library to implement scim patch operations, I have implemented my own library.
You can find it here: https://www.npmjs.com/package/scim-patch

Related

Additive deserializing with Serde

I'd like to additively deserialize multiple files over the same data structure, where "additively" means that each new file deserializes by overwriting the fields that it effectively contains, leaving unmodified the ones that it does not. The context is config files; deserialize an "app" config provided by the app, then override it with a per-"user" config file.
I use "file" hear for the sake of clarity; this could be any deserializing data source.
Note: After writing the below, I realized maybe the question boils down to: is there a clever use of #[serde(default = ...)] to provide a default from an existing data structure? I'm not sure if that's (currently) possible.
Example
Data structure
struct S {
x: f32,
y: String,
}
"App" file (using JSON for example):
{ "x": 5.0, "y": "app" }
"User" file overriding only "y":
{ "y": "user" }
Expected deserializing (app, then user):
assert_eq!(s.x, 5.0);
assert_eq!(s.y, "user");
Expected solution
I'm ignoring on purpose any "dynamic" solution storing all config settings into, say, a single HashMap; although this works and is flexible, this is fairly inconvenient to use at runtime, and potentially slower. So I'm calling this approach out of scope for this question.
Data structure can contain other structs. Avoid having to write too many per-struct code manually (like implementing Deserialize by hand). A typical config file for a moderate-sized app can contains hundreds of settings, I don't want the burden of having to maintain those.
All fields can be expected to implement Default. The idea is that the first deserialized file would fallback on Default::default() for all missing fields, while subsequent ones would fallback on already-existing values if not explicitly overridden in the new file.
Avoid having to change every single field of every single struct to Option<T> just for the sake of serializing/deserializing. This would make runtime usage very painful, where due to above property there would anyway be no None value ever once deserialization completed (since, if a field is missing from all files, it defaults to Default::default() anyway).
I'm fine with a solution containing only a fixed number (2) of overriding files ("app" and "user" in example above).
Current partial solution
I know how to do the first part of falling back to Default; this is well documented. Simply use #[serde(default)] on all structs.
One approach would be to simply deserialize twice with #[serde(default)] and override any field which is equal to its default in the app config with its value in the user config. But this 1) probably requires all fields to implement Eq or PartialEq, and 2) is potentially expensive and not very elegant (lose the info during deserialization, then try to somehow recreate it).
I have a feeling I possibly need a custom Deserializer to hold a reference/value of the existing data structure, which I would fallback to when a field is not found, since the default one doesn't provide any user context when deserializing. But I'm not sure how to keep track of which field is currently being deserialized.
Any hint or idea much appreciated, thanks!
Frustratingly, serde::Deserialize has a method called deserialize_in_place that is explicitly omitted from docs.rs and is considered "part of the public API but hidden from rustdoc to hide it from newbies". This method does exactly what you're asking for (deserialize into an existing &mut T object), especially if you implement it yourself to ensure that only provided keys are overridden and other keys are ignored.

Shall I set an empty string computed string attribute for Terraform resource?

context: I'm adding a new resource to TF Provider.
I've got an API that optionally return a string attribute so I represent it as:
"foo": {
Type: schema.TypeString,
Computed: true,
Optional: true,
},
Question: if an API returns value not set / empty string for response.foo, shall I still set an empty string for foo attribute or I shouldn't set any value instead (e.g., null)?
in my resource schema.
(Hello! I'm the same person who wrote the answer you included in your screenshot.)
If both approaches -- returning null or returning an empty string -- were equally viable from a technical standpoint then I would typically prefer to use null to represent the absence of a value, since that is clearly distinct from an empty string which for some situations would otherwise be a valid present value for the attribute.
However, since it seems like you are using the old SDK ("SDKv2") here, you will probably be constrained from a technical standpoint: SDKv2 was designed for Terraform v0.11 and earlier and so it predates the idea of attributes being null and so there is no way in its API to specify that. You may be able to "trick" the SDK into effectively returning null by not calling d.Set("foo", ...) at all in your Create function, but there is no API provided to unset an attribute and so once you've set it to something non-null there would typically be no way to get it to go back to being null again.
Given that, I'd suggest it better to be consistent and always use "" when using the old SDK, because that way users of the provider won't have to deal with the inconsistency of the value sometimes being null and sometimes being "" in this case.
When using the modern Terraform Plugin Framework this limitation doesn't apply, because that framework was designed with the modern Terraform language in mind. You aren't using that framework and so this part of the answer probably won't help you right now, but I'm mentioning it just in case someone else finds this answer in future who might already be using or be considering use of the new framework.

Cypress test: is .contains() equivalent to should('contain')?

Is this: cy.get('[name=planSelect]').contains(dummyPlan)
equivalent to this: cy.get('[name=planSelect]').should('contain', dummyPlan)
And if so, which is preferred? The first is more of an implicit assertion, but it's shorter and cleaner to my mind.
Follow-up question: After looking around to see how best to select elements for e2e testing I found that the Cypress docs recommend using data-cy attributes. Is there a reason this would be better than just adding name attributes to the markup? Should name only be used for forms fields?
The result on your cypress test will be the same if the element with name=planSelect does not contain dummyPlan, that is, the test will fail at this point.
The difference between them is that in the first form, using contains(), you're actually trying to select an element, and the result of cy.get(...).contains() will yield this expected DOM element, allowing for further chaining of methods, like:
cy.get('[name=planSelect]').contains(dummyPlan).click();
In the second form you are making an explicit assertion to verify that dummyPlan exists within the other element, using the Chai chainer contain.
It is a subtle difference and the result is the same, but I would recommend you to use cy.get('[name=planSelect]').contains(dummyPlan) only in case you would like to chain some other method after contains, and use the second form if you want to explicitly assert that this element exists. Logically speaking, the first would represent a generic test failure (cypress tried to find an element that wasn't there) and the second represents an explicit assertion failure (element should contain dummyPlan but it does not).
As for your second question, name is a valid HTML attribute and using it for your tests can lead to confusion if the attribute is being used in its original function (to name input fields) or if the attribute is there just for testing purposes. I would recommend you to use cy-name as the documentation suggests because this way you avoid this ambiguity and make it clear that this attribute cy-name is only there for testing purposes.
Furhtermore, on some situations you might decide to strip all cy-name from your code before sending it to production (during the build process, using some webpack plugin, like string-replace-loader). You would not be able to do the same if using just name because you would also remove the required input name, if there was some inputs in your code.
Answer
.contains(selector, content) is the best selector; it retries
element selection AND allows text matching (not just <tag>
.class #id [attributes])
.should() is just an assertion and only the assertion is retried
(not the element selection)
.should('exist') is implied unless you specify your own -- this is how they allowed .should('not.exist')
Tangent
Browsers support XPath 1.0 which is a pretty cool but obscure way to make complex queries based on DOM tree traversal. There's a contains predicate function:
//*[ contains(normalize-space(.), 'The quick brown fox jumped over the lazy dog.') ]
[not(.//*[contains(normalize-space(.), 'The quick brown fox jumped over the lazy dog.') ])]
This searches from root of the document for any node that contains the text and doesn't contain a descendant node which contains the text.
You can test it in the console with the Chrome $x() shortcut or this polyfill (and helper):
getLowestDomNodesByText("The quick brown fox jumped over the lazy dog.")
function getLowestDomNodesByText (text) {
return x(`//*[contains(normalize-space(.), '${text}')][not(.//*[contains(normalize-space(.), '${text}') ])]`);
};
function x (expression) {
const results = new XPathEvaluator().evaluate(expression, document);
const nodes = [];
let node = null;
while (node = results.iterateNext()) {
nodes.push(node);
}
return nodes;
}
If you need even more performance, you can use a TreeWalker with NodeFilter.SHOW_TEXT as seen in this chrome extension I've worked on for a long time
I recommend to use contains after get then verify existence with should.
cy.get('[name=planSelect]').contains(dummyPlan, {matchCase: false}).should('exist')

what does getType do in antlr4?

This question is with reference to the Cymbol code from the book (~ page 143) :
int t = ctx.type().start.getType(); // in DefPhase.enterFunctionDecl()
Symbol.Type type = CheckSymbols.getType(t);
What does each component return: "ctx.type()", "start", "getType()" ? The book does not contain any explanation about these names.
I can "kind of" understand that "ctx.type()" refers to the "type" rule, and "getType()" returns the number associated with it. But what exactly does the "start" do?
Also, to generalize this question: what is the mechanism to get the value/structure returned by a rule - especially in the context of usage in a listener?
I can see that for an ID, it is:
String name = ctx.ID().getText();
And as in above, for an enumeration of keywords it is via "start.getType()". Any other special kinds of access that I should be aware of?
Lets disassemble problem step by step. Obviously, ctx is instance of CymbolParser.FunctionDeclContext. On page 98-99 you can see how grammar and ParseTree are implemented (at least the feeling - for real implementation please see th .g4 file).
Take a look at the figure of AST on page 99 - you can see that node FunctionDeclContext has a several children, one labeled type. Intuitively you see that it somehow correspond with function return-type. This is the node you retrieve when calling CymbolParser.FunctionDeclContext::type. The return type is probably sth like TypeContext.
Note that methods without 'get' at the beginning are usually children-getters - e.g. you can access the block by calling CymbolParser.FunctionDeclContext::block.
So you got the type context of the method you got passed. You can call either begin or end on any context to get first of last Token defining the context. Simply start gets you "the first word". In this case, the first Token is of course the function return-type itsef, e.g. int.
And the last call - Token::getType returns integral representation of Token.
You can find more information at API reference webpages - Context, Token. But the best way of understanding the behavior is reading through the generated ANTLR classes such as <GrammarName>Parser etc. And to be complete, I attach a link to the book.

Partial objects with JAXB?

I'm working to create some services with JAX-RS, and am relatively new to JAXB (actually XML in general) so please don't assume I know the pre-requisites that I probably should know! Here's the questions: I want to send and receive "partial" objects in XML. That is, imagine one has an object (Java form, obviously) with:
class Thing { int x, String y, Customer z }
I want to be able to send an XML output that contains (dynamically chosen, so I can't use XmlTransient) just x, or just z, or x and y, but not z, or any other combination that suits my client. The point, obviously, is that sometimes the client doesn't need everything, so I can save some bandwidth (particularly with lists of deep, complex objects, which this example clearly doesn't illustrate!).
Also, for input, the same bandwidth argument applies; I would like to be able to have the client send just the particular fields that should be updated in, say, a PUT operation, and ignore the rest, then have the server "merge" those new values onto existing objects and leave the un-mentioned fields unchanged.
This seems to be supported in the Jackson JSON libraries (though I'm still working on it), but I'm having trouble finding it in JAXB. Any ideas?
One thought that I was pondering is whether one can do this in some way via Maps. If I created a Map (potentially nested Maps, for nested coplex objects) of what I want to send, could JAXB send that with a plausible structure? And if it could create such a map on input, I guess I could work through it to make the updates. Not perfect, but maybe?
And yes, I know that the "documents" that will be flying around will probably fail to comply with schemas, having missing fields and all that, but I'm ok with that, provided the infrastructure can be made to work.
Oh, and I know I could do this "manually" with SAX, StAX, or DOM parsing, but I'm hoping there's a rather more automatic way, particularly since JAXB handles the whole objects so effortlessly.
Cheers,
Toby
Note: I'm the EclipseLink JAXB (MOXy) lead and a member of the JAXB (JSR-222) expert group.
EclipseLink JAXB (MOXy) offerst this support through its object graph extension. Object graphs allow you to specify a subset of properties for the purposes of marshalling an unmarshalling. They may be created at runtime programatically:
// Create the Object Graph
ObjectGraph contactInfo = JAXBHelper.getJAXBContext(jc).createObjectGraph(Customer.class);
contactInfo.addAttributeNodes("name");
Subgraph location = contactInfo.addSubgraph("billingAddress");
location.addAttributeNodes("city", "province");
Subgraph simple = contactInfo.addSubgraph("phoneNumbers");
simple.addAttributeNodes("value");
// Output XML - Based on Object Graph
marshaller.setProperty(MarshallerProperties.OBJECT_GRAPH, contactInfo);
marshaller.marshal(customer, System.out);
or statically on the class through annotations:
#XmlNamedObjectGraph(
name="contact info",
attributeNodes={
#XmlNamedAttributeNode("name"),
#XmlNamedAttributeNode(value="billingAddress", subgraph="location"),
#XmlNamedAttributeNode(value="phoneNumbers", subgraph="simple")
},
subgraphs={
#XmlNamedSubgraph(
name="location",
attributeNodes = {
#XmlNamedAttributeNode("city"),
#XmlNamedAttributeNode("province")
}
)
}
)
#XmlRootElement
#XmlAccessorType(XmlAccessType.FIELD)
public class Customer {
For More Information
http://blog.bdoughan.com/2013/03/moxys-object-graphs-partial-models-on.html
http://blog.bdoughan.com/2013/03/moxys-object-graphs-inputoutput-partial.html
http://blog.bdoughan.com/2011/05/specifying-eclipselink-moxy-as-your.html

Resources