Is xs:language datatype inconsistently defined by W3C? - xsd

According to Section 3.3.3 of the W3C Datatypes spec:
language represents natural language identifiers as defined by
[RFC 3066]. The value space of language is the set of all strings
that are valid language identifiers as defined [RFC 3066]. The
lexical space of language is the set of all strings that conform to
the pattern [a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})*
However, in RFC 3066 it states in Section 2.5 that
language-range = language-tag / "*"
That is, a language-range has the same syntax as a language-tag, or
is the single character "".
where
The special range "" matches any tag.
Thus, the RFC would allow the use of the astrick as a wildcard identifying all possible languages. However, the pattern specified by W3C does not allow the use of "*". In other words, it appears that the lexical space and the value space are disjoint. Alternatively, I could be misunderstanding something in the definitions. Hence my question: Is the xs:language datatype inconsistently defined by W3C?

Related

Can you set multiple (different) tags with the same value?

For some of my projects, I have had to use the viper package to use configuration.
The package requires you to add the mapstructure:"fieldname" to identify and set your configuration object's fields correctly, but I have also had to add other tags for other purposes, leading to something looking like the following :
type MyStruct struct {
MyField string `mapstructure:"myField" json:"myField" yaml:"myField"`
}
As you can see, it is quite redundant for me to write tag:"myField" for each of my tag, so I was wondering if there was any way to "bundle" them up and reduce the verbosity, with something like this mapstructure,json,yaml:"myField"
Or is it simply not possible and you must specify every tag separately ?
Struct tags are arbitrary string literals. Data stored in struct tags may look like whatever you want them to be, but if you don't follow the conventions, you'll have to write your own parser / processing logic. If you follow the conventions, you may use StructTag.Get() and StructTag.Lookup() to easily get tag values.
The conventions do not support "merging" multiple tags, so just write them all out.
The conventions, quoted from reflect.StructTag:
By convention, tag strings are a concatenation of optionally space-separated key:"value" pairs. Each key is a non-empty string consisting of non-control characters other than space (U+0020 ' '), quote (U+0022 '"'), and colon (U+003A ':'). Each value is quoted using U+0022 '"' characters and Go string literal syntax.
See related question: What are the use(s) for tags in Go?

JDL pattern is not correct in Java #Pattern

When I applied pattern in JDL, the generated entity classes has #Patternannotation, but the value for that annotation is not the exact pattern which applied in JDL.
For example, if I've defined patterns as pattern('/[^\\s]+.*[^\\s]+/') and in java
it reflects as
#Pattern(regexp = "[^\\\\s]+.*[^\\\\s]+")
If you noticed in java class, there are 4 (slash) which indeed should be 2 only. Because of this functionality is getting failed.
It looks to me like you are trying to use regex control characters in your pattern, which do not need to be doubled up in your JDL: see https://www.jhipster.tech/jdl/entities-fields, especially the part under "Regular Expressions" where it says: "/.../ the pattern is declared inside two slashes... \ anti-slashes needn’t be escaped"
So it's acting correctly. Since you have double-backslants in your JDL, Java is correctly interpreting it with quadruple-backslants. Your solution is just to use single backslants in your JDL, as per the documentation.

XML schema restriction pattern for not allowing specific string

I need to write an XSD schema with a restriction on a field, to ensure that
the value of the field does not contain the substring FILENAME at any location.
For example, all of the following must be invalid:
FILENAME
ORIGINFILENAME
FILENAMETEST
123FILENAME456
None of these values should be valid.
In a regular expression language that supports negative lookahead, I could do this by writing /^((?!FILENAME).)*$ but the XSD pattern language does not support negative lookahead.
How can I implement an XSD pattern restriction with the same effect as /^((?!FILENAME).)*$ ?
I need to use pattern, because I don't have access to XSD 1.1 assertions, which are the other obvious possibility.
The question XSD restriction that negates a matching string covers a similar case, but in that case the forbidden string is forbidden only as a prefix, which makes checking the constraint easier. How can the solution there be extended to cover the case where we have to check all locations within the input string, and not just the beginning?
OK, the OP has persuaded me that while the other question mentioned has an overlapping topic, the fact that the forbidden string is forbidden at all locations, not just as a prefix, complicates things enough to require a separate answer, at least for the XSD 1.0 case. (I started to add this answer as an addendum to my answer to the other question, and it grew too large.)
There are two approaches one can use here.
First, in XSD 1.1, a simple assertion of the form
not(matches($v, 'FILENAME'))
ought to do the job.
Second, if one is forced to work with an XSD 1.0 processor, one needs a pattern that will match all and only strings that don't contain the forbidden substring (here 'FILENAME').
One way to do this is to ensure that the character 'F' never occurs in the input. That's too drastic, but it does do the job: strings not containing the first character of the forbidden string do not contain the forbidden string.
But what of strings that do contain an occurrence of 'F'? They are fine, as long as no 'F' is followed by the string 'ILENAME'.
Putting that last point more abstractly, we can say that any acceptable string (any string that doesn't contain the string 'FILENAME') can be divided into two parts:
a prefix which contains no occurrences of the character 'F'
zero or more occurrences of 'F' followed by a string that doesn't match 'ILENAME' and doesn't contain any 'F'.
The prefix is easy to match: [^F]*.
The strings that start with F but don't match 'FILENAME' are a bit more complicated; just as we don't want to outlaw all occurrences of 'F', we also don't want to outlaw 'FI', 'FIL', etc. -- but each occurrence of such a dangerous string must be followed either by the end of the string, or by a letter that doesn't match the next letter of the forbidden string, or by another 'F' which begins another region we need to test. So for each proper prefix of the forbidden string, we create a regular expression of the form
$prefix || '([^F' || next-character-in-forbidden-string || ']'
|| '[^F]*'
Then we join all of those regular expressions with or-bars.
The end result in this case is something like the following (I have inserted newlines here and there, to make it easier to read; before use, they will need to be taken back out):
[^F]*
((F([^FI][^F]*)?)
|(FI([^FL][^F]*)?)
|(FIL([^FE][^F]*)?)
|(FILE([^FN][^F]*)?)
|(FILEN([^FA][^F]*)?)
|(FILENA([^FM][^F]*)?)
|(FILENAM([^FE][^F]*)?))*
Two points to bear in mind:
XSD regular expressions are implicitly anchored; testing this with a non-anchored regular expression evaluator will not produce the correct results.
It may not be obvious at first why the alternatives in the choice all end with [^F]* instead of .*. Thinking about the string 'FEEFIFILENAME' may help. We have to check every occurrence of 'F' to make sure it's not followed by 'ILENAME'.

Jape grammar to identify product release

How can i use AND operation on jape grammar?. I just want to check whether a sentence contain 'organisation','jobtitle','person' all together in any order. How it possible? There is '|'(OR) operation allowed but i didnt see any documentation about AND operation.
There isn't an "and" operator like that as such but you could do it with a set of contains checks:
Rule: OrgTitlePer
({Sentence contains {Organization},
Sentence contains {JobTitle},
Sentence contains {Person}}):sent
-->
:sent.Interesting = {}
When you have several constraints within the same set of braces that involve the same annotation type on the left (Sentence in this case) then all the constraints must be satisfied simultaneously by the same annotation.

Prefix namespace in RDF

I have this RDF statement (turtle format):
#prefix cd: <http://mai.com/contactwrapper/0.1#> .
<http://mai.com/contactwrapper/0.1#malzaa#m.com>
cd:Belongs_To "1"^^xmls:string ;
cd:Email_Address "malzaa#m.com"^^xmls:string ;
cd:Email_Type "WORK"^^xmls:string .
As you can see, the prefix worked with the properties (Belongs_To, Email_Address, and Email_Type) but didn't work with the name of the resource (malzaa#m.com). Because "http://mai.com/contactwrapper/0.1#" should be replaced by cd.
Could anyone please explain whats wrong with that ??
Thank you
The abbreviated form is often called a QName (which stands for "qualified name"). The reason cd:malazaam#m.com does not work as a QName are the # and the . char in the part behind the :. Turtle syntax does not allow these characters in a QName, which is why the full URI is used instead.
See the Turtle grammar for an overview of what characters are allowed in a QName.
As an aside: your Turtle fragment does not declare the xmls: namespace either (which you use for your literal datatypes), so it will fail to parse.
As Jeen says "#" is not allowed in a prefixed name in Turtle, despite prefixed name being broader than QNames.
In RDF 1.1, the Turtle language is being formally standardized. "#" is not legal in the local part of prefixed names, but "\#" is.
The latest grammar is: http://www.w3.org/TR/turtle/#sec-grammar-grammar
There are many parers that accept the traditional Turtle. Jena writers are conservative - they output legal RDF in a way to maximise the chances of being readable by another parser. Writing in full <..> form or using a prefixed name does not change the URI being written, only it's surface appearance.

Resources