What kind of normalization is used by Swift string comparisons?

What kind of normalization is used by Swift string comparisons? - string

Elsewhere I've seen it told that Swift's comparisons use NFD normalization.
However, running in the iSwift playground I've found that
print("\u{0071}\u{0307}\u{0323}" == "\u{0071}\u{0323}\u{0307}");
gives false, despite this being an example straight from the standard of "Canonical Equivalence", which Swift's documentation claims to follow.
So, what kind of canonicalization is performed by Swift, and is this a bug?

It seems that this was in bug in Swift that has since been fixed. With Swift 3 and Xcode 8.0,
print("\u{0071}\u{0307}\u{0323}" == "\u{0071}\u{0323}\u{0307}")
now prints true.

Related

How to understand the some("a value") in Swift 4.1

Here is the screenshot of my Xcode Playground:
As you can see, the str is printed as some("Hello"). This really confuses me as there seems to be no documentation on it.
Dose anyone have a good explanation for this some()?
System info:
swift -version: 4.1.2
Xcode: 9.4.1

This appears to be a quirk in print for this compiler, purely conjecture it may be an artefact of work on changing the semantics of implicitly unwrapped optionals, see Abolish ImplicitlyUnwrappedOptional type.
The type Optional is, stripping to the basics, defined as:
enum Optional<Wrapped>
{
case none
case some(Wrapped)
}
Normally if you print() an enum you do get the literals, here none and some(), however print() normally prints optionals as nil and Optional().
It seems in Xcode 9.4.1 (at least) implicitly unwrapped optionals are being printed as optionals but without the special casing, whereas Xcode 9.2 (at least) prints the unwrapped value as would be expected (because it is implicitly unwrapped).
Maybe there is other interesting behaviour for implicitly unwrapped optionals in 9.4.1. You should test in Xcode 10 Beta and/or report a bug (bugreport.apple.com) in 9.4.1 and see what Apple say.

Groovy - strange Collection#intersect behaviour

I have code like that:
def a1 = [[1],[2],[3]]
def a2 = [[2],[3],[4]]
a1.intersect(a2)
and as result got:
[]
After some time of research i found out that elements of lists must be instance of Comparable. In DefaultGroovyMethods we can see implementation of intersect method. First thing i noted was the collection (TreeSet) used for checking existence of objects in one of our lists (btw. if HashSet used it worked fine).
I checked the NumberAwareComparator there are two options for checking in compareTo method. The first is the delegation of comparison to another class (eaten exception ?!) and the second is hashCode checking.
The first option DefaultTypeTransformation explained us the behavior.
We can see that only allowed object to be compare are Comparable in other case we got exception eaten later.
My question is why is it like that? There is lack of information in documentation (or am i wrong?) about it. Did i missed something?

Can't find it documented.
Feels like a great pull request contribution to the project on github with a change to the existing documentation/javadoc to make this more explicit?
The elements have to be comparable, as otherwise you can't compare them to check for an intersection, but you're right the documentation isn't explicit.
You could write your own implementation like so:
Collection.metaClass.equalityIntersect = { Collection other ->
delegate.findAll { a -> other.find { it == a } }
}
So that a1.equalityIntersect(a2) == [[2], [3]]

This behavior has been introduced somewhere down the line - haven't pin-pointed it, maybe 2.4.2 or 2.4.2 as per this commit. It used to work in 2.2.1 and 2.4.0 and is broken on 2.4.6... but it's fixed in 2.4.7.
$ groovy -v
Groovy Version: 2.4.7 JVM: 1.8.0_92 Vendor: Oracle Corporation OS: Mac OS X
$ groovy intersect.groovy
[[2], [3]]
How can such breaking change roll out to a release is a mystery to me. Lack of testing?

Delphi: Upgrade from 6 to XE2 - TStringList

We have to upgrade to XE2 (from Delphi6).
I collected many informations about this, but one of them isn't clear for me.
We are using String - what is AnsiString in XE.
As I know we must replace all (P)Ansi[String/Char] in our libraries to avoid the side effects of Unicode converts, and to we can compile our projects.
It is ok, but we are also using TStringList, and I don't found any TAnsiStringList class to change it simply... ;-)
What do you know about this? Can this cause problems too? Or this class have an option to preserve the strings?
(Ok, it seems to be 3 questions, but it is one only)
The program / OS language is hungarian, the charset is WIN-1250, what have some strange characters, like Ő, and Ű...
Thanks for your every information, link, etc.

1) 1st of all - WHY should u use AnsiStringList, rather than converting all your project to unicode-aware TStringList ? That should have certain detailed reasons, to suggest viable alternatives.
Unicode is a superset of windows-1250, windows-1251 and such.
Normally all you locale-specific string would be just losslessly converted to Unicode. IT is the opposite, Unicode to AnsiString, convertion that may loose data.
Explicit or implicit (like AnsiChar reduction in "if char-var in char-set")
You may have type-unsafe API like in DLLs, where compiler cannot check if you pass PChar or PAnsiChar, but you anyway should not pass objects liek TStrings into DLLs, there are BPLs for that.
So you probably just do not need TAnsiStringList
2) you can take TJclAnsiStringList from Jedi Code Library
3) You can use XE2 stock TList<AnsiString> type

What's the name for hyphen-separated case?

This is PascalCase: SomeSymbol
This is camelCase: someSymbol
This is snake_case: some_symbol
So my questions is whether there is a widely accepted name for this: some-symbol? It's commonly used in url's.

There isn't really a standard name for this case convention, and there is disagreement over what it should be called.
That said, as of 2019, there is a strong case to be made that kebab-case is winning:
https://trends.google.com/trends/explore?date=all&q=kebab-case,spinal-case,lisp-case,dash-case,caterpillar-case
spinal-case is a distant second, and no other terms have any traction at all.
Additionally, kebab-case has entered the lexicon of several javascript code libraries, e.g.:
https://lodash.com/docs/#kebabCase
https://www.npmjs.com/package/kebab-case
https://v2.vuejs.org/v2/guide/components-props.html#Prop-Casing-camelCase-vs-kebab-case
However, there are still other terms that people use. Lisp has used this convention for decades as described in this Wikipedia entry, so some people have described it as lisp-case. Some other forms I've seen include caterpillar-case, dash-case, and hyphen-case, but none of these is standard.
So the answer to your question is: No, there isn't a single widely-accepted name for this case convention analogous to snake_case or camelCase, which are widely-accepted.

It's referred to as kebab-case. See lodash docs.

It's also sometimes known as caterpillar-case

This is the most famous case and It has many names
kebab-case: It's the name most adopted by official software
caterpillar-case
dash-case
hyphen-case or hyphenated-case
lisp-case
spinal-case
css-case
slug-case
friendly-url-case

As the character (-) is referred to as "hyphen" or "dash", it seems more natural to name this "dash-case", or "hyphen-case" (less frequently used).
As mentioned in Wikipedia, "kebab-case" is also used. Apparently (see answer) this is because the character would look like a skewer... It needs some imagination though.
Used in lodash lib for example.
Recently, "dash-case" was used by
Angular (https://angular.io/guide/glossary#case-types)
NPM modules
https://www.npmjs.com/package/case-dash (removed ?)
https://www.npmjs.com/package/dasherize

Adding the correct link here Kebab Case
which is All lowercase with - separating words.

I've always called it, and heard it be called, 'dashcase.'

There is no standardized name.
Libraries like jquery and lodash refer it as kebab-case. So does Vuejs javascript framework. However, I am not sure whether it's safe to declare that it's referred as kebab-case in javascript world.

I've always known it as kebab-case.
On a funny note, I've heard people call it a SCREAM-KEBAB when all the letters are capitalized.
Kebab Case Warning
I've always liked kebab-case as it seems the most readable when you need whitespace. However, some programs interpret the dash as a minus sign, and it can cause problems as what you think is a name turns into a subtraction operation.
first-second // first minus second?
ten-2 // ten minus two?
Also, some frameworks parse dashes in kebab cased property. For example, GitHub Pages uses Jekyll, and Jekyll parses any dashes it finds in an md file. For example, a file named 2020-1-2-homepage.md on GitHub Pages gets put into a folder structured as \2020\1\2\homepage.html when the site is compiled.
Snake_case vs kebab-case
A safer alternative to kebab-case is snake_case, or SCREAMING_SNAKE_CASE, as underscores cause less confusion when compared to a minus sign.

I'd simply say that it was hyphenated.

Worth to mention from abolish:
https://github.com/tpope/vim-abolish/blob/master/doc/abolish.txt#L152
dash-case or kebab-case

In Salesforce, It is referred as kebab-case. See below
https://developer.salesforce.com/docs/component-library/documentation/lwc/lwc.js_props_names

Here is a more recent discombobulation. Documentation everywhere in angular JS and Pluralsight courses and books on angular, all refer to kebab-case as snake-case, not differentiating between the two.
Its too bad caterpillar-case did not stick because snake_case and caterpillar-case are easily remembered and actually look like what they represent (if you have a good imagination).

My ECMAScript proposal for String.prototype.toKebabCase.
String.prototype.toKebabCase = function () {
return this.valueOf().replace(/-/g, ' ').split('')
.reduce((str, char) => char.toUpperCase() === char ?
`${str} ${char}` :
`${str}${char}`, ''
).replace(/ * /g, ' ').trim().replace(/ /g, '-').toLowerCase();
}

This casing can also be called a "slug", and the process of turning a phrase into it "slugify".
https://hexdocs.pm/slugify/Slug.html

Input string was not in a correct format when parse in multithreading

Does Anybody can explain this:
How does it possible to throw exception when parsing "55.01"? I use multithreading.
--edit--
but... sometimes it works
This realy make me sad ;(
i use .NET 4.0 and VS2010.
--edit 2---
Ok, I made a little progress. When I do not use multithreading everything works perfect. But when I use multithreading (probably)one of a thread throw FormatException in place which is shown in the picture.

It's possible the system is set for some culture that expects a comma as the decimal point.
From http://msdn.microsoft.com/en-us/library/fd84bdyt.aspx:
The s parameter is interpreted using the formatting information in a NumberFormatInfo object that is initialized for the current thread culture. For more information, see CurrentInfo. To parse a string using the formatting information of some other culture, call the Double.Parse(String, IFormatProvider) or Double.Parse(String, NumberStyles, IFormatProvider) method.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

What kind of normalization is used by Swift string comparisons? - string

It seems that this was in bug in Swift that has since been fixed. With Swift 3 and Xcode 8.0, print("\u{0071}\u{0307}\u{0323}" == "\u{0071}\u{0323}\u{0307}") now prints true.

Related

How to understand the some("a value") in Swift 4.1

Groovy - strange Collection#intersect behaviour

Delphi: Upgrade from 6 to XE2 - TStringList

What's the name for hyphen-separated case?

Input string was not in a correct format when parse in multithreading

Categories

Resources