ISO 3166-1 alpha-2 comma separated values - iso

http://www.iso.org/iso/country_names_and_code_elements has "VIRGIN ISLANDS, BRITISH". I would've thought the country is called "British Virgin Islands'. Am I looking at the wrong place for country names? And yes, this is a programming question, I want to use it in Drupal core.
Edit: and no, the question is not just that single country name. Every country name separated by comma seems to be a similar problem. I need an ISO source :/

ISO 3166 has no concept of "common name" for a country. It only carries the following names:
English (or French) short name
English (or French) full name
The short name will be something like:
Central African Republic (the)
Virgin Islands, British
The long name will be something like:
The Republic of Equatorial Guinea
People's Republic of China
Both the short and long names are somewhat useless in a user interface. That's why the Debian project maintains a 'common name' column in its ISO 3166 database. It fixes some of the blatant issues, but definitely not all of them (especially not the Virgin Islands, British entry).

From Wikipedia:
"The official name of the British Virgin Islands is simply the Virgin Islands, and the official name of the United States Virgin Islands is the Virgin Islands of the United States."
Based on that, I would be tempted to drop the word British altogether and provide the US entry alongside whatever list you're putting it into. That way when locals scroll a-z to find "virgin islands" both entries are there, and also both entries are "correct" based on current naming conventions. Assuming Wikipedia is correct, of course.
Although I suspect locals would probably identify with being from the "Virgin Islands" first and British/American second, I have no cultural background on that region that would make me an expert on this particular issue, so if someone has a better answer I'd be happy to hear it.

You can get a country code to name map at http://country.io/names.json, and it has sensible names:
CF: "Central African Republic",
VG: "British Virgin Islands",
VI: "U.S. Virgin Islands",

Related

Problem creating a hispanic person in dialogflow

So,
Due to cultural differences people in hispanic countries have quite a number of surnames.
Taking someone elses surname isnĀ“t the norm, you just combine your surnames in most cases:
1st husband, 1st bride, 2nd husband, 2nd bride, 3rd husband, 3rd bride, 4th husband, 4th bride.
You have to add a second surname to get Spanish nationality and some people just repeat their last name because they refuse to understand how culturally important this is in Spain
Athletic Bilbao can get away with saying all of their players have basque origins if they start tracing back the multiple surnames and have been known to do so/approach foreign players with basque surnames among the neverending list to ask if they would be interested in joining.
This can be quite problematic in some cases but it makes it easy to differentiate people:
There can be an elevated number of Thomas Smith's in your city, there is hardly ever two Thomas Smith matchingCommonSecondSurname in the same areas.
Because of this people are used to use at least two of their surnames in hispanic countries unless their name is unique enough.
On to my issue:
My dialogflow agent asks someone to identify themselves in order to provide some extra information to the business.
I have added multiple examples with several surnames, they are identified correctly by the training proccess but the agent struggles with them in actual conversation picking either the second surname as the full person or the person first surname as the entity, never the full thing.
Neither option is valid in a hispanic country where I would be using this solution.
Anything I can do to improve this?
Creating a custom entity for a person seems like an arduous task to me.
It is not vital and I could do without the extra tidbit as I am storing their email already. It just seems like a basic thing that should be doable and I am struggling to believe I am the first person to face this issue.

About first national top level domains in Europe

I have read that some of the first TLD where registered back in 90s, including .cz, .pl and other. So domain .SU was. That was domains for national needs.
But who have rights to become a maintainer of national domain? How that procedure looks like?
I also read that .SU TLD was proposed by Finnish student. But how can a student register national domain that supposed represent country?
I couldn't find information about that on Google.
You can find all data on the IANA webpage at https://www.iana.org/domains/root/db or just query it with whois.
.CZ is listed as created on 1993-01-12 and .PL on 1990-07-30
You can go back with some in 1985 like .UK or .US.
.SU had a complicated life because, as a ccTLD it should not exist anymore as the country it represented does not exist anymore. However for non technical reason, it subsists. You can find some discussions there : https://www.icann.org/news/announcement-2-2006-12-05-en
But who have rights to become a maintainer of national domain?
This is a complicated question, and not a technical one nor a programming one.
In short, IANA uses the ISO list on country codes (with some exceptions, like .UK and .EU) and takes input from the relevant government. Now the problem is that some countries are not stable, and also change. So there are a lot of complicated cases. Some ccTLDs are also marketed as non ccTLDs (like .CO or .TV) because the government decided to give its management to some external companies, for some financial agreement.
"Mistakes" happen also, see for example https://medium.com/#Oskar456/stolen-sk-domain-717e070f6735
You can find more about the IANA process at https://www.iana.org/help/cctld-delegation
Each IANA decision to delegate a ccTLD to a country is associated with a "IANA report" listing the justifications. You can read them for whatever country you wish at https://www.iana.org/reports, like a recent one for .TD for example at https://www.iana.org/reports/2018/td-report-20180227.html
The core business is codified, before ICANN even existed in https://www.rfc-editor.org/rfc/rfc1591
IANA adheres to that, and you can find further documentation at https://www.iana.org/domains/root/help
For more details in general, I would recommend you to read my extensive reply to a related question about TLDs and wars: https://superuser.com/questions/1332236/what-happens-to-country-specific-tlds-in-a-war-involving-that-country/1332238#1332238

identity vs appositive coreference

what is the difference between identity coreference and appositive coreference?
In the following sentence for example:
Mohammad traveled to Washington last week. He was on leave of absence. The 30-year old man stayed in an hotel overlooking the National Mall.
As per what I understand, there is an identity coreference between Mohammad and he. Is there an appositive coreference between he and the 30-year old man? or Mohammad and the 30-year old man'?
An appositive is a noun or noun phrase that renames another noun right beside it. Appositive can be a long or short combination of words. let's have a little example :
This impressive detective, Sherlock Holmes always solves all kinds of problems.
So no appositive coreference between "he" and "the 30-year old man" or "Mohammad" and "the 30-year old man'".

Determining customary distance unit from ISO 3166 country code

ISO 3166 defines country codes such as GB, US, FR or RU.
I would like a reasonably definitive association from these country codes to the customary unit of measure for distances between places in those countries.
Specifically on iOS and OS X, the country code can be retrieved from NSLocale:
[[NSLocale currentLocale] objectForKey: NSLocaleCountryCode];
NSLocale also provides a way to see if a country uses metric or non metric units:
const bool useMetric = [[[NSLocale currentLocale] objectForKey: NSLocaleUsesMetricSystem] boolValue];
However, this is not sufficient. For example, in Great Britain (GB) the metric system is widely used, but distances between places continue to be officially measured in miles rather than kilometres.
I also faced this problem :-)
Countries which uses Metric system but still use miles :--
1. GB is only exception which still uses miles instead of metric.
Note: Canada also stared using KMs for road transport. Although, Canada still follows miles for train and horse transport
Countries which do not uses Metric System
Liberia, Myanmar and United States of America.
Note: Myanmar (Formerly Burma) is planning to move to metric system. Currently, Myanmar uses its own system different from imperial and metric.
In my app, i check whether country uses imperial or metric.
if (metric) then assign kms for all countries except britan
if (imperial) then assign miles for all countries except Burma
if burma then assign burma unit
if britan then assign miles
A chart showing countries using miles per hour for road speeds is available. It cites Wikipedia's articles on miles per hour as its source, which has the following to say:
These include roads in the United Kingdom,[1] the United States,[2] and UK and US territories; American Samoa,[3] the Bahamas,[4] Belize,[5] British Virgin Islands,[6] the Cayman Islands,[7] Dominica,[8] the Falkland Islands,[9] Grenada,[10] Guam,[11] Burma,[12] The N. Mariana Islands,[13] Samoa,[14] St. Lucia,[15] St. Vincent & The Grenadines,[16] St. Helena,[17] St. Kitts & Nevis,[18] Turks & Caicos Islands,[19] the U.S. Virgin Islands,[20][21] Antigua & Barbuda (although km are used for distance),[22] and Puerto Rico (same as former).[22]
I don't see a way to download this as data keyed from ISO3166 country code, but it's not a huge task to compile one.
I'll leave this answer unaccepted in case a better suggestion is available.
Officially, road distances in the UK are in kilometres, but road signs are in miles. Confusing? Yes! When a road engineer get aplan of a road, everythign is in kiolometres, government statistics are in kilometres, but road signs and car odometers are in miles. See https://en.wikipedia.org/wiki/Driver_location_sign for more info.

How to find if a word in a sentence is pointing to a city

How to find if a word in a sentence is pointing to a city
I live in San Francisco
I work in San Jose
I was born in New York
Is there a way to find that "San Francisco" is a city in the above sentence.
The task of recognising possibly multi-word expressions that reference individuals of various specific types (locations, but also organisations, dates, etc.) is called named-entity recognition (NER).
For a simple task such as yours, existing freely available tools and models are sufficient. You could try the Stanford Named Entity Recognizer, which is free software. Try analysing your sentence using their online demo.

Resources