What is the meaning of “StandardProductID” (ASIN and EAN) - amazon

I have that answer here "What is the meaning of "StandardProductID" (ASIN) in Amazon Seller inventory?" but I want to know how to use them together (EAN and ASIN)?
<StandardProductID>
<Type>ASIN</Type>
<Value>B000LQLG7E</Value>
</StandardProductID>
<StandardProductID>
<Type>EAN</Type>
<Value>9006900212131</Value>
</StandardProductID>
This is an example for ASIN and for EAN.
Thanks!

Amazon has different product identifiers according to different international standards. You usually should not try to use both of them together as you're asking, instead the normal procedure is to use one of them and Amazon should be able to get the rest.
As the answer you linked says: Uou should use the ASIN if you have it, otherwise one of the other standards.
To know what each of those mean, have a look at this answer in Amazon help: What are UPCs, EANs, ISBNs and ASINs?
What are UPCs, EANs, ISBNs. and ASINs?
UPC Universal Product Code (UPC) is a 12-digit bar code used
extensively for retail packaging in United States.
EAN The European Article Number (EAN) is a barcode standard, a 12- or
13-digit product identification code. Each EAN uniquely identifies
the product, manufacturer, and its attributes; typically, the EAN is
printed on a product label or packaging as a bar code. We require EAN
codes to improve quality of search results and the quality of the
catalog as a whole. You can obtain EANs from the manufacturer. If your
products do not have manufacturer EANs, and you need to buy EAN codes,
you; should go directly to GS1 UK http://www.gs1uk.org
For example, the EAN of "Colgate Total 75 ml" is 4011200296908
ISBN The International Standard Book Number (ISBN) is a unique
commercial book identifier barcode. Each ISBN code identifies uniquely
a book. ISBN have either 10 or 13 digits. All ISBN assigned after 1
Jan 2007 have 13 digits. Typically, the ISBN is printed on the back
cover of the book.
For example, the ISBN code for J.K.Rowling's "Harry Potter and the
Deathly Hallows", Adult Edition, Paperback, UK edition is
978-0747595823, and this code identifies uniquely this book and
edition.
ASINs
Amazon Standard Identification Numbers (ASINs) are unique blocks of 10
letters and/or numbers that identify items. You can find the ASIN on
the item's product information page at Amazon.com. For books, the ASIN
is the same as the ISBN number, but for all other products a new ASIN
is created when the item is uploaded to our catalogue. You will find
an item's ASIN on the product detail page alongside further details
relating to the item, which may include information such as size,
number of pages (if it's a book) or number of discs (if it's a CD).
ASINs can be used to search for items in our catalogue. If you know
the ASIN or ISBN of the item you are looking for, simply type it into
the search box (which can be found near the top of most pages), hit
the "Go" button and, if the item is listed in our catalogue, it will
appear in your search results.
For example, the ASIN for Hasbro's "Monopoly" game is B00005N5PF.
For more information you can read about each standard separately:
UPC
EAN (now renamed to IAN - "european" changed for "international")
ISBN
ASIN

Related

Identifying the gender and nationality of a list of names?

I have a list of names that I've extracted from articles, and I'm trying to guess demographic information about them (gender and nationality).
The list looks like:
Šefik Džaferović
Miloš Zeman
Abdel Fattah el-Sisi
სალომე ზურაბიშვილი
Michael D. Higgins
Maia Sandu
محمد السادس
Стево Пендаровски
with each list item including at least a first and second name.
Any advice on where to start?
You could get list of names from different countries; most countries will have records of their most common first names etc.
Once you have that data, you can set up a mapping between a name and the countries it is used in -- this will be a probability, as many names (most, probably) will occur in many countries, but will be more common in some than in others. For example, a lot of names of Turkish origin will be used in Germany, due to the sizable Turkish communities living there.
When you then get a name, you can consult that map, and get a likelihood for the nationality. If this is separate for first and last name, that might be more precise; but be aware that there is no absolute certainty.
With Gender it would work the same (helpfully, many list of baby names are split by gender); but there are also some ambiguous ones (Alex, Jan, Sam, Leslie, ...)

REGEX for Netflix Viewing Activity Titles (TV Show vs Movie using 'Episode')

I have a csv file containing Netflix viewing data for all users on an account (38k entries) which I am analyzing in Power Bi.
There was no column for Movie/TV which I needed, but its clear that entries with the word 'Episode' in them are episodes from TV shows/Netflix series etc so I created a column in Power Query based on that. Here is an example of what I mean (other columns removed).
Title
ContentType
The Office (U.S.): Season 5: Heavy Competition (Episode 24)
TV
Forensic Files: Collection 1: A Tight Leash (Episode 2)
TV
Kung Fu Panda
Movie
Teen Wolf: Season 1: Lunatic (Episode 8)
TV
Kung Fu Panda 2
Movie
This seems to have worked quite well, but ideally I want a way to be sure I don't have any erroneously labeled entries (e.g "Star Wars: A New Hope (Episode IV)", not an entry in this dataset, but there is a risk of other 'Movie' titles using this format that I cant manually check for.).
I am a total Regex beginner, and sloppily put together the expression \b[Ee]pisode[\s\S]\b[^0123456789] to try and find any entries with the word episode that didn't have a number following it, and all entries were still TV Shows, but this would not account for something like "A New Hope(Episode 4)".
I'm a little stuck now and there are likely other exceptions to the 'Episode' rule that I am not considering. Functionally, the way I have done this is working for my purposes, but I'm trying to show due diligence for anyone that reads my report.
My question: is there a better expression to try that would account for such outliers?
Thanks!

What standard is the currency parameter

When I wanted to get the median price of an item on the steam market I came across this answer. It gets the lowest and median price of an item. The one thing I had trouble understanding which currency number corresponds with what currency and if so, which industry standard is used here.
This is an example URL:
https://steamcommunity.com/market/priceoverview/?appid=730&currency=3&market_hash_name=Tec-9%20%7C%20VariCamo%20(Minimal%20Wear)
In the documentation it says it is ISO 4217:
An optional ISO 4217 currency code. If specified, only prices for this currency need to be
But that's clearly not the case.
When I put in 1 as the currency parameter, I get dollar.
With 2 I get pounds.
And with 3 it responds with euro.
...
The max seems to be 41 with the Uruguayan Peso
All actual currency codes and other respective information about currencies on Steam can be found in global.js on https://steamcommunity.com/market/. Just open this page. Then open developer console and search for g_rgCurrencyData variable like on image below.
Location of g_rgCurrencyData variable
Besides codes there are info about formatting for each currency which is useful when you need to parse data from page or automate some actions.

Book ordering comparison between spreadsheets for existing catalogue of a Library

I have recently asked this question of google's spreadsheet page.
I a significant data comparison problem I would like to solve. It relates to purchasing books for a Library. We have a catalogue of over 11,000 books. When we order new books we need to compare our proposed purchases to the current stock. Currently we can manually compare them to our catalogue, very laboriously book by book.
We need to do 3 things to make our life easier -
1 easily clean out bad data/characters in the ISBN's - these are either spaces, - (hyphen's) or . (period mark or full stops). A simple formula to run over all ISBN fields would be great.
2 I need to compare data between 1 spreadsheet with 11,000 books in it (current library stock), a second with up to 1000 books in it (currently on order) and finally the third currently active one (about to be ordered) with 50 to 200 books listed in it.
All spreadsheets use the same column configuration as below
Library orders
Title Author Publisher ISBN (long version) US$ UKgpd HK$ Other$ P/O no. Date ordered
UNNATURAL SELECTION MARA HVISTENDAHL Public Affairs Publishing; Reprint edition (May 1, 2012) 978610391511
Finally, the out put of these comparisons should quickly and easily identify on what lines we have matches. and what type of match it is, Author only, Author and Title, or Author, title and ISBN etc for all the possible combinations. To make this easier assume spreadsheet 1 is an unalterable master table, with spreadsheet two similar. It is really only on Spreadsheet 3 we need to be clear if we are starting to reorder materials.
If it is possible to have these as different sheets in a workbook it would be ideal. The only additional feature is that any scripts that run need to be able to cope with spreadsheet 1 increasing in size as new acquisitions arrive and are included. Both spreadsheets 2 and 3 will vary (increase and decrease) as the ordering process proceeds.
Finally the absolute ideal would be for this comparison process to be instant (live) and ongoing as data is included.
If anyone would like to take this on 3 Library staff will be eternally grateful.
regards
Nick
This would be very much easier had you one sheet rather than three (simply add a column to each existing sheet to show whether in stock, on order or to be ordered – three individual letters would be sufficient, then append each of the smaller two files to the largest). Then for example you could apply Conditional Formatting to highlight duplicates one column at a time (Author, Title etc). Apart from the initial data cleansing it would mean in the future switching ‘between sheets’ would merely involve changing a one-letter flag. Filtering would allow you and your colleagues to appear to have three separate sheets and if anyone asks for a particular Title the search would be one-time, not in triplicate.
Also, http://www.microsoft.com/en-gb/download/details.aspx?id=15011 may be of interest, also =SUBSTITUTE.And with data validation you would prevent entry of a new ISBN that already is in your list.

Menu extracting

I am interesting in extracting and structuring information about restaurant menus. What is needed is to extract the items from the menu in form category / name / price
For instance, we have the following website. Here we have a drinks sections, and there a number of items. For that website I'd like to be able to extract
Drink / Cappuccino / € 1,50
SANDWICHES / filled sandwich, pistolet (round roll) or emperor roll / € 1,30
etc ...
Of course it shouldn't be limited only to this website.
The only way I can see to handle that is applying a bunch of regexps, but I don't believe listing all possible dish names is feasible.
I know that the topic might be too broad for a question, but anyway any suggestions or references to relevant articles or books will be much appreciated.
This seems quite possible. You many not be able to list all possible dishes but you can list all possible categories.
Assuming that in every menu, dish names follows category name and it is followed by the price, you can identify dish names.
The algorithm will look like this:
foreach(category: category_list):
foreach(word:document):
if(category == word):
dish = Read next(if data is structures with table read next row or col)
price = Read next and check it format to see if its Currency or a price
The point is you will need to analyse different websites to understand how the information is structured and prepare your algorithm to deal with all possible structures.

Resources