Will there be any problem if page having errors in Google structured data? - structured-data

I have applied the JSON-LD structured data for Product schema. I have just checked that a few fields are required by Google Guidelines. ( https://share.getcloudapp.com/7Ku09XQQ )
Will it be accepted by Google? Or will there be any penalty or any issue with such errors/warnings/recommended fields?

Google's guidelines relate to eligibility for rich snippets. Their testing tools also show errors and warnings related to issues that can mean those rich snippets may not show.
There is no penalty.

Related

how to get the comments on social media and make it as your data?

I've proposed a title for our thesis, Movie Success Prediction through Social Media comments using Sentiment Analysis, is there a way you can get the comments on social media (twitter, Instagram, Facebook etc.) and use it for your software? like an API or any other way. is that even possible to use your software on different social media to get the comments for prediction or should i change my title and stick to one social media like Facebook or twitter only?
what's the good algorithm for this?
what programming language and framework/IDE should i use?
I've done lots of research on google and still hoping for more info here. Thank you.
Edit: I'll only use YouTube and YouTube API.
From the title of your question, it seems that the method you need to use is distant supervision. You need to retrieve data with labels you think it is proper for your task. For instance, a tweet containing #perfect hashtag would probably be a positive tweet. So, you can define set of hashtags for your task, negative, positive or even for neutral; then you can retrieve tweets by those via Twitter API. For your task, those should be for movies, therefore your data should contain movie related information in first place.
Given that you will deal with text data and you'd like to create your own dataset, it is better to start with Twitter. Its API works for your needs and it is very well-documented. The language and frameworks are upto your choice, since APIs supports many known languages as well. Personally, I'd start with python or java to quickly solve future problems easier with community support.
For a general survey of this area, you may dive into papers and resources from here:
https://scholar.google.com.tr/scholar?hl=en&q=distant+supervision+sentiment+analysis
Distant supervision could be used to create a sentiment lexicon out of millions English tweets by using sets of negative and positive hashtags as well. You may take a look at Chapter 5 of this thesis ( https://spectrum.library.concordia.ca/980377/1/Ozdemir_MCompSc_F2015.pdf ), this may also give a good insight for your thesis, too.
Hope this helps.
Cheers

Chrome extension to read shopping cart

How Can I read the cart products, from the chrome extension if the user is on the online stores like amazon or ebay then after reading moving the items to the another site.
Parsing the Whole Page For cart is a real pain for it. Any generic solution for this?
First of all, you need to have a content-script, it can read DOM structure.
Almost all modern e-commerce platforms support a structured markup language to describe products. I've worked with JS product parser. According to my experience, most popular are:
Schema.org (Product example)
JSON-LD (Product example)
There are plenty modules for parsing such data available in NPM. For most e-commerce, it will be enough.
Updated.
Ah, I checked the question again: you're asking about products which already were added to cart. But even in this case I still suggest you check page source for any structured data.

Looking for ICD10 API [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
Any body knows of a good ICD10 API to do diagnostic code lookups that can recommend. I am currently building a simple app to tag patients with medical condition and the idea is to have a lookup API where one can type asthma for example and get back all the different ICD10 codes for asthma
My R package, icd converts ICD-9 and ICD-10 codes to descriptions, in addition to its main function of finding comorbidities. Documentation at https://jackwasey.github.io/icd/ , and code at https://github.com/jackwasey/icd . It does this using the function explain_code. It currently uses ICD-10-CM, i.e. the USA billing adapted ICD-10 code set, which in general is more specific than the canonical WHO version, but does have some areas of less detail.
E.g. WHO ICD-10 has HIV disease resulting in Pneumocystis jirovecii pneumonia as a subdivision for HIV infection, whereas ICD-10-CM just has HIV. On the other hand, ICD-10-CM has Sucked into jet engine, subsequent encounter whereas the WHO is happy with the terribly vague: Person on ground injured in air transport accident.
The volume of data for all the descriptions is not very high, just handful of megabytes, so although an API may seem convenient, you might consider just having all the data and not having to ping some random server.
I'm going to assume you're ignoring all of the usual stuff around variations of spelling of medical terms, proper terms vs. colloquialisms, labels vs. descriptions, etc. that get to be a pain with term / code finders.
If you want to use a hosted option and are OK with the terms of use, you could use UMLS (https://uts.nlm.nih.gov/home.html#apidocumentation). It's a great resource, but the use case you're describing isn't necessarily what it's intended to address.
Personally - and I usually don't like to roll my own stuff - I'd consider doing your own thing. You could do something focused on your needs and tailor it to any specific behaviors you might want (like preferring specific codes based on an organization - EX: billing preference). You could also probably make it far, far more ... perky ... and address short forms of terms (EX: synonyms like "DVT") or misspellings ("asthma" vs. "athsma"). If you go that route, I'd suggest considering getting your hands on the ICD-10 code info and then mashing it into Elastic Search. You could extend the data by mixing it with other info and really make it hum. And Elastic is wicked fast.
That's just my $0.02, though.
There is a project called "Unified Medical Language System (UMLS)", funded by NIH and apparently they are working on a RESTful Web API for medical terms.
https://documentation.uts.nlm.nih.gov/rest/home.html
I didn't work with their API yest and the samples I am seeing on their website sounds like they are more SNOMED-CT oriented.
The option I would go for is to get the whole ICD-10-CM from CMS and build my own Web API.
https://www.cms.gov/Medicare/Coding/ICD10/2016-ICD-10-CM-and-GEMs.html
you can check the full documentation from WHO https://icd.who.int/icdapi

how schema.org can help in nlp

I am basically working on nlp, collecting interest based data from web pages.
I came across this source http://schema.org/ as being helpful in nlp stuff.
I go through the documentation, from which I can see it adds additional tag properties to identify html tag content.
It may help search engine to get specific data as per user query.
it says : Schema.org provides a collection of shared vocabularies webmasters can use to mark up their pages in ways that can be understood by the major search engines: Google, Microsoft, Yandex and Yahoo!
But I don't understand how it can help me being nlp guy? Generally I parse web page content to process and extract data from it. schema.org may help there, but don't know how to utilize it.
Any example or guidance would be appreciable.
Schema.org uses microdata format for representation. People use microdata for text analytics and extracting curated contents. There can be numerous application.
Suppose you want to create news summarization system. So you can use hNews microformats to extract most relevant content and perform summrization onit
Suppose if you have review based search engine, where you want to list products with most positive review. You can use hReview microfomrat to extract the reviews, now perform sentiment analysis on it to identify product has -ve or +ve review
If you want to create skill based resume classifier then extract content with hResume microformat. Which can give you various details like contact (uses the hCard microformat), experience, achievements , related to this work, education , skills/qualifications, affiliations
, publications , performance/skills for performance etc. You can perform classifier on it to classify CVs with particular skillsets
Thought schema.org does not helps directly to nlp guys, it provides platform to perform text processing in better way.
Check out this http://en.wikipedia.org/wiki/Microformat#Specific_microformats to see various mircorformat, same page will give you more details.
Schema.org is something like a vocabulary or ontology to annotate data and here specifically Web pages.
It's a good idea to extract microdata from Web pages but is it really used by Web developper ? I don't think so and I think that the majority of microdata are used by company such as Google or Yahoo.
Finally, you can find data but not a lot and mainly used by a specific type of website.
What do you want to extract and for what type of application ? Because you can probably use another type of data such as DBpedia or Freebase for example.
GoodRelations also supports schema.org. You can annotate your content on the fly from the front-end based on the various domain contexts defined. So, schema.org is very useful for NLP extraction. One can even use it for HATEOS services for hypermedia link relations. Metadata (data about data) for any context is good for content and data in general. Alternatives, include microformats, RDFa, RDFa Lite, etc. The more context you have the better as it will turn your data into smart content and help crawler bots to understand the data. It also leads further into web of data and in helping global queries over resource domains. In long run such approaches will help towards domain adaptation of agents for transfer learning on the web. Pretty much making the web of pages an externalized unit of a massive commonsense knowledge base. They also help advertising agencies understand publisher sites and to better contextualize ad retargeting.

social media search engine question

I came across this site called social mention and am curious about how applications like this work, hopefully somebody can offer some glimpses/suggestions on this.
Upon looking at the search results, I realize that they grab results from facebook, twitter, google.... I suppose this is done on the fly, probably through some REST api exposed by the mentioned?
If what I mention in point 1 is probably true, does that means sentiment analysis on the documents/links return is done on the fly too? Wouldn't that be too computationally intensive? I am curious because other than sentiments, they also return the top keywords in the document set.
They have something called the "trends". They looked like the trendingtopics in twitter, but seems like they also include phrases >3 words long. Is this relevant to nlp's entity extraction or more to keyphrase extraction? Is there apis other than that of Twitter that provides this? Is "trends" generally done on search queries submitted by users or do the system actually processes the pages?
A curious man.
sentiment can be fast and on the fly, if it is for example rule-based and the dictionaries are in memory. Curious? Get in touch

Resources