Chrome extension to read shopping cart - google-chrome-extension

How Can I read the cart products, from the chrome extension if the user is on the online stores like amazon or ebay then after reading moving the items to the another site.
Parsing the Whole Page For cart is a real pain for it. Any generic solution for this?

First of all, you need to have a content-script, it can read DOM structure.
Almost all modern e-commerce platforms support a structured markup language to describe products. I've worked with JS product parser. According to my experience, most popular are:
Schema.org (Product example)
JSON-LD (Product example)
There are plenty modules for parsing such data available in NPM. For most e-commerce, it will be enough.
Updated.
Ah, I checked the question again: you're asking about products which already were added to cart. But even in this case I still suggest you check page source for any structured data.

Related

How can I scrape data from a website?

I want to scrape only four data items from the following page in each and every product from the following link that was an infinitive scroll down page.
name of the product
price of the product
href of the product
img src of the product.
All the data will be stored in a single csv file.
How can I do this?
Any idea?
i have not sure of this method.
get the original source code where you can get all of info of the website including the photo link or any word
This is usually considered a bad idea. If you write code to scrape a website for it's content, what happens when they change their markup? Or what happens when they realize you're scraping (stealing) their original content and ban your server's IP address or IP range even. It's a losing battle, so unless you have permission from them to do so I wouldn't recommend trying. It may work for a little while, but probably not for long. It's generally considered poor form to do something like this, so personally I wouldn't encourage anyone to teach someone how to scrape a website for it's content.
Furthermore, it says very clearly in their Terms of Use not to do exactly that:
You agree not to access (or attempt to access) the Website and the materials
or Services by any means other than through the interface that is provided by
Snapdeal. You shall not use any deep-link, robot, spider or other automatic
device, program, algorithm or methodology, or any similar or equivalent manual
process, to access, acquire, copy or monitor any portion of the Website or
Content (as defined below), or in any way reproduce or circumvent the
navigational structure or presentation of the Website, materials or any
Content, to obtain or attempt to obtain any materials, documents or
information through any means not specifically made available through the
Website.

how schema.org can help in nlp

I am basically working on nlp, collecting interest based data from web pages.
I came across this source http://schema.org/ as being helpful in nlp stuff.
I go through the documentation, from which I can see it adds additional tag properties to identify html tag content.
It may help search engine to get specific data as per user query.
it says : Schema.org provides a collection of shared vocabularies webmasters can use to mark up their pages in ways that can be understood by the major search engines: Google, Microsoft, Yandex and Yahoo!
But I don't understand how it can help me being nlp guy? Generally I parse web page content to process and extract data from it. schema.org may help there, but don't know how to utilize it.
Any example or guidance would be appreciable.
Schema.org uses microdata format for representation. People use microdata for text analytics and extracting curated contents. There can be numerous application.
Suppose you want to create news summarization system. So you can use hNews microformats to extract most relevant content and perform summrization onit
Suppose if you have review based search engine, where you want to list products with most positive review. You can use hReview microfomrat to extract the reviews, now perform sentiment analysis on it to identify product has -ve or +ve review
If you want to create skill based resume classifier then extract content with hResume microformat. Which can give you various details like contact (uses the hCard microformat), experience, achievements , related to this work, education , skills/qualifications, affiliations
, publications , performance/skills for performance etc. You can perform classifier on it to classify CVs with particular skillsets
Thought schema.org does not helps directly to nlp guys, it provides platform to perform text processing in better way.
Check out this http://en.wikipedia.org/wiki/Microformat#Specific_microformats to see various mircorformat, same page will give you more details.
Schema.org is something like a vocabulary or ontology to annotate data and here specifically Web pages.
It's a good idea to extract microdata from Web pages but is it really used by Web developper ? I don't think so and I think that the majority of microdata are used by company such as Google or Yahoo.
Finally, you can find data but not a lot and mainly used by a specific type of website.
What do you want to extract and for what type of application ? Because you can probably use another type of data such as DBpedia or Freebase for example.
GoodRelations also supports schema.org. You can annotate your content on the fly from the front-end based on the various domain contexts defined. So, schema.org is very useful for NLP extraction. One can even use it for HATEOS services for hypermedia link relations. Metadata (data about data) for any context is good for content and data in general. Alternatives, include microformats, RDFa, RDFa Lite, etc. The more context you have the better as it will turn your data into smart content and help crawler bots to understand the data. It also leads further into web of data and in helping global queries over resource domains. In long run such approaches will help towards domain adaptation of agents for transfer learning on the web. Pretty much making the web of pages an externalized unit of a massive commonsense knowledge base. They also help advertising agencies understand publisher sites and to better contextualize ad retargeting.

Extracting user interests from social profiles

This is my first time dabbling in NLP so please excuse my ignorance. I'm looking for a method to extract interests/likes/hobbies from users' social profiles. Here is an example where all the interests/likes/hobbies are in bold:
"I consider myself a pretty diverse character... I'm a professional
wrestler, but I'd take a bullet for Wall•E. I train like a one-man genocide machine in the gym, but I cried at
"Armageddon." I'll head bang to AC/DC, and I'm seriously
considering getting a Legend of Zelda tattoo. I'm 420-friendly. I
like to party it up with the frat crowd one night, hang out with
my Burning Man friends the next, play Halo and World of
Warcraft the next, and jam with friends that aren't any younger than
40 the next. My youngest friend is 16, my oldest friend is 66. I'll
sing karaoke at the bars, and I'm my friends' collective
psychiatrist/shoulder."
The profiles are plain text. There are no meta tags or ids associated with any of it, it's just a paragraph of text.
My naiive idea was to take each noun and match it against Freebase to see if it's an activity/artist/movie/book etc. The problem is that although most entities mentioned will be things the user likes, she will also mention things she doesn't like and I have no means of distinguishing the 2.
I have 2 questions:
What sub field of NLP should I be looking at? Some googleable algorithms/techniques/authors would be greatly appreciated.
How hard is this problem?
Thanks!
First, unless using NLP to do this is a particular objective for you, check your problem domain to see if you can avoid it completely.
For instance:
do these profiles have tags (supplied either by the Site or by the
user)?
what does the Site's API make available (assuming that's how you are accessing this data; if you are scraping it, then this doesn't of course apply)? A good example, Facebook. if you read a user's posts, you'll see words like "wrestler", "karaoke", etc. but if you look at what fields are exposed via the Graph API, you'll see that these activities nearly always have an associated FB ID.
I am not a specialist in this field, but I can recommend a couple of resources directed to NLP and which are accessible to the non-specialist or novice. The first is a text processing API. This simple web service uses REST and JSON IO. It is free and seems to have a fairly large rate limit.
This API appears to rely heavily on the excellent Natural Language Tooolkit (NLTK) which is a mature stable library in python, that includes modules directed to the problem in your Question, e.g., Sentiment Analysis, Tagging and Chunk Extraction, etc.
Which particular sub-domain is most relevant to solving the Question in the OP? I don't know, but I suspect there's a module somewhere in the NLTK that does what you need. Finding that module is hopefully just a matter of skimming the API Documentation (which is organized by module); reading the Getting Started section which contains an excellent survey of NLTK's modules as well as demos for all of each of them.

Magento: Attribute with thousands of values/options

I'm creating a Book store in Magento and am having trouble figuring out the best way to handle the Authors of a Book (which would be the product).
What I currently have is an Attribute called "authors" which is multi-select and a thousand [test] values. It's still manageable but does get a little slow when editing a product. Also, when adding an option/value to the authors attribute itself, a huge list is rendered in the HTML making this an inefficient solution.
Is there another approach I should take?
Is it possible to create an Author object (entity type?) which is associated to a product through a join table? If yes, can someone give me an explanation about how that is done or point me to some good documentation?
If I'd take the Author object approach, could that still be used in the layered navigation?
How would I show the list of all books for a single author?
Thanks in advance!
PS: I am aware of extensions like Improved Navigation but AFAIK it adds something like attributes to attributes themselves which is not what I'm looking for.
For Googlers: The same would apply for Artists of a music site or manufacturers.
If you create an author entity type, you'll just increase your work trying to add it to layered navigation, and I don't see a reason why it would be faster.
Your approach seems the best fit to the problem, given the way Magento is set up. How are you going to display 1,000 (which presumably pales in comparison to the actual list) authors in layered navigation?
Depending on the requirements, you could go the route of denormalizing the field and accepting text for it. That would still allow you to display it, search based on it, etc, but would eliminate the need to render every possible artist to manipulate the list. You could add a little code around selecting the proper artist (basically add an AJAX autocomplete to the backend field) to minimize typos as well.
Alternatively, you could write a simple utility to add a new artist to the system without some of the overhead of Magento's loading the list. To be honest, though, it seems that the lag that this has the potential to create on the frontend will probably outweigh the backend trouble.
Hope that helps!
Thanks,
Joe

Any flexible CMS perfect for restaurant website’s back-end?

I’m building a website for a restaurant which consists of several static pages like ‘About us’ and editable menu.
I need a CMS flexible enough to be able to add items individually (by individually, I mean adding items doesn’t equal pasting a HTML list of n products into another static page).
Each item should contain its name, description, price and category. The list of added items should be displayed using templates the way I want them to.
Can you suggest any lightweight CMS which can provide similar conditions?
There are tons of options for simple page creation. Have you considered just using one of the many free website builders out there? Then you don't even have to worry about finding hosting, just make it happen quickly and easily with one of them. For instance, take a look at Weebly (review here) or Wix. Both allow for free pages and both are incredibly easy to use. Squarespace (review here) is another solid option (and one of my favorites) but charges a small fee (which I personally think is worth it).
Weebly allows for some slick drag and drop of page elements into place as does Wix. They are what I would classify as the easiest of the batch while Squarespace provides for an excellent user interface experience.
Other options if you'd prefer something hosted on your own would depend on your experience level. I am a huge fan of Processwire and ImpressPages has come along nicely and is great little CMS too.
These are exceptions to the typical Top Three that everyone tends to recommend I know but I like to spread the word about other projects instead of the usual ones.
Cheers!
Mike
Sounds like a job for Wordpress 3.0 plus Custom Post Types UI + Verve Meta Boxes plugins. Wordpress will handle the static pages, the other two plugins will allow you to make a Menu Item post type with custom fields.
It is not exactly lightweight, but you could do it with Drupal. You can define you own content type "product", use the CCK module to add your fields (price, ...) and use the Views module to display it how you want.
Drupal has a relatively steep learning curve, so it may be overkill for this project. It is definitely flexible enough for this, though.

Resources