How can I scrape content from a Website with AlchemyAPI? - node.js

I would like to scrape Content from a website with AlchemyAPI. I get informed about this feature on http://www.alchemyapi.com/api/scrape/qlang.html
I will implement it in the same way as in the example:"Querying Inside Tables (Selecting a Column Inside a Specific Row)".
Could somebody please help me, how to use this in Node.js and Cquery? Which parameters do I need to get specific fields like price as output?

No, it's not currently possible to do this. Since AlchemyAPI was acquired by IBM, the remaining services have been incorporated into Watson. Most of the AlchemyAPI services are now covered in the Natural Language Understanding (NLU) service: https://www.ibm.com/watson/developercloud/doc/natural-language-understanding/ but there is no feature that allows you to scrape content from a web site per se.
The NLU service does allow you to retrieve text from a web page using the analyze endpoint: https://www.ibm.com/watson/developercloud/natural-language-understanding/api/v1/#post-analyze

Related

detecting if website has e-commerce in Node.js

I need to detect programatically if a website has an e-commerce platform/system
I don't need to know which one, I just need to know if the website has one.
(I have a big list of websites so I probably need to scrape them)
any suggestions on how I could do this without using external websites (like rescan.io/builtwith/etc) would be greatly appreciated!
thank you!
You can use a package called Puppeteer which is used to do web-scraping in node.js.
I don't know what platforms you are trying to look for, but I guess you could try something like giving the list of websites you want to check to a node.js process and ask Puppeteer to scrape them all. Then you look at the content you get back and for example look for Shopify's CDN in the tags or check the tags for keywords.
You will definitely need to check each different platform like Magento or Shopify for unique source code that clearly sets apart the framework you are looking at from other tools.

Amazon Cloud Search Experience

I might get flagged down by this question.. but still will give it a shot..
Since Google Site Search is going out of business and we are not interested in the free version of it - We decided to go with the Amazon Cloud Search option. The challenge though is - it is not straight forward. We have to build a crawler and there are some features that needs to be custom built.
I am trying to see examples where websites have used ACS and worked but i am not able to find anything good.. Have anyone tried using Amazon Cloud search for their Website search. Our website has around 15000 plus pages.
We are .net based solution - so i am thinking to write a crawler.. extract content on nightly basis and send it to Amazon. Would it be the right way?
ACS is based on Solr. If your site is under your control, i think the first step is extracting all useful content out and generate them into xml/json files, then use AWS CLI upload these documents to ACS. ACS has REST APIs to let you to get the query result. You need to define indexes before uploading them.

Retrieving Google Instant Data

I want to develop an application that will visualize the recommendations of Google instant. It is for a course project and for now, I don't know much about web programming tools. What I wonder is that is it possible to retrieve that data from another web page. If you think it is possible and it is possible with which platform, could you please guide me to the correct direction?
Without more information on what you're actually trying to do, it's difficult to give a proper answer. From what I can understand, you just want a list of the auto-completed items from a Google search, to manipulate however you like?
In which case, using the highest-rated answer from here, you can use http://suggestqueries.google.com/complete/search?client=firefox&q=YOURQUERY to give you a JSON object which you can then manipulate to get the auto-complete results. The client= part is needed, but I haven't looked at various options you can put in there.
Personally, I've never used JSON before, so can't give you any help on how to go about parsing it, but you can find more information about it on the JSON website, and w3 website.
Will need to act like javascript or run a javascript engine OR a browser add on and communication with that add on.
What happens as you type is a javascript function is called. So you need to call this function in your own or mimic what it does. I guess it calls a web service/ web page form programamtically (ajax) with what you have typed. The server responds with the suggestions. Not very difficult as long as Google does not deny you if it realizes your not a browser. i think they like only 100 free API calls but you can google google about that.
Http Components in java will help calling the serice, with cookeis etc. You should use the dev tools on firefox to see what happens under the hood when you type in the google search bar and see the code.

Integrating Eventbrite with Expression Engine

Has anyone has done Eventbrite integration with an Expression Engine site? We'd like to set up events with Eventbrite and have them handle all ticket management. But we'd like to be able to display the events within the Expression Engine site and then enable users to click on the link to be redirected to Eventbrite. I've viewed the API and it looks like we can create custom EE pages with the API.
More importantly I'd like to let users search for events from our main site.
Has anyone done this type of work and have any hints or resources?
Thanks.
Todd Perkins got started on a module for this some time ago, but there hasn't been any action on it since then. Could be a good starting point for you though.
https://github.com/toddperkins/eventbrite
Eventbrite has a great PHP-based API client library that should be able cover all of your API interaction needs.
These PHP examples might be useful as well:
https://github.com/ryanjarvinen/eventbrite.php/tree/master/examples
http://eventbrite.github.com/#examples
Please let #EventbriteAPI know if you make any major progress on this project. I'm sure they would love to add an Expression Engine integration to their open source projects list and application showcase!

Hosting Google Apps UI in my app

I'm investigating the possibility of re-using Google Apps/Docs in a local hybrid desktop/browser application.
I've been going through the Google documentation on manipulating docs, eg. the Spreadsheet. I can't seem to find any info on actually hosting the UI. Is this possible, or does it require some form of permission from Google?
You want to basically embed an browser control in your application pointed at the URL of a Google Apps doc? You could use the Google Document List API to retrieve the documents for a user, then use the URLs of those documents in your embedded browser control.
You don't need Google's permission to do that; you're writing a browser with some extra smarts built in.
What do you mean by "hosting the UI?" These apps are HTML/CSS/JavaScript. Are you thinking about embedding them in AIR or Titanium, or in some kind of web control in another app?
i briefly looked into doing this, and figured if i really wanted to i could just load the gdocs page content dynamically, and use javascript to strip away the superflous elements like header and footer. but instead i'll probably just use an OS alternate because they have come a long way and I want rich hooks.

Resources