I have a task to parse links and titles from YouTube videos from one category (e.g. music).
I know that there's a huge amount of videos so that's my question. How to do it using for example nodeJS?
I have only one idea simply to use phantomJS and scroll scroll scroll down page to get as many videos as I can, but this solution is dumb.
Are there any other solutions, using YouTube API for example or other tools and methods?
https://www.googleapis.com/youtube/v3/search?part=snippet&type=video&videoCategoryId=10&key=Your Key Here
try using this. First find out which category video id you need
Related
Lately some of the articles have imbedded videos, but they are not listed in the JSON responses when querying the API; not even in a "related videos" type of attribute. Only the normal videos are available via the API.
Unfortunately I don't think there is. For our older articles, which are just a blob of HTML, /api/v1/articles/<id> will give you a htmlContent field will presumably contain some iframe. For our newer articles, the perseusContent field has the data we use to render it; I don't know that we document the format anywhere but the code that renders it is open-source so you could take a look at that, or just look at some actual articles with videos and see what they look like. Sorry we don't have a more direct way to do this!
I recently read this article on how to scrape the Inbound.org community members profile using Excel. And you can watch the video here if you prefer it that way.
Since the release of this tutorial, the Inbound website structure has changed a bit, as you can see at minute 11:00 in the video, if you attempt to copy the XPath of the social media icons it appears slightly different and because of this I haven't been able to extract that information.
Here's what I get now:
/html/body/div[3]/div/div/div[1]/div/div[2]/a[1]/i
This is how I wrote the syntax in Excel:
=XPathOnUrl(A2,"//a[#class='twitter']","href")
And then like this:
=XPathOnUrl(A2,"//a[contains(#class,twitter)]/#href")
Although I tried in many different ways, none of them showed me the link to the member's social media profile.
I even tried changing the xpath in multiple ways to get different data from the page, but none of it was the social media information:
=XPathOnUrl(A2,"//*[contains(#class,member-banner-tagline)]/div[2]/div/div/div[1]/div/div[1]")
=XPathOnUrl(A2,"//*[contains(#class,member-banner-tagline)]/div[2]/div/div/div[1]/div/h1")
I honestly don't know what to try anymore, something's wrong and I can't figure it out. Anybody have enough experience with this or can pinpoint the problem here with my syntax?
Thanks a lot
The first formula you tried looks fine, but this is the one that works for me (SEO Tools version 4.3.4) :
=Dump(XPathOnUrl(A2;"//a[#class='twitter']";"href";HttpSettings(TRUE)))
I am looking to create a custom spotify playlist rather than use the generator via the website. I need a way of grabbing this xml, rather like the lookup and search facilities that the webAPI provide. I have tried to use a playlist spotifyURI with the lookup functionality but it doesnt seem to work.
e.g.
http://ws.spotify.com/lookup/1/?uri=spotify:user:XXX:playlist:YYY
However, using this just gives me the following error :
"You hit the rate limit, wait 10 seconds and try again"
I don't think I have really hit the hitrate, I only tried it a few times.
If this isnt the way to go, what other options are there ? libSpotify ? This seems like rather a bigger solution for just getting some xml for a playlist.
Any help appreciated.
The web API doesn't support playlist lookup at all. If you want to find playlist data, you'd have to use libspotify.
for my job, I'm looking into an idea in which people would use Google Search by Image and use any celebrity photo they find. Google would return the results and then on our end, a there'd be a database of professionals showing how to get that specific look.
I'm assuming this is extremely unlikely to do, based on that users could use ANY photo.
So, is there a way that I could have about 100 or so celebrity photos that Google Image results could compare to and then choose the one that is closest.
Basically:
Drag drop photo of Britney Spears
Google searches with that image
Google's results compare the top images with our 100, and selects the closest match.
User gets to see video of how to get Britney Spears look.
I'm not a programmer, but looking for some API or Search by Image extension that could make this remotely possible for the programmers here at my job. Does something like that (a search by image api) exist? The best I could find was just the support page, which is hardly of any help: http://support.google.com/images/bin/answer.py?hl=en&p=searchbyimagepage&answer=1325808
You can easily search by an existing image by inserting this into your address bar:
https://www.google.com/searchbyimage?site=search&sa=X&image_url=YOUR_IMAGE_URL
Example:
https://www.google.com/searchbyimage?site=search&sa=X&image_url=http://cdn.sstatic.net/Sites/stackoverflow/company/img/logos/so/so-icon.png
Sorry to say, but the Google image API is deprecated:
Important: The Google Image Search API has been officially deprecated as of May 26, 2011. It will continue to work as per our deprecation policy, but the number of requests you may make per day may be limited.
Quite sure there are some alternatives (http://www.tineye.com/ and http://mrisa.mage.me.uk)
Update (2013): There is now Google Custom Search which allows image searches.
These answers are quite obsolete, but the question comes up in searches. So, the Google Vision API has the "web detection" feature that does a reverse image search. First 1000 requests per month are free, $3.50/1000 afterwards.
I think Google Web Detection could be a solution for you. Google moved it permanently from Image search
You can do it via www.images.google.com but only from a browser (lets you upload your own image and compares it to similar).
I'm working on doing it from code (not from browser).
I had the same problem and came up with two solutions:
There are a number of APIs that give reverse image search results nowadays. The ones I used are https://reverseimageapi.com and TinEye.com.
As the selected answer mentions, you can easily scrape this information but will almost certainly need rotating proxies to prevent being banned by the search engine. There are plenty of proxy rotation services (Zyte, Oxylabs, ScrapingBee, etc.) to make you life easier.
I ended up going with option 1 due to the upkeep of scraping search engines and elements changing / breaking.
I have a search engine that searches albums.
For each music album, I have a page.
So, the work flow goes like this:
People search for music titles
The search engine displays a list of albums.
People click on an album to go to a details page.
I want google to index my front page and the details page. I want the details page to be highly ranked. How can I build a sitemap for this?
By the way, I have about 5 million albums (but I want the top 1000 ones to be highly ranked on google)
You would not use a sitemap for that many results. You would want each album to appear as a page with a unique URI to reference that page. That way the search engine can crawl your site by crawling links since search bots cannot submit form data. Each of those URIs should be simple, meaning limited to this part of the URI syntax:
scheme://authority_segment/path
Program your web application to remove and throw away any extraneous data, such as query string or parameters. If you do this you have to be sure that you are watching for URI poisoning or SQL injection even through means of character encoding.
How can I build a sitemap for this?
By pulling the addresses out of your database and creating a XML file with a high priority for some selected pages. Somehow I think that isn’t your real question …
If I wanted to automate building a site map for a site like this, I'd employ Python. I'd pretty much write everything from the ground up (except the data store access). The format is quite simple.
I'm not sure I quite understand your question...