Search Multiple URLs for New Content - search

I have a group of websites I want to check daily for new content and I'm not sure what the best way is. I'm hoping one of you can help me.
With Google Custom Search, I can search a group of websites -- but what I want is to find any content posted in the past 24 hours, not just content related to a specific keyword. I've tried searching with no keyword and I get no results.
With regular Google Search, I can choose a single site (site:www.example.com), use search tools to limit the results to the past 24 hours, enter no keyword and find anything that's new. But that only works for one site at a time, as far as I can tell.
With Google News search, I can find new content from multiple sites -- but that only works for news sources. If I enter nytimes.com, it works; if I enter dcenr.gov.ie/ I get nothing.
Any ideas on another way to approach this?

You can try creating a RSS feed for the webpages and then using a RSS reader to check for updates.

Related

Searching for Old YouTube Videos

I'm trying to find all of the YouTube videos created by IGN's channel during the month of February 2014. IGN currently has 118,000+ videos uploaded, so going back through all of them is not possible. I previously used the following Google search string and a custom date range to find them:
site:youtube.com ignentertainment
This doesn't work anymore for some reason. I'd be much obliged if anyone has any ideas of how to do this. I have no idea what an API is, but if there's a VERY simple way of using that to do what I want that can be explained briefly, I'm willing to go that route.
Thanks.
You can use google to limit the period that it fetches search hits from.
Start by searching using "site:youtube.com ignentertainment" or simply "ignentertainment" and then click on the tools button, you now got a new bar between your search bar and the results that can limit time among other things.
Open the time related options and choose to input a specified period and your all done.
Edit: oh and the command site:youtube.com ignentertainment sure worked for me.

SharePoint online search not returning Home Pages

I cannot for the life of me work out how to return home pages within SharePoint Online search.
I have a single site collection with a number of sub sites that have a home page set as the default page, however when I create a query results source in SharePoint Online I cannot retrieve any of the homepages. They seem to be excluded?
Any ideas or thoughts to why they would be excluded?
Ideally, I just want to return all homepages for each sub-site within the site collection.
Many thanks.
You need to make sure that all your home pages are having same content type you are using in your filter. You might also use the name of content type in filter instead of id in your filter.
ContentType:"your content type name"
You also need to make sure all these pages are checked in and published to be picked up in search. If you are sure of all that, then try to reindex the whole site collection frim site settings and recheck after a while, normally it takes a couple of hours for crawl to finish and het your results. However, it sometimes take longer depending on search index load on cloud.

can you have "variables" in text in google sites?

Sorry, this is a bad question. I don't even know what the title should be. I'm a total noob at making websites so this might be easy to find but I just don't know the terminology to search for. I cannot find anything about how to do this...
What I want to do is have something like references/variables that I can use in a block of text and it will automatically get replaced with whatever value should be there. Best way I can think of to describe it would be if I was using the site as a design doc for a game or something, I would be able to type in [Title] or something similar on any page and when it loads that text would be replaced with whatever my Title is. That way If I ever change titles, names, classes, races, places, items, etc... they would only have to be changed in 1 place and the change would be reflected everywhere.
I notice if I add a link to a page it will automatically use the Title of that page as the text of the link. That is almost exactly what I want. Except when I change the Title of the other page the text of the link remains as the original text. It doesn't get updated to the new Title and that is not at all what I want.
Also, I want to do this in Google Sites and as simply as possible. I don't really want to use a database. I was hoping Google Sites would have some kind of funcionality for this.
I don't believe this is possible (on Google Sites) and likely you need to consider a hosted solution.
Quoting the answer from this relevant post:
You should consider hosting your solution using Google's App Engine
instead of Google Sites. You can set it up so it uses PHP (see link
below), you can configure it to use your domain name and you get
enough CPU, disk and bandwidth allowance to serve around five million
page views for free each month, if you are serving more than that,
their prices are extremely competitive.
Google App Engine:
http://code.google.com/appengine/docs/whatisgoogleappengine.html How
to setup PHP using Google App Engine: http://blog.caucho.com/?p=187
Also I'm not sure how your PHP skills are but if you're unfamiliar with it then this should help to get you started.

How would I best make this SEO_able?

I have a search engine that searches albums.
For each music album, I have a page.
So, the work flow goes like this:
People search for music titles
The search engine displays a list of albums.
People click on an album to go to a details page.
I want google to index my front page and the details page. I want the details page to be highly ranked. How can I build a sitemap for this?
By the way, I have about 5 million albums (but I want the top 1000 ones to be highly ranked on google)
You would not use a sitemap for that many results. You would want each album to appear as a page with a unique URI to reference that page. That way the search engine can crawl your site by crawling links since search bots cannot submit form data. Each of those URIs should be simple, meaning limited to this part of the URI syntax:
scheme://authority_segment/path
Program your web application to remove and throw away any extraneous data, such as query string or parameters. If you do this you have to be sure that you are watching for URI poisoning or SQL injection even through means of character encoding.
How can I build a sitemap for this?
By pulling the addresses out of your database and creating a XML file with a high priority for some selected pages. Somehow I think that isn’t your real question …
If I wanted to automate building a site map for a site like this, I'd employ Python. I'd pretty much write everything from the ground up (except the data store access). The format is quite simple.
I'm not sure I quite understand your question...

Why and how does the googlebot use my website's search engine?

Looking through my search logs from time to time, I notice that by far the biggest user of my search engine is the google-bot. What gives? Is it looking for content that might not be directly accessible through navigation? If so, how does it know which words and phrases to look for (they're surprisingly relevant). Does it check the most popular keywords on the site? I know I seem to be answering my own question here, but this is really only working it out from first principles. I'd like to hear from someone who knows what they're talking about (i.e. not me).
If your search form's method is get instead of post, each search has its own url, and people might be posting those urls elsewhere. Or if you have a (possibly inadvertently) publicly accessible webstats page that listed those urls, that's another common way for search engines to stumble upon your internal search urls. A third way I've seen is sites that list recent searches on their pages, but this is more intentional. "MySQL Performance Blog" does this to an annoying extent, so any search of their site from google yields hundreds of pages of similar searches, even if none of them found what they were looking for.
Edit: Looks like it does on occasion, but only GET forms:
http://googlewebmastercentral.blogspot.com/2008/04/crawling-through-html-forms.html
Google will use words that occur on your site in search boxes to try to find pages that it can't otherwise.
Google says that for the past few months, it has been filling in forms
on a "small number" of "high-quality" web sites to get back
information. What words has it been entering into those forms? Words
automatically selected that occur on the site, with check boxes and
drop-down menus also being selected.
http://searchengineland.com/google-now-fills-out-forms-crawls-results-13760

Resources