How to get sitemap of webpages? - web

I am trying to analyze some page and to get all necessary information I need to know their web structure ( sitemap - map of web-pages, something like this Czech pages or Harvard pages )
Is there some online service or program to which I give URL of page and then it display sitemap (I know that there can be link to other pages but if the service would have option to search only to some level it would be great)

Sitemap is often present in a file, located at root of website, and named sitemap.xml.
eg : http://www.abcdef.com/sitemap.xml
Otherwise, if it is not present, you can try to have a look in a file named robots.txt file (still at root of website)
It can be defined like that in robots.txt :
Sitemap: http://www.abcdef.com/sitemap.xml

Related

How to tell search engines like Google to display its tabs/url?

When we search in google engine it displays top web site tabs or links too. Like when we search "bing" or "net beans".
Q: How it displays those links. Do we have to tell it to display these links.
Q: Does it something have to do with sitemap.xml/robots.txt or it displays the links present in index.php of that website?
Robots.txt: allow/disallow bots to crawl which page.
sitemap.xml: tells the map/loc of your website pages and also tells the frequency.
Q: How does it display description of a website?
I have searched about description it has to do with meta tag name description. But i open the source file of net beans
<META NAME="description" CONTENT="Welcome to NetBeans">
But the description google showing is
Fully-featured Java IDE written completely in Java, with many modules available, such as: debugger, form editor, object browser, CVS, emacs integration, ...
For your first question I should say that those links which you've mentioned are automatically genereted in top most visited websites and portals. If you'd set the sitemap.xml and robots.txt correctly in your website root folder, After a while if your website has a lot of visitors traffic, google detect your top most visited links which users most redirect to and show them in its result as you wish.
For the second question meta tags are not the only criteria the search engines show them in their results, Rather they catch the page content and extract the context from the text content of the page and show the description based on your entered keyword. However your meta description will be shown when keyword is the website name or its domain.
Take a look at Open Graph Protocol to extend your information about meta tags and your requirements for seo.
Regards

How to discover amount of pages of an external website

I have to make an offer for a new website. It should be based on the amount of pages there are in the existing site. There is no sitemap present.
Question: how can i get the total amount of pages inside an external website that not belong to me?
Have you tried to reach a potential sitemap.xml file (http:www.yourwebsite.com/sitemap.xml)?
You can test pages discovery with an online sitemap generator : https://www.xml-sitemaps.com/
You can also try a Google research like this : site:www.yourwebsite.com. You'll see all indexed pages.

URL Rewrite IIS and search engine

I've configured my IIS (asp.net site) to use URL Rewrite.
In particular this is my rule (dynamic one): whatever url in format number/string will be redirected to a special aspx page.
SSo whatever url starts with mysite/id/Name is redirected to showprof.aspx?id=id&title=Name. This works perfectly.
My question is about search engines. I don't have any "fixed" page that contains links like mysite/id/Name that the spider can scan, so I'm trying to figure it out how search engines could index my dynamic pages. Should I create a sitemap.xml? if yes in wich way? or should I create a "hidden" page that contains every link to all my dynamic contents like mysite/id1/Name1 mysite/id2/Name2 and so on?
thank you
A starting point is definitely a Sitemap.xml, You could try for example the IIS SEO Toolkit and see if it is able to index any of your pages: http://www.iis.net/downloads/microsoft/search-engine-optimization-toolkit
It also has functionality to generate a sitemap.xml, although I'm guessing in your case you probably have some dynamic content, so a better approach would be to have a "handler" that generates it dynamically on demand (maybe cache it for performance reasons).
I would also recommend to have some pages that actually are accessible through normal links, for example maybe have in your home page of the site a link to a "site map" page (not sitemap.xml), where there you render a set of links that you want to index (at least the ones that are most important to you), and that will make them easy to discover.

How to Hide Drupal Site from Google?

I have a website, and I use Drupal for CMS. i.e. I write articles in mysite.com/drupal/ but I display them in mysite.com/show_article.php?article_id=x&article_name=y (actually mysite.com/article/x/y with .htaccess RewriteRule) by loading the content from Drupal database.
However, when I search for an article in google, mysite.com/drupal/node/x/y appears as the result, but mysite.com/article/x/y doesn't.
so I guess I need to add some <google nofollow> tags in drupal's php pages, but which ones? or is there an easier configuration setting for this?
Thanks !
robots.txt:
User-agent: *
disallow: /drupal/
Then you'll want to submit an index for mysite.com/article to Google. It sounds like you won't be able to use XML sitemap for this however.
That should be all you need to do what you need. Of course the obvious question (if I understand what you're doing) is why you're using custom php to serve up Drupal content from the same domain, and not just doing this in Drupal.

SEO-Setting website to search dynamic data

I want to set my website . It has many user profile which is kind of dynamic.
e.g. http://test.com?profile=2,http://test.com?profile=3.
Whats steps I need to make so that its show all profiles on search engine dynamically.
1) I have an Google webmaster tool
2) Added a sitemap and robot.txt for the site.
After 1 months or so(Indexing is done , as I can on Webmaster tool account)
If I search the profile(say by name) I don't see the user profile in search.
I have added the url parameters as well e.g. here profile.
Am i Missing anything?
Can you get to a profile from the home page by basic links alone?
Search engines like to be able to find your pages on their own.
Do a more specific search first. e.g. add site:test.com to your search so only your site is competing.
Check you have not blocked the pages in the robots.txt file or via the robots meta tag on the page.

Resources