Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 12 months ago.
Improve this question
I've had my photography site at photography.brianbattenfeld.com, but now it's becoming my primary income and I'm doing it pretty much full time so my primary domain should be my photography portfolio.
I'm thinking about having brianbattenfeldphotography.com and/or brianbattenfeld.com be my new domain for photography.
So my questions are:
If I make brianbattenfeldphotography.com just an alias of photography.brianbattenfeld.com are there significant SEO or analytics issues I should be worried about?
Will one perform better than the other, or rank higher?
Does it make a difference which one people visit?
Do search engines generally acknowledge the alias as 'secondary' somehow, because it's not where the files are actually stored?
A lot of questions I know, but I'm just trying to figure out what impact this may have.
In general, when moving a site or just changing the domain (because that is what you're doing, changing from a subdomain to the primary one), do NOT create duplicate content.
Essentially, if you go to subdomain.domain.com and get the same site as www.domain.com without the URL changing, you have duplicate content.
What I would suggest, is that you create a forward (301) from subdomain.domain.com to domain.com. That way, Google will transfer all your rank from the old URL to the new URL. It can take some time to happen, but it will happen.
So to answer your questions:
Do not make an Alias (that would make duplicate content)
They will perform differently, based on number of inbound links. They could also perform poorly, both of them, if Google sees it as duplicate content.
No difference to the visitors
It's not "secondary", it is a separate page. On this however, I feel I need to mention Canonical URLs. They should only be used when you have two different sites where some pages contain the same body as another, either on the same or different domain. Using canonical URLs for each page is A) overkill and B) not a great idea. You might as well have a 301 re-direct. You can read more about Canonical URLs here: http://googlewebmastercentral.blogspot.se/2009/02/specify-your-canonical.html
Hopefully that answers your question.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I want to build ... something (website? app? tool of some variety?) that searches other sites -- such as Amazon -- for specific items and then lists whether or not those items exist. Ideally it could also pull prices, but that's secondary.
I'd like to be able to enter a (very specific, an identification number) search term into the thing that I build and then have the thing return whether or not the searched item exists on the sites that it checks (a predetermined list). I'd also like it to take a list of ID numbers and search them all at once.
I have no idea where to begin. Can anyone point me in the right direction? What do I need to learn to make this happen?
You will need to learn a few key languages in order to start working on a program like this.
PHP: you need a server side language to skim the site
Javascript: For the input on the users side
HTML: to implement the javascript
Once you learn the basics, search stackoverflow for specific questions relating to a specific problem.
This is certainly a too broad question, but as OP asks to point in some direction here are few suggesstions-
Well this seems to be a big projects. You'll need to find if there is some official api given the other sites from where you want to fetch the product info, if yes use the api to retrieve the product info or else use web scraping where you retrieve the data by parsing the page and storing into your local database.
Amazon provides EC2 instances, where you can hourly rent specific configuration server as needed e.g. Linux with apache/mysql/php, or linux/java.
Amazon has a set of other tools like the S3 storage where you can host your images/docs/video and link them on the site.
Hope this helps in someway.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
I have an eCommerce site that I am working on. The primary way customers find products is by filtering down results with a search menu (options include department, brand, color, size, etc.).
The problem is that the menu creates a lot of duplicate content, which I am afraid will cause problems with search engines like Google and Bing. A product can be found of multiple pages, depending on what combination of filters are used.
My question is, what is the best way to handle the duplicate content?
As far as I can tell, I have a few options: (1) Do nothing and let search engines cache everything; (2) use a canonical link tag in the header so search engines only cache departments; (3) put rel="nofollow" in the filter links-- though, to be honest I'm not sure how that works internally; (4) put noindex in the header of filtered pages.
Any light that can be shed on this would be great.
This is exactly what canonical URLs are for. Choose a primary URL for those pages and make that the canonical URL. This is usually one that isn't found using filters. This way the search engines know to display that URL in the search results. And if they find the filtered pages from incoming links they give credit to the canonical URL which helps its rankings.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I have a page that serves up dynamic content
/for-sale
the page should always have at least one parameter
/for-sale?id=1
I'd like to disallow
/for-sale
but allow
/for-sale?id=*
without affecting the bot's ability to crawl the site or the possibility of affecting negatively on SERP's.
Is this possible?
What you want does not work using robots.txt:
There is no such thing as Allow: in the robot exclusion standard, although the RFC written by M. Koster suggests so (and some crawlers seem to support it).
No such thing as query strings or wildcards is supported, so disallowing the "naked" version will disallow everything. Surely not what you want.
Anything in robots.txt is an entirely optional, and merely a hint. No robot is required to request that file at all or respect anything you say.
You will almost certainly find one or several web crawlers for which any or all of the above is wrong, and you have no way of knowing.
To address the actual problem, you could put a rewrite rule into your Apache configuration file. There is readily available code available for turning an URL with query string into a normal URL (example from a quick web search).
(Alternatively, you could just leave the id query string in place. The One Search Engine that makes up 85% of your traffic eats them just fine, and the other two that make up 90% of what is not Google do as well.
So your fear is really only about search engines that nobody uses, and about spam harvesters.)
I think this should work
Disallow: /for-sale
Allow: /for-sale?id=*&*
Allow: /for-sale?id=*
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I need to categorize domains into different categories that offer the best use of a domain name.
Like categorizing 'gamez.com' as a gaming portal.
Is there any service that offers classification of domain name like Sedo is doing?
All the systems that I am aware of manage a list, somewhat by hand.
Using a web-filtering proxies (e.g. WebSense) for inspiration, you could scan for keywords contained in the domain name, or in web content/meta tags at the specified location. However, there are always items that seem to match more than one category, or no category, and these need deeper analysis.
Eventually you end up building your own fairly complex logic, maintaining a list by hand, or buying a list from someone else.
SimilarWeb API does that.
It's really straight forward and returns a given domain's category from a URL.
If these are new domains or not used domains. There isn't any information on the internet yet. You can make use of a mechanical turk, like: https://www.mturk.com/ .
You could post an task with your list and possible categories. The downside is this will cost you money.
If these are domains that are already in use you can use a bookmark service as xmarks or delicious. Retrieve all public bookmarks from that domain and count the number of tags. The most used tags will indicate a category of the domain.
I think https://tools.zvelo.com/ has pretty accurate categorization.
For example gamez.com comes back with Hobbies and Interests as IAB-TIER-1 and Video & Computer Games as IAB-TIER-2.
It also provides information if the domain is brand-safe, is it malicious or illegal content?
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
The answers for the first question would be link to the web-site from crawling page(from the page search engine knows already). But, if you type very_long_name_without_any_sense_123kni.com, I guess it will find it anyway.
The second question is about folders.... If you have robots.txt in your root directory, then it's a bit clear. But, if you have no robots.txt on your web-site, how will search engine find all the folders that are allowed to be accessed?
If a search engine knows your web-site but your web-site has no robots.txt, how long will it take to appear at most popular search engine? In 10 minutes? 1 hour? 1 day? 1 week? never? How dangerous is it to leave pages (that should be protected) unprotected even for 1 minute, if your web-site is not crawled yed (because it's protected)?
P.S. These questions are not about steps how to make your web-site popular and to appear on the first pages among others... I'm just curious about principles how it works...
They can't, and don't.
that said, they can make some guesses based on knowing domain names (That information is accessible) and typical default website locations at those domain names.