multiple urls pointing to one website iis - iis

I don't know whether it is possible or not please help me on this.
I have url1: mycompany.test.com
url2: mycompany1.test.com
My current customers using the URL1 for my product.I have got a requirement from one of our customers, that they want to use "mycompany1.test.com" instead of "mycompany.test.com".How to do that, and these both URLs should point to the same directory.These changes should not reflect for my remaining customers.

Related

Spring Data Rest Frontend deep linking

So i have been struggling with this one question some time now:
How to handle details Page or deep linking on the Frontend.
So, say, we got a paged collection endpoint with user entities in it and a React App consuming the endpoint.
The flow would be, user authenticates, gets collections, clicks on an item and is either:
Redirected to a new Url say: webapp.com/users/userid
A modal opens with the user details.
Say we got a scenario were two people working with the webapp, Person 1 wants to share a link with Person 2. Person 2 should do some updates on a specific user, which is identified by the link.
The link should be something like : https://www.webapp.com/users/{slug or id}
With Option 2 this functionality is not mappable.
With Option 1 we got to expose the ids in the response to identify the resource, which may work, but we would still need to hardcode the url, as the findById method is not exported as a Uri Template.
So, my Solution would be to add a slug for the resources, implement a search method by the slug, and then get the user, if found, by its self-link.
Sounds like a good solution for me, but on the other hand, I would have to add an extra frontend id(the slug here) which would need to be also unique, to the database model.
So how do you guys handle a problem like this, or is there anybody using spring data rest in this way or in production mode where you have the handle situations like this?
Should mention that this isn’t a primary problem with spring data rest but rather with hateoas itself.
thanks in advance
Florian
You don't need to hardcode URL template. Spring data rest will generate links for each resource.
You can refer to it from front end by some format like: {your_user_object}._links.self.href

Kimono Desktop's payload url and index fields missing

With the Kimono Web, in the crawled payload there was always url and index field in every source URL JSON. But with the desktop, these fields are missing and my product was totally depends on it.
I'm browsing the source codes of Kimono Desktop but I couldn't manage to find that part.
The index field is explained in there ; https://help.kimonolabs.com/hc/en-us/articles/203349674-Add-a-unique-index-to-each-result-object-
Can anyone help me with it ?
Thanks
I've had the same issue. I found this workaround for the missing url field with the desktop application http://mudd.com/blog/how-to-extract-vdp-data-from-your-website/
Also, in case you used the crawl scheduling feature with the Kimono web app, I found that if I edit my APIs and save them again it lets me choose a crawl frequency. I just discovered this so I'm crossing my fingers and waiting to see if it's really going to work.

Nodejs URL modifications for SEO purposes

I have a working App built on Nodejs + Drywall + Openshift, sorry it's in Arabic. Basically, I am looking to improve on the service, but having a major roadblock. The site is a classifieds site and I need to optimize it for SEO, however, my links to ads are shown like this...
http://yobyobi.com/ads/show/55c9ff9dcf68970612ba2d38
55c9ff9dcf68970612ba2d38 is the Ad ID on my mongoDB, I do also have a record for the indicating the date and the title of Add combined "Sun-Nov-22-2015-8-pm-2007-camry-for-sale", the goal is to make the URL pretty and understandable by search engines. The end result I want to accomplish is one of the following:
yobyobi.com/ads/show/55c9ff9dcf68970612ba2d38/Sun-Nov-22-2015-8-pm-2007-camry-for-sale
yobyobi.com/ads/show/Sun-Nov-22-2015-8-pm-2007-camry-for-sale/55c9ff9dcf68970612ba2d38
yobyobi.com/ads/show/Sun-Nov-22-2015-8-pm-2007-camry-for-sale/
Now, option number 3 would be ideal, but would slow down my application if I have to search by Ad title instead of Ad ID. Similar to what Stackoverflow is doing (attached pic)
Stackoverflow example
Code
app.get('/ads/show/:id', require('./views/account/ads/index').read);
The above line returns the Ad for me with all the details including the title that I want to use, but the problem is that I cannot change the route URL after I receive the title.
I am not sure if this module would help in whatever I am trying to do it's called "named-routes"
Has anyone ran across this problem? If so can you share some insight on how to best tackle the problem?
Thanks in advance,
well, the solution was dead simple, do the following:
add * as a wild card like this
app.get('/ads/show/:id/*', require('./views/account/ads/index').read);
now when you create the links to the post, attach anything where the * is and it should show the same Ad without breaking the page.
Cheers

How fast does Google take to crawl new page, and can we influence Google's crawler?

I want to submit my site to Google. How much time does it take to crawl a new post on the website?
Also, is there a way to feed this post to Google crawler as soon as a post is created?
Google has three modes of entering a website into its results - discover, crawl, index.
In order to 'discover' your site, it must be made aware of it's existence - normally through back-links. If you're site is brand new you can use the submit URL form - but this isn't really a trusted method. You're better off signing up for a Google Webmaster Tools account and submitting your site. An additional step is to submit an XML sitemap of your site. If you are publishing to your site in a blogging/posting way - you can always consider PubSubHubbub.
From there on, crawl frequency is normally based on site popularity (as measured by ye olde PageRank). Depth of crawl (crawl-budget) is also determined by PR.
There are a couple ways to help "feed" the Google Crawler a URL.
The first way is to go here and submit a URL ---> www.google.com/webmasters/tools/submit-url/
The second way is to go to your Google Webmasters Tools and clicking "Fetch as GoogleBot"
And then inputting the URL you want to add:
http://i.stack.imgur.com/Q3Iva.png
The URL will then appear similar to this:
http:\\example.site Web Success URL submitted to index 1/22/12 2:51 AM
As for how long it takes for a question on here to appear on google, there are many factors that are put in to this.
If the owners of the site use Google Webmasters Tools, the following setting is available:
http://i.stack.imgur.com/RqvOi.png
For fast crawl you should submit your xml sitemap in google web master and manually crawled and index your web pages url through google webmaster fetch.
I also used google crawled and index method and after that this practices give me best result.
This is a great resource that really breaks down all the factors that affect a crawl budget and how to optimize your website to increase it. Cleaning up your broken links and removing outdated content, for example, can work wonders. https://prerender.io/crawl-budget-seo/ 
I acknowledged error in my response by adding a comment to original question a long time ago. Now, I am updating this post in interest of keeping future readers from being misguided as I was. Please see notes from other users below - they are correct. Google does not make use of the revisit-after meta tag. I am still keeping the original response text here to make sure that anyone else looking for similar answer will find it here along with this note confirming that this meta tag IS NOT VALID! Hope this helps someone.
You may use HTML meta tag as follows:
<meta name="revisit-after" content="1 day">
Adjust time period as necessary. There is no guarantee that robots will return in given time frame but this is how you are telling robots about how often a given page is likely to change.
The Revisit Meta Tag is used to tell search engines when to come back next.

How I do to block Web scraping without blocking Well behaved bots?

I'm building an e-commerce website with a large database of products. Of course, is nice when Goggle indexes all products of the website. But what if some competitor wants Web Scrape the website and get all images and product descriptions?
I was observing some websites with similar lists of products, and they place a CAPTCHA, so "only humans" can read the list of products. The drawback is... it is invisible for Google, Yahoo or another "Well behaved" bots.
You can discover the IP addresses the Google and others are using by checking visitor IPs with whois (in the command line or on a web site). Then, once you've accumulated a stash of legit search engines, allow them into your product list without the CAPTCHA.
If you're worried about competitors using your text or images, how about a watermark or customized text?
Let them take your images and you'd have your logo on their site!
Since a potential screen-scaping application can spoof the user agent and HTTP referrer (for images) in the header and use a time schedule that is similar to a human browser, it is not possible to completely stop professional scrapers. But you can check for these things nevertheless and prevent casual scraping.
I personally find Captchas annoying for anything other than signing up on a site.
One technique you could try is the "honey pot" method: it can be done either by mining log files are via some simple scripting.
The basic process is you build your own "blacklist" of scraper IPs based by looking for IP addresses which look at 2+ unrelated products in a very short period of time. Chances are these IPs belong to Machines. You can then do a reverse lookup on them to determine if they are nice (like GoogleBot or Slurp) or bad.
Block webscrapers is not easy, and it's even harder trying to avoid false positives.
Anyway you can add some netrange to a whitelist, and don't serve any captcha to them.
All those well known crawlers: Bing, Googlebot, Yahoo etc.. use always specific netranges when crawling, and all those IP addresses resolve to specific reverse lookups.
Few examples:
Google IP 66.249.65.32 resolves to crawl-66-249-65-32.googlebot.com
Bing IP 157.55.39.139 resolves to msnbot-157-55-39-139.search.msn.com
Yahoo IP 74.6.254.109 resolves to h049.crawl.yahoo.net
So let's say that '*.googlebot.com ', '*.search.msn.com ' and '*.crawl.yahoo.net ' addresses should be whitelisted.
There are plenty of white lists you can implement out on internet.
Said that, I don't believe Captcha is a solution against advanced scrapers, since services such as deathbycaptcha.com or 2captcha.com promise to solve any kind of captcha within seconds.
Please have a look into our wiki http://www.scrapesentry.com/scraping-wiki/ we wrote many articles on how to prevent, detect and block web-scrapers.
Perhaps I over-simplify, but if your concern is about server performance then providing an API would lessen the need for scrapers, and save you band/width processor time.
Other thoughts listed here:
http://blog.screen-scraper.com/2009/08/17/further-thoughts-on-hindering-screen-scraping/

Resources