MOSS 2007 Crawl - sharepoint

I'm trying to get crawl to work on two separate farms I have but can't get it to work on either one. They both have two WFE's with an additional WFE configured as an Index server. There is one more server dedicated for Query and two clustered SQL 2005 back end servers for the database. I have unsuccessfully tried at least 50 different websites that I found with solutions from a search engine. I have configured (extended) my Web App to use http://servername:12345 as the default zone and http://abc.companyname.com as the custom and intranet zones. When I enter each of those into the content source and then try to run a crawl, I get a couple of errors in the crawl log:
http://servername:12345 returns:
"Could not connect to the server. Please make sure the site is accessible."
http://abc.companyname.com returns:
"Deleted by the gatherer. (The start address or content source that contained this item was deleted and hence this item was deleted.)"
However, I can click both URL's and the page is accessible.
Any ideas?
More info:
I wiped the slate clean, so to speak, and ran another crawl to provide an updated sample.
My content sources are as such:
http://servername:33333
http://sharepoint.portal.fake.com
sps3://servername:33333
My current crawl log errors are:
sps3://servername:33333
Error in PortalCrawl Web Service.
http://servername:33333/mysites
Content for this URL is excluded by the server because a no-index attribute.
http://servername:33333/mysites
Crawled
sts3://servername:33333/contentdbid={62a647a...
Crawled
sts3://servername:33333
Crawled
http://servername:33333
Crawled
http://sharepoint.portal.fake.com
The Crawler could not communicate with the server. Check that the server is available and that the firewall access is configured correctly.
I double checked for typos above and I don't see any so this should be an accurate reflection.

One thing to remember is that crawling SharePoint sites is different from crawling file shares or non-SharePoint websites.
A few other quick pointers:
the sps3: protocol is for crawling user profiles for People Search. You can disregard anything the crawler says about it until you're ready for user profiles.
your crawl account is supposed to have access to your entire farm. If you see permissions errors, find the KB article that tells you the how to reset your crawl account (it's a specific stsadm.exe command). If you're trying to crawl another farm's content, then you'll have to work something else out to grant your crawl account access. I think this is your biggest issue presently.
The crawler (running from the index server) will attempt to visit the public URL. I've had inter-server communication issues before; make sure all three servers can ping each other, and make sure the index server can reach the public URL (open IE on the index server and check it out). If you have problems, it's time to dirty up your index server's hosts file. This is something SharePoint does for you anyway, so don't feel too bad doing it. If you've set up anything aside from Integrated Windows Authentication, you'll have to work harder to get your crawler working.
Anyway, there's been a lot of back and forth in the responses, so I'm just shotgunning a bunch of suggestions out there, maybe one of them is on target.

I'm a little confused about your farm topology. A machine installed as a just a WFE cannot be an indexer. A machine installed as "complete" can be an indexer, query and/or a wfe...
Also, instead of changing the default content access account, you may want to add a crawl rule instead (once everything is up and running)
Can you see if anything helpful is in the %commonprogramfiles%/microsoft shared/web server extensions/12/logs on your indexer?
The log file may be a bit verbose, you can search for "started" or "full" and that will usually get you to the line in the log where your crawl started.
Also, on your sql machine, you may be able to get more information from the MSScrawlurlhistory table.

Can you create a content source for http://www.cnn.com and start a full crawl? Do you get the same error(s)?
Also, we may want to take this offline, let me know if you want to do that.
I'm not sure if there is a way to send private messages via stackoverflow though.

Most of your issues are related to Kerberos, it sounds like. If you don't have the infrastructure update applied, then Sharepoint will not be able to use kerberos auth to web sites w/ non default (80/443) ports. That's also why (I would bet) that you cannot access CA from server 5 when it's on server 4. If you don't have the SPNs set up correctly, then CA will only be accessible from the machine it is installed on. If you had installed Sharepoint using port 80 as the default url you'd be able to do the local sharepoint crawl without any hitches. But by design the local sharepoint sites crawl uses the default url to access the sharepoint sites. Check out http://codefrob.spaces.live.com/blog/cns!7C69E7B2271B08F6!363.entry for a little more detail on how to get Kerberos & Sharepoint to work well together.

In the Services on Server section check the properties for the search crawl account to make sure it is set up, and that it has permissions to access those sites.

Thanks for the new input!
So I came back from my weekend and I wanted to go through your pointers and try every one and then report back about how they didn't work and then post the results that I got. Funny thing happened, though.
I went to my Indexer (servername5) and I tried to connect to Central Admin and the main portal from Internet Explorer. Neither worked. So I went into IIS on ther Indexer to try to browse to the main portal from within IIS. That didn't work either and I received an error telling me that something else was using that port. So I saw my old website from the previous build and I deleted it from IIS along with the corresponding Application Pool. Then I started the App Pool for the web site from the new build and browsed to the website. Success. Then I browsed to the website from the browser on my own PC. Success again. Then I ran a crawl by the full URL, not the servername, like so:
http://sharepoint.portal.fake.com
Success again. It crawled the entire portal including the subsites just like I wanted. The "Items in index" populated quickly and I could tell I was rolling.
I still cannot access the Central Admin site hosted on servername4 from servername5. I'm not sure why not but I don't know that it matters much at this point.
Where does this leave me? What was the fix?
I'm still not sure. Maybe it was the rebuild. Maybe as soon as I rebuilt the server farm I had everything I needed to get it to work but it just wouldn't work because of the previous website still in IIS. (It's funny how sloppy a SharePoint un-install can be. Manual deletion of content databases, web sites, and application pools seem necessary and that probably shouldn't be the case.)
In any event, it's working now on my "test" farm so the key is to get it working on the production farm. I'm hopeful that it won't be so difficult after this experience.
Thanks for the help from everyone!

Related

Datasources only found on 1 site in IIS 7

The title is not quite correct, but here is the problem situation:
Setup multiple sites on the same IIS 7 server
Installed CF10 and it works fine on all sites
CFIDE Datasources can only be found for 1 site, not all of them, even though they still work on all sites
To see CF datasources (using RDS), the URL is sitename/CFIDE/administrator/datasources/index.cfm. Each site in IIS 7 has the CFIDE directory mapped to it as far as I know. It appears in the site folder structure for all my sites as a virtual directory. I used the Web Server Configuration Tool to remove and re-add ColdFusion to all my sites.
The problem is that applications using RDS can only find datasources for one of my sites. It uses the URL given above sitename/CFIDE/administrator/datasources/index.cfm to find the datasources of the site. RDS is not picking up the datasources for any of the other sites.
I tried manually going to sitename2/CFIDE/administrator/datasources/index.cfm (sitename2 being the name of a different site in IIS to the one that's working) and I just get this error:
"The page isn't redirecting properly
Firefox has detected that the server is redirecting the request for this address in a way that will never complete."
Can anyone suggest how to fix this so the URL will resolve for each site? Otherwise my RDS feature has broken which is not good. If I test the sites themselves, they all work fine and can access my datasources just fine. So something is up with the RDS feature
I've sorted it. Looks like it was a password thing. I had to remove the require password authentication and re-apply it again.

first time using SSRS web server, setting up website

its my first time using this and as a newbie I have many questions. Any help are appreciated.
My ultimate goal is to have reports created from database that will be able to be accessed by other end users on website so they can view/filter the report data online in a shared way with some user control settings.
So I have already made my reports in the visual studio linking to databases.
And I have also set up the Reporting Service Configuration Manager so that I can access SSRS home page and the site setting at http://'127.0.0.1'/Reports/Pages/Folder.aspx
Now my question is, how will the other end users be able to get onto the website and get access to the reports I created with SSRS? Do I upload the reports in .rdl on my report site manually or do I deploy it from VS? How do I turn my '127.0.0.1/Reports' into a public site for other user's access? Or do I have to create it using a sharepoint?
Thanks so much, I need a guidance to head toward the right direction! :)
Now my question is, how will the other end users be able to get onto the website and get access to the reports I created with SSRS?
Users will need 2 things from you to access the site: the server name/address, and a means of authenticating to it. By default, authentication is handle via Windows domain auth (which you can change, with varying degrees of effort...).
Do I upload the reports in .rdl on my report site manually or do I deploy it from VS?
It actually makes no difference in the end; do whichever you find easier. (There are also plenty of other ways to deploy reports, such as through powershell!)
How do I turn my '127.0.0.1/Reports' into a public site for other user's access?
Well you're halfway there - At this point, you could probably open up your firewall (port 80, maybe 443 depending on your config), and have people connect to your computer via IP or hostname - for example, if your computer's IP was 12.34.56.78, they could visit 12.34.56.78/Reports/ and access the site. If you have a means of creating a URL and pointing it to your SSRS server, you might need to open the configuration manager again and bind that URL to SSRS.

SharePoint MOSS 2007: extranet's external FBA login stops working, can't trace cause

We have an extranet site. The Central Admin server is internal, with an external DMZ server that hosts the extended site. The external site connects to LDAP via FBA.
Every so often, the login.aspx page comes up without any core.css formatting or fonts, and users are unable to log in. When they try to submit their login they're re-prompted.
What has worked in the past is overwriting the web.config (containing connections to our LDAP server) with a known good copy. Recently, in addition to the overwrite, we've had to run an IISRESET.
The "default" internal site, connected to the local domain, works fine and without interruption.
The web.config 'fix' worked for a couple years, but recently it has been happening more frequently, almost daily. Last time it occurred, I noted that the 'bad' web.config's Modify date hadn't changed. A file compare shows them to be identical.
Has anyone seen something like this before? The only site customizations are the standard web.config connections to LDAP.
Much thanks,
Scott

SharePoint 2010 Search not working

I have installed and configured SharePoint 2010 to run on the same box as the SQL Server its running from in Windows Server 2008 R2. Everything is working fine except the search. I have uploaded several documents and tagged several items (documents, tasks, announcements etc), however whenever I search the site using the defaul search, i get nothing returned no matter what i search on, I simply get "We did not find any results for [search term]". I know there is setup needed if you wish to use "FAST search", but do I have to do anything to get the standard default search to work?
Found the answer on SharePoint.SE:
After installing the system you need to configure your indexing job.
Navigate to CA > Service Applications > Search.
You will see a link to your Content Sources. If you edit that it will give you the opportunity to setup a schedule for both Full and Incremental indexing.
You can kick off a full crawl, once completed you will have results if everything is configured correctly.
It does work for people search too . if you edit this content source you will see sps protocol which is for user profiles .
To make people search to work, in Central Administration > Manage Service Applications, make sure to provide a valid domain\user account as administrator in User Profile Service Application.
To add on to Chensformers answers, the account has to have "Retrieve People Data for Search Crawlers" enabled. They have to have it in the Administrators button, even if the Permissions is set to Full Control! Quite misleading.

Setup Sharepoint search?

For some reason my search in the sharepoint site does not work.
I have set up the SSP, the scopes, the crawls, everything but it still does not work
Can someone explain to me how to setup the search? Maybe I did something wrong in the process.
It's not the simplest thing in the world to setup, as it's comprised of a number of components.
You need to check each one to determine where your problem is.
Start from the crawl, and work your way forward to the search production on the page.
So check the following:
Check some servers have been setup to index pages. (You can see this under services on servers in the central administration pages.)
Make sure they're all running correctly. (Not in a half started state.)
Check your crawl log in your SSP to see if it is indexing anything.
(Index different types of content, like file shares, web sites, and sharepoint itself. (check each one.)).
(Note you need a special plugin to index PDF's.).
Check your index is copied to the front end server where it is used.
If it's not, it may be because this hasn't been configured, (Check Services running on servers again)
Then check your site collection setup, and ensure you have a search site configured.
Ensure the site collection search details are configured to use the search site.
Finally check the user doing the searching actually has access to the content being indexed.
Doing all of that should give you some idea of where the problem is.
In addition to Bravax's answer its worth checking that you are not getting stung by the local loopback check.
I had similar problem and ended up using search server express which is free (see my answer from this link: sharepoint 2010 foundation search not working)
I have installed search server express 2010 on top of SPF which works great. it has additional features and work well with sharepoint foundation. her is a link for upgrade and configuration: http://www.mssharepointtips.com/tip.asp?id=1086
You need to crawl the the contents source and add the website to it, then run full crawl to index data.

Resources