Mediawiki / Excel: Hyperlink from Excel to a non-existant wiki page gives a 404 - how can I fix or work around this? - excel

I suspect this could be something faulty with Excel (although I keep an open mind), but I wondered if anyone knew how I could get around this apparent bug:
I wish to create Excel spreadsheets which link to pages in a local wiki (running MW 1.14.0, full details below) where those pages don't yet all exist.
The idea is that over time we will fill in details of the pages, but we would like to create the links now (because copies of the Excel files will get sent out to various internal users and it will not be feasible to go track them down and add links later once the pages are created)
The problem is that when I create such a hyperlink in Excel and then go to follow the hyperlink, I get a message back indicating that the page does not exist. The full text of the message is:
"Unable to open http://. The Internet site reports that the item you requested could not be found. (HTTP/1.0 404)"
This happens on our site or in fact if you link to a non-existant page on wikipedia (e.g. http://en.wikipedia/wiki/Swed53rf). Whereas if you put such a link into a browser you get the correct response (which is to be taken to a page indicating that there is no such page but that you can create it by following the usual link)
Is there some setting on Apache that I might need to configure / override to make sure it returns a valid server response to Excel?
Creating links to existing pages works fine. I appreciate that in theory we could go around creating all the pages that are required, but some of the people involved in the project (creating the initial Excel files) do not / cannot use our wiki and it would be better if this just worked as it would appear it should rather than having to try to add steps to work around it in this way.
I also wondered if it were anything to do with the short URL reformatting. Our wiki, like wikipedia has short URLs, eg:
http://server/w/index.php?title=User:Joe_Blogs/Sandbox
can be reached from
http://server/wiki/User:Joe_Blogs/Sandbox
but including hyperlinks to the full name versions of the pages does not resolve the issue.
The version of Excel being used is Excel 2003 (SP3)
I have discovered that this also happens with Word 2003 (I imagine they are using the same code). However the desired behaviour occurs with Lotus Notes (a miracle, as it's rubbish in so many other ways! )
I have not done any significant development on Apache, but I could consider some form of custom page that re-directs to the non-existent wiki page if Mediawiki changes were deemed to complex/tricky. (although I'm not particularly sure where I'd start with this idea, I'm guessing some sort of URL parameter to accept the destination pagename might be a possible approach)
Any helpful suggestions gratefully received!!
[FYI: I have posted a question on MWUsers forum (www.mwusers.com) too after Googling this to no avail! I'll update the forum response there if I get an answer here or vice versa]
Many thanks,
Neil
Running on Ubuntu Server 8.10
Product Version:
MediaWiki 1.14.0
PHP 5.2.4-2ubuntu5.6 (apache2handler)
MySQL 5.0.51a-3ubuntu5.4
Installed extensions:
CategoryTree (Version r44056)
Renameuser
CategoryTree (Version r44056)
ImageMap (Version r35980)
ParserFunctions (Version 1.1.1)
StringFunctions (Version 2.0.2)

Not sure how to get Excel to let you go to a page which turns out to be a 404, but as a temporary workaround, you can hack out MediaWiki's 404 reporting on missing pages...
In MediaWiki 1.14 or 1.15 releases this will be in Article::view() in includes/Article.php:
if( $return404 ) {
$wgRequest->response()->header( "HTTP/1.x 404 Not Found" );
}
Note that the latest dev code is a little different, but you can find it where it sends the same header in the same file. :)

Wikipedia returns a 404 with a redirect which gets you to the page you want; my guess would be that Excel's rendering engine is not following the redirect.
You could try capturing the conversation in Wireshark, both with a browser and with Excel. That might show you what's happening differently.
Surely once you roll out the new pages, the links would start working, though?

Related

How can i get website report showing links from each page

I want to get a report which specifies what all links are there in each page of the website.I tried using different softwares,but the problem is they are just giving all links without showing exactly which links are there in each page.Also the website i am trying to make a report on is very unstructured,so it's not possible to just classify links,based on url forward slashes.For example,links starting with https://example.com/blog, will not give me all links inside the
'https://example.com/blog' page,because links inside 'https://example.com/blog' page can contains links without 'https://example.com/blog/' in the beginning of the link.
What can i do about this?
Thanks.
In Google analytics, there is no such concept as the next page.
Rather, it only knows the previous page.
It is due to the disconnected nature of the web.
You can, however, use the previous page to trace back to get the data you want.
Instead of looking for all links inside the https://example.com/blog, you will be looking at getting all links where the previous page is https://example.com/blog
More detailed explanation

SEO: Google fetch returns blank page (but rendered HTML seems correct)

I usually find all the answers to my question but this time I could not find any. This if actually the first time I post on stackoverflow!
Here is my problem.
The root of my website: "www.example.com" returns a blank (not empty) page when I use Google Webmaster Tools to fetch my website. When I look at the rendered HTML, it is exactly what it should be, but the preview of the page is just blank.
All the other pages of my website like "www.example.com/sample_page.html" seem to give the proper preview though. I have even tried to make a redirection (htaccess) of the root domain "www.example.com" to "www.example.com/sample_page.html" but it also gives a blank preview.
I use cached HTML files so it does not have anything to do with enabled JS or whatsoever. Furthermore, like I told you, the rendered HTML seems OK it's just the preview that does not return anything.
Any hint is greatly appreciated as I have been trying to find a solution since a few weeks now.

View. Show values as Links. Strange behaviour

Xpage (listPostits.xsp) has a "View" container control, where one of the column is set "show values in this column as links".
Now, here comes "Strange behaviour".
When i work with this application on my own (developer) PC (Win XP, Chrome or IE), the Domino generate the link, which can't be really processed:
/servername/db/postit/postit.nsf/listPostits.xsp/onePostit.xsp?documentId=many_numbers&action=editDocument
Namely, the Bold-marked portion shouldn't be there ! This portion is the name of the XPage, where the View control is in.
When i work with the application from other PC (Mac, Firefox) then i get the correct link (the same as above but without the XPage name inbetween):
/servername/db/postit/postit.nsf/onePostit.xsp?documentId=many_numbers&action=editDocument
update: let us leave for the moment the differencies in generated links between two machines. The first question is - why the extra portion is inserted into automatically generated link?
After playing around i think i might have found the reason for this strange behaviour. Namely, the "Substitution" Rules on the server side. One of them is to substitute "*/postit/all" with "/db/postit/postit.nsf/listPostits.xsp"
If i switch it off, then the Links are generated properly. Still, it's pretty strange to me that these settings influence the way Domino generates the links. I thought it works on the fly with them and those settings have nothing to do with the way how Links are generated inside the application.
So, the help now is needed regarding Web Site Rule Topic, but for that, i guess, i have to create another topic. But in case somebody has some good Info on this, please share it with me. I'm a bit confused at the moment :)
Final Update: Spent some more hours of testing and the results confirmed the initial idea.
If i open the page with the standart URL, i.e.
http://servername/db/postit/postit.nsf/listPostits.xsp then everything is fine, links are generated properly. When i however open the same page with short URL http://servername/postit/all , then server adds the substitute URL (db/postit/postit.nsf/listPostits.xsp) to every single link he generates automatically to be used as the link to open/edit the underlying document.
Is it bug or feature ? Don't know.
As a workaround (because i want to keep simple URL's for the application) i have to manually generate links.

Drupal: Cannot save any nodes of certain content types, used to work

Whenever I save or create a node of a certain user-defined type, I am back in the edit window instead of switching to the first tab labeled view. All my fields (body, title etc. are as they were and no message appears, neither directly on the page nor in the watchdog database log. The validation is working, though, because I see those 'required' messages as soon as I try to save without Title for example.
The strange thing is that when I create a new content type, or use the predefined story and page types, I can edit and create nodes. In the latter case, Drupal answers with an 302 redirect, whereas with the proplematic content types, only a 200 HTTP status is returned.
The issue doesn't seem to be related to either JavaScript (on or off, no difference), Browser (tried Chrome and Firefox) or WYSIWYG (used input formats with and without).
I'm using Drupal 6.22 and the CCK. I have about 7 content types, some of them with fields. I am not using Rules, but a multitude of modules, all of which are up-to-date. I will post a list if this issue can't be solved otherwise.
I have spend the last hours trying to figure this out, both by looking at my installation (settings, database) and by searching Google & Co.
Any ideas?
The situation appeared because Drupal translated both the Upload and the Save button to one word, Speichern. The FileField issue tracker contains the corresponding thread: http://drupal.org/node/684426
The ImageField and Locale modules, along with a language such as German or Finnish were partly responsible for the trouble.

How do I move old content down in the search engine rankings?

There is some precedent for search-engine-ranking-related questions on StackOverflow, so please don't close this question. It's programming-related to the extent that HTML META tags can be called "programming".
Here's the problem:
We make FogBugz, the software project planning and bug tracking suite.
Either we did a great job with our old documentation or a crummy job with our new documentation, but for most of the popular searches on FogBugz terms, documentation for our old versions comes up.
Here's an example. For context, our current FogBugz version is FogBugz 7. The top two results for that search are for FogBugz 5, which is positively ancient.
As best I can tell, there are several options for getting these results out of the top slots, but each has problems:
A NOINDEX tag, but what happens if someone is actually searching for help on an old version?
Finding the incoming links to the old documentation and placing a NOFOLLOW on them to deprive the old docs of PageRank. Problem here is that it's really fiddly to find the links to the content, rather than changing the content itself.
The unavailable_after tag, which is just a time-delayed NOINDEX, with the same problem of removal rather than demotion.
I just want these old documentation versions to stop competing with our current versions, without being completely unavailable.
An approach I used in the past (3 years ago)
Change the URL to your old documentation, and change your own links to point to the new url. e.g. abc.com/docs/fogzbugz/v5/xyz becomes abc.com/docs/fogzbugz/ancient/v5/xyz
Using the old URLs, implement a 301 redirection to your new v7 content. e.g. a request to abc.com/docs/fogzbugz/v5/GettingStarted.html is redirected to abc.com/docs/fogzbugz/v7/GettingStarted.html
In this way, existing links from external sites will take browsers to the latest documentation, and inform indexing robots that the page has moved.
Google will find the new links to your old documentation by indexing your site, but there will be no external links, thus reducing page rank.
Google will also find the new links to your new documentation, and as more sites link to it, its page rank will increase and so take priority.
This worked for me on a small scale (100 or so pages) site, and visitor attempts to view the old content rapidly dropped off.
If a user does land on a v5 page, how about the MSDN approach of explicitly stating the version that the page describes, and providing links to the equivalent topic in the v6 and v7 docs?
I would suggest that external links to older versions get redirected to the latest version - with some sort of note that if you really needed version 5 the link is here.
I think a lot of the problem deals with the fact that search engines give something a high rank if a lot of people are linking to a specific page. Unless you can get all the people linking to your old documentation, to link to your new documentation, then you are going to have a problem with the older documents being rated artificially high. In order to overcome this, you might need to change the way you handle documentation pages. One good way would be to always show the newest information on a particular topic, and then only by clicking on a link on the page, do you get to the older versions. Optimally, this would be the same page, with a different parameter, to state which version you want to get documentation for.
What about trying the MSDN approach? You assign a version tag to your pages. When this page is displayed, its version number is displayed as well. Users will be able to see immediately that this information is deprecated.
You may need to write some stubs for new version pages like "This problem has been resolved in the current version" so that the users don't have to think you didn't do anything in 5 years. Some writing work, some interlinking but it's doable for a limited number of problematic pages.

Resources