My question pertains specifically to the two pages below, but is also more generally relating to methods for using clean URLs without an .htaccess file.
http://www.decitectural.com/
and
http://www.decitectural.com/about/
The pages above are hosted on Amazon's S3, which does not allow for the use of htaccess files. As a result, I have found no easy way to create a clean url rewrite scheme that sends all requests to an index file which, in turn, interprets the URL using javascript and loads up the correct page (with AJAX, or, as is the case with decitectural, with simple div visibility toggling).
In order to circumvent this problem, I usually edit the amazon S3 bucket properties and set both the index page and the error page to the index.html file. In this case, the index.html file is served even when an invalid path (such as /about/) is requested. This has, for the most part, been a functioning solution... That is, until I realized that I was also getting a 404 with the index.html page which would stop Google from indexing it.
This has led me to seek out an alternative solution to this problem. Currently, as a temporary fix, I am actually creating the /about/ directory on the server with a duplicate of the index.html file in it. This works, but obviously is not a real solution to the problem.
I would appreciate any advice on how to set up a clean URL routing scheme on S3 or in any instance where an .htaccess file can't be used.
Here's a few solutions: Pretty URLs without mod_rewrite, without .htaccess
Also, I guess you can run a script to create the files dynamically from an array or database so it generates all your URLs:
/index.html
/about/index.html
/contact/index.html
...
And hook the script on every edit, in a cron or run manually. Not the best in terms of performance but hey, it should work.
I think you are going about it the wrong way. S3 gives you complete control of the page structure of your site. If you want your link to be "/about", just upload a file called "about", and you're done. (Set the headers so that the browser knows it's HTML.)
Yes, it will break if someone links to "/about/" or "/about.html". But pretty much any site will break if you mess with their links in odd ways. You will have to be vigilant when linking to your own site, because you won't have any rewrite rules to clean up for you. But you should have automation doing that.
Related
We have a FAQ page /faq (tab style) where every question should have its own 'ghost' url/page. So users could visit eg.
/faq/question-1
/faq/question-2
/faq/question-3
The problem is question-1, question-2, question-3 are not actual pages but just sections on /faq. For SEO, aesthetics and usability reasons we do not want to work with ?q= or #
I've searched and tried every .htaccess thread I came across but without result.
Is there a way we can show the page/faq when visiting /faq/question-1 and keep the url /faq/question-1 with mod_rewrite? (we cannot hardcode it because we do not know all future question slugs) So basically something that tells the browser: if the first url part is /faq/, just ignore everything that comes behind but keep the url.
Thanks
This is a trivial rewriting task and it is unclear why this should not work for you:
RewriteEngine on
RewriteRule ^/?faq/.+ /faq [END]
Since you claim that you "tried every .htaccess thread you came across" and this clearly works the question is: why not in your setup? But since you did not tell us anything about your setup we cannot really offer more help...
These are some general hints though which you should go through:
Where did you implement the rules you tried? In the http server's host configuration or in a distributed configuration file?
If you are using a distributed configuration file (".htaccess") then how did you make sure such files are interpreted by your http server and how did you test that?
Did you check your http server's error log file for hints?
Did you make sure that you are not actually looking at cached responses? So did you really test with a fresh anonymous browser window using a "deep reload"?
Since the CMS you are using requires own rewriting rules, where did you add those rules you tried? Remember: the order is important!
So, currently, I do not have the ability to utilize .htaccess, I have no control over the server config at the moment, so I have to live with /index.php/ in my URL. That's not a problem, but there is one issue.
If I go to webserver.com/site/index.php everything works fine. But, if I go to webserver.com/site/ (without index.php) the site loads but all the links are broken (relative to /site/ instead of /site/index.php/).
I've tried various ways to build the links with url() and route() but I can't get anything to work short of hard-coding the full url for every link.
Any ideas?
In my typo3temp folder I always find a file called javascript_a1cb3a5978.js. It seems that this is a JS by Typo3 to encrypt email adresses. Now in the code always the trojan is appended. I delete the file from the Typo3 cache and if the page is called in the browser the file is generated.
I tried to download the site and scan it with Security Essentials. Also I tried to search for eval but there are too much in the whole Typo3 folder. I didn't found something in the index.php and also I didn't found it in the htaccess. Permission should be OK for the site.
Do you have some ideas for me where this code is appended?
Check typo3conf/localconf.php and typo3conf/temp_* files and typo3conf/extTables.php.
Deactivate every extension and update your TYPO3. Check your TypoScript. I guess you should shut down your website and analyse how the attacker injected that code.
I've been asked to figure out how the Concrete5 system works for an employer, and I can't figure something out.
I have Concrete5 installed to a directory on the server called /realprofessionals. When the Concrete5 system makes new pages, it gives them their own absolute paths, for instance:
http://www.wmcpartners.com/realprofessionals/footer
However, it hasn't actually made a folder in the /realprofessionals directory called footer. So how does that work? How can http://www.wmcpartners.com/realprofessionals/footer be a working link?
Short answer: All page requests are actually going through the one and only index.php file. Page content is stored in the database, not in files on the server.
Long answer:
Concrete5 (and most PHP-based CMS's for that matter) work like this: all requests are routed through the index.php file. This routing is enforced with some mod_rewrite rules in the .htaccess file. The rules say "for any request, don't actually go to that page, but instead go to index.php and pass the rest of the requested path as $_GET parameters". Then in the index.php code (or some other code that is included by the index.php file), the requested page is determined based on the path that was put into the $_GET parameters by Apache (as per the mod_rewrite rule in .htaccess), and the appropriate content is retrieved from the database.
Storing content in the database as opposed to files on the server has several advantages. For example, you can re-use the same html template -- header, footer, sidebar -- on every page, and if you change the template it will automatically be reflected on all pages it's used on. Also, it makes it easier to shuffle pages around and to give them whatever URL you want (e.g. no ".php" extension at the end, or /2010/11/date/based/paths/for/blog/posts).
The disadvantage of course is that every request requires many database queries, but for most sites (those without zillions of page views), the trade-off is well worth it (and various types of caching can help reduce the performance hit).
Jordan's answer is excellent, I would add that you probably don't see index.php in the url because you've enabled pretty URLs (type 'pretty' on concrete5's searchbox to check that).
Anyhow, the best way to programmatically add link to internal pages is:
<a href="<?=$this->url('page-name');?>">
page name
</a>
It works both on localhost and online, with or without pretty URLs.
(For the page-name go to dashboard/full sitemap/page-name/properties/page paths and location.)
I'd like to know how websites have created URLs with other domains like these on trafficestimate.com.
I'm guessing it's some .htaccess stuff to redirect domain names to a dynamic page?
Thanks
Your URL has an GET Request. So when someone calls the page http://google.com/search with the parameters hl=en, safe=off etc., the page can process those parameters. So for instance safe=off means that you want to get back any search result. The q=site:... is your search string. In this case Google will look it up in its database and give you the results. So when you call this URL there is probably no .htaccess processing done. However you can process the URL and GET request with .htacces and i.e. redirect the user to another page.
Maybe you'll describe a bit further what exactly you trying to do/want to know. This makes explaining easier.
EDIT: After reading Gumbo's comment I looked at the Google result page. So maybe your question means the trafficestimate-URLs. They look like http://trafficestimate.com/example.org. This is really a good case for .htaccess. So using .htaccess they take the URL and redirect it to http://www.trafficestimate.com/websites/?domain=example.org. Here you have again a GET request and an application builds the page.
Some URL rewriting is probably involved. Otherwise they would have to create an existing file for every possible request.
Using Apache’s mod_rewrite in a .htaccess file is one option. But since the server identifies itself with “Microsoft-IIS/7.5”, they are probably rather using ISAPI_Rewrite, a mod_rewrite derivative for Microsoft’s IIS.