htaccess rewrite url parameter with defined values - .htaccess

I need to rewrite URLs like this
www.url.com?host_interface=220?222?770
Into this (this the the one the user should see)
www.url.com/host_interface-usb,firewire,pcie.html
For this example I need to replace these parameters:
220 = usb
222 = firewire
770 = pcie
To make the url above better readable I exchanged the "%2C" to "?"
? = %2C
So, it actually looks like this
220%2C222%2C770
Do I need to translate this too?
besides the parameter the
?host_interface=
need to rewriten to
host_interface-
also its needs the ending
.html
So at the end I want to place several parameter like
xyz = abx
in the htaccess so all the parameters gets to be rewritten.

Related

File-name not visible in the URL

I am able to see the file name in the URL in lower environments like SIT and UAT. But in Production environment, some junk value is replacing the file name. Any help will be great.
File name is replaced with some junk value this -> "bWFzdGVyfGltYWdlc3w4OTM1fGltYWdlL3BuZ3xpbWFnZXMvaDk4L2g4My84ODA0MTAxMDk1NDU0LnBuZ3xjMWY2OTZmOGQ5ZGM2MTIxMmQxMmUwODI5ZGQwYTg5YzNhMjIyYjQzMTJlMzc1MTU0ZmUyZWFjOGE5MjUyMGFj"
If you are asking about Media URL.
In hybris, SEO friendly URL call prettyURL. That can be enabled by setting media.legacy.prettyURL = true in the local.properties.
With prettyURL disabled, URL looks something like this
/medias/fileName.jpg?context=NAYDCL3IGAZC6ZTPN4XGU4DHHI5DU4LXMVZHI6JRGIZTINI.....
Above, context request paramater is base64 encoded media details.
With prettyURL enabled, URL looks something like this
/medias/sys_master/images/h98/h83/8804101095454/yourFileName.jpg
Now verify you have the same value for media.legacy.prettyURL in all environment. By default, prettyURL is disabled(media.legacy.prettyURL = false).
Refer LocalMediaWebURLStrategy class and help.hybris for more detail.
This is not junk value, it is base64 encoded text. It has unavailable characters for URL so system auto encode your value.
master|images|8935|image/png|images/h98/h83/8804101095454.png|c1f696f8d9dc61212d12e0829dd0a89c3a222b4312e375154fe2eac8a92520ac

How to make apache treat query string as file name?

I mirrored a site to local server with wget and the file names locally look like this:
comments
comments?id=123
Locally these are static files that show unique content.
But when I access second file in browser it keeps showing content from file comments and appends the query string to it ?id=123 so it is not showing content from file comments?id=123
It loads the correct file if I manually encode the ? TO %3F in browser window and I type:
comments%3Fid=123
Is there a way to fix this ? Maybe make apache stop treating ? as query separator and treat it as file name character ? Or make an URL rewrite and change ? into %3F ?
Edit: Indeed too many problems caused by ? in file name and requests. I ended up using the wget option --restrict-file-names=windows that would convert ? into an # when saving file name.
The short answer is "don't do that."
The longer answer is that ? is a reserved character in URLs, using it as a part of a filename is going to cause problems forever, and the recommended solution is to pick a different character to use in those filenames. There are many to choose from - just avoid ? & # and # and you'll probably be fine.
If you insist on keeping the file name (or if you don't have an option) try:
RewriteCond %{QUERY_STRING} (.*)
RewriteRule (.*) $1%%3F%1 [NE]
However, this is going to fire any time you have a query string, which is likely not what you want.

Config for find url in html content

Can anybody help me to configure Sphinx for best matching url (part of url) in html content?
My config:
index base_index
{
docinfo = extern
mlock = 0
morphology = none
min_word_len = 3
charset_type = utf-8
charset_table = 0..9, A..Z->a..z, a..z
enable_star = 1
blend_chars = _, -, #, /, .
html_strip = 0
}
I use SphinxAPI on backend (PHP) with SPH_MATCH_EXTENDED mode.
I don't understand how search works. If I find "domain.com" I have 37 results. If "www.domain.com" - 643 results. But why? The "domain.com" is needle of "www.domain.com" and in theory with first query a have to get more results.
FreeBSD 9.2. Sphinx 2.1.2
16 distributed indexes (147Gb)
This is a bit late, but here's my thoughts anyway.
It looks like when you search www.domain.com, sphinx is actually looking for www domain and com respectively. If you're searching for just domain.com, it's just looking for domain and com. This is probably the reason why www.domain.com returns more results, because www appears more frequently throughout the index.
Since you're searching URLs, I would setup stopwords depending on how you want to search. For me, I would make www com org and basically all top-level domains stopwords. You might want to leave the top-level domains and just make www a stopword. This would allow you to weight com higher than a net in a search result.
If you setup your stopwords right, when someone searches domain.com sphinx actually just looks for hits of domain in the index, whether it be domain.com or domain.org or domain.net.

.htaccess condition that works on many conditions inside

I want to try something like if in .htaccess:
I want to Redictes each ?sp=SOMEWHAT to diffrent ?p=NNN (some number)
I have a 100 ?sp= pages.
And I don't want to work on 100 Rules each page load.
If this another method to solve it, I happy to know.
if(RewriteCond %{HTTP_HOST ^?sp=}{
RewriteRule ^?sp=bar ?p=5
RewriteRule ^?sp=foo ?p=9
RewriteRule ^?sp=tin ?p=15
}
This is no logic between the ?sp= and ?p=
Update: I doesn't have access to server config.
This can be done with the RewriteMap directive (iff you have access to the server configuration, as pointed out in a comment. No idea why they thought that needed to be restricted...). For example:
RewriteMap sp_to_s txt:/path/to/map.txt
RewriteRule ^?sp=(.*) ?p=${sp_to_s:$1|0}
(the 0 is the default value if none of the pairs in the map match).
Here's a sample map.txt:
bar 5
foo 9
tin 15
There are more ways to use the map feature; see the documentation for mod_rewrite for details.

URLs with symbol "%" at the end make http error, how to prevent it with htaccess?

I have a doubt with some of my URLs from my acces_log . There are some URLs from external sites linking me like http://domain.com/url_name.htm% (yes, with %).
Then... my server returns http error, I need to redirect this fake URLs to the correct way, and I thought in htaccess.
I only need to detect the % symbol in the last character of URL, and redirect without it.
http://domain.com/url_name.htm% --> http://domain.com/url_name.htm
How can I do this? I was trying with some samples with ? symbol but I didn't have lucky.
Thanks!
I already found the mistake...
It seems that some malformed URLs don't pass to vhost, then these petitions don't read the .htaccess.
The only way to solve this, is adding in httpd.conf the ErrorDocument 400 directive... Not is the best option for servers with different vhosts.. because all of the will have the same behaviour... but I think that is the only way for this case.
Quotation from Apache documentation:
Although most error messages can be overriden, there are certain circumstances where the >internal messages are used regardless of the setting of ErrorDocument. In particular, if a >malformed request is detected, normal request processing will be immediately halted and the >internal error message returned. This is necessary to guard against security problems >caused by bad requests.
Thanks anyway!!
This page is super helpful about the .htaccess rules.
http://www.helicontech.com/isapi_rewrite/doc/RewriteRule.htm
I saw a few solutions to this that use a small php script too. IE this one replaces #
.htaccess
RewriteRule old.php redirect.php? url=http://example.com/new.php|hash [R=301,QSA,L]
redirect.php
<?php
$new_url = str_replace("|", "#", $_GET['url']);
header("Location: ".$new_url, 301);
die;
?>

Resources