I can normally write czech string to the form:
But after validation (and also when I send the collected string to database) the string is in some other charset:
h:outputTexts (jméno, příjmení) are still shown normally, h:inputTexts are not.
Where should I look for the problem?
UPDATE: HTTP response headers:
SOLUTION:
create filter with request.setCharacterEncoding("UTF-8") in Filter#doFilter()
check all xml to have UTF-8 configured
add <f:view contentType="text/html" encoding="UTF-8"/> to main xhtml
add these lines to hibernate.cfg.xml:
<property name="hibernate.connection.characterEncoding">utf8</property>
<property name="hibernate.connection.useUnicode">true</property>
Given the symptoms, UTF-8 data is been redisplayed using ISO-8859-x encoding. The č (LATIN SMALL LETTER C WITH CARON (U+010D)) exist in UTF-8 of bytes 0xC4 and 0x8D. According to the ISO-8859-1 codepage layout those bytes represent the characters Ä and [nothing] respectively, which is exactly what you're seeing.
This particular problem can have many causes. As Facelets by itself already uses UTF-8 by default to process HTTP POST request parameters and to write the HTTP response, there should/can be nothing which you need to fix/change in the Java/JSF side.
However, when you're manually grabbing a request parameter before JSF creates/restores the view (e.g. in a custom filter), then it may be too late for Facelets to set the right request character encoding. You'd need to add the following line to the custom filter before continuing the chain, or in a new filter which is mapped before the filter causing the trouble:
request.setCharacterEncoding("UTF-8");
Also, when you've explicitly/implicitly changed the Facelets' default character encoding by for example <?xml version="1.0" charset="ISO-8859-1"?> or <f:view encoding="ISO-8859-1">, then Facelets will use ISO-8859-1 instead. You'd need to replace it by UTF-8 or remove them altogether.
If that's not it, then only the database side is the major suspect. In that side I can see two possible causes:
The DB table is not using UTF-8.
The JDBC driver is not using UTF-8.
How exactly to solve it depends on the DB server used. Usually you need to specify the charset during CREATE of the DB table, but you can usually also alter it using ALTER. As to the JDBC driver, this is usually to be solved by explicitly specifying the charset as connection URL parameter. For example, in case of MySQL:
jdbc:mysql://localhost:3306/db_name?useUnicode=yes&characterEncoding=UTF-8
See also:
Unicode - How to get the characters right?
Unicode input retrieved via PrimeFaces input components become corrupted
Try this solution: http://ibnaziz.wordpress.com/2008/06/10/spring-utf-8-conversion-using-characterencodingfilter/
In my cases it helps (with russian)
In web.xml add Spring's character encoding filter:
<filter>
<filter-name>encodingFilter</filter-name>
<filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
<init-param>
<param-name>forceEncoding</param-name>
<param-value>true</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>encodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
I had exactly the same problem with validation form and I solved it with Sergey's answer.
BUT your filter needs to be in first position in your web.xml. Moving my filter from 3rd to first position solved my problem.
Hope it helps.
(Primefaces 3.2, JSF 2.1.2 with Jboss 7.1)
Related
I can normally write czech string to the form:
But after validation (and also when I send the collected string to database) the string is in some other charset:
h:outputTexts (jméno, příjmení) are still shown normally, h:inputTexts are not.
Where should I look for the problem?
UPDATE: HTTP response headers:
SOLUTION:
create filter with request.setCharacterEncoding("UTF-8") in Filter#doFilter()
check all xml to have UTF-8 configured
add <f:view contentType="text/html" encoding="UTF-8"/> to main xhtml
add these lines to hibernate.cfg.xml:
<property name="hibernate.connection.characterEncoding">utf8</property>
<property name="hibernate.connection.useUnicode">true</property>
Given the symptoms, UTF-8 data is been redisplayed using ISO-8859-x encoding. The č (LATIN SMALL LETTER C WITH CARON (U+010D)) exist in UTF-8 of bytes 0xC4 and 0x8D. According to the ISO-8859-1 codepage layout those bytes represent the characters Ä and [nothing] respectively, which is exactly what you're seeing.
This particular problem can have many causes. As Facelets by itself already uses UTF-8 by default to process HTTP POST request parameters and to write the HTTP response, there should/can be nothing which you need to fix/change in the Java/JSF side.
However, when you're manually grabbing a request parameter before JSF creates/restores the view (e.g. in a custom filter), then it may be too late for Facelets to set the right request character encoding. You'd need to add the following line to the custom filter before continuing the chain, or in a new filter which is mapped before the filter causing the trouble:
request.setCharacterEncoding("UTF-8");
Also, when you've explicitly/implicitly changed the Facelets' default character encoding by for example <?xml version="1.0" charset="ISO-8859-1"?> or <f:view encoding="ISO-8859-1">, then Facelets will use ISO-8859-1 instead. You'd need to replace it by UTF-8 or remove them altogether.
If that's not it, then only the database side is the major suspect. In that side I can see two possible causes:
The DB table is not using UTF-8.
The JDBC driver is not using UTF-8.
How exactly to solve it depends on the DB server used. Usually you need to specify the charset during CREATE of the DB table, but you can usually also alter it using ALTER. As to the JDBC driver, this is usually to be solved by explicitly specifying the charset as connection URL parameter. For example, in case of MySQL:
jdbc:mysql://localhost:3306/db_name?useUnicode=yes&characterEncoding=UTF-8
See also:
Unicode - How to get the characters right?
Unicode input retrieved via PrimeFaces input components become corrupted
Try this solution: http://ibnaziz.wordpress.com/2008/06/10/spring-utf-8-conversion-using-characterencodingfilter/
In my cases it helps (with russian)
In web.xml add Spring's character encoding filter:
<filter>
<filter-name>encodingFilter</filter-name>
<filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
<init-param>
<param-name>forceEncoding</param-name>
<param-value>true</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>encodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
I had exactly the same problem with validation form and I solved it with Sergey's answer.
BUT your filter needs to be in first position in your web.xml. Moving my filter from 3rd to first position solved my problem.
Hope it helps.
(Primefaces 3.2, JSF 2.1.2 with Jboss 7.1)
When I was still using PrimeFaces v2.2.1, I was able to type unicode input such as Chinese with a PrimeFaces input component such as <p:inputText> and <p:editor>, and retrieve the input in good shape in managed bean method.
However, after I upgraded to PrimeFaces v3.1.1, all those characters become Mojibake or question marks. Only Latin input comes fine, it are the Chinese, Arabic, Hebrew, Cyrillic, etc characters which become malformed.
How is this caused and how can I solve it?
Introduction
Normally, JSF/Facelets will set the request parameter character encoding to UTF-8 by default already when the view is created/restored. But if any request parameter is been requested before the view is been created/restored, then it's too late to set the proper character encoding. The request parameters will namely be parsed only once.
PrimeFaces encoding fail
That it failed in PrimeFaces 3.x after upgrading from 2.x is caused by the new isAjaxRequest() override in PrimeFaces' PrimePartialViewContext which checks a request parameter:
#Override
public boolean isAjaxRequest() {
return getWrapped().isAjaxRequest()
|| FacesContext.getCurrentInstance().getExternalContext().getRequestParameterMap().containsKey("javax.faces.partial.ajax");
}
By default, the isAjaxRequest() (the one of Mojarra/MyFaces, as the above PrimeFaces code has obtained by getWrapped()) checks the request header as follows which does not affect the request parameter encoding as request parameters won't be parsed when a request header is obtained:
if (ajaxRequest == null) {
ajaxRequest = "partial/ajax".equals(ctx.
getExternalContext().getRequestHeaderMap().get("Faces-Request"));
}
However, the isAjaxRequest() may be called by any phase listener or system event listener or some application factory before the view is been created/restored. So, when you're using PrimeFaces 3.x, then the request parameters will be parsed before the proper character encoding is been set and hence use the server's default encoding which is usually ISO-8859-1. This will mess up everything.
Solutions
There are several ways to fix it:
Use a servlet filter which sets ServletRequest#setCharacterEncoding() with UTF-8. Setting the response encoding by ServletResponse#setCharacterEncoding() is by the way unnecessary as it won't be affected by this issue.
#WebFilter("/*")
public class CharacterEncodingFilter implements Filter {
#Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws ServletException, IOException {
request.setCharacterEncoding("UTF-8");
chain.doFilter(request, response);
}
// ...
}
You only need to take into account that HttpServletRequest#setCharacterEncoding() only sets the encoding for POST request parameters, not for GET request parameters. For GET request parameters you'd still need to configure it at server level.
If you happen to use JSF utility library OmniFaces, such a filter is already provided out the box, the CharacterEncodingFilter. Just install it as below in web.xml as first filter entry:
<filter>
<filter-name>characterEncodingFilter</filter-name>
<filter-class>org.omnifaces.filter.CharacterEncodingFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>characterEncodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
Reconfigure the server to use UTF-8 instead of ISO-8859-1 as default encoding. In case of Glassfish, that would be a matter of adding the following entry to <glassfish-web-app> of the /WEB-INF/glassfish-web.xml file:
<parameter-encoding default-charset="UTF-8" />
Tomcat doesn't support it. It has the URIEncoding attribute in <Context> entry, but this applies to GET requests only, not to POST requests.
Report it as a bug to PrimeFaces. Is there really any legitimate reason to check the HTTP request being an ajax request by checking a request parameter instead of a request header like as you would do for standard JSF and for example jQuery? The PrimeFaces' core.js JavaScript is doing that. It would be better if it has set it as a request header of XMLHttpRequest.
Solutions which do NOT work
Perhaps you'll stumble upon below "solutions" somewhere on the Internet while investigating this problem. Those solutions do won't ever work in this specific case. Explanation follows.
Setting XML prolog:
<?xml version='1.0' encoding='UTF-8' ?>
This only tells the XML parser to use UTF-8 to decode the XML source before building the XML tree around it. The XML parser actually being used by Facelts is SAX during JSF view build time. This part has completely nothing to do with HTTP request/response encoding.
Setting HTML meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
The HTML meta tag is ignored when the page is served over HTTP via a http(s):// URI. It's only been used when the page is by the client saved as a HTML file on local disk system and then reopened by a file:// URI in browser.
Setting HTML form accept charset attribute:
<h:form accept-charset="UTF-8">
Modern browsers ignore this. This has only effect in Microsoft Internet Explorer browser. Even then it is doing it wrongly. Never use it. All real webbrowsers will instead use the charset attribute specified in the Content-Type header of the response. Even MSIE will do it the right way as long as you do not specify the accept-charset attribute.
Setting JVM argument:
-Dfile.encoding=UTF-8
This is only used by the Oracle(!) JVM to read and parse the Java source files.
When I was still using PrimeFaces v2.2.1, I was able to type unicode input such as Chinese with a PrimeFaces input component such as <p:inputText> and <p:editor>, and retrieve the input in good shape in managed bean method.
However, after I upgraded to PrimeFaces v3.1.1, all those characters become Mojibake or question marks. Only Latin input comes fine, it are the Chinese, Arabic, Hebrew, Cyrillic, etc characters which become malformed.
How is this caused and how can I solve it?
Introduction
Normally, JSF/Facelets will set the request parameter character encoding to UTF-8 by default already when the view is created/restored. But if any request parameter is been requested before the view is been created/restored, then it's too late to set the proper character encoding. The request parameters will namely be parsed only once.
PrimeFaces encoding fail
That it failed in PrimeFaces 3.x after upgrading from 2.x is caused by the new isAjaxRequest() override in PrimeFaces' PrimePartialViewContext which checks a request parameter:
#Override
public boolean isAjaxRequest() {
return getWrapped().isAjaxRequest()
|| FacesContext.getCurrentInstance().getExternalContext().getRequestParameterMap().containsKey("javax.faces.partial.ajax");
}
By default, the isAjaxRequest() (the one of Mojarra/MyFaces, as the above PrimeFaces code has obtained by getWrapped()) checks the request header as follows which does not affect the request parameter encoding as request parameters won't be parsed when a request header is obtained:
if (ajaxRequest == null) {
ajaxRequest = "partial/ajax".equals(ctx.
getExternalContext().getRequestHeaderMap().get("Faces-Request"));
}
However, the isAjaxRequest() may be called by any phase listener or system event listener or some application factory before the view is been created/restored. So, when you're using PrimeFaces 3.x, then the request parameters will be parsed before the proper character encoding is been set and hence use the server's default encoding which is usually ISO-8859-1. This will mess up everything.
Solutions
There are several ways to fix it:
Use a servlet filter which sets ServletRequest#setCharacterEncoding() with UTF-8. Setting the response encoding by ServletResponse#setCharacterEncoding() is by the way unnecessary as it won't be affected by this issue.
#WebFilter("/*")
public class CharacterEncodingFilter implements Filter {
#Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws ServletException, IOException {
request.setCharacterEncoding("UTF-8");
chain.doFilter(request, response);
}
// ...
}
You only need to take into account that HttpServletRequest#setCharacterEncoding() only sets the encoding for POST request parameters, not for GET request parameters. For GET request parameters you'd still need to configure it at server level.
If you happen to use JSF utility library OmniFaces, such a filter is already provided out the box, the CharacterEncodingFilter. Just install it as below in web.xml as first filter entry:
<filter>
<filter-name>characterEncodingFilter</filter-name>
<filter-class>org.omnifaces.filter.CharacterEncodingFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>characterEncodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
Reconfigure the server to use UTF-8 instead of ISO-8859-1 as default encoding. In case of Glassfish, that would be a matter of adding the following entry to <glassfish-web-app> of the /WEB-INF/glassfish-web.xml file:
<parameter-encoding default-charset="UTF-8" />
Tomcat doesn't support it. It has the URIEncoding attribute in <Context> entry, but this applies to GET requests only, not to POST requests.
Report it as a bug to PrimeFaces. Is there really any legitimate reason to check the HTTP request being an ajax request by checking a request parameter instead of a request header like as you would do for standard JSF and for example jQuery? The PrimeFaces' core.js JavaScript is doing that. It would be better if it has set it as a request header of XMLHttpRequest.
Solutions which do NOT work
Perhaps you'll stumble upon below "solutions" somewhere on the Internet while investigating this problem. Those solutions do won't ever work in this specific case. Explanation follows.
Setting XML prolog:
<?xml version='1.0' encoding='UTF-8' ?>
This only tells the XML parser to use UTF-8 to decode the XML source before building the XML tree around it. The XML parser actually being used by Facelts is SAX during JSF view build time. This part has completely nothing to do with HTTP request/response encoding.
Setting HTML meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
The HTML meta tag is ignored when the page is served over HTTP via a http(s):// URI. It's only been used when the page is by the client saved as a HTML file on local disk system and then reopened by a file:// URI in browser.
Setting HTML form accept charset attribute:
<h:form accept-charset="UTF-8">
Modern browsers ignore this. This has only effect in Microsoft Internet Explorer browser. Even then it is doing it wrongly. Never use it. All real webbrowsers will instead use the charset attribute specified in the Content-Type header of the response. Even MSIE will do it the right way as long as you do not specify the accept-charset attribute.
Setting JVM argument:
-Dfile.encoding=UTF-8
This is only used by the Oracle(!) JVM to read and parse the Java source files.
When I was still using PrimeFaces v2.2.1, I was able to type unicode input such as Chinese with a PrimeFaces input component such as <p:inputText> and <p:editor>, and retrieve the input in good shape in managed bean method.
However, after I upgraded to PrimeFaces v3.1.1, all those characters become Mojibake or question marks. Only Latin input comes fine, it are the Chinese, Arabic, Hebrew, Cyrillic, etc characters which become malformed.
How is this caused and how can I solve it?
Introduction
Normally, JSF/Facelets will set the request parameter character encoding to UTF-8 by default already when the view is created/restored. But if any request parameter is been requested before the view is been created/restored, then it's too late to set the proper character encoding. The request parameters will namely be parsed only once.
PrimeFaces encoding fail
That it failed in PrimeFaces 3.x after upgrading from 2.x is caused by the new isAjaxRequest() override in PrimeFaces' PrimePartialViewContext which checks a request parameter:
#Override
public boolean isAjaxRequest() {
return getWrapped().isAjaxRequest()
|| FacesContext.getCurrentInstance().getExternalContext().getRequestParameterMap().containsKey("javax.faces.partial.ajax");
}
By default, the isAjaxRequest() (the one of Mojarra/MyFaces, as the above PrimeFaces code has obtained by getWrapped()) checks the request header as follows which does not affect the request parameter encoding as request parameters won't be parsed when a request header is obtained:
if (ajaxRequest == null) {
ajaxRequest = "partial/ajax".equals(ctx.
getExternalContext().getRequestHeaderMap().get("Faces-Request"));
}
However, the isAjaxRequest() may be called by any phase listener or system event listener or some application factory before the view is been created/restored. So, when you're using PrimeFaces 3.x, then the request parameters will be parsed before the proper character encoding is been set and hence use the server's default encoding which is usually ISO-8859-1. This will mess up everything.
Solutions
There are several ways to fix it:
Use a servlet filter which sets ServletRequest#setCharacterEncoding() with UTF-8. Setting the response encoding by ServletResponse#setCharacterEncoding() is by the way unnecessary as it won't be affected by this issue.
#WebFilter("/*")
public class CharacterEncodingFilter implements Filter {
#Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws ServletException, IOException {
request.setCharacterEncoding("UTF-8");
chain.doFilter(request, response);
}
// ...
}
You only need to take into account that HttpServletRequest#setCharacterEncoding() only sets the encoding for POST request parameters, not for GET request parameters. For GET request parameters you'd still need to configure it at server level.
If you happen to use JSF utility library OmniFaces, such a filter is already provided out the box, the CharacterEncodingFilter. Just install it as below in web.xml as first filter entry:
<filter>
<filter-name>characterEncodingFilter</filter-name>
<filter-class>org.omnifaces.filter.CharacterEncodingFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>characterEncodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
Reconfigure the server to use UTF-8 instead of ISO-8859-1 as default encoding. In case of Glassfish, that would be a matter of adding the following entry to <glassfish-web-app> of the /WEB-INF/glassfish-web.xml file:
<parameter-encoding default-charset="UTF-8" />
Tomcat doesn't support it. It has the URIEncoding attribute in <Context> entry, but this applies to GET requests only, not to POST requests.
Report it as a bug to PrimeFaces. Is there really any legitimate reason to check the HTTP request being an ajax request by checking a request parameter instead of a request header like as you would do for standard JSF and for example jQuery? The PrimeFaces' core.js JavaScript is doing that. It would be better if it has set it as a request header of XMLHttpRequest.
Solutions which do NOT work
Perhaps you'll stumble upon below "solutions" somewhere on the Internet while investigating this problem. Those solutions do won't ever work in this specific case. Explanation follows.
Setting XML prolog:
<?xml version='1.0' encoding='UTF-8' ?>
This only tells the XML parser to use UTF-8 to decode the XML source before building the XML tree around it. The XML parser actually being used by Facelts is SAX during JSF view build time. This part has completely nothing to do with HTTP request/response encoding.
Setting HTML meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
The HTML meta tag is ignored when the page is served over HTTP via a http(s):// URI. It's only been used when the page is by the client saved as a HTML file on local disk system and then reopened by a file:// URI in browser.
Setting HTML form accept charset attribute:
<h:form accept-charset="UTF-8">
Modern browsers ignore this. This has only effect in Microsoft Internet Explorer browser. Even then it is doing it wrongly. Never use it. All real webbrowsers will instead use the charset attribute specified in the Content-Type header of the response. Even MSIE will do it the right way as long as you do not specify the accept-charset attribute.
Setting JVM argument:
-Dfile.encoding=UTF-8
This is only used by the Oracle(!) JVM to read and parse the Java source files.
I'm having a problem with charset encoding in my web application (JSF 1.2, Spring and Tomcat 7), and I've ran out of ideas of what to test to see where it is going wrong.
Whenever I submit something like 'çã' I get 'çã': that means my data POSTed as UTF-8 is being converted to ISO-8859-1 somewhere in the JSF life cycle.
I know that the wrong conversion is UTF-8 to ISO-8859-1 cause it's the same output for:
System.out.println(new String("çã".getBytes("UTF-8"), "ISO-8859-1"));
I believe that the wrong conversion is somewhere in the JSF life cycle (can it be before?) cause I set up a validator in my MB:
public void debugValidator(FacesContext context, UIComponent component,
Object object) throws ValidationException {
System.out.println("debug validator:");
System.out.println(object);
System.out.println("\n");
throw new ValidationException("DEBUG: " + object.toString());
}
and its message returns as: "DEBUG: çã"
I have in all my .xhtml pages the first line as <?xml version="1.0" encoding="UTF-8"?>.
I'm using Facelets, which according to BalusC's article uses UTF-8 by default
So it wouldn't need but I set up anyway, Spring's CharacterEncodingFilter in my web.xml to set the request character encoding to UTF-8.
I put URIEncoding="UTF-8" in Tomcat's server.xml file, just to guarantee
It is not my browser's fault, it prints the same thing in the console, and my environment is all UTF-8.
Do you have any idea of what more can I test? What could be my wrong assumption?
Thanks in advance!
BalusC's answer helped me to better understand the problem, but what solved it for me was putting the Character Encoding Filter as the FIRST filter in the chain (putting it above all the others in the web.xml file).
This is the filter I used:
<!-- filter enforcing charset UTF-8 - must be first filter in the chain! -->
<filter>
<filter-name>characterEncodingFilter</filter-name>
<filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>utf-8</param-value>
</init-param>
<init-param>
<param-name>forceEncoding</param-name>
<param-value>true</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>characterEncodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
Apparently, the data was read before the parameter was set by the filter.
I got the hint from this page: http://tech.top21.de/techblog/20100421-solving-problems-with-request-parameter-encoding.html
Thanks everybody!
The symptoms indicate that the browser has sent the data using ISO-8859-1 encoding instead of UTF-8. This in turn means that the HTTP response Content-Type header is not been set with the proper charset attribute. In for example Firebug, you can find it out as follows:
You're right that Facelets uses UTF-8 by default. But very early versions of Facelets weren't programmed to use UTF-8 by default. See also among others issue 46 and issue 53. Facelets is currently at 1.1.15.B1.
As to your attempts to fix it, the presence of the XML prolog is not strictly necessary and its encoding isn't used in any way to set the response encoding, it's only used by the XML parser to decode the inputstream to characters. Spring's filter is also not necessary, but that it didn't solve the problem after you added it is enough evidence that it's the client who has sent the data as ISO-8859-1.
Check, if your form has enctype="multipart/form-data".
See this question form more information