Unicode input retrieved via PrimeFaces input components become corrupted - jsf

When I was still using PrimeFaces v2.2.1, I was able to type unicode input such as Chinese with a PrimeFaces input component such as <p:inputText> and <p:editor>, and retrieve the input in good shape in managed bean method.
However, after I upgraded to PrimeFaces v3.1.1, all those characters become Mojibake or question marks. Only Latin input comes fine, it are the Chinese, Arabic, Hebrew, Cyrillic, etc characters which become malformed.
How is this caused and how can I solve it?

Introduction
Normally, JSF/Facelets will set the request parameter character encoding to UTF-8 by default already when the view is created/restored. But if any request parameter is been requested before the view is been created/restored, then it's too late to set the proper character encoding. The request parameters will namely be parsed only once.
PrimeFaces encoding fail
That it failed in PrimeFaces 3.x after upgrading from 2.x is caused by the new isAjaxRequest() override in PrimeFaces' PrimePartialViewContext which checks a request parameter:
#Override
public boolean isAjaxRequest() {
return getWrapped().isAjaxRequest()
|| FacesContext.getCurrentInstance().getExternalContext().getRequestParameterMap().containsKey("javax.faces.partial.ajax");
}
By default, the isAjaxRequest() (the one of Mojarra/MyFaces, as the above PrimeFaces code has obtained by getWrapped()) checks the request header as follows which does not affect the request parameter encoding as request parameters won't be parsed when a request header is obtained:
if (ajaxRequest == null) {
ajaxRequest = "partial/ajax".equals(ctx.
getExternalContext().getRequestHeaderMap().get("Faces-Request"));
}
However, the isAjaxRequest() may be called by any phase listener or system event listener or some application factory before the view is been created/restored. So, when you're using PrimeFaces 3.x, then the request parameters will be parsed before the proper character encoding is been set and hence use the server's default encoding which is usually ISO-8859-1. This will mess up everything.
Solutions
There are several ways to fix it:
Use a servlet filter which sets ServletRequest#setCharacterEncoding() with UTF-8. Setting the response encoding by ServletResponse#setCharacterEncoding() is by the way unnecessary as it won't be affected by this issue.
#WebFilter("/*")
public class CharacterEncodingFilter implements Filter {
#Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws ServletException, IOException {
request.setCharacterEncoding("UTF-8");
chain.doFilter(request, response);
}
// ...
}
You only need to take into account that HttpServletRequest#setCharacterEncoding() only sets the encoding for POST request parameters, not for GET request parameters. For GET request parameters you'd still need to configure it at server level.
If you happen to use JSF utility library OmniFaces, such a filter is already provided out the box, the CharacterEncodingFilter. Just install it as below in web.xml as first filter entry:
<filter>
<filter-name>characterEncodingFilter</filter-name>
<filter-class>org.omnifaces.filter.CharacterEncodingFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>characterEncodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
Reconfigure the server to use UTF-8 instead of ISO-8859-1 as default encoding. In case of Glassfish, that would be a matter of adding the following entry to <glassfish-web-app> of the /WEB-INF/glassfish-web.xml file:
<parameter-encoding default-charset="UTF-8" />
Tomcat doesn't support it. It has the URIEncoding attribute in <Context> entry, but this applies to GET requests only, not to POST requests.
Report it as a bug to PrimeFaces. Is there really any legitimate reason to check the HTTP request being an ajax request by checking a request parameter instead of a request header like as you would do for standard JSF and for example jQuery? The PrimeFaces' core.js JavaScript is doing that. It would be better if it has set it as a request header of XMLHttpRequest.
Solutions which do NOT work
Perhaps you'll stumble upon below "solutions" somewhere on the Internet while investigating this problem. Those solutions do won't ever work in this specific case. Explanation follows.
Setting XML prolog:
<?xml version='1.0' encoding='UTF-8' ?>
This only tells the XML parser to use UTF-8 to decode the XML source before building the XML tree around it. The XML parser actually being used by Facelts is SAX during JSF view build time. This part has completely nothing to do with HTTP request/response encoding.
Setting HTML meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
The HTML meta tag is ignored when the page is served over HTTP via a http(s):// URI. It's only been used when the page is by the client saved as a HTML file on local disk system and then reopened by a file:// URI in browser.
Setting HTML form accept charset attribute:
<h:form accept-charset="UTF-8">
Modern browsers ignore this. This has only effect in Microsoft Internet Explorer browser. Even then it is doing it wrongly. Never use it. All real webbrowsers will instead use the charset attribute specified in the Content-Type header of the response. Even MSIE will do it the right way as long as you do not specify the accept-charset attribute.
Setting JVM argument:
-Dfile.encoding=UTF-8
This is only used by the Oracle(!) JVM to read and parse the Java source files.

Related

Why UTF8 character encoding not work in JSF? [duplicate]

I have the same problem as Set request character encoding of JSF input submitted values to UTF-8 in GlassFish, the submitted values arrive as Mojibake. However, the answer is targeted at GlassFish and I'm using JBoss AS 7.
I've already specified the JDBC connection URL to use UTF-8:
jdbc:mysql://localhost:3306/mydb?useUnicode=yes&characterEncoding=UTF-8
And in top of my JSF page:
<?xml version='1.0' encoding='UTF-8' ?>
How can I solve the same problem in JBoss AS 7? Or better, in a more generic way so that it works in all servers?
The question which you linked to has already excluded the DB encoding from being the cause because the problem already occurs during printing/redisplaying the submitted value before saving in DB. Thus, the problem is in HTTP request encoding.
Your JDBC connection URL with the charset specified,
jdbc:mysql://localhost:3306/mydb?useUnicode=yes&characterEncoding=UTF-8
only tells the MySQL JDBC driver to use UTF-8 to decode values in SQL queries before sending it to DB. This is not only completely beyond JSF's scope, but this is also not the cause of your problem, provided that you're absolutely positive that you've the same problem as in the linked question.
Your XML prolog with the charset specified,
<?xml version='1.0' encoding='UTF-8' ?>
only tells the XML parser to use UTF-8 to decode the XML source before building the XML tree around it. The XML parser actually being used is SAX as internally used by Facelets during JSF view build time. This part has completely nothing to do with HTTP request/response encoding and is thus very unlikely the cause of your problem.
None of them sets the HTTP request encoding, while you need to set the HTTP request encoding. The question which you linked to already shows how to do that for the Glassfish server. In your case, you're however using JBoss AS server. The Glassfish-specific setting is then inapplicable and JBoss doesn't support anything like that. You'd need to bring in a custom servlet filter to do the job. E.g.
#WebFilter("/*")
public class CharacterEncodingFilter implements Filter {
#Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws ServletException, IOException {
request.setCharacterEncoding("UTF-8");
chain.doFilter(request, response);
}
// ...
}
In standalone.xml, add atributte url-charset="UTF-8" in the tag http-listener name="default", and add atributte default-encoding="UTF-8" in the tag servlet-container.
Adding this to JBOSS_HOME/standalone/configuration/standalone.xml solved it for me:
<system-properties>
<property name="org.apache.catalina.connector.URI_ENCODING" value="UTF-8"/>
<property name="org.apache.catalina.connector.USE_BODY_ENCODING_FOR_QUERY_STRING" value="true"/>
</system-properties>
Got it from https://developer.jboss.org/message/643825#643825

Cyrillic input retrieved as Mojibake [duplicate]

When I was still using PrimeFaces v2.2.1, I was able to type unicode input such as Chinese with a PrimeFaces input component such as <p:inputText> and <p:editor>, and retrieve the input in good shape in managed bean method.
However, after I upgraded to PrimeFaces v3.1.1, all those characters become Mojibake or question marks. Only Latin input comes fine, it are the Chinese, Arabic, Hebrew, Cyrillic, etc characters which become malformed.
How is this caused and how can I solve it?
Introduction
Normally, JSF/Facelets will set the request parameter character encoding to UTF-8 by default already when the view is created/restored. But if any request parameter is been requested before the view is been created/restored, then it's too late to set the proper character encoding. The request parameters will namely be parsed only once.
PrimeFaces encoding fail
That it failed in PrimeFaces 3.x after upgrading from 2.x is caused by the new isAjaxRequest() override in PrimeFaces' PrimePartialViewContext which checks a request parameter:
#Override
public boolean isAjaxRequest() {
return getWrapped().isAjaxRequest()
|| FacesContext.getCurrentInstance().getExternalContext().getRequestParameterMap().containsKey("javax.faces.partial.ajax");
}
By default, the isAjaxRequest() (the one of Mojarra/MyFaces, as the above PrimeFaces code has obtained by getWrapped()) checks the request header as follows which does not affect the request parameter encoding as request parameters won't be parsed when a request header is obtained:
if (ajaxRequest == null) {
ajaxRequest = "partial/ajax".equals(ctx.
getExternalContext().getRequestHeaderMap().get("Faces-Request"));
}
However, the isAjaxRequest() may be called by any phase listener or system event listener or some application factory before the view is been created/restored. So, when you're using PrimeFaces 3.x, then the request parameters will be parsed before the proper character encoding is been set and hence use the server's default encoding which is usually ISO-8859-1. This will mess up everything.
Solutions
There are several ways to fix it:
Use a servlet filter which sets ServletRequest#setCharacterEncoding() with UTF-8. Setting the response encoding by ServletResponse#setCharacterEncoding() is by the way unnecessary as it won't be affected by this issue.
#WebFilter("/*")
public class CharacterEncodingFilter implements Filter {
#Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws ServletException, IOException {
request.setCharacterEncoding("UTF-8");
chain.doFilter(request, response);
}
// ...
}
You only need to take into account that HttpServletRequest#setCharacterEncoding() only sets the encoding for POST request parameters, not for GET request parameters. For GET request parameters you'd still need to configure it at server level.
If you happen to use JSF utility library OmniFaces, such a filter is already provided out the box, the CharacterEncodingFilter. Just install it as below in web.xml as first filter entry:
<filter>
<filter-name>characterEncodingFilter</filter-name>
<filter-class>org.omnifaces.filter.CharacterEncodingFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>characterEncodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
Reconfigure the server to use UTF-8 instead of ISO-8859-1 as default encoding. In case of Glassfish, that would be a matter of adding the following entry to <glassfish-web-app> of the /WEB-INF/glassfish-web.xml file:
<parameter-encoding default-charset="UTF-8" />
Tomcat doesn't support it. It has the URIEncoding attribute in <Context> entry, but this applies to GET requests only, not to POST requests.
Report it as a bug to PrimeFaces. Is there really any legitimate reason to check the HTTP request being an ajax request by checking a request parameter instead of a request header like as you would do for standard JSF and for example jQuery? The PrimeFaces' core.js JavaScript is doing that. It would be better if it has set it as a request header of XMLHttpRequest.
Solutions which do NOT work
Perhaps you'll stumble upon below "solutions" somewhere on the Internet while investigating this problem. Those solutions do won't ever work in this specific case. Explanation follows.
Setting XML prolog:
<?xml version='1.0' encoding='UTF-8' ?>
This only tells the XML parser to use UTF-8 to decode the XML source before building the XML tree around it. The XML parser actually being used by Facelts is SAX during JSF view build time. This part has completely nothing to do with HTTP request/response encoding.
Setting HTML meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
The HTML meta tag is ignored when the page is served over HTTP via a http(s):// URI. It's only been used when the page is by the client saved as a HTML file on local disk system and then reopened by a file:// URI in browser.
Setting HTML form accept charset attribute:
<h:form accept-charset="UTF-8">
Modern browsers ignore this. This has only effect in Microsoft Internet Explorer browser. Even then it is doing it wrongly. Never use it. All real webbrowsers will instead use the charset attribute specified in the Content-Type header of the response. Even MSIE will do it the right way as long as you do not specify the accept-charset attribute.
Setting JVM argument:
-Dfile.encoding=UTF-8
This is only used by the Oracle(!) JVM to read and parse the Java source files.

Set request character encoding of JSF input submitted values to UTF-8

I have the same problem as Set request character encoding of JSF input submitted values to UTF-8 in GlassFish, the submitted values arrive as Mojibake. However, the answer is targeted at GlassFish and I'm using JBoss AS 7.
I've already specified the JDBC connection URL to use UTF-8:
jdbc:mysql://localhost:3306/mydb?useUnicode=yes&characterEncoding=UTF-8
And in top of my JSF page:
<?xml version='1.0' encoding='UTF-8' ?>
How can I solve the same problem in JBoss AS 7? Or better, in a more generic way so that it works in all servers?
The question which you linked to has already excluded the DB encoding from being the cause because the problem already occurs during printing/redisplaying the submitted value before saving in DB. Thus, the problem is in HTTP request encoding.
Your JDBC connection URL with the charset specified,
jdbc:mysql://localhost:3306/mydb?useUnicode=yes&characterEncoding=UTF-8
only tells the MySQL JDBC driver to use UTF-8 to decode values in SQL queries before sending it to DB. This is not only completely beyond JSF's scope, but this is also not the cause of your problem, provided that you're absolutely positive that you've the same problem as in the linked question.
Your XML prolog with the charset specified,
<?xml version='1.0' encoding='UTF-8' ?>
only tells the XML parser to use UTF-8 to decode the XML source before building the XML tree around it. The XML parser actually being used is SAX as internally used by Facelets during JSF view build time. This part has completely nothing to do with HTTP request/response encoding and is thus very unlikely the cause of your problem.
None of them sets the HTTP request encoding, while you need to set the HTTP request encoding. The question which you linked to already shows how to do that for the Glassfish server. In your case, you're however using JBoss AS server. The Glassfish-specific setting is then inapplicable and JBoss doesn't support anything like that. You'd need to bring in a custom servlet filter to do the job. E.g.
#WebFilter("/*")
public class CharacterEncodingFilter implements Filter {
#Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws ServletException, IOException {
request.setCharacterEncoding("UTF-8");
chain.doFilter(request, response);
}
// ...
}
In standalone.xml, add atributte url-charset="UTF-8" in the tag http-listener name="default", and add atributte default-encoding="UTF-8" in the tag servlet-container.
Adding this to JBOSS_HOME/standalone/configuration/standalone.xml solved it for me:
<system-properties>
<property name="org.apache.catalina.connector.URI_ENCODING" value="UTF-8"/>
<property name="org.apache.catalina.connector.USE_BODY_ENCODING_FOR_QUERY_STRING" value="true"/>
</system-properties>
Got it from https://developer.jboss.org/message/643825#643825

JSF using UTF-8 not working with letters with accents [duplicate]

When I was still using PrimeFaces v2.2.1, I was able to type unicode input such as Chinese with a PrimeFaces input component such as <p:inputText> and <p:editor>, and retrieve the input in good shape in managed bean method.
However, after I upgraded to PrimeFaces v3.1.1, all those characters become Mojibake or question marks. Only Latin input comes fine, it are the Chinese, Arabic, Hebrew, Cyrillic, etc characters which become malformed.
How is this caused and how can I solve it?
Introduction
Normally, JSF/Facelets will set the request parameter character encoding to UTF-8 by default already when the view is created/restored. But if any request parameter is been requested before the view is been created/restored, then it's too late to set the proper character encoding. The request parameters will namely be parsed only once.
PrimeFaces encoding fail
That it failed in PrimeFaces 3.x after upgrading from 2.x is caused by the new isAjaxRequest() override in PrimeFaces' PrimePartialViewContext which checks a request parameter:
#Override
public boolean isAjaxRequest() {
return getWrapped().isAjaxRequest()
|| FacesContext.getCurrentInstance().getExternalContext().getRequestParameterMap().containsKey("javax.faces.partial.ajax");
}
By default, the isAjaxRequest() (the one of Mojarra/MyFaces, as the above PrimeFaces code has obtained by getWrapped()) checks the request header as follows which does not affect the request parameter encoding as request parameters won't be parsed when a request header is obtained:
if (ajaxRequest == null) {
ajaxRequest = "partial/ajax".equals(ctx.
getExternalContext().getRequestHeaderMap().get("Faces-Request"));
}
However, the isAjaxRequest() may be called by any phase listener or system event listener or some application factory before the view is been created/restored. So, when you're using PrimeFaces 3.x, then the request parameters will be parsed before the proper character encoding is been set and hence use the server's default encoding which is usually ISO-8859-1. This will mess up everything.
Solutions
There are several ways to fix it:
Use a servlet filter which sets ServletRequest#setCharacterEncoding() with UTF-8. Setting the response encoding by ServletResponse#setCharacterEncoding() is by the way unnecessary as it won't be affected by this issue.
#WebFilter("/*")
public class CharacterEncodingFilter implements Filter {
#Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws ServletException, IOException {
request.setCharacterEncoding("UTF-8");
chain.doFilter(request, response);
}
// ...
}
You only need to take into account that HttpServletRequest#setCharacterEncoding() only sets the encoding for POST request parameters, not for GET request parameters. For GET request parameters you'd still need to configure it at server level.
If you happen to use JSF utility library OmniFaces, such a filter is already provided out the box, the CharacterEncodingFilter. Just install it as below in web.xml as first filter entry:
<filter>
<filter-name>characterEncodingFilter</filter-name>
<filter-class>org.omnifaces.filter.CharacterEncodingFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>characterEncodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
Reconfigure the server to use UTF-8 instead of ISO-8859-1 as default encoding. In case of Glassfish, that would be a matter of adding the following entry to <glassfish-web-app> of the /WEB-INF/glassfish-web.xml file:
<parameter-encoding default-charset="UTF-8" />
Tomcat doesn't support it. It has the URIEncoding attribute in <Context> entry, but this applies to GET requests only, not to POST requests.
Report it as a bug to PrimeFaces. Is there really any legitimate reason to check the HTTP request being an ajax request by checking a request parameter instead of a request header like as you would do for standard JSF and for example jQuery? The PrimeFaces' core.js JavaScript is doing that. It would be better if it has set it as a request header of XMLHttpRequest.
Solutions which do NOT work
Perhaps you'll stumble upon below "solutions" somewhere on the Internet while investigating this problem. Those solutions do won't ever work in this specific case. Explanation follows.
Setting XML prolog:
<?xml version='1.0' encoding='UTF-8' ?>
This only tells the XML parser to use UTF-8 to decode the XML source before building the XML tree around it. The XML parser actually being used by Facelts is SAX during JSF view build time. This part has completely nothing to do with HTTP request/response encoding.
Setting HTML meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
The HTML meta tag is ignored when the page is served over HTTP via a http(s):// URI. It's only been used when the page is by the client saved as a HTML file on local disk system and then reopened by a file:// URI in browser.
Setting HTML form accept charset attribute:
<h:form accept-charset="UTF-8">
Modern browsers ignore this. This has only effect in Microsoft Internet Explorer browser. Even then it is doing it wrongly. Never use it. All real webbrowsers will instead use the charset attribute specified in the Content-Type header of the response. Even MSIE will do it the right way as long as you do not specify the accept-charset attribute.
Setting JVM argument:
-Dfile.encoding=UTF-8
This is only used by the Oracle(!) JVM to read and parse the Java source files.

POST parameters using wrong encoding in JSF 1.2

I'm having a problem with charset encoding in my web application (JSF 1.2, Spring and Tomcat 7), and I've ran out of ideas of what to test to see where it is going wrong.
Whenever I submit something like 'çã' I get 'çã': that means my data POSTed as UTF-8 is being converted to ISO-8859-1 somewhere in the JSF life cycle.
I know that the wrong conversion is UTF-8 to ISO-8859-1 cause it's the same output for:
System.out.println(new String("çã".getBytes("UTF-8"), "ISO-8859-1"));
I believe that the wrong conversion is somewhere in the JSF life cycle (can it be before?) cause I set up a validator in my MB:
public void debugValidator(FacesContext context, UIComponent component,
Object object) throws ValidationException {
System.out.println("debug validator:");
System.out.println(object);
System.out.println("\n");
throw new ValidationException("DEBUG: " + object.toString());
}
and its message returns as: "DEBUG: çã"
I have in all my .xhtml pages the first line as <?xml version="1.0" encoding="UTF-8"?>.
I'm using Facelets, which according to BalusC's article uses UTF-8 by default
So it wouldn't need but I set up anyway, Spring's CharacterEncodingFilter in my web.xml to set the request character encoding to UTF-8.
I put URIEncoding="UTF-8" in Tomcat's server.xml file, just to guarantee
It is not my browser's fault, it prints the same thing in the console, and my environment is all UTF-8.
Do you have any idea of what more can I test? What could be my wrong assumption?
Thanks in advance!
BalusC's answer helped me to better understand the problem, but what solved it for me was putting the Character Encoding Filter as the FIRST filter in the chain (putting it above all the others in the web.xml file).
This is the filter I used:
<!-- filter enforcing charset UTF-8 - must be first filter in the chain! -->
<filter>
<filter-name>characterEncodingFilter</filter-name>
<filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>utf-8</param-value>
</init-param>
<init-param>
<param-name>forceEncoding</param-name>
<param-value>true</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>characterEncodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
Apparently, the data was read before the parameter was set by the filter.
I got the hint from this page: http://tech.top21.de/techblog/20100421-solving-problems-with-request-parameter-encoding.html
Thanks everybody!
The symptoms indicate that the browser has sent the data using ISO-8859-1 encoding instead of UTF-8. This in turn means that the HTTP response Content-Type header is not been set with the proper charset attribute. In for example Firebug, you can find it out as follows:
You're right that Facelets uses UTF-8 by default. But very early versions of Facelets weren't programmed to use UTF-8 by default. See also among others issue 46 and issue 53. Facelets is currently at 1.1.15.B1.
As to your attempts to fix it, the presence of the XML prolog is not strictly necessary and its encoding isn't used in any way to set the response encoding, it's only used by the XML parser to decode the inputstream to characters. Spring's filter is also not necessary, but that it didn't solve the problem after you added it is enough evidence that it's the client who has sent the data as ISO-8859-1.
Check, if your form has enctype="multipart/form-data".
See this question form more information

Resources