Announcement Announcement Module
Collapse
No announcement yet.
@ResponseBody reports UTF-8 charset in HTTP response but uses actually ISO-8859-1. Ha Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • @ResponseBody reports UTF-8 charset in HTTP response but uses actually ISO-8859-1. Ha

    I ran into a problem where our Spring MVC claims to return data in UTF-8 but it actually uses ISO-8859-1 when accessed using a browser. The server reports the encoding correctly when the HTTP request doesn't have the Accept header (or if it is present but has no value) but not when the header is present and has a value. I'd like the server to both report and use UTF-8. There are workarounds but I'm wondering if our server is incorrectly configured.


    I have this example controller:

    Code:
    import java.io.IOException;
    import java.io.OutputStream;
    
    import javax.servlet.http.HttpServletResponse;
    
    import org.springframework.stereotype.Controller;
    import org.springframework.web.bind.annotation.RequestMapping;
    import org.springframework.web.bind.annotation.ResponseBody;
    
    @Controller
    public class TestController {
    	@RequestMapping("/logintest")
    	public @ResponseBody String test() {
    		return "--語-";
    	}
    
    	@RequestMapping("/logintest2")
    	public void test2(HttpServletResponse response) throws IOException {
    		OutputStream os = response.getOutputStream();
    		byte[] content = "--語-".getBytes();
    		os.write(content);
    		response.setContentType("text/html; charset=UTF-8");
    		response.setContentLength(content.length);
    	}
    }
    Now I open up Konsole, set its encoding to ISO-8859-1 and type in bash: (formatted a bit for clarity):

    > echo "GET /app/logintest HTTP/1.1
    Host: localhost:8080
    Accept: foo/bar
    " | nc localhost 8080

    HTTP/1.1 200 OK
    Server: Apache-Coyote/1.1
    Content-Type: foo/bar;charset=UTF-8
    Content-Length: 5
    Date: Thu, 13 Oct 2011 11:49:50 GMT

    --?-
    The server responds that it's serving content-type foo/bar (which is funny, but I think unrelated) in encoding UTF-8. However, since the '' character is just one character and not two and my terminal encoding is ISO-8859-1, it looks like the server's claimed and actual encoding conflict. This is confirmed by the fact that the content-length is 5, not 8. Furthermore, if I pipe the output to a file and open it using a hex editor, I see that the '' character has hex value E4, just like in ISO-8859-1.

    On the other hand, if I omit the Accept header from the request...

    > echo "GET /app/logintest HTTP/1.1
    Host: localhost:8080
    " | nc localhost 8080

    HTTP/1.1 200 OK
    Server: Apache-Coyote/1.1
    Content-Type: text/plain;charset=ISO-8859-1
    Content-Length: 5
    Date: Thu, 13 Oct 2011 11:52:14 GMT

    --?-
    This is better since now the server at least reports the encoding correctly, though ISO-8859-1 doesn't support Chinese characters. (The same happens if the Accept header is present but has no value, i.e. "Accept: ")

    As a comparison, logintest2 returns data using the correct encoding:

    > echo "GET /app/logintest2 HTTP/1.1
    Host: localhost:8080
    Accept: foo/bar
    " | nc localhost 8080

    HTTP/1.1 200 OK
    Server: Apache-Coyote/1.1
    Content-Type: text/html;charset=UTF-8
    Content-Length: 8
    Date: Thu, 13 Oct 2011 11:59:20 GMT

    -ä--


    After finding out there's a problem I wanted to figure out if there's a problem with our server's configuration. I cd'ed to the directory of our project to search for xml files that define something related to encodings. I used search terms a bit more general than what I expected to find, e.g. iso-8859 instead of iso-8859-1. I typed:
    Code:
    > grep -i iso-8859 $(find -name \*.xml)
    > grep -i iso8859 $(find -name \*.xml)
    > grep -i windows-125 $(find -name \*.xml)
    > grep -i latin $(find -name \*.xml)
    These didn't find anything. On the other hand,
    Code:
    > grep -i 'utf' $(find -name \*.xml)|grep -v '^./target' | grep -v '<?xml version'
    returned three hits (plus some unrelated ones):

    web.xml has:
    Code:
    <filter>
        <filter-name>CharacterEncodingFilter</filter-name>
        <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
        <init-param>
    	<param-name>encoding</param-name>
    	<param-value>UTF-8</param-value>
        </init-param>
        <init-param>
    	<param-name>forceEncoding</param-name>
    	<param-value>true</param-value>
        </init-param>
    </filter>
    spring-context.xml has:
    Code:
    <bean id="viewResolver" class="org.springframework.web.servlet.view.freemarker.FreeMarkerViewResolver">
    	<property name="cache" value="true"/>
    	<property name="prefix" value=""/>
    	<property name="suffix" value=".ftl"/>
    	<property name="requestContextAttribute" value="request"/>
    	<property name="contentType" value="text/html;charset=UTF-8"></property>
    </bean>
    ./pom.xml has:
    Code:
    <properties>
      	<org.springframework.version>3.0.5.RELEASE</org.springframework.version>
    	<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>
    As pointed out in other places, CharacterEncodingFilter shouldn't have anything to do with this since it relates to HTTP requests, not responses. ViewResolver seems to apply only to files with suffix .ftl (we use those to create ModelAndView objects), but here we're just returning data directly without FreeMarker templates. The last hit seems to define the source file encoding, which is fine since I'm able to hard-code Chinese characters into the source.




    There are some other posts over the web that seem related, with various workarounds.

    The post Spring MVC response encoding issue seems to describe the same issue.

    The forum post Cannot set character encoding using @ResponseBody and HttpMessageConverter may describe the same problem.

    The forum post @ResponseBody and UTF-8 seems to describe the same problem but some of the anwers seem to answer a different question: how to set the charset reported by the server.

    The forum post Can not change response Charset encoding, it is always set to ISO-8859-1 describes a problem where the server reports ISO-8859-1 as the response encoding. In my case the server reports the encoding I'd like but it doesn't actually use it.

    The bug report @ResponseBody doesn't obey CharacterEncodingFilter seems to be related to setting the reported encoding (not the actually used one).



    I'm running my project in SpringSource Tool Suite 2.7.2. My project has Maven dependencies to files with names such as spring-web-3.0.5.RELEASE.jar, so I assume my Spring Framework is version 3.0.5. The server type is "VMware vFabric tc Server v2.5,2.6". I'm using Kubuntu Linux 11.04.

    (For the next two weeks I may not be able to follow this thread actively. My colleagues may, however.)

  • #2
    Can you describe the exact behavior you're looking for?

    Here are a few pointers that may help:
    1. StringHttpMessageConverter writes using whatever character encoding is requested (e.g. "Accept=text/plain;charset=UTF-8").
    2. If none is requested (e.g. "Accept=text/plain"), the converter checks and uses the default character encoding if set via StringHttpMessageConverter's defaultCharacterEncoding property.
    3. Or otherwise the converter falls back on it DEFAULT_CHARSET constant, which is "ISO-8859-1".

    Another option is to return ResponseEntity from your method instead of annotating it with @ResponseBody. Setting the Content-Type header of the ResponseEntity dictates the content type and character encoding to be used.

    Yet one more option in Spring 3.1 is to use the "produces" condition, which influences what content type used to write to the response. See ResponseController.java in the mvc-showcase.

    Comment


    • #3
      Thank you for your reply. I'd like the server to both report and use UTF-8 in its response. (It's ok if the server can serve the same content using other encodings, too, but it isn't required.) Currently the server claims to use UTF-8 when it actually uses ISO-8859-1 (if the Accept HTTP request header has a value, which it does when using a regular browser). The desired response looks like:

      HTTP/1.1 200 OK
      Server: Apache-Coyote/1.1
      Content-Type: text/html;charset=UTF-8
      Content-Length: 8
      Date: Thu, 13 Oct 2011 11:59:20 GMT

      -ä--
      I tried using the StringHttpMessageConverter bean, but it doesn't seem to help at all. I inserted this code into spring-context.xml:

      Code:
      <bean class="org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter">
      	<property name="messageConverters">
      		<array>
      			<bean class="org.springframework.http.converter.StringHttpMessageConverter">
      				<property name="supportedMediaTypes" value="text/html;charset=UTF-8" />
      			</bean>
      		</array>
      	</property>
      </bean>
      The ResponseEntity trick seems to work and it might be the one I'll go with. The "produces" condition you mentioned might be the most desirable one, but I suspect we're not going to change the version of the framework if there are other workarounds.

      I'm wondering though if Spring framework is working as it's supposed to. Why does it say it's serving UTF-8 when it isn't? Is it possible that this is a bug that should be filed?
      Last edited by jkk; Oct 27th, 2011, 09:01 AM. Reason: Added desired HTTP response

      Comment


      • #4
        Your web.xml has a CharacterEncodingFilter and it's set to enforce a UTF-8 character encoding. So that explains why the response can say UTF-8 where the actual content written by the StringHttpMessageConverter is not. Probably what happens is the coverter sets the content type and then the filter overrides it.

        The "supportedMediaTypes" property of StringHttpMessageConverter is used if the request doesn't have an Accept header. It sounds like in your case there is an Accept header and the character encoding in it is not UTF-8. ResponseEntity should work and so should the produces condition.

        Comment


        • #5
          Thank you for your reply. I only used StringHttpMessageConverter as a test and we don't normally use it so I'm not sure if that was causing problems. ResponseEntity works well for us so that's what we'll go with.

          Comment

          Working...
          X