Bugzilla – Bug 522
Raw POST request body gets lost if no query is contained
Last modified: 2007-06-05 11:05:28
You need to log in before you can comment on or make changes to this bug.
When I post data conforming to a query-like structure to a Helma action I get the key/value pairs resolved in req.data: # POST http://localhost:8080/post Please enter content (application/x-www-form-urlencoded) to be POSTed: n=1&foo=bar // Root/post.hac: res.writeln(req.data.n); // "1" res.writeln(req.data.foo); // "bar" However, if the data does not conform to this structure there is currently no way (at least I know of) to get the request content: # POST http://localhost:8080/post Please enter content (application/x-www-form-urlencoded) to be POSTed: Hello, World! Here, the only option I found so far is using req.servletRequest.inputStream which would provide access to the content; unfortunately, it does not return anything: // Root/post.hac var result = []; while (req.servletRequest.inputStream.readLine(result, 0, 1024) {} res.writeln(result.join("")); // "" Is there any way the raw content of a request's body could be made accessible? (Sidenote: PHP provides the "php://input" quirk [1] to achieve this. Obviously, we are not alone here.) -- [1] http://at.php.net/manual/en/language.variables.external.php#30485
The problem seems to be that the request is sent with content type application/x-www-form-urlencoded, which triggers the helma servlet to parse the body as form parameters, but then chokes because it's not the key1=value1&key2=value2 format it expects. This explains why the input stream is consumed and you don't get anything from the req.servletRequest.inputStream. (There's also a req.servletRequest.reader, which is a java.io.BufferedReader and might be more convenient to read, btw). The solution I'm likely to implement is to store any unnamed data in an x-www-form-urlencoded request into some place like req.data.http_body. For example, with post body "foo=bar&xxx", req.data.http_body would be set to "xxx", with body "foo+bar", req.data.http_body would be set to "foo bar", while for body "foo=bar" req.data.http_body wouldn't be set at all. Any other suggestions?
(In reply to comment #1) > The problem seems to be that the request is sent with content type > application/x-www-form-urlencoded, which triggers the helma servlet to parse > the body as form parameters, but then chokes because it's not the > key1=value1&key2=value2 format it expects. This explains why the input stream > is consumed and you don't get anything from the req.servletRequest.inputStream. That's what I observed, too: as soon as I posted such key/value pairs in the HTTP body Helma recognized the data. > (There's also a req.servletRequest.reader, which is a java.io.BufferedReader > and might be more convenient to read, btw). I tried the reader at first but got a java.lang.IllegalStateException which according to the API docs is due to "the getInputStream() method has already been called for this request" [1]. > The solution I'm likely to implement is to store any unnamed data in an > x-www-form-urlencoded request into some place like req.data.http_body. For > example, with post body "foo=bar&xxx", req.data.http_body would be set to > "xxx", with body "foo+bar", req.data.http_body would be set to "foo bar", while > for body "foo=bar" req.data.http_body wouldn't be set at all. I like this solution, it sounds good. Or could this issue be solved by changing the content type of the very request? Ciao, tobi -- [1] http://java.sun.com/products/servlet/2.2/javadoc/javax/servlet/ServletRequest.html#getReader()
> I tried the reader at first but got a java.lang.IllegalStateException which > according to the API docs is due to "the getInputStream() method has already > been called for this request" [1]. Yes, that's because the x-www-form-urlencoded tricks Helma into reading the request body using the input stream. > Or could this issue be solved by changing the content type of the very request? Yes. If you set the content-type to something like "text/plain" or "text/xml", Helma won't touch the input stream and you'll be able to do your own reading using the InputStream or Reader.
Considering that all this also applies to GET parameters, and to make it a bit easier to understand that this is about "left-over" data in the query string or post data, I propose to use the following names: req.data.http_get_remainder req.data.http_post_remainder What do people think? Is adding more stuff to req.data problematic, even if we use the http_ prefix?
I committed the http_post_remainder/http_get_remainder thing.