Bug 522 - Raw POST request body gets lost if no query is contained
: Raw POST request body gets lost if no query is contained
Status: RESOLVED FIXED
: Helma
Web Support
: CVS trunk
: Other Windows
: P1 normal
: ---
Assigned To:
:
:
:
  Show dependency treegraph
 
Reported: 2007-05-11 14:58 by
Modified: 2007-06-05 11:05 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2007-05-11 14:58:03
When I post data conforming to a query-like structure to a Helma action I get
the key/value pairs resolved in req.data:

# POST http://localhost:8080/post
Please enter content (application/x-www-form-urlencoded) to be POSTed:
n=1&foo=bar

// Root/post.hac:
res.writeln(req.data.n);   // "1"
res.writeln(req.data.foo); // "bar"

However, if the data does not conform to this structure there is currently no
way (at least I know of) to get the request content:

# POST http://localhost:8080/post
Please enter content (application/x-www-form-urlencoded) to be POSTed:
Hello, World!

Here, the only option I found so far is using req.servletRequest.inputStream
which would provide access to the content; unfortunately, it does not return
anything:

// Root/post.hac
var result = [];
while (req.servletRequest.inputStream.readLine(result, 0, 1024) {}
res.writeln(result.join("")); // ""

Is there any way the raw content of a request's body could be made accessible?

(Sidenote: PHP provides the "php://input" quirk [1] to achieve this. Obviously,
we are not alone here.)

--
[1] http://at.php.net/manual/en/language.variables.external.php#30485
------- Comment #1 From 2007-05-31 11:49:18 -------
The problem seems to be that the request is sent with content type
application/x-www-form-urlencoded, which triggers the helma servlet to parse
the body as form parameters, but then chokes because it's not the
key1=value1&key2=value2 format it expects. This explains why the input stream
is consumed and you don't get anything from the req.servletRequest.inputStream.
(There's also a req.servletRequest.reader, which is a java.io.BufferedReader
and might be more convenient to read, btw).

The solution I'm likely to implement is to store any unnamed data in an
x-www-form-urlencoded request into some place like req.data.http_body. For
example, with post body "foo=bar&xxx", req.data.http_body would be set to
"xxx", with body "foo+bar", req.data.http_body would be set to "foo bar", while
for body "foo=bar" req.data.http_body wouldn't be set at all.

Any other suggestions?
------- Comment #2 From 2007-05-31 13:28:53 -------
(In reply to comment #1)
> The problem seems to be that the request is sent with content type
> application/x-www-form-urlencoded, which triggers the helma servlet to parse
> the body as form parameters, but then chokes because it's not the
> key1=value1&key2=value2 format it expects. This explains why the input stream
> is consumed and you don't get anything from the req.servletRequest.inputStream.

That's what I observed, too: as soon as I posted such key/value pairs in the
HTTP body Helma recognized the data.

> (There's also a req.servletRequest.reader, which is a java.io.BufferedReader
> and might be more convenient to read, btw).

I tried the reader at first but got a java.lang.IllegalStateException which
according to the API docs is due to "the getInputStream() method has already
been called for this request" [1].

> The solution I'm likely to implement is to store any unnamed data in an
> x-www-form-urlencoded request into some place like req.data.http_body. For
> example, with post body "foo=bar&xxx", req.data.http_body would be set to
> "xxx", with body "foo+bar", req.data.http_body would be set to "foo bar", while
> for body "foo=bar" req.data.http_body wouldn't be set at all.

I like this solution, it sounds good.

Or could this issue be solved by changing the content type of the very request?

Ciao,
tobi

--
[1]
http://java.sun.com/products/servlet/2.2/javadoc/javax/servlet/ServletRequest.html#getReader()
------- Comment #3 From 2007-05-31 13:56:03 -------
> I tried the reader at first but got a java.lang.IllegalStateException which
> according to the API docs is due to "the getInputStream() method has already
> been called for this request" [1].

Yes, that's because the x-www-form-urlencoded tricks Helma into reading the
request body using the input stream.

> Or could this issue be solved by changing the content type of the very request?

Yes. If you set the content-type to something like "text/plain" or "text/xml",
Helma won't touch the input stream and you'll be able to do your own reading
using the InputStream or Reader.
------- Comment #4 From 2007-05-31 16:36:57 -------
Considering that all this also applies to GET parameters, and to make it a bit
easier to understand that this is about "left-over" data in the query string or
post data, I propose to use the following names:

req.data.http_get_remainder
req.data.http_post_remainder

What do people think? Is adding more stuff to req.data problematic, even if we
use the http_ prefix?
------- Comment #5 From 2007-06-05 11:05:28 -------
I committed the http_post_remainder/http_get_remainder thing.