What is Megaphone?

What is Megaphone?
The Megaphone project is about enhancing open source chat software. Specifically, the goal is to allow ejabberd to support 1,000,000 simultaneous users. See The Plan page for more details on how I plan to solve this problem. See the About this Blog page for more details on why I created this blog.

Friday, January 13, 2012

Parsing HTTP

Previously...
  • I decided to keep using the ejabberd BOSH server.
  • I started messing around with nodejs TCP servers.
  • I switched ECM over to using a straight nodejs TCP server.

Once ECM started sending straight HTTP, I needed to change megaphone to be able to parse it.  I was planning on using erlang:decode_packet to do this, but I was running into a problem.  The input looked like this:

POST /http-bind/ HTTP/1.1
Host: ubuntu2
User-Agent: Pidgin 2.6.6 (libpurple 2.6.6)
Content-Encoding: text/xml; charset=utf-8
Content-Length: 225
...

The first call to decode_packet returned

{http_request,'POST', {abs_path,"/http-bind/"}, {1,1}}

With the rest being the stuff that started at "Host:"  Note that each line is separated with a carriage-return/linefeed combination.  

Thank you DOS for leaving us with that legacy, but I digress.

The second call with the the binary start therefore starts with the second line was returning:

{http_error,"Host: ubuntu2\r\n"}

which was not exactly what I expected.  After a bit of puttering about I discovered the following from the erlang documentation for decode_packet:

The protocol type http should only be used for the first line when a HttpRequest or a HttpResponse is expected. The following calls should use httph to get HttpHeader's until http_eoh is returned that marks the end of the headers and the beginning of any following message body.  

Of course I knew this all along and I was just testing the readers (OK, I guess it's just singular) to see if they would catch this point.  

So I was testing you.  Yeah.  It was not that I myself needed to RTFM.

At any rate, the next step is to get megaphone_socket to maintain some state in between calls so that it will keep the data from call to call and use the correct data for each invocation of decode_packet.

No comments:

Post a Comment