What is Megaphone?

What is Megaphone?
The Megaphone project is about enhancing open source chat software. Specifically, the goal is to allow ejabberd to support 1,000,000 simultaneous users. See The Plan page for more details on how I plan to solve this problem. See the About this Blog page for more details on why I created this blog.

Saturday, January 14, 2012

Problem with Processes

Previously...
  • I started messing around with nodejs TCP servers.
  • I switched ECM over to using a straight nodejs TCP server.
  • I learned about parsing HTTP packets from a buffer

The erlang:decode_packet function expects the caller to maintain a kind of state in that the arguments to it need to change depending on where in the packet the caller is at.  From the erlang documentation at 


decode_packet/3 has the following form:

    erlang:decode_packet(Type,Bin,Options) -> {ok,Packet,Rest} | {more,Length} | {error,Reason}

When calling decode_packet to decode the first line of an HTTP packet the "Type" parameter is supposed to be "http".  For subsequent calls, this parameter is supposed to be "httph".  

This causes a problem for megaphone_socket because now that module must maintain some state information in between calls to recv.  My immediate feeling is to use gen_fsm, since modeling state is what that whole module is designed to do; but there is a problem with that approach: memory consumption.

From some simple-minded tests that I've done erlang consumes somewhere between 4 and 7 kB each time you create a new process.  For 1,000,000 connections, this translates to between 4 and 7 GB.  Unless there is a way to reduce that consumption, I would probably be better off using some other approach.

Since the call from ejabberd_http to megaphone_socket includes the "Socket" that it wants to read from, I can use that to index into some sort of structure to store the state of the connection and then go from there.

The next step is to figure out how to do this without using a seperate process.

No comments:

Post a Comment