This is the bocoup web log with posts from Al, Boaz, Rick, Sam, Nate, Nick & Pete. You should also make sure to checkout code.bocoup.com, where we keep the finished versions of ideas we kick around here.


Apr 12, 2010

Javascript Enumerable.Map() with WebWorkers

For those with short attention spans, here’s how you call the function:

map(enumerable, mapFunction, callback, numWorkers);

I wanted an easy way to divide up a parallelizable task with Web Workers, so I create a Worker enabled Map function for arrays and objects. It works just like the map function in your favorite functional languages, except that it executes asynchronously with callback. And of course, that it will go several times faster on multicore machines.

The function creates a pool of workers (32 by default) and divides the work up among them. It reassembles the results into a new object of the same type as the original. Array order is preserved.

Here’s the function:

function map(data, mapper, callback, numWorkers) {
 
  // Support arrays & objects
  var length = 0;
  for(var d in data) { length++; }
  var result = new data.constructor;
 
  numWorkers = Math.min(numWorkers || 32, length);
  var workers = [];
  var messagesReceived = 0;
 
  // Create the workers
  for (var i=0; i < numWorkers; i++) {
    workers[i] = new Worker("mapper.js");
    workers[i].addEventListener('message', function(e) {
      result[e.data.key] = e.data.value;
      // Check if we have finished the job.  This should probably be more robust.
      if (++messagesReceived == length) { callback(result) };
    }, false);
  }
 
  // Just send out all the tasks.  The messages get queued by the browser.
  // It would probably be better to queue up two or three tasks per worker (to minimize downtime)
  // and add tasks to the queues as results come back.
  var nextItem=0
  for (var d in data) {
    workers[nextItem++ % numWorkers].postMessage({key: d, value: data[d], mapper: "(" + String(mapper) + ")(value)"});
  }
 
}

And here is the worker code. The worker is pretty bare bones, as you might have assumed.

// Minion
onmessage = function(e) {
    var value = e.data.value;
    postMessage({key: e.data.key, value: eval(e.data.mapper)});
}

Next, I’m thinking I’ll do a WebWorker implementation of MapReduce. I was thinking that I would use the syntax from CouchDB in the interest of standardization (emit, in other words) but I am far from an expert on these things and would love to hear any feedback.



Comments:

5 Comments

  1. Posted April 12, 2010 at 3:33 pm | Permalink

    the improvements you made since I last saw this are definitely very cool

  2. Posted April 12, 2010 at 5:20 pm | Permalink

    Nice! I’m pretty excited to see what shiny new things we’ll be reaping from what appears to be a building momentum behind workers.

    To add something useful: As you indicate in your code, right now it’s rather hard to write a good feedback mechanism to really tune performance. The UAs are all doing what’s optimal for them, and the spec is (intentionally, for now) underdefined wrt things like determining the maximum number of simultaneous workers. You might already follow the WHATWG mailing list and be well aware of this stuff, but, if not, you might find this collection of threads interesting:
    http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-December/024396.html

    Nice to see this post. Hopefully this also means you guys will stop teasing us with that Hive stuff soon, too =)

  3. Posted April 12, 2010 at 6:06 pm | Permalink

    Looking over this I see I forgot one vital piece of information: the map function takes a single argument which is the value (not two argument which you might be expecting if you’re thinking Google MapReduce).

  4. Posted April 13, 2010 at 5:19 am | Permalink

    Very cool!

    This reminds me of Metaworker – “A javascript work parallelizer/distributor library for both HTML5 web workers and server-side nodejs” – http://github.com/Maciek416/metaworker.

    I’ve also played with workers and created the pmrpc library – a library for RPC-style communication with web workers (and iframes/windows) – http://bit.ly/JMtkm.

  5. Posted April 15, 2010 at 6:49 am | Permalink

    Metaworker looks is pretty cool – similar to what I was planning to implement as far as map reduce goes. Except that my fallback would be single threaded execution, not server-side workers. Also, I’m kind of thinking that since you need the worker js files anyway, you may as well write the logic there instead of passing functions in. Still, maybe I’ll fork MW instead of rolling from scratch.

    Thanks for the tip.

4 Trackbacks

  1. [...] Read more: Javascript Enumerable.Map() with WebWorkers – Javascript … [...]

  2. [...] This post was mentioned on Twitter by Dion Almaer, Christopher Blizzard, Al Mac, Al Mac, dave johnson and others. dave johnson said: RT @dalmaer: Map Reduce with Web Workers http://weblog.bocoup.com/javascript-enumerable-map-with-webworkers /via @F1LT3R [...]

  3. By Luke Morton :: Freelance web explorer on April 13, 2010 at 6:03 am

    [...] http://weblog.bocoup.com/javascript-enumerable-map-with-webworkers [...]

  4. [...] Javascript Enumerable.Map() with WebWorkers – Javascript Enumerable.Map() with WebWorkers &#82… [...]




Please send your questions to this address or call Bocoup at 617-379-2752
This web page is proudly maintained by Bocoup and hosted by (mt) Media Temple
All code on this website is Open Source