I guess my comment below regarding overhead of serialisation in container local is wrong ? Nevertheless having a local thread implementation gives some benefits . For example I am using to whether sleep if there is no request in the queue or spin checking for request presence in the request queue etc to take care of no delays in the request queue processing itself.