Clarifying Orleans Messaging Guarantees

There has been some confusion around Orleans messaging guarantees, that I wanted to take a second to clarify.  In past talks on Halo 4 and Orleans I mistakenly mention that Orleans supports At Least Once Messaging Guarantees.  However this is not the default mode.  By default Orleans delivers messages At Most Once.

Its also worth pointing out that the paper Orleans: Distributed Virtual Actors for Programability and Scalability in section 3.10 says “Orleans provides at-least-once message delivery, by resending messages that were not acknowledged after a configurable timeout,” which identifies the non-default configurable behavior.  This along with some of my talks has led to some of the confusion.

In Orleans when messages are sent between grains, the default messaging passing is request/response.  If a message is acknowledge with a response, it is guaranteed to have been delivered.  Internally, Orleans does best effort delivery.  In doing so it may retry certain internal operations however this does not impact the overall application messaging level guarantees of At Least Once Messaging.  This is similar to TCP, TCP may retry internally but the application code using the protocol will receive the message once or zero times.

Orleans can be configured to do automatic retries upon timeout, up to a maximum amount of retries.  In order to get at least once messaging you would need to implement infinite retries.  Enabling retries is not the recommended configuration since in some failure scenarios it can create a storm of failed retries in the system.  It is recommended that the application level logic handles retries when necessary.

In the Halo Services we ran Orleans in the default mode, At Most Once Message Delivery.  This guarantee was sufficient for some services like the Halo Presence Service.  However, the Halo Statistics Service needed to process every message to guarantee that player data was correct. So in addition to using Orleans to process the data we utilized Azure Service Bus to durably store statistics and enable retires to ensure that all statistics data was processed.  The Orleans grains processing player stats were designed to get messages at least once, leading us to design Idempotent operations for updating players statistics.

I hope this helps clarify Orleans messaging guarantees.  This has also been documented on the Orleans Github Wiki.

You should follow me on Twitter here