Upcoming Changes to AMS Drain Behavior

Hello everyone,

We want to inform you about an upcoming change to dedicated server drain behavior in AMS. No action is required for your game or fleet configuration, but you may choose to reduce the drain timeout setting on your fleets after this change takes effect.

What is changing

After receiving a drain signal, a dedicated server (DS) which has been claimed, and is therefore in the “in session” state, will no longer be transitioned to the “draining” state. This will leave it subject to the session timeout, rather than the drain timeout. Dedicated servers in any other state will still transition to “draining” and be subject to the drain timeout.

Why we are making this change

The current behavior of the watchdog is that when it receives a drain command it

  1. Sends a drain message to all of the DS it is managing
  2. Transitions the internal state of all of the DS to “draining”, making them subject to the drain timeout
  3. Sends and update to Fleet Command updating the state of all its DS to “draining”

The watchdog will get a drain command any time the fleet needs to scale down and that watchdog is one of the ones selected for drain (by least utilized). The ramifications of this design in practice, when applied to DS that are “in-session” are as follows:

  • It’s common and expected for a watchdog to get a drain signal while one or more of its DS are “in session”
  • In order to give the in-session DS time to finish the game session, with the current design, the drain timeout needs to be set the same or longer than the session timeout (which can be many minutes). This then applies to all “draining” state DS, including ones that were previously “ready” and should ideally exit almost immediately.
  • Metrics for DS states suddenly change from a mix of “ready” and “in-session” (and possibly other states), to all “draining”, which can make it appear as though there was a sudden drop in sessions being served, and makes it hard to see if in aggregate “draining” DS are responding appropriately to the drain signal, given that the expected behavior for in-session DS is likely different than for other DS.

We’ve observed that this leads to an un-intuitive user experience, confusion, and poorly configured fleets, rather than a pit of success for users. By making a change to keep the DS that are serving a session in the “in session” state, we aim to provide a more intuitive experience for fleet configuration and metrics.

When is this change coming

This change will apply to new VMs which are started in AMS after 8/28/2024

Please don’t hesitate to reply with any questions you have around this change.