If you’ve been playing (or trying to) Diablo 2 Resurrected these past few weeks, you’re probably well aware of all the server related issues going on. While Blizzard has published a number of updates in hopes of resolving the issues, it would appear that some coding that was brought over from the original game may, in fact, be the culprit as Blizzard explains the reasoning behind Diablo 2 server issues, all while outlining their plans to fix it.
In a statement issued by Blizzard’s Community Manager, the team details the reasoning behind Diablo 2 server issues, stating that some of the “legacy code” from the original 2000 release of Diablo 2 isn’t able to keep up with today’s modern online.
In staying true to the original game, we kept a lot of legacy code. However, one legacy service in particular is struggling to keep up with modern player behavior.
This service, with some upgrades from the original, handles critical pieces of game functionality, namely game creation/joining, updating/reading/filtering game lists, verifying game server health, and reading characters from the database to ensure your character can participate in whatever it is you’re filtering for. Importantly, this service is a singleton, which means we can only run one instance of it in order to ensure all players are seeing the most up-to-date and correct game list at all times. We did optimize this service in many ways to conform to more modern technology, but as we previously mentioned, a lot of our issues stem from game creation.
We mention “modern player behavior” because it’s an interesting point to think about. In 2001, there wasn’t nearly as much content on the internet around how to play Diablo II “correctly” (Baal runs for XP, Pindleskin/Ancient Sewers/etc for magic find, etc). Today, however, a new player can look up any number of amazing content creators who can teach them how to play the game in different ways, many of them including lots of database load in the form of creating, loading, and destroying games in quick succession. Though we did foresee this–with players making fresh characters on fresh servers, working hard to get their magic-finding items–we vastly underestimated the scope we derived from beta testing.
Additionally, overall, we were saving too often to the global database: There is no need to do this as often as we were. We should really be saving you to the regional database, and only saving you to the global database when we need to unlock you–this is one of the mitigations we have put in place. Right now we are writing code to change how we do this entirely, so we will almost never be saving to the global database, which will significantly reduce the load on that server, but that is an architecture redesign which will take some time to build, test, then implement.
With this information now known to Blizzard, the team has now put into motion work towards improving server loads, including rate limiting, and breaking functions into smaller services.
Rate limiting: We are limiting the number of operations to the database around creating and joining games, and we know this is being felt by a lot of you. For example, for those of you doing Pindleskin runs, you’ll be in and out of a game and creating a new one within 20 seconds. In this case, you will be rate limited at a point. When this occurs, the error message will say there is an issue communicating with game servers: this is not an indicator that game servers are down in this particular instance, it just means you have been rate limited to reduce load temporarily on the database, in the interest of keeping the game running. We can assure you this is just mitigation for now–we do not see this as a long-term fix.
Login Queue Creation: This past weekend was a series of problems, not the same problem over and over again. Due to a revitalized playerbase, the addition of multiple platforms, and other problems associated with scaling, we may continue to run into small problems. To diagnose and address them swiftly, we need to make sure the “herding”–large numbers of players logging in simultaneously–stops. To address this, we have people working on a login queue, much like you may have experienced in World of Warcraft. This will keep the population at the safe level we have at the time, so we can monitor where the system is straining and address it before it brings the game down completely. Each time we fix a strain, we’ll be able to increase the population caps. This login queue has already been partially implemented on the backend (right now, it looks like a failed authentication in the client) and should be fully deployed in the coming days on PC, with console to follow after.
Breaking out critical pieces of functionality into smaller services: This work is both partially in progress for things we can tackle in less than a day (some have been completed already this week) and also planned for larger projects, like new microservices (for example, a GameList service that is only responsible for providing the game list to players). Once critical functionality has been broken down, we can look into scaling up our game management services, which will reduce the amount of load.
We have people working incredibly hard to manage incidents in real-time, diagnosing issues, and implementing fixes–not just on the D2R team, but across Blizzard. This game means so much to all of us. A lot of us on the team are lifelong D2 players–we played during its initial launch back in 2000, some are part of the modding community, and so on. We can assure you that we will keep working until the game experience feels good to us not only as developers, but as players and members of the community ourselves.
For PC players, the login queue visual was implemented in last Friday’s update. For console players, you can fully expect to see this implemented sometime this week, as the patch is making its way through certification.
There was a PC patch on Friday that added in a queue visual onto the game while we started queueing players during high traffic windows.
We noted that consoles would have this within a week. It’s looking like it will be the first half once it gets through all the first party certifications. As of now, console is kind of flying blind and it unfortunately is leading to timeouts for users which knocks them into offline. There are moments where players are getting in as we have a team around the clock turning on the faucets for console players to feed online but they are difficult windows before the timeouts.
This weekend has led to an even more massive increase in players connecting into one specific region and is thus causing players to move to other regions and create queues there. Again, we have a team on this 24/7 and calls going on 24/7 as they work on troubleshooting and keeping things going with the databases.
It’s not ideal. I know as even I have had difficulty playing myself and I’m hoping we can have a further update for you all here soon. Again, apologies on this everyone.
Hopefully the Diablo 2 server issues are resolved sooner rather than later, though given the nature of the issues, we don’t expect a definite fix anytime soon. Let’s hope that’s not the case as many are still struggling to play the game for the last few weeks.