Server Overload and Failed Launches

overload

Readers, at the end of the day the reason I am writing this article is simply out of ignorance and confusion.  I work in the tech industry and have written network based applications.  I’ve even written a video server. I understand how all this works.  And, that is why I am confused.  Why is it that year after year we continue to see the same thing over and over: server overload on launch day.

First time I truly experienced a day one server overload was Half-Life 2.  Steam nowadays is hailed as the savior of PC gaming, but it “officially” launched along side Half-Life 2, and what a disastrous launch it was.  I remember the day, freshman year of high school.  I had already downloaded a pre-install with hopes that the second I got home I’d download a little quick fix and boom I’d be playing the most anticipated PC game of all time.

But that’s not what happened.  What ensued were server issues causing my game to neither fully download nor properly verify – causing me to stay up all night for no reason.  It happened again years later with the launch of Battlefield 1943 on Xbox 360.  I stayed home sick and needing something to do. I noticed the game was being released that day and I joyfully I realized I had something to do all day between vomiting.  Or so I thought.  Again server issues.  Can’t play.  Game essentially broken.

So, at a fundamental level I understand how everything works.  Take for example a web server;  A web server hosts certain files and scripts.  When I request a webpage, I send a request to a server, the server processor (CPU) processes the request (typically grabbing an HTML file) and sends back the file.  It might also run some additional items on the CPU like PHP scripts.  Basically this is a very simple operation.  Check out my beautiful drawing below.

Server Overload

The problem for web servers is there may be many people requesting the server to do a task at the exact same time.  And if the request completion time sending data back out can’t be completed at the rate that people are requesting data in you get a bottleneck.

In a video server a user requests a file.  I retrieve this file and chop it up into little packets, add a wrapper around it for syncing at the other end and out I send them one by one as a steady stream.  All servers come down to requests being processed.

Imagine a funnel that we all pour water into from our individual hoses (no dirty jokes please).  Well that funnel might be able to handle one or two of us pouring into it, but if we keep adding more and more water into the hose from other sources, eventually it will overflow.  Again, I have accurately drawn this for you below in case you aren’t following.

What happens in an overload

And how do you stop this overflow?  You get a bigger funnel, get more funnels, make the funnel more efficient, or lastly you shoot some of the people pouring water into the funnel so they stop.  Basically, it is the same concept with servers.  Those requests pile up and the server can’t handle them and doesn’t let them in.  The only way to fix this is to get a bigger faster server or more of them.

A crappy website might have a single, small server because they do not expect that many people making requests.  Hence why if an unpopular website suddenly becomes very popular by posting something cool that goes viral very quickly, its server can’t handle this amount of new requests.  And it explodes. Actually it typically just locks up and restarts itself.  And then gets hit by a bunch of requests again and locks up and restarts and so on.  Essentially the server is “down”.  That small website didn’t prepare for that amount of traffic, hence the server overload is understandable.  But we are talking triple A games that every gamer on the planet wants to play.  You can plan for the traffic to be gigantic.

Now I started with web-servers because they are very basic to understand.  And I’ve never actually coded any net code for games, but again it all works on the same logic.  I load a level in Call of Duty and every action I perform gets sent to a server who then dishes this action out to the other players.  And it does the same for them.

So, when I shoot a bullet on my TV at player , that action is sent to the server and then sent to player 2 in China and it renders this bullet being shot on his screen.  And it hopefully hits him in the face and kills him.  Hence,there sometimes is lag because the bullet I shot may take 5 seconds to get to China and by then a ton of other actions have taken place and everything has to re-sync and basically you get mad and throw your controller.  But lag is beyond the scope of this document.

This is where we get to the point of this article,  and why I am confused.  Again, for all I know I am completely missing something and I’d love for someone to fill me in.  The day of Diablo 3’s launch, Blizzard has a certain amount servers they allocated for players to use.  You do some simple math here.  You run a beta test and you know how many servers are needed per person for the game to run smoothly.  You then try to determine how many players are most likely to play at launch at any one point.  You then use this estimation and calculate the amount of servers most likely necessary for everyone to play smoothly.

This is where we come to our issue.  Time and time again this “estimation” fails.  I don’t know why.  We are pretty accurate in knowing the amount of people that will play the game.  You have pre-order numbers, past histories and whatnot.  In the case of Grand Theft Auto V, you actually have the physical number of copies purchased because the online portion launched a full month later.  This isn’t a small website suddenly killed by unexpected viral traffic.  No, this is a launch that is foreseen and, supposedly, calculated to the T.

So, the only thing I can guess is the fact that servers cost money.  So if you were to over estimate the amount of players that are going to play on launch, well then you aren’t using all your servers and essentially you are wasting money.

But, is this money being wasted really worth the risk of completely alienating your fan base? Grand Theft Auto V was running on an absolute high.  I loved the game.  Critics adored it (it has a 97/100 on metacritic aggregation).  It was crowned the fastest selling game of all time.  Fast forward a month to the launch of Grand Theft Auto V’s online portion (GTA Online), and it is a different scenario.  I understood the typical launch day issues, but then day 2 came, and then 3, and then 4…  It wasn’t until a full 2 weeks later that the game’s servers got to a point I would call “stable” enough to be playable.

Now when I think of GTA V, the fact the single player was so perfect is always going to be placed next to this soiled launch.  Even though in the end this isn’t that big of a deal, a bad taste is still left in my mouth.

What really is driving me nuts is the fact that this keeps happening.  Over and over again.  Within the last year and a half we have had Diablo 3, FF XIV, Sim City, and GTA V Online all unplayable at launch.  And these aren’t little multiplier indie games that didn’t see it coming (see the web-server analogy where a small website gets overloaded with unforeseen popularity), these are triple A games.  Everyone knows that GTA V and Diablo 3 would sell.  It’s a given.  So, why not just buy more servers for launch and if you overestimate then just roll them back?  Money lost, possibly, but don’t you think the damage you have done to your fan base greatly outweighs the money lost on over estimating?  I have friends who say they will never buy a Sim City game again strictly because of the soiled launch of the most recent one.

Can anyone explain to me like I’m 5?  I understand bugs and glitches at launch, but these aren’t related to server overload issues.  These bugs are simply logic errors in the game that weren’t seen prior to launch. I want to know what really happens at a failed launch involving server overload and what happens in the days following.  Is it really this simple and publishers and developers just keep messing up?  Or is there something more complicated going on?  Someone let me know.

SHARE THIS POST

  • Facebook
  • Twitter
  • Google Buzz
  • Reddit
Author: Chris Fadeley View all posts by
I am UF alumni and a computer engineer. I know virtually every useless fact about videogames ever. I like computers and potatoes.

2 Comments on "Server Overload and Failed Launches"

  1. Ryan Atkinson November 8, 2013 at 7:43 am - Reply

    Part of it is the fact that their system isn’t tested and optimized. Another part is the system goes from 0 to 60 in a matter of minutes, the pure amount of requests per second is silly. There are 2 parts to an MMO server, the game world server and the logon server. The logon server is what is glitchy most of the time. It is made to process 100 or so requests per minute, cause that is what normal traffic will be, but on launch day, you have tens of thousands trying to get access all at the same time. Another part is on first start up new data has to be written to the drives for a new player/account. Data writing and retrieval is one of the slowest processes in a computer. Coupling just these two issues together and you get why launch days are usually a problem. And of course, there are always freak errors with something as complex as an online game that just cannot be foreseen.

    • Chris Fadeley November 8, 2013 at 2:13 pm - Reply

      You make some good points. That does make a lot of sense. Thanks!

Leave A Response

Login with one of the buttons below to Comment


Connect with Facebook


Or click here for manual input.