Note: if you got a blank page, I’m sorry – I was updating a link (the Blizz video mentioned below) and stooged it. I think this backup was the final draft. Let me know if you see any glaring errors! — Grimm
I’m going to do something I very rarely do. I’m going to take a poke at Blizzard over the quality of its software.
Now, you may remember a while back when some l33t raiding guild took out Arthas as a world first, only to have it revoked and them temp-banned afterwards. The reason? They used an exploit which they had noticed earlier in the week. It was a bug, it got fixed, and someone else went on to do the world-first Arthas kill without using an exploit to do so.
A lot of the ennui around the exploit was that “Blizzard didn’t catch the bug” and “Ensidia paid the price.” We called bullshit then, and I stand by that. Ensidia was an elder guild, well-acquainted with how things work. They probably figured that it would be easy to get forgiveness, and they were wrong. Boo hoo.
I also expressed doubt that QA could really be blamed for anything here. To know to LOOK for a bug, you first need to know that part of the requirements is the platform shall disintegrate, and bombs will not rebuild it. Well, that’s a silly software requirement, but in hindsight, a perfectly legitimate one. Nobody conceived that such a requirement would be viable. I really won’t even blame the design team for failing to communicate that requirement – it was a dumb luck find, that’s all. It’s kind of like the requirement that a spacecraft fly in space, and not explode while doing so.
So I’m no stranger to defending the ranks of QA against unfair attack by those unaware of the general software development process. This is why it feels so odd to be reversing that, to some extent, right now.
If you are, or know, a Herbalist that is at level 85, you will have, at some point, sworn at, or heard someone swearing at, the phased herbs. Particularly Twilight Highlands, but, really, anywhere there is phased landscape (looking at you, Uldum), you will find herbs and ores that show on the mini-map, but phase out when you get close enough to harvest them.
This is a well-known bug; we’ve seen it before. And it is precisely for that reason that I am annoyed at the quality of this software. If it’s a well known bug, why does it recur? Find it, fix it once, fine, that’s how the process is supposed to work. But when the same software people with the same quality people allow the same bug to recur elsewhere, again and again, I start to suspect that some pretty fundamental stuff is not happening.
Generally speaking, in the software development world, bugs found and fixed should be added to a suite of test cases called “regression tests”. This is not all that regression is used for, but it is a significant part of it1. In this case, of course, Twilight Highlands did not exist when the first out of phase herbs were found in Icecrown. However, they use the same “core” classes, presumably. Fundamental core changes to the core “herb” object in the code base should propagate out to all subclasses of the core “herb” object. This includes test cases. Every instance of an herb in Cataclysm should have been tested against this, and passed. And it should have passed before QA even saw it. This is a basic, brain-dead bozo code maintenance function.
So what happened? We can only guess. None of the scenarios make me feel particularly happy.
Scenario 1: No such practice as regressive testing exists within Blizzard. If you look at the Blizzard 20th Anniversary video, you see the founders talking about how they picked potential employees. In a lot of cases, professional skills were not a priority. The entire early culture of that company appears to be based on “work harder, not smarter”. I very much doubt they had a QA department back then, and, in fact, one of the artists (Samwise? Maybe.) mentioned that he got dragged into doing QA at some point. I do find this scenario unlikely. Morhaime had sense enough to find “grownups” to handle the business end of things back then, so I suspect that at some point he also went out and found a QA director of some sort. It’s possible they did not, Success often breeds arrogance, especially when that success comes in spite of dire warnings to the contrary. But, honestly, Morhaime seems a bit more intelligent than that.
Scenario 2: regressive testing of ‘herb’ instances would not uncover this. I find this hard to believe. I can swallow that the phasing issue is not “owned” by the core “herb” class, but I cannot believe that if every herb instance in every zone in every phase was tested, that this would not have been caught. Granted, that’s a lot of testing. But they "own" the entire automation harness for this software. If anything screams for automation, it’s this. Every build should have gone through this regression. So, I’ll stick to my guns. If the testing is being done, it would have been found. Which leads to the next two scenarios.
Scenario 3: the testing is not being done. I am sad to say that this stands firmly in second place as a likely candidate. Unfortunately, this points to a ineffective, incompetent, uncaring, overburdened, or nonexistent QA team. I’ve been involved with most of these sorts of teams. Some are more disturbing than others. For example, an incompetent, ineffective, or uncaring QA team that has somehow managed to survive in that environment says a lot of alarming things about Management. An overburdened team does, too, but not as alarming.
(Note: when I say “incompetent” I am not implying “stupid”. If they grew that QA team “organically” out of, say, beta testers – and please do see that film about where they got beta testers – they may have people who do not exist inside of a normal professional QA space, and they just don’t know what to do other than pull levers and push buttons faster and harder. Training and leadership can fix that.)
Scenario 4: QA reported the bug, but they haven’t gotten around to fixing it. This is a common scenario, and speaks volumes about workload and priorities. Just consider; this is a large quality-of-life hit to the customer, a common source of frustration, and a thorn in the CMs collective sides. You would think they would want to get right on it. So why haven’t they?
It’s possible that the defects that they are currently addressing are far worse. Patching exploits, crash to desktop, that sort of thing. They implication is that there are a lot of this sort of bug to keep them busy, which is not good news, but it could be worse.
It’s also possible that the defect correction team has been pared down to a minimum to move people onto 4.1 new development. It’s possible that this is standard practice. Also standard practice when one is running behind on a project, and need bodies.
The disturbing aspect of this is that we are not seeing the actions of a team that is hellbent on quality over features, which goes somewhat counter to what we have been told about Blizzard – “It’ll ship when it’s ready”, “In the end they won’t remember that it’s late, they’ll remember that it was great”, and so forth. All pretty words on a poster unless you execute on the maxim.
What is most likely is that the last scenario – issue reported, not yet fixed – is the most likely, and the least disturbing. The rest of the scenarios all spin progressively more dire tales of a sick corporate culture that is slowly self-destructing. Being somewhat familiar with various companies with sick development cultures, it’s no fun to be involved in, and it tends to reinforce its own sickness.
Things like phased herbs, evade bugs, crashing on raid bosses, and so forth all appear trivial when taken separately, but as a body they paint a different picture. Right now, one could conclude a lot about how Blizzard treats its customers by seeing how it addresses customer-facing defects.
The big question for me will be how 4.1 shapes up in terms of product quality. If all of these phasing issues are rolled up into that patch, and no new ones are created (I’m looking at you, Hyjal), then we can assume that the best-case scenario is in play, and we can worry a bit less about what is yet to come. On the other hand, if these issues are not addressed, and more added, it may be a sign that software quality in WoW – if not at Blizzard in general – has taken a back seat to deadlines. This is what a lot of people feared when Activision stepped in, and what we were told would not happen.
I personally hope that the worst of what we are seeing is the normal resource shuffle that takes place as one project is de-emphasized under others that are either in crunch mode, or just of greater importance. I don’t like it, but …
What does this mean to me as a customer? Honestly, I’ve always been one to jump ship if I get the impression that I’m not being treated squarely. I’ve been fine with a bug here and there in the past if I had insight into what might be going on. There is no transparency here, of course, and that makes it difficult to guess what is happening, as well as motives in how they are handling it. Truthfully, when I start to get the impression that a company is chumping me, I move on and give someone else a shot.
What is happening here? Are we seeing signs of a company that has set aside its values in favor of a paycheck? Are we seeing signs of an overstressed staff that can’t begin to meet its workload? Are we seeing a poor development process finally collapsing under the weight of its own incompetence? Or was this a triumph?
Bonus Scenario 5: you should have seen the ones that didn’t get away – Let me turn my Dwarven Rapper Hat around to the front for a second and present the opposite of what I’ve been driving at, just for contrast.
I don’t know if you noticed, but Cata was incredibly stable.
Remember BC? Remember the entire world server going down every time someone entered Hellfire Citadel? Remember not being able to enter an instance in Northrend? The past two expansions had some pretty impressive bugs. Hell, some patches in Vanilla caused some pretty horrific server-wide crashes.
But I can’t remember any of that with Cata.
Remember what I was saying about prioritizing? Well, there you go. What if what we’re seeing is the direct effect of a highly effective and successful triage effort by the Cata development team? Sure, the Twilight Jasmine has a few issues. But it would be what I would classify as moderate severity. And, thinking back, all I have seen so far have been of that level or lesser impact. I really can’t say I’ve seen anything that had serious impact. Nothing I’d classify as High or Urgent severity (using metrics I am familiar with here).
I’ll be watching the 4.1 and possibly 4.2 rollouts for insight into what this development team is up to, priority-wise, and maybe I can discern what is afoot. At some point, I may have enough information to tell if we’re being jerked around, and decide what I’m going to do about it.
That, however, is not a decision I have to make for many months.
- Significant in importance, not size. [↩]