There’s no good reason for Andy Ellis to be as calm as he is.
Walking into a sleek glass-walled conference room on a cold and wet December day, Ellis has the easy confidence and serenity you might find in a semi-retired professional golfer. It is not the kind of demeanor often associated with CSOs, and certainly not with the CSO of a company that handles a non-trivial portion of the Internet’s traffic on any given day.
Just across the lobby from that conference room, in the Cambridge, Mass., headquarters of Akamai, is another glass room. Larger, darker, and protected by both man and machine, that room holds the company’s network operations command center (NOCC). The room adheres to the established aesthetic for such rooms: long, curved walls covered with large screens showing a variety of graphs and maps, and rows of cubicles staffed by young analysts. It also holds several hundred million reasons for Ellis to be worried, in the form of the data feeds coming in from Akamai’s worldwide network and lighting up those screens on the wall.
But, like Ellis, the Akamai NOCC is preternaturally calm. Despite the fact that Akamai operates one of the Internet’s larger content delivery networks and has customers and data centers around the world, about half of the desks in the Cambridge SOC are unoccupied and the phones are silent. A typical SOC is a hive of activity, regardless of the time of day, with phones ringing constantly, critical alerts flashing on monitors in big red boxes, and dozens of analysts bouncing around. Silence is normally a bad sign. Silence means something broke. Here, silence means the system is operating as designed. Akamai’s CDN is designed not only to deliver content as quickly as possible, but also to ensure that the network can heal itself and route around most of the issues that arise.
“When you’re solving all the problems you can solve with software, you never have to worry that seconds matter. There’s no single data center that should matter in our network,” Ellis said. “There are some things that if they go down, you say, Oh that’s interesting. But it’s not because it causes a disruption. It becomes a problem to solve.”
Disruptions are anathema to a business like Akamai. More than half of the 500 largest global companies are consumers of Akamai’s content-delivery and security services and the company’s network handles roughly three trillion web interactions each day. The network Akamai has built to deliver all of this comprises nearly a quarter-million servers and sees traffic peaks of 30 Tbps and as many as 1.7 trillion DNS requests per day. Reliability and security are the twin pillars upon which the business rests. If the network isn’t reliable, then customers go away. If the network isn’t secure, then customers go away. Ensuring the security of the network has been Ellis’s job since 2002, which is several lifetimes in CSO years. Yet, in an industry infamous for inducing stress and burnout he has found a different road.
“When you’re solving all the problems you can solve with software, you never have to worry that seconds matter."
It wasn’t the direct route, though. A gifted student, Ellis went to college at a school down by the river, just a few blocks from Akamai’s offices, to study computer science and math but soon discovered that high school was one thing and MIT was something else. He left Cambridge and spent a few years trying to figure things out, years that included a stint issuing employee uniforms at Disneyland (“It was my job to put a smile on their faces when they went out. Crack a joke, whatever it took.”) and a year working as a wine steward at a Vermont inn. He eventually found his way back to MIT via the Air Force ROTC program, and after graduation joined the nascent 609th Information Warfare Squadron. The U.S. military was making a big investment in offensive and defensive infosec, and Ellis found the craft suited him. He spent three years in the Air Force but when his time was up, his interest in security was just beginning. There probably aren’t many--any?--other CSOs with that specific background, but Ellis believes a varied set of experiences and skills is invaluable in his field.
“The people I think who are most successful in security are the ones who went and did something else. Even if what they did was summer jobs. If you understand success and failure, that gives you a leg up in the security world because you can now empathize with your business partner, because you can really understand what pressures they’re under at that point,” he said.
Pressure is relative, of course. There’s the daily pressure of working in a high-level security position, there’s the pressure that builds when something breaks and needs to be fixed immediately, and there’s the pressure of trying to make sure that things don’t break in the first place. Ellis knows all of those feelings well and one of the ways he has worked to alleviate them for himself and others is by instilling a security mindset throughout the company. Product managers and developers often talk about “baking security in” or “building security in from the start”, but that’s not what Ellis means. For him, it’s about getting executives, managers, developers, and others to see what the risks are in a given situation before making a decision.
“I sometimes feel that the most important part of my job is not making these changes but convincing other people that they want to make the change before I make the change. That’s probably ninety percent of my job now. The biggest piece is, they have to believe in the risk,” he said.
That’s much easier said than done and so security teams sometimes throw up their hands and become the sin eaters for their organizations, taking on all of the risks and suffering the consequences when something goes wrong. The people and business units that build and launch products and services need to have ownership of the risks those offerings create so they understand what’s at stake, Ellis said.
“The whole model that a lot of security practitioners have which says, ‘We’re the conscience of the business, our job is to keep the business from taking bad risks,’ that’s a really flawed model. I own no risk. There’s no risk I actually own for the business. I don’t have a P&L and I’m not responsible for the success of the business, so I can’t own risk,” he said.
“If somebody is about to launch a product and they come to me and say, ‘Hey I’m going to launch this product, is it safe to do so?’, I’m almost lost already at that point because what they’re really saying is, ‘I’m going to launch this product. Please spend some time thinking about risks so I don’t have to.’ They should have done that already. We moved to a model where they come and say. ‘Hey is this safe?’ and we say, ‘I don’t know, you tell us. Let us guide you on this journey about thinking about risk.’”
“The people I think who are most successful in security are the ones who went and did something else."
Thinking about risk isn’t always fun, but when the alternative is a free external security assessment courtesy of Fancy Bear, it’s a wise time investment. Time is something that Ellis thinks about quite a lot. Do we have enough time to solve all our pressing problems? Are we spending too much time on small ones? Is it time to rethink some of our processes? Some of those questions are always there, lurking in the background, and some have answers one day and not the next. But that’s the job and it’s a job that Ellis relishes, both for its certainties and its fluidity.
It’s not a job that keeps him up at night, though.
“As a chief security officer, if you wake up in the morning and the first thing you’re thinking about is your company’s risk, there’s a good chance you’re burning yourself out and you’re doing it wrong,” he said. “I’m not this crazy paranoid person who does nothing but panic about risk. If I’m staying up at night worrying about a risk and the business executive who owns the risk area isn’t worrying about it, then I’ve done something wrong already.”
In the security world, mistakes tend to be amplified and the consequences can be painful and long-lasting and sometimes public. From the outside, the job of CSO can look like a never ending string of meetings designed to address every potential problem and avoid any possible mistakes. That’s a one-way ticket to Crazytown though, and Ellis said he’s learned to accept that no product or service will be perfectly secure and communicating that message to other executives, architects, and developers has become one of his more important tasks.
“If you have zero problems, you are over-investing in security, right? You pay me to tell you all the things you could do and hear ways you might be able to solve them. Some of the things we invest our time in is figuring out which things just aren’t even worth solving,” he said. “What we’ve done that we’ve found has been more successful is that rather than coming to somebody and saying hey we’ve done the prioritization for you, this is the most important thing, we say, we’ve done loose prioritization and these are the five most important things. We think you can afford to be working on two of them right now. Maybe we’re wrong, tell us if we’re wrong on that.
“By taking our hubris out, rather than saying we’re the ones who know which one thing you should do, we partner with them and say, which one do you think is most important? Because it’s your product.”
Being wrong comes with the territory, Ellis knows, and some days you’re wrong more than you’re right. But as he sits looking through the glass wall at the NOCC a few yards away, full of potential problems, the lessons he’s learned along the way remain with him.
“When [the Disneyland employees] came back in, they were often really cranky. They just spent eight hours dealing with mass humanity. We had to take that and see if we could flip them back to a smile so they could go home,” he said.
Solve the problems you can. There will be more tomorrow.