VoIP Business and Virtual PBX
Business solutions

Cloud computing done the Netflix way

Last week's SIG meeting was one of the most interesting we've had in our more than three-year history. Its title was "Cloud Computing the Netflix Way," and we had two Netflix guest speakers: Adrian Cockcroft, director of architecture for cloud systems, and Jason Chan, Netflix's security architect.

If you're not familiar with its innovation infrastructure, Netflix has, over the past few years, migrated almost in every way from an on-premises data-center environment to a cloud-based setup located in the Amazon Web Services infrastructure.

Follow-up to my recent CIO

As a follow-up to my recent CIO.com article titled " Cloud Computing Calls for Rebuilding Enterprise IT," these two presentations are nearly a perfect complement. Learning about what Netflix has done is an excellent primer about what, in my view, most enterprises will go through henceforth.

First, Netflix started its journey with a traditional enterprise environment and a traditional data-center infrastructure. It found that the infrastructure was too fragile for its needs, and the traditional operations model didn't respond fast enough to the needs of the business. Netflix changed its approach because it recognized that the future of its business required a different way of doing things.

Second, companies are starting to look more and more like Netflix in terms of offering online services as a core part of their business. Think of GM and its OnStar service. If you've taken a Virgin America flight and seen the future of in-cabin entertainment, do you think it's not collecting and analyzing that data to tune its offerings to individual clients? What Netflix is doing, company afterwards company is doing as so then. So, for meanwhile a portion of their applications, most companies are starting to resemble Netflix. And one thing is for sure -- managing the new type of applications with the practices and processes associated with existing applications is a recipe for disaster.

Netflix's business is growing rapidly and experiences very uneven demand. In this kind of environment, Netflix didn't want to experience service interruptions due to its inability to build data centers fast enough.

Then, if your application is composed of many services that are failure-prone, and your application architecture is written to be failure-proof for services, it makes sense to specifically shut down portions of your production environment to see if the application is actually robust. Netflix famously does this with what it calls its "chaos monkey," in which different service environments are randomly taken offline to confirm that the Netflix environment can continue operating in the face of resource failure. One thing that came out of the presentations is that Netflix has many monkeys, not just one. They do different things, nevertheless they all focus on validating the robustness of the environment when confronted with resource failure.

Of course, if the concepts of release to production -- and release itself -- are called into question, so too is the role of operations. Netflix does not have a separate operations group for its cloud infrastructure -- every developer is responsible for putting his or her code into production and is called when something breaks. Cockcroft has caused a bit of a ruckus in the cloud community by calling this "NoOps," opposite to "DevOps," which many operations-focused folks feel is the future of large-scale cloud computing applications.

To my mind, the notion of fine-grained service in continuous deployment puts to rest the concept of a separate operations group responsible for putting applications into production and keeping them running. I believe Cockcroft is somewhat overstating the situation, as there are people tracking the service monitoring and ensuring that any performance and latency issues get addressed. The larger point is that the new model of applications requires a radical rethinking of application architectures, differing ways of moving fine-grained services through their individual lifecycle, differing ways of monitoring an "application," and differing ways of ensuring robustness. As I said last week, cloud computing requires rebuilding enterprise IT for a completely new operating model.

Perhaps the most interesting thing about Netflix is how it approached the overall proposition of using a cloud computing environment. It didn't focus on how to make the cloud support their established application architectures and IT processes. Instead, it evaluated its applications and operations to understand how the new environment would affect the compute infrastructure and redesigned the applications to address that. If your organization is looking to aggressively move into cloud computing and is willing to examine what is required to really leverage a cloud environment, the Netflix story is a critical example to understand.

Bernard Golden is the vice president of Enterprise Solutions for enStratus Networks, a cloud management software company. He is the author of three books on virtualization and cloud computing, including Virtualization for Dummies. Follow Bernard Golden on Twitter @bernardgolden. Follow everything from CIO.com on Twitter @CIOonline

The active archive market is a growing segment where tape is seen as part of a disk or network fileystem. This means that to an end user disk and tape are "blended" and whether file is held on disk or tape is "invisible" to the end user. The active archive market is the fastest growing space in the storage industry and allows direct end user access to tape through a file system front end.

More information: Arnnet.com
References:
  • ·

    Netflix

  • ·

    Netflix Operating Environment

  • ·

    Netflix Cloud Computing

  • ·

    Netflix Robust It Processes

  • ·

    How Has Cloud Computing Changed Business For Netfl