Amazon, Netflix, Standard Cloud APIs and the Inevitable Lock-in

A few weeks ago Adrian Cockcroft (Cloud Architect @ Netflix) wrote another very interesting post on his blog. Adrian warms up the discussion sharing his experience about the reasons for which you may want to use public cloud services. While there are a lot of people (including myself) sometimes advocating about these concepts, there isn't anything like hearing this first hand from the people that are actually running a business out of this model. I like to hear/read Adrian for this reason. It's no secret that Netflix uses Amazon AWS to run their business and this is the second part of Adrian's post. Admittedly the part that intrigued me the most.

The remaining part of his post is basically a public ask (or hope) to see AWS API compatible clouds (or clones), possibly built around the OpenStack stack (no pun intended). He doesn't seem to be shy about sharing his pessimism about OpenStack success (correct me if I am wrong Adrian) but this isn't going to be the core of the post I am writing . Only time will tell who will be successful in doing what.

Going back to Adrian's "ask" I believe there are a number of reasons why he would like to see an AWS clone. Again Adrian is welcome to set the record straight if I got the wrong understanding.

One of the reasons is somewhat logical and it boils down to: risk mitigation, additional resiliency and problem avoidance. I came to learn from another very interesting piece by Adrian that Netflix has a number of policies for backup and data retention. This includes backing up data on S3, copying them in different AWS availability zones, and eventually replicating them in different AWS regions. It only makes perfect sense for Netflix to go a step further duplicating these data at different service providers for an additional level of risk mitigation. This is after all what this slide was trying to convey in his interesting pitch (highly recommended if you haven't watched it yet):

I'd speculate that another good reason for which Adrian would like to see alternative public clouds based on clones of the AWS APIs is this: Netflix would like to have choices. Simple. What's wrong with that? I wouldn't expect anything less if I was them. Someone would try to argue that Netflix doesn't want to be locked-in into Amazon. I think the matter is a lot more complex and, in fact, I am not sure I agree (entirely) with that. I don't even know if avoiding a certain level of lock-in is even possible at all anyway (more on this later).

Warning: I am not trying to sell vCloud to Adrian Cockcroft or anyone else. By the way I believe Adrian knows more about vCloud than I do. .

Having this said this is a hot topic. Adrian's blog post (along with all comments on the thread) reminded me of a couple of old blog posts I wrote last year. They are "Open standards, open source, OpenStack and the TCPIP of Cloud APIs" and "vSphere, vCloud and the Meaning of Being Open" where I was trying to describe VMware's strategy in terms of API standardization and choice of service providers. This is an oversimplified picture, from one of those blog posts, that focuses on the point I am trying to make: a common API that works across different service providers.

This picture primarily shows access to different service providers using the same interface but the story doesn't stop here. Since vCloud Director is a product you can buy, you can even build your own private cloud if you want to. I regularly use, as a consumer of cloud services, a couple of internal labs (that mimic private clouds) as well as the public Stratogen cloud and another public cloud I am piloting with another big telco in Europe. I do have my choices.

Here I am not specifically talking about the effort of making the vCloud APIs an industry standard. Lately, I came to the (personal) conclusion that a standard API is a function of its adoption and not a function of a theoretical agreement. I am instead talking about the choice of service providers the vCloud stack would be able to guarantee to consumers. After all, it's one stack instantiated many times by different organizations (either private or public). I am not sure if it's a standard (yet), certainly it is very consistent. And this is where I can hear you claiming. "it's a lock-in". And this is where I would argue: "is a certain minimum level of lock-in avoidable anyway?"

Let's try to get into a bit more details and explore the options this industry (more particularly consumers and providers of cloud services) have.

API lock-in

First of all, what on earth is a lock-in. How do you define it? A lock-in, to me at least, is a function of the time it takes to move to an alternative solution. In the context we are discussing here a lock-in is a function of how much time and effort it would take to rewrite your software (for example the Netflix software) to talk to a different cloud interface. Adrian at some point says it wouldn't be (too) difficult for Netflix to do that but the mere reasons for which he is looking for an AWS clone is telling me he doesn't want to get to that point (my speculation).

At this point, does it make any difference if the APIs you are writing your solution against are the vCloud APIs, the AWS APIs or the future OpenStack native APIs (these are APIs that exposes the OpenStack personality, not the AWS clone interface). I don't think so. Lock-in isn't so much what you are writing against (be it the vCloud APIs, the OpenStack APIs, or the Amazon AWS APIs), it is rather how difficult it is to move away from it.

At the end of the day, as a consumer, you don't have control on any of those anyway. So it doesn't make any difference at all.

If you are a service provider you are pretty much in the same situation if you intend to use vCloud Director or OpenStack. Unless you decide to take OpenStack, fork it and do with it whatever you want. In that case it's a different kind of lock-in, and not necessarily a better one. Good luck with that.

Sure if you are big enough you may be able to contribute to the main OpenStack project and see what you need / want implemented sooner rather than later but, frankly, if you are an organization of such a size, chances are that you have a word on the roadmap of a proprietary product too. I have seen that first hand.

All in all using available third party software products (be them vCloud Director or OpenStack) to build clouds has the advantage of allowing consumers to connect to different service providers. Having this said, if users decide to consume services from these service providers, they are essentially locking themselves into that specific interface/API. Whatever that interface is.

I am not getting into the federation and hybrid cloud discussion here because it would only be useful to discuss why choosing one interface over the other could be better. Not the point of this post anyway.

Service Provider lock-in

The other option to see more openness (or the perception thereof) would be to keep Amazon AWS as your "gold standard" and pray for other service providers to implement a clone of their APIs (using OpenStack or any other tool). This is, to me, the worst of both worlds since both consumers and providers have certainly no control whatsoever on the AWS APIs (similarly to how you'd have no control over the vCloud APIs or the potential OpenStack native APIs). In addition to that you'd have to deal with the complexity of creating and consuming APIs whose clone is fundamentally a reverse engineering hack which will suffer the generic problems of copying someone else's interfaces.

This is especially true when these interfaces are changing at the speed of light (given the pace Amazon is innovating introducing new cloud services) and also given the fact that the AWS interfaces appear to be pretty complex to track.

In reality, Adrian was asking for cloning only a subset of the features provided by AWS but, based on my past experience working for a company that was trying to be the overlay interface to everything, typically the only thing that works (somewhat) well across different virtualized platforms and interfaces is turn on and off virtual machines. I bet Netflix needs something more compelling than that to consider another service provider that claims to be compatible with the Amazon APIs. OK I am exaggerating but you see (hopefully) my point. If Amazon was to facilitate this cloning process or better yet if Amazon was to provide (read: sell) to service providers its own technology enablement stack the story would be very different but I don't think any service provider will be successful in implementing an AWS clone if Amazon doesn't want that to happen.

If I was evaluating this option, as a consumer, I would just give up with the idea of consuming a clone of Amazon...and I would just consume native Amazon AWS resources. Sure you are limiting yourself to a single service provider (AWS) but I think it is better to be locked-in into Amazon than having choices... that don't work very well. Because, at the end, we all need to be pragmatic don't we?

Conclusions

In conclusion I just want to reiterate that it's just a bet you are making and you can't really avoid a certain level of lock-in. It's just a fact of (IT) life. In the last 15 years I came across a lot of vendors that were selling openness and freedom of choice. At the end of the day they were just trying to sell another control point. They don't call it a lock-in as it makes the whole sales process a bit harder but it is what it is.

This post is not meant to bash Amazon or OpenStack. As a matter of fact I am bashing at least as much vCloud. It's just a reality check of what's going on and how I see these things progressing going forward for both consumers and providers of (IaaS) cloud services.

My message? Make your bet and keep your fingers crossed.

Perhaps I will be proven wrong. Oh well, it's just my usual (less than) 2 cents

Massimo.