vCHS Meets vCO (and Boris Becomes a Hero!)

There are so many things I’d like to show and talk about here that I wanted to dumb down the title as much as possible.

The trigger

This all started (a long time ago) by realizing that with AWS you can deploy an EC2 instance and SSH/RDP into it in a matter of a couple of minutes with just a few clicks. With vCHS (and vCD for that matter) you can deploy an instance in pretty much the same time, but then you have to configure separately the Edge Gateway to get the proper NAT and Firewall rules in place (see this blog post for vCD networking configuration samples if you are new to vCD). While VMware is continuously improving the portal and the user experience I wanted to find a way to address this problem in “consumer space” (i.e. what a user can do without the need for vCHS engineering to improve the service).

If there is one thing that I learned playing with the vCHS APIs is that you can change the behavior of the portal (see the “How to change the computer name of the VM” section in this blog post). So while vCHS doesn’t have a wizard that allows you to deploy a VM and customize its surrounding network and security settings all in one, one can build a string of API calls to make that happen in a single transaction. Hold tight.

The excitement (so to speak)

This is where I started. But I didn’t stop there.

I wanted to bolt a bit of hybrid buzz on top of this concept. Wouldn’t it be *awesome* if you could execute that multi-calls API transaction from the vSphere web client so that a vSphere admin can deploy an “Internet VM” in vCHS as if it was a local deployment?

Back in December VMware released the vCHS vSphere Client plugin which is awesome, but that plugin cannot be extended nor customized. In addition, as of today, the vCHS vSphere Client plugin can only deploy VMs but it cannot (yet) configure network and security services in vCHS.

Long story short: enter vCenter Orchestrator.

With vCenter Orchestrator I could achieve both things:

  • I could concatenate API calls that would allow, in a single transaction, to deploy a VM and configure the network and security services associated to it.

  • I could expose that transaction in the vSphere client by virtue of the existing vCO and vSphere web client integration.

So this was the first sketch of what I had in mind. The journey begins:

The Roadblocks

This journey wasn’t a piece of cake. I must admit that.

I am going to focus this blog post on the result, not so much on the “sausage making” aspect of it. But, in complete transparency, I think it is good to touch on a couple of aspects that are particularly problematic.

The first set of roadblocks I found is related to the fact that the vCD plugin for vCO was designed around the assumption that the user is a cloud admin (aka vCloud Director Cloud Admin) and not a cloud tenant (aka vCloud Director Org Admin). Since vCHS consumers are always cloud tenants (even when they subscribe to a Dedicated Cloud), a lot of the out of the box vCD plugin workflows in vCO won’t work. Even workflows that are supposed to be available to tenants (e.g. configure an Edge Gateway NAT rule) have been created using API shortcuts that are only available to cloud admins (and not tenants).

I wouldn’t define this as “poor design” though. I would define this as a design for a totally different use case compared to what I am trying to achieve here.

The second roadblock I found ties back to one of my old rants re the cost of building clouds. I have experienced first hand the complexity of aligning a number of different products with different versions and compatibility requirements. This is the second sketch I made on the journey that, while not being detailed and exhaustive, can give you an idea of the problem:

So much for the sausage making part.

Big shout out to Aleksandar Lazarov in the vCO engineering team for fixing on the fly those roadblocks (in the vCD plugin and workflows) that allowed me to build the prototype below. Please note that those fixes haven’t been made public so: “kids, don’t try this at home” (it won’t work with currently shipping code).

The Storyboard (Use Case)

I’ll use a storyboard to give you an idea of what sort of scenario this prototype can address. The scenario I am going to describe is one among a million that you can think of where these technologies could apply. It goes like this:

Boris works in the IT department of a big Enterprise shop. He owns vSphere there. Lately he has noticed a surge in casual requests from peers in the IT department (and even outside) for transient VMs that colleagues require to do test, development, products evaluations and so forth.

These people often need to work closely with remote third party partners (that would need to get access to these VMs too). Boris is scratching his head because he has a lot of issues with fulfilling those requests.

  • Boris doesn’t have a lot of capacity left in his private vSphere environment. Where is he going to put those VMs?

  • Boris mentioned to the security team how external partners could get access to those VMs sitting on his vSphere infrastructure and they had a good laugh at him and his ask. How can he solve this problem?

  • Boris doesn’t have a private cloud project going on. There are talks about it but right now he must manually fulfill those requests.

Boris also thinks that “cloud” is a buzzword but this is another story.

The Solution (Philosophy)

Boris decides to subscribe to vCHS to gain immediate access to additional capacity in an OPEX model.

Boris isn’t going to give these internal and external folks access to vCHS directly. Instead Boris will get the request (via whatever channel) and create those VMs using a vCO workflow that he can launch from where he spend 18 hours a day (that is, the vSphere web client).

Boris doesn’t have a lot of public IP addresses subscribed with vCHS and wants to keep them to a minimum anyway. He will give users access to those VMs via NAT ports mapping that users will suggest.

Note: in addition to not consuming public IP addresses, this port mapping also increases (exponentially) the security of exposing SSH and/or RDP access (e.g. SSH attacks usually target port 22 and not arbitrary ports you choose when NATting).

For example:

  • User1 will get RDP access to a Windows VM on port 999 of the Edge Gateway IP.

  • User2 will get RDP access to a Windows VM on port 998 of the Edge Gateway IP.

  • User3 will get SSH access to a Linux VM on port 997 of the Edge Gateway IP.

  • User4 will get HTTPS access to a Linux VM on port 996 of the Edge Gateway IP.

  • etc

Essentially, the users requesting the VMs will only need to provide 4 pieces of information to Boris:

  • The VM template they want (Windows, Linux etc)

  • The name of the VM

  • The external TCP port

  • The internal TCP port

  • An email address to communicate the provisioning happened (and how to connect to the VM)

In return the user will get an email with a summary of the public IP and external port he/she will need to use to connect to the VM.

The Solution (Architecture and Flow)

This is what Boris setup to accomplish the above:

This is what happens when the user calls Boris to get a VM. Boris gets the message (phone, email, chat at the coffee break, whatever) and Boris runs the vCO workflow from within the vSphere web client:

And this is the happy user that consumes his VM per his specifications:

The Solution (Screenshots)

You can appreciate a more complete and detailed overview of the prototype with the video in the section below but these are a few screenshots of how the process looks like.

First and foremost, below is a picture of the vCO workflow. The heart of what I have been doing.

Another shout out is due here. This would have not been possible without the help of Andrea Siviero. We had so much fun building the LiquidDC utility that I thought he would have had more fun with vCO (that he masters). Thanks Andrea.

This is, at a high level, what the workflow does:

  • It programmatically reads the public IP of the Gateway [Get GW IP]

  • It deploys a vApp [Instantiate a vApp]

  • It programmatically reads the IP of the VM in the vApp deployed above [GetVAppIP]

  • It configures a NAT rule [Add a NAT rule EDITED]

  • It configures a FW rule [Add a firewall rule]

  • It creates a string of parameters (IPs and Ports) used to inform the user on how to connect to the VM [Create Output]

  • It sends an email to the user with the info above [Send notification]

Boris will call this workflow from the vSphere client. Note that Boris has created a ghost Datacenter in the vCenter hierarchy. This is pure “marketing”.

The vCO workflow is associated to any Datacenter object (so it would work if you right-click on DemoLab as well in the picture below) but Boris thought it was cool doing it this way. Mind you.

Boris will fill the workflow inputs with the parameters that a user (e.g. Massimo) asked him to use.

In particular what happened is that Massimo gave Boris a call asking begging for a Windows 2008 VM he would like to RDP into using port 999. Boris is then filling the input parameters of the workflow as follows:

Boris has an idea of how the workflow is doing because the vSphere web client tracks vCO workflows execution. At this point Boris can forget about Massimo. Boris can continue working on his stuff and go home sooner rather than later.

According to the vCO client, the workflow took 116 seconds to run:

The last step of the workflow sends an email to Massimo. In fact Massimo has received the following email:

Massimo, with tears in his eyes, will then just open an RDP client on his Mac and will type:

And boom! Boris is his hero now!

The Solution (Live)

This is a brief 7 minutes video that brings to life this storyboard and it goes through this ideal discussion between Boris and Massimo (Boris’ internal customer).

Conclusions

Deploying an Internet reachable VM on demand in 116 seconds from the vSphere client.

[Assuming we can get rid of all the problems I described at the beginning of the post in terms of products version mismatches] building a solution like this takes no longer than 1 or 2 days.

While this is very far from being a true hybrid cloud, I have seen “cloud” projects taking 2 years and 2M$, while still delivering half of what you have seen here in 10 minutes.

Say what you want. I think this is amazing.

The workflow I built is pretty basic (no error control or anything). If you want to do serious things you may spend more time to make it more solid.

You can even think of re-using more complex (and advanced) vApp deployment workflows like Christophe’s example here and complementing them with other workflow components (FW and NAT rules).

I am fairly sure you can think of at least another couple of hundreds use cases you can use this integration for.

Interesting times ahead for vSphere admins. You can all be heroes in your organization.

Massimo.

10 comments to vCHS Meets vCO (and Boris Becomes a Hero!)

  • Brilliant post. Lesson Learnt: All vSphere admin should become vCO expert soon

  • Great demo! The beauty of this is it is just as applicable to any VMware vCloud Powered or vCloud Datacenter partner.
    -Phil

  • […]  From here it’s just a matter of jumping in and using the service!  My buddy Massimo did a great entry on actually using the client to perform some incredibly powerful cloud orchestration.  This entry […]

  • I like this a lot, i assume the vCO workflow could just as easily be exposed in vCAC 6.x in the advanced service designer.

    Is there any logic in the workflow to check to see if the user selected port is already in use, i.e. Boris selects port 99 for his external NAT, does vCO check to see if this is already in use

    • Massimo

      Thanks Andy. Yes exposing this through vCAC was one of the next thing I had in mind to show and never had time to write a blog post about it. But absolutely it was the natural follow on. On the line of “you don’t even want to call or send Boris an e-mail, now you have a self service portal to “ask” for such a VM”. etc etc etc

      No there is no control whatsoever in that minimal workflow. If you intend to use it for any production related work you should have a lot more polished workflow with controls etc. This was really just the proof of a concept, literally.

      Thanks!

  • Hi, Can I have access to the source code of those workflows? Would you have one that can just setup the IP in vCHS just based on a provided IP or just read it from vCHS for a vAPP that does not use any vCNS.

    • Massimo

      Hi Eiad.

      Please send me an email off line and I will give you what I have built. What it does is fairly basic though. It reads the existing public IP assigned to the GW and it creates a FW / NAT rule to associate it to the internal IP of the VM just deployed. If you have multiple IPs on the Edge GW or if you want to request a new public IP programmatically (available in the latest vCD versions) you’d need to modify the logic.