Ah Cloud, the weird and wonderful variation of technology that causes lots of headaches as it travels down the road to mass adoption.
One of the biggest headaches is that unlike our tried and true physical networking devices, virtualized or cloud devices may not work the same way. Sounds like an easy concept, but you’d be amazed at how often this gets overlooked.
Here’s an example (company names withheld):
A big service provider was about 5 days away from a big customer signing on for a multi-million (estimated) cloud hosting contract. They were expected to deliver the solution to the customer based on pre-approved requirements around capacity and throughput. Sounds pretty commonplace, and considering this is not a fly-by-night operation, there was nothing to assume that anything was on the path to meltdown.
Until they forget to test network throughput on the virtual servers. It came in at a fraction of what it needed to be. So let’s recap, big customer expecting perfect environment in a few days, and infrastructure not able to deliver. Insert expletive here,
This is not a case where they can simply call the internet provider to fix, as it has nothing to do with the actual network capabilities. This problem has to do with the actual implementation and the oversight to not properly account for network affects caused by virtual NICs. Let me explain.
When you feed a network connection into a virtual server (let’s say with 4 VMs running on it), it’s inherently going to act different than if it was a normal, single server. With virtualization, you now have fun factors like virtual NIC cards and other virtualized processes which can impact the network throughput. It’s the same as when you take a comprehensive security device and power on all the additional capabilities, the throughput is going to go to, well, it’s not going to be good. Try to blame the OEM all you want, but the reality is that you should’ve expected some form of network throughput degredation. You can’t have four 300 lbs guys in a Mini cooper and expect it to get top speeds.
So what happened? Well, luckily they were able to push the deal back slightly and get it all fixed after working with all the right partners. But again, it’s not the fault of the OEMs here, not doing the right testing was the culprit.
Cloud is inherently more complicated, and there are lots of squiggly details that are going to trip people up. There is no way everyone can be an expert, so the best thing to do is really take your time and do the research. Read best case practices, talk to OEMs, make sure you have the right people on the project, and GO SLOW.
Cloud can help companies become more nimble and agile, but don’t expect the adoption path to be.