When considering an on-premises or hybrid IT
infrastructure, you will no doubt come into contact with the hyperconvergence hype train. Promising
reduced costs, greater flexibility, and free puppies for everyone; what is a hyperconverged
infrastructure, what are its benefits, and does it deserve a place in your budget? Stick
with me, and let's find out. Welcome back to the Pro Tech Show - the place for tech, tips,
and advice for IT pros and decision-makers. To understand what hyperconvergence offers we need to compare
it to a traditional datacentre infrastructure. A traditional datacentre infrastructure is
sometimes called a three-tier infrastructure. Those three tiers are the network tier, the server tier,
and the storage tier. Let's start simple, though - here's a standalone server. It's a single box that
contains computational resources (processors and memory) and storage resources (hard drives). This
is no different to your desktop PC or even your laptop or phone. Everything you need is together in
one package. When you scale out your infrastructure this approach becomes inefficient. You have to
worry about managing loads of different pieces of storage in loads of different discrete units.
Expansion is limited by the physical constraints of the individual boxes themselves, and you end up
wasting space by over-provisioning your storage to make sure you don't run out in each of your little
silos. To solve this, organisations started separating their storage from their computational resource,
into a separate Storage Area Network or SAN. Your servers now do the computation and connect
back to a centrally managed pool of storage. As well as being much more flexible and scalable,
this makes it easy to do things like replicate your data to other locations and back it up more
efficiently; because you can replicate or back up the SAN itself rather than having to manage backup
and replication on every server, individually. This infrastructure tiering which was once for
large datacentres got pushed to the masses when virtualisation arrived. Virtualization decouples
your computation from the underlying hardware, allowing the logical servers that run your
business to move around between physical host servers that simply provide processor memory
capacity for them. This flexibility to roam across hosts has huge efficiency and resilience benefits,
but it also requires a separate storage tier so these logical virtual servers
can access the storage from any host. There's no point being able to seamlessly glide
your computation between different host servers if you're going to be tethered to one of
them anyway for access to your storage. So you arrive at the traditional three-tier datacentre
infrastructure that many of us are familiar with. The network, server, and storage tiers talk to
each other, but they are managed independently as three separate entities. Often, by three separate
teams. A converged infrastructure is basically the same thing but packaged up and productised by
a single vendor. Cisco, for example, offer a FlexPod converged infrastructure that is comprised of
Cisco network switches, Cisco servers running something like vSphere as the hypervisor, and NetApp
as the storage layer. It's all bundled up and sold as a pre-validated and pre-configured unit. The idea
is you buy your rack of converged infrastructure, plug it in, and off you go. If you need more
capacity you buy another rack from the same vendor. The attractive thing about this approach is
that you've got a single vendor to go to if anything goes wrong. You're not bouncing between
hardware, hypervisor, storage, and network vendors all blaming each other; and you avoid any arguments
about whether a particular component is compatible or supported. There are some downsides as well. In
theory, a converged infrastructure is very scalable, but in practice it can be prohibitively expensive to
do so. The entire thing is vendor-locked and if you just need to add a little bit of capacity here or
there you might find yourself over a barrel. The vendor may insist that you can only add capacity
by purchasing an entire rack full of kit, which would be massive overkill; and of course there's
only one person you can get it from so it's not going to be cheap! If you feel tempted to add a
little of your own, that may invalidate support for your entire datacentre stack - negating the
benefit of going for a converged infrastructure in the first place. So what is hyperconverged and
how is it different? Traditionally, the server and storage tiers of your three-tier infrastructure
used physically different hardware. The storage used dedicated arrays with storage controllers and
a fibre channel network linking it all together. Over time, we've seen a move away from specialised
storage hardware to more generalised server and network hardware. Expensive fibre channel storage
networks are in many cases being replaced by the iSCSI protocol that runs across a standard IP
network. Specialised storage controllers with dedicated hardware for things like RAID have
seen their logic move into software that runs on commodity server hardware. Handling storage in
software instead of hardware allows for a lot of flexibility, and new features can be downloaded
rather having to buy and replace physical kit. You may have heard this described as Software
Defined Storage or SDS. So if your server layer is running on commodity x86 hardware, connected
to IP network switches... and your storage layer is running on commodity x86 hardware, connected to
IP network switches... why not put them together? This in a nutshell is a hyperconverged
infrastructure or HCI. At first glance it may look like we've gone simply back to having
multiple standalone servers that contain both computation and storage. Physically, that's exactly
right, but logically speaking it behaves more like a three-tier infrastructure. The computation
is using virtualization so logical servers can migrate between physical hosts at will.
Unlike with a standalone server there is no hard linking between the computation resources
and the storage resources in the same metal box. The actual data is distributed and replicated
between servers just like nodes in a SAN. A virtual machine could feasibly be running on one
physical server and using storage from another. To all intents and purposes it's the same as a
three-tier infrastructure, but the hypervisor and SAN nodes are sharing the same physical hardware.
The first obvious benefit of this approach is that by collapsing your server and storage tiers you
save on hardware. That's fewer metal boxes to buy, power, cool, and fit somewhere. Scaling out your
infrastructure is in theory a nice and simple affair. Your capacity now comes in discrete units
that include all of your computation and your storage. So if you have five hyperconverged host
servers and you need 20% more capacity, you buy another hyperconverged host and
everything increases by 20%. You'll often get simplified administration as well.
This varies quite a bit between vendors, of course. Some wrap everything up inside a single management
interface and abstract away a lot of the underlying detail. Others are a bit more of a DIY
affair. For the more packaged products you end up with simplified administration. You can now manage
the infrastructure from a single point rather than managing servers and storage separately. Maybe
you only need one team now, instead of two. So: less kit to buy, fewer things to manage,
all the performance and resilience of a three-tier infrastructure... Is there
a catch? Yeah. There's always a catch. Linear scalability sounds like a good idea because
it's easy to understand and easy to purchase. The problem, of course, is that a lot
of applications don't scale linearly. Virtual desktop infrastructure or VDI is
generally a good fit for this sort of thing. Each new user means a new desktop, which means a
repeatable chunk of processor, memory, and storage. If you double the number of users, you double
your processor, memory, and storage requirements; so you double the number of hyperconverged
hosts and it all works out quite nicely. But what about a file server? Usually with file
servers you'll tend to see storage growth over time, but processor and memory utilisation tends
to remain fairly static by comparison. That's something that doesn't lend itself very nicely
to hyperconverged scaling. If you want to add more storage capacity you could be forced to
buy additional computation resources as well, because it all comes in a single box. Quite how
inflexible this is will depend on the vendor and the units they offer, but you can easily find yourself
spending more on hardware because you can't simply tack on a bit of storage. Instead you're buying
processors and memory you don't actually need, because "hyperconverged". Another downside is
complexity. Reduced complexity is supposed to be a benefit of hyperconverged infrastructures, but
that only really counts when it's working. There are fewer things to manage, yes; but if you think
about it, each of those individual things is now individually more complex because it's combining
the computation and the storage in every unit. You can no longer take a virtualisation host down
for maintenance without also taking a storage node down for maintenance as well. If a Hyper-V
hosts bluescreens, so did part of your SAN. Typically, you should be able to tolerate
such a failure because it will be deployed in a highly available topology where other
nodes pick up the slack; but the point remains. When it's all working well there's less to manage.
When you have an issue to deal with, though; it can start to feel a bit like a stack of cards. So
ultimately whether hyperconvergence is right for you or not is going to depend on a number of
factors. If your infrastructure scales linearly, it could save you money on hardware. If scalability
needs to be non-linear, hyperconvergence could cost you more in hardware. You might want to
mix and match - you might be better served by the flexibility to design separate server and
storage tiers in your main datacentre, but the plug-and-play simplicity of hyperconvergence
in a space-constrained branch site. There is no one-size-fits-all answer, and you need to consider
the performance characteristics of your workload as well. Whilst I can't give you a simple "yes"
or "no" answer on that, hopefully this has helped you figure out where to start. Let me know in the
comments if you're going hyperconverged and in what scenarios you find it works best for you. If
you found this video useful, give it a like before you go, and don't forget to subscribe for future
videos. Thanks for watching, guys. See you next time!