How Hyperconverged Systems Change Storage Provisioning for VMs
The arguments for hyperconvergence are compelling: Ease of deployment, reduced effort to operate and lower acquisition costs are commonly espoused benefits. Under the hood, hypconverged infrastructure (HCI) implements conceptually similar technologies as the traditional combination of virtualization and storage, but adds a software layer on top to provide easy management, administration and monitoring functions. That results in a few design trade-offs when deploying HCI that admins should consider as they’re planning.
What Is a Hyperconverged System?
Hyperconverged systems typically configure individual servers with large amounts of storage and network bandwidth, in addition to the usual processing and memory components. The thinking, of course, is to simplify management of IT resources by colocating the storage with the compute, and avoid the additional factors required in order to implement a separate storage area network. Doing so results in a number of performance implications.
1. Disk Subsystem
Costs for enterprise-grade solid-state storage devices continue to fall, and it is reasonably economical to add a number of each per server. With performance ratings approaching 500 megabytes per second becoming quite common, it doesn’t take very many to saturate a 10 gigabit network. This point proves all the more important when we discuss network performance. Driving those storage devices also takes some effort on the part of the processors, although that can be mitigated with a good storage controller — for additional cost, of course.
Depending on the nature of your application requirements, storage configuration may be impacted as well. Workloads that utilize ephemeral virtual machines, where individual VMs are expendable in some sense, are conducive to a simplified storage architecture where the storage is not synchronized between servers. Some relatively small amount of local storage can be set aside (and replicated between nodes) to provision VMs, while the remainder is excluded from replication and reserved for active VMs and snapshots.
To guard against individual disk failures, apply traditional RAID configurations, generally using the forementioned storage controller. In that scenario, the complete failure of an entire server in the HCI would result in the loss of the VMs running on the failed server, which would then be rapidly recreated on remaining servers. Where workloads cannot tolerate such failures, the storage system must necessarily become more complex.
2. Network Performance
Many applications cannot tolerate the ephemeral nature of expendable VMs, and require high-availability configurations to re-establish operations on a different server in the event of an equipment failure. In order to accomplish that in a hyperconverged environment, local storage must be replicated synchronously to one or more other servers, which means network bandwidth can easily become a limiting factor of VM performance. That isn’t just replicating application data as a transaction is committed, but replicating the in-memory state of the VM.
As administrators configure other typical enterprise storage features such as deduplication, additional network bandwidth will be consumed. Of course, that model impacts capacity planning as well, because administrators must deploy a significant amount of compute capacity — possibly as much as double — to ensure resiliency to failures.
3. Processing and Memory
Both storage and networking place additional burdens on processor and memory components. Storage and network controllers that off-load those functions from processors are recommended. Remember: It only takes a couple of solid-state drives to consume a 10Gb network. The use of a secondary network, dedicated to storage communication within the HCI, is highly recommended in order to remove network contention between storage replication traffic and application traffic. By off-loading those functions to dedicated controllers, more processing power will be available for application workloads. Investments in virtualization licensing costs also can be leveraged as much as possible.
Scaling Up with Hyperconverged Infrastructure
What happens when we think about scaling up hyperconverged infrastructure? HCI typically deploys with homogeneous servers, with identical processing, memory, storage and networking components. As an environment grows, that approach eschews an opportunity for optimizing costs. A common trend is that applications have disproportionate scaling of compute requirements compared to storage. Deploying additional HCI elements with a proportional amount of both may not be required to satisfy application demands.
For instance, a heavily compute-intensive application may not require additional storage. Conversely, a Big Data application will tax the storage infrastructure far more than the compute. Pay attention to both of those trends as part of your capacity planning. When admins must take a very long-term view, remember that the cost of processing capacity is improving at different rates than storage or networking. That probably doesn’t make much of a tangible economic difference for smaller deployments, but at scale neglecting the trend will prove costly.
With some basic knowledge of a few of the trade-offs I’ve outlined here, teams can make more informed decisions when deploying hyperconverged infrastructure. This somewhat technical discussion has omitted an important factor — personnel time and effort. The core value proposition of HCI is that it simplifies management; however, it does so at the expense of scalability. In smaller environments, HCI reduces or eliminates the need for teams of IT specialists, each focused on storage, compute, networking and associated technologies. As application demands grow, equipment costs start to dominate the total cost of ownership equation; therefore, separating out the storage, compute and networking functions starts to become more cost-effective. Bottom line — HCI prioritizes easy management of the virtualization ecosystem and reduced personnel costs, possibly at the expense of highly optimized price/performance at scale.