X
101889

Service providers turn to mainframes as pragmatic means of delivering workload performance

May 3 2021
by James Sanders


Introduction


As comparisons go, you can do worse than calling a general-purpose datacenter CPU the electronic equivalent of a Swiss Army knife. It can be adapted to handle just about anything – and for decades, it has been. In pursuit of performance, workloads have increasingly been directed to purpose-built silicon, such as using GPU-accelerated systems for AI/machine learning (ML) workloads. This trend extends beyond the use of GPUs, however: Differentiated systems, including the venerable mainframe, are of increasing importance to service providers as they aim to deliver fit-for-purpose solutions for their customers.

The 451 Take

Service providers approach technology with an outcome-based mindset. There is no room for ego when handling the workloads of your clients; this is not an exercise in technology for the sake of technology. Consequences of this split both ways – using relatively staid mainframe systems is the correct choice when the workload in question demands mainframe-level high availability and processing speed. That said, the engineering time required to fully leverage the capabilities of a given piece of hardware is not insignificant.

Finding accelerants for your workloads on fire


Deriving value from sets of data requires processing; it is not enough that data just be processed, it must be processed relatively quickly, depending on the type of data involved and the desired outcome. As noted in the introduction, ML model training is a process that is constantly iterative. When new data is made available, re-training the model with the expanded data set – not just the delta of new data – is necessary to derive value from it.

Model training is (generally) not a real-time event. Data collected from years or decades prior can be useful for models related to speech-to-text translation, and these data sets must be sufficiently large to generate a model of functional use. GPUs are used for these tasks, given an architectural advantage of extremely high parallelism.

451 Research's Infrastructure Evolution 2020 study finds support of this trend – 46% of service providers in a variety of categories indicated production use of GPUs for client workloads, with a further 26% planning to implement in 12 months. This is a mainstream, well-understood utilization of technology: NVIDIA's CUDA library for general-purpose GPU compute was introduced over a decade ago, and GPU-accelerated cloud VM offerings have increased steadily over the past several years. GPUs are sought for the outcomes they can deliver for the workloads that require the performance this hardware provides.

Taking technology to a teleological extreme


There is no reason this trend should be limited to GPUs – different workloads bring different processing requirements, and purpose-built hardware is designed and marketed toward solving these problems. Perhaps the oldest among these (by definition, or at a minimum, etymology) are mainframes – while these are counted as 'server hardware' in the above graph, mainframes are effectively workload-specific accelerated systems. Fifteen percent (15%) of the service providers in the survey indicated operating a mainframe environment in production, with growth on the horizon: 20% plan to implement within 12 months, 14% plan to implement in 24 months and a further 8% indicate that mainframes are in discovery or proof of concept.

Leaving open the possibility that adoption or management of mainframes – or any technology – may be aspirational on the part of service providers (as 'we can do X, and we're doing it for customer Y' confers flexibility), there is a visible increase in interest in mainframes. Notably, IBM's Q1 2021 earnings indicated 49% year-over-year revenue growth in its IBM Z mainframe business.

An example driver of this trend could be inferred from a higher number of credit card transactions – a commonly cited high-frequency 'essential' workload handled by mainframes. This could be the result of the COVID-19 pandemic, as businesses deprioritized cash payments due to sanitation concerns. Similar increases in digital transactions from e-commerce and microtransactions may also serve to increase the net amount of transactions.

While there is versatility in general-purpose CPUs, enterprise workloads should – ideally – be executed on the most optimized equipment for the job. Certainly, in a post-Moore's Law landscape, the use of accelerators is an appealing way to extract greater performance. These benefits are swaying service providers: 24% of respondents indicate using FPGAs to accelerate workloads, with 26% planning to implement within 12 months. Fully utilizing the capabilities of differentiated hardware does come with a caveat: The engineering time required to fully leverage the capabilities of a given piece of hardware is not insignificant.

Furthermore, Arm-based servers – such as the Ampere Altra or the custom Graviton2-based instances in AWS – may prove compelling for compute performance or competitive pricing, as 22% of respondents indicate using Arm-based servers in production and 24% are planning to implement within the next 12 months. While there may be some headwinds to migrating from x86-64 to Arm, open source software and partner offerings are catching up to provide multiplatform support. (Interestingly, NextRoll cited re-platforming apps to Amazon Linux 2 as more involved than from x86-64 to Arm.)

Implications for a quantum future


Clear delineations exist for workload accelerators. In addition to the examples of GPUs for AI/ML model training and mainframes for high-availability financial workloads, GPUs are likewise adept at video encoding, while Google deploys custom silicon for video encoding for YouTube.

Quantum computing, which is likely to be of increasing importance in the portfolios of compute accelerators in the next several years, may also have domain-specific benefits for near-term quantum systems, depending on the workload in question. For example, quantum architectures that provide fast gates may be more beneficial for finance applications, while quantum architectures that provide long coherence times may be more beneficial for chemistry applications.