The new memory addressing scheme in IBM Power10 will change the way we look at AI applications

IBM's lead engineer, bill stark, asks the question: "what if there is a new kind of workload and I want to collect much more memory than could ever fit in a single device? This is a completely new application: what if I want one computer to talk to a petabyte of memory? But today, no one knows how to build such a system."

With a concept called distributed memory disaggregation applied in the latest IBM Power10 processors, "you can really use huge, huge amounts of memory and allow whole new types of workloads to run on computers running in a single cloud."

In the new IBM Power 10 processors, the company has rethought the approach to working with RAM, and now, for example, one processor core of one server can address physical memory installed on another server located in the same cluster. This is achieved through architecture features where multiple cores sharing a common memory address space can be combined to perform common tasks in parallel that would require asynchronous accelerators such as GPUs, ASICs, and FPGAs to use much more threads, as well as a separate orchestration scheme.

Схема применения ИИ в IBM Power10

Today, GPU accelerators are tied to their own server. They perform highly recursive algorithmic tasks much faster than a normal processor, whose concurrency threads are limited by the number of available cores. Although IBM has had a hand in developing accelerator technology, most recently supporting the OpenCAPI architecture for accelerator interfaces, it does not have the same market shares as Intel (after buying Altera in 2015), Nvidia, or Xilinx.

With the new approach to memory architecture, many types of resource-intensive computing will not require the use of asynchronous accelerators, including GPU. the AI block will work according to the classic SMP scheme, which is more friendly for programmers than development for GPU or ASIC. Various AI tasks can be integrated into the General flow of program code, which in turn will not prevent any large-scale tasks for AI processing to be transferred to separate accelerators (GPU or ASIC). This will allow you to return AI tasks to a sequential, synchronized, symmetrical workflow on the CPU.

IBM intends to take full advantage of its Red Hat division by making OpenShift, its commercial Kubernetes platform, a workload deployment mechanism. This means that the Power10 cluster will be able to organize highly parallelized workloads — perhaps not very parallelized, but still somewhat complex-into a single group of tasks managed through Kubernetes, rather than through some external engine.

Ron Amadeo
25/08.2020


Read also:

Huawei Ascend is a unified platform for AI and ML

Huawei is the first and only company to have single-brand solutions across the entire AI cycle, from machine learning systems to edge servers and end devices such as surveillance cameras. From top to bottom, you have a single vendor, a single ...

Qualcomm Cloud AI 100 AI accelerators for data centers

Today, Qualcomm announced that they are entering the AI accelerator market with the Qualcomm Cloud AI 100 family of chips. Designed from the ground up for the artificial intelligence market, this chip should be released as early...

Guide to buying used servers

Your business has already grown so much that you need your own server for CRM, MySQL or virtualization, but the prices for server hardware upset you? In this article, we will tell you how to save up to 100 times on buying use...