Back to Blog

AI Drives the Software-Defined Heterogeneous Computing Era

Image of Dr. Sakyasingha Dasgupta
Dr. Sakyasingha Dasgupta
AI Drives the Software-Defined Heterogeneous Computing Era

Listen to this blog:

Listen on your favorite podcast platform

The rapid development of artificial intelligence (AI) applications has created enormous demand for high-performance and energy-efficient computing systems.

However, traditional homogeneous architectures based on Von Neumann processors face challenges in meeting the requirements of AI workloads, which often involve massive parallelism, large data volumes, and complex computations. Heterogeneous computing architectures integrate different processing units with specialized capabilities and features and have emerged as promising solutions for AI applications. In our view, AI is driving the next era of software-defined heterogeneous computing, enabling better solutions for complex problems.

Heterogeneous Computing in an AI Context

Heterogeneity in computing architectures refers to the situation where different types of computing devices, such as general-purpose processors, reconfigurable devices, accelerators, and sensors, are integrated and interconnected in various ways to form complex systems. These systems can span multiple levels of granularity, from a single chip to a data center to a distributed processing network. Heterogeneity can involve different instruction sets, memory models, programming models, and communication protocols working together.

The primary motivation for heterogeneity is to optimize the trade-offs between performance, power consumption, cost, and greater flexibility. AI applications like real-time video analysis or speech recognition may require high computational intensity and low latency. They can benefit from specialized hardware accelerators delivering high performance per watt. On the other hand, some AI tasks, such as natural language processing or recommendation systems, may involve diverse and dynamic data sources and models. These tasks can benefit from reconfigurable processor architectures that adapt to rapidly evolving AI workloads and data characteristics.

Software Challenges for Heterogeneous Computing

However, heterogeneous computing architectures also pose significant software development and optimization challenges.

These challenges include:

  • Partitioning and distributing a workload among different processing units
  • Managing data movement and communication among different memory hierarchies and interconnects
  • Exploiting parallelism and concurrency at different levels of granularity
  • Balancing trade-offs between performance, energy, accuracy, and reliability
  • Ensuring the portability and scalability of the software across different platforms and devices

Software tools and frameworks that can abstract the complexity and heterogeneity of the underlying hardware and provide high-level programming models and interfaces are essential in meeting these challenges.

AI techniques in these software tools and frameworks are emerging for optimizing performance and energy consumption - especially for resource-constrained edge devices. For example, machine learning can model the behavior and characteristics of different processing units and predict the optimal system configuration. AI planning can explore the parameter space and generate efficient execution plans. Reinforcement learning can take feedback and adapt to dynamic environments and changing workloads.

EdgeCortix's MERA Software and Compiler Framework for Heterogeneous Computing

At EdgeCortix, we have developed MERA as a Machine-learning Enhanced Runtime Acceleration software and compiler framework designed for heterogeneous computing environments, especially for edge AI solutions. MERA aims to simplify and automate the software development process for heterogeneous systems by providing a unified programming interface, a smart compiler that can generate optimized code targeting a combination of processors, and a runtime that can dynamically manage the execution and adaptation of heterogeneous systems. MERA enables software-defined heterogeneous computing solutions by configuring software and the underlying hardware to match the demands of an application.

EdgeCortix edge AI inference products and technology include hardware, IP, and software in one workflow for AI developers

Leveraging MERA's programming interface, machine learning developers can write application code in a high-level language such as Python or C++, without worrying about the details of each target processor or device. MERA's compiler then analyzes the code and automatically partitions it into different segments that can run across or in combination with additional devices. The compiler applies various optimization techniques such as loop unrolling, vectorization, task parallelization, memory management, data layout transformation, and others to generate efficient code for each device. The compiler also generates an executable file containing each target device's code segments and metadata.

MERA's compiler and runtime is responsible for deploying and executing the executable across a heterogeneous processor mix, selecting the best combination of devices - CPUs, network processor units (NPUs), or even programmable logic - for each application. The compiler can also monitor the workload and environment conditions such as model type, model precision like INT8 or FP16, data size or resolution, power consumption, and more, and dynamically adjust the configuration and behavior of the heterogeneous systems accordingly. For example, the compiler & runtime can switch between different devices or code segments based on model precision or performance criteria and partition a deep neural network model or multiple such models into optimal groups of operations for distribution across the heterogeneous processors available.

MERA also supplies a library of pre-optimized kernels and set of patterns it recognizes from standard AI operations (such as convolutions, pooling, activations, masked or multi-head attention, and others) that can run optimally on the computing engines corresponding to the heterogeneous processors. The MERA library makes use of different scheduling and memory allocation techniques that during compilation can iteratively optimize a given target metric, such as latency and compute utilization on a given configuration.

The MERA software and compiler framework supports a wide range of software-defined heterogeneous compute engines. One example is EdgeCortix's SAKURA-I edge inference processor with its runtime-configurable dedicated NPU (based on EdgeCortix's Dynamic Neural Accelerator technology), which can pair with a general-purpose host processor (with Arm, x86, or RISC-V cores) along with an additional discrete FPGA accelerator. MERA enables a machine learning-based end application to be accelerated across this mixture of processing engines under low-power conditions while maximizing performance.

Designers developing with the EdgeCortix MERA software can achieve high-performance and low-power AI inference on EdgeCortix's SAKURA-I edge AI processor without writing low-level code or manually tuning hardware parameters. Moreover, when adding new types of processors or other third-party devices with a supported interface, the existing MERA library easily extends to support a new instruction set architecture.

Created for Software-Defined Heterogeneous Computing

As Moore’s Law sunsets, we see diminishing performance gains from transistor shrinkage, and homogeneous multicore solutions don't solve all computing problems, especially where efficiency is a deciding criterion. Semiconductor engineers are increasingly focused on architectural improvements, moving toward heterogeneous computing where multiple processors (CPUs, GPUs, ASICs, FPGAs, and NPUs) work together to improve performance.

New types of software and compilers are critical to accelerated AI performance. In this context, we introduced EdgeCortix's MERA software and compiler framework, created for deploying real-time yet high-performance edge AI solutions. In this new era of software-defined heterogeneous computing, such software is a necessary and flexible tool for developing and deploying AI applications at scale.

See what MERA does

Stay Ahead of the Curve: Subscribe to Our Blog for the Latest Edge AI Technology Trends!

Multimodal Generative AI on Energy-Efficient Edge Processors

Image of Dr. Sakyasingha Dasgupta
Dr. Sakyasingha Dasgupta
SAKURA-I efficient edge AI chips from EdgeCortix outperform the NVIDIA Jetson AGX Orin

Efficient Edge AI Chips with Reconfigurable Accelerators

Image of Nikolay Nez
Nikolay Nez