neuASIC 7nm Platform for Machine Learning ASIC Design

A fundamentally new approach to building ASICs for AI/neural network applications called the neuASIC platform

Up to now, hardware accelerators for machine learning have been built primarily with GPUs and FPGAs. The machine learning ASIC platform (MLAP) segment of the market has been under-served due to the dynamic nature of artificial intelligence (AI)/machine learning algorithms. These algorithms typically experience a high degree of change as they are adapted to the end application, making it problematic to use a static, full-custom ASIC platform. It is well-known that an ASIC will deliver the best power, performance and lowest total cost of ownership but these benefits were out of reach for most due to frequent algorithm changes.

ASIC Performance Example

ASIC performance vs. time chart

Source: “The Future of Machine Learning Hardware”, Phillip Jama, Sept. 2016,

Drawing upon its work over the past three years on the design of AI/2.5D systems and recently announced production qualification for a deep learning ASIC, eSilicon has fundamentally changed the MLAP market segment with the neuASIC™ platform.

Through customized, targeted IP offered in 7nm FinFET technology and a modular design methodology, the neuASIC platform removes the restrictions imposed by changing AI algorithms. The platform includes a library of AI-targeted functions that can be quickly combined and configured to create custom AI algorithm accelerators. With the use of a Design Profiler and AI Engine Explorer, eSilicon-developed and third-party IP can be configured as AI “tiles” via an ASIC Chassis Builder, allowing early power, performance and area (PPA) analysis of various candidate architectures. The neuASIC platform also uses a sophisticated knowledge base to ensure optimal PPA.

NeuASIC Platform Architecture

Machine Learning ASIC Platform

The elements of neuASIC IP library include functions that are found in most AI designs, resulting in a core architecture that is both optimized and durable with respect to AI algorithm changes. Specific algorithm modifications can be accommodated through a combination of minor chip revisions that integrate appropriate AI “tiles” or modifications of the 2.5D package to integrate appropriate memory components.

eSilicon-developed AI-targeted “tiles” include subsystems such as convolution engines that have MAC blocks tightly coupled with memory subsystems optimized for AI that result in lowest area and power. Special innovative structures have been developed for data transfer across memory subsystems. It also includes transpose memory, among others. The physical interface (PHY) to the HBM memory stack is also part of the library. Approximately 100 engineers at eSilicon are working on the design and silicon hardening of this AI IP.

NeuASIC Platform AI Tile

AI ASIC platform tile diagram

A typical AI design requires access to large amounts of memory. This is usually accomplished with a combination of customized memory structures on the AI chip itself and off-chip access to dense 3D memory stacks called high-bandwidth memory (HBM).  Access to these HBM stacks is accomplished through a technology called 2.5D integration. This technology employs a silicon substrate to tightly integrate the chip with HBM memory in a sophisticated multi-chip package. The current standard for this interface is HBM2. The development of customized on-chip memory and 2.5D integration represent eSilicon core competencies that are required for a successful AI design.

eSilicon built the industry’s first AI ASIC. We are currently engaged with several tier-one system providers and high-profile startups to deploy the neuASIC platform and its associated IP. Initial applications will focus on the data center and information optimization, human/machine interaction and autonomous vehicles.