Slider

November 29, 2023

Cloud-Native Gets An Edge with COM-HPC Server Modules and Manycore Arm SoCs

ADLINKCOM-HPCOpen StandardsSlider

PICMG executive member ADLINK Technology is redefining the performance per watt equation in computer-on-module technology with COM-HPC Server Type modules based on Ampere Computing’s “cloud-native” manycore Arm processors. ADLINK Technology’s business development manager for embedded modules, Richard Pinnow, and Joe Speed, Head of Edge at Ampere Computing, explain.

PICMG: What is a “cloud-native processor” and why does industry need it?

SPEED, AMPERE: Ampere’s founders are the people who created processors for the cloud business at Intel. When you look at cloud, the basic exercise is I take each core of a CPU and sell it to a different customer. This is a gross oversimplification, but you get the idea.

To do this you need privacy – you need to know that one customer can’t see into another customer’s business. You need what’s called “freedom from interference,” so customers can run their own workloads with predictably fast response times regardless of what activity others are doing on the same processor.

Everything is virtualized, so each core is virtualized running in its own OS instance or, if they bought a bunch of cores, their application is containerized running different parts of their application on different cores. You need to be able to do this in an energy-efficient way because it’s not just about having more cores-per-socket, it’s about having more cores-per-rack-per-data center using less power.

I started working with Ampere years ago when I was at ADLINK and for me it was all about robotics, industrial, software-defined vehicles, and autonomous driving. So, I can get a predictably good response time, right? For me that’s like deterministic, real-time, low jitter, right? Freedom from interference? I can have my autonomous driving perception pipeline feeding sensor fusion data to localization, path planning, and control algorithms and I can pin all of those to separate cores and even run different kinds of software on different bundles of cores and have this very deterministic real time behavior.

So for me taking cloud-native to the edge just belongs.

PICMG: What’s driving demand for cloud-native technology?

PINNOW, ADLINK: It’s definitely scalability, cost efficiency, and time-to-market or agility.

Customers often require, from the application point of view, the ability to scale, and these services need to be very adaptive. Cloud-native processing allows this scalability, especially when we’re talking about the emphasis of microservices on containers, which allows for very dynamic scaling no matter the underlying hardware. And that’s not just focused on the traditional server market – it applies to the edge devices as well.

In terms of cost efficiency, cloud-native technologies enable customers to pay only when resources are allocated, which can lead to significant cost savings. The flexibility to allocate and deallocate resources on demand. And the third pillar is time-to-market, so it’s very important to rapidly develop PoCs and so on, and cloud-native processing really makes the best use of practices like continuous integration and continuous deployment concepts and pipelines. This is getting more and more important for edge devices as well, as it enables faster development cycles and quicker time to market. So the embedded market is adapting the concepts driven by cloud-native processing.

SPEED, AMPERE: There are stats that say 75% of all data is created at the edge, so you have to move the compute to where the data is. If 75% of data is being generated at the edge, you can sure eat a lot of cores pretty quickly, right? And the thing that was really eye opening for a lot of the cloud providers was computer vision because that one is especially greedy in terms of compute resources and communication bandwidth. Backhauling video to the cloud for processing is an expensive proposition. And for a lot of these use cases, by the time it hits the cloud, runs your analysis, and the decision comes back, it’s too late.

And a lot of these use cases talk about “connected” devices, but as you know, we should think of these as “usually connected” or “occasionally connected” devices. If it’s doing a safety function then it has to always work whether the Internet connection is working or not.

We’re Arm-based and we believe in the Arm architecture. A lot of the success of Arm comes from mobile devices that are power constrained by definition. The performance of most Arm products kind of falls off where we begin, and then you get into overlapping with x86, which is hot and hungry compared to what we do on embedded and the edge. With the Ampere Altra we’ve done some recent benchmarks running Yolo V8 AI streams on our chip and it does 4x as many frames per second per watt compared to a top-of-the-line Xeon D processor, so you can support four times as many cameras on the same power budget.

It’s an architecture that’s extremely energy-efficient and then you can scale up to where you’ve got 32, 64, 96, 128, or even 192 cores with our newest product, with each core being very energy efficient and having freedom from interference. They are single-threaded cores and they’re running at a fixed frequency so there’s none of this frequency scaling. That’s the deterministic part so that you get this predictable, low-latency performance. And then as you start loading more cores, everything runs at a fixed, predictable speed.

PICMG: What embedded edge application would make use of 128 cores?

PINNOW, ADLINK: Embedded applications that require 128 cores would typically involve tasks that demand extremely high levels of parallel compute. Some examples are edge devices such as advanced robotics, autonomous vehicles, and industrial automation systems that really benefit from, for instance, real-time computer vision, natural language processing, machine learning, and so on.

To analyze and react to very complex environments rapidly and do all those tasks in parallel

For instance, an autonomous drone might need to process multiple video streams to perform object detection on incoming camera data, simultaneously manage flight controls, and make navigation decisions, all in real time. 128 cores enables you to assign cores to very specific tasks and do all this in parallel without the “bad neighbor effect” where one core or schedule is impacting the calculation of another task or application. A lot of people are using discrete GPU solutions on the market because it allows you to do a lot of tasks in parallel. Here, you basically have the same with 128 cores of unparalleled resolution, and you can select which core is doing what, when. It’s great.

SPEED, AMPERE: When I was the Field CTO at ADLINK we helped launch this thing called “Project Cassini.” Project Cassini is about cloud server virtualization for embedded systems. My friend Garish Shirasat, who was leading the software-defined vehicle efforts over at Arm, had an idea, “What if we put that in a car? It would be Cassini on wheels.” So we used Ampere’s processor to build the developer platform for this software-defined vehicle platform.

Autonomous driving is an obvious workload for that, but what’s happening is all these automakers are working on future silicon they’ll get in a few years but that doesn’t help them develop today. So we took all this work around Project Cassini and worked out how to make it functionally safe and how to put it into cars. What happened is we developed what’s now the Ampere Altra Developer Platform – a 32- to 128-core Arm workstation for developers of Scalable Open Architecture for Embedded Edge (SOAFEE) – which is a big software-defined vehicle program for automakers and automotive tech companies. The Ampere Altra Developer Platform by ADLINK is the reference platform for that.

Figure 1. ADLINK performed eight weeks of testing on the Ampere Altra COM-HPC developer kit, including thermal shock and vibration to MIL-SPEC and validation out to +85 ºC and -45 ºC using a fanless heatsink.

Alternatively, take application code that’s been written for Raspberry Pi and things of that ilk. Jeff Geerling and Patrick Kennedy of Serve the Home did a benchmarking of this telco AI edge server with an Ampere processor from our friends at Super Micro and benchmarked it against a Raspberry Pi cluster. One of our chips was equal to 100 Raspberry Pi 4s in performance, but the interesting thing is this system with redundant 800 W power supplies was still 22% more energy efficient than the Raspberry Pi 4. We have companies working with us where they need to move so many GB per W at 1 W per core. It’s kind of a brilliant fit for those things.

PICMG: The Ampere Altra is a COM-HPC Server Type module and currently the only ADLINK product that supports a “cloud-native” processor, correct? Why that particular PICMG specification and form factor?

PINNOW, ADLINK: Yes, at the moment, the Ampere offering is solely available on COM-HPC and it’s a perfect match for edge computing systems that prioritize energy efficiency and require a lot of scalability. It is outperforming other platforms easily by three times less energy consumption. And from the I/O point of view, the CPU is providing a lot of PCI Express interfaces, way higher memory capacities, and higher bandwidth interfaces so that there are no constraints to get all those signals down to the carrier without a significant loss in signal integrity.

But there’s still a lot of flexibility to interchange from one Ampere Altra COM-HPC module to another or even to an x86 in case it is needed. And not just the hardware is standardized, but firmware is as well. I’m talking, for instance, about the IPMI interface you use to remotely manage COM-HPC devices, regardless if you’re talking to Arm silicon or x86 silicon. The demand driving this market is that application software is getting more and more independent from the underlying architecture.

COM-HPC Ampere Altra | COM-HPC Server Type | COM | ADLINK

Figure 2. The ADLINK Technology, Inc. Ampere Altra COM-HPC Server Type module equips 64 PCIe Gen 4 lanes and six DDR4 channels.

PICMG: COM-HPC was explicitly intended not to not be x86-centric. But we haven’t seen many non-x86-based COM-HPC modules make it to market yet. Why do you think that is?

PINNOW, ADLINK: x86 solutions are and will be complemented by more and more Arm offerings. This is market driven. We cannot avoid this. And the Ampere Altra is a good example.

Our entry-level and super-high-end COMs are Arm based already, Today, right? Now we’re seeing this Ampere Altra being the most powerful COM solution we have at the moment. A very good question is how fast it will cover the traditionally x86-dominated mid-performance market. It depends how fast and reliably a customer can port their existing code to execute regardless of the underlying hardware, which is a key trend. But containers hypervisors and flexible application frameworks are enablers to support this journey. And we see this happening already today. So I think it’s just a matter of time until we see more and more Arm flavors in the mid performance as well.

For more information visit www.ipi.wiki/products/com-hpc-ampere-altra to find some of ADLINK’s Ampere-based products as well as carrier reference designs, schematics, Ethernet OCP cards and design files, and even bill of materials.

Richard Pinnow is business development manager for embedded modules at ADLINK.

Joe Speed is head of Edge at Ampere.

ADLINK Technology, Inc.
www.adlinktech.com/en

Ampere Computing
https://amperecomputing.com

October 9, 2023

PICMG COM-HPC 1.2 “Mini” Brings PCIe 5.0, USB4 & 10 GbE to Far Edge

ADLINKAdvantechAvnetCOM-HPCCongatecIndustry NewsKontronNewsOpen StandardssamtecSECOSlider

 

WAKEFIELD, Mass., October 9, 2023. PICMG, a leading consortium for developing open embedded computing specifications, has announced the release of the COM-HPC 1.2 “Mini” specification. Measuring just 95 mm x 70 mm, COM-HPC Mini is nearly half the size of the next-smallest COM-HPC form factor and provides a cost-effective, lower power module for autonomous mobile robots, drones, mobile 5G test and measurement equipment, and other far edge applications.

A single, rugged 400-pin connector allows COM-HPC Mini to support communications interfaces such as:

  • 16x PCIe lanes (PCIe 4.0 or PCIe 5.0)
  • 2x 10 Gbps NBASE-T Ethernet ports
  • 8x SuperSpeed lanes (for USB4/ThunderBolt, USB 3.2, or DDI)
  • 8x USB 2.0
  • 2x SATA ports (shared with PCIe lanes)
  • 1x eDP
  • 2x DDI

The 1.2 specification defines a separate FFC connector for MIPI CSI, while its 400 pins also support signals such as Boot SPI and eSPI, UART, CAN, Audio, FUSA, and power management signals. A signal voltage reduction from 3.3V to 1.8V on most pins is in line with reduced I/O voltage on the latest low-power CPUs. The input power is limited to a maximum of 107W at a wide input voltage of 8V to 20V, leaving plenty of headroom for performance processors.

“The COM-HPC size A started at 95 mm x 120 mm, but the market loves the Mini size as well as the performance you get with COM-HPC,” says Christian Eder, Director of Product Marketing at congatec and Chairman of the COM-HPC Working Group at PICMG. “The whole trend of making things smaller and more power-saving was a reaction to market trends, and it will continue.”

Mini’s smaller footprint also provides mechanical advantages, such as a 15 mm stack height from the top of a carrier board to the top of a heat spreader stacked on a COM-HPC Mini module. This 5 mm reduction compared to other COM-HPC variants means COM-HPC Mini modules must use soldered memory, which makes them inherently rugged through resistance to shock and vibration and provides direct thermal coupling to heat spreaders.

“The new revision of the specification allows COM-HPC to address additional high-performance applications that require a smaller footprint,” says Doug Sandy, CTO of PICMG. “COM-HPC 1.2 is a great solution that completes the spectrum of solutions of COM Express through COM-HPC Server Modules.”

“The COM-HPC Mini specification leverages the high-speed capabilities and SI performance of existing COM-HPC interconnect solutions,” says Matthew Burns, Global Director of Technical Marketing at Samtec. “Dropping one 400-pin connector enables small form-factors without sacrificing the data throughput demanded at the Far Edge.”

PICMG members ADLINK, congatec, Samtec, SECO, and others have either already released or plan to release COM-HPC 1.2 product in the near future.

The COM-HPC 1.2 specification can be downloaded today for $750 from the PICMG website at www.picmg.org/product/com-hpc-module-base-specification-revision-1-2. A COM-HPC 1.2 Carrier Design Guide is scheduled for release in early 2024.

For more information, visit www.picmg.org/openstandards/com-hpc.

About PICMG

Founded in 1994, PICMG is a not-for-profit 501(c) consortium of companies and organizations that collaboratively develop open standards for high performance industrial, Industrial IoT, military & aerospace, telecommunications, test & measurement, medical, and general-purpose embedded computing applications. There are over 130 member companies that specialize in a wide range of technical disciplines, including mechanical and thermal design, single board computer design, high-speed signaling design and analysis, networking expertise, backplane, and packaging design, power management, high availability software and comprehensive system management.

Key standards families developed by PICMG include COM-HPC, COM Express, CompactPCI, AdvancedTCA, MicroTCA, AdvancedMC, CompactPCI Serial, SHB Express, MicroSAM, and HPM (Hardware Platform Management). www.picmg.org.

February 5, 2014

Why use PICMG Standards?

Slider

Our standards reflect the capabilities of our members which have led to thousands of compliant products in a multi-billion dollar industry.