KServe
vAccel
Container images have become the standard unit of software packaging and deployment. They’re everywhere: in the cloud, on edge devices, and even in AI inference pipelines. Yet, despite the ubiquity of OCI (Open Container Initiative) registries and image formats, there hasn’t been a clean, lightweight C library for fetching and unpacking OCI images.
If you’ve ever tried to debug a PyTorch program on an ARM64 system using Valgrind, you might have stumbled on something really odd: “Why does it take so long?”. And if you’re like us, you would probably try to run it locally, on a Raspberry pi, to see what’s going on… And the madness begins!
Valgrind
That’s what we thought when setting up a BERT-based hate speech classifier. This was part of a broader experiment using vAccel, our hardware acceleration abstraction for AI inference across the Cloud-Edge-IoT continuum.
If you’re into open-source and tech meetups, FOSSCOMM is the event to be at. This year, it was held in Thessaloniki, organized by the open-source community of the University of Macedonia (UoM), and they absolutely crushed it with the setup and vibe!
To facilitate the use of vAccel, we provide bindings for popular languages, apart from C. Essentially, the vAccel C API can be called from any language that interacts with C libraries. Building on this, we are thrilled to present support for Go.
C
Following up on a successful VM boot on a Jetson AGX Orin, we continue exploring the capabilities of this edge device, focusing on the cloud-native aspect of application deployment.
In 2022, NVIDIA released the Jetson Orin modules, specifically designed for extreme computation at the Edge. The NVIDIA Jetson AGX Orin modules deliver up to 275 TOPS of AI performance with power configurable between 15W and 60W.
Running applications that need hardware acceleration in public clouds remains a challenge, both for end-users and for service providers. The reasons for this are mostly related to the complicated hardware abstractions that acceleration devices expose, as well as to the complicated software stacks that drive these devices.