Initial documentation site for cuframes:
- Landing page (src/pages/index.mdx) — hero, quick example (publisher +
subscriber), comparison table vs naive/DeepStream, honest "early but
production-tested" status
- /docs/intro — full overview
- /docs/getting-started/{install,first-publisher,first-subscriber}
- /docs/concepts/{frame-vs-packet-ring,ownership-modes,sync-vmm-stream}
with mermaid diagrams
- /docs/integration/{ffmpeg-demuxer,ffmpeg-filter,python}
- /docs/reference/{api-c,api-cpp,protocol} — full v4 wire protocol spec
incl. VMM_FDS message, magic 0xCC7C1DCE bump diff
- /docs/faq — comparison vs DeepStream/GStreamer, license, multi-host
limitations
- i18n/ru/ — parallel RU translation (tech terms latin, склонение апостроф)
Build:
- Docusaurus 3.10.1 + theme-mermaid + search-local
- Follows dagstack-* docs convention (canonical: dagstack-plugin-system-docs)
- Apache-2.0 license; cuframes lib itself remains LGPL-2.1+
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
4.9 KiB
title, sidebar_position
| title | sidebar_position |
|---|---|
| First publisher | 2 |
First publisher
A minimal publisher that exposes a CUDA-resident ring of 4 NV12 frames at 1920×1080 and writes 10 frames into it. Each frame is filled with a single-byte pattern via cudaMemsetAsync, so a subscriber can later verify the contents end-to-end.
This is a stripped-down version of spike/smoke_v04/smoke_pub.c in the cuframes repo.
Source
/* first_publisher.c — publish 10 NV12 1920x1080 frames, then exit. */
#include <cuframes/cuframes.h>
#include <cuda_runtime.h>
#include <stdio.h>
#include <time.h>
int main(int argc, char **argv) {
const char *key = argc > 1 ? argv[1] : "mykey";
cuframes_publisher_config_t cfg = {0};
cfg.key = key;
cfg.width = 1920;
cfg.height = 1080;
cfg.format = CUFRAMES_FORMAT_NV12;
cfg.ownership = CUFRAMES_OWNERSHIP_LIBRARY;
cfg.ring_size = 4;
cfg.policy = CUFRAMES_POLICY_DROP_OLDEST;
cfg.cuda_device = 0;
cuframes_publisher_t *pub = NULL;
int r = cuframes_publisher_create(&cfg, &pub);
if (r != CUFRAMES_OK) {
fprintf(stderr, "create: %s\n", cuframes_strerror(r));
return 1;
}
cudaStream_t stream;
cudaStreamCreate(&stream);
for (int i = 0; i < 10; i++) {
void *ptr = NULL;
if ((r = cuframes_publisher_acquire(pub, &ptr)) != CUFRAMES_OK) break;
/* NV12 = Y plane + interleaved UV plane = width*height*3/2 bytes */
cudaMemsetAsync(ptr, (uint8_t)i, 1920 * 1080 * 3 / 2, stream);
r = cuframes_publisher_publish(pub, stream, cuframes_now_ns());
if (r != CUFRAMES_OK) break;
struct timespec ts = {.tv_nsec = 40000000}; /* 25 fps */
nanosleep(&ts, NULL);
}
cudaStreamDestroy(stream);
cuframes_publisher_destroy(pub);
return r == CUFRAMES_OK ? 0 : 1;
}
Walk-through
cuframes_publisher_config_t cfg = {0}; — always zero-initialise. The struct has a _reserved[4] field that must stay zero for forward ABI compatibility.
cfg.key = "mykey" — uniquely names the publisher within the host. It becomes the path component of the Unix socket (/run/cuframes/mykey.sock) and of the POSIX SHM segment (/dev/shm/cuframes-mykey). Two publishers cannot share a key — the second one gets CUFRAMES_ERR_ALREADY_EXISTS.
cfg.format = CUFRAMES_FORMAT_NV12 plus width/height — frame geometry is fixed for the lifetime of the publisher. Subscribers see exactly these dimensions.
cfg.ownership = CUFRAMES_OWNERSHIP_LIBRARY — the library allocates the CUDA ring buffer itself. The alternative, CUFRAMES_OWNERSHIP_EXTERNAL, lets you hand in pre-allocated device pointers (typically from a FFmpeg AVHWFramesContext pool). For details see Concepts → Ownership modes.
cfg.ring_size = 4 — number of frame slots. 2 is the minimum, 4 a reasonable default, 16 the cap. With DROP_OLDEST policy a slow consumer simply misses frames; the publisher never blocks.
cuframes_publisher_acquire(pub, &ptr) — returns a CUDA device pointer to the next writable slot. Valid only until the matching publish() call.
cudaMemsetAsync(ptr, ..., stream) — fill the frame on a CUDA stream of your choice. You do not have to synchronize before calling publish(). The library issues cuStreamSynchronize(stream) inside publish() to flush pending GPU writes, then atomically publishes the sequence number. Subscribers see the data via hardware coherence on a same-GPU DtoD copy — no CUDA events needed. Full rationale: Concepts → Sync: stream sync, not CUDA events.
cuframes_publisher_publish(pub, stream, pts_ns) — make the slot visible to subscribers. The pts_ns is opaque to the library; the recommended source is cuframes_now_ns() (CLOCK_MONOTONIC in nanoseconds).
Cleanup — cuframes_publisher_destroy() closes the socket, unlinks the SHM segment and releases the CUDA pool.
Compile
gcc -O2 -I/usr/local/include -I/usr/local/cuda/include \
-o first_publisher first_publisher.c \
-L/usr/local/lib -lcuframes \
-L/usr/local/cuda/lib64 -lcudart -lcuda
If you built cuframes without cmake --install, point -I and -L at your build/ tree (-I./include -L./build/libcuframes).
Run
./first_publisher mykey
While running, the process owns:
/run/cuframes/mykey.sock— the handshake / control socket/dev/shm/cuframes-mykey— the shared metadata header (SHM)
Both are removed on clean shutdown. If the publisher crashes, stale files may remain; the next start re-creates them.
Next
Open a second terminal and wire up a First subscriber that reads these frames and validates the pattern. For the full API surface see Reference → C API.