Files
cuframes/docs/launch/ffmpeg-devel-rfc.md
gx 612843bd39 docs: launch drafts (Frigate discussion + FFmpeg-devel RFC + Show HN)
3 черновика для upstream visibility (Etap E):
- docs/launch/frigate-integration-issue.md — Discussion на blakeblackshear/frigate
- docs/launch/ffmpeg-devel-rfc.md — RFC patch + cover letter для ffmpeg-devel ML
- docs/launch/hn-show-post.md — Show HN draft (Etap F)
- docs/launch/README.md — порядок, чек-лист, pre-flight notes

См. issue #3.
2026-05-19 02:04:42 +01:00

7.4 KiB
Raw Permalink Blame History

FFmpeg-devel RFC submission

Status: DRAFT — review перед отправкой.

Куда: ffmpeg-devel@ffmpeg.org (subscribe: https://ffmpeg.org/mailman/listinfo/ffmpeg-devel)

Как: patch генерится через git format-patch, отправляется git send-email с cover-letter. FFmpeg не использует GitHub PR / pull-request — только mailing-list patches.


Шаги отправки

# 1. Конфигурация git send-email (один раз)
git config --global sendemail.smtpserver smtp.gmail.com
git config --global sendemail.smtpserverport 587
git config --global sendemail.smtpencryption tls
git config --global sendemail.smtpuser ВАШ-EMAIL
# password — через ~/.netrc или интерактивно

# 2. На fork ffmpeg-patched, в ветке n7.1-cuframes:
cd /path/to/ffmpeg-patched
git log --oneline n7.1..n7.1-cuframes  # должна быть одна commit

# 3. Подготовить .patch
git format-patch -1 --cover-letter --subject-prefix='RFC PATCH' \
    --output-directory=/tmp/cuframes-rfc \
    n7.1..n7.1-cuframes

# 4. Отредактировать /tmp/cuframes-rfc/0000-cover-letter.patch:
#    - Заменить *** SUBJECT HERE *** → см. ниже
#    - Заменить *** BLURB HERE *** → cover-letter body (см. ниже)

# 5. Dry-run
git send-email --dry-run --to=ffmpeg-devel@ffmpeg.org /tmp/cuframes-rfc/*.patch

# 6. Реальная отправка
git send-email --to=ffmpeg-devel@ffmpeg.org /tmp/cuframes-rfc/*.patch

Subject line

[RFC PATCH 0/1] libavformat/cuframesdec: zero-copy CUDA frame ingest via IPC

Cover-letter body

Hi all,

This RFC adds a new demuxer "cuframes" to libavformat that ingests already-
decoded video frames residing in CUDA device memory, produced by another
process via the libcuframes IPC layer [1].

# Why

In multi-consumer GPU video pipelines (CCTV with multiple analytics
services, multi-stream transcoding farms, ML inference + recording on the
same source) every consumer typically runs its own NVDEC session. On 16
cameras × 25 fps × N consumers this multiplies NVDEC sessions, OS
context-switches and host<->device PCIe traffic for what is logically the
same decoded frame.

cuframes addresses this by letting one process decode (e.g. via FFmpeg's
existing CUDA hwaccel) and publish the decoded frames into a small CUDA
ring buffer; other processes import the buffer via cudaIpcOpenMemHandle
and consume the same VRAM allocation without redecoding or copying.

The libavformat demuxer in this RFC is the consumer side: it exposes the
remote ring buffer as a regular AVFormat input source, so any downstream
FFmpeg filter chain or muxer can use it transparently.

# Scope of this patch

  libavformat/cuframesdec.c   — new demuxer
  libavformat/allformats.c    — registration
  configure                   — --enable-libcuframes option

The demuxer currently outputs NV12 frames via cudaMemcpy2DAsync to host
memory (rawvideo path). A v0.2 follow-up is planned that emits frames
directly as CUDA AVHWFramesContext (true zero-copy into a CUDA-aware
filter chain) — see [2].

# Out-of-tree library

libcuframes (the producer side, the IPC handshake, the ring-buffer
allocator) lives out-of-tree at [1], licensed LGPL-2.1+ to match FFmpeg.
The demuxer links against libcuframes via pkg-config.

This mirrors the model used by other libavformat plugins that wrap third-
party libraries (libsmbclient, librist, libsrt, etc.).

# Testing

- Unit smoke tests in the libcuframes repo (1 publisher × 4 subscribers ×
  2000 frames @ 120 fps — 0 torn frames, 0 gaps).
- E2E test against a real RTSP IP camera (Dahua HEVC 1920×1080, 25 fps,
  100/100 frames, avg_fps=25.03).
- ~24h production deployment serving Frigate (object detection) and a
  custom analytics pipeline from a single decoder, single NVDEC session.

# Prior art and what this is not

There is no in-tree mechanism for sharing decoded GPU frames between
unrelated FFmpeg processes. Existing alternatives are:
  - CUDA hwdownload + hwupload (defeats the purpose — round-trips via PCIe)
  - DeepStream Gst-nvstreammux (NVIDIA, closed, GStreamer-only)
  - Vendor-locked NVENC/NVDEC pooling helpers

cuframes is intentionally minimal: ring buffer + handshake + IPC handles.
No transcoding logic, no policy.

# Limitations / known issues for review

  - NVIDIA GPUs only (CUDA IPC is vendor-specific).
  - Linux only (POSIX SHM + AF_UNIX sockets).
  - Producer and consumer must share the same CUDA device (CUDA IPC limit).
  - NV12 only in v0.1; other pixel formats are roadmap items.
  - Driver ≥ 525, CUDA toolkit ≥ 12.0 (≥ 13.0 recommended).

# Feedback wanted

  1. Is the libavformat demuxer the right home for this, or would a
     hwcontext_cuda extension + a thin demuxer be a better split?
  2. Are folks open to an out-of-tree library dependency under
     --enable-libcuframes, given the precedent of librist/libsrt?
  3. Naming: "cuframes" vs "cudaipcframes" vs something else?

Happy to iterate. Patch follows.

[1] https://git.goldix.org/gx/cuframes  (LGPL-2.1+)
[2] https://git.goldix.org/gx/cuframes/issues/2  (v0.2 zero-copy plan)

Signed-off-by: <YOUR NAME> <YOUR EMAIL>

Notes на review

  • Subject prefix [RFC PATCH] — потому что это design discussion, не "merge this now". Если получите конструктивный feedback и сделаете revision — следующая будет [PATCH v2].
  • Sign-off обязателен — иначе patch отклонят на уровне tooling.
  • Не упоминать "production-ready", "battle-tested", "30 days of uptime" — FFmpeg-devel список очень аллергичен на маркетинговый тон. Numbers OK, эпитеты нет.
  • Не CC maintainers без приглашения — ответят те, кому интересно. Можно CC Timo Rothenpieler (CUDA hwaccel maintainer) если хочется ускорить — но только после первого revision если тишина.
  • Возможные возражения:
    • "Why not Vulkan video?" — Vulkan video не имеет cross-process sharing API на уровне CUDA IPC. Vulkan external memory работает с DMA-BUF на Linux но требует DRM device sharing, что тоже non-trivial — отдельный RFC материал.
    • "Why a new demuxer, not a filter?" — потому что producer уже вне этого FFmpeg-процесса; demuxer — это место где AVFormat читает из внешнего источника. Filter pull'ает из upstream AVStream — здесь нет upstream.

Альтернативный путь — ffmpeg-user (lighter)

Если кажется что для -devel сразу с patch'ем тяжело — можно начать с awareness email в ffmpeg-user@ffmpeg.org:

Subject: ANNOUNCE: libcuframes — zero-copy CUDA frame sharing for FFmpeg pipelines

[3 параграфа: what / why / link to repo]

Patch для libavformat будет отправлен в -devel список после feedback от пользователей.

Это soft launch — мень рисков отказа, больше шансов получить early adopters которые потом support'ят RFC. Рекомендую этот шаг сначала.