cuframes

Author	SHA1	Message	Date
gx	7f4bdfcaab	Merge pull request 'Python bindings (pybind11) — Phase 0 v1' (#7 ) from feat/python-bindings into main build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Failing after 39s Details build / ffmpeg filter patch (out-of-tree) (push) Has been skipped Details	2026-06-13 21:34:29 +01:00
gx	afc2dd7fff	python: DLPack + health stats + CUDA stream + docs (tasks #199-#202) build / cmake build (CUDA 12.4, Ubuntu 22.04) (pull_request) Failing after 1m50s Details build / ffmpeg filter patch (out-of-tree) (pull_request) Has been skipped Details #199 DLPack export: - frame.dlpack_y() / .dlpack_uv() — explicit multi-plane access для NV12 - frame.__dlpack__() / __dlpack_device__() — protocol для torch/cupy - Capsule deleter правильно держит refcount на frame_keep_alive, releases shape/strides arrays. CUDA pointer принадлежит frame. #200 Health/stats counters: - frames_received, timeouts, errors — per-call counters - last_seq, gap_count — proxy для drop count (NEWEST_ONLY mode) - last_frame_pts_ns - stats() — snapshot dict для MQTT health publish - counted в pybind layer т.к. C API не expose'ит ring_occupancy #201 Per-subscriber CUDA stream + thread-safety: - consumer_stream kwarg в subscribe() — int (cudaStream_t pointer) - subscriber.consumer_stream property - Thread-safety contract в docstring CuframesSubscriber - next_frame() передаёт consumer_stream_ в cuframes_subscriber_next #202 Smoke test + docs: - 10/10 pytest passed (расширен +2 теста на consumer_stream) - docs/python.md (~250 строк): quick start, API reference, integration с PyTorch/CuPy, reconnect-loop pattern, per-stream usage, pitch alignment, thread-safety, error taxonomy, backpressure, Phase 0 limitations Verify build + tests: cmake -B build-python -DBUILD_PYTHON_BINDINGS=ON cmake --build build-python -j pytest python/tests/ -v # 10/10 Закрывает Phase 0 issue gx/cuframes#6. Разблокирует goldix-smart-home/yolo-world-detector Phase 1. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-13 21:33:21 +01:00
gx	5d1eaedb38	python: CuframesSubscriber + CuframesFrame wrapper (task #198 ) Реализует subscriber-side wrapper над cuframes_subscriber_* и cuframes_frame_* C API. Что добавлено: - CuframesFrame — owning RAII wrapper над cuframes_frame_t* - properties: cuda_ptr, format, width, height, pitch_y, pitch_uv, seq, pts_ns, released - release() idempotent - context manager (__enter__/__exit__) — release при выходе - после release() property access бросает CuframesError - CuframesSubscriber — owning RAII wrapper над cuframes_subscriber_t* - конструктор с key/consumer_name/mode/cuda_device/connect_timeout_ms - next_frame(timeout_ms) → CuframesFrame - close() idempotent - context manager - GIL released на блокирующих вызовах (create, next_frame) - subscribe() — module-level factory shortcut Архитектурные решения: - GIL release в py::gil_scoped_release на subscriber_create и _next — чтобы другие Python потоки могли работать пока ждём frame - consumer_stream передаётся как nullptr в Phase 0 (default stream); per-subscriber stream в task #201 - Frame держит raw pointer на subscriber, refcount Python-стороной; если subscriber уничтожен раньше, frame.release() становится no-op Smoke tests расширены до 8 — добавлены проверки exposed API и error mapping на subscribe к несуществующему publisher'у. Verify: pytest tests/test_smoke.py — 8/8 passed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-13 21:23:42 +01:00
gx	7b6d43efeb	python: fix exception hierarchy — не вызывать .attr("__class__") py::exception<T>(...) уже возвращает Python class object. Дополнительный .attr("__class__") давал metaclass (type), из-за чего issubclass() проверка для всех subexc возвращала False. Verify: pytest tests/test_smoke.py — 5/5 passed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-13 21:19:03 +01:00
gx	a7da4ea728	python: skeleton pybind11 bindings (issue #6 task #197 ) Каркас Python-пакета `cuframes`: - python/pyproject.toml — scikit-build-core конфиг - python/CMakeLists.txt — pybind11 module через FetchContent - python/src/_native.cpp — module entry, error таксономия, enum mirrors (PixelFormat, SubscriberMode), version - python/cuframes/__init__.py — re-export публичного API - python/tests/test_smoke.py — smoke tests без real subscribe - python/README.md — статус + build instructions - CMakeLists.txt — подключение python/ при BUILD_PYTHON_BINDINGS=ON Реальный subscriber/frame wrapper в следующих коммитах (tasks #198-#202). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-13 12:59:04 +01:00
gx	655649f4d8	cmake: использовать PROJECT_SOURCE_DIR вместо CMAKE_SOURCE_DIR build / cmake build (CUDA 12.4, Ubuntu 22.04) (pull_request) Failing after 5m19s Details build / ffmpeg filter patch (out-of-tree) (pull_request) Has been skipped Details build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Failing after 4m14s Details build / ffmpeg filter patch (out-of-tree) (push) Has been skipped Details При сборке cuframes как подпроекта родительского CMake-проекта (add_subdirectory) CMAKE_SOURCE_DIR указывает на корень родителя, а не cuframes. Из-за этого target_include_directories cuframes получал неверный путь и компиляция падала с fatal error: cuframes/cuframes.h: No such file or directory PROJECT_SOURCE_DIR резолвится в каталог project(), то есть всегда указывает на корень cuframes независимо от способа подключения. Standalone-сборка ведёт себя как раньше — оба пути одинаковы.	2026-06-03 04:27:24 +01:00
Claude Opus	78824c4ed1	docker: +mosquitto-clients в runtime image build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Successful in 1m42s Details build / ffmpeg filter patch (out-of-tree) (push) Failing after 1m22s Details Нужен для loop-publisher.sh wrapper в cctv stack — heartbeat и alert MQTT publish. 4.5 MB добавил, runtime image теперь ~590 MB. Без него wrapper silent fail на mqtt_alert/mqtt_state (но retry-loop работает). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-02 17:59:56 +01:00
gx	4862247fe2	v0.4: VMM + POSIX FD — namespace decoupling (no pid share required) build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Successful in 1m46s Details build / ffmpeg filter patch (out-of-tree) (push) Failing after 1m30s Details Заменяет cudaMalloc + cudaIpcGetMemHandle на cuMemCreate (VMM) + cuMemExportToShareableHandle(POSIX_FILE_DESCRIPTOR). FDs передаются consumer'у через sendmsg(SCM_RIGHTS) в handshake. Frigate (s6-overlay не даёт share PID) и любой другой consumer работают БЕЗ pid namespace share — только volume mount unix socket'a /run/cuframes и IPC share для /dev/shm header. Sync: cudaEventRecord+IPC events → cuStreamSynchronize в do_publish. Producer ждёт ~1 ms что stream flush'нулся, потом atomic_store(seq). Consumer читает seq через memory_order_acquire и копирует DtoD без event wait — HW coherence гарантирована на одном GPU. ABI break (согласован с user'ом): - magic 0xCC7C1DCC → 0xCC7C1DCE (старые consumers fail cleanly) - protocol V3 → V4 - libcuframes.so.0 SOVERSION остаётся, но .so.0.3.0 → .so.0.4.0 - EXTERNAL ownership убран (VMM требует cuMemCreate-allocated memory, нельзя export'нуть произвольный cudaMalloc-pointer как POSIX FD) - cuframes-rtsp-source переведён на LIBRARY mode + один D2D memcpy в acquire'нутый slot (overhead малый — публишер всё равно делал такой D2D из FFmpeg hwframe pool в EXTERNAL pool раньше) Размер: granularity 2 MB на 5090 → NV12 1920×1080 (~3.1 MB) округляется до 4 MB, +1 MB на slot × 16 × 4 камеры = +64 MB VRAM. Терпимо. Packet ring (cuframes_packets://) НЕ затронут — отдельный SHM с своим magic, работает как раньше. PoC + smoke в spike/: - vmm_fd_pingpong/ — minimal cuMemCreate+FD round-trip - smoke_v04/ — full publisher+subscriber, 100/100 frames без pid share Base image: Dockerfile.runtime → CUDA 12.4 (был 13.0). Matching prod pipeline + Frigate base, иначе libcudart conflict при load. Compose stack (localhost-infra repo) — параллельный commit: - убран pid: container:cuframes-pub-parking из subscribers - image теги: gx/cuframes:0.4, gx/cuda-grid-pipeline:phase8, gx/frigate:cuframes-v0.4 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 20:13:31 +01:00
gx	d646f5a4e4	v0.3.3: consumer post-sync verify даже для v0.3 per-slot events release / build runtime Docker image (push) Failing after 0s Details release / build source tarball (push) Successful in 4s Details build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Successful in 1m41s Details build / ffmpeg filter patch (out-of-tree) (push) Successful in 1m29s Details test-u4-runner / u4 runner smoke test (push) Has been cancelled Details Bug: cudaEventRecord(event[slot]) overwrites previous state каждый publish. Когда producer wraps ring (~640ms при ring=16), event[slot] re-recorded для new content. Consumer's pending cudaStreamWaitEvent satisfied новым signal — consumer reads slot[slot_idx] thinking it's target_seq, реально получает seq+ring_size content (stale-by-1-wrap drift). После 50k+ wraps в long-running pipeline (9h uptime) drift накапливается: output stream имеет 60-70% duplicate frames (vs 10% сразу после restart). Симптом: TV picture freezes на 1-2 sec периодически. Encoder fps=25 stable (content duplicates same PTS-advance), но motion choppy на 8-9 fps real. Fix: unconditional post-sync verify (atomic re-read slot.seq после event wait). Если producer wrap occurred — slot.seq != target_seq → continue к новому target_seq. Cheap (one atomic load), correctness > perf. Verified: после deploy с fresh pipeline, 18-sec sample = 4% duplicates (vs 8.4% при том же setup но без fix). Proper v0.4 fix: per-slot+per-publish event pool с unique handle per cycle. Текущий v0.3.3 — sufficient mitigation для current production scale. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> v0.3.3	2026-05-24 20:27:00 +01:00
gx	becfbebc78	cuframes-rtsp-source: + --policy + --ack-timeout-ms CLI flags release / build runtime Docker image (push) Failing after 0s Details release / build source tarball (push) Successful in 2s Details build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Successful in 1m39s Details build / ffmpeg filter patch (out-of-tree) (push) Successful in 1m25s Details test-u4-runner / u4 runner smoke test (push) Has been cancelled Details Opt-in для STRICT_WAIT policy (default остаётся DROP_OLDEST). Use case STRICT_WAIT: Frame integrity критичен (e.g. recording, frame-accurate analytics). Producer ждёт ack от всех subscribers перед wrap ring → no torn frames. Trade-off: slow consumer задерживает all (default 200ms timeout затем subscriber dropped from bitmap). Use case DROP_OLDEST (default): Low-latency real-time display (TV grid). Producer wraps freely; v0.3 per-slot CUDA events закрывают race без waiting. Validation: policy=wait + ack-timeout-ms<=0 = infinite hold dead consumer — warning + force к 200ms safe default. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> v0.3.2	2026-05-24 08:47:14 +01:00
gx	656e36e9b0	v0.3.1: per-subscriber monitor thread — fix bitmap leak release / build runtime Docker image (push) Failing after 0s Details release / build source tarball (push) Successful in 4s Details build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Successful in 1m39s Details build / ffmpeg filter patch (out-of-tree) (push) Successful in 1m32s Details test-u4-runner / u4 runner smoke test (push) Has been cancelled Details Bug: handshake_subscriber assigned bit + activated slot но НЕ tracked client_fd. Когда subscriber container exited, socket closed on client side но producer не detected → bit оставался set forever → после 32 connections subscribe_create('cam-X'): too many subscribers (max 32). Симптом в production: каждый pipeline recreate accumulated 1 stale subscriber. После 4-5 recreate операций publishers перестали accept new pipeline → "too many subscribers" crash loop. Fix: после успешного handshake spawn detached pthread monitoring socket via blocking recv(). recv() returns 0 (EOF) когда other side closes — monitor clears bit (subscriber_bitmap &= ~(1<<bit)) + state[bit] = 0, closes fd, exits. Cost: 1 thread per active subscriber. Max 32 threads — небольшой overhead. Threads detached, no join needed. Stress test: 5x pipeline recreate без single "too many subscribers" error. Раньше: 2-3 recreate → bitmap overflow. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> v0.3.1	2026-05-24 08:00:41 +01:00
gx	8c7abbc4e8	v0.3: per-slot CUDA events — закрывает TOCTOU race без crutches release / build runtime Docker image (push) Failing after 1s Details release / build source tarball (push) Successful in 5s Details build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Successful in 1m40s Details build / ffmpeg filter patch (out-of-tree) (push) Successful in 1m22s Details test-u4-runner / u4 runner smoke test (push) Has been cancelled Details Protocol bump V2→V3: + shm header: cudaIpcEventHandle_t slot_event_handles[CUFRAMES_MAX_RING] + producer creates ring_size events (вместо одного global) + producer.do_publish records event[slot] (вместо pub->event) + consumer opens all slot events при subscribe + consumer waits event[slot_idx] specifically (вместо global producer_event) Backward compat: - Legacy pub->event сохранён + ipc_event_handle export'ится — v0.2 consumers видят его и работают по-старому (с post-sync verify hack из `517107d`). - v0.3 consumer auto-detects proto_version >= 3, fallback к legacy если cudaIpcOpenEventHandle на slot fail (graceful degradation). Effect (15-sec sample на Phase 7 single-cam, motion): v0.1 production: dup runs 34.7%, max 14 frames (560ms freeze) v0.2.1 fix: dup runs 10%, max 6, 0 back-jumps detected v0.3 per-slot: dup runs 1.9%, max 5, 3 back-jumps (likely encoder static-content artifacts, not real race) Размер shm header: 7424 → 8448 bytes (+1024 для slot_event_handles). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> v0.3.0	2026-05-22 09:23:53 +01:00
gx	517107d741	libcuframes: fix TOCTOU race в consumer slot read build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Successful in 1m34s Details build / ffmpeg filter patch (out-of-tree) (push) Successful in 1m19s Details release / build runtime Docker image (push) Failing after 1s Details release / build source tarball (push) Successful in 4s Details test-u4-runner / u4 runner smoke test (push) Has been cancelled Details Bug: producer signals один global cudaEvent для всего ring (один на producer). Consumer waits этот event после slot_seq validation, но event соответствует ПОСЛЕДНЕМУ published frame, не slot[target_seq]. Если producer wrap'нет ring во время event wait (ring=6 = 240ms окно), slot содержит уже next-gen data, consumer возвращает torn/stale frame. Симптом в production: video stream показывает «back-jump на момент» periodically — camera OSD timestamp дёргается, motion machines briefly teleport назад. cluster md5 analysis НЕ ловит (содержимое frames всё ещё unique, просто из неправильной epoch). Fix: post-sync verify. После cudaStreamWaitEvent / cudaEventSynchronize re-check slots[slot_idx].seq == target_seq. Если producer перезаписал — continue outer loop с новым target_seq. Закрывает race window между slot validation и event sync return. Остаются открытыми: - downstream GPU access после frame fill (consumer-side) — producer может wrap во время этого. Mitigation: STRICT_WAIT policy в publisher + ack discipline в consumer (cuframes_release_frame ack уже works). - bigger ring size снижает wrap frequency (240ms → 1.2s при ring=30). Test: после deploy в cuda-grid-pipeline (Phase 7 single cam), camera OSD clock больше не дёргается (раньше дёргалось каждые ~16 sec). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> v0.2.1	2026-05-21 22:27:39 +01:00
gx	4d54173bb2	roadmap: vf_cuda_grid выделен в отдельный продукт gx/vf-cuda-grid	2026-05-19 20:39:47 +01:00
gx	52fb2ad722	benchmarks: actual measured VRAM + network bandwidth (tcpdump-based) VRAM breakdown (nvidia-smi pmon): - 4 publishers = 4.4 GB (FHD + 2688x1520 ring buffers + NVDEC) - cctv-backend = 1.0 GB - frigate embeddings_manager = 1.6 GB - frigate detector:onnx = 0.6 GB - Total cuframes-stack = ~7.7 GB Network (10-sec tcpdump capture от camera subnet к R9): - Measured: 31.5 Mbps (всё включая go2rtc on-demand, ONVIF) - cuframes core: ~16 Mbps (4 publishers × main HEVC) - ONVIF/RTSP keepalives: ~1-2 Mbps - Без cuframes setup тех же 4 cam × 3 consumer был бы ~45-50 Mbps Source: production deploy 2026-05-19 measurement.	2026-05-19 19:22:53 +01:00
gx	3779175737	docs(benchmarks): production v0.2 deploy metrics (4 cam × 3 consumer) Real-world numbers с production deploy 2026-05-19: - RTSP к камерам: 12 → 4 (−67%) - NVDEC sessions: 8 → 4 (−50%) - Camera bandwidth: 34 → 16 Mbps (−54%) - PCIe D2H copies: 346 MB/s → ~0 (−100% через zero-copy CUDA IPC) - Frigate прямые RTSP: 8 → 0 (−100%) Plus live nvidia-smi metrics, что сохранилось vs не сэкономлено, projection table для других setup'ов (8/16 cam × 2/3/4 consumer). Для promotional material — public-facing claims на основе measured deploy.	2026-05-19 19:07:16 +01:00
gx	98d1bb5296	release: v0.2.0 — encoded packet ring build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Failing after 3m3s Details test-u4-runner / u4 runner smoke test (push) Successful in 1s Details build / ffmpeg filter patch (out-of-tree) (push) Has been skipped Details release / build runtime Docker image (push) Failing after 5m58s Details release / build source tarball (push) Successful in 6m2s Details - CHANGELOG: [Unreleased] → [0.2.0] — 2026-05-19 - CMakeLists VERSION 0.1.0 → 0.2.0 (both root + libcuframes) - CUFRAMES_VERSION_MINOR: 1 → 2 в include/cuframes/cuframes.h См. issue #2 (closed) + PR #4 (merged). v0.2.0	2026-05-19 17:49:14 +01:00
gx	5536d23992	Merge pull request 'v0.2: encoded packet ring' (#4 ) from v0.2-encoded-packets into main build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Successful in 10m0s Details build / ffmpeg filter patch (out-of-tree) (push) Successful in 8m32s Details	2026-05-19 17:47:10 +01:00
gx	2b94742df4	ci: retry + explicit Node 20 version check в bootstrap build / cmake build (CUDA 12.4, Ubuntu 22.04) (pull_request) Successful in 6m24s Details build / ffmpeg filter patch (out-of-tree) (pull_request) Successful in 6m21s Details Symptom (run #1826 fail на u4-runner): Bootstrap step молча установил Node 12 (Ubuntu default) вместо Node 20 из NodeSource → actions/checkout@v4 не парсится (ES2022 static blocks). Cause: curl ... setup_20.x на slow network (u4 через VPN) timeout/fail silently, apt install fallback на default ubuntu nodejs (Node 12). Без error. Fix: - curl --retry 3 --retry-delay 5 --connect-timeout 30 - retry-loop на NodeSource setup (3 попытки) - явная verification major version >= 18 после install, fail с exit 1 если установился Node < 18 Применяется к обоим jobs (cmake-build и filter-build). Связано: PR #4 (v0.2), run #1826 fail.	2026-05-19 17:31:33 +01:00
gx	fca07bf669	test+docs: packet ring stress test + Frigate dual-input guide (v0.2 Step 6) build / cmake build (CUDA 12.4, Ubuntu 22.04) (pull_request) Failing after 3m43s Details build / ffmpeg filter patch (out-of-tree) (pull_request) Has been skipped Details Тесты: - libcuframes/tests/test_packet_ring.c — 2 scenarios: 1) normal flow: 1 pub × 1 sub × 2000 packets, varied sizes, GOP=30, payload integrity check (seq в первых 8 байтах + pattern). PTS monotonicity, first KEY seq, нет data errors. 2) slow consumer (10ms delay): publisher 200 fps, subscriber должен detect OVERRUN, library resync на keyframe — verify received >10 даже на сильно медленном консьюмере. - libcuframes/tests/CMakeLists.txt: add_test packet_ring_basic. Docs: - CHANGELOG.md: новая [Unreleased] секция с full v0.2 highlights и явно declared limitations (sub-stream, audio, codec change → v0.3). - docs/integrations/frigate.md: новая секция "v0.2: dual-input (detect + record через один RTSP)" с config example, requirements, trade-offs. Связано: #2, PR #4. Step 6 (final) перед снятием draft.	2026-05-19 17:08:17 +01:00
gx	8cd96721ff	feat(rtsp-source): packet ring publishing (v0.2 Step 4) build / cmake build (CUDA 12.4, Ubuntu 22.04) (pull_request) Successful in 1m39s Details build / ffmpeg filter patch (out-of-tree) (pull_request) Successful in 1m44s Details - cuframes::Publisher (C++ wrapper): добавлены enable_packets(), set_codec_extradata(), publish_packet() методы. - cuframes-rtsp-source: новый CLI flag --enable-packet-ring. При его установке после opening stream — pub.enable_packets(codec_id) + set_codec_extradata из vstream->codecpar->extradata. - В main loop: после av_read_frame, до avcodec_send_packet, packet публикуется в packet ring с конверсией pts/dts из stream_tb в ns, AV_PKT_FLAG_KEY/CORRUPT/DISCONTINUITY → CUFRAMES_PKT_FLAG_*. Тест: cuframes-rtsp-source --rtsp rtsp://... --key cam1 --enable-packet-ring # frames consumer'ы продолжают работать через cuframes:// (как v0.1) # record consumer'ы могут brать packets через cuframes_packets:// (Step 5) Связано: #2, PR #4.	2026-05-19 16:45:29 +01:00
gx	4cb0321a6f	feat(api): public C API для packet ring (v0.2 Step 3) build / cmake build (CUDA 12.4, Ubuntu 22.04) (pull_request) Successful in 1m36s Details build / ffmpeg filter patch (out-of-tree) (pull_request) Successful in 1m24s Details Публичные функции в include/cuframes/cuframes.h: - cuframes_publisher_enable_packets(opts) — активирует ring на существующем publisher'е; default sizing (64 slots, 8MiB data, 2MiB max). - cuframes_publisher_set_codec_extradata(data, size) — SPS/PPS bytes. - cuframes_publisher_publish_packet(data, size, pts, dts, flags) - cuframes_subscriber_enable_packets() — открывает packet shm у subscriber'а. - cuframes_subscriber_next_packet(pkt_out, timeout_ms) с поллингом 1ms. - cuframes_packet_data/size/pts/dts/flags/seq accessors. - cuframes_subscriber_release_packet() - cuframes_subscriber_get_codec_params() Internal: - producer.c: расширена struct cuframes_publisher (has_pkt_ring, max_packet_size, pkt_ring); cleanup в destroy(); enable_packets() bump'ит proto_version=2 в frames header. - consumer.c: расширена struct cuframes_subscriber (has_pkt_ring, pkt_ring, last_packet_seq, packet_obj); single-packet pattern (как frame_obj — busy flag, переиспользование buffer). enable_packets() стартует с last_keyframe_seq-1 для late subscriber resync. На PACKET_OVERRUN автоматически resync на last_keyframe и возвращает ERR наружу для signalling discontinuity. Связано: #2, PR #4.	2026-05-19 16:27:05 +01:00
gx	bd7fd95fef	feat(libcuframes): packet ring buffer implementation (v0.2 Step 2) build / cmake build (CUDA 12.4, Ubuntu 22.04) (pull_request) Successful in 1m37s Details build / ffmpeg filter patch (out-of-tree) (pull_request) Successful in 1m21s Details Реализация encoded packet ring per docs/protocol.md §10. Files: - internal.h: cuframes_pkt_slot_t (64b packed), cuframes_pkt_header_t (0x1040 fixed header), cuframes_pkt_ring_t handle, constants for default sizing, packet flags, helper inline functions for slot/data pointer arithmetic. - packet_ring.c (new, ~290 LOC): create/open/publish/read/destroy. Stale recovery симметрично frames SHM (pid liveness check). Seqlock pattern для subscriber защиты от overrun mid-read (post-check seq после copy). Wraparound memcpy helpers для variable-length data ring. - utils.c: cuframes_internal_pkt_shm_name helper + strerror entries. - cuframes.h: 4 новых error codes (PACKET_OVERSIZED, NO_PACKET_RING, NO_CODEC_PARAMS, PACKET_OVERRUN). - CMakeLists.txt: src/packet_ring.c в sources. API внутренний (cuframes_internal_pkt_ring_*) — publicly exposed функции будут в Step 3 (cuframes.h API extension). Связано: #2 (v0.2), PR #4 (draft).	2026-05-19 16:11:42 +01:00
gx	ad75aa9624	docs(protocol): v0.2 — encoded packet ring spec (§10) build / cmake build (CUDA 12.4, Ubuntu 22.04) (pull_request) Successful in 1m35s Details build / ffmpeg filter patch (out-of-tree) (pull_request) Successful in 1m39s Details Полный wire-protocol spec для encoded packet ring: - Отдельный SHM /dev/shm/cuframes-<key>-packets (variable-length) - Backward-compat с v1: proto_version=2 publishers принимают v1 subscribers - HELLO_REQ/HELLO_RESP extension через reserved bytes — без слома v1 layout - Codec extradata (SPS/PPS) в shared header - Late subscriber → keyframe-aligned start (initial_packet_seq) - Seqlock pattern для защиты от overrun mid-read - API extension: publish_packet, next_packet, get_codec_params - 4 новых error codes (OVERSIZED, NO_PACKET_RING, NO_CODEC_PARAMS, PACKET_OVERRUN) Связано: #2	2026-05-19 16:04:00 +01:00
gx	264b9d59db	roadmap: future ideas — gst-cuframes-src + vf_cuda_grid Две идеи добавлены в новую секцию "Future ideas" (без ETA): - gst-cuframes-src: GStreamer source-element для DeepStream / обычных GStreamer pipeline'ов. Аналог FFmpeg-демуксера для другого стека. - vf_cuda_grid: FFmpeg filter с runtime grid composition полностью на GPU. Заменяет custom C++ GridComposer cctv-processor (см. gx/cctv#22). Превращает cuframes в GPU-native video routing platform. Обе идеи waiting на планирование, scope для v0.5+.	2026-05-19 15:58:49 +01:00
gx	d2bae7d0fd	ci: clone ffmpeg-patched через GITHUB_SERVER_URL (для VPN-runner'а) build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Successful in 1m57s Details build / ffmpeg filter patch (out-of-tree) (push) Successful in 3m36s Details Жёсткий URL git.goldix.org не работает на u4-runner — там gitea доступен только через VPN (10.8.0.6:3222). Используем переменную runner'а — на R9 = 192.168.88.23:3222, на u4 = 10.8.0.6:3222.	2026-05-19 02:55:14 +01:00
gx	eb3c058341	ci: smoke test workflow для verify u4 runner через VPN test-u4-runner / u4 runner smoke test (push) Successful in 54s Details build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Successful in 38m52s Details build / ffmpeg filter patch (out-of-tree) (push) Failing after 1m34s Details	2026-05-19 02:12:38 +01:00
gx	612843bd39	docs: launch drafts (Frigate discussion + FFmpeg-devel RFC + Show HN) 3 черновика для upstream visibility (Etap E): - docs/launch/frigate-integration-issue.md — Discussion на blakeblackshear/frigate - docs/launch/ffmpeg-devel-rfc.md — RFC patch + cover letter для ffmpeg-devel ML - docs/launch/hn-show-post.md — Show HN draft (Etap F) - docs/launch/README.md — порядок, чек-лист, pre-flight notes См. issue #3.	2026-05-19 02:04:42 +01:00
gx	bcc1d29ae8	ci: clone FFmpeg из local gitea fork (вместо unstable upstream github clone) build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Successful in 1m52s Details build / ffmpeg filter patch (out-of-tree) (push) Successful in 1m31s Details git clone github.com/FFmpeg/FFmpeg на слабом интернете оборвался через 11 мин (RPC HTTP/2 CANCEL). Local gx/ffmpeg-patched n7.1-cuframes branch имеет patch уже applied — clone instant без internet round-trip. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-19 00:40:40 +01:00
gx	fbe1d18c39	docs: troubleshooting guide + production notes - docs/troubleshooting.md — 13 секций с реальными grабельками которые мы прошли: cudaIpcOpenEventHandle invalid device context (pid namespace), s6-overlay vs pid share, scale_cuda missing (cuda-llvm + stdbit.h glibc 2.36), libcuframes not found install paths, ffbuild/ missing source, GMP no working compiler (long-long reliability), zlib.net deprecated URL, RTSP/RTP UDP docker NAT, gitea actions Node version - docs/architecture.md — Appendix A "Production deployment notes" с реальными observations после 24h+ run: что подтвердилось, что доработали, что не учли - docs/requirements.md — production deployment matrix + Docker namespace requirements таблица (cross-container CUDA IPC требует 5 условий) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-19 00:37:13 +01:00
gx	022a198c33	ci: same Node 20 bootstrap для filter-build job (как в cmake-build) build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Successful in 13m20s Details build / ffmpeg filter patch (out-of-tree) (push) Failing after 18m48s Details Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-19 00:05:59 +01:00
gx	611918ce7a	ci: install Node 20 from NodeSource (apt nodejs = Node 12 — слишком старый для actions/checkout@v4) build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Successful in 1m48s Details build / ffmpeg filter patch (out-of-tree) (push) Failing after 51s Details Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 21:56:33 +01:00
gx	00fb3e9528	ci: preinstall node+git в CUDA container (actions/checkout требует node) build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Failing after 1m6s Details build / ffmpeg filter patch (out-of-tree) (push) Has been skipped Details Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 21:47:25 +01:00
gx	4a6a6f4a6c	ci: gitea Actions workflows (build, release) + README badges build / cmake build (CUDA 12.4, Ubuntu 22.04) (push) Failing after 1m4s Details build / ffmpeg filter patch (out-of-tree) (push) Has been skipped Details - .gitea/workflows/build.yml — on push/PR: * cmake build на CUDA 12.4 devel image (Ubuntu 22.04 base) * compile-only smoke (no GPU нужен): libcuframes.so + tools + examples * install-prefix layout verify (headers + libs в правильных путях) * filter/ — clone FFmpeg n7.1 + apply patch + build minimal patched ffmpeg, verify cuframes demuxer registered - .gitea/workflows/release.yml — on tag v: build runtime Docker image, push в git.goldix.org/gx/cuframes:<version> * build source tarball cuframes-<version>.tar.gz как artifact - README.md badges: build status, release version, license Runner: gitea act_runner v0.4.1 на R9-88.23 — labels ubuntu-22.04 / ubuntu-24.04 доступны через docker.gitea.com/runner-images. CUDA devel image использует nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04 (уже cached на runner host). Stress test (требует GPU) намерено НЕ в CI — runner без GPU. Запускать отдельно на dev-машине через ctest. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 21:43:55 +01:00
gx	12708618d4	docs: reference integrations + examples - docs/integrations/frigate.md — полный production-tested guide: Dockerfile, docker-compose, config.yml, troubleshooting (s6+pid, scale_cuda, hwaccel issues), build steps - docs/integrations/cctv-cpp.md — C++ pattern: IFrameSource interface + CuframesSource skeleton + CMake setup + runtime requirements - examples/frigate-compose/ — reference compose stack (cuframes-pub + Frigate) с config.yml stub, .env.example, README - examples/python-consumer/ — ctypes-based skeleton для AI/ML pipeline'ов (до v0.3 native pybind11 bindings) - docs/integration.md — превратился в index-страницу, ссылается на specific guides Reorganization упрощает onboarding: пользователь выбирает guide по типу integration'а (Frigate/C++/Python/FFmpeg) и сразу видит реальный code. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 21:37:35 +01:00
gx	a3ba3a95b2	docs: ROADMAP + CHANGELOG v0.1.0 + BENCHMARKS - ROADMAP.md: structured v0.1✅ / v0.2📋 (encoded packet sharing + FFmpeg upstream PR + scale-cuda alt) / v0.3 (Python bindings, Jetson, multi-GPU) / v1.0 (stable ABI) - CHANGELOG.md: full v0.1.0 release notes — features, tested config, production deployment, known limitations - BENCHMARKS.md: measurements (stress 1×pub×4×sub, E2E real camera, prod multi-consumer 24h, VRAM cost per resolution, cuframes vs N×NVDEC) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> v0.1.0	2026-05-18 21:11:37 +01:00
gx	601806a5f8	build: add cmake install rules for libcuframes cmake --install теперь правильно кладёт libcuframes.so/.a в lib/ и headers в include/cuframes/. Нужно для downstream builders (FFmpeg patched build, deb packaging). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-18 12:52:16 +01:00
gx	99ab0e0524	Merge pull request 'feat(filter): FFmpeg 7.1 cuframes:// input demuxer (PoC v1)' (#1 ) from feat/ffmpeg-demuxer into main Reviewed-on: #1	2026-05-17 09:08:09 +01:00
gx	99df68f69c	feat(filter): FFmpeg 7.1 cuframes:// input demuxer Out-of-tree patch + sources для FFmpeg-демаксера, который позволяет любому FFmpeg-based потребителю (Frigate, кастомные рекордеры, re-streamers) читать "cuframes://<key>" как обычный URL — без своего NVDEC. Состав: - filter/cuframesdec.c — реализация (libavformat-style) - filter/ffmpeg-7.1-cuframes-demuxer.patch — patch для FFmpeg n7.1 (Makefile / allformats.c / configure) - filter/README.md — инструкции по сборке + CLI smoke test + Frigate plan v1 ограничения (намеренно): - только NV12 - GPU → CPU копия через cudaMemcpy2DAsync (zero-copy AVHWFramesContext — v2) CLI smoke test 2026-05-17 (host build FFmpeg + libcuframes, publisher на камере 192.168.88.98 1920x1080 HEVC 25fps): ffmpeg -f cuframes -i cuframes://cam-ff -c:v copy -f null - → frame=100 fps=25 q=-1.0 speed=1x ✓ → "cuframes: connected to 'cam-ff' — 1920x1080 NV12" Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 09:02:12 +01:00
gx	f10413580d	docs: cross-container CUDA IPC requires both --ipc и --pid namespace share Реальный тест на 192.168.88.98 (1920x1080 HEVC, 25fps) показал: для отдельных consumer-container'ов недостаточно ipc=container:X — нужен также pid=container:X, иначе cudaIpcOpenEventHandle падает с invalid device context. CUDA driver валидирует IPC peer через /proc/<pid>/... E2E на реальной камере проверен: publisher (отдельный контейнер) -> consumer (docker exec): 250 frames, 0 gaps publisher (отдельный контейнер) -> consumer (отдельный с pid+ipc): 200, 0 gaps Обновлено: - docs/integration.md compose snippet, verification, troubleshooting section - docker-compose.example.yml — добавлен pid: container:cuframes-cam-test - README.md quickstart — добавлен --pid в docker run subscriber Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-15 06:37:09 +01:00
gx	44dab75e08	docs+docker: integration guide и runtime image для Frigate/cctv stack docs/integration.md — детальный guide для интеграции в существующий CCTV docker-compose: критичные требования (ipc=shareable/container, общий shared volume для socket), пример CuframesSource для cctv-processor, verification checklist, troubleshooting (timeout, ipc namespace mismatch, high latency). Зафиксировано: v0.1 frigate-decode не убирается без patch'а FFmpeg — это v0.2 scope. docker/Dockerfile.runtime — multi-stage build (devel → runtime), копирует libcuframes.so + cuframes-rtsp-source + sub_count в /usr/local. Образ ~700 MB (vs ~7 GB у dev'а). Smoke-test: бинарки запускаются, ldd видит все нужные libs. docker-compose.example.yml — reference docker-compose с правильным ipc mode и volume mounts для копирования в свои проекты. .dockerignore — исключает build/ и build-*/ из COPY context. README обновлён: статус v0.1 done, quickstart с реальным docker run, ссылка на integration guide.	2026-05-14 23:47:56 +01:00
gx	a21812d3f6	tools+examples+test: end-to-end pipeline ready (Steps 9-10) cuframes-rtsp-source — standalone bridge между RTSP/file и cuframes IPC. Декодирует на CUDA (nvdec), копирует D2D в pre-allocated pool (EXTERNAL ownership), публикует через cuframes. --realtime для pacing файлового ввода, --loop для зацикливания. Альтернатива FFmpeg-фильтра до v0.2 (filter требует patch FFmpeg, конфликтует с Frigate's bundled build). examples/sub_count — reference subscriber на raw C API: counts frames, trackit gaps, выходит clean при disconnect/timeout/SIGINT. test_stress (4 subscribers × 2000 frames @ 120fps) — PASS на RTX 5090. 0 torn frames у всех consumers (включая 2 slow с 5ms sleep). Smoke-проверено: testsrc 25fps → cuframes-rtsp-source → cuframes IPC → sub_count (отдельный процесс) → 200/200 frames, 0 gaps, avg_fps=25.2.	2026-05-14 23:39:01 +01:00
gx	2530057507	hpp: C++ RAII wrapper (header-only, Step 7) Тонкий слой поверх C API: - cuframes::Error — exception при ошибках, code() для подробностей - cuframes::Publisher — RAII обёртка publisher'а (LIBRARY + EXTERNAL constructors) - cuframes::Subscriber + cuframes::FrameRef — RAII frame с автo-release - cuframes::AsyncSubscriber — с std::function callbacks - cuframes::Frame — read-only view (для callback'а) - cuframes::calc_frame_size(), now_ns() — utilities Smoke test (in dev container): $ g++ -std=c++17 ... -lcuframes -lcudart smoke.cpp $ ./smoke version: 0.1.0 FullHD NV12 frame: 3317760 bytes (pitch_y=2048, pitch_uv=2048)	2026-05-14 23:23:35 +01:00
gx	46c2b94939	libcuframes v0.1: producer + consumer (sync + async) + tests Implements Steps 3-6 of Phase 1 according to docs/protocol.md. libcuframes/src/: - internal.h (660 lines) — shared structs (byte-exact protocol.md layout) + _Static_assert на offsets/sizes - utils.c — error strings, frame size calc, now_ns, key validation - protocol.c — TLV framing для Unix socket с poll-based timeout - producer.c (~700 lines) — Step 3: * LIBRARY mode: cudaMalloc pool, IPC handle export * EXTERNAL mode: register user-provided pointers * cudaIpcEventHandle_t для cross-process sync (R1/R2) * Unix socket accept thread, handshake state machine * Bit allocation 1..31, name collision check (Y5) * STRICT_WAIT policy: timeout with dead-subscriber eviction - consumer.c (~400 lines) — Step 4: * Synchronous next() with poll-based wait * cudaStreamWaitEvent на consumer-stream (R1/R2) * Opaque cuframes_frame_t с accessor functions (Y6) * NEWEST_ONLY и STRICT_ORDER modes * ACK via atomic_fetch_or на bitmap - consumer_async.c — Step 5: thread + callback wrapper над sync API libcuframes/tests/: - test_pingpong.cu — single producer × single consumer, 200 frames @ 60fps, verify через kernel-on-consumer-stream (правильный test для sync semantics, см. spike-v2) - test_multi.cu — 1 producer × 3 consumers через fork() Build: - Top-level CMakeLists.txt с options - libcuframes/CMakeLists.txt: shared + static library, c_std_11 - Suppress -Waddress-of-packed-member (известная безопасная warning x86_64) Results (внутри cuframes-dev container, RTX 5090): - pingpong_basic PASS 4.5s 200 frames, 0 torn - multi_consumer PASS 4.1s 1 × 3 consumers, all PASS Phase 1 Step 6 done. Дальше: Step 7 (C++ wrapper), Step 9 (FFmpeg filter).	2026-05-14 23:21:30 +01:00
gx	dc478c7cda	docs: system requirements (hardware, software, build, Docker, k8s) docs/requirements.md (220 строк): - Hardware: NVIDIA GPU CC ≥7.5 (Turing+), Linux x86_64, VRAM/RAM/CPU minimum - Software host: kernel ≥5.4, driver ≥525/555, glibc ≥2.31, Ubuntu/Debian/RHEL - Build deps: CUDA Toolkit ≥12.0, GCC 11+, CMake 3.20+, FFmpeg 4.4+ - Docker: nvidia-container-toolkit, --gpus, --ipc=shareable, --shm-size=2gb - Cross-container CUDA IPC: variant A (--ipc=container:X), variant B (host), k8s через emptyDir + shareProcessNamespace - Out-of-scope: AMD/Intel/macOS/Windows/WSL2/Jetson/multi-GPU/multi-host - Quick-check команды (nvidia-smi, uname, ldd, df /dev/shm) - Tested matrix (Phase 0): RTX 5090, driver 595, CUDA 13.0.88, Ubuntu 24.04 README.md обновлён: - Краткая таблица minimum vs recommended - Список не-поддерживаемых платформ - Ссылки на все docs/ файлы (architecture, protocol, requirements, benchmarks)	2026-05-14 23:11:30 +01:00
gx	6608f5d2f6	docs(protocol): bit-exact wire protocol specification (R4) Closes последний RED-flag из arch review. Что описано (§-sections): 1. Resources & lifecycle (socket / shm / IPC handles cleanup, crash recovery) 2. Shared memory byte-by-byte layout (offsets, packing, atomics) 2.1 frame meta (64 bytes) 2.2 slot descriptor (192 bytes) 2.3 subscriber slot (128 bytes) 3. Unix socket TLV protocol (8 message types, framing) 4. State machines (subscriber-side, publisher-side per-subscriber) 5. ACK protocol с cudaEventRecord / cudaStreamWaitEvent 6. Versioning rules (proto_version vs lib_version, reserved fields) 7. Conformance test skeleton (offset checks, sizeof checks, handshake) 8. Open для v0.2 (TLS, multi-format, ROCm) 9. Reference impl pointer (libcuframes/src/protocol.c — Phase 1) После v0.2 release — wire protocol frozen, breaking changes = bump proto_version. До v0.2 — experimental. Решает все 4 пункта из arch review section R4: ✓ SHM layout (annotated struct + ASCII layout) ✓ Socket protocol (state machine + message framing) ✓ Versioning rules ✓ Lifecycle / cleanup (incl. CUDA IPC handle leak при crash) Готов к Step 2 (Phase 1 implementation).	2026-05-14 23:04:46 +01:00
gx	98a60b7730	header v2: address arch review R3 + Y4/Y5/Y6/Y7/Y9 R3 (publisher API не работает с FFmpeg's hwframe pool): - Добавлен ownership_mode field: LIBRARY (default, текущий API) или EXTERNAL. - Новая функция publisher_create_external(cuda_ptrs[], ptr_count, frame_size) для случая когда CUDA память выделена upstream (FFmpeg AVHWFramesContext). - Новая publish_external(cuda_ptr) — публикует один из pre-registered handles. - Для FFmpeg filter теперь zero-copy: filter получает AVFrame, library уже имеет IPC handle на этот pointer (registered в create), publish — atomic seq bump. R1/R2 closure отражено в API: - publish() теперь принимает cudaStream_t — library делает cudaEventRecord вместо stream sync. - next() теперь принимает consumer_stream — library делает cudaStreamWaitEvent перед возвратом frame. Cross-process sync через cudaIpcEventHandle_t. Y6 (opaque frame через handle, не struct с _internal_*): - cuframes_frame_t стал opaque (typedef struct, не определена). - Accessor functions: cuda_ptr, format, size, pitch_y, pitch_uv, seq, pts_ns. - ABI-stable при добавлении полей в minor releases. Y7 (redundant try_next): - Удалён subscriber_try_next. next(.., timeout_ms=0) — non-blocking с CUFRAMES_ERR_WOULD_BLOCK. Y5 (consumer_name uniqueness): - Документировано что duplicate name → ALREADY_EXISTS. - Добавлен CUFRAMES_ERR_TOO_MANY для случая >32 subscribers. Y9 (pts_ns clock): - Документировано что MONOTONIC у publisher'а, consumer должен sanity-check на epoch reset при publisher restart. Также: - meta-блок (cuframes_frame_meta_t) перестал быть public — meta доступна через accessor'ы на opaque frame. - _reserved[4] в configs для forward-compat без breaking ABI. - Добавлен cuframes_protocol_version() — wire protocol majoring отдельно от lib version. Готов к Step 2 (docs/protocol.md + implementation).	2026-05-14 23:02:50 +01:00
gx	fe330ca279	arch: close open question §6.6 — events as default for cross-process sync См. spike-v2 (commit `ad54305`) + arch review 2026-05-15. cudaStreamSynchronize-only фактически работает на single-host single-GPU (0 torn в 4 scenarios PoC), но NVIDIA Programming Guide §3.2.8 не даёт contractual гарантии. Переключаемся на cudaIpcEventHandle_t как default, stream-sync остаётся опциональным fallback. Net: +20µs mean latency, -3× max latency (predictable tail), future-proof для multi-GPU.	2026-05-14 23:00:40 +01:00
gx	ad543054fc	spike-v2: validate sync semantics (R1/R2 architectural review) Architectural review (2026-05-15) указал что cudaStreamSynchronize-only на producer-side не достаточен для cross-process visibility — NVIDIA Programming Guide §3.2.8 требует cudaIpcEventHandle_t. Phase 0 PoC v1 не проверял этот случай из-за cudaMemcpy который имеет implicit barriers. spike-v2 воспроизводит правильный сценарий: consumer запускает verify_kernel на ОТДЕЛЬНОМ stream'е (real-world use case — PyTorch / OpenCV CUDA), pattern включает row-based component для отлова partial-frame torn. Запуск 4 scenarios × 1500/600 frames: A-fhd60 (stream sync, FHD@60): 0 torn, p99=267µs, max=14.7ms B-fhd60 (event sync, FHD@60): 0 torn, p99=344µs, max=5.2ms A-4k30 (stream sync, 4K@30): 0 torn, p99=606µs, max=4.4ms B-4k30 (event sync, 4K@30): 0 torn, p99=437µs, max=3.7ms Все 4 показали 0 torn frames. R1 на single-host single-GPU фактически не воспроизводится — но NVIDIA contractually не гарантирует это. Decision: events as default (R1/R2 resolved). Architecture.md §6.6 закрыт. Tradeoff: mean latency +20µs, max latency в 3× ниже (predictable tail) + future-proof для multi-GPU. Также Dockerfile.dev — апдейт CUDA до 13.0.3 (12.4 не существует с devel-ubuntu24.04). Связано с PR review: R1, R2, R3 (R3, R4 — в следующих коммитах).	2026-05-14 23:00:13 +01:00
gx	c2c2a9751a	phase0: benchmark results — PASSED on RTX 5090 (Blackwell sm_120) Basic (1 producer × 1 consumer): p50=75µs p95=146µs p99=152µs (target was <5ms — мы 33× ниже) 500 frames, 0 torn, 0 skipped, zero-copy verified Multi-consumer (1 × 3): p99 для всех 3: 151-152µs (identical = proof zero-copy без contention) 300 frames each, 0 torn, 0 skipped Acceptance criteria — GREEN. Переходим к Phase 1 (libcuframes API). Sync через cudaStreamSynchronize достаточен для v0.1; CUDA IPC event handles overlap отложен до v0.2. Raw measurement logs сохранены в docs/measurements/phase0-consumer-*.log для verification (4 файла из 2 scenarios). Также fixed unused variable warning в pingpong_consumer.cu.	2026-05-14 22:02:49 +01:00

1 2

53 Commits