gx
|
46c2b94939
|
libcuframes v0.1: producer + consumer (sync + async) + tests
Implements Steps 3-6 of Phase 1 according to docs/protocol.md.
libcuframes/src/:
- internal.h (660 lines) — shared structs (byte-exact protocol.md layout)
+ _Static_assert на offsets/sizes
- utils.c — error strings, frame size calc, now_ns, key validation
- protocol.c — TLV framing для Unix socket с poll-based timeout
- producer.c (~700 lines) — Step 3:
* LIBRARY mode: cudaMalloc pool, IPC handle export
* EXTERNAL mode: register user-provided pointers
* cudaIpcEventHandle_t для cross-process sync (R1/R2)
* Unix socket accept thread, handshake state machine
* Bit allocation 1..31, name collision check (Y5)
* STRICT_WAIT policy: timeout with dead-subscriber eviction
- consumer.c (~400 lines) — Step 4:
* Synchronous next() with poll-based wait
* cudaStreamWaitEvent на consumer-stream (R1/R2)
* Opaque cuframes_frame_t с accessor functions (Y6)
* NEWEST_ONLY и STRICT_ORDER modes
* ACK via atomic_fetch_or на bitmap
- consumer_async.c — Step 5: thread + callback wrapper над sync API
libcuframes/tests/:
- test_pingpong.cu — single producer × single consumer, 200 frames @ 60fps,
verify через kernel-on-consumer-stream (правильный test
для sync semantics, см. spike-v2)
- test_multi.cu — 1 producer × 3 consumers через fork()
Build:
- Top-level CMakeLists.txt с options
- libcuframes/CMakeLists.txt: shared + static library, c_std_11
- Suppress -Waddress-of-packed-member (известная безопасная warning x86_64)
Results (внутри cuframes-dev container, RTX 5090):
- pingpong_basic PASS 4.5s 200 frames, 0 torn
- multi_consumer PASS 4.1s 1 × 3 consumers, all PASS
Phase 1 Step 6 done. Дальше: Step 7 (C++ wrapper), Step 9 (FFmpeg filter).
|
2026-05-14 23:21:30 +01:00 |
|