-
Notifications
You must be signed in to change notification settings - Fork 430
Description
This seems to happen for bottom-up raw frames from dshow. The returned NumPy array is backed by an invalid buffer description, and the process later crashes when the array is fully read, for example via .copy(). Any suggestion on how to resolve this issue?
Observed frame properties
From the failing frame:
def on_image(image: VideoFrame):
print(
"format=", image.format.name,
"size=", (image.width, image.height),
"line_size=", image.planes[0].line_size,
"buffer_ptr=", hex(image.planes[0].buffer_ptr),
"buffer_size=", image.planes[0].buffer_size,
)
buffer = image.to_ndarray(format="bgr24")
frame = buffer.copy() # process crashes hereand I got
format= bgr24 size= (3840, 2160) line_size= -11520 buffer_ptr= 0x147a14d9380 buffer_size= 24883200
Environment
- OS: Windows
- Input: DirectShow (
dshow) - Frame format:
bgr24 - Resolution:
3840x2160
For 3840x2160 in bgr24, the expected row size is:
3840 * 3 = 11520 bytes
So line_size = -11520 indicates a bottom-up packed frame with negative stride.
Important observation
If I insert an FFmpeg vflip filter before the callback, the crash disappears.
Behavior I observed:
- DirectShow
bgr24frame withoutvflip: crashes - Same device and payload, then use
to_ndarray(format="rgb24")then flip the channelrgb = bgr[:, :, ::-1]: works - Same device and payload with
vflip: works
This suggests the problem is not the stream metadata itself, but the memory layout of the original bottom-up packed frame. vflip likely materializes a new frame with a normal top-down layout.
Expected behavior
VideoFrame.to_ndarray() should safely handle packed rgb24 / bgr24 frames with negative linesize, either by:
- allocating and copying rows manually in the correct order, or
- avoiding the no-copy path when
line_size < 0
Actual behavior
The process crashes when the ndarray is read, often at:
buffer.copy()Additional note
Nature FFmpeg does not support RGB/BGR buffer on Windows. Therefore, I have used this patch for pyav-ffmpeg build
diff --git a/libavdevice/dshow.c b/libavdevice/dshow.c
index 6e97304..6a55584 100644
--- a/libavdevice/dshow.c
+++ b/libavdevice/dshow.c
@@ -56,6 +56,10 @@
# define AMCONTROL_COLORINFO_PRESENT 0x00000080 // if set, indicates DXVA color info is present in the upper (24) bits of the dwControlFlags
#endif
+#define DSHOW_MEDIASUBTYPE_RGB565 0xe436eb7b
+#define DSHOW_MEDIASUBTYPE_RGB555 0xe436eb7c
+#define DSHOW_MEDIASUBTYPE_RGB24 0xe436eb7d
+#define DSHOW_MEDIASUBTYPE_RGB32 0xe436eb7e
static enum AVPixelFormat dshow_pixfmt(DWORD biCompression, WORD biBitCount)
{
@@ -76,10 +80,33 @@ static enum AVPixelFormat dshow_pixfmt(DWORD biCompression, WORD biBitCount)
case 32:
return AV_PIX_FMT_0RGB32;
}
+ case DSHOW_MEDIASUBTYPE_RGB565:
+ return AV_PIX_FMT_RGB565;
+ case DSHOW_MEDIASUBTYPE_RGB555:
+ return AV_PIX_FMT_RGB555;
+ case DSHOW_MEDIASUBTYPE_RGB24:
+ return AV_PIX_FMT_BGR24;
+ case DSHOW_MEDIASUBTYPE_RGB32:
+ return AV_PIX_FMT_0RGB32;
}
return avpriv_pix_fmt_find(PIX_FMT_LIST_RAW, biCompression); // all others
}
+static int dshow_is_bottomup_rgb(DWORD biCompression)
+{
+ switch (biCompression) {
+ case BI_RGB:
+ case BI_BITFIELDS:
+ case DSHOW_MEDIASUBTYPE_RGB565:
+ case DSHOW_MEDIASUBTYPE_RGB555:
+ case DSHOW_MEDIASUBTYPE_RGB24:
+ case DSHOW_MEDIASUBTYPE_RGB32:
+ return 1;
+ default:
+ return 0;
+ }
+}
+
static enum AVColorRange dshow_color_range(DXVA2_ExtendedFormat *fmt_info)
{
switch (fmt_info->NominalRange)
@@ -1581,7 +1608,7 @@ dshow_add_device(AVFormatContext *avctx,
par->codec_type = AVMEDIA_TYPE_VIDEO;
par->width = fmt_info->width;
par->height = fmt_info->height;
- par->codec_tag = bih->biCompression;
+ par->codec_tag = fmt_info->pix_fmt == AV_PIX_FMT_NONE ? bih->biCompression : 0;
par->format = fmt_info->pix_fmt;
if (bih->biCompression == MKTAG('H', 'D', 'Y', 'C')) {
av_log(avctx, AV_LOG_DEBUG, "attempt to use full range for HDYC...\n");
@@ -1594,7 +1621,7 @@ dshow_add_device(AVFormatContext *avctx,
par->chroma_location = fmt_info->chroma_loc;
par->codec_id = fmt_info->codec_id;
if (par->codec_id == AV_CODEC_ID_RAWVIDEO) {
- if (bih->biCompression == BI_RGB || bih->biCompression == BI_BITFIELDS) {
+ if (dshow_is_bottomup_rgb(bih->biCompression)) {
par->bits_per_coded_sample = bih->biBitCount;
if (par->height < 0) {
par->height *= -1;