Skip to content

add opencv benchmark #711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Conversation

Dan-Flores
Copy link
Contributor

This PR updates the changes in #674, which added a benchmark for decoding with the OpenCV library.

The main differences are:

  • Removed iteration through available backends
  • Removed API/ABI compatibility checks
  • Added FFMPEG as default backend for OpenCV
python benchmarks/decoders/benchmark_decoders.py --decoders opencv,torchcodec_public:seek_mode=exact,torchcodec_public:seek_mode=approximate,torchaudio --min-run-seconds 40
[-------- video=/home/danielflores3/github/Dan-Flores/benchmarks/decoders/../../test/resources/nasa_13013.mp4 h264 480x270, 13.013s 29.97002997002997fps -------]
                                              |  decode 10 uniform frames  |  decode 10 random frames  |  first 1 frames  |  first 10 frames  |  first 100 frames
1 threads: ------------------------------------------------------------------------------------------------------------------------------------------------------
      OpenCV[backend=FFMPEG]                  |            38.2            |            38.0           |       15.0       |        16.4       |        32.3      
      TorchAudio                              |           187.6            |           201.4           |       11.2       |        13.2       |        49.0      
      TorchCodecPublic:seek_mode=exact        |            51.9            |            45.9           |       18.0       |        14.6       |        47.3      
      TorchCodecPublic:seek_mode=approximate  |            51.4            |            41.9           |       10.9       |        12.1       |        35.6      

Times are in milliseconds (ms).

image

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jun 4, 2025
@Dan-Flores Dan-Flores marked this pull request as ready for review June 4, 2025 19:04
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @Dan-Flores ! It looks good, I shared a few comments below. As we just discussed offline, it might be worth checking the output frames for validity, to make sure that the all the decoders we're benchmarking are returning similar frames.

self._print_each_iteration_time = False

def decode_frames(self, video_file, pts_list):
import cv2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's best to only import in __init__ and then store the module as a self.cv2 attibute. Otherwise, we'd be benchmarking the import statement when calling the decoding method, and it may add noise to the results.

current_frame += 1
cap.release()
assert len(frames) == len(approx_frame_indices)
return frames
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect what we're getting as output from opencv are numpy arrays. For a fair comparison with the other decoders, we should convert them to pytorch. I think torch.from_numpy is what we'd want to use for a fair comparison, as it returns a view and it's cheap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed opencv returned numpy arrays, so I added a call to torch.from_numpy. While validating that the frames were correct, I had to make a few more adjustments to the color and array order to get the same result as other decoders. I can remove them if we decide they are not needed for the benchmark.

import cv2

frames = [
cv2.resize(frame, (width, height))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should note that openCV doesn't apply antialias, while the rest of the decode_and_resize() methods apply antialias by default. That makes these methods less comparable.

I would suggest not to expose decode_and_resize for openCV if we can, so as to prevent any confusion, but if we need to expose it for technical reasons then let's at least add a comment pointing out this discrepancy about antialiasing.

CC @scotts as it's relevant to the whole "resize inconsistency chaos"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decode_and_reize() is needed to generate data for the README, so I've left a comment here.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @Dan-Flores ! It looks good, I shared a few comments below. As we just discussed offline, it might be worth checking the output frames for validity, to make sure that the all the decoders we're benchmarking are returning similar frames.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @Dan-Flores ! It looks good, I shared a few comments below. As we just discussed offline, it might be worth checking the output frames for validity, to make sure that the all the decoders we're benchmarking are returning similar frames.

@Dan-Flores
Copy link
Contributor Author

Thanks for the PR @Dan-Flores ! It looks good, I shared a few comments below. As we just discussed offline, it might be worth checking the output frames for validity, to make sure that the all the decoders we're benchmarking are returning similar frames.

cc @NicolasHug - It seems OpenCV only supports certain frame resolutions, so the benchmark on nasa13013.mp4 was using a smaller resolution, and was not a fair comparison (see image below). I was unfortunately unable to modify the VideoCapture's resolution - OpenCV's documentation on these properties suggests that modifying them can fail unexpectedly.

Screenshot 2025-06-06 at 1 35 19 PM

I ran the benchmark using a generated mandelbrot video at 1920x1080, which OpenCV is able to support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants