-
Notifications
You must be signed in to change notification settings - Fork 37
add opencv benchmark #711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
add opencv benchmark #711
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @Dan-Flores ! It looks good, I shared a few comments below. As we just discussed offline, it might be worth checking the output frames for validity, to make sure that the all the decoders we're benchmarking are returning similar frames.
self._print_each_iteration_time = False | ||
|
||
def decode_frames(self, video_file, pts_list): | ||
import cv2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's best to only import in __init__
and then store the module as a self.cv2
attibute. Otherwise, we'd be benchmarking the import
statement when calling the decoding method, and it may add noise to the results.
current_frame += 1 | ||
cap.release() | ||
assert len(frames) == len(approx_frame_indices) | ||
return frames |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect what we're getting as output from opencv are numpy arrays. For a fair comparison with the other decoders, we should convert them to pytorch. I think torch.from_numpy
is what we'd want to use for a fair comparison, as it returns a view and it's cheap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirmed opencv returned numpy arrays, so I added a call to torch.from_numpy
. While validating that the frames were correct, I had to make a few more adjustments to the color and array order to get the same result as other decoders. I can remove them if we decide they are not needed for the benchmark.
import cv2 | ||
|
||
frames = [ | ||
cv2.resize(frame, (width, height)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should note that openCV doesn't apply antialias, while the rest of the decode_and_resize()
methods apply antialias by default. That makes these methods less comparable.
I would suggest not to expose decode_and_resize
for openCV if we can, so as to prevent any confusion, but if we need to expose it for technical reasons then let's at least add a comment pointing out this discrepancy about antialiasing.
CC @scotts as it's relevant to the whole "resize inconsistency chaos"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decode_and_reize()
is needed to generate data for the README, so I've left a comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @Dan-Flores ! It looks good, I shared a few comments below. As we just discussed offline, it might be worth checking the output frames for validity, to make sure that the all the decoders we're benchmarking are returning similar frames.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @Dan-Flores ! It looks good, I shared a few comments below. As we just discussed offline, it might be worth checking the output frames for validity, to make sure that the all the decoders we're benchmarking are returning similar frames.
cc @NicolasHug - It seems OpenCV only supports certain frame resolutions, so the benchmark on ![]() I ran the benchmark using a generated mandelbrot video at 1920x1080, which OpenCV is able to support. |
This PR updates the changes in #674, which added a benchmark for decoding with the OpenCV library.
The main differences are: