Video Understanding Models in AI

Shafik Quoraishee
1 min readDec 14, 2024

--

Video understanding models

Today I did a deep dive on video understanding models. I believe true video understanding is the next milestone in AI. Most people in the space understand how image understanding models work, and have for awhile. But the truth of a scene is in what happens over time.

Video understanding is a much more complex subject than image understanding because not only are you dealing with static image data, which can have a ton of frame level contextual variation — but you’re also dealing with temporal evolutions of the frame sequence over time, which requires understanding how the time sequence evolves.

Here, I go into some of the models, the benchmarks and evaluations, along with with demos I’ve written in python into building functional video understanding applications. Again, there is a bit of depth in this video, but hopefully it’s still balanced enough to keep you engaged. Thanks for watching!

--

--

Shafik Quoraishee
Shafik Quoraishee

Written by Shafik Quoraishee

I'm an Engineer, currently working at the New York Times. In my spare time I'm also a computational biology and physics enthusiast. Hope you enjoy my work!

No responses yet