In this post, we will see how to load a video using PyTorch, followed by a rant on how we perfom evaluations of video models. We will try various methods to load a video and convert it to a PyTorch tensor. These include VideoClips (from torchvision), torchvision.IO (using PyAV), and decord, my own implementation using ffmpeg and FFCV. We will use the Kinetics-400 dataset as an example. You can find the dataset here. All the code is available in this repository. I modified FFCV to be able to handle videos, you can find the fork here.
Some links and papers that I have found interesting this week. If you have any comments, please let me know.
Some links and papers that I have found interesting this week. If you have any comments, please let me know.
In this post, we will study the applicability of design of experiments (DoE) in machine learning (ML) experiments, to do so we will use a machine learning paper as a case study. I assume that the reader is familiar with RNN’s. For a simple introduction to factorial designs with replication, I consider these slides a great resource. Some starting code in python for factorial designs can be oun here. All the code necessary to reproduce these experiments can be found on here.
El seguimiento de objetos es una tarea importante dentro del campo de computer vision. En este post consideramos la metodología de object tracking conocida como tracking-by-detection donde los resultados de la detección de objetos se dan en cada frame como input y el objetivo es asociar las detecciones para encontrar las trayectorias de los objetos. No se puede esperar que se detecten todos los objetos en cada frame, puede haber falsas detecciones y algunos objetos pueden ser ocluidos por otros; estos factores hacen que la asociación de datos sea una tarea difícil.