Minimizing Human Effort in Interactive Tracking by Incremental Learning of Model Parameters


The past decade has seen an explosive growth of video data. The ability to easily annotate/track objects in videos has the potential for tremendous impact across multiple application domains. For example, in computer vision annotated video data can be used as an extremely valuable source of information for the training and evaluation of object detectors (video provides continuous view of how an object's appearance might change due to viewpoint effects). In sports, video-based analytics is becoming increasingly popular. In behavioral science, video has been used to assist the coding of children's behavior (e.g., for studying infant attachment, typical development, and autism)

In this work, we propose an interactive tracking system that is designed to minimize the amount of annotation required to obtain high precision tracking results. We achieve this by leveraging user annotations for incrementally learning instance specific model parameters of the tracking cost function. This is in contrast to the common practice of hand-tuning the model parameters on a training set and applying the same fixed parameters on any new testing data. This approach is both time consuming (due to hand-tuning) and gives suboptimal accuracy on individual tracking instances. We cast the problem of learning the optimal model parameters as the problem of learning a structured prediction model in a maximum margin framework. Our key insight is that the incremental nature of an interactive tracking process is particularly well-suited for efficient maximum margin learning of model parameters.


Arridhana Ciptadi, and James M. Rehg. Minimizing Human Effort in Interactive Tracking by Incremental Learning of Model Parameters. In Proc. IEEE Intl. Conf. on Computer Vision (ICCV 2015), Santiago, Chile, December, 2015.

Paper | Poster | BibTex | DOI: 10.1109/ICCV.2015.498



To be released


The authors would like to thank Dr. Daniel Messinger and The Early Play and Development Laboratory at the University of Miami for providing the videos used in the Infant-Mother Interaction Dataset. Portions of this work were supported in part by NSF Expedition Award number 1029679 and the Intel Science and Technology Center in Embedded Computing.


The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without explicit permission of the copyright holder.