Kari Pulli, NVIDIA

Talk Title: Mobile Visual Computing

Abstract:

This talk gives an overview of some recent work at NVIDIA Research on mobile visual computing. We give an overview of Tegra, the NVIDIA family of mobile processors, and in particular look into its visual processing capabilities. We then discuss several recent research projects: 2D registration of moving scenes for computational photography, 3D tracking and modeling for augmented reality, wearable light-field displays, and interactive editing of a live scene on a camera viewfinder which provides a WYSIWYG experience for computational photography.

Biography:

Dr. Kari Pulli is Senior Director of Research at NVIDIA. Kari joined NVIDIA research in 2011 to work in imaging and other mobile applications. He heads the Mobile Visual Computing Research team which works on topics related to cameras, imaging, and vision on mobile devices such as tablets, smartphones, and cars. Previously he was at Nokia (1999-2004 in Oulu, Finland; 2004-06 a visiting scientist at MIT CSAIL; 2006-11 at Nokia Research Center Palo Alto). He was one of the 3 Nokia Fellows in 2010 (6th in Nokia history), and a Member of CEO's Technology Council. Kari worked on standardizing mobile graphics APIs at Khronos (OpenGL ES, OpenVG) and JCP (M3G), wrote with colleagues a book on Mobile 3D Graphics, and lead a research group working on mobile augmented reality and computational photography (including the FCam architecture for computational cameras).

Kari has a B.Sc. from the University of Minnesota, M.Sc. and Lic. Tech. from the University of Oulu (Finland), and Ph.D. from the University of Washington (Seattle), all in Computer Science / Engineering; MBA from the University of Oulu; and he worked as a research associate at Stanford University as the technical lead of the Digital Michelangelo Project.

For more details and publication list see here.

Thursday, October 31st

Nuno Vasconcelos, UCSD

Talk Title: Understanding Video of Crowded Environments

Abstract:

Classical work in computer vision has emphasized the study of individual objects, e.g. object recognition or tracking. More recently, it has been realized that most of these approaches do not scale well to scenes that depict crowded environments. These are scenes with many objects, which are imaged at low resolution, and interact in complex ways. Solving vision problems in these environments requires the ability to model and reason about a crowd as a whole. I will review recent work in my lab in this area, including the design of statistical models for the appearance and dynamics of crowd video with multiple flows, and their application to the solution of problems such as crowd counting, dynamic background subtraction, and anomaly detection.

Biography:

Dr. Nuno Vasconcelos is a Professor at the Electrical and Computer Engineering Department of the University of California, San Diego. He heads the Statistical Visual Computing Laboratory, which investigates problems in computer vision, statistical signal processing, machine learning, and multimedia.

Prof. Vasconcelos current interests include object recognition, detection and tracking; activity recognition, semantic representations for images and video; saliency and attention; biological models of vision; surveillance of crowded scenes; dynamic models of video; cost-sensitive learning; boosting and detector cascades; cross-modal approaches to image retrieval; and various machine learning problems. Before joining UCSD, Prof. Vasconcelos was a member of the research staff at the Compaq Cambridge Research Laboratory, which later became the HP Cambridge Research Laboratory. He received a PhD from MIT in 2000. He is a Senior Area Editor of the IEEE Signal Processing Letters and has been an area chair of various major conferences in vision and machine learning. Prof. Vasconcelos is the recipient of a 2005 NSF CAREER award, a Hellman Fellowship, several best paper awards, and co-authored more than 100 peer reviewed publications.

Friday, November 1st

Ajay Divakaran, SRI International

Talk Title: Tracking People, Vehicles and Vessels Across Multiple Cameras

Abstract:

We will cover the vehicle and people tracking aspects of the situation awareness systems we have developed over the past several years at SRI International, formerly Sarnoff Corporation. We will present our latest results on real-time people tracking and handoff in moderately crowded conditions. Real time tracking in such conditions presents the challenge of maintaining persistent tracks despite high and frequent occlusion of targets of interest. We will show demonstration videos of systems that we have delivered to the customer. On the vehicle tracking front, we have developed a unique vehicle fingerprinting approach that has enabled us to both enable real-time vehicle tracking from camera to camera as well as real-time detection of vehicles of interest on a watch-list. Such tracking presents the challenges of view-invariant fingerprinting and matching of vehicles which we have achieved in systems that we have tested in field conditions. We have also tackled fingerprinting of water-borne vessels and demonstrated success in field conditions. Such fingerprinting goes beyond vehicle fingerprinting because vessels are much more diverse in shape and size, are not constrained to move on roads and hence highly variable in pose and move on water which comprises a moving background. Finally, we will touch upon avenues for further research.

Biography:

Dr. Ajay Divakaran is Technical Manager at SRI International's Center for Vision Technologies. He leads the Vision and Multi-Sensor group in SRI International's Vision and Learning Laboratory. As technical manager, he is responsible for the proposal and execution of contract research projects in computer vision as well as multi-sensor systems that combine various modalities.

Divakaran is currently the principal investigator for a number of SRI research projects. His work includes multimodal modeling and analysis of affective, cognitive, and physiological aspects of human behavior, interactive virtual reality-based training, tracking of individuals in dense crowds and multi-camera tracking, technology for automatic food identification and volume estimation, and audio analysis for event detection in open-source video. He has developed several innovative technologies for multimodal systems in both commercial and government programs during the course of his career.

Prior to joining SRI in 2008, Divakaran worked at Mitsubishi Electric Research Labs for 10 years, where he was the lead inventor of the world's first sports highlights playback-enabled DVR. He also oversaw a wide variety of product applications for machine learning.

Divakaran was named a Fellow of the IEEE in 2011 for his contributions to multimedia content analysis. He developed techniques for recognition of agitated speech for his work on automatic sports highlights extraction from broadcast sports video. He established a sound experimental and theoretical framework for human perception of action in video sequences as lead-inventor of the MPEG-7 video standard motion activity descriptor. He serves on Technical Program Committees of key multimedia conferences, and served as an associate editor of IEEE Transactions on Multimedia from 2007 to 2010. He has authored two books and has more than 100 publications to his credit, as well as more than 40 issued patents.

Divakaran received his M.S. and Ph.D. degrees in electrical engineering from Rensselaer Polytechnic Institute. His B.E. in electronics and communication engineering is from the University of Jodhpur in India.

Credits