AIC Seminar Series
Learning hierarchical invariant spatio-temporal features for action recognition
Notice: Hosted by Hung Bui
Date: 2011-11-17 at 16:00
Location: EJ228 (SRI E building) (Directions)
Previous work on action recognition has focused on
adapting hand-designed local features, such as SIFT or
HOG, from static images to the video domain. In this paper,
we propose using unsupervised feature learning as a
way to learn features directly from video data. More
specifically, we present an improvement of the Independent
Subspace Analysis algorithm to learn invariant spatio-temporal
features from unlabeled video data. We discovered that,
despite its simplicity, this method performs surprisingly well
when combined with deep learning techniques such as
stacking and convolution to learn hierarchical representations.
By replacing expert hand-designed features with our machine learned features,
we achieve classiﬁcation results superior to all previous published
results on all computer vision benchmarks. Further beneﬁts of
this method, such as the ease of training and the efﬁciency of
training and prediction, will also be discussed. You can download
our code and learned spatio-temporal features here: http://ai.stanford.edu/∼quocle/
Please arrive at least 10 minutes early as you will need to sign in by
following instructions by the lobby phone at Building E. SRI is located
at 333 Ravenswood Avenue in Menlo Park. Visitors may park in the parking
lots off Fourth Street. Detailed directions to SRI, as well as maps, are
available from the Visiting AIC web page.
There are two entrances to SRI International located on Ravenswood Ave.
Please check the Builing E entrance signage.
©2016 SRI International 333 Ravenswood Avenue, Menlo Park, CA 94025-3493