TY - GEN
T1 - Recognizing human-vehicle interactions from aerial video without training
AU - Lee, Jong Taek
AU - Chen, Chia Chih
AU - Aggarwal, J. K.
PY - 2011
Y1 - 2011
N2 - We propose a novel framework to recognize human-vehicle interactions from aerial video. In this scenario, the object resolution is low, the visual cues are vague, and the detection and tracking of objects are less reliable as a consequence. Any methods that require the accurate tracking of objects or the exact matching of event definition are better avoided. To address these issues, we present a temporal logic based approach which does not require training from event examples. At the low-level, we employ dynamic programming to perform fast model fitting between the tracked vehicle and the rendered 3-D vehicle models. At the semantic-level, given the localized event region of interest (ROI), we verify the time series of human-vehicle relationships with the pre-specified event definitions in a piecewise fashion. With special interest in recognizing a person getting into and out of a vehicle, we have tested our method on a subset of the VIRAT Aerial Video dataset [11] and achieved superior results. Our framework can be easily extended to recognize other types of human-vehicle interactions.
AB - We propose a novel framework to recognize human-vehicle interactions from aerial video. In this scenario, the object resolution is low, the visual cues are vague, and the detection and tracking of objects are less reliable as a consequence. Any methods that require the accurate tracking of objects or the exact matching of event definition are better avoided. To address these issues, we present a temporal logic based approach which does not require training from event examples. At the low-level, we employ dynamic programming to perform fast model fitting between the tracked vehicle and the rendered 3-D vehicle models. At the semantic-level, given the localized event region of interest (ROI), we verify the time series of human-vehicle relationships with the pre-specified event definitions in a piecewise fashion. With special interest in recognizing a person getting into and out of a vehicle, we have tested our method on a subset of the VIRAT Aerial Video dataset [11] and achieved superior results. Our framework can be easily extended to recognize other types of human-vehicle interactions.
UR - http://www.scopus.com/inward/record.url?scp=80054914238&partnerID=8YFLogxK
U2 - 10.1109/CVPRW.2011.5981794
DO - 10.1109/CVPRW.2011.5981794
M3 - Conference contribution
AN - SCOPUS:80054914238
SN - 9781457705298
T3 - IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
SP - 53
EP - 60
BT - 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2011
PB - IEEE Computer Society
T2 - 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2011
Y2 - 20 June 2011 through 25 June 2011
ER -