Hollywood 3D
Leaderboard
Below are the performance of various techniques reported on the Hollywood 3D dataset.
(note that these have not been independently verified). If you wish your technique to
be added to the leaderboard, email S.Hadfield{at}surrey.ac.uk with the name of the technique, the reference to the publication, and if possible a link to the pdf.
|
Correct Classification Rate |
Average Precision |
Algorithm
|
Mean |
NoAction |
Run |
Punch |
Kick |
Shoot |
Eat |
Drive |
UsePhone |
Kiss |
Hug |
StandUp |
SitDown |
Swim |
Dance |
HOS' [5]
|
- |
36.9 |
21.2 |
63.1 |
54.2 |
19.9 |
31.0 |
24.2 |
60.8 |
22.3 |
31.3 |
32.4 |
50.0 |
18.1 |
43.0 |
44.9 |
Multi-view neural networks [7] |
35.71 |
30.79 |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
Disp-Pyr{1,3} [4]
|
36.09 |
30.52 |
11.83 |
47.89 |
27.71 |
22.93 |
49.38 |
7.48 |
59.84 |
14.75 |
41.42 |
17.09 |
50.02 |
10.03 |
29.44 |
37.54 |
Enriched-IPs [6] |
32.8 |
30.1 |
11.8 |
49.5 |
28.0 |
20.5 |
37.4 |
8.8 |
61.7 |
14.9 |
46.3 |
14.2 |
52.8 |
10.7 |
23.1 |
41.8 |
MVRELM [3]
|
33.44 |
29.86 |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
- |
Disparity-IPs [6] |
35.7 |
28.7 |
11.6 |
53.2 |
34.4 |
17.4 |
36.3 |
7.3 |
63.5 |
14.4 |
34.9 |
16.6 |
39.8 |
9.8 |
31.3 |
30.9 |
SAE-MD(Av) [2]
|
30.13 |
26.11 |
12.77 |
50.44 |
38.01 |
7.94 |
35.51 |
7.03 |
59.62 |
23.92 |
16.40 |
7.02 |
34.23 |
6.95 |
29.48 |
36.26 |
HoG/HoF/HoDG + 3.5D-Harris [1]
|
21.8 |
14.1 |
13.7 |
27.0 |
5.7 |
4.8 |
16.6 |
5.6 |
69.6 |
7.6 |
10.2 |
12.1 |
9.0 |
5.6 |
7.5 |
7.5 |
[1] Hadfield, S. and Bowden, R. Hollywood 3D: Recognizing Actions in 3D Natural Scenes. In Proceedings, Conference on Computer Vision and Pattern Recognition (CVPR), pg. 3398-3405, 2013.
[2] Konda, K. and Memisevic, R. Learning to combine depth and motion. Indian Conference on Computer Vision, Graphics and Image Processing, 2014.
[3] Iosifidis, A. and Tefas, A. and Pitas, I. Multi-view Regularized Extreme Learning Machine for Human Action Recognition. In Artificial Intelligence: Methods and Applications volume 8554, pg. 84-94, Springer International Publishing, 2014.
[4] Iosifidis, A. and Tefas, A. and Nikolaidis, N. and Pitas, I. Human action recognition in stereoscopic videos based on bag of features and disparity pyramids.
[5] Hadfield, S. and Lebeda, K. and Bowden, R. Natural action recognition using invariant 3D motion encoding. In Proceedings, European Conference on Computer Vision (ECCV), Springers Lecture Notes in Computer Science issue 8690, pg. 758-771, 2014. (Code below)
[6] Mademlis, I. and Iosifidis, A. and Tefas, A. and Nikolaidis, N. and Pitas, I. Stereoscopic Video Description for Human Action Recognition. In the IEEE Symposium Series on Computational Intelligence (SSCI), 2014.
[7] Iosifidis, A. and Tefas, A. and Pitas, I. Human action recognition based on bag of features and multi-view neural networks. In Proceedings International Conference on Image Processing (ICIP), pg. 1510-1514, 2014.
Data and Code
To access the data and code, please enter your details below.
Calibrations
The stereo calibrations for each sequence in the dataset (as described in this paper) are available here.
|