This task is more challenging than action classification as it deals with more fine-grained action classes and needs to capture de- tailed structure information to discriminate co-occurring ac- tion classes. In experiments, we choose two action detec- tion benchmarks to...