Multimodal movie genre classification using recurrent neural network

Published in Multimedia Tools and Applications, 2022

Authors

Tina Behrouzi, Ramin Toosi, Mohammad Ali Akhaee

Abstract

Genre is one of the features of a movie that defines its structure and type of audience. The number of streaming companies interested in automatically deriving movies’ genres is rapidly increasing. Genre categorization of trailers is a challenging problem because of the conceptual nature of the genre, which is not presented physically within a frame and can only be perceived by the whole trailer. Moreover, several genres may appear in the movie at the same time. The multi-label learning algorithms have not been improved as significantly as the single-label classification models, which causes the genre categorization problem to be highly complicated. In this paper, we propose a novel multi-modal deep recurrent model for movie genre classification. A new structure based on Gated Recurrent Unit (GRU) is designed to derive spatial-temporal features of movie frames. The video features are then concatenated with the audio features to predict the final genres of the movie. The proposed design outperforms the state-of-art models based on accuracy and computational cost and substantially improves the movie genre classifier system’s performance.