B. Bigot, I. Ferrané, J. &. Pinquier, and R. André-obrecht, Speaker role recognition to help spontaneous conversational speech detection, Proceedings of the 2010 international workshop on Searching spontaneous conversational speech - SSCS '10, 2010.

. Collobert-r, Deep learning for efficient discriminative parsing, AISTATS, 2011.

. Damnati-g.-&-charlet-d, Multi-view approach for speaker turn role labeling in TV broadcast news shows, 2011.

R. Dufour, Y. Estève, and P. Deléglise, Characterizing and detecting spontaneous speech: Application to speaker role recognition, Speech Communication, vol.56, pp.1-18, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01433222

B. Feng, J. Bai, Z. Chen, X. Huang, and B. Xu, Anchor Shot Detection with Deep Neural Network, Advances in Multimedia Information Processing ? PCM 2014, pp.304-312, 2014.

. Giraudel-a, M. Carré, V. Mapelli, J. Kahn, and . Galibert-o.-&-quintard-l, The repere corpus : a multimodal corpus for person recognition, LREC, 2012.

B. Hutchinson, B. Zhang, and M. Ostendorf, Unsupervised broadcast conversation speaker role labeling, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010.

K. P. Boulianne, G. &. Ouellet-p, and . Dumouchel-p, Factor analysis simplified, ICASSP, 2005.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, Communications of the ACM, vol.60, issue.6, pp.84-90, 2017.

Y. Liu, Initial study on automatic identification of speaker role in broadcast news speech, Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers on XX - NAACL '06, 2006.

J. Ngiam, . Khosla-a, N. J. Kim-m, H. &. Lee, and . Y. Ng-a, Multimodal deep learning, ICML, 2011.

M. Rouvier, P. Bousquet, and B. Favre, Speaker diarization through speaker embeddings, 2015 23rd European Signal Processing Conference (EUSIPCO), 2015.
URL : https://hal.archives-ouvertes.fr/hal-01194233

M. Rouvier, S. Delecraz, B. Favre, M. &. Bendris, and F. Bechet, Multimodal embedding fusion for robust speaker role recognition in video broadcast, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2015.
URL : https://hal.archives-ouvertes.fr/hal-01475413

M. Rouvier, . Dupuy-g, . Gay-p, M. T. Khoury-e, and . Meignier-s, An open-source state-of-the-art toolbox for broadcast news diarization, 2013.
URL : https://hal.archives-ouvertes.fr/hal-01433449

. Rouvier-m.-&-favre-b, Speaker adaptation of dnn-based asr with i-vectors : Does it actually adapt models to speakers ?, 2014.

W. W. Yaman, S. &. Precoda-k, and . Richey-c, Automatic identification of speaker role and agreement/disagreement in broadcast conversation, ICASSP, 2011.

. Zhang-b, . Hutchinson-b, and . Wu-w.-&-ostendorf-m, Extracting Phrase Patterns with Minimum Redundancy for Unsupervised Speaker Role Classification, NAACL, 2010.

A. De-la-conférence-conjointe and J. , , vol.1, 2016.