A bayesian framework for word segmentation: Exploring the effects of context, Cognition, vol.112, issue.1, pp.21-54, 2009. ,
A nonparametric bayesian approach to acoustic model discovery, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol.1, pp.40-49, 2012. ,
Variational inference for acoustic unit discovery, Procedia Computer Science, vol.81, pp.80-86, 2016. ,
Unsupervised learning of spoken language with visual context, Advances in Neural Information Processing Systems, pp.1858-1866, 2016. ,
Cognitive science in the era of artificial intelligence: A roadmap for reverse-engineering the infant language-learner, Cognition, vol.173, pp.43-59, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01888694
Speech recognition: A model and a program for research, IRE transactions on information theory, vol.8, issue.2, pp.155-159, 1962. ,
Perception of the speech code, Psychological review, vol.74, issue.6, p.431, 1967. ,
wav2vec: Unsupervised pre-training for speech recognition, 2019. ,
Learning problem-agnostic speech representations from multiple self-supervised tasks, 2019. ,
An unsupervised autoregressive model for speech representation learning, 2019. ,
Mockingjay: Unsupervised speech representation learning with deep bidirectional transformer encoders, 2019. ,
Learning hierarchical discrete linguistic units from visually-grounded speech, International Conference on Learning Representations, 2020. ,
Unsupervised learning of disentangled and interpretable representations from sequential data, Advances in neural information processing systems, pp.1878-1889, 2017. ,
A factorial deep markov model for unsupervised disentangled representation learning from speech, ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.6540-6544, 2019. ,
Disentangled sequential autoencoder, 2018. ,
Unsupervised domain adaptation for robust speech recognition via variational autoencoderbased data augmentation, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp.16-23, 2017. ,
Unsupervised adaptation with interpretable disentangled representations for distant conversational speech recognition, 2018. ,
Your classifier is secretly an energy based model and you should treat it like one, International Conference on Learning Representations, 2020. ,
Black box variational inference, 2013. ,
Structured inference networks for nonlinear state space models, Thirty-first aaai conference on artificial intelligence, 2017. ,
Kernel methods in machine learning, The annals of statistics, pp.1171-1220, 2008. ,
Unsupervised speech representation learning using wavenet autoencoders, speech, and language processing, vol.27, pp.2041-2053, 2019. ,
Generating sentences from a continuous space, 2015. ,
Sequence-tosequence speech recognition with time-depth separable convolutions, 2019. ,
Transformers with convolutional context for asr, 2019. ,
Biva: A very deep hierarchy of latent variables for generative modeling, Advances in neural information processing systems, pp.6548-6558, 2019. ,
Composing graphical models with neural networks for structured representations and fast inference, Advances in neural information processing systems, pp.2946-2954, 2016. ,
Variational message passing with structured inference networks, 2018. ,
Hidden markov model variational autoencoder for acoustic unit discovery, pp.488-492, 2017. ,
Auto-encoding variational bayes, 2013. ,
Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, Proceedings of the 23rd international conference on Machine learning, pp.369-376, 2006. ,
Contrastive multiview coding, 2019. ,
Data-efficient image recognition with contrastive predictive coding, 2019. ,
Librispeech: an asr corpus based on public domain audio books, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5206-5210, 2015. ,
The design for the wall street journalbased csr corpus, Proceedings of the workshop on Speech and Natural Language, pp.357-362, 1992. ,
Neural autoregressive distribution estimation, The Journal of Machine Learning Research, vol.17, issue.1, pp.7184-7220, 2016. ,
Wavenet: A generative model for raw audio, 2016. ,
Samplernn: An unconditional end-to-end neural audio generation model, 2016. ,
Melnet: A generative model for audio in the frequency domain, 2019. ,
Waveglow: A flow-based generative network for speech synthesis, ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.3617-3621, 2019. ,