VALL-E: Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

22 Feb 2024 in Seminar on Text-to-Speech

VALL-E 논문 요약

Goal

Voicebox

21 Feb 2024 in Seminar on Text-to-Speech

Voicebox 논문 요약

Matthew Le, Apoorv Vyas, Bowen Shi, Brian Karrer, Leda Sari, Rashel Moritz, Mary Williamson, Vimal Manohar, Yossi Adi, Jay Mahadeokar and Wei-Ning Hsu
"Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale"
Accepted by NeurIPS 2023
[Paper] [Demo] [Unofficial Code]

MQTTS: A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech

15 Feb 2024 in Seminar on Text-to-Speech

MQTTS 논문 요약

Li-Wei Chen, Shinji Watanabe, Alexander Rudnicky
Accepted by AAAI 2023
[Paper][Demo][Code]

CrossSpeech

07 Feb 2024 in Seminar on Text-to-Speech

CrossSpeech 논문 요약

Ji-Hoon Kim, Hong-Sun Yang, Yoon-Cheol Ju, Il-Hwan Kim and Byeong-Yeol Kim
"CrossSpeech: Speaker-Independent Acoustic Representation for Cross-Lingual Speech Synthesis"
Accepted by ICASSP 2023
[Paper] [Demo] [Code X]

CONCSS: Contrastive-based Context Comprehension for Dialogue-appropriate Prosody in Conversational Speech Synthesis

01 Feb 2024 in Seminar on Text-to-Speech

CONCSS 논문 요약

Yayue Deng, Jinlong Xue, Yukang Jia, Qifei Li, Yichen Han, Fengping Wang, Yingming Gao, Dengfeng Ke, Ya Li
Accepted by ICASSP2024
[Paper][Demo]

VALL-E: Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

Goal

Voicebox

MQTTS: A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech

CrossSpeech

CONCSS: Contrastive-based Context Comprehension for Dialogue-appropriate Prosody in Conversational Speech Synthesis

PRML Lab. Speech Team

Error

Goal

Pagination

Templates (for web app):

Error