Jump to: Navigation
Seminar
논문 읽고 정리해서 공유하기
2024
- VALL-E: Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
- Voicebox
- MQTTS: A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech
- CrossSpeech
- CONCSS: Contrastive-based Context Comprehension for Dialogue-appropriate Prosody in Conversational Speech Synthesis
- GenerSpeech
- VARIANCEFLOW: HIGH-QUALITY AND CONTROLLABLE TEXT-TO-SPEECH USING VARIANCE INFORMATION VIA NORMALIZING FLOW
- DSE-TTS
- UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding
- Flow matching 수식 정리
- CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training
- P-Flow
- Matcha-TTS
- M2-CTTS: End-to-End Multi-Scale Multi-Modal Conversational Text-to-Speech Synthesis
2023
- Mega-TTS
- Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling
- Encoding Speaker-specific Latent Speech Feature for Speech Synthesis
- Fine-Grained Emotional Control of Text-to-Speech: Learning to Rank Inter-And Intra-Class Emotion Intensities
- Diff-TTS
- VITS [ICML 2021]
- Denoising Diffusion Probabilistic Models
- Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
- AdaSpeech
- Meta-StyleSpeech
- Variational Inference with Normalizing Flows
- Glow: Generative Flow with Invertible 1×1 Convolutions
- Emo-Q
- FastSpeech 2
- FluentSpeech
- [Author], [Paper title], [Journal/Conference], [year]