About me

I am a PhD student in Computational Linguistics and holds an MS in Computer Science & Engineering from The Ohio State University, advised by Dr. Micha Elsner and Dr. Andrew Perrault. My research focuses on speech synthesis and reinforcement learning, and generative AI. I have hands-on experience with stable diffusion models, scalable diffusion models with transformers and GANs.

Updates

  • 05/2024: I work as applied scientist intern at Amazon Prime Video and develop an end-to-end speech-to-speech synthesis model which can generate audios in specific emotion and speaking style with natural speech quality.
  • 07/2023: Our paper received ACL 2023 Area Chair Awards (Linguistic Theories, Cognitive Modeling, and Psycholinguistics).
  • 05/2023: Our paper on exploring how GANs learn phonological representations was accepted to ACL 2023 main conference.
  • 03/2022: I was awarded the Summer Graduate Research Award from The Center for Cognitive and Brain Sciences at OSU! Super grateful for my Advisor Dr. Micha Elsner!

Research Experiences

Controllable Text to Speech Model (2024 - present)

  • Develop a lightweight text to speech model that can generate high-quality, natural sounding speech in the style of a given speaker (gender, pitch, speaking style, etc). *Implement LLM as reinforcement learning agent to control text to speech model.

EMOCLONE: Speech Emotion Cloning (2024)

  • Developed a variational autoencoder-based end-to-end speech-to-speech model using adversarial learning techniques.
  • Applied reinforcement learning methods to fintune the VAE model for controlling emotion expression, speaker voices,and language settings.demo page

Optimizing Speech Model Using Advanced RL techniques (2023-2024)

  • Investigating and developing advanced reinforcement learning techniques to enhance diffusion speech synthesis models.

Neural discovery of abstract inflectional structure (2023-2024)

  • Develop an RL approach to the tradeoff between memory and prediction in morphological production.
  • NSF-BCS-2217554; Principal Investigator: Dr. Micha Elsner and Dr. Andrea Sims

Explore How GANs Learn Phonological Representations (2021-2022)

  • Train two Convolutional Neural Networks models on English words and French words in an unsupervised manner
  • Visualize and interpret the intermediate layers of CNNs to explore what linguistics representations are learned from raw speech data by CNNs and how nasalization acoustic features are encoded in different layers of CNNs

Presentation

Jingyi Chen, Micha Elsner. Explore How Generative Adversarial Networks Learn Phonological Representation. ACL 2023 main conference[Code] [Paper]

Contact

Contact Email: [LAST_NAME].9220@osu.edu

Don’t hesitate to reach out if you’re curious about my research, eager to dive into exciting topics, or keen on brewing up some potential collaborations – I’m just an email away!