The Generatlve AI Org at Meta builds industry leading LLM Foundation models to enable many Meta products and open source. As an Applied Research Scientist in Generative AI, you have the opportunity to help push state of the art research, build industry leading foundation models, and bring them to products to impact billions of users.

GenAI Audio 's mission is to build industry leading Speech & Audio foundation models, to enable Meta products. Our current focus is on controllable generative speech synthesis, to give our AI agents human-like voices.

The current status:

We have built a Speech foundation model that is capable of generating high quality voice given 10 - 20 seconds of target speaker voice, and can support deep cloning with 30 - 60 minutes voices.

Research Scientist, Generative AI - Speech & Audio Responsibilities

Independently design and implement algorithms, train state of the art speech recognition models on large data, and evaluate their performance.
Perform research to tackle unsolved real-world problems and push state of the art speech recognition.
Develop machine learning models for both server-based and embedded systems.
Contribute research that can be applied to Meta product development.

Minimum Qualifications

Has PhD in the field of Artificial Intelligence, a related field, or equivalent practical experience. Degree must be completed prior to joining Meta.
Research experience in one or more of these areas: speech recognition, speech processing, machine learning, deep learning, or related fields.
Experience with development and implementation of speech recognition algorithm/systems and model training.
Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.
Knowledge of deep learning and neural networks.
Experience working with machine learning libraries like Pytorch, Tensorflow, etc.
Familiar with scripting languages such as Python and shell scripts.
Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment.

Preferred Qualifications

Experience with developing scalable machine learning models in at least one of the following areas: automatic speech recognition, speech synthesis, LLM, or relevant areas.
Experience with large scale model training, implementing algorithms, and evaluating Speech systems.
Proven track record of achieving significant results as demonstrated by publications at leading workshops, journals or conferences such as ICASSP, INTERSPEECH, or similar.
Experience taking ideas from research to production.
Experience solving complex problems and comparing alternative solutions, tradeoffs, and diverse points of view to determine a path forward.
Experience working and communicating cross functionally in a team environment.