About AssemblyAI

At AssemblyAI, we’re building at the forefront of Speech AI, creating powerful models for speech-to-text and speech understanding available through a straightforward API. With more than 200,000 developers building on our API and over 5,000 paying customers, AssemblyAI is helping unlock and support the next generation of powerful, meaningful products built with AI.

Progress in AI is moving at an unprecedented pace– and our team is made up of experts in AI research that are focused on making sure that our customers are able to stay on the cutting edge, with production-ready AI models that are constantly updating and improving as our team continues to improve accuracy, latency, and what’s possible with Speech AI. Our models consistently rank highest in industry benchmarks for accuracy, outperforming models from Google and Amazon, and up to 30% fewer hallucinations than OpenAI’s Whisper. Our models power more than 2 billion end-user experiences each day, helping companies better understand customer feedback, run more productive meetings with automated meeting notes, and helping improve childhood literacy via ed tech tools.

We’ve raised funding by leading investors including Accel, Insight Partners, Y Combinator’s AI Fund, Patrick and John Collision, Nat Friedman, and Daniel Gross. We’re a remote team looking to build one of the next great AI companies, and are looking for driven, talented people to help us get there!

About the role:

We are looking for a Senior Research Engineer to join our Research team. The core competencies of AssemblyAI include building best-in-class AI models, writing optimized inference code, and creating infrastructure to support deep learning work. This role involves both research engineering and applied research in Speech and Natural Language Processing (NLP) within a collaborative team environment. Specific responsibilities include enabling cutting-edge performance of our training software stack on large TPU clusters, writing code for state-of-the-art model training, and optimizing code for both training and inference. The ideal candidate will have a strong background in research engineering, deep learning using JAX, and experience writing high-quality code. With a fine-grained understanding of deep learning fundamentals, training hardware, and distributed systems, this person will contribute to all research projects, alongside the specific ones assigned to them.

What You’ll Do:

Train large-scale Speech AI models, including speech-focused multi-modal LLMs and ASR models with billions of parameters.
Write and optimize training code for maximum efficiency and memory utilization.
Stay up-to-date on the latest AI research and share insights across the company.
Collaborate with the technology leadership team to prioritize the research and engineering agenda, define project scopes, and lead their execution.

What You’ll Need:

2+ years of professional experience in deep learning research & development.
Hands-on experience with JAX/TPUs and distributed training.
Excellent knowledge of GPU and TPU hardware.
Strong Python programming skills.
Expertise in deep learning architectures and techniques such as CNNs, Transformers, SSMs, self-supervised learning, etc.
Experience with training large Transformer models on distributed systems.
Strong written and verbal communication skills for technical matters.

Nice to Have:

Experience in training/fine-tuning (multimodal) Large Language Models.
Experience in LLM inference using frameworks like vLLM or JetStream.

You may be a great fit if you are:

Highly self-motivated and keen to have real-world impact.
Detail-oriented, analytical, and creative problem solver.
Meticulous in all aspects (code, data, modeling, ...).
Able to work effectively in a team-oriented, collaborative environment.

Location Requirements:

At this time, we are only able to hire for this role in the USA, Canada, UK, Germany and Switzerland.

Pay Transparency:

AssemblyAI strives to recruit and retain exceptional talent from diverse backgrounds while ensuring pay equity for our team. Our salary ranges are based on paying competitively for our size, stage and industry, and are one part of many compensation, benefits and other reward opportunities we provide.

There are many factors that go into salary determinations, including relevant experience, skill level and qualifications assessed during the interview process, and maintaining internal equity with peers on the team. The range shared below is a general expectation for the function as posted, but we are also open to considering candidates who may be more experienced than outlined in the job description. In this case, we will communicate any updates in the expected salary range.

Lastly, the provided range is the expected salary for candidates in the U.S. Outside of those regions, there may be a change in the range, which again, will be communicated to candidates.

Salary range: $185K - $260K

Working at AssemblyAI

We are a small but mighty group of startup veterans and experienced AI researchers with over 20 years of expertise in Machine Learning, Speech Recognition, and NLP. As a fully remote team, we’re looking for people to join our team who are ambitious, curious, and lead with integrity. We’re still in the early days of AI and of AssemblyAI’s journey, and are looking for teammates who won’t just fit in, but will help us define and build our company culture.

We’re committed to creating a space where our employees can bring their full selves to work and have equal opportunity to succeed. No matter your race, gender identity or expression, sexual orientation, religion, origin, ability, age, veteran status, if joining this mission speaks to you, we encourage you to apply!

Keep Exploring AssemblyAI:

Check us out on YouTube!

Learn more about AI models for speech recognition

Core Transcription | Audio Intelligence | LeMUR | Try the Playground

Our $50M Series C fundraise