AI Research Scientist - Multimodal Machine Learning

At Reality Labs Research (RL-R), our goal is to explore, innovate and design novel interfaces and hardware subsystems for the next generation of virtual, augmented, and mixed reality experiences. We are driving research towards a vision of an always-on augmented reality device that can enable high-quality contextually relevant interactions across a range of complex, dynamic, real-world tasks in natural environments; to achieve this goal, our team draws on and fundamentally advances methods and knowledge from artificial intelligence, multimodal machine learning, computer vision, and signal processing. We are looking for a skilled and motivated researcher with expertise in multimodal machine learning and large scale multimodal representation learning. More broadly, the chosen candidate will work with a diverse and highly interdisciplinary team of researchers and engineers and will have access to cutting edge technology, resources, and testing facilities.

AI Research Scientist - Multimodal Machine Learning Responsibilities

Lead, collaborate, and execute on research that pushes forward the state of the art in multimodal reasoning and generation research.

Work collaboratively with other research scientists to develop novel solutions and models in service of contextualized AI for augmented reality.

Working on curation of large datasets across many modalities (e.g., vision, audio, IMU, sEMG).

Leading experimental design, implementation, evaluation, and reporting results to enable new capabilities for multimodal LLMs.

Mentor other team members. Play a significant role in healthy cross-functional collaboration.

Minimum Qualifications

PhD degree or equivalent experience in Computer Science, Mathematics, Engineering or a related field

Publication record in machine learning, AI, computer science, statistics, applied mathematics, data science, or related technical fields

6+ years of experience in one or more of the following areas: Deep Learning, NLP, Multimodal, Computer Vision, Speech

6+ years of experience developing end-to-end ML pipelines with a focus on dataset preprocessing, model development and evaluation, software integration, and real-time deployment

Experience writing software (Python and C/C++) and executing complex machine learning experiments

Research experience in one or more of these areas: NLP, computer vision, Multimodal, Speech

Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.

Preferred Qualifications

Publication record in AI conferences like NeurIPS, EMNLP, ACL, ICCV, CVPR, ICML, NAACL, ICLR, Interspeech, ICASSP.

Programming experience in Python and hands-on experience with frameworks such as PyTorch.

Experience with developing machine learning models at scale from inception to deployment.