Research Scientist

Guide Labs • Full-time • San Francisco, CA, US • $150k - $200k / year • 1w ago

About Guide Labs

At Guide Labs, we are building AI systems that are engineered, from the start, to be modular, easily audited, steered, and understood by humans. Conventional model development approaches optimize for narrow performance measures, and jettison key interpretability (and reliability) concerns until after the model has been pre-trained. This approach is fundamentally broken. Guide Labs reimagines the entire model development process: the training dataset, model’s architecture, loss function, pre-training and post-training algorithm, all the way to user interaction with the model.

Team

We are interpretability researchers and engineers that have been responsible for major advances in the field. We have set out to rethink the way machine learning models, AI agents, and systems broadly are developed.

A Note About Experience

We do not care about formal credentials. If you share our vision, have evidence that demonstrates that you can be effective in this role, and would like to get involved, please reach out. If some of the below doesn’t line up perfectly with your profile or past experience, we still encourage you to apply!

About The Role

We are building generative models, across modalities, that are engineered to be modular and easy to audit. As a research scientist, you will be responsible for conducting research, empirical or theoretical, towards developing AI systems that are modular, easy to understand, steer, and audit. We are looking for people who prioritize impact over traditional academic publications. We expect to publish as we see fit; however, we seek researchers that are excited about translating interpretability research into useful products and insights for end-users.

You’ll work to reimagine the entire model building stack from training, fine-tuning, all the way to development. We are looking for researchers across multiple disciplines:

Generative modeling,
Reinforcement Learning,
General Interpretability, e.g. Feature attribution, Concepts, Training data attribution, Counterfactual Explanations, Chain-of-Thought, Self-explanations.
Spurious Correlations and Model Debugging,
Optimization,
Human Computer Interaction, and others.

Responsibilities

Design and implement machine learning algorithms to enhance interpretability, modularity, and steerability.
Reimagine interpretability needs for generative models, and perform experiments to demonstrate the reliability of algorithms that claim to support such needs.
Work to scale interpretable models to billions of parameters.
Design experiments, datasets, and algorithms that can effectively help answer research questions relating to interpretability needs.
Run large-scale experiments to test the reliability of a training pipeline.
Build infrastructure for performing experiments, and visualizing the results.
Interpreting and communicating results.

Our Stack

We are not dogmatic. We believe in the right tool for the job. However, most of our stack is currently: Pytorch (Lightning, Hydra, Wandb), Jax, XLA/MLIR, Cloud (AWS, GCP, Azure) and HPC/Kubernetes systems.

You are

Enthusiastic about translating algorithmic insights into practice and see research and engineering as two sides of the same coin.
Able to translate research questions into testable experiments, design and run those experiments, and interpret the results.
Interested and enthusiastic about engineering and moving research ideas into product.
Very comfortable in Python, and can pick up other languages, as needed.
Interested in contributing to ideas, figures, and writing of research papers, blog posts, and other technical documents.
Passionate about engineering best practices.
Interested in working as part of a collaborative team instead of solo efforts.

Compensation Range: $150K - $200K