1 year contract (could extend)
Mostly remote in Seattle WA, San Francisco CA, San Diego CA, or New York NY
Large client in IT industry
Job Description:
This role is for a Research Scientist who will be responsible for designing and implementing robust processes to evaluate and improve LLM-based models. More specifically, they will work on ensuring that we build models that will accelerate the developer work cycle and improve how software is built.
This researcher will:
· Work closely with engineers, PMs, designers, and data scientists to align on research goals and build scalable end-to-end evaluation pipelines.
· Report on their research and make recommendations to stakeholders.
· Work autonomously and mentor other researchers to support the goals of the research organization.
Requirements:
Minimum Qualifications
· 4+ years of experience in an applied research setting, or similar.
· Proven experience developing benchmarks and metrics to evaluate AI models, preferably with a focus on code generation or natural language processing (NLP).
· Familiarity with software engineering principles (unit tests, CI/CD, version control etc.).
· Strong programming skills in 1 or more programming languages (including Python).
· 2 years or more of experience with machine learning and data science algorithms and tools (e.g., PyTorch).
· Ability to produce meaningful visualizations and reports, and effectively present results to different audiences and across multiple media formats.
· Ability to work independently as well as collaboratively with minimal direction.
· Organized, highly attentive to detail, and manages time well.
· Experience working in industry.
The ideal candidate must possess the following qualifications:
Preferred Qualifications
· Advanced degree (MS/PhD) in data analytics, statistics, or any engineering or computer science field.
· Experience with large language models (LLMs) and knowledge of state-of-the-art code generation techniques.
· Publication(s) in the fields of artificial intelligence, machine learning or data science.
· Familiarity with code generation-specific benchmarks and metrics.
· Ability to modify or fine-tune pre-trained models for specialized tasks.
· Familiarity with code scanning tools.
Skills Required: User Experience