REMOTE – (Senior) Genetic Data Scientist – Statistical Geneticist


Job Overview

* This position is based in the Boston area with HQ in South San Francisco. Those interested in the position that live outside of the Boston area should still apply for the remote position.The OpportunityKey to insitro’s approach to rethinking drug development is linking in vitro cellular phenotypes with patient phenotypes. As a Statistical Geneticist, you will lead the development of cutting edge statistical approaches and workflows to analyze large-scale human cohorts with genetic and multi-dimensional, multi-modality phenotypic data. These will be obtained from public sources, as well as private datasets generated in-house and obtained through our collaborations. You will use genetic association and other statistical genetics techniques that involve integration of genetic, genomic and clinical data to guide in vitro disease model development. You will also work closely with a cross-functional team of life scientists, bioengineers and machine learning scientists to integrate human level data with our high-throughput in-house in vitro genomic and phenotypic data to identify therapeutic targets and develop drugs that have high efficacy and low toxicity.You will be joining as the founding team of a biotech startup that has long-term stability due to significant funding, but yet is very much in formation. A lot can change in this early and exciting phase, providing many opportunities for significant impact. You will work closely with a very talented team, learn a broad range of skills, and help shape insitro’s culture, strategic direction, and outcomes. Join us, and help make a difference to patients!About You* Ph.D. in statistics, genetics, computational biology or a related discipline, or equivalent practical experience* Demonstrated ability to use and develop cutting edge methods for analyzing human genetic data* Hands on experience with genetic association testing (eQTL mapping, GWAS, EWAS, PheWAS, etc.) and/or post-association analysis using summary statistics (fine mapping, colocalization, LD score regression, meta-analysis, etc.)* Experience mining modern, large-scale genetic databases (e.g. ExAC/gnomAD, UK Biobank, UK10K, EBI GWAS Catalog, 1KG, etc.)* Strong fundamentals in applied multivariate statistics; experience working with statistical models for complex datasets, effectively measuring goodness of fit and estimating confidence* Proficiency in working with large-scale datasets in Python; experience with R, C/C++ or other compiled, statically typed languages is a plus* Ability to communicate effectively and collaborate with people of diverse backgrounds and job functions* Passion for making a difference in the worldNice to Have* Experience working with multi-dimensional phenotypes* Experience working with genomic data from different modalities (DNA sequencing, RNA-seq, proteomics, DNA accessibility assays, etc.)* Experience integrating genetic studies from diverse populations, including correcting for population structure, ethnicity or differences between genotyping platforms* Experience applying quantitative trait loci (QTL) approaches to clinical data, including imaging data* Familiarity with cloud computing services (e.g., AWS or GCP) and workflow management tools or batch scheduling systems (e.g. SLURM)* Proficiency in Linux environment (including shell/Bash scripting), experience with database languages (e.g., SQL) and experience with version control practices and tools (e.g. Git)Benefits at insitro* Excellent medical, dental, and vision coverage* Open vacation policy* Team lunches (catered daily)* Commuter benefits* Paid parental leaveAbout insitroinsitro is a data-driven drug discovery and development company using machine learning and high-throughput biology to transform the way that drugs are discovered and delivered to patients. The company is applying state-of-the-art technologies from bioengineering to create massive data sets that enable the power of modern machine learning methods to be brought to bear on key bottlenecks in pharmaceutical R&D. The resulting predictive models are used to accelerate target selection, to design and develop effective therapeutics, and to inform clinical strategy. insitro was launched in 2018 with a Series A of $100M funded by top investors including a16z, Arch Venture Partners, Foresite Capital, GV, and Third Rock Ventures. In 2019 the company announced a collaboration with Gilead Sciences in the area of NASH and, in mid 2020, announced a Series B financing of $143M including current investors and new investors Canada Pension Plan Investment Board (CPP Investments), T. Rowe Price, BlackRock, Casdin Capital and other leading investors. The company is located in South San Francisco, CA. For more information about insitro, please visit the company’s website at www.

