(Senior) Data Engineer, DevOps


Job Overview

The OpportunityData Engineering plays a key role in insitro’s approach to rethinking drug development. The Data Engineering DevOps team ensures the infrastructure which powers our biological data factory’s robots, instruments, and machine learning platform is reliable, scalable, and manageable. You will work closely with a cross-functional team of scientists, bioengineers, and data scientists to identify areas where data engineering can make a difference, by developing data architectures and systems on cutting edge, high throughput platforms that enable our scientists to be maximally productive. You will design, implement, and deploy cloud infrastructure, including managed databases, application servers, data warehouses, and interactive/batch computing environments, and work as part of a team to rigorously design our data platform, identify key architectural performance improvements, and join an on-call rotation to ensure that insitro’s platform runs at maximum productivity.You will be joining as the founding team of a biotech startup that has long-term stability due to significant funding, but yet is very much in formation. A lot can change in this early and exciting phase, providing many opportunities for significant impact. You will work closely with a very talented team, learn a broad range of skills, and help shape insitro’s culture, strategic direction, and outcomes. Join us, and help make a difference to patients!About You* 2-3 years of experience with provisioning AWS cloud services (Experience with GCP and Azure is also relevant).* Experience with cloud configuration and resource management tools such as Terraform* Experience architecting reliable infrastructure platforms including monitoring and alerting, load balancing, scalable services, multi-region* Experience with at least one high-end distributed data processing environment (Hadoop, Spark, etc)* Experience with batch computing systems such as AWS Batch, SLURM* Experience with container build and deployment systems like Docker, Kubernetes, or ECS* Ability to communicate effectively and collaborate with people of diverse backgrounds and job functions* Proficiency in Linux environment (including shell scripting and Python programming), experience with database languages (e.g., SQL, No-SQL) and experience with version control practices and tools (Git, Mercurial, etc.)* Passion for making a difference in the worldNice to Have* Experience with biological data* Experience with managing medium-sized data sets (100TB+) in object storage systems like S3* Experience with defining infrastructure following compliance (GDPR, HIPAA, etc).* Experience with data processing pipelines* Experience with deploying and monitoring machine learning models in a production environmentBenefits at insitro* Excellent medical, dental, and vision coverage* Open vacation policy* Team lunches (catered daily)* Commuter benefits* Paid parental leaveAbout insitroinsitro is an exciting startup company that aims to take a new approach to drug development: one with big data and machine learning at its core. We plan to build on the ground-breaking innovations that have occurred in life sciences to develop large data sets that are designed from the start to allow machine learning to address fundamental bottlenecks in the drug development process. Our goal is to cure more people, sooner, and at a much lower cost.We are fortunate to have the strong support from the top investors in both biotech and tech: ARCH Ventures, Foresite Capital, A16Z, GV, and Third Rock Ventures. We are building a remarkable team that embodies a new type of culture, one based on a true partnership between scientists, engineers, and data scientists. Together we are working to define the problems, design experiments, analyze the data, and derive the insights that will lead us to new therapeutics. Join us, and help make a difference to patients!

View More
Job Detail
Shortlist Never pay anyone for job application test or interview.