We’re assembling some of the industry’s best and brightest AWS cloud and software engineering talent and pairing that with cutting-edge science and technology to push the boundaries of data, IoT, and AI/ML platforms to accelerate the discovery of transformative medicines.
This newly formed team is responsible for creating an integrated data pipeline, API, and microservices ecosystem using serverless architectures, languages like Python, and AWS capabilities like Athena, DynamoDB, Firehose, Kinesis, Glue, SageMaker, and Lambda. As a member of the team, you’ll be hands-on exploring new capabilities, building rapid prototypes, iterating on longer-lasting solutions, improving our automation capabilities, and establishing our operating frameworks.
Essential Job Duties
* Build and maintain serverless data pipelines, derived datasets, and data discovery platforms at scale using AWS cloud services
* Build and maintain serverless APIs and microservices to simplify data mastering, access, interrogation, cleansing, and exchange
* Develop and implement tests to ensure data quality across all integrated data sources
* Contribute to software development efficiencies by advancing our agile development practices, automated build & test frameworks, privacy-by-design frameworks, and secure coding expertise.
* Collaborate directly with technical peers and non-technical end users to understand requirements, invent solutions, and create value quickly.
* Bachelor’s degree in Computer Science, or related discipline required, PhD / Master’s degree in related discipline preferred
* 3+ years of relevant cloud data engineering or software engineering experience
* Experience using AWS cloud services for data processing, storage, computation, monitoring, event processing, machine learning, and messaging such as Glue, Kinesis, Kinesis Firehose, S3, Athena, DynamoDB, Redshift, Athena, Neptune, SageMaker, API Gateway, Lambda, and others
* Experience creating, versioning, and supporting RESTful services and APIs at scale
* Experience ingesting and integrating data from many sources using streams, flat files, APIs and databases
* Demonstrated expert level understanding of Python and SQL by the ability to understand and apply advanced concepts
* Experience determining and using the optimal database (relational, graph, columnar, document, …) based on requirement
* Understanding of contemporary data file formats like Parquet
* Experience preparing data for use in a research setting a plus
* Experience with full stack development a plus
* Experience in life sciences industry a plus