Senior Data Engineer

Stanford University
Stanford University

Job Overview

Senior Data Engineer????School of Medicine, Stanford, California, United States????Information Technology Services???????Jul 24, 2020 Post Date???????86951 Requisition #Apply for JobShare this JobSign Up for Job AlertsWe enter the world with a set ofgenes that determine many factors of our health. From that moment forward,every aspect of our lives affects our health. There is unequivocal evidencethat social, environmental, and behavioral factors are by far and away thestrongest predictors of health and disease. Founded in 2015, the StanfordCenter for Population Health Sciences (PHS) mission is to improve populationhealth by bringing together diverse disciplines and data to better understandand address the social and environmental determinants of health.One of PHS strategic goals is tospur and support cutting-edge transdisciplinary population research thatadvances our mission. Toward this end, we have created a world-class dataecosystem that hosts over 100 datasets, supporting 800 researchers and ~1,500research projects. PHS data team is responsible for managing PHS dataecosystem, advancing critical research projects and initiatives, developingnovel data products and services, and acquiring new datasets that stimulatepopulation health research. The Senior Data Engineer will be responsible foradministering, monitoring, and maintaining the PHS Google cloud environment.She/he will work as a technical bridge between PHS and our partners, who areproviding PHS researchers with technical infrastructure, which allowsresearchers to analyze high-risk data in a secure environment.To advance PHS’ goals, the data teamis expanding its data portfolio to include additional high-value datasets(focusing on datasets that include social determinants). The Senior DataEngineer will manage a portfolio of high-value datasets which will involve datacleaning, curation, annotation, and transformation into common data models. Theintegrity, transparency, and reproducibility of science increasingly depend onthe accuracy and precision of both decisions made during data curation and thedegree to which they are annotated in a way that enables the data to be bothused and reused. The Senior Data Engineer must excel at both curating the dataand communicating decisions made and how to use the data to the hundreds ofresearchers who will be using the canonical datasets. PHS has a large portfolioof exciting research and data partnerships around the world and working groupswith different health focuses (e.g., air pollution and health, healthdisparities, mental health) that vary in terms of their scope, complexity, andtopics. She/he will serve as the data lead and provide technical support tosome of the working groups particularly for teams requiring machine learning orNLP methods.Duties include:+ Install, configure, and supportrelational database management systems (RDBMS) and related software to resolvehighly complex and/or unique issues without precedent and/or structure. Work ondatabases using more advanced database administration concepts and modern(cloud) tools.+ Design, develop, and maintainhighly-available databases.+ Take responsibility for databasedesign and performance optimization, backup and recovery strategies andimplementation, as well as monitoring the overall health of the databaseenvironment.+ Execute complete database solutionsfrom evaluation to implementation.+ Provide on-going systemadministration and technical infrastructure support.+ Review the physical and logicaldesign of databases for optimal database structures, performance tuning,security, and database backup/recovery. Plan and implement pro-active andreactive performance analysis, monitoring, troubleshooting, and capacityplanning.+ Evaluate and test marketplace toolsand utilities, including system integration and automation, which enhanceserver functionality and promote the development of RDBMS applications.+ Develop and maintain efficient andappropriate connectivity solutions between various campus databases to ensurenecessary data is available as needed.+ Create data flow and data lifecycledocumentation.+ Develop and enforce databasestandards and/or new development protocols.+ Experience in the Extract,Transform, Load (ETL) processes and building the pipeline.* – Other duties mayalso be assignedDESIRED QUALIFICATIONS:+ 5+ years of experience in workingwith large datasets on cloud environment+ 3+ years of previous analyticsprojects on large Electronic Healthcare Records, medical claims or datasets ofcomparable size, sensitivity and complexity.+ Proficiency in Google Cloud suitesof tools such as BigQuery, Compute Engine, Dataprep, Healthcare API, AIPlatform, etc.+ Proficiency in Python packages fordata science and statistics+ Strong knowledge of relationaldatabases and SQL+ Strong skills in statistics,probability, and machine learning+ Experience with Cloud Services andCloud Automation solutions+ Experience working with automateddeployments and source code/configuration management tools (such as GitHub+ Great communication skills, andexperience telling stories with data+ Familiarity with medical terms andthe healthcare billing systems+ Previous experience with NaturalLanguage Processing is a plus+ Advancedcloud skills+ Dataannotation and data science+ MLskillsEDUCATION & EXPERIENCE (REQUIRED):+ Bachelor’s degree and eight years ofrelevant experience or a combination of education and relevant experience.KNOWLEDGE, SKILLS AND ABILITIES(REQUIRED):+ Demonstrated experience in thesupport requirements of large database applications/systems includingperformance analysis and tuning of high volume, transaction systems.+ Knowledge of hierarchical databasesystems.+ Extensive knowledge of systemsplatform.+ Extensive knowledge and experiencein the design and implementation of complex database architecture.+ Extensive experience in programmingenvironments.+ Ability to develop a vision forfuture computing needs and to develop appropriate plans to meet these needs.+ Ability to work well independentlyor as a team member.+ Ability to maintain knowledge of newand current technology and trends.+ Ability to collaborate with vendorsand system service providers, provide administration for system solutions, andact as a liaison.+ Extensive experience with databasesecurity protocol.PHYSICAL REQUIREMENTS*:+ Constantly perform desk-basedcomputer tasks.+ Frequently sit, grasp lightly/finemanipulation.+ Occasionally stand/walk, use atelephone.+ Rarely writing by hand,lift/carry/push/pull objects that weigh up to 10 pounds.* – Consistent with its obligations under the law, theUniversity will provide reasonable accommodation to any employee with adisability who requires accommodation to perform the essential functions of hisor her job.WORKING CONDITIONS:+ Evening and weekend work required on occasionWORK STANDARDS:+ Interpersonal Skills:Demonstrates the ability to work well with Stanford colleagues and clients andwith external organizations.+ Promote Culture ofSafety: Demonstrates commitment to personal responsibility and value forsafety; communicates safety concerns; uses and promotes safe behaviors based ontraining and lessons learned.+ Subject to and expectedto comply with all applicable University policies and procedures, including butnot limited to the personnel policies and other policies found in theUniversity’s Administrative Guide, tanford.edu.Additional Information+ Schedule: Full-time+ Job Code: 4762+ Employee Status: Regular+ Grade: K+ Requisition ID: 86951

View More
Job Detail
Shortlist Never pay anyone for job application test or interview.