Principal Distributed Systems Engineer, AI Infrastructure

NVIDIA
NVIDIA

Job Overview

We are now looking for a Principal Distributed Systems Engineer.NVIDIA is hiring a principal distributed systems engineer to architect, lead and develop AI infrastructure and deep learning platforms! You will need to have strong programming skills, a deep understanding of cloud technologies, distributed storage & compute systems, and distributed systems architecture. You will need excellent communication and planning skills. Finally, you will need engineering technical leadership skills. Together, we will build the software 2.0 cloud platform for one of the most challenging problems of our time: autonomous vehicles. Then we will apply it to other applications such as medical imaging, data science, genomics and more.What you’ll be doing:* Architect and build scalable and distributed services that will help power the AI infrastructure for deep learning platforms.* You will design and build infrastructure and microservices that help index, mine, transform, and compose PB sized deep learning datasets.* Design the next generation of dataset management services for real and synthetic / simulated datasets.* You will enable smart data selection – one of the key ingredients for successful machine learning!* Collaborate with multiple AI teams to understand their requirements and build a platform infrastructure that improves productivity.* Be a technical leader on various projects across the platform, and be a major contributor of the entire platform’s architecture.* Collaborate with AI applied researchers and leaders to build future-proof infrastructure.What we need to see:* Masters in Computer Architecture, Computer Science, Electrical Engineering or related field* 14+ years of Work or Research Experience in distributed systems development and design.* Solid technical foundation in distributed computing and storage, including significant experience with most of the following: server systems, storage, I/O, networking, and systems software.* You possess advanced programming skills to build distributed storage and compute systems, backend services, microservices, and web technologies.* A specialist programmer in Go and/or C/C++.* Switch effectively between long-term strategic management and near-term tactical management.* Highly motivated with strong interpersonal skills, you have the ability to work successfully with multi-functional teams, principles and architects and coordinate effectively across organizational boundaries and geographies.* A track record of successful technical leadership and large-scale architecture that impacted critical projects.Ways to stand out from the crowd:* Advanced programming expertise in Scala.* Experience with Kubernetes and Docker.* Enthusiasm to collaborate and build supporting development infrastructure like CI/CD and DevOps.* A go getter attitude to investigate and understand technical requirements.With highly competitive salaries and a comprehensive benefits package, Nvidia is widely considered to be one of the technology industry’s most desirable employers. We have some of the most brilliant and talented people in the world working with us and our engineering teams are growing fast in some of the hottest state of the art fields: Deep Learning, Artificial Intelligence, and Autonomous Vehicles. If you’re a creative and autonomous computer scientist with a real passion for distributed systems and parallel computing, we want to hear from you.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression , sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.#deeplearning

View More
Job Detail
Shortlist Never pay anyone for job application test or interview.