Ingest and aggregate data from both internal and external data sources to build our world class datasets.
Build large-scale batch and real-time data pipelines with data processing frameworks like Scio, Storm, or Spark on the Google Cloud Platform.
Use best practices in continuous integration and delivery.
Help drive optimization, testing, and tooling to improve data quality.
Collaborate with other engineers, ML experts, and stakeholders, taking learning and leadership opportunities that will arise every single day.
Work in multi-functional agile teams to continuously experiment, iterate, and deliver on new product objectives.
Work from our offices in New York, with some travel to other Spotify office locations. We offer a relocation package if you do not live in the area.
Who You Are
You’ve worked with high-volume, heterogeneous data, preferably with distributed systems such as Hadoop, BigTable, and Cassandra.
You have experience with one or more higher-level JVM-based data processing frameworks such as Beam, Dataflow, Crunch, Scalding, Storm, Spark, or something we didn’t list- but not just Pig/Hive/BigQuery/other SQL-like abstractions
You know how to write distributed services in Java or Scala or Python
You are knowledgeable about data modeling, data access, and data storage techniques.
You understand the value of collaboration within teams and can build relationships with a diverse set of stakeholders.