About Us
At BOI we pioneer the field of AI strategy and applied AI, partnering with the world’s most ambitious businesses to unlock the transformative potential of AI. We don’t just help organizations imagine what’s possible—we build the AI products that turn vision into reality.
From developing the world’s first AI-native fashion and retail models for a global retail giant to creating an autonomous innovation engine for a leading CPG company. As a cutting-edge consultancy, we collaborate with Fortune 500 companies on high-impact international projects, helping them innovate, grow, and lead in the autonomous age.
We’re also the creators of Autonomous, the world’s largest gathering of AI innovators, where thought leaders and visionaries come together to redefine industries and shape the future of technology.
If you’re passionate about driving transformative change, becoming part of our founding engineering team and leading on projects at the forefront of AI innovation, we’d love to meet you.
About the Role
We are hiring a Data Engineer to design, build, and manage scalable data pipelines that support our AI-powered tools and applications, including agentic tools that adapt to user behaviors. You will harmonize and transform data from disparate sources to ensure it is ready for use in foundational model integrations. The ideal candidate has prior experience in a consulting or agency environment and thrives in project-based settings.
This is a hands-on role where you will be responsible for building and implementing systems from the ground up. You would write production-level code while defining processes and best practices for future team growth.
Responsibilities
- Develop and manage ETL pipelines to extract, transform, and load data from various internal and external sources into harmonized datasets.
- Design, optimize, and maintain databases and data storage systems (e.g. PostgreSQL, MongoDB, Azure Data Lake, or AWS S3).
- Implement scalable workflows for automating data harmonization and transformation using tools like Apache Airflow or AWS Glue.
- Collaborate with AI Application Engineers to prepare data for use in foundational model workflows (e.g. embeddings and retrieval-augmented generation setups).
- Ensure data integrity, quality, and security across all pipelines and workflows.
- Monitor, debug, and optimize data pipelines for performance and reliability.
- Support data needs for the development of adaptive agentic tools, including data structuring and optimization.
- Stay current on best practices in cloud data solutions and emerging technologies in data engineering.
Requirements
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.
- A minimum of 6 years of professional experience in data engineering with a focus on building and managing ETL pipelines.
- Proven experience working in a consulting or agency environment on project-based work.
- Advanced proficiency in Python, SQL, and data transformation libraries like pandas or PySpark.
- Strong experience with cloud data platforms (e.g. AWS Glue, Azure Synapse, BigQuery).
- Hands-on experience with data pipeline orchestration tools like Apache Airflow or Prefect.
- Solid understanding of database design and optimization for relational and non-relational databases.
- Familiarity with API integration for ingesting and processing data.
- Advanced English skills, both written and verbal, with the ability to communicate effectively in an international team.
Preferred Qualifications
- Experience with vector database systems (e.g. Pinecone, Weaviate) and RAG workflows.
- Knowledge of foundational model requirements for data preparation.
- Strong understanding of data governance, privacy, and security best practices.