Staff Software Engineer, Backend - ML Platform | Amsterdam
Uber
Software Engineering, Data Science
Amsterdam, Netherlands
Posted on Saturday, May 13, 2023
About The RoleAs part of Uber's AI/ML Platform Team (Michelangelo), the Machine Learning Training team's mission is to make it really easy to train, tune, and build high-quality models at Uber.We build our own ML training software stack, and solve problems at all layers of the stack including iteration speed, compute efficiency, observability, fault tolerance, and correctness. On top of the core training stack, we build services, libraries, and frameworks e.g. automatic HyperParameter/architecture optimization, to accelerate the model development process. Check out [1,2] for more information.Our team moves at a fast pace and provides individuals with a high degree of autonomy and agency to affect change. We welcome kind and brilliant people to our team, from wherever they come.
What You’ll Do
- Build elastic, scalable, and fault-tolerant distributed machine learning libraries and systems used to power machine learning development productivity across Uber.
- Define and drive ML development best practices to build high-quality models.
- Act as a tech lead for a group of engineers in the broader Uber ML/AI Platform Team (Michelangelo) to improve the broader ML Platform ecosystem for our users.
- Work closely with Uber's ML community (with ML Engineers, Data Scientists, and Researchers) to scope and build new abstractions for scalable machine learning.
- 8+ years of relevant production software engineering experience designing and working with distributed systems or frameworks e.g. Spark, Ray, Kafka, Kubernetes, and/or Flink.
- Experience in building scalable and fault-tolerant distributed systems and/or large-scale machine learning systems.
- Driving highly ambiguous problems end-to-end, with a willingness and independence to pick up whatever knowledge is missing to get the job done.
- Communication and problem-solving skills working with multidisciplinary teams across different locations.
- Strong technical leadership with experience in mentoring junior engineers.
- Contributions to AI frameworks such as PyTorch, TensorFlow, JAX, or XGBoost.
- Papers at top-tier venues such as NeurIPS, ICML, MLSys, ICLR, JMLR.
See more open positions at Uber
Something looks off?