About the talk
Modern recommender systems require diverse data processing and feature engineering at tremendous scale and usually employ heavy and complex deep learning models that require expensive GPU clusters to shorten the training time. Techniques for developing an effective, performant recommendation system remains a challenge for most data scientists and engineers.
In this session, we first survey the recommendation system landscape. Then, we walk through challenges endemic to building a recommendation system with a focus on the cost prohibitive nature of training a new system from scratch. We then propose an end-to-end solution that makes it much more cost, resource and time efficient to develop a recommendation system. This solution will include practical ways to optimize parallel data processing based on Spark and hyperparameter optimization and model selection with SigOpt. We will then apply this method on commodity CPU clusters to demonstrate how this combination of tooling boosts the pipeline efficiency for typical end-to-end recommendation systems like DLRM and DIEN. Finally, we’ll conclude the talk with a discussion around the future of this process and where further research would be valuable.
Jian Zhang is a senior software engineering manager at Intel, recently he and his team primarily focused on implementing and optimizing end to end AI solutions on distributed CPU cluster, democratizing AI models to improve scalability & usability on commodity hardware. He has 10+ years of experience in software development and optimization, ranging from open source virtualization systems like Xen & KVM, distributed storage systems Swift and Ceph, to big data and AI systems. Jian holds a master degree in Computer Science and Engineering.View the profile
Buy this talk
Buy this video
Our other topics
With ConferenceCast.tv, you get access to our library of the world's best conference talks.