Project Hydrogen: Unifying State-of-the-art Big Data and AI in Apache Spark

Xingbo Jiang

Databricks    Software Engineer

Xingbo Jiang is a software engineer at Databricks, where he investigates the use cases on Spark Core and Spark SQL. Xingbo is an active contributor to Apache Spark. His areas of interest include distributed system, database, and data warehouse.


Big data and AI are joined at the hip: the best AI applications require massive amounts of constantly updated training data to build state-of-the-art models AI has always been on of the most exciting applications of big data and Apache Spark. Increasingly Spark users want to integrate Spark with distributed deep learning and machine learning frameworks built for state-of-the-art training. This talk introduces a new project that substantially improves the performance and fault-recovery of distributed deep learning and machine learning frameworks on Spark.