eBay

Shanghai

MTS 2 Software Engineer, Cloud Data Platform

Jun 2019 - Feb 2026

Led the construction of the next-generation lakehouse based on Apache Iceberg
- Scaled Apache Iceberg in production by developing internal customizations, managing Spark-Iceberg integration, and contributing bug fixes and features back to the community.
- Developed a customized migration solution to achieve seamless and smooth migration of 300+ Spark/Hive tables to Iceberg tables without copying data or affecting business, meeting GDPR and other compliance requirements
- Developed a Data Lake Optimizer by watching Iceberg commit events, triggering compaction jobs on demand to merge small files, reducing metadata overhead and read amplification, improving query stability and performance, and saving cluster resources.
- Designed a CDC data ingestion solution based on Iceberg to solve the problem of declining query performance under streaming updates, meeting near real-time data analysis needs.
Developed and maintained a high-availability Spark platform, supporting PB-level data processing and analysis requirements
- Managed the full lifecycle of internal Spark distributions, including custom feature backporting, proactive performance tuning, and the development of standardized production troubleshooting frameworks.
- Continuously evolved the job monitoring and diagnosis platform: analyzing Spark EventLogs to extract task execution and diagnostic information, and persisting into ElasticSearch to support query and diagnostic services, improving self-service troubleshooting efficiency.
- Led the large-scale enabling of Spark AQE on the production platform, handling data skew scenarios without manual intervention; meanwhile reported critical issues and contributed bug fixes back to Spark community.
- Built a centralized Spark configuration service: supporting canary release and rollback of multiple versions and configuration changes across multiple clusters, satisfying diverse business requirements under a unified platform.
- Developed an automated Spark major version upgrade pipeline based on the centralized configuration service and diagnostic platform, achieving a smooth upgrade of 10,000+ jobs from 2.4 to 3.1 and then to 3.5; fixed issues found during the upgrade process and contributed back to the community.

Vipshop

Shanghai

Senior Software Engineer, Machine Learning Platform

Aug 2017 - Jun 2019

Led the architecture and development of an end-to-end Machine Learning Platform, providing an integrated workspace for data scientists and engineers to develop, train, and deploy models.
Built a high-performance drag-and-drop pipeline editor that abstracted complex execution logic, enabling non-technical users to build production-grade ML workflows.
Developed stream processing jobs to prepare real-time data for model training, greatly improving model freshness.

Intel

Shanghai

Software Engineer, Gearpump

Jan 2015 - Aug 2017

Developed Gearpump, a next-generation real-time big data engine based on Akka; responsible for developing Kafka connectors, Storm compatibility layers, and transaction APIs and integrating with Apache Beam; contributed the project to the Apache Foundation.

Software Engineer, Intel Hadoop Distribution

July 2013 - Jan 2015

Developed storm-benchmark, a specialized benchmarking suite to profile Apache Storm performance at scale, uncovering architectural bottlenecks.
Contributed to mapreduce-nativetask, which boosted MapReduce performance up to 30%, and was merged into Hadoop trunk.

Work	eBay Shanghai MTS 2 Software Engineer, Cloud Data Platform Jun 2019 - Feb 2026 Led the construction of the next-generation lakehouse based on Apache Iceberg Scaled Apache Iceberg in production by developing internal customizations, managing Spark-Iceberg integration, and contributing bug fixes and features back to the community. Developed a customized migration solution to achieve seamless and smooth migration of 300+ Spark/Hive tables to Iceberg tables without copying data or affecting business, meeting GDPR and other compliance requirements Developed a Data Lake Optimizer by watching Iceberg commit events, triggering compaction jobs on demand to merge small files, reducing metadata overhead and read amplification, improving query stability and performance, and saving cluster resources. Designed a CDC data ingestion solution based on Iceberg to solve the problem of declining query performance under streaming updates, meeting near real-time data analysis needs. Developed and maintained a high-availability Spark platform, supporting PB-level data processing and analysis requirements Managed the full lifecycle of internal Spark distributions, including custom feature backporting, proactive performance tuning, and the development of standardized production troubleshooting frameworks. Continuously evolved the job monitoring and diagnosis platform: analyzing Spark EventLogs to extract task execution and diagnostic information, and persisting into ElasticSearch to support query and diagnostic services, improving self-service troubleshooting efficiency. Led the large-scale enabling of Spark AQE on the production platform, handling data skew scenarios without manual intervention; meanwhile reported critical issues and contributed bug fixes back to Spark community. Built a centralized Spark configuration service: supporting canary release and rollback of multiple versions and configuration changes across multiple clusters, satisfying diverse business requirements under a unified platform. Developed an automated Spark major version upgrade pipeline based on the centralized configuration service and diagnostic platform, achieving a smooth upgrade of 10,000+ jobs from 2.4 to 3.1 and then to 3.5; fixed issues found during the upgrade process and contributed back to the community. Vipshop Shanghai Senior Software Engineer, Machine Learning Platform Aug 2017 - Jun 2019 Led the architecture and development of an end-to-end Machine Learning Platform, providing an integrated workspace for data scientists and engineers to develop, train, and deploy models. Built a high-performance drag-and-drop pipeline editor that abstracted complex execution logic, enabling non-technical users to build production-grade ML workflows. Developed stream processing jobs to prepare real-time data for model training, greatly improving model freshness. Intel Shanghai Software Engineer, Gearpump Jan 2015 - Aug 2017 Developed Gearpump, a next-generation real-time big data engine based on Akka; responsible for developing Kafka connectors, Storm compatibility layers, and transaction APIs and integrating with Apache Beam; contributed the project to the Apache Foundation. Software Engineer, Intel Hadoop Distribution July 2013 - Jan 2015 Developed storm-benchmark, a specialized benchmarking suite to profile Apache Storm performance at scale, uncovering architectural bottlenecks. Contributed to mapreduce-nativetask, which boosted MapReduce performance up to 30%, and was merged into Hadoop trunk.
Open Source Contributions	Projects Apache Iceberg (Top 10 contributors by commits) github.com/apache/iceberg Apache Spark (Long time active contributor) github.com/apache/spark Apache Beam (committer) github.com/apache/beam Gearpump (Core developer) github.com/gearpump/gearpump awesome-streaming (over 3k stars) github.com/manuzhang/awesome-streaming
Skills	Programming Languages: Scala,Java,Python,SQL,Shell,C/C++,JavaScript,Haskell Distributed Frameworks: Iceberg,Spark,Hadoop,Akka,Beam,Storm,Kafka,Cassandra Tools: Docker/Kubernetes,Ubuntu,CentOS,Git,Intellij,Jupyter/JupyterLab,Vim,Emacs
Education	East China Normal University Shanghai Undergraduate Software Engineering Sep 2009 - July 2013

Tianlun(Manu) Zhang

Work

eBay

MTS 2 Software Engineer, Cloud Data Platform

Led the construction of the next-generation lakehouse based on Apache Iceberg

Developed and maintained a high-availability Spark platform, supporting PB-level data processing and analysis requirements

Vipshop

Senior Software Engineer, Machine Learning Platform

Intel

Software Engineer, Gearpump

Software Engineer, Intel Hadoop Distribution

Open Source Contributions

Projects

Apache Iceberg (Top 10 contributors by commits)

Apache Spark (Long time active contributor)

Apache Beam (committer)

Gearpump (Core developer)

awesome-streaming (over 3k stars)

Skills

Programming Languages:

Distributed Frameworks:

Tools:

Education

East China Normal University

Undergraduate Software Engineering