Work | MTS 2 Software Engineer, Cloud Data PlatformJun 2019 - Feb 2026 Led the construction of the next-generation lakehouse based on Apache Iceberg-
Scaled Apache Iceberg in production by developing internal customizations, managing Spark-Iceberg integration,
and contributing bug fixes and features back to the community.
-
Developed a customized migration solution to achieve seamless and smooth migration of 300+ Spark/Hive tables to Iceberg tables
without copying data or affecting business, meeting GDPR and other compliance requirements
-
Developed a Data Lake Optimizer by watching Iceberg commit events, triggering compaction jobs on demand to merge small files,
reducing metadata overhead and read amplification, improving query stability and performance, and saving cluster resources.
-
Designed a CDC data ingestion solution based on Iceberg to solve the problem of declining query performance under streaming updates,
meeting near real-time data analysis needs.
Developed and maintained a high-availability Spark platform, supporting PB-level data processing and analysis requirements-
Managed the full lifecycle of internal Spark distributions, including custom feature backporting, proactive performance tuning,
and the development of standardized production troubleshooting frameworks.
-
Continuously evolved the job monitoring and diagnosis platform: analyzing Spark EventLogs to extract task execution and diagnostic information,
and persisting into ElasticSearch to support query and diagnostic services, improving self-service troubleshooting efficiency.
-
Led the large-scale enabling of Spark AQE on the production platform, handling data skew scenarios without manual intervention;
meanwhile reported critical issues and contributed bug fixes back to Spark community.
-
Built a centralized Spark configuration service: supporting canary release and rollback of multiple versions and configuration changes across multiple clusters,
satisfying diverse business requirements under a unified platform.
-
Developed an automated Spark major version upgrade pipeline based on the centralized configuration service and diagnostic platform,
achieving a smooth upgrade of 10,000+ jobs from 2.4 to 3.1 and then to 3.5; fixed issues found during the upgrade process and contributed back to the community.
Senior Software Engineer, Machine Learning PlatformAug 2017 - Jun 2019 -
Led the architecture and development of an end-to-end Machine Learning Platform,
providing an integrated workspace for data scientists and engineers to develop,
train, and deploy models.
-
Built a high-performance drag-and-drop pipeline editor that abstracted complex execution logic,
enabling non-technical users to build production-grade ML workflows.
-
Developed stream processing jobs to prepare real-time data for model training,
greatly improving model freshness.
Software Engineer, GearpumpJan 2015 - Aug 2017 -
Developed Gearpump, a next-generation real-time big data engine based on Akka;
responsible for developing Kafka connectors, Storm compatibility layers, and transaction APIs and integrating with Apache Beam;
contributed the project to the Apache Foundation.
Software Engineer, Intel Hadoop DistributionJuly 2013 - Jan 2015 -
Developed storm-benchmark, a specialized benchmarking suite to profile Apache Storm performance at scale,
uncovering architectural bottlenecks.
-
Contributed to mapreduce-nativetask, which boosted MapReduce performance up to 30%,
and was merged into Hadoop trunk.
|
Skills | Programming Languages:Scala,Java,Python,SQL,Shell,C/C++,JavaScript,HaskellDistributed Frameworks:Iceberg,Spark,Hadoop,Akka,Beam,Storm,Kafka,CassandraTools:Docker/Kubernetes,Ubuntu,CentOS,Git,Intellij,Jupyter/JupyterLab,Vim,Emacs
|