Quarterly Reading 0x13

2017-07-29Home

I was writing this about two weeks ago, but the first section Exactly-once in Kafka ended up as a full post. The topic has been keeping its momentum this week,

but I will refrain myself from talking more since there is a lot to catch up.

Releases

Apache Beam published first stable release 2.0.0. They went up directly from 0.6.0 to join Cloud Dataflow 2.0 SDK which is based on Beam 2.0.0.
Apache Storm community released 1.0.4, a maintenance release along the 1.0.x line. Note Storm also has a 1.1.x line with Streaming SQL and Kafka 0.10 support. I guess they try to stick to Semantic Versioning.
1. MAJOR version when you make incompatible API changes,
2. MINOR version when you add functionality in a backwards-compatible manner, and
3. PATCH version when you make backwards-compatible bug fixes.
The downside is now two minor versions have to be maintained for bug fixes.
Apache Flink has made the fourth major release 1.3.0 shortly followed by bug-fix release 1.3.1. Some notable new features are incremental checkpointing for RocksDB, side outputs, support for retractions in Table API/SQL. This is also a bug-fix version, 1.2.1 for 1.2 series. In the future,

Users can expect Flink releases now in a 4 month cycle. At the beginning of the 1.3 release cycle, the community decided to follow a strict time-based release model.
Our Gearpump has also made a bug-fix release, 0.8.4.

Google Cloud Big Data

Google Cloud Big Data blog has a series of After Lambda: Exactly-once processing in Cloud Dataflow

They also have announced

Service-based shuffling which brings up to 5x performance improvements
Cloud Dataproc 1.2, faster and easier to run Apache Spark (2.2.0) and Apache Hadoop (2.8.0).

Java land

With Jigsaw finally completed, Java 9 is around the corner (Sep. 21st). Azul blog has made a summary of required changes when upgrading to Java 9.
Wondering how to take advantage of Single Instruction, Multiple Data(SIMD) in Java ? There is no way to use SIMD intrinsics in Java directly, as of Java 8.
Brendan Gregg has made a Java Package Flame Graph to visualize which package has taken up most CPU time.
Kotlin is trending as Android makes it an official language. Why not Scala which also has Android support ? This is how a Scala developer takes it by drawing an analogy to skiing.

That's it but I still have more than 50 articles in my Pocket(read-it-later App). "read-it-later" is actually becoming "never read". That has made me to start a new experiment, read-it-now, to improve my reading quality.

Rather than saving an article to Pocket to read it later, I’d like to read it now, take notes and put down my thoughts in a GitHub issue

ManuZhang's Blog

Static Blog generated in Scala

Quarterly Reading 0x13

Releases

Google Cloud Big Data

Java land