Monthly Reading 0xB


It seems everyone in Big Data is doing Weekly / Bi-weekly / Monthly review. Here is my subscription list

Now I'm writing up a review based on reviews.


A bunch of maintenace releases from the open source community.



  • 7-big-data-tools-to-ditch-in-2017. Both MapReduce and its streaming counterpart Storm make the list. Wait, are you sure to ditch Java ?

  • Scalable Stream Processing: A Survey of Storm, Samza, Spark and Flink

    give an overview over the state of the art of stream processors for low-latency Big Data analytics and conduct a qualitative comparison of the most popular contenders

  • 4-min read on How Kafka’s Storage Internals Work. A big fan of such short informative stories.

  • Jay Kreps has a new masterpiece, Sharing is Caring: Multi-tenancy in Distributed Data Systems

    You see hundreds of blog posts on benchmarking infrastructure systems—showing millions of requests per second on vast clusters—but far fewer about the work of scaling a system to hundreds or thousands of engineers and use cases. It’s just a lot harder to quantify multi-tenancy than it is to quantify scalability.

    Can't agree more ! A multi-tenancy benchmark, anyone ? I'll leave you here to think more about this topic.