2016-09-04Home
In the last reading, I missed an important bug fix in Apache Storm 1.0.2, STORM-1728: TransactionalTridentKafkaSpout error, which I ran into myself writing a TridentKafka pipeline. You can only use OpaqueTridentKafkaSpout
in 1.0.1 or write your own. Okay, let's see what's new.
Kafka Streams' fault tolerance mechanism resumes processing from where it left off. There are cases, however, users want to reprocess data from scratch for testing or addressing bugs. It's no easy work since you need to clean up committed offsets and states manually. Luckily, Kafka provided an application reset tool. Data Reprocessing with Kafka Streams: Resetting a Streams Application has a thorough introduction on what's happening behind the scenes.
Stephan Ewen, CTO of data Artisans and Neha Narkhede, CTO of Confluent jointly posted Apache Flink and Apache Kafka Streams. Better together ? No, a comparison and guideline for users.
In summary, while there certainly is an overlap between Kafka Streams and Flink, they live in different parts of a company, largely due to differences in their architecture and thus we see them as complementary systems.
I see them make an alliance. Do you ?
Lightbend blog already has a good summary on Akka updates for August. I'd like to highlight those interested me most.
Akka team introduced Alpakka, Akka Streams Integration project, to build an ecosystem around connectors, similar to Apache Camel.
Being a streaming library, Akka Streams is not able to window data based on time yet. Adam Warski from SoftwareMill wrote about how to implement windowing data in Akka Streams. I also like his introduction to windowing.
PayPal Scaled To Billions Of Transactions Daily Using Just 8VMs with Akka and Scala.
There are far more interesting stuff so check out the original post.
How do you compare Strings in Java ?
It compares strings by the first differing character, falling back to the length difference when they are identical up to the end of the shorter string
Have you heard of a second implementation ? If not, you may like How the JVM compares your strings using the craziest x86 instruction you've never heard of. Note from the comments that
Just because there is an instruction for it doesn't mean it's faster than some other simpler approach
Linkedin shared their real-time bidding pipeline which backed their sponsored content. Nice introduction to Real-Time Bidding (RTB).
It seems there are for more interesting things than you are able to work on, so how do you decide ?
That's all for the last two weeks. I have weekly, monthly, and now biweekly. What's next ?