I recently read an article about Cloud-Native vs Reactive development, in which the statement is made that Cloud-Native looks a lot like Reactive. The article asserts several points that could make pursuit of both Cloud-Native and Reactive seem unreachably complex.
I think most of the message that I read is targeted at C-level executives, with a lot of sweeping statements that are quite inaccurate. It seems appropriate to clarify what the actual technical realities are around Cloud-Native and Reactive, but stated more in terms of business drivers, goals, and general concerns.
I hope my input will help C-level executives and the technical experts supporting them to avoid the danger zone of “streaming all the things” and “NoSql all the things” and “reactive all the things.” Those approaches are certainly important, and in any given service possibly worthy of use, but don’t believe that you will miss out on the benefits of Cloud if everything in your digital assets portfolio doesn’t work that way.
In an effort to bring balance and accuracy to an otherwise super-hyped technology stack, I address each of the statements that most concerned me. Here are the assertions made about Cloud-Native and Reactive:
I address each of these in the next subsections.
Streaming data has become quite important and useful. It can make a huge difference between acting late on slow business intelligence, or acting near real-time to fast business intelligence. Where it is critical to move rapidly on data just in time, there is currently no better way to do so than with streaming.
Yet, to state that in the cloud it’s always streaming is just not accurate. There is still persistent state in most business systems and subsystems (services). Call this persistent state data at rest. To garner more value from each subsystem that persists data, it’s critical to inform other subsystems what has occurred in a given subsystem that could cause the others to react. That’s important, for otherwise, how would the other subsystems know about the happenings in another?
Informing other subsystems of happenings in your subsystem can be accomplished in several ways. One of those is through streaming. When data is streaming, call it data in motion. Yet, what streaming data is can be interpreted in different ways. Generally we think of streaming as a push-based approach, where other subsystems are proactively notified of various occurrences while the consumer itself is mostly passive until the notification arrives. Some of the most popular “streaming” solutions are in reality poll-based, where the interested subsystems ask the stream topic for new happenings that are currently available.
Another important point to grasp is that, in fact, there are very intelligent and type safe ways to help the developers of consuming subsystems to “parse it,” where “it” is the streaming data. Beware of technology stacks that have largely ignored this essential developer need.
This is a sad commentary on the part of those who promote Reactive, as if ‘your developers don’t cut it.’ Perhaps such messages explain why many enterprises have resisted adopting Reactive even when it makes so much sense to do so. After all, if you must purge much of your existing expertise and then rebuild in an industry that is already strapped for software developers and architects, why would you purposely do so and endanger your core business to follow a hard-to-reach technology stack? C-level executives, team managers, and leads, and even the developers with hard-won experience, should rest assured that existing Java skills can be leveraged in a Reactive architecture and software development environment.
What is more, Java developers should use toolsets that purposely preserve the current experience of Java developers. They should be able to define Java interfaces as messaging protocols, and then implement objects that fulfill the defined protocols. Yet, along with this these experienced developers should be afforded modern tooling that supports vastly concurrent and parallel hardware that has become commodity since approximately 2003 and beyond.
If someone tells you that “the skill set at the traditional Java shop is not conducive to the architecture models of modern, cloud, and data-centric applications” challenge it. They have over-complicated things.
Referring back to the first statement, this is simply not so. Data is persisted identically or similarly to how it has been for decades. To make statements to the contrary is misleading and even dangerous. There are still databases, even rational databases (RDBMS) in the Cloud. There are also what are known as “No SQL” (or NoSql) stores. Each of these are leveraged for different reasons.
You should not consider it inferior to use an RDBMS when circumstances call for one. In some cases an RDBMS may not be able to meet the scalability needs of a specific set of use cases. At other times it’s just perfect for the task at hand, and if arbitrarily ignored as a technology choice could lead to much unnecessary complexity.
Think for a moment about this assertion that “you’re dealing with data that are not persisted.” This makes my eye twitch. If data in the Cloud is not persisted, what happens when there is an outage of any kind in the system or subsystems? Of course data is persisted, because persistence is the only way to protect one of your most valuable assets, your data.
Yes, even data in streams is very often persisted, possibly even mostly so. Again, if a stream of data is not persisted, how will your business intelligence analytics continue processing data after some outage somewhere in the critical path? Quite often, even most of the time, streaming data is persisted with a retention period that may range from “forever” to weeks, hours, days, or minutes.
This of course is not to say that all streaming data is persistent. This is sometimes unnecessary when the stream’s consumers need only samples of data every so often. It may also not be possible to persist some streams of data due to throughput and performance requirements. Although it helps any given consumer to be more accurate about it’s calculations and other processing goals if they have all the data, other processing pipelines won’t be negatively impacted by losing some data. In such cases when some outages causes a relatively minimal amount of data to be lost, the over all consumer processing goals are not compromised.
The important lesson to gain out of this is to understand when stream persistence is necessary and when it is not, or when it is even detrimental to the overall throughput and scaleability goals. Most business stakeholders and technologists can reason about this and make sound decisions that achieve the best business outcome.
True, streams are streams. If you think of a stream of water and yourself being in a fixed position with a fixed line of sight, you get only one opportunity to see a specific section of the stream. What you just saw is now downstream and out of your visibility scope.
However, in the points just made in the previous subsection, it is common for the strongest streaming options to offer retentions periods, such that if for some reason a critical portion of data was not seen, it may be recalled for later consumption. That’s why you need to think about the properties of the consumer itself and whether or not it needs persisted data.
This leads to another point about “a SQL query” being unavailable. Well, that may be true for the stream itself, but it is possible and even common for a stream of data to be projected into a store that enables developers to query it using SQL. One such means of querying live streaming data is known as continuous query and has been available for quite a while.
This is often the message that emanates from product companies that don’t offer any other kind of solution. Unfortunately this is true of several leading technology stacks. Yet, there are offerings known as “lift and shift” which gives you the benefit of preserving your legacy while still getting Cloud benefits.
Although I won’t first and foremost direct others to create monoliths, creating a monolith is not the worst thing that your development teams could do. A well-modularized monolith is far better than a large array of strongly coupled microservices. The real peril here comes, not from monolithic architectures and designs, but those that have weak architectures and poor modularity and thus constitute what has become known as the Big Ball of Mud. This name should be self describing.
Even so, one of the biggest risks that teams face is having experience only with creating Big Balls of Mud, and then imagining that if they can develop using microservices that this will somehow “fix” their architecture and designs.
As a guideline, use microservices when the rate of change across several business functionalities are quite different. Does trying to coordinate changes across various software updates cause contention across functional teams? What about scalability and throughput? If some business components vary in the scaleability and throughput requirements, it can work out best to break off such into smaller microservices, because the scalability and throughput requirements can be addressed independently from those that do not have the same stringent demands.
Just as it is important to understand when streaming data should be persistent and when it does not need to be persistent, it’s important to understand when to use microservices and when to use monoliths, or to preserve preexisting ones. Both can be very useful, depending on the service-level agreements around each, and the experience of the teams available to develop the various parts of the whole system solution.
Depending on the poll, Java is the most popular programming language in the world, or the second most. Java is not going away any time soon. Java SE releases are rolling out at a formerly un-heard-of rate. Improvements are being made on a continual basis. Does Java as a programming language have issues? Of course, but as it has been said:
There are only two kinds of [programming] languages: the ones people complain about and the ones nobody uses.
Java is the first kind, and it has been proven time and again that Java is the language used when software must be delivered. Even if you chose to use an alternative to Java, it will very likely be a JVM-language—one that is hosted by the Java Virtual Machine. It’s a testimony to the fact that the worldwide investment in Java is extremely high, and it is a reliable delivery platform.
Java not only has a future, but a very good one. With Java add-on tools that support a Java-Native Reactive programming model, businesses building on the quality of Java can continue to make strides in the direction of Cloud-Native and Reactive.
I hope you take a few moments to educate yourself on our open source Java-Native Reactive Platform. We assure you that our Reactive toolset does not require you to abandon hard-won experience with the Java language and platform. We have made it simple to use standard Java interfaces as protocols implemented in message-driven objects. Have your Java architects and developers go though this tutorial to jump start Reactive in just minutes.