service machines

Sunday, November 17, 2013

Service Composition: Modularity for SOA and Event-Driven Applications, Part I

Modularity is a cornerstone of good application design. As systems become more distributed, we’re faced with unique challenges to achieving effective modularity. How do you organize, encapsulate, and version loosely-coupled services?

In this series of posts, I will cover how modular architectures were built for two diverse Java-based applications: a highly reliable SOA tax processing platform that interfaces with legacy systems; and a low-latency, event-based system for FX currency trading. Modularity was achieved OSGi, Service Component Architecture (SCA), and Fabric3 as the runtime stack.

This post will start with a brief overview of the technologies involved in creating these modular systems and proceed to a detailed discussion of how they were used to build the SOA tax processing platform for a European government. In a subsequent post, we will cover how the same modularity techniques were applied to successfully deliver the low-latency FX trading architecture to a major bank.

From OSGi to Service Composition

There is no one technology that offers a complete modularity solution. That’s because modularity provides a number of features and exists at a number of levels in an application.

In terms of features, modularity:

Reduces complexity by segmenting code into discrete units
Provides a mechanism for change by allowing application components to be versioned
Promotes reuse by defining contracts between subsystems

Modularity is also present at different levels of an application:

Application Modularity

While much of the above diagram will be familiar to Java developers, it’s worth defining what we mean by service composition and architectural modularity. Service-Orientation (organizing application logic into contract-based units) and Dependency Injection (popularized by frameworks Such as Spring and Guice, among others) are the foundation of modern architectural modularity. In a nutshell, both help to decouple subsystems, thereby making an application more modular.

What’s missing is architectural encapsulation and composition. For example, applications often need to expose coarse-grained services that are themselves composed of multiple finder-grained services. Traditional integration platforms, ESBs, and Java EE lack facilities for doing this in a simple, effective manner. You may have seen this with the proliferation of services exposed via an Enterprise Service Bus (ESB) or Spring application contexts that contain hundreds or even thousands of beans.

Like Object-Orientation, what’s needed is a way to group collections of services together for better management and mechanism to encapsulate the implementation details of particular services:

Service Modularity

Fabric3 provides such a composition mechanism that works well for both SOA as well as event-driven designs. I’ll now turn to how this was achieved in a tax-processing system and a FX Trading platform.

I’ve deliberately chosen these two examples because each application has a different set of requirements. The tax system is what many would label a SOA integration platform: it receives asynchronous requests for tax data, interfaces with a number of legacy systems, process the results, and sends a response to the requesting party. The FX system, in contrast, is concerned with extreme (microsecond) latencies: it receives streams of market data, processes them, and in turn provides derived foreign exchange pricing feeds to client systems.

SOA Modularity

The tax system architecture looks like this:

Tax System Architecture

Tax information requests are received via a custom reliable messaging layer to a gateway service, which transactionally persists the message request and initiates processing. Processing takes place in a number of steps using a series of complex rules and interactions with multiple legacy systems. When data has been received and processed, a response is sent via the messaging layer to the requestor.

The principal modularity challenge faced when designing the system was to separate the core processing (a state machine that transitions a request through various stages) from the rules evaluation and logic that connects to the legacy systems.

A key goal of modularizing the various subsystems was to provide a straightforward versioning mechanism. For example, tax rules typically change every tax year. Consequently, existing rules had to be preserved (to handle requests for data involving previous tax years) alongside the current year rules. Modularizing the rules allowed for them to be updated without affecting other parts of the system.

The parts of the application that interfaced with the legacy systems to retrieve tax data were also isolated in a module. Similar to the rules, this allows changes to the way external interfaces are made to be altered without impacting the rest of the system. Modularity served an additional practical purpose: the code to interface with the legacy systems was complex and tedious. By segmenting that complexity, the overall system was made easier to understand and maintain.

What did this modularity translate to in practice? The development environment was setup as a Maven multi-module build. The base API modules contain Java interfaces for various services. Individual modules for core processing, rules, and integration depends on relevant API modules:

Tax System Modules

The multi-module build enforces development-time modularity. For example, the rules module cannot reference classes in the integration module. OSGI is used for runtime code modularity. The API modules export packages containing the service interfaces while each dependent module imports the API interfaces it requires.

OSGi Imports and Exports

The tax system uses service composition to enforce modularity at the service level. The core processing, rules, and integration subsystems are all composed of multiple fine-grained services. The integration subsystem in particular exposes a single interface for receiving requests from the core processing module. This request is then passed through a series of services that invoke legacy systems using Web Services (WS-*):

Tax System Integration Module

Service composition is handled in Fabric3 by using SCAcomposites. Similar to a Spring application context, a composite specifies a set of components and their wiring. In this example, we use XML to define the composite (The next version of Fabric3 will also support a Java-based DSL):

The Integration Module Composite

As its name implies, a Composite provides a way to compose coarser-grained services from private, finer-grained ones. In the above example, the service element promotes, or exposes, the TaxSystem service as the public interface of the composite.

Service Promotion

When this is done, client services in the core processing module can reference the TaxSystem integration composite as a single service:

With composites in place, the tax system successfully delivered a consistent modular design from the code layer to its service architecture:

Service Architecture Modularity

After more than a year in production, the investment in this modular design paid off. The integration module was re-written to take advantage of new, significantly different legacy system interfaces without the need to refactor the other subsystems.

****

In the next post, we will cover how service composition was used to modularize a low-latency, event-based FX trading platform. In this case, service composition was employed to simplify the system architecture and provide a mechanism for writing custom plugins while maintaining sub-millisecond performance.

Thursday, September 26, 2013

Wiring-in-the-Large: The Missing Technology for Java Cloud Applications

Have you ever wondered why dependency injection in most Java frameworks is only for local, in-process services as opposed to distributed services?

I recently came across Paul Maritz’s keynote (skip to minute 32) at the 2013 EMC World conference, which made me think about this question in the context of cloud platforms. The keynote is an excellent and well thought-out statement of how Pivtol is positioning itself in the emerging cloud platform market. One of his most interesting points is that with the proliferation of mobile and interconnected devices (often referred to as the Internet of Things), we are seeing a new class of applications emerge that intake, process, and distribute large amounts of data.

Maritz highlights his talk with some useful anecdotal evidence: a single transatlantic flight produces nearly 30TB of data that needs to be recorded, processed, and analyzed by a new breed of applications.

Cloud Fabrics

These types of applications cannot effectively be built on traditional Java EE application server architectures. Instead, they will run on cloud fabrics: dynamic, interconnected infrastructure that is highly adaptable.

The dynamic nature of cloud fabrics place new requirements on existing Java frameworks and containers. For example, VM instances may be created or migrated to meet increased demand. In this setting, machine (and hence service endpoint) addresses may change. This makes static architectures often associated with Java EE application server clusters and message brokers difficult to manage and scale

Cloud fabrics are built on hardware virtualization where physical compute resources are abstracted via software. Virtualization needs to be extended up the stack to Java programming models so that applications can be run more efficiently.

Spring: The Service Virtualization Pioneer

Spring was an early pioneer in this respect. It virtualized many parts of the Java EE app server by replacing container APIs for obtaining local service references (JNDI) with dependency injection. This made it possible to run Spring application code outside a Java EE container, for example, in unit tests.

As cloud fabrics gain adoption, we’ll see a need to extend Spring’s wiring capabilities to distributed services – wiring–in-the-large. Just as applications should not need container APIs to obtain references to local services, they should not need APIs to call remote services or send messages to endpoints. If remote services are instead wired to application code, the fabric infrastructure can transparently propagate endpoint address changes as VMs are migrated or created in response to varying workload:

An additional benefit of wiring-in-the-large is communication virtualization. Application code no longer relies on transport-specific APIs to send messages or invoke remote services. The cloud fabric is instead responsible for injecting code with proxies that manage communication:

This will allow the cloud fabric to adopt and adjust the most appropriate messaging technology without requiring code-level changes. In addition to greatly simplifying code, communication virtualization also makes it possible to produce more portable cloud applications.

Wiring-in-the-Large in Practice

So what does wiring-in-the-large look like in practice? The good thing is that many of its concepts predate the emergence of modern cloud computing. The OASIS SCA standards give us a simple and familiar way to wire remote services that fit well with cloud fabrics:

The above services can be connected via JMS, ZeroMQ, AMQP, MQTT or some other communications technology – it’s either up to the SCA runtime or deployment configuration to choose one. Application code will look the same:

The Fabric3 runtime (a conformant SCA implementation) provides transparent dynamic endpoint propagation in the way we discussed above.

What’s Next?

Fabric3 support for wiring-in-the-large currently requires applications be deployed to a Fabric3 container. The Fabric3 community is working on removing this restriction so that cloud services can be accessed in an ubiquitous manner – literally from any JVM or Java runtime. Here’s an example:

This API can be integrated into frameworks such as Spring and Guice to provide transparent injection of remote services. Basically, application code will no longer need to deal with specific transport APIs, or in the case of Spring, templates.

****

Returning to Maritz’s picture of next-generation applications that consume, process and distribute data at a massive scale, wiring-in-the-large will hopefully play the same modernizing role that local dependency injection did for Java EE in the corporate datacenter. If you want more detail on the art of the possible, check out Fabric3.

Friday, August 30, 2013

Fabric3 2.0: Bridging Service-Oriented and Event-Based Architectures

I'm excited to announce the release of Fabric3 2.0. This milestone introduces a number of new features. In particular, a major theme of the 2.0 release is creating a platform that brings service-oriented and event-based design closer together.

In this post, I will briefly touch on how Fabric3 does this and also identify some of the other important features in this release.

Event-Based Systems

The area we invested in most heavily for the Fabric3 2.0 release is improving support for building event-based systems. Martin Thompson has given an excellent presentation on event-based architectures. Peter Lawrey and Martin Fowler have also written very nice synopses here and here.

Fabric3 already has a rich dependency-injection model for event-sourced (pub/sub) interactions based on the concept of channels. In this release we have significantly refactored the Fabric3 internals to accommodate custom channels and low-latency requirements.

Now it is possible to plugin alternative channel implementations based on application requirements. One out-of-the-box channel type we have added is the ring buffer channel based on the LMAX Disruptor. This provides extremely low-latency, lock-free pub/sub capabilities:

We also have plans in the future to add more specialty channel types. For example, one based on coalescing buffers that is capable of dropping older unprocessed messages as newer ones arrive.

Ring buffer channels can be bound to transports such as ZeroMQ using simple configuration. This makes it possible to develop performant architectures using straightforward Java with no boilerplate API code:

For complete example, checkout the FastQuote trading platform sample.

Low-Latency Systems

One knock against middleware in general and dependency injection-based platforms in particular is that they introduce overhead during processing by proxying references to injected instances. Fabric3 has always had the ability to optimize wiring to perform direct (i.e. proxy-less) injection for local services. This means a local service invocation is a Java method invocation with no runtime intervention. Using this approach, we are able to achieve microsecond processing times for fairly complex tasks such as foreign exchange rate streaming.

In Fabric3 2.0 we took this concept further and applied it to channel injection. In Fabric3, channels are strongly-typed and represented by Java interfaces. At runtime, Fabric3 creates a proxy implementing the interface, which is then injected on a component that publishes events to the channel. By default, Fabric3 uses JDK proxies. For many applications, JDK proxies are sufficient.

However, for low latency systems, JDK proxies introduce relatively significant overhead. Perhaps more importantly, they create garbage as invocation parameters are wrapped in an array. When messages are processed at high volume in a system, the garbage resulting from the creation of parameter arrays can lead to excessive GC pauses, which negatively impacts latency.

Our solution to this problem is to use bytecode generation (via the ASM library) to implement the channel interface dynamically. The generated implementation directly calls into the Fabric3 channel infrastructure without using reflection or wrapping parameters. This results in garbage-free invocations that are as fast as handwritten code. In the next release we plan to go further and combine this with support for zero-copy message passing to ZeroMQ.

High Performance Logging

After working with a number of clients, we found a major bottleneck in many applications was logging. Part of the problem is that applications often log too much information. In fact, logging in the traditional sense is probably not necessary for many event-based systems, but that’s a subject for another blog.

Another culprit is the logging subsystem itself: most third-party logging implementations perform poorly in highly concurrent scenarios even when they support asynchronous writes. This is because the logging implementations are often subject to contention when placing messages in a queue that is read by a background thread persisting to disk. In addition, we have seen logging implementations produce a fair amount of garbage leading to unacceptable CG pauses.

To alleviate these problems, we re-wrote the Fabric3 logging framework to take advantage of the Disruptor for asynchronous writes and bytecode generation to provide high-performance, garbage-free operation. You can read more about it here.

Standards: SCA Conformance and JAX-RS 2.0 Support

One of Fabric3’s advantages with respect to proprietary Enterprise Service Bus products and integration platforms is that it is standards-based from the ground up. In this release we expanded our standards support to cover 100% conformance with the OASIS SCA Assembly, Policy, WS Binding and Spring Specifications. This includes all mandatory and optional features. In this context it is also noteworthy that we upgraded our support for RESTful Web Services to JAX-RS 2.0.

Get Started with Fabric3

To get started with Fabric3 2.0, try the sample applications, which contain all you need to get up and running with a cluster installation in less than five minutes.

Also, watch this space as I discuss these new features in depth and how they are being employed by our users to build innovative applications.