In my last post I described an XML-in XML-out API with just one function in it, called call_function.
I also gave a list of engineering challenges that need to be met if we are to leverage the full power of object-oriented design in analytics systems. Let's go through each in turn and throw couple more in for good measure, seeing how a call_function approach to a core library API addresses these concerns:
- ABI (in)compatibility: It is a C interface despite C++ being used internally. C is the "lowest common denominator" which provides the greatest degree of cross-platform portability.
- Transparency and auditing: There is a single pipe in and out. Every call and its results can be captured and logged in a format that both humans and machines can read, with a guarantee of 100% coverage.
- Regression testing: Run a test suite and log the calls and results. Replay them in a different version of the library or on a different platform or both, and log the results. XML is easy to parse, which means implementing a result comparator and surrounding test framework is very quick.
- Persistence: Serialization was the original motivation for this style of API in the first place. All objects can be saved and loaded. The only twist is that objects constructed from other objects need the full dependency tree to be tracked (which is straightforward).
- Distributed computing: The compact (well, as compact as XML can be) serialized form of the calls and results is ideal for network transport to cloud or grid nodes and back.
- API extensibility: Different calling environments impose different constraints on the extensibility of an API. For example, optional arguments in some languages must follow mandatory ones and only some languages have the notion of overloading. The fundamental structure in the XML strings that of name-value pairs, which flexible enough to accommodate any calling environment or language.
- Technical support: While most acute for the software vendor, the need to support production deployments is always there. This API design enables any calculation to be reproduced exactly at some later time and/or in some separate location. In practice, you can often diagnose problems by just eye-balling the XML.
- Portability: C++, if kept close to the standard, is portable across platforms. With the above C interface, and ensuring that any generated higher-level APIs are header-only, you can support any platform for which compilers are available.
- Service-oriented architectures: XML blobs are ideal SOA message payloads, which means that a library with the above style of interface is very readily integrated into an enterprise's messaging backbones.
This pure-XML approach is a very low-level view of the world. In my next post, I'll be moving up the stack and looking at higher-level APIs and related issues.