In an earlier post in this series, I listed a few problems which go with the territory of building cross-platform object-oriented libraries in general, and some with C++ in particular. Here they are again: ABI (in)compatibility, regression testing, transparency and auditing, persistence, distributed/multi-process computing, API extensibility and the provision of technical support.
In my previous post I gave an XML-format for serializing an object in terms of the instructions that built the object in the first place.
In this point I want to assert that such an XML format deserves to take centre stage in the API of a core analytics library and be the single conduit through which all calls are made, whether they construct objects or not. I am talking here about a core library API with (essentially) just one function - a string-in-string-out function that takes a string representation of the call and returns a string representation of the result.
Here is what it might look like in C:
result_container call_function( library_handle_t c,
The library_handle_t type is our link to the library instance, that tracks all the relevant objects (and their names). In practice, we might wish to put in another layer, to distinguish a multi-user library from one particular view of its state, but the above code captures the essential structure.
Of course, this is not particularly convenient for someone authoring code explicitly, but there is nothing to stop higher-level APIs being generated, given an (say) XML representation of the library's functions. For an example of just such an API, see Appendix A in this article.
What about performance? Surely the conversion of data, particularly numerical data to a string representation and back on every call imposes an unacceptable overhead for a high performance numerical computation library? This is a common gut-feel reaction but turns out to be a premature optimization. Such overhead (measured in microseconds for most calls) is negligible, because only a small amount of numerical data passes through the API relative to the typical cost of computation.
In my next post, I will revisit the list of concerns given above and see how each is addressed by this approach.