How Universal Algorithmic Differentiation™ results in more robust market risk management, optimal deployment of capital and, ultimately, greater profit.
by Russell Goyder, PhD, Director of Quantitative R&D, FINCAD
In the past several years, leading quant teams have been utilizing Algorithmic Differentiation (AD) to accelerate the risk management process. The technique has been popularized in the finance space only recently. Several decades old, AD has been applied to a variety of fields – oceanography, physics, geology, meteorology, engineering and many others.
If the concept of AD, and how it speeds up risk measurement or increases accuracy, is foreign to you, you're reading the right article. Ultimately, we'll not only explain AD, but also an approach known as Universal Algorithmic Differentiation (UAD™), which takes the next step in the evolution of real-time, analytic risk.
Out of the Stone Age
For most financial organizations, finite difference methods (known less formally as bump and grind, or bumping) to calculate greeks and other sensitivities is the status quo. The biggest problem with bumping is that it's slow. Very slow. As slow as a turtle, one might analogize. An entire portfolio valuation is required for every sensitivity calculation. This means that firms have to sacrifice intra-day risk reporting and pre-trade risk while relying on overnight snapshots of their exposure.
Many organizations have thrown a lot of hardware at this problem to speed up the bumping process, but investment in high-performance GPU hasn't fully solved the problem. At the end of the day, even if you strap a rocket onto a turtle, it's still a turtle. It is faster . . . but a turtle nonetheless.
Using AD, however, the speed-up is in the realm of 100x – 1000x. Managing exposure is no longer an overnight activity, but a pre-trade one.
For teams with complex multi-asset, multi-currency portfolios that need to bump, corners are often cut for overnight runs to reduce run-time, including not bumping every quote, or bumping curves altogether with a parallel shift, twist, or other aggregate bump.
From Flashlight to Floodlight
If you can't afford to bump every quote, it is natural to ask, what do you bump?
From a risk management perspective, it's like trying to navigate a dark and dangerous landscape with a small flashlight. With AD, you don't have to decide which quote for which you want to calculate portfolio sensitivity. You see everything. Essentially, you trade your flashlight for a floodlight, yielding a complete view of the risk landscape. Sensitivities to every relevant quote, including intermediate ones, are available for a fraction of the cost.
Jesper Andresen, head of quantitative research at Danske Bank, was quite right when he said, “The real benefit of [adjoint] AD is not just that I can do things quicker, but the fact that I can start looking at problems I haven't dared look at before.”
There are a variety of tangible use-cases for AD, including the ability to hedge every exposure in your portfolio, re-project risk to form an alternative view of your exposure, and knowing how your risk profile changes under different market scenarios.
The first and last use-cases are fairly well-known, so let's focus on risk re-projection.
With risk re-projection, you can transform sensitivities from one set of instruments into equivalent sensitivities for another set of instruments.
Imagine you are managing a long-only fund and want to control your rate exposure, but your mandate prohibits the use of contingent liabilities like swaps. You can measure your rates exposure with a model built from the (Libor, OIS, etc.) swap market, but what you really need is to calculate exposure with respect to the medium-term notes that you're actually allowed to trade.
By transforming your sensitivities to swap quotes to sensitivities to medium-term note yields, you can build an effective interest rate hedge.
Exact, Analytic Risk
So far, this has all been about speed, so let's talk about accuracy.
Bumping is an approximate technique, and therefore subject to numerical noise. Among many questions, you're faced with deciding how big a bump should be, whether a basis point is sufficient and whether it should be relative or absolute.
With AD, this tuning goes away. You get exact (analytic) sensitivity. Additionally, when AD is used within calibration, things become more stable. There is no more fine-tuning of risk calculations and your quants are freed up to move onto more productive activities, saving valuable time and resources.
Jargon: Chain Rule
To understand AD and have a basic orientation of this growing field, it's important to understand the jargon. You'll hear terms from both mathematics and computer science.
Starting in mathematics, you'll hear the chain rule. Thinking back to high school mathematics, the chain rule is a method of calculating mathematical derivatives for composite relationships formed by composing a collection of constituent relationships.
In the case of the relationship between a portfolio and any one of its risk factors, each relationship can be viewed as a collection of simple operations linked together to form a chain. Applying differential calculus to complex equations can be difficult, but if it's applied to much simpler links in a larger chain, the work becomes much simpler.
The name given to encoding the chain rule within a computer is Algorithmic Differentiation. AD comes in two flavors: adjoint (reverse mode) and tangent (or forward mode).
The difference between the two is that adjoint works best for problems where there are a large number of inputs and few outputs, while tangent works best for problems where there are a large number of outputs and few inputs.
For calculations in finance, it is almost always the case that there are a large number of inputs (quotes in this case) than few outputs (the relevant sensitivities you're calculating). (Hence the tendency to see AAD written simply as AD, in the context of financial applications.)
Enabling AD in Existing Systems
How should you go about getting AD for your risk analytics? The answer is: two ways. You can use tools which employ techniques such as operator overloading and, in fewer cases, source code transformation; or you can do it by hand.
The first method, operator overloading, is a fairly common “bolt-onto-your-code” approach. When valuation is run with operator overloading, the calculation is recorded in a data structure called a tape, so that the sensitivity calculation can be performed in reverse afterward. The problem with this process is that it's slow and takes a lot of memory. The Numerical Algorithms Group notes just how intensive this is: “Even for relatively simple codes, the tape can be several GBs [gigabytes], and for production codes will typically exceed the capacity for even large-memory machines.”
Memory problems tend to be so severe with operator overloading that you often require more than just a tape. Additional techniques such as checkpointing are typically required. Unfortunately, these techniques don't make things run faster, but rather enable them to run in the first place, given the extreme memory requirements.
The second method of achieving AD for your risk system is brute force ‐ coding it by hand. Roughly speaking, it will take as long as it took to produce your analytics in the first place to produce AD in your existing system. This is assuming, however, that you have access to personnel who can perform manual coding on a properly structured codebase.
Quants Aren't Architects
Building optimal and scalable software is a demanding field in its own right. Asking quants to not only bring financial and mathematical intuition and numerical software engineering to bear on problems, but also to bring software architecture skills, is a large request that comes at a significant price point.
The net result of all this is that comprehensive support for AD in a firm's analytics is beyond reach for all but a small handful of financial institutions with years of time and capital to spend on development.
As valuation and risk analytics vendors, we have been tackling this hurdle for clients since 2006 when we explicitly built AD into F3, our answer to the industry's need for fast, accurate, transparent, holistic and flexible analytics. The short answer to AD in F3, which we call Universal Algorithmic Differentiation, is AD-enabled building blocks (and the AD-enabled structure which combines them together), which F3 uses to generate pricers.
Pricers generated with fundamental AD-enabled building blocks means that AD is completely comprehensive for every combination of product, model, and valuation method – including user-defined hybrids and advanced structures. A truly Universal implementation of AD.
Dr. Russell Goyder is director of quantitative research and development at risk and analytics solutions company FINCAD. He manages the quant team and oversees the delivery of analytics functionality in FINCAD products, from initial research to the deployment of production code. Before joining the Vancouver, Canada-based company in 2006, Goyder worked as a consultant at The MathWorks. He holds a PhD in physics from the University of Cambridge.