What is Grid Computing?
The idea behind grid computing is to make multiple machines that may be in different physical locations, behave like they are one large virtual machine. A variety of technologies are used to make this happen. Clusters of machines can be used to increase the resources available at one physical location but to go beyond that requires using peer-to-peer communications tools and the internet to allow clusters of machines at different physical locations to work together. Grid computing is precisely that, you have a single scheduling process that uses peer-to-peer communication to control multiple clusters of machines at different locations. A compute cluster refers to a technology that allows a calculation to be done using multiple CPUs at a single site. This is normally done to improve performance by making more CPUs available for doing a calculation. The clustering technology can be used independently or it can be used as a component part of a grid. The technologies are complementary, a good starting point for someone who wants to use these technologies is to focus on using the clustering technology first and then to migrate to a distributed solution at a later point. There are a number of products that can be used to build a compute cluster including a Microsoft product called Compute Cluster Server that works with Visual Studio 2005.
Software developers who use FINCAD Developer to build web based systems may consider using compute clusters or grid computing as a part of their solution. This article is not specifically focused on the FINCAD Developer product but instead it will discuss some of the issues and technologies that can be used to solve problems when building larger software systems. The focus is on the idea of identifying independent calculations and distributing them to multiple CPUs to enhance performance rather than the low level details relating to what is happening inside the calculation routine. A popular method for building web systems is to use XML and web services as supporting technologies to build systems with generic interfaces that are interoperable. As these systems grow and require more processing power, cluster computing and grid computing can play a role in ensuring that these systems meet performance goals. In particular, this article will describe the roles of cluster computing and grid computing as a means of overcoming performance issues in large web based applications.
Advantages / ChallengesSoftware Challenges
One advantage of grid computing is that it allows one to share computer resources across networks. This can both increase the computational power available to programs and reduce the number of machines needed by an organization. It allows for linking a large number of low-cost machines together, rather then spending a large amount of money on a single machine or super-computer with a larger processing capability. It also allows for applications to be more easily scaled since additional machines can be added to the grid. A number of large investment banks and securities firms have already used grid computing to build more scalable systems that reduce their IT costs by allowing computer resources to be pooled and shared by more than one group within the organization. One of the main challenges in building a system that uses grid computing is to identify calculations or processes that can be done independently and to manage the process of distributing these tasks or jobs to multiple machines.
When you have been running applications on dedicated clusters of CPUs for a reasonable period of time, then you will be ready to go to the next stage which is developing a single scheduling process that controls several clusters of machines using peer-to-peer communications. A well designed grid can improve performance if some applications running on the grid are idle when resources are needed for other applications. For example, you might have one application that recalculates the financial risk that is associated with a specified portfolio, on demand, typically during the work day. With grid computing, this application might share resources with another application that does batch calculations and runs in the middle of the night. Since these applications would typically run at different times, you will almost always have more resources available to the applications by pooling the machines together and treating them all as one large virtual machine. Figuring out which computing clusters would work well together, if they were combined, is the big challenge in going from clustering solutions to grid computing. Putting clusters together that need resources at the same time will simply decrease performance since there is some overhead associated with the network communications that are required to control the operation of the grid. The two applications will simply be spending most of their time competing for the resources in the pool.
Standard Methods for Building Grids
As more and more companies begin to develop and use grid computing solutions, it becomes necessary to establish standard methods for building grids so that it is easier for applications that are built by different people, to work together. An organization called the Global Grid Forum (GGF) has formed to establish a standard for how to build a grid solution. They refer to their standard as the Open Grid Services Architecture and you can learn more about it by going to their website http://www.ggf.org/. Another group called the Enterprise Grid Alliance (EGA) has also formed for the same reason. This second group is more focused on business applications that run on a grid rather than scientific high performance computing applications. Their website is http://www.gridalliance.org/. If you are building a grid computing solution then you may find it advantageous down the road if you choose to follow one of these standards when building your system. It will make it more likely that your solution will be compatible with other grid applications that you may want to develop or use in the future.
The software world is changing as legacy stand alone applications are being replaced by solutions that use a service oriented architecture where components are loosely coupled together and are interchangeable. This new way of doing things is quite complimentary to the way that grid computing works. Many web service applications use this new development method and I believe that this was one of the reasons that groups such as GGF and EGA have taken the interests of web service application developers into account when establishing new standards for grid computing. One of many tools that you could choose to use for implementing a grid is Digipede. It is of interest to developers who have built .NET web service applications. Digipede is also one of a small number of grid computing tools that has worked to make grid computing solutions accessible to spreadsheets. A new version of Microsoft Excel will be released next year that will provide support for multi-threading in spreadsheets. It is expected that tools such as Compute Cluster Server and Digipede will take advantage of this change and use it to increase the performance of large spreadsheets. The Digipede framework allows you to distribute and execute .NET objects across a network of computers. This provides a scalable method for increasing available computing resources and allows for faster execution of tasks such as Monte Carlo simulation and portfolio valuation.
Building clustering and grid computing solutions is very complex and there is a lot to learn before you can get started. This article was written to provide a high level overview of some of the technologies and decisions that need to be made when using these technologies. It includes some practical advice that may help provide direction as you learn more about these technologies. As these technologies become more established and better understood the tools will evolve and will make implementing grids and clusters easier. These technologies may still be in their infancy but the foundation and standards for building better tools that use them have mainly been established and I expect that the use of these technologies will increase significantly in the next few years as the migration from stand alone systems to service oriented applications continues to evolve.
Presentation documents from the 2006 Microsoft Financial Services Developer Conference in New York
MSDN on-line article by Rich Ciapala called "Develop TurboCharged Apps for Windows Compute Cluster Server" The "Wall Street Plugs into Grid Computing" on-line article available through the http://www.sun.com/ website. The "Enterprise Grid Computing vs Clustering" on-line article from the AvarSYS website.
Your use of the information in this article is at your own risk. The information in this article is provided on an "as is" basis and without any representation, obligation, or warranty from FINCAD of any kind, whether express or implied. We hope that such information will assist you, but it should not be used or relied upon as a substitute for your own independent research.
For more information or a customized demonstration of the software, contact a FINCAD Representative.