GridGain provides a computational grid framework. These days, CPUs are increasing in performance by adding more cores, rather than MHz. While, this may continue to fulfill Moore’s Law, the development of multi-core applications is becoming an increasingly complex and difficult problem. GridGain alleviates this by providing a powerful framework to address these issues. Here is an exclusive interview with Nikita Ivanov, Grid Computing and Research Advisory, at GridGain.

Nikita Ivanov

With so many technologies out there what does GridGain offer?

GridGain is a Java-based open source computational grid framework. Compute grids solve basic parallel computing problems: If the task takes unacceptably long time to execute on a single computing resource, you can split this task into sub-tasks, execute them in parallel on separate computing resources, aggregate results and get final result back in a fraction of time.

GridGain Architecture

Surprisingly, there are very few products that solve exactly this problem in Java space and just may be handful overall.

GridGain offers a consistent programming model whether you are running a couple of test nodes on the laptop or in large data center with hundreds upon hundreds of servers. One of the key characteristics that we built in into GridGain from the ground up is simplification of programming model. Grid computing in general is over bloated with unnecessary complexity and cumbersome programming APIs. Look at EJB2 vs. Spring/Hibernate stacks – that’s exactly the difference we are making with GridGain.

Another key problem that GridGain addresses is that despite all the complexity of the established grid computing products they still have no proper integration with JEE stack. It is still just a pain to use them if you are writing Java enterprise software. With GridGain – you are right “at home”. Integration with Spring and all major application servers, familiar deployment model, resource injection and annotations, peer class loading, AOP-based grid enabling – all making GridGain a very natural tool for enterprise Java developer.

What are some areas that GridGain is heavily used?

I think it’s the pretty usual mix of various financial applications, biotech, banking, telcos, and insurance. But what’s more important for us is we are getting a steadily, growing stream of usage from different type of clients – clients that would never think about using grid computing before, but with GridGain they do, and they use it now: small hosting company analyzing log files, development shops distributing JUnit tests to improve build times, etc.

GridGain in many ways redefined how people look at traditional parallel computing. Its unique combination of simple use, full integration with JEE stack and advanced features provides very different experience for the developers when it comes to grid computing.

What is the difference between your technology, then say a database, filesystem, or messaging service?

The difference is rather obvious in my opinion – it is the problem that GridGain solves what differentiates GridGain technology from the rest you have mentioned.

How is this different than a caching system? Is cache a competing technology?

No, not really. They are rather very complimentary. You almost always find both data and compute grids in the systems (if either one is used).

I often say that compute grids parallelize processing of data, while data grids parallelize storage of data. These are two very different technologies, they solve different problems and they have very different usage pattern in the enterprise systems.

It is important to note that some data grid product would have some rudimentary functionality for remote execution. However, it is usually far behind from what computational grids require.

As I mentioned above the combination of data and compute grids is essential and we at GridGain took a different approach for solving it: instead of building yet another data grid implementation we decided to concentrate on building a compute grid (where we believe we can bring most of the value) and providing deep integration with all the leading data grid products out there so that our customer would have plenty of options to make the right choice. We already provide integration with Oracle Coherence and GigaSpaces. In the next releases, we’ll extend it and add JBoss Cache, OS Cache and Ehcache integration.

Is GridGain a compute or data grid?

As I mentioned before GridGain is a compute grid framework. We integrate natively with two leading data grids Tangosol and GigaSpaces. We’ll provide out-pf-the-box integration with more open source caching products in the forthcoming releases. I mentioned above that it is important to note that many data grid products would have some rudimentary functionality for computational grid computing. But they are really an afterthought and very simplistic.

In my previous company we’ve built combine product with data and compute grid together. Although there’s nothing impossible in this approach, with GridGain we decided to concentrate on doing one thing and doing it better than anyone else – provide Java-based computational grid framework. Many data grid products have grown up from distributed caching. This field is very mature and we don’t think there’s much sense in creating yet-another data grid framework.

What is the scalability of the GridGain platform?

It’s almost impossible to answer this question objectively or accurately :-) But seriously, scalability of distributed systems is as far way from being an exact science as possible. Cameron Purdy has few great slides in his “10 Ways To Botch Scalability in Enterprise Systems” presentation that discusses it very succinctly, specifically how misleading certain aspects can be when dealing with scalability and throughput of a distributed system.

About the only good approximation you can get is by running your application on the grid in as close to production environment as possible. And even in this situation – the variation from real-life results can be significant.

GridGain, as a product, provides a significant amount of direct support and features to aid scalability. You can dynamically add and remove nodes from grid, you can have very intelligent and dynamic split, GridGain support collision resolution that allows for fine-grained distribution control and preemption logic, and fully customizable failover.

Furthermore, with our unique SPI-based integration we can fully blend into the hosting environment, if required, and use underlying host protocols avoiding duplicate implementation. For example, with Coherence as a hosting environment, we fully blend with Coherence cluster discovery and cluster communication protocol – essentially, becoming as scalable as Coherence itself in this regard. That gives very predictive metrics to the customers that are running us on top of the Coherence in terms of the scalability.

What platforms does GridGain support or integrate with?

First and foremost we are Java-based product: it’s written in Java – but most importantly it is written for Java developers. From the get-go, we took a no-nonsense approach to design and often follow convention over configuration whenever we could. Java developers will feel instantly at home when they work with GridGain – and that was the goal from the beginning.

From my past experience of building cross-platform (Java and .NET) middleware I’ve learned the lesson that having a native APIs in different languages is highly overrated. Very rarely developers will find it convenient or productive to call .NET product from Java, or Java middleware from PHP, for example. Moreover, modern WS-* access solves most of these problems rather naturally.

Even further, Java 6 will provide very nice integration with Java-based scripting languages like Groovy which provides yet another natural platform-level extension to GridGain.

How much does it cost?

Well, nothing :-) It is open source software with LGPL license which should fit practically for everyone. We used to say: if you can use JBoss (licensed under the same LGPL license) – you can use GridGain.

GridGain Systems provides commercial services for GridGain software. It includes 2 levels of support subscription, training and consulting. We have ability to provide these services pretty much throughout the world as we are conveniently located in US and Europe. All prices for services are on the website, we are absolutely open about them so that every customer can decide for himself – there’s no sales pressure.

What is next?

As for any young and dynamically growing product we have a lot in the pipeline. We have gotten to a rather unexpectedly positive start and got almost 3,000 downloads in just a few months and rapidly growing. Our usage has grown almost exponentially in the same short period of time and we are now taking a careful look at how we add new features and improve existing.

We will continuously improve our integration story. In GridGain 1.6 we are shipping new integration with GlassFish and most notable JUnit testing framework. In upcoming releases we will provide tighter integration with data caching products such as affinity split or data and computation co-location.

Next significant release, GridGain 2.0, will include comprehensive web-based management console that is certainly missing right now from our product which will be available to our support subscribers.

And as always, with each incremental release of GridGain we improve on our core functionality, stability, performance and usability – some of the core characteristics that set GridGain apart from other products.

Is there anything you’d like to add?

Try GridGain and it may change your idea about grid computing forever! Even if you never thought that high performance or grid computing is something you may need – it could have been just a perception of complexity and weight that the current solutions carry.

GridGain is different. If you like Spring or Seam or other modern JEE frameworks – you will like GridGain pretty much instantly. It is built on the same principles, although it solves very different problems. The fundamental idea behind parallel computing is simple and powerful even for every day development – it just needed a right implementation, and GridGain is certainly one of them.