Matching Business Intelligence with Cloud Computing

By Randolph West and Brendon Bezuidenhout

With Cloud Computing fast becoming the next big thing in IT, the jury is still out on how it can best be matched up with Business Intelligence (BI) to deliver real and sustainable benefits to business.

The focus of BI is the gathering, modelling, transforming and analysis of data (often from disparate systems), in order to provide a “big picture” to a business. Traditionally, BI has lagged behind the business it is meant to support. Reporting and data analysis can take a long time to process, and may run outside of business hours in order to reduce impact on the day-to-day operations. Businesses must wait to understand what is going on, and rely on old information to make decisions.

Enter Cloud Computing. The main benefits include lower cost, multiple redundant sites, and scalable provisioning of resources, which in turn allow for business continuity, disaster recovery and on-demand performance improvements.

It is entirely feasible that Business Intelligence functions, such as data modelling, transformation and analysis, are spread over a distributed system (or “grid”), thereby decreasing the time it takes to return value to the business. This in turn reduces the time it takes for a business to react to changing conditions.

But what of the data gathering?

Much is being written about cloud computing. If we leave BI out of the equation for a moment, businesses are moving their internal systems to scalable, often virtualised, third-party data centres operated as a service over the Internet.

In fact, due to the growing volume of data, requirements around compliance (including data retention), and associated costs of scaling systems, more and more businesses are likely to “head to the cloud” to satisfy their needs and reduce costs. Replacing capital expenditure with operational expenditure looks good on the balance sheet, and more of the IT infrastructure is outsourced. The business doesn’t have to worry about their data.

Or do they?

With cloud computing, the business does not necessarily own their data anymore. If the business is not in control of the hardware, they are not in control of their sensitive data. They now rely on (at least) one third-party service provider – a major challenge facing cloud computing customers. Even the mighty Google, the poster child of cloud computing, suffers from major outages affecting its online business offerings from time to time.

The business may also find that the cloud is not conducive to some in-house systems. For example, they may be operating a system with legacy code (a mainframe for instance), and migrating would be too expensive and time-consuming to port to the cloud. Even in systems that would be easy to migrate, a business may choose not to do so for privacy reasons.

When one considers that scenario, it is easy to see how the BI process actually slows down. A business is not likely to push all of their systems into the cloud, least of all to one service provider. BI data gathering mechanisms will have to take this into account. If one of the links between two third-party providers is down, how does that affect the process? If a service provider goes out of business as a result of an economic downturn, what happens to the data? What if the entire BI process is outsourced to a single provider that has a massive outage? How about an attack or outage on the Internet’s physical infrastructure? Is it then just collateral damage that the business suffers?

There are more challenges facing cloud computing in general, and Business Intelligence specifically. As businesses move to the cloud, this puts an additional burden on the Internet itself. Bandwidth requirements increase. High availability and redundancy is outsourced to the Internet Service Providers as well as the telecommunications providers. Telkom, Neotel, and the cellular providers, are all expected to provide an uptime of close to 100%, along with the ISPs.

Consider this: as users become used to the speed advantages of a faster Internet, immediate feedback from social media sites like Twitter, Facebook, FriendFeed and the like, search engines like Google, Yahoo! and Bing, there is an expectation of zero latency from cloud computing as well. If Google is fast (which is free), so too must the Software-as-a-Service be fast (that we’re paying for). This in turn will apply to BI reporting, regardless of where it, or the data on which it depends, is hosted.

So we sit with a problem: as more people put their data in the cloud, they expect it to be fast. On the other hand, by putting this data in the cloud, they are slowing down the entire system.

How do we fix this?

BI from the ground up: Build real-time data modelling and analysis into our systems from the beginning. Business Intelligence should not be an afterthought. If we’re moving to the cloud, let’s make our systems support BI functionality while we’re porting them.

Open standards: If we expect systems to talk to each other over the Internet, we need to adopt open standards, from security to the data itself.

The backbone: Increase local and international bandwidth. Lower the cost of international bandwidth (Seacom is helping with that). Lower the cost of inter-network data exchange. There is a time in the very near future when mobile carriers will simply be ISPs, and ISPs will provide data centres to facilitate cloud computing.

Bring in the cloud: There are grid computing solutions available that larger businesses can bring in-house, and have a backup solution at a disaster recovery site elsewhere.

As business looks to adopt a consolidated approach to systems by entering Cloud Computing, so Business Intelligence should be integrated from the ground up. Real-time BI is possible when businesses appreciate the possibility of a “big picture” outcome. Departments will share systems and data, and grid computing is allowing a one-stop shop to consolidation.


References:

1 – Wikipedia: Cloud computing (http://en.wikipedia.org/wiki/Cloud_computing). Retrieved on 3 October 2009.

2 – Wikipedia: Data analysis (http://en.wikipedia.org/wiki/Data_analysis). Retrieved on 3 October 2009.