|
Business Intelligence (BI) systems increasingly provide the data that drives both strategic and tactical decisions for an enterprise. Many businesses have already invested heavily to aggregate data from diverse systems and applications in order to create a whole-enterprise view to fully reflect the daily state of the business, as well as support more effective, informed decisions. As businesses use BI systems more heavily, the challenge for IT is to keep the underlying databases accurate so decisions are made on trustworthy data. Businesses need many types of complex and dynamic data integration to remain competitive and improve profitability. One example is the compilation of whole customer profiles that are kept updated in near real-time. This consolidated, finger-tip information about the customers and their activities is created by pulling data from a number of systems to create a view that is essential for targeting sales in real-time; for example, it can enable up-sell campaigns during customer service calls. Another use of the customer profile is as an input to fraud detection systems. A second example of a data integration need is the consolidation of data across multiple portfolio accounting systems to support risk mitigation and portfolio analysis. Further examples include financial modeling for cost allocation of services; trades and transfers reconciliation checking; and compliance and audit reporting. As the operational warehouse matures, new applications for the data typically emerge. These are based on the business’s experience with the information that is available. A BI approach is just one of the solutions to enterprise-wide data integration. To many, its benefits are ready accessibility and scalability. However, it does have its challenges. The information retrieved from a BI system is only as trustworthy as the data put into it. Generally bad information is not the result of bad source data. More often, bad information is the result of pulling out-of-sync data into the system due to problems in the data processing flow. This process flow, referred to as Extract-Transform-Load (ETL), can not only result in bad or inaccurate information, but can also suffer from other problems in the areas of information availability, process auditability, and agility in responding to changing business needs. While bad information can lead to bad decisions, which can negatively affect the course of the business, the full scope of problems leads to high costs for IT, potential issues in governance and compliance, and inability to provide a whole-enterprise view that is kept current with the business needs. Bad data, or data that is not current, has another, broader impact in leading to distrust of the system. When the BI system cannot be fully relied on, its uses rapidly become limited. This requires complex and time-consuming adjustments for errors and re-running of the downstream analysis. Ultimately this leads to the business losing faith in the data. Perhaps the biggest hit is that strategic business planning is no longer strategic or too late to capitalize on changing market opportunities. A well run BI system reliably presents current data from all of the available, different data types and becomes an integral part of the business applications. The longer term result is for the business to continually find better uses for the data because they trust it. Achieving trustworthy data requires that IT has complete, in-depth visibility and full control over the fragile data flow processes. Scripts, Custom Code, and Islands of Automation In typical environments, the ETL processing flows are generally set up and maintained by the enterprise’s IT group. The group works in conjunction with other departments to identify the needed information and its role in the overall data warehouse. There can be hundreds if not thousands of data sources including legacy databases, application databases, departmental data marts, inbound data sources from partners, and many other sources of information scattered throughout the IT landscape. Each source often has its own unique issues of method of access, data content and quality, update and arrival schedules, and requirements for transformation. To consolidate and integrate this information, each portion of the ETL process is often performed by different types of technology, most of which has little, if any, management or administrative control.
|