Saturday, March 1, 2014

ETL - Performing Predictive Analytic in the flow

Until recent past ETL tools were considered only to create EDW which gets used to perform various analytic later on. Tools like IBM SPSS capable of predicting with confidence, leading to smart decision. With the change in industry dynamics and BigData growth, real-time and "Now" is taking precedence than in future.  
Currently a traditional batch flow standard is followed for predictive analytics is by extracting data from EDW into different mart and apply predictive models to obtain valuable insights and results of the analytics fed to decision makers. Here analytic model is built once and it is applied on large amounts of data in batch. It required repeated I/O and transformation before making it available to the end application. Instead efficient method of performing this operation is to integrate the process of running the analytic models during the import (or export) of new data into (from) the warehouse. 

A possible integration between ETL and predictive analytic tools open scope for entire different set of business opportunities like existing InfoSphere DataStage and SPSS integration already providing this capability. Following this approach analytical model can be applied on the data which is ingested into  warehouse or mart and the output can be stored directly into resulting tables. On availability of statistical model output in the data warehouse or data mart, business applications like reporting tools and marketing campaigns can make use of this data readily without the need for a separate analytic step.

-Ritesh 
Disclaimer: The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions

No comments:

Post a Comment