Q.Why do we need ETL Tools?
Ans.Think of GE, the company has over 100+ years of history & presence in almost all the industries. Over these years company’s management style has been changed from book keeping to SAP. This transition was not a single day transition. In transition, from book keeping to SAP, they used a wide array of technologies, ranging from mainframes to PCs, data storage ranging from flat files to relational databases, programming languages ranging from Cobol to Java. This transformation resulted into different businesses, or to be precise different sub businesses within a business, running different applications, different hardware and different architecture. Technologies are introduced as and when invented & as and when required.
This directly resulted into the scenario, like HR department of the company running on Oracle Applications, Finance running SAP, some part of process chain supported by mainframes, some data stored on Oracle, some data on mainframes, some data in VSM files & the list goes on. If one day company requires a consolidated reports of assets, there are two ways.
First completely manual, generate different reports from different systems and integrate them.
Second fetch all the data from different systems/applications, make a Data Warehouse, and generate reports as per the requirement.
Obviously second approach is going to be the best.
Now to fetch the data from different systems, making it coherent, and loading into a Data Warehouse requires some kind of extraction, cleansing, integration, and load. ETL stands for Extraction, Transformation & Load.
ETL Tools provide facility to Extract data from different non-coherent systems, cleanse it, merge it and load into target systems.
Q.What is Informatica?
Ans.Informatica is a tool, supporting all the steps of Extraction, Transformation and Load process. Now a days Informatica is also being used as an Integration tool.
Informatica is an easy to use tool. It has got a simple visual interface like forms in visual basic. You just need to drag and drop different objects (known as transformations) and design process flow for Data extraction transformation and load. These process flow diagrams are known as mappings. Once a mapping is made, it can be scheduled to run as and when required. In the background Informatica server takes care of fetching data from source, transforming it, & loading it to the target systems/databases.
Informatica can communicate with all major data sources (mainframe/RDBMS/Flat Files/XML/VSM/SAP etc), can move/transform data between them. It can move huge volumes of data in a very effective way, many a times better than even bespoke programs written for specific data movement only. It can throttle the transactions (do big updates in small chunks to avoid long locking and filling the transactional log). It can effectively join data from two distinct data sources (even a xml file can be joined with a relational table). In all, Informatica has got the ability to effectively integrate heterogeneous data sources & converting raw data into useful information.
Before we start actually working in Informatica, let’s have an idea about the company owning this wonderful product.
Some facts and figures about Informatica Corporation:
Founded in 1993, based in Redwood City, California
1400+ Employees; 3450 + Customers; 79 of the Fortune 100 Companies
NASDAQ Stock Symbol: INFA; Stock Price: $18.74 (09/04/2009)
Revenues in fiscal year 2008: $455.7M
Informatica Developer Networks: 20000 Members
In short, Informatica is worlds leading ETL tool & its rapidly acquiring market as an Enterprise Integration Platform.
2 comments:
good information at a glance..quite useful for infa guys
Read your blog. Nice and very good explanation. Learn Informatica Training in Chennai
Post a Comment