It is a well-known fact that ETL testing plays a crucial role in any Business Intelligence (BI) application. To ensure high-quality and acceptance for business operations, it is essential to thoroughly test the BI application beforehand.
The main goal of ETL testing is to verify that the Extract, Transform & Load functionality is aligned with business requirements and performance standards.
Recommended IPTV Service Providers
- IPTVGREAT – Rating 4.8/5 ( 600+ Reviews )
- IPTVRESALE – Rating 5/5 ( 200+ Reviews )
- IPTVGANG – Rating 4.7/5 ( 1200+ Reviews )
- IPTVUNLOCK – Rating 5/5 ( 65 Reviews )
- IPTVFOLLOW -Rating 5/5 ( 48 Reviews )
- IPTVTOPS – Rating 5/5 ( 43 Reviews )
Before diving into ETL Testing with Informatica, it is important to understand the concepts of ETL and Informatica itself.
What You Will Learn:
What you will learn in this ETL tutorial:
- Foundations of ETL, Informatica & ETL testing.
- Understanding ETL testing in the context of Informatica.
- Classifying ETL testing in Informatica.
- Sample test cases for Informatica ETL testing.
- Advantages of using Informatica as an ETL tool.
- Tips & Tricks for Informatica ETL testing.
In the field of computing, Extract, Transform, Load (ETL) refers to a database process, particularly in data warehousing, that performs:
- Data extraction – Obtains data from homogeneous or heterogeneous data sources.
- Data Transformation – Changes the data into the required format.
- Data Load – Transfers and stores the data in a permanent location for long-term use.
Informatica PowerCenter ETL Testing Tool:
Informatica PowerCenter is a robust ETL tool developed by Informatica Corporation. It serves as a unified enterprise data integration platform, allowing access, discovery, and integration of data from various business systems, regardless of format and delivering it throughout the organization at any speed. Through Informatica PowerCenter, workflows can be created to execute end-to-end ETL operations.
Download and Install Informatica PowerCenter:
To install and configure Informatica PowerCenter 9.x, follow the step-by-step instructions provided in the following link:
=> Informatica PowerCenter 9 Installation and Configuration Guide
Understanding ETL testing specific to Informatica:
ETL testers often have specific questions about what aspects to test in Informatica and how comprehensive the test coverage needs to be.
Let’s take a detailed look at how to perform ETL testing specific to Informatica.
The key areas that should be covered in Informatica ETL testing are:
- Testing the functionality of Informatica workflow and its components, including all transformations used in the underlying mappings.
- Checking data completeness: ensuring that the projected data is loaded into the target without any truncation or data loss.
- Verifying that data is loaded into the target within the estimated time limits, evaluating the workflow’s performance.
- Ensuring that the workflow prevents loading invalid or unwanted data into the target.
Classification of ETL Testing in Informatica:
For better understanding and ease of testing, ETL testing in Informatica can be divided into two main parts:
#1) High-level testing
#2) Detailed testing
In high-level testing:
- Verification of the validity of Informatica workflow and related objects.
- Confirmation of successful completion of the workflow execution.
- Validation of the execution of all the required sessions/tasks in the workflow.
- Confirmation of data loaded into the desired target directory with the expected filename (if the workflow generates a file), and so on.
In summary, high-level testing includes basic sanity checks.
In the case of detailed testing in Informatica, a more in-depth validation is performed to ensure that the logic implemented in Informatica produces expected results and meets performance requirements:
- Validation of output data at the field level, confirming that each transformation operates correctly.
- Verification of the record count at each processing stage and ultimately in the target.
- Thorough monitoring of elements such as source qualifier and target in source/target statistics of the session.
- Ensuring that the runtime of the Informatica workflow aligns with the estimated run time.
In conclusion, detailed testing involves rigorous end-to-end validation of the Informatica workflow and the flow of data related to it.
Here’s an example:
We have a flat file containing information about various products. It includes details such as the product’s name, description, category, expiry date, and price.
Our requirement is to extract each product record from the file, generate a unique product ID for each record, and load it into the target database table. We also need to exclude products that belong to category ‘C’ or have an expiry date earlier than the current date.
Assuming our flat file (source) looks like this:
(Note: Click on any image for an enlarged view)
According to the requirements stated above, our database table (target) should have the following structure:
Table name: Tbl_Product
Prod_ID (Primary Key) | Product_name | Prod_description | Prod_category | Prod_expiry_date | Prod_price |
---|---|---|---|---|---|
1001 | ABC | This is product ABC. | M | 8/14/2017 | 150 |
1002 | DEF | This is product DEF. | S | 6/10/2018 | 700 |
1003 | PQRS | This is product PQRS. | M | 5/23/2019 | 1500 |
Let’s say we have developed an Informatica workflow to meet the requirements of our ETL process.
The underlying Informatica mapping will read data from the flat file, process it through a router transformation to discard rows based on category and expiry date, and utilize a sequence generator to assign unique primary key values for the Prod_ID column in the Product Table.
Finally, the records will be loaded into the Product table, which serves as the target for our Informatica mapping.
Here are some sample test cases for the explained scenario:
You can use these test cases as templates for your Informatica testing project and adjust them as per the functionality of your workflow.
#1) Test Case ID: T001
Test Case Purpose: Validate Workflow – [workflow_name]
Test Procedure:
- Go to workflow manager
- Open the workflow
- Select Workflows menu -> Click on validate
Input Value/Test Data:
Sources and targets are available and connected
Sources: [all source instance names]
Mappings: [all mapping names]
Targets: [all target instance names]
Session: [all session names]
Expected Results:
Message in the workflow manager status bar: “Workflow [workflow_name] is valid”
Actual Results:
Message in the workflow manager status bar: “Workflow [workflow_name] is valid”
Remarks: Pass
Tester Comments:
#2) Test Case ID: T002
Test Case Purpose: Verify successful execution of the workflow
Test Procedure:
- Go to workflow manager
- Open the workflow
- Right-click in the workflow designer and select “Start workflow”
- Check the status in the Workflow Monitor
Input Value/Test Data:
Same as the test data for T001
Expected Results:
Message in the output window in Workflow manager: Task Update: [workflow_name] (Succeeded)
Actual Results:
Message in the output window in Workflow manager: Task Update: [workflow_name] (Succeeded)
Remarks: Pass
Tester Comments: Workflow succeeded
Note: You can easily view the workflow run status (failed/succeeded) in the Workflow monitor, as shown in the example below. Once the workflow is completed, the status will be automatically reflected in the Workflow monitor.
In the above screenshot, you can see the start time, end time, and status (succeeded) of the workflow.
#3) Test Case ID: T003
Test Case Purpose: Validate the number of records loaded into the target table
Test Procedure:
- After the successful execution of the workflow, navigate to the target table in the database
- Check the number of rows in the target database table
Input Value/Test Data:
5 rows in the source file
Target: Database table – [Tbl_Product]
SQL query to run in SQL server: Select count(1) from [Tbl_Product]
Expected Results:
3 rows selected
Actual Results:
3 rows selected
Remarks: Pass
Tester Comments:
#4) Test Case ID: T004
Test Case Purpose: Check the functionality of the sequence generator in populating the [primary_key_column_name e.g. Prod_ID] column
Test Procedure:
- After the successful execution of the workflow, navigate to the target table in the database
- Check the unique sequence generated in the Prod_ID column
Input Value/Test Data:
Value for Prod_ID left blank for every row in the source file
Sequence Generator mapped to Prod_ID column in the mapping
Sequence generator start value set as 1001
Target: Database table – [Tbl_Product] opened in SQL Server
Expected Results:
Values from 1001 to 1003 populated against every row in the Prod_ID column
Actual Results:
Values from 1001 to 1003 populated against every row in the Prod_ID column
Remarks: Pass
Tester Comments:
#5) Test Case ID: T005
Test Case Purpose: Verify the accuracy of the router transformation in suppressing records based on category and expiry date.
Test Procedure:
- After the successful execution of the workflow, navigate to the target table in the database
- Run a SQL query on the target table to check if the desired records have been suppressed.
Input Value/Test Data:
5 rows in the source file
Target: Database table – [Tbl_Product]
SQL query to run in SQL server: Select * from Product where Prod_category=’C’ or Prod_expiry_date < sysdate;
Expected Results:
No rows selected
Actual Results:
No rows selected
Remarks: Pass
Tester Comments: (if any)
#6) Test Case ID: T006
Test Case Purpose: Evaluate the performance of the workflow by recording the runtime.
Test Procedure:
- Open the Workflow Monitor and navigate to the run that was performed as part of T001.
- Record the start time and end time of the workflow.
- Calculate the total runtime by subtracting the start time from the end time.
Input Value/Test Data:
Workflow has run successfully
Start time of the workflow in the monitor
End time of the workflow in the monitor.
Expected Results:
2 minutes 30 seconds
Actual Results:
2 minutes 15 seconds
Remarks: Pass
Tester Comments: Considering the test as ‘Pass’ if the actual runtime is within +/- 10% of the expected runtime.
#7) Test Case ID: T007
Test Case Purpose: Validate data at the target table’s column level to ensure there is no data loss.
Test Procedure:
- After the successful execution of the workflow, go to the SQL Server.
- Run a SQL query on the target table to check for any data loss.
Input Value/Test Data:
Workflow has run successfully
One sample record from the source flat file.
SQL Query: Select Top 1 * from Tbl_Patient;
Expected Results:
Prod_ID (Primary Key) | Product_name | Prod_description | Prod_category | Prod_expiry_date | Prod_price |
---|---|---|---|---|---|
1001 | ABC | This is product ABC. | M | 8/14/2017 | 150 |
Actual Results:
Prod_ID (Primary Key) | Product_name | Prod_description | Prod_category | Prod_expiry_date | Prod_price |
---|---|---|---|---|---|
1001 | ABC | This is product ABC. | M | 8/14/2017 | 150 |
Remarks: Pass
Tester Comments: Considering the test as ‘Pass’ if the actual runtime is within +/- 10% of the expected runtime.
Benefits of Using Informatica as an ETL tool:
Informatica is a widely recognized and successful ETL tool due to the following advantages:
- High success rate for achieving desired results
- Capability to enable Lean Integration
- Competitive pricing compared to other ETL tools
- Integrated job scheduler, eliminating the need for a separate third-party scheduler
- Easy training and availability, contributing to its popularity
Recommended reading =>> Top ETL Test Automation Tools
Some useful Tips to assist you in Informatica ETL testing:
- Generate test data before executing test scenarios.
- Ensure the test data aligns with the corresponding test case.
- Cover all scenarios: no data submitted, invalid data submitted, and valid data submitted as inputs to the Informatica workflow.
- Verify that all required data is completely loaded into the target. You can use test case T003 described above as a sample.
- Thoroughly test the accuracy of data transformations implemented in Informatica, adhering to business rules.
- Suggest creating a checklist for each transformation in the Informatica mapping to verify output data. This enables easier identification of bugs in case of any faulty transformations.
Conclusion:
In conclusion, we have explored some sample test cases that can be used as templates for ETL testing in Informatica. As mentioned earlier, these test cases can be adapted, added, or modified based on the specific requirements of your project.
Informatica PowerCenter is a solid foundation for data integration activities across various environments. It enables script-free automated testing of data migration to test, development, or production environments, making it one of the most popular ETL tools today.
Recommended reading => ETL vs. DB Testing – A Closer Look at ETL Testing Need
About the author: This guest article was written by Priya K., a professional with over 4 years of hands-on experience in developing and supporting Informatica ETL applications.
Feel free to post any queries or comments about this ETL tool.