Introduction to Informatica Scenario Based Interview Questions And Answer
Informatica is an essential tool used in data warehousing, which helps manage large amounts of data and report data analysis. Below are some questions which will be helpful when you attend an interview on Informatica.
So you have finally found your dream job in Informatica but are wondering how to crack the Informatica interview and what could be the probable Informatica Scenario Based Interview Questions. Every interview is different, and the scope of a job is different too. Keeping this in mind, we have designed the most common Informatica Scenario based on Interview Questions and Answers to help you get success in your interview.
Some of the most important Informatica Scenario Based Interview Questions that are frequently asked in an interview are as follows:
1. How to remove duplicate records in Informatica? Explain the different ways to do it?
There are many ways of eliminating duplicates:
1. If there are duplicates in the source database, a user can use the property in source qualifier. A user must go to the Transformation tab and checkmark the ‘Select Distinct’ option. Also, a user can use SQL override for the same purpose. The user can go to the Properties tab and in SQL query tab write the distinct query.
2. A user can use Aggregator and select ports as key to getting distinct values. If a user wishes to find duplicates in the entire column, then all ports should be selected as a group by key.
3. The user can also use Sorter with Sort distinct property to get distinct values.
4. Expression and filter transformation can also be used to identify and remove duplicate data. If data is not sorted, then it needs to be sorted first.
5. When a property in Lookup transformation is changed to use Dynamic cache, a new port is added to the transformation. This cache is updated as and when data is read. If a source has duplicate records, the user can look in Dynamic lookup cache and then the router selects only one distinct record.
2. What is the difference between Source qualifier and filter transformation?
Source qualifier transformation is used to represent rows that Integration service reads in a session. It is an active transformation. Using source qualifier, the following tasks can be fulfilled:
1. When two tables from the same source database with primary key – foreign key transformation relationship is there, then the sources can be linked to one source qualifier transformation.
2. Filtering rows when Integration service adds a where clause to the user’s default query.
3. When a user wants an outer join instead of an inner join, then join information is replaced by metadata specified in SQL query.
4. When sorted ports are specified, the integration service uses the order by clause to the default query.
5. If a user chooses to find a distinct value, then integration service uses select distinct to the specified query.
When the data we need to filter is not a relational source, the user should use Filter transformation. It helps the user to meet the specified filter condition to let go or pass through. It will directly drop the rows that do not meet the condition, and multiple conditions can be specified.
3. Design a mapping to load the last 3 rows from a flat-file into the target?
Suppose the flat file in consideration has below data:
Step1: Assign row numbers to each record. Generate row numbers using expression transformation by creating a variable port and incrementing it by 1. After this assign this variable port to output port. After expression transformation, the ports will be as –
Create a dummy output port for the same expression transformation and assign 1 to that port. This dummy port will always return 1 for each row.
Finally, the transformation expression will be as follows:
The output of this transformation will be :
Column A O_count Dummy_output
Aanchal 1 1
Priya 2 1
Karishma 3 1
Snehal 4 1
Nupura 5 1
Step 2: Pass the above output to an aggregator and do not specify any group by the condition. A new output port should be created as O_total_records in the aggregator and assign O_count port to it. The aggregator will return the last row. This step’s final output will have a dummy port with value as 1 and O_total_records will have a total number of records in the source. The aggregator output will be: O_total_records, Dummy_output
Step 3: Pass this output to joiner transformation and apply a join on dummy port. The property sorted input should be checked in joiner transformation. Only then the user can connect both expression and aggregator transformation to joiner transformation. Joiner transformation condition will be as follows:
Dummy_output (port from aggregator transformation) = Dummy_output (port from expression transformation)
The output of joiner transformation will be
Column A o_count o_total_records
Aanchal 1 5
Priya 2 5
Karishma 3 5
Snehal 4 5
Nupura 5 5
Step 4: After the joiner transformation we can send this output to filter transformation and specify filter condition as O_total_records (port from aggregator)-O_count(port from expression) <=2
The filter condition, as a result, will be
O_total_records – O_count <=2
The final output of filter transformation will be :
Column A o_count o_total_records
Karishma 3 5
Snehal 4 5
Nupura 5 5
4. How to load only NULL records into the target? Explain using mapping flow?
Consider the below data as a source
Emp_Id Emp_Name Salary City Pincode
619101 Aanchal Singh 20000 Pune 411051
619102 Nupura Pattihal 35000 Nagpur 411014
NULL NULL 15000 Mumbai 451021
The target table also has a table structure as a source. We will have two tables containing NULL values and others that would not contain NULL values.
The mapping can be as:
SQ –> EXP –> RTR –> TGT_NULL/TGT_NOT_NULL
EXP – Expression transformation create an output port
O_FLAG= IIF ( (ISNULL(emp_id) OR ISNULL(emp_name) OR ISNULL(salary) OR ISNULL(City) OR ISNULL(Pincode)), ‘NULL’,’NNULL’)
RTR – Router transformation two groups
Group 1 connected to TGT_NULL ( Expression O_FLAG=’NULL’)
Group 2 connected to TGT_NOT_NULL ( Expression O_FLAG=’NNULL’)
5. Explain how the performance of joiner condition can be increased?
The performance of the joiner condition can be increased by following some simple steps.
1. The user must perform joins whenever possible. When for some tables, this is not possible, then a user can create a stored procedure and then join the tables in the database.
2. Data should be sorted before applying to join whenever possible.
3. When data is unsorted, then a source with fewer rows should be considered a master source.
4. For sorted joiner transformation, a source with less duplicate key values should be considered a master source.
This has been a guide to List of Informatica Scenario Based Interview Questions and answers so that the candidate can crackdown these Interview Questions easily. You may also look at the following articles to learn more –