ETL stands for Extracting, Transforming, and Loading. It is a tool that has been used by some of the major companies in the world to extract, transfer and load data which makes it easy to transfer data from any outside system to a place where it is required. Naturally, it makes it very desirable for business as most of the business happens virtually, or at least the data is stored online.
It has gone very popular these days and that is why people are finding career opportunities in this field as well. There are many candidates, fresher, or professionals that are seeking some opportunities for ETL testing jobs. If you are one of them then this article is for you.
You do know that getting a job is not as easy as it may sound. You have to have the adequate qualification and experiment among other things and you have to get through the interview. The interview would be the only thing that you have to worry about. No matter how good you are, you might be a little bit nervous about the interview, which is completely fine. You should know that you have to prepare fully for the upcoming interview.
The interview is a crucial step and you can’t take it for granted. So you have to put all your efforts to clear the interview before it’s too late. Well, one thing that can help you out with the interview is the most commonly asked ETL testing interview questions. If you know what kind of questions they ask generally then you can easily prepare for the upcoming interview and you would be able to pass through the interview and get your dream job.
Here we have a list of the best ETL Testing interview questions that you would want to know about. So go ahead and take a look at the following ETL testing interview questions and see how much you already know and how much more you have to work on your interview preparations.
ETL Testing Interview Questions
Q1.What is ETL?Ans-In data warehousing architecture, ETL is an important component, which manages the data for any business process. ETL stands for Extract, Transform and Load. Extract does the process of reading data from a database. Transform does the converting of data into a format that could be appropriate for reporting and analysis. While, load does the process of writing the data into the target database.
Q2.Explain what are the ETL testing operations includes?Ans-ETL testing includes–Verify whether the data is transforming correctly according to business requirementsVerify that the projected data is loaded into the data warehouse without any truncation and data lossMake sure that ETL application reports invalid data and replaces with default valuesMake sure that data loads at expected time frame to improve scalability and performance
Q3.Mention what are the types of data warehouse applications and what is the difference between data mining and data warehousing?Ans-The types of data warehouse applications are–Info ProcessingAnalytical ProcessingData MiningData mining can be define as the process of extracting hidden predictive information from large databases and interpret the data while data warehousing may make use of a data mine for analytical processing of the data in a faster way. Data warehousing is the process of aggregating data from multiple sources into one common repository
Q4.What are the various tools used in ETL?Ans-Cognos Decision StreamOracle Warehouse BuilderBusiness Objects XISAS business warehouseSAS Enterprise ETL server
Q5.What is fact? What are the types of facts?Ans-It is a central component of a multi-dimensional model which contains the measures to be analysed. Facts are related to dimensions.–Types of facts are–Additive FactsSemi-additive FactsNon-additive Facts
Q6.Explain what factless fact schema is and what is Measures?Ans-A fact table without measures is known as Factless fact table. It can view the number of occurring events. For example, it is used to record an event such as employee count in a company.–The numeric data based on columns in a fact table is known as Measures
Q7.Explain what is transformation?Ans-A transformation is a repository object which generates, modifies or passes data. Transformation are of two types Active and Passive
Q8.Explain the use of Lookup Transformation?Ans-The Lookup Transformation is useful for–Getting a related value from a table using a column valueUpdate slowly changing dimension tableVerify whether records already exist in the table
Q9.Explain what is partitioning, hash partitioning and round robin partitioning?Ans-To improve performance, transactions are sub divided, this is called as Partitioning. Partioning enables Informatica Server for creating of multiple connection to various sources–The types of partitions are–Round-Robin Partitioning:–By informatica data is distributed evenly among all partitionsIn each partition where the number of rows to process are approximately same this partioning is applicableHash Partitioning:–For the purpose of partitioning keys to group data among partitions Informatica server applies a hash functionIt is used when ensuring the processes groups of rows with the same partitioning key in the same partition need to be ensured
Q10.Mention what is the advantage of using DataReader Destination Adapter?Ans-The advantage of using the DataReader Destination Adapter is that it populates an ADO recordset (consist of records and columns) in memory and exposes the data from the DataFlow task by implementing the DataReader interface, so that other application can consume the data.
ETL Testing Interview Questions Capgemini
Q11.Using SSIS ( SQL Server Integration Service) what are the possible ways to update table?Ans-To update table using SSIS the possible ways are:–Use a SQL commandUse a staging tableUse CacheUse the Script TaskUse full database name for updating if MSSQL is used
Q12.In case you have non-OLEDB (Object Linking and Embedding Database) source for the lookup what would you do?Ans-In case if you have non-OLEBD source for the lookup then you have to use Cache to load data and use it as source
Q13.In what case do you use dynamic cache and static cache in connected and unconnected transformations?Ans-Dynamic cache is used when you have to update master table and slowly changing dimensions (SCD) type 1For flat files Static cache is used
Q14.Explain what are Cubes and OLAP Cubes?Ans-Cubes are data processing units comprised of fact tables and dimensions from the data warehouse. It provides multi-dimensional analysis.–OLAP stands for Online Analytics Processing, and OLAP cube stores large data in muti-dimensional form for reporting purposes. It consists of facts called as measures categorized by dimensions.
Q15.Explain what is tracing level and what are the types?Ans-Tracing level is the amount of data stored in the log files. Tracing level can be classified in two Normal and Verbose. Normal level explains the tracing level in a detailed manner while verbose explains the tracing levels at each and every row.
Q16.Explain what is Grain of Fact?Ans-Grain fact can be defined as the level at which the fact information is stored. It is also known as Fact Granularity
Q17.Explain what is data source view?Ans-A data source view allows to define the relational schema which will be used in the analysis services databases. Rather than directly from data source objects, dimensions and cubes are created from data source views.
Q18.Explain what is the difference between OLAP tools and ETL tools ?Ans-The difference between ETL and OLAP tool is that–ETL tool is meant for the extraction of data from the legacy systems and load into specified data base with some process of cleansing data.–Example: Data stage, Informatica etc.–While OLAP is meant for reporting purpose in OLAP data available in multi-directional model.–Example: Business Objects, Cognos etc.
Q19.What is Bus Schema?Ans-For the various business process to identify the common dimensions, BUS schema is used. It comes with a conformed dimensions along with a standardized definition of information
Q20.Explain what is data purging?Ans-Data purging is a process of deleting data from data warehouse. It deletes junk data’s like rows with null values or extra spaces.
ETL Testing Interview Questions Infosys
Q21.Explain what are Schema Objects?Ans-Schema objects are the logical structure that directly refer to the databases data. Schema objects includes tables, views, sequence synonyms, indexes, clusters, functions packages and database links
Q22.Explain these terms Session, Worklet, Mapplet and Workflow ?Ans-Mapplet : It arranges or creates sets of transformationWorklet: It represents a specific set of tasks givenWorkflow: It’s a set of instructions that tell the server how to execute tasksSession: It is a set of parameters that tells the server how to move data from sources to target
Q23.What Are The Various Test Procedures Used To Check Whether The Data Is Loaded In The Backend, Performance Of The Mapping, And Quality Of The Data Loaded In Informatica?Ans-The best procedure to take a help of debugger where we monitor each and every process of mappings and how data is loading based on conditions breaks.
Q24.What Is The Difference Between Joiner And Lookup?Ans-joiner is used to join two or more tables to retrieve data from tables(just like joins in sql).Look up is used to check and compare source table and target table .(just like correlated sub-query in sql).
Q25.If A Flat File Contains 1000 Records How Can I Get First And Last Records Only?Ans-By using Aggregator transformation with first and last functions we can get first and last record.
Q26.How Do You Calculate Fact Table Granularity?Ans-Granularity, is the level of detail in which the fact table is describing, for example if we are making time analysis so the granularity maybe day based – month based or year based.
Q27.How you can extract SAP data using Informatica?Ans-With the power connect option you extract SAP data using informaticaInstall and configure the PowerConnect toolImport the source into the Source Analyzer. Between Informatica and SAP Powerconnect act as a gateaway. The next step is to generate the ABAP code for the mapping then only informatica can pull data from SAPTo connect and import sources from external systems Power Connect is used
Conclusion:So these are some of the best ETL testing interview questions that you can find. These questions have been picked by some experts and professionals who want to help the aspiring candidates. Try to answer these questions first on your own and then if you fail you can look up to their answers. You can also write the answers or record them to see how impressive you sound and know how to present an answer in the interview.That would be all that we have to offer you and we hope you found our article helpful. Go ahead and practice these questions and best of luck with your interview.