A Interview Faq's - 1
A Interview Faq's - 1
A Interview Faq's - 1
By PenchalaRaju.Yanamala
ANS: Mapping parameter represents a constant value that you can define before
running a session. A mapping parameter retains the same value throughout the
entire session. When you use the mapping parameter, you declare and use the
parameter in a mapping. Then define the value of parameter in a parameter file
for the session. Unlike a mapping parameter, a mapping variable represents a
value that can change throughout the session. The informatica server saves the
value of mapping variable to the repository at the end of session run and uses
that value next time you run the session.
Ans: Session parameters are like mapping parameters represents values u might
want to change between sessions such as database connections or source files.
Server manager also allows u to create user defined session parameters.
Following are user defined session parameters.
Database connections: You can give database connections here.
Location of Source file names: Use this parameter when u wants to change the
name or session source file between session runs.
Location of Target file name: Use this parameter when u wants to change the
name or session target file between session runs.
Location of Reject file name: Use this parameter when u wants to change the
name or session rejects files between session runs.
Ans: If a session fails or expected data does not come in target table then we use
debugger. By using debugger we can know where the exact fault will be.
Example:::
if we declare mapping parameter we can use that parameter
untill completing the session,but if we declare mapping
variable we can change in between sessions.Use mapping
variable in Transcation Control Transformation......
If mapping parameters and variables are not given we cannot call values n
different mappings. We define mapping parameters and variables in mapping
designer.
Thanks Deepak. But does it take any null values or any default values to run the
mapping.
Mapping Parameter: Define the constant value it can't change the value through
the session.
Mapping Variables: Define the value It can be change throughout the session
Mapping parameter is a dynamic normally pass the value to this parameter
through the prarameter file(.par file).we can pass the parameter dynamically at
every time the session runs.This can be set to the relevent variable called by
using the $$ sign.
Mapping variable is static.we can't change the value of this variable during
runtime.
mapping parameter is a static value that you define before running the session
and it retains the same value till the end of the session. Define a parameter in
paramaeter file (.par) and use it in mapping or maplet and when PowerCenter
server the runs the session it evaluates the value from the parameter and retains
the same value throughout the session. When the session run again it reads from
the file for its value.
2. Current mapping.
Ans: Raju in complex mapping you remove some transformations and tell them
that this is my current mapping.
Ans: First we will load data into Master table. Because with reference to master
table we will load data into detail table. As here primary key and foreign key
relation ship acts.
4.Session partitions.
4.Daily activities.
Ans: After I go to office I will connect to client machine and I will develop
mappings according to the business logic by referring to the business
documents. If I am having any doubt regarding the logic I will ask my team lead
else the business analyst. At the end of the day I will report to my team lead
regarding today’s work which I completed.
Ans: source -> source qualifier -> expression transformation -> filter
transformation -> target. In expression transformation we will generate a variable
port and also an output port.
Old_Value (variable) New_Value is assigned.
New_value (variable) EMPNO.
Flag (output) IIF(Old_Value=New_Value, 1, 0).
Now in filter transformation we will give condition as Flag=1.
Ans: source -> source qualifier -> expression -> filter -> target. In expression we
will create a variable and output port.
Exp_Var (variable) Exp_Var+1
Exp_out (output) Exp_var.
In filter we will give condition of what records we want. For example we want 2 nd,
7th, 13th records we will give condition as Exp_Out=2 or Exp_Out=7 or
Exp_Out=13.
With PowerCenter, you receive all product functionality, including the ability to
register multiple servers, share metadata across repositories, and partition data.
A PowerCenter license lets you create a single repository that you can configure
as a global repository, the core component of a data warehouse.
PowerMart includes all features except distributed metadata, multiple registered
servers, and data partitioning. Also, the various options available with
PowerCenter (such as PowerCenter Integration Server for BW, PowerConnect
for IBM DB2, PowerConnect for IBM MQSeries, PowerConnect for SAP R/3,
PowerConnect for Siebel, and PowerConnect for PeopleSoft) are not available
with PowerMart.
Q. What is a repository?
We create and maintain the repository with the Repository Manager client tool.
With the Repository Manager, we can also create folders to organize metadata
and groups to organize users.
Q. What are different kinds of repository objects? And what it will contain?
Q. What is a metadata?
Designing a data mart involves writing and storing a complex set of instructions.
You need to know where to get data (sources), how to change it, and where to
write the information (targets). PowerMart and PowerCenter call this set of
instructions metadata. Each piece of metadata (for example, the description of a
source table in an operational database) can contain comments about it.
Folders let you organize your work in the repository, providing a way to separate
different types of metadata or different projects into easily identifiable areas.
A shared folder is one, whose contents are available to all other folders in the
same repository. If we plan on using the same piece of metadata in several
projects (for example, a description of the CUSTOMERS table that provides data
for a variety of purposes), you might put that metadata in the shared folder.
A mapping specifies how to move and transform data from sources to targets.
Mappings include source and target definitions and transformations.
Transformations describe how the Informatica Server transforms data. Mappings
can also include shortcuts, reusable transformations, and mapplets. Use the
Mapping Designer tool in the Designer to create mappings.
Sessions and batches store information about how and when the Informatica
Server moves data through mappings. You create a session for each mapping
you want to run. You can group several sessions together in a batch. Use the
Server Manager to create sessions and batches.
Detailed descriptions for database objects, flat files, Cobol files, or XML files to
receive transformed data. During a session, the Informatica Server writes the
resulting data to session targets. Use the Warehouse Designer tool in the
Designer to import or create target definitions.
The need to share data is just as pressing as the need to share metadata. Often,
several data marts in the same organization need the same information. For
example, several data marts may need to read the same product data from
operational sources, perform the same profitability calculations, and format this
information to make it easy to review.
If each data mart reads, transforms, and writes this product data separately, the
throughput for the entire organization is lower than it could be. A more efficient
approach would be to read, transform, and write the data to one central data
store shared by all data marts. Transformation is a processing-intensive task, so
performing the profitability calculations once saves time.
Therefore, this kind of dynamic data store (DDS) improves throughput at the level
of the entire organization, including all data marts. To improve performance
further, you might want to capture incremental changes to sources. For example,
rather than reading all the product data each time you update the DDS, you can
improve performance by capturing only the inserts, deletes, and updates that
have occurred in the PRODUCTS table since the last time you updated the DDS.
The DDS has one additional advantage beyond performance: when you move
data into the DDS, you can format it in a standard fashion. For example, you can
prune sensitive employee data that should not be stored in any data mart. Or you
can display date and time values in a standard format. You can perform these
and other data cleansing tasks when you move data into the DDS instead of
performing them repeatedly in separate data marts.
Each local repository in the domain can connect to the global repository and use
objects in its shared folders. A folder in a local repository can be copied to other
local repositories while keeping all local and global shortcuts intact.
Write lock. Created when you create or edit a repository object in a folder for
which you have write permission.
Execute lock. Created when you start a session or batch, or when the
Informatica Server starts a scheduled session or batch.
Fetch lock. Created when the repository reads information about repository
objects from the database.
Q. After creating users and user groups, and granting different sets of
privileges, I find that none of the repository users can perform certain
tasks, even the Administrator.
Q. I do not want a user group to create or edit sessions and batches, but I
need them to access the Server Manager to stop the Informatica Server.
To permit a user to access the Server Manager to stop the Informatica Server,
you must grant them both the Create Sessions and Batches, and Administer
Server privileges. To restrict the user from creating or editing sessions and
batches, you must restrict the user's write permissions on a folder level.
Alternatively, the user can use pmcmd to stop the Informatica Server with the
Administer Server privilege alone.
Q. How does read permission affect the use of the command line program,
pmcmd?
To use pmcmd, you do not need to view a folder before starting a session or
batch within the folder. Therefore, you do not need read permission to start
sessions or batches with pmcmd. You must, however, know the exact name of
the session or batch and the folder in which it exists.
With pmcmd, you can start any session or batch in the repository if you have the
Session Operator privilege or execute permission on the folder.
Q. My privileges indicate I should be able to edit objects in the repository,
but I cannot edit any metadata.
You may be working in a folder with restrictive permissions. Check the folder
permissions to see if you belong to a group whose privileges are restricted by the
folder owner.
When you use event-based scheduling, the Informatica Server starts a session
when it locates the specified indicator file. To use event-based scheduling, you
need a shell command, script, or batch file to create an indicator file when all
sources are available. The file must be created or sent to a directory local to the
Informatica Server. The file can be of any format recognized by the Informatica
Server operating system. The Informatica Server deletes the indicator file once
the session starts.
Use the following syntax to ping the Informatica Server on a UNIX system:
Use the following syntax to stop the Informatica Server on a UNIX system:
Q. What is a transformation?
Each transformation has rules for configuring and connecting in a mapping. For
more information about working with a specific transformation, refer to the
chapter in this book that discusses that particular transformation.
You can create transformations to use once in a mapping, or you can create
reusable transformations to use in multiple mappings.
The Informatica Server queries the lookup table based on the lookup ports in the
transformation. It compares Lookup transformation port values to lookup table
column values based on the lookup condition. Use the result of the lookup to
pass to other transformations and the target.
Update strategy flags a record for update, insert, delete, or reject. We use this
transformation when we want to exert fine control over updates to a target, based
on some condition we apply. For example, we might use the Update Strategy
transformation to flag all customer records for update when the mailing address
has changed, or flag all employee records for reject for people no longer working
for the company.
Within a session. When you configure a session, you can instruct the
Informatica Server to either treat all records in the same way (for example, treat
all records as inserts), or use instructions coded into the session mapping to flag
records for different database operations.
The lookup table can be a single table, or we can join multiple tables in the same
database using a lookup query override. The Informatica Server queries the
lookup table or an in-memory cache of the table for all incoming rows into the
Lookup transformation.
If your mapping includes heterogeneous joins, we can use any of the mapping
sources or mapping targets as the lookup table.
Q. What is a Lookup transformation and what are its uses?
Get a related value. For example, if our source table includes employee ID, but
we want to include the employee name in our target table to make our summary
data easier to read.
Perform a calculation. Many normalized tables include values used in a
calculation, such as gross sales per invoice or sales tax, but not the calculated
value (such as net sales).
Update slowly changing dimension tables. We can use a Lookup transformation
to determine whether records already exist in the target.
For example, if we have a large input file that we separate into three sessions
running in parallel, we can use a Sequence Generator to generate primary key
values. If we use different Sequence Generators, the Informatica Server might
accidentally generate duplicate key values. Instead, we can use the same
reusable Sequence Generator for all three sessions to provide a unique value for
each target row.
(1).Joiner Transformation
*In order to join two sources, there must be at least one matching
Port while joining two sources.
1. Normal join:-Normal join discards all the rows of the data from both master
and
Detail Sources that don’t match based on the condition.
2. Master outer join:-
(2). Normal, master are faster than detail, full outer join.
Q) We have 10 sources. Using joiner T/R how many joins you should use to
join them?
Q) We have an oracle source; from this source we drag two tables EMP,
DEPT and there is no common column for these two tables and how
Could you join them?
ANS: No we cannot join without having atleast one common port.
ANS: Normal join. Because normal join gives records which match the condition.
Where as in case of detail join it gives records which match the condition plus
master table records, so performance decreases. Similarly for master join it gives
Records which match the condition plus detail table records, so performance
decreases
ANS: Yes.
Q) What is the use of sorted input option? In which T/R can you use it?
ANS:
We can use to join from relational or flat file sources to a mapping. And it
represents the records the power center server reads when it’s run a session.
It perform various tasks such as
2.
4. Expression Transformation:-
A:-*It is Passive Transformation, because does not change the no. of rows
And pass it from source to target.
*Run a stored procedure every time a row passes through the Connected or
Unconnected in Stored Procedure transformation.
*Run a stored procedure once during our mapping, such as pre- or post-
Session by using Unconnected in S.P.T.
With Type 2 SCD, you always create another version of dimension record and
mark the existing version as history. To accommodate this, you need to create
extra metadata for your dimension table, including an effective date column and
an expiration date column. These columns are used to differentiate a current
version from a historical version as follows:
Effective date column stores the effective date of the version; also known as start
date
Expiration date column stores the expiration date of the version; also known as
end date
Expiration date value of the current version is always set to NULL or a default
date value
You also need to decide which columns you want to store historic data for when
the values are to be changed. These columns are defined as trigger columns and
should be described as part of your metadata.
Once you define your dimension GEO_DIM_TYPE2, you can use it in your
mapping to load data into it. GEO_SRC is the sample source table here from
which data are to be loaded into GEO_DIM_TYPE2.
To load Type 2 slowly changing dimension, you need to transform data extracted
from the source properly before you load them into the target. You achieve this
by creating a mapping, such as the one displayed in Figure A-6. In this mapping,
data is first extracted from GEO_SRC, transformed by a series of operators, and
finally loaded into GEO_DIM_TYPE2.
Or Type2: Flag: The old data will be denoted as false and the
new data will be denoted as true.
Date: The changes along with the date in which they are made
are clearly mentioned.
You can use a Command task to call the shell scripts, in the following ways:
1. Standalone Command task. You can use a Command task anywhere in the
workflow or worklet to run shell commands.
2. Pre- and post-session shell command. You can call a Command task as the
pre- or post-session shell command for a Session task. For more information
about specifying pre-session and post-session shell commands
(2)In What scenario ETL coding is preferred than Database level SQL,
PL/SQL coding?
We should go for an ETL tool when we have to extract the data from multiple
source systems like flat file, oracle, COBOL etc at one instance. Where PL/SQL
or SQL can not fit.
(5) If LKP on target table is taken, can we update the rows without update
strategy transformation?
We can update the target with out using update strategy by setting the session
parameters..If the source is a database.
(6)How you capture changes in data if the source system does not have option of
storing date/time field...
(9)How do you tell aggregator stage that input data is already sorted
Can anyone please explain why and where do we exactly use the lookup
transformations.
Three tier data warehouse contains three tier such as bottom tier,middle tier and
top tier.
Bottom tier deals with retrieving related datas or information from various
information repositories by using SQL.
Middle tier contains two types of servers.
1. ROLAP server
2.MOLAP server
Top tier deals with presentation or visualization of the results
We can use control table update and ipf files for capturing incremental data or
delta data from a source. Control table will maintain the details like from which
timestamp (previous) to which timestamp (current) we have taken the data.
If the session is taking data everyday (daily run) then the delta will be of one day.
Previous timestamp will be of yesterday's date (P1) and current timestamp will be
the time (C1)of run of the job. So in today’s run we will get one day data.
Next day when job runs C1 will become P2 and today’s run time will become C2,
so we will not miss any records incremented i the source systems. This will go
on.
Incase on weekly runs the delta will be of one week.
Use them?
1. Mapping level.
2.session level.
In real time if we want to update the existing record with the same source
data you can go for session level update logic.
If you want to applay different set of rules for updating or inserting a record, even
that record is existed in the warehouse table .you can go for mapping level
Update strategy transformation.
It means that if you are using Router transformation for performaning different
activities.
EX: If the employee 'X1234 ' is getting Bonus then updating the Allowance with
10% less.If not, inserting the record with new Bonus in the Warehouse table.
(17).Lets suppose we have some 10,000 odd records in source system and
when load them into target how do we ensure that all 10,000 records that
are loaded to target doesn't contain any garbage values.
How do we test it. We can't check every record as numbers of records are
huge
You should have proper tesing conditions in your ETL jobs for validating all the
important columns before they are loaded into the target. Always have proper
rejects to capture records containing garbage values.
or
Go into workflow monitor after showing the status succeed click right button go
into the property and you can see there no of source row and success target
rows and rejected rows
modeling???
1.One-One,
2.One-Many,
3.Many-One,
4.Many-Many.
The fact table getting data from dimensions tables because it containing primary
keys of dimension tables as a foreign keys for getting summarized data for each
record. or
(20) What are the various test procedures used to check whether the data is loaded in the
backend, performance of the mapping, and quality of the data loaded in
INFORMATICA.
The best procedure to take a help of debugger where we monitor each and
every process of mappings and how data is loading based on conditions breaks.
Or
Hi, u can check out the session logfiles for the total number of records added
number of records udated and number of rejected records and errors related to
that so this is the answer the interviewer is expecting from usk
metadata extensions. You can view and change the values of vendor-
them.
User-defined. You create user-defined metadata extensions using
defined metadata extensions. You can also change the values of user-
defined extensions
I think we cannot lookup through a source qualifier as we use the souce to look
up the tables so it is not possible i think or
You cannot lookup from a source qualifier directly. However, you can override
the SQL in the source qualifier to join with the lookup table to perform the lookup.
Parameter files are used for static variable exectuion of a task. this file can b
modifies/updated for later change in the parameter. say for ex, xyz="RAJAT" is
defined in the parameter file and now whererever XYZ is used in the mapping
the data is automatically taken as RAJAT. is we wann achange that we can
change that to any other varchar or int in the file.
this file can be called upon in the session properties and shown the physical
path in the server to read upon.
1. PowerMart Designer
2. Server
3. Server Manager
4. Repository
5. Repository Manager
(A) For a cached lookup the entire rows (lookup table) will be put in the buffer,
and compare these rows with the incomming rows.
where as uncached lookup, for every input row the lookup will query the
lookup table and get the rows.
parameter file
how to create parameter file and how to use it in a mapping explain with
example
Please place your parameter file in the server "srcfiles" with data in it.In
mapping designer window of powercenter designer,click on "Mapping" and
then "Parameter and variable".Add all the parameter here one by one.
Now you can able to see the variable with "$$" added in the above will be
available in your mapping.This variable inturn picks value from the parameter
file.Donot forget to give "parameter filename" in the "property" tab of task in
workflow manager.
4. Verbose data: Along with verbose initialisation records each and every record
processed by the informatica server
A Reusable Transformation can only be used with in the folder. but a shortcut
can be used anywhere in the Repository and will point to actual Transformation..
14.What are the locks with respect to mappings?
17.
Explain Session Recovery Process?
Commit interval is a interval in which the Informatica server loads the data into
the target.
18.How to run a workflow without using GUI i.e, Worlflow Manager, Workflow
Monitor and pmcmd?
pmcmd is not GUI. It is a command you can use within unix script to run the
workflow.
or
Unless the job is scheduled, you cannot manually run a workflow without using a
GUI.