Talend Use Cases

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 38

TALEND

USE CASES
CASE 1
Description: The records which are coming once needs in first Target File and the remaining records
from the source should go to the second Target File.

Source Target 1 Target 2

A A B
B D C
C B
D C
B B
C
B

Data Files/Tables to be used:

Steps to create the job

Step1: Read the Source File using tFileInputDelimited Component.

Step2: Group by the data based on the column value and calculate the count for each value using
tAggregateRow Component.
Step3: Filter the data based on the value of count where the value of Count = 1 using tFilterRow
Component

Step4: Take the output of the Step3 and remove the column Count using tFilterColumns component
Step5: Write the output of the Step4 to a file using tFileOutputDelimited Component

Step6: Take the Reject Link of Step3 with the records having Count > 1 and remove the Count Column
using tFilterColumns Component.

Step7: Join the Output of Step6 with the main file using tJoin Component.
Step8: Write the output of Step7 to second output file using tFileOutputDelimited.

Final Job Screen Shot:


CASE 2
Description: The distinct records needs in First Target File and the remaining duplicates needs in the
second Target File

Source Target 1 Target 2

A A B
B B C
C C B
D D
B
C
B

Data Files/Tables to be used:

Steps to create the job

Step1: Read the Source File using tFileInputDelimited Component.


Step2: Use tUniq Component to generate two outputs, one for Unique and another for duplicates.

Step3: Write the outputs of Step2 to two different files using tFileOutputDelimited.
Final Job Screen Shot:
CASE 3
Description: The second column from the Source needs to be reversed without changing the order of
the first Column

Source Target
101 D
101 A
102 C
102 B
103 B
103 C
104 A
104 D

Data Files/Tables to be used:

Steps to create the job

Step1: Read the Source File using tFileInputDelimited Component.

Step2: Use tMap Component, select COL1 and generate a DUMMY Column with sequence numbers.
Step3: Repeat the Step2 for COL2 and sort the COL2 in descending order before sending to tMap using
tSort Component.

Step4: Join the output of Step 2 and Step 3 using tJoin Component.

Step5: Write the output to file using tFileOutputDelimited Component.


Final Job Screen Shot:
CASE 4
Description: For each value of First Column, the Second Column values needs to be combined.

Source Target

10 A 10 ABC
20 K 20 KY
10 B 30 L
30 L 40 M
40 M
10 C
20 Y

Data Files/Tables to be used:

Steps to create the job

Step1: Read the Source File using tFileInputDelimited Component.


Step2: Sort the data in ascending order based on COL1 using tSort Component.

Step3: Using tDenormalizeSortedRow Component, group the data on COL1 and merge the values of
COL2.

Step4: Write the output of Step3 to a file using tFileOutputDelimited.

Final Job Screen Shot:


CASE 5
Description: For each DEPTNO column present in the Source we need the first record and last record in
the output.

Source Target

ENO ENAME DEPTNO


1 A 10 ENO ENAME DEPTNO
2 B 10 1 A 10
3 C 10 4 D 10
4 D 10 5 E 20
5 E 20 7 G 20
6 F 20 8 H 30
7 G 20 9 I 30
8 H 30
9 I 30

Data Files/Tables to be used:

Steps to create the job

Step1: Read the Source File using tFileInputDelimited Component.


Step2: Remove the duplicates based on DEPTNO Column using tUniq Component.

Step3: Repeat the Step2, sort the data in Descending order of ID before removing the duplicate using
tUniq Component.

Step4: Combine the data from Step2 and Step3 using tUniq Component.
Step5: Write the output to a file using tFileOutputDelimited Component.

Final Job Screen Shot:


CASE 6
Description: Create a job to load SCD dimension based on User Defined design

Data Tables to be used:

Source
ENO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
7839 RAJ PRESIDENT NULL 17-NOV-81 5000 NULL 10
7698 BLAKE MANAGER 7819 1-MAY-81 2850 NULL 30
7782 CLARK CLERK 7839 9-JUN-81 2450 NULL 10
7566 JONES MANAGER 7839 2-APR-81 2975 NULL 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7844 TURNER SALESMAN 7698 8-SEP-81 1500 0 30
7900 JAMES CLERK 7698 3-DEC-81 950 NULL 30
7521 RAJ SALES 1234 22-FEB-81 1250 500 30
7902 FORD ANALYST 7566 3-DEC-81 3000 NULL 20
7369 SMITH CLERK 7902 17-DEC-80 800 NULL 20
7788 SCOTT ANALYST 7566 09-DEC-82 3000 NULL 20
7876 ADAMS CLERK 7788 12-JAN-83 1100 NULL 20
7934 MILLER CLERK 7782 23-JAN-82 1300 NULL 10
7123 SCOTT CLERK 7839 17-NOV-81 4500 100 10

Target Data:
SKEY ENO ENAME SAL INS_DATE UPD_DATE
1 7839 RAJ 500 2017-04-11 22:43:46 NULL
0
2 7698 BLAKE 320 2017-04-11 22:43:46 2017-04-11 01:51:57
0
3 7782 CLARK 245 2017-04-11 22:43:46 NULL
0
4 7566 JONES 297 2017-04-11 22:43:46 NULL
5
5 7654 MARTIN 125 2017-04-11 22:43:46 NULL
0
6 7499 ALLEN 160 2017-04-11 22:43:46 NULL
0
7 7844 TURNER 150 2017-04-11 22:43:46 NULL
0
8 7900 JAMES 950 2017-04-11 22:43:46 NULL
9 7521 RAJ 125 2017-04-11 22:43:46 NULL
0
10 7902 FORD 300 2017-04-11 22:43:46 NULL
0
11 7369 SMITH 800 2017-04-11 22:43:46 NULL
12 7788 SCOTT 300 2017-04-11 22:43:46 NULL
0
13 7876 ADAMS 110 2017-04-11 22:43:46 NULL
0
14 7934 MILLER 130 2017-04-11 22:43:46 NULL
0
15 7256 SCOTT 234 2017-04-12 01:51:57 NULL
0
16 7698 BLAKE 200 2017-04-12 01:51:57 NULL
0

Steps to create the job

1. Link the Source table first (so that it would be added as Main input) to tMap and then the
Target (so that it would be added as Lookup input)
2. Bring in the Target table instances (as Out components) into the Job thrice
3. Link the tMap to all the three Target instances and mark the rows as INS, UPDINS, UPDUPD
respectively
4. Designed Job should look like below

5. Under Target component properties set the property Action on data to Insert for INS and
UPDINS instances while you need to set the same property to Update for UPDUPD target
instance
6. Now let’s configure tMap as shown below:
CASE 7
Description: Create a job to load SCD dimension based on Built-In SCD components

Data Tables to be used: Please use the same source table (emp) created for Use Case # 6

Source:
ENO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
7839 RAJ PRESIDENT NULL 17-NOV-81 5000 NULL 10
7698 BLAKE MANAGER 7819 1-MAY-81 2850 NULL 30
7782 CLARK CLERK 7839 9-JUN-81 2450 NULL 10
7566 JONES MANAGER 7839 2-APR-81 2975 NULL 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7844 TURNER SALESMAN 7698 8-SEP-81 1500 0 30
7900 JAMES CLERK 7698 3-DEC-81 950 NULL 30
7521 RAJ SALES 1234 22-FEB-81 1250 500 30
7902 FORD ANALYST 7566 3-DEC-81 3000 NULL 20
7369 SMITH CLERK 7902 17-DEC-80 800 NULL 20
7788 SCOTT ANALYST 7566 09-DEC-82 3000 NULL 20
7876 ADAMS CLERK 7788 12-JAN-83 1100 NULL 20
7934 MILLER CLERK 7782 23-JAN-82 1300 NULL 10
7123 SCOTT CLERK 7839 17-NOV-81 4500 100 10

Target:
ENO ENAME JOB MGR HIREDATE SAL COM DEPTNO SKE scd_start scd_en
M Y d
7839 RAJ PRESIDENT NUL 17-NOV- 500 NULL 10 1 2017-04-12 01:57:57 NULL
L 81 0
7698 BLAKE MANAGER 7819 1-MAY-81 200 NULL 30 2 2017-04-12 01:57:57 NULL
0
7782 CLARK CLERK 7839 9-JUN-81 245 NULL 10 3 2017-04-12 01:57:57 NULL
0
7566 JONES MANAGER 7839 2-APR-81 297 NULL 20 4 2017-04-12 01:57:57 NULL
5
7654 MARTIN SALESMAN 7698 28-SEP- 125 1400 30 5 2017-04-12 01:57:57 NULL
81 0
7499 ALLEN SALESMAN 7698 20-FEB- 160 300 30 6 2017-04-12 01:57:57 NULL
81 0
7844 TURNER SALESMAN 7698 8-SEP-81 150 0 30 7 2017-04-12 01:57:57 NULL
0
7900 JAMES CLERK 7698 3-DEC-81 950 NULL 30 8 2017-04-12 01:57:57 NULL
7521 RAJ SALES 1234 22-FEB- 125 500 30 9 2017-04-12 01:57:57 NULL
81 0
7902 FORD ANALYST 7566 3-DEC-81 300 NULL 20 10 2017-04-12 01:57:57 NULL
0
7369 SMITH CLERK 7902 17-DEC- 800 NULL 20 11 2017-04-12 01:57:57 NULL
80
7788 SCOTT ANALYST 7566 09-DEC- 300 NULL 20 12 2017-04-12 01:57:57 NULL
82 0
7876 ADAMS CLERK 7788 12-JAN-83 110 NULL 20 13 2017-04-12 01:57:57 NULL
0
7934 MILLER CLERK 7782 23-JAN-82 130 NULL 10 14 2017-04-12 01:57:57 NULL
0
7256 SCOTT SALES 9876 18-OCT- 234 150 20 15 2017-04-12 01:57:57 NULL
81 0

Steps to create the job:


CASE 8
Description: Create a set of jobs to pass the values to pass values from child to parent to final.

Data Tables to be used: Please use the same source table (emp) created for Use Case # 6

Source
ENO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
7839 RAJ PRESIDENT NULL 17-NOV-81 5000 NULL 10
7698 BLAKE MANAGER 7819 1-MAY-81 2850 NULL 30
7782 CLARK CLERK 7839 9-JUN-81 2450 NULL 10
7566 JONES MANAGER 7839 2-APR-81 2975 NULL 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7844 TURNER SALESMAN 7698 8-SEP-81 1500 0 30
7900 JAMES CLERK 7698 3-DEC-81 950 NULL 30
7521 RAJ SALES 1234 22-FEB-81 1250 500 30
7902 FORD ANALYST 7566 3-DEC-81 3000 NULL 20
7369 SMITH CLERK 7902 17-DEC-80 800 NULL 20
7788 SCOTT ANALYST 7566 09-DEC-82 3000 NULL 20
7876 ADAMS CLERK 7788 12-JAN-83 1100 NULL 20
7934 MILLER CLERK 7782 23-JAN-82 1300 NULL 10
7123 SCOTT CLERK 7839 17-NOV-81 4500 100 10

Steps to create the job

1. Create a Job named Child with Source, tMap, tLogRow, tBufferOutput as shown below:

2. tMap configuration
3. Make sure you rename tLogRow to ‘ChildData’
4. Create a second job named Final with tRowGenerator, tMap, tLogRow as shown below

5. tRowGenerator configuration
6. tMap configuration

7. Make sure you rename the tLogRow to FinalData


8. Create a third job named Parent with tRunJob, tMap, tLogRow, tRunJob as shown below

9. Drag and drop the above created Child and Final jobs from Repository to Job canvas
10. Click on ‘Copy Child Job Schema’ under ‘Child’ tRunJob component
11. tMap configuration

12. Rename tLogRow to ParentData


13. Final tRunJob component configuration

14. Execute the Parent job only that triggers both Child and Final jobs as well pass the values from
Child to Parent to Final
15. Output as below
CASE 9
Description: Create a job that reads the database table Invoices and generates 3 different output files
for different regions (Without tForEach Component)

Data Files/Tables to be used:

Steps to create the job

Step1: Create 3 Database Connections for each Regions (Region1, Region2 and Region3)
Step2: Write the data to the respective output files using tFileOutputDelimited Component
Final Job Screen Shot:
CASE 10
Description: Create a job that reads the database table Invoices and generates 3 different output files
for different regions (With tForEach Component)

Data Files/Tables to be used:

Steps to create the job

Step1: Using tForEach Component create 3 Values (region1, region2 & region3)

Step2: Using Database Component read the data from INVOICES Table for each values present in Step1.

Step3: Using tFileOutputDelimited Component write the data to the output file.
Final Job Screen Shot:
CASE 11
Description: Create a job that reads the database table Invoices and generates multiple output files for
all the different regions

Data Files/Tables to be used:

Steps to create the job

Step1: Extract the distinct values of Regions from the Invoices table using the Database Component.

Step2: Pass the distinct values of Regions to tFlowToIterate Component.


Step3: Generate the Values for each Regions using tForEach Component.

Step4: Read the Invoices Table for all the values from Step3.

Step5: Write the output file using tFileOutputDelimited Component.

Final Job Screen Shot:


END

You might also like