PostgreSQL – SELF JOIN
In PostgreSQL, a SELF JOIN is a powerful technique that allows us to join a table with itself. This type of join is particularly useful for comparing rows within the same table, such as establishing hierarchical relationships or identifying duplicate records.
Unlike other joins, there is no specific keyword for self-joins; instead, they are executed using INNER JOIN, LEFT JOIN, or RIGHT JOIN along with table aliases. This article will explain the concept of self-joins in PostgreSQL, demonstrate their usage through detailed examples, and provide practical insights to optimize our database queries.
What is a SELF JOIN?
A SELF JOIN is used to join a table to itself. It enables us to perform queries that require the comparison of rows within a single table. By utilizing aliases, we can differentiate between the two instances of the same table allowing for complex data relationships, such as hierarchical reporting structures, to be queried and visualized easily.
Syntax
SELECT column_list
FROM table_name T1
INNER JOIN table_name T2 ON join_predicate;
or,
SELECT column_list
FROM table_name T1
LEFT JOIN table_name T2 ON join_predicate;
or,
SELECT column_list
FROM table_name T1
RIGHT JOIN Table_name T2 ON join_predicate;
Setting Up the Sample Database
To show self joins effectively, we will create a sample database named company and a table named employee to represent the company hierarchy.
Step 1: Create the Database
Create a database named “company” with the below command:
CREATE DATABASE company;
Step 2: Create the Employee Table
Add a table of “employee” which includes employee’s ID, first name, last name, and their manager’s ID to show the company hierarchy into the database using the below command:
CREATE TABLE employee (
employee_id INT PRIMARY KEY,
first_name VARCHAR (255) NOT NULL,
last_name VARCHAR (255) NOT NULL,
manager_id INT,
FOREIGN KEY (manager_id)
REFERENCES employee (employee_id)
ON DELETE CASCADE
);
Step 3: Insert Employee Data
Now add some employee data to the table using the below command:
INSERT INTO employee (employee_id, first_name, last_name, manager_id) VALUES
(1, 'Sandeep', 'Jain', NULL),
(2, 'Abhishek', 'Kelenia', 1),
(3, 'Harsh', 'Aggarwal', 1),
(4, 'Raju', 'Kumar', 2),
(5, 'Nikhil', 'Aggarwal', 2),
(6, 'Anshul', 'Aggarwal', 2),
(7, 'Virat', 'Kohli', 3),
(8, 'Rohit', 'Sharma', 3);
In this setup, the manager_id column indicates the employee’s manager. If the manager_id is NULL
, it means the employee does not report to anyone. The overall hierarchy looks like the below image:
Example 1: Who Reports to Whom
Now that we have set up our database, here we will query for the data of who reports to whom using the same “employee” table twice. In this query, we are using the self join technique to join the employee
table with itself, where the alias e
represents each employee and the alias m
represents their respective manager. This way, we can display a hierarchical relationship by pairing employees with their managers.
Query:
SELECT
e.first_name || ' ' || e.last_name employee,
m .first_name || ' ' || m .last_name manager
FROM
employee e
INNER JOIN employee m ON m .employee_id = e.manager_id
ORDER BY
manager;
Output

PostgreSQL SELF JOIN Example1
Example 2: Finding Films with the Same Runtime
If we remember our Sample DVD rental database used in previous articles which is explained here and can be downloaded from the given link here, In this example, we’ll perform a self join on a film table from the DVD rental database, where we will identify all pairs of films that share the same runtime.
Query:
SELECT
f1.title AS Film_1,
f2.title AS Film_2,
f1.length AS Runtime
FROM
film AS f1
INNER JOIN film AS f2 ON f1.film_id <> f2.film_id AND f1.length = f2.length;
Output

PostgerSQL SELF JOIN Example2
Conclusion
In summary, self joins in PostgreSQL are invaluable for comparing rows within the same table. By using aliases, we can effectively join a table with itself to extract meaningful insights from hierarchical data or find related records. Understanding how to implement self joins through practical examples enhances our database querying capabilities. This powerful feature of PostgreSQL can significantly improve our data analysis processes.
FAQs
What is a self join in PostgreSQL?
A self join in PostgreSQL is a technique used to join a table to itself, allowing for the comparison of rows within that table.
How do I use aliases in a self join?
Aliases are used in a self join to differentiate between the two instances of the same table, allowing you to refer to each instance as a unique entity in the query.
Is a self join the same as other types of joins?
No, while a self join uses the same syntax as other joins (like INNER JOIN, LEFT JOIN, or RIGHT JOIN), it specifically refers to joining a table to itself