Open In App

PostgreSQL – FIRST_VALUE Function

Last Updated : 18 Nov, 2024
Summarize
Comments
Improve
Suggest changes
Like Article
Like
Share
Report
News Follow

The FIRST_VALUE() function in PostgreSQL is a window function that retrieves the first value within an ordered set of rows, often within a specific partition. This feature is highly useful for data analysis and reporting by allowing targeted access to specific data points.

In this article, we will explain the syntax, usage, and practical examples of using the FIRST_VALUE() function in PostgreSQL, focusing on how it can improve data analysis by retrieving initial values within sorted or partitioned datasets.

Why Use the FIRST_VALUE() Function in PostgreSQL?

The FIRST_VALUE function in PostgreSQL is particularly valuable when working with grouped or partitioned data, such as identifying the lowest value within a subset of data. It’s widely used in analytics for tasks like tracking minimum or initial values across datasets

Syntax

FIRST_VALUE ( expression )  
OVER (
[PARTITION BY partition_expression, ... ]
ORDER BY sort_expression [ASC | DESC], ...
)

Key Terms

  • expression: Evaluates the value to retrieve from the first row in the sorted partition. This can be a column, expression, or subquery returning a single value.
  • PARTITION BY clause: Divides rows into separate partitions within which the FIRST_VALUE() function operates independently. Useful for performing calculations or comparisons within specific groups of data.
  • ORDER BY clause: Specifies the sorting criteria for rows within each partition. Determines the order in which rows are processed by the FIRST_VALUE() function.
  • rows_range_clause: Optional clause that limits the range of rows within each partition that the function operates on. It defines a window frame within the partition for more precise control over result set boundaries.

Examples of PostgreSQL FIRST_VALUE Function

Let us take a look at some of the examples of the FIRST_VALUE Function in PostgreSQL to better understand its functionality and flexibility in data analysis. Here, we will explain its usage in both basic and partitioned query structures to highlight different applications.

Example 1: Basic Usage of FIRST_VALUE

Suppose we have two tables, Animal_groups and Mammals, and we want to use the FIRST_VALUE() function to retrieve the mammal with the lowest lifespan.

Step 1: Create Sample Tables and Data

CREATE TABLE Animal_groups (
animal_id serial PRIMARY KEY,
animal_name VARCHAR (255) NOT NULL
);

CREATE TABLE Mammals (
mammal_id serial PRIMARY KEY,
mammal_name VARCHAR (255) NOT NULL,
lifespan DECIMAL (11, 2),
animal_id INT NOT NULL,
FOREIGN KEY (animal_id) REFERENCES Animal_groups (animal_id)
);

Step 2: Insert Data

INSERT INTO Animal_groups (animal_name)
VALUES
('Terrestrial'),
('Aquatic'),
('Winged');

INSERT INTO Mammals(mammal_name, animal_id, lifespan)
VALUES
('Cow', 1, 10),
('Dog', 1, 7),
('Ox', 1, 13),
('Wolf', 1, 11),
('Blue Whale', 2, 80),
('Dolphin', 2, 5),
('Sea Horse', 2, 3),
('Octopus', 2, 8),
('Bat', 3, 4),
('Flying Squirrels', 3, 1),
('Petaurus', 3, 2);

Step 3: Query Using FIRST_VALUE()

To find the mammal with the lowest lifespan across all animal groups, use the FIRST_VALUE() function with an ORDER BY clause with the help of the below given query.

Query:

SELECT 
mammal_id,
mammal_name,
mammal_id,
lifespan,
FIRST_VALUE(mammal_name)
OVER(
ORDER BY lifespan
) lowest_lifespan
FROM
Mammals;

Output

PostgreSQL FIRST_VALUE Function Example

Explanation:

This query returns all mammals along with the mammal having the lowest lifespan across the entire dataset. Here, ‘FIRST_VALUE(mammal_name) OVER(ORDER BY lifespan)' identifies the mammal with the smallest lifespan.

Example 2: Partitioned by Animal Groups

The below statement uses the FIRST_VALUE() function to return all mammals grouped by the animal group. And for each animal group, it returns the mammal with the lowest lifespan.

Query:

SELECT 
mammal_id,
mammal_name,
mammal_id,
lifespan,
FIRST_VALUE(mammal_name)
OVER(
PARTITION BY animal_id
ORDER BY lifespan
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) lowest_lifespan
FROM
Mammals;

Output

PostgreSQL FIRST_VALUE Function Example

Explanation:

This query returns all mammals along with the mammal having the lowest lifespan across the entire dataset. Here, ‘FIRST_VALUE(mammal_name) OVER(ORDER BY lifespan)' identifies the mammal with the smallest lifespan.

Important Points About PostgreSQL FIRST_VALUE Function

  • If multiple rows have the same value for the ORDER BY expression, the FIRST_VALUE() function returns the first row it encounters based on the order of the rows in the table.
  • The ‘rows_range_clause' (e.g., ROWS BETWEEN or RANGE BETWEEN) can limit the number of rows considered within each partition. However, for FIRST_VALUE(), specifying the frame is generally not necessary unless combined with other window functions that require precise row control.
  • The ordering of rows within partitions directly affects the result of FIRST_VALUE(). Ensure the ORDER BY clause accurately reflects the desired sorting to get meaningful results.
  • NULL values affect FIRST_VALUE() results, as they are sorted last in ascending order and first in descending order. Use NULLS FIRST or NULLS LAST in the ORDER BY clause for precise control.

Conclusion

The FIRST_VALUE function in PostgreSQL enables effective data analysis, especially for tracking initial values within ordered or grouped data. This function, when combined with clauses like PARTITION BY and ORDER BY, provides excellent flexibility in SQL queries.

The FIRST_VALUE function is particularly useful for summarizing datasets, allowing us to quickly access specific insights like the lowest or earliest value across different groups or partitions, enhancing the power of analytical SQL operations in PostgreSQL.

FAQs

What is REGEXP_REPLACE in PostgreSQL?

The REGEXP_REPLACE function in PostgreSQL allows for searching a string based on a specified pattern and replacing occurrences of that pattern with a new substring, making it ideal for advanced text transformations.

What is the use of REGEXP_REPLACE?

REGEXP_REPLACE is primarily used for replacing parts of a string based on regular expression patterns, allowing dynamic transformations such as data cleaning, formatting, and reformatting complex text structures.

Does regular expression (regexp) work in PostgreSQL?

Yes, regular expressions are fully supported in PostgreSQL with functions like REGEXP_REPLACE, REGEXP_MATCHES, and SIMILAR TO, enabling robust pattern matching and text processing capabilities



Similar Reads

three90RightbarBannerImg