Skip to content
Advertisement

How to aggregate on multiple columns using SQL or spark SQL

I have following table:

Expected output is:

The aggregation computation involves 2 columns, is this supported in SQL?

Advertisement

Answer

In Spark SQL you can do it like this:

or in one select:

Higher-order aggregate function is used in this example.

aggregate(expr, start, merge, finish) – Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state. The final state is converted into the final result by applying a finish function.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement