I have a table with the following schema (postgresql 14):
x
message sentiment classification
any text positive mobile, communication
message
are only string, phrases.
sentiment
is a string, only one word
classification
are string but can have 1 to many word comma separated
I would like to create a json field with these columns, like this:
{"msg":"any text", "sentiment":"positive","classification":["mobile,"communication"]}
Also, if possible, is there a way to consider the classification this way:
{"msg":"any text", "sentiment":"positive","classification 1":"mobile","classification 2" communication"}
Advertisement
Answer
The first part of question is easy – Postgres provides functions for splitting string and converting to json:
with t(message, sentiment, classification) as (values
('any text','positive','mobile, communication')
)
select row_to_json(x.*)
from (
select t.message
, t.sentiment
, array_to_json(string_to_array(t.classification, ', ')) as classification
from t
) x
The second part is harder – your want json to have variable number of attributes, mixed of grouped and nongrouped data. I suggest to unwind all attributes and then assemble them back (note the numbered
CTE is actually not needed if your real table has id – I just needed some column to group by):
with t(message, sentiment, classification) as (values
('any text','positive','mobile, communication')
)
, numbered (id, message, sentiment, classification) as (
select row_number() over (order by null)
, t.*
from t
)
, extracted (id,message,sentiment,classification,index) as (
select n.id
, n.message
, n.sentiment
, l.c
, l.i
from numbered n
join lateral unnest(string_to_array(n.classification, ', ')) with ordinality l(c,i) on true
), unioned (id, attribute, value) as (
select id, concat('classification ', index::text), classification
from extracted
union all
select id, 'message', message
from numbered
union all
select id, 'sentiment', sentiment
from numbered
)
select json_object_agg(attribute, value)
from unioned
group by id;