Re: Parsers - selecting last element in an array w...

as-1 · 09-18-2024 06:03 AM

hey, wondering if anyone has any decent ideas to solve this condundrum in CBN

given the below json object to parse into additional.fields with keys being Unique instances of "match_key", creating repeated fields from duplicate keys, how do you conquer this in CBN? I've included a snippet of some python code I have created that gets the desired result

{
    "data_objs": [
        {
            "match_key": "test3",
            "match_string": "string.com"
        },
        {
            "match_key": "test2",
            "match_string": "string.xyz"
        },
        {
            "match_key": "test1",
            "match_string": "string.tech"
        },
        {
            "match_key": "test1",
            "match_string": "string.tech"
        },
        {
            "match_key": "RuleName",
            "match_string": "string.tech"
        }
    ],
    "domain": "somewhere.com",
    "issue_org": "Let's Encrypt"
}

python code


result_object_udm = {
    "metadata": {
    "event_timestamp": datetime.now().strftime("%Y-%m-%dT%H:%M:%S.%fZ"),
    "event_type": "GENERIC_EVENT",
    "product_name": "TEST_PRODUCT",
    },
    "additional": {
        "fields": [
        ]
    }
}
unique_matches = []
result_store = []
for rule in data1['matched_rules']:
    if rule['match_key'] not in unique_matches:
        unique_matches.append(rule['match_key'])
        result_object = {
                "key": rule['match_key'],
                "value": {
                    "list_value": {
                        "values": [
                            {
                                "string_value": rule['match_value']
                            }
                        ]
                    }
                }
            }
        result_store.append(result_object)
    else :
        result_store[-1]['value']['list_value']['values'].append({
            "string_value": rule['match_value']
        })
    

for result in result_store:
    result_object_udm['additional']['fields'].append(result)
# print(json.dumps(result_store, indent=4))

I've got as far as line 35 in CBN before becoming confused at how to track which keys have been seen before and thus need merging, while I can confirm that a key is or is not unique im not sure how to access the last element of an array to update the list_value.values array with the corresponding value

any ideas of alternative methods also appreciated

AbdElHafez

Hi @as-1 ,

There could be simpler ideas, but one idea is that you could concatenate the keys -not merge- in a token, says "keys"=>"%{keys}|%{current_key}", then use a conditional with =~ (regex match) to check if the [currentKey] value matches the {keys} token that has a regex form like this ;

filter {

    json {
  source => "message"
  array_function => "split_columns"
}

    mutate {replace => {"keys"=>""}}
for index1,item1 in data_objs {
    mutate {convert => {"index1"=>"string"}} #Convert index to string to be used if needed in replace statements
    #mutate {replace => {"obj1"=>""}}

    mutate {replace => {"current_key"=>"%{item1.match_key}"}}

    statedump {"label" => " countStart"} 


    if [index1]=="0" {
        mutate {replace => {"keys"=>"%{current_key}"}}
    } else {
        if [current_key] !~ keys {
            mutate {replace => {"exists"=>"False"}}
            mutate {replace => {"keys"=>"%{keys}|%{current_key}"}}
        } else {
            mutate {replace => {"exists"=>"True"}}
        }
    }


    statedump {"label" => " countEnd"} 

}


statedump {
  label => "end"
}


     
}

You will get a de-duplicated field "keys" and an error indicator "Exists" that you could use in your logic.

You could add another condition in line 18 to exclude null keys, i.e. AND [key]!="" as a null key will impact the regex and make all subsquent current_keys match the regex ( r"a|" or r"|a" will match anything due to the null branch in regex).

Can you take it from there or could you share the desired outcome and I can finish the parser ?
I tried running the Python code but I think it is missing some fields of "data1" .

AbdElHafez

I updated my parser to accommodate the null values ;

filter {

    json {
  source => "message"
  array_function => "split_columns"
}

    mutate {replace => {"keys"=>""}}
for index1,item1 in data_objs {
    mutate {convert => {"index1"=>"string"}} #Convert index to string to be used if needed in replace statements
    #mutate {replace => {"obj1"=>""}}

    mutate {replace => {"current_key"=>"%{item1.match_key}"}}

    statedump {"label" => " countStart"} 


    if [current_key] == "" {
        mutate {replace => {"nullKey"=>"True"}}
        mutate {replace => {"current_key"=>"Null"}}
    }else{mutate {replace => {"nullKey"=>"False"}}}

    if [index1]=="0"  {
        mutate {replace => {"keys"=>"%{current_key}"}}
    } else {
        if [current_key] !~ keys {
            mutate {replace => {"exists"=>"False"}}
            mutate {replace => {"keys"=>"%{keys}|%{current_key}"}}
        } else {
            mutate {replace => {"exists"=>"True"}}
        }
    }


    statedump {"label" => " countEnd"} 

}


statedump {
  label => "end"
}


     
}

This will work with this edge case ;

{
    "data_objs": [
 {
            "match_key": "",
            "match_string": "null_String"
        },        {
            "match_key": "test3",
            "match_string": "string.com"
        },
        {
            "match_key": "test2",
            "match_string": "string.xyz"
        },
        {
            "match_key": "test1",
            "match_string": "string.tech"
        },
        {
            "match_key": "test1",
            "match_string": "string.tech"
        },
        {
            "match_key": "RuleName",
            "match_string": "string.tech"
        }
    ],
    "domain": "somewhere.com",
    "issue_org": "Let's Encrypt"
}

as-1

hey apologies for the lack of updates on this and thank you for your input.

the desired outcome here is to have each unique "key" created as an additional field of type list_value where its Vlaues are whatever the match string is for that key.

so additional.fields['MyRuleName1' = [
"url1",
"url2",
]
etc etc

so the model becomes one "match_key" can have many "match_strings"

this is because the underlying thing is a custom WAF solution that a client has built

I think you would need to create a placeholder object for each unique match key
iterate again to filter data into the corresponding match key

I will have another look at this with fresh eyes this week

for reference as well the data1 in the python code was a typo, create an object of schema of data_objs and call it data1, the code will then run

I have since noticed that even there I'm not properly tracking and sorting as expected and just filtering data into the last unique match which doesn't garauntee that the data isn't of another unique match type. so I probably need to do another lookup to find the exact match index point and merge it into the object at that index.

Parsers - selecting last element in an array without specifying exact index?