'How to merge multiple JSON files and average values

I try to merge different JSON files, which contain this kind of data:

#file1:
{
  "user1": {
    "server1": 7.2,
    "server2": 10.3
  },
  "user2": {
    "server1": 15
  }
}
#file2:
{
  "user1": {
    "server1": 8.5,
    "server3": 20.5
  },
  "user3": {
    "server1": 28
  }
}

And so on (12 per hour).

These data are a list of users with the servers usage in percentage, evolving every 5 min (user may dissapear and reappear depending of resources usage).

I am under Linux and I generate these files with different Shell tools like jq.

My goal is to obtain a merged file every hour with the average load on all available data, like that (just considering the 2 files above):

#result:
{
  "user1": {
    "server1": 7.85,
    "server2": 5.15,
    "server3": 10.25
  },
  "user2": {
    "server1": 7.5
  },
  "user3": {
    "server1": 14
  }
}

I try several queries with reduce command, but nothing works as expected.

Maybe the best approach should to parse every files, extract data and store user in different file and after merge all data?

Thank for your analyse and example.



Solution 1:[1]

Here's a solution for an arbitrary number of input files. It first divides each value by the number of input files, and then adds them all up retaining the keys:

jq -s '
  .[][][] /= length
  | reduce (.[] | to_entries[]) as $u ({};
      reduce ($u.value | to_entries[]) as $s (.;
        .[$u.key][$s.key] += $s.value
      )
    )
' file*
{
  "user1": {
    "server1": 7.85,
    "server2": 5.15,
    "server3": 10.25
  },
  "user2": {
    "server1": 7.5
  },
  "user3": {
    "server1": 14
  }
}

Demo


The division may also be shifted into the addition by using a variable at the beginning instead:

jq -s '
  length as $c
  | reduce (.[] | to_entries[]) as $u ({};
      reduce ($u.value | to_entries[]) as $s (.;
        .[$u.key][$s.key] += $s.value / $c
      )
    )
' file*

Demo

Solution 2:[2]

One approach:

jq -n \
   --argjson d1 file1.json \
   --argjson d2 file2.json \
'
   $d1 * $d2 |
   with_entries(
      .key as $user |
      .value = (
         with_entries(
            .key as $server |
            .value | ( $d1[ $user ][ $server ] + $d2[ $user ][ $server ] ) / 2
         )
      )
   )
'

Demo on jqplay

$d1 * $d2 creates the right structure, but with the wrong values. We then fix up the values.


Another approach:

jq -s '
   length as $n |
   reduce ( .[] | to_entries[] ) as $_ ( { };
      $_.key as $user |
      reduce ( $_.value | to_entries[] ) as $_ ( .;
         $_.key as $server |
         .[ $user ][ $server ] += $_.value / $n
      )
   )
' file1.json file2.json

Demo on jqplay

This one works with more than two files!

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 pmf
Solution 2