'Json Object in Log-File to Dataframe

I haven't worked with RStudio much, and when then I used csv-files.

Now I have a log file, in which there are multiple json-objects. So it looks like this:

{"count": 3, "material": wood, "systemnumber": 284, "cost": 45}
{"count": 4, "material": steel, "systemnumber": 360, "cost": 67}
{"count": 10, "material": plastic, "systemnumber": 210, "cost": 15}
...
...

Now I would like a dataframe that looks like this:

|  |count|material|systemnumber|cost
|1 |3    |wood    |284         |45
|1 |4    |steel   |360         |67
|1 |10   |plastic |210         |15
...
...

The normal import doesn't work, and neither does "fromJson" as it is a log-file.

How can I convert it?



Solution 1:[1]

That's not valid JSON. There can be only one root object in a JSON file. What you posted stores a single JSON document per JSON line, a format typically used for logs and big data files because it allows appending new records without having to read and rewrite the entire file. It also allows sending JSON records over a stream (eg a network stream).

There's no standard for this, it's just something people started doing more than a decade ago. You'll find this described as streaming JSON, newline-delimited JSON, JSON-per-line, JSON Lines etc. All of these refer to the same technique even when some people try to present this as some kind of standard.

jsonlite handles this through the stream_in and stream_out functions. This is described in Streaming JSON input/output. Using some snippets from the article's example :

> mydata <- stream_in(url("http://jeroen.github.io/data/diamonds.json"))
opening url input connection.
 Imported 53940 records. Simplifying...
closing url input connection.
> summary(mydata)
     carat            cut               color             clarity         
 Min.   :0.2000   Length:53940       Length:53940       Length:53940      
 1st Qu.:0.4000   Class :character   Class :character   Class :character  
 Median :0.7000   Mode  :character   Mode  :character   Mode  :character  
 Mean   :0.7979                                                           
 3rd Qu.:1.0400                                                           
 Max.   :5.0100                                                           
     depth           table           price             x         
 Min.   :43.00   Min.   :43.00   Min.   :  326   Min.   : 0.000  
 1st Qu.:61.00   1st Qu.:56.00   1st Qu.:  950   1st Qu.: 4.710  
 Median :61.80   Median :57.00   Median : 2401   Median : 5.700  
 Mean   :61.75   Mean   :57.46   Mean   : 3933   Mean   : 5.731  
 3rd Qu.:62.50   3rd Qu.:59.00   3rd Qu.: 5324   3rd Qu.: 6.540  
 Max.   :79.00   Max.   :95.00   Max.   :18823   Max.   :10.740  
       y                z         
 Min.   : 0.000   Min.   : 0.000  
 1st Qu.: 4.720   1st Qu.: 2.910  
 Median : 5.710   Median : 3.530  
 Mean   : 5.735   Mean   : 3.539  
 3rd Qu.: 6.540   3rd Qu.: 4.040  
 Max.   :58.900   Max.   :31.800

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Panagiotis Kanavos