'Json Object in Log-File to Dataframe
I haven't worked with RStudio much, and when then I used csv-files.
Now I have a log file, in which there are multiple json-objects. So it looks like this:
{"count": 3, "material": wood, "systemnumber": 284, "cost": 45}
{"count": 4, "material": steel, "systemnumber": 360, "cost": 67}
{"count": 10, "material": plastic, "systemnumber": 210, "cost": 15}
...
...
Now I would like a dataframe that looks like this:
| |count|material|systemnumber|cost
|1 |3 |wood |284 |45
|1 |4 |steel |360 |67
|1 |10 |plastic |210 |15
...
...
The normal import doesn't work, and neither does "fromJson" as it is a log-file.
How can I convert it?
Solution 1:[1]
That's not valid JSON. There can be only one root object in a JSON file. What you posted stores a single JSON document per JSON line, a format typically used for logs and big data files because it allows appending new records without having to read and rewrite the entire file. It also allows sending JSON records over a stream (eg a network stream).
There's no standard for this, it's just something people started doing more than a decade ago. You'll find this described as streaming JSON, newline-delimited JSON, JSON-per-line, JSON Lines etc. All of these refer to the same technique even when some people try to present this as some kind of standard.
jsonlite handles this through the stream_in and stream_out functions. This is described in Streaming JSON input/output. Using some snippets from the article's example :
> mydata <- stream_in(url("http://jeroen.github.io/data/diamonds.json"))
opening url input connection.
Imported 53940 records. Simplifying...
closing url input connection.
> summary(mydata)
carat cut color clarity
Min. :0.2000 Length:53940 Length:53940 Length:53940
1st Qu.:0.4000 Class :character Class :character Class :character
Median :0.7000 Mode :character Mode :character Mode :character
Mean :0.7979
3rd Qu.:1.0400
Max. :5.0100
depth table price x
Min. :43.00 Min. :43.00 Min. : 326 Min. : 0.000
1st Qu.:61.00 1st Qu.:56.00 1st Qu.: 950 1st Qu.: 4.710
Median :61.80 Median :57.00 Median : 2401 Median : 5.700
Mean :61.75 Mean :57.46 Mean : 3933 Mean : 5.731
3rd Qu.:62.50 3rd Qu.:59.00 3rd Qu.: 5324 3rd Qu.: 6.540
Max. :79.00 Max. :95.00 Max. :18823 Max. :10.740
y z
Min. : 0.000 Min. : 0.000
1st Qu.: 4.720 1st Qu.: 2.910
Median : 5.710 Median : 3.530
Mean : 5.735 Mean : 3.539
3rd Qu.: 6.540 3rd Qu.: 4.040
Max. :58.900 Max. :31.800
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Panagiotis Kanavos |
