'Open a text file containing a table with no clear separator
I have to open a text file which contain data that I want to open as a tibble or dataframe. The data are stored in a tbale with white space as separator (I suppose). The probleme is that whitespace is also used in the column "PI". I have no idea how to open it without doing any complicated script to scan all of it and parse it with regex. Do you have any tips ? I think it was designed to have visualy aligned column when we open it as a text file.
here is a sample of the file.
DAC PI SPROF_DATE_UPDATE DATA_MODE WMO CYCLE PROFILE_DATE LON LAT MPD_RAW MPD_ADJ Z_RAW Z_ADJ
aoml STEPHEN RISER , KENNETH JOHNSON 2021-05-12 18:52:52 R 5903612 038 2012-07-04 01:15:50 8.651 -41.466 -89.313473 NaN 5.439052 NaN
aoml STEPHEN RISER , KENNETH JOHNSON 2020-12-03 02:03:33 R 5903717 138 2014-02-03 18:07:45 -164.533 -69.958 134.866767 NaN 7.884637 NaN
aoml STEPHEN RISER , KENNETH JOHNSON 2021-05-12 22:44:39 R 5903717 139 2014-02-10 20:05:49 -164.302 -70.061 99.848464 NaN 5.803396 NaN
aoml STEPHEN RISER , KENNETH JOHNSON 2021-05-12 22:44:53 R 5903717 140 2014-02-17 22:06:53 -164.146 -70.246 107.535752 NaN 6.260275 NaN
Solution 1:[1]
- Read the file as lines of text
- Replace multiple consecutive spaces with a custom and unique delimiter
- Paste everything into a character vector
- read from that data using the custom delimiter
- Additional re-formatting like trimming and type conversion
library(tidyverse)
read_lines("data.txt") %>%
map_chr(~ .x %>% str_replace_all("( )+", ";")) %>%
paste0(collapse = "\n") %>%
read_delim(delim = ";", col_types = cols(.default = "character")) %>%
mutate(across(everything(), str_trim)) %>%
type_convert()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | danlooo |
