'Extracting Long/Lat from a shapefile and calculating nearest distance between two sets of coordinates

I have two sets of data that I'm working with. The first is a list of addresses and their respective Longitude and Latitude coordinates:

First is the list of addresses and their long/lat coorindates:

library(dplyr)
library(tidygeocoder)
library(tigris)
library(rgeos)
library(sf)

    addresses <-
      structure(
        list(
          id = c(
            107234063L,
            106950145L,
            107256562L,
            107277550L,
            106952865L,
            106858955L,
            104019143L,
            102264960L,
            101690658L,
            107259458L
          ),
          streetno = c(12700L, 2016L, 311L, 3405L,
                       2400L, 711L, 2400L, 406L, 14002L, 1502L),
          streetname = c(
            "Stafford",
            "Dunlavy",
            "Branard",
            "Shepherd",
            "Fountain View",
            "William",
            "Braeswood",
            "Hawthorne",
            "Hempstead",
            "Quitman"
          ),
          city = c(
            "Stafford",
            "Houston",
            "Houston",
            "Houston",
            "Houston",
            "Houston",
            "Houston",
            "Houston",
            "Houston",
            "Houston"
          ),
          state = c("TX", "TX", "TX",
                    "TX", "TX", "TX", "TX", "TX", "TX", "TX"),
          zip5 = c(
            77477L,
            77006L,
            77006L,
            77018L,
            77057L,
            77002L,
            77030L,
            77006L,
            77040L,
            77009L
          ),
          vaddress = c(
            "12700 Stafford Rd",
            "2016 Dunlavy St",
            "311 Branard St",
            "3405 N Shepherd Dr",
            "2400 Fountain View Dr",
            "711 William St",
            "2400 N Braeswood Blvd",
            "406 Hawthorne St",
            "14002 Hempstead Rd",
            "1502 Quitman St"
          ),
          countyname = c(
            "Harris",
            "Harris",
            "Harris",
            "Harris",
            "Harris",
            "Harris",
            "Harris",
            "Harris",
            "Harris",
            "Harris"
          ),
          hdname = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L),
          complete_address = c(
            "12700 Stafford Rd, Stafford, TX 77477",
            "2016 Dunlavy St, Houston, TX 77006",
            "311 Branard St, Houston, TX 77006",
            "3405 N Shepherd Dr, Houston, TX 77018",
            "2400 Fountain View Dr, Houston, TX 77057",
            "711 William St, Houston, TX 77002",
            "2400 N Braeswood Blvd, Houston, TX 77030",
            "406 Hawthorne St, Houston, TX 77006",
            "14002 Hempstead Rd, Houston, TX 77040",
            "1502 Quitman St, Houston, TX 77009"
          ),
          longlat = structure(
            list(
              singlelineaddress = c(
                "12700 Stafford Rd, Stafford, TX 77477",
                "2016 Dunlavy St, Houston, TX 77006",
                "311 Branard St, Houston, TX 77006",
                "3405 N Shepherd Dr, Houston, TX 77018",
                "2400 Fountain View Dr, Houston, TX 77057",
                "711 William St, Houston, TX 77002",
                "2400 N Braeswood Blvd, Houston, TX 77030",
                "406 Hawthorne St, Houston, TX 77006",
                "14002 Hempstead Rd, Houston, TX 77040",
                "1502 Quitman St, Houston, TX 77009"
              ),
              lat = c(
                29.634542,
                29.74779,
                29.737116,
                29.817532,
                29.742098,
                29.76709,
                29.698315,
                29.742435,
                29.850826,
                29.783348
              ),
              long = c(
                -95.545334,
                -95.40214,-95.383736,
                -95.41044,
                -95.48542,
                -95.35345,
                -95.415306,-95.38407,
                -95.52886,
                -95.35297
              )
            ),
            row.names = c(NA,-10L),
            class = c("tbl_df", "tbl", "data.frame")
          )
        ),
        row.names = 353:362,
        class = "data.frame"
      )

The next set of coordinates comes from downloading the following shapefile:

texas_hd <- state_legislative_districts("TX", house = "lower")

So what I'm looking to do is twofold:

  1. Extract the long/lat coordinates from the texas_hd file

  2. Create a new column in addresses that maps each rows long/lats coordinates to the nearest SLDLST code from texas_hd

So for example, the address of 2016 Dunlavy St, Houston, TX 77006 should be 134 from SLDLST



Solution 1:[1]

Please find below one possible solution to answer your two requests.

Reprex

Question 1

  • Option 1 - Code for "raw" coordinate extraction
library(sf)
library(tigris)
library(tidyr)
library(dplyr)

texas_hd <- state_legislative_districts("TX", house = "lower")

texas_hd_coords <- texas_hd %>% 
  sf::st_coordinates() %>% 
  as.data.frame() %>% 
  dplyr::select(X,Y) %>% 
  dplyr::rename(long = X, lat = Y)
  • Output for option 1 - "Raw" coordinate extraction
head(texas_hd_coords)
#>        long      lat
#> 1 -97.81123 30.23370
#> 2 -97.81115 30.23384
#> 3 -97.81105 30.23400
#> 4 -97.81096 30.23416
#> 5 -97.81037 30.23516
#> 6 -97.80993 30.23587
  • Option 2 - Code to get a list of coordinates by polygon

Actually, I am not sure what you mean by "Extract the long/lat coordinates from the texas_hd file", but this option makes more sense to me. Here, the result texas_hd_coords_by_polys is a list of which the number of elements (i.e. length) is equal to the number of polygons of the sf object texas_hd (i.e. 150). You can thus access the list of coordinates for a specific polygon.

texas_hd_coords_by_polys <- texas_hd %>%
  {sapply(st_geometry(.), sf::st_coordinates)} %>% 
  {lapply(., as.data.frame)} %>% 
  {lapply(., dplyr::select,X,Y)} %>% 
  {lapply(., dplyr::rename, long = X, lat = Y)}
  • Output for option 2: list of coordinates by polygon
class(texas_hd_coords_by_polys)
#> [1] "list"
length(texas_hd_coords_by_polys)
#> [1] 150

# Example to access the first coordinates of the polygon #135 from the list
# 'texas_hd_coords_by_polys'.
head(texas_hd_coords_by_polys[[135]])
#>        long      lat
#> 1 -95.46224 32.12897
#> 2 -95.46163 32.12910
#> 3 -95.46111 32.12899
#> 4 -95.46079 32.12904
#> 5 -95.46076 32.12924
#> 6 -95.46070 32.12955

Question 2

  • Code

NB: Not knowing what exactly you wish for the final result, the last line of code deletes all the columns of texas_hd except the SLDLST column. So, if necessary, you can easily add back the columns you want in the final result by removing the columns included in select(-c(...))

addresses <- addresses %>% 
  dplyr::as_tibble() %>% 
  tidyr::unnest(., cols = c(longlat)) %>% 
  sf::st_as_sf(., coords = c("long", "lat"), crs = 4326) %>% 
  sf::st_transform(., crs = 4269) %>% 
  sf::st_join(., texas_hd, join = st_nearest_feature) %>% 
  dplyr::select(-c(GEOID, NAMELSAD, LSAD, LSY, MTFCC, FUNCSTAT, ALAND, AWATER, INTPTLAT, INTPTLON))
  • Output
addresses

#> Simple feature collection with 10 features and 13 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -95.54533 ymin: 29.63454 xmax: -95.35297 ymax: 29.85083
#> Geodetic CRS:  NAD83
# A tibble: 10 x 14
#>           id streetno streetname    city     state  zip5 vaddress              countyname hdname complete_address            singlelineaddre~             geometry STATEFP SLDLST
#>        <int>    <int> <chr>         <chr>    <chr> <int> <chr>                 <chr>       <int> <chr>                       <chr>                     <POINT [°]> <chr>   <chr> 
#>  1 107234063    12700 Stafford      Stafford TX    77477 12700 Stafford Rd     Harris          0 12700 Stafford Rd, Staffor~ 12700 Stafford ~ (-95.54533 29.63454) 48      027   
#>  2 106950145     2016 Dunlavy       Houston  TX    77006 2016 Dunlavy St       Harris          0 2016 Dunlavy St, Houston, ~ 2016 Dunlavy St~ (-95.40214 29.74779) 48      134   
#>  3 107256562      311 Branard       Houston  TX    77006 311 Branard St        Harris          0 311 Branard St, Houston, T~ 311 Branard St,~ (-95.38374 29.73712) 48      147   
#>  4 107277550     3405 Shepherd      Houston  TX    77018 3405 N Shepherd Dr    Harris          0 3405 N Shepherd Dr, Housto~ 3405 N Shepherd~ (-95.41044 29.81753) 48      148   
#>  5 106952865     2400 Fountain View Houston  TX    77057 2400 Fountain View Dr Harris          0 2400 Fountain View Dr, Hou~ 2400 Fountain V~  (-95.48542 29.7421) 48      133   
#>  6 106858955      711 William       Houston  TX    77002 711 William St        Harris          0 711 William St, Houston, T~ 711 William St,~ (-95.35345 29.76709) 48      142   
#>  7 104019143     2400 Braeswood     Houston  TX    77030 2400 N Braeswood Blvd Harris          0 2400 N Braeswood Blvd, Hou~ 2400 N Braeswoo~ (-95.41531 29.69832) 48      134   
#>  8 102264960      406 Hawthorne     Houston  TX    77006 406 Hawthorne St      Harris          0 406 Hawthorne St, Houston,~ 406 Hawthorne S~ (-95.38407 29.74244) 48      147   
#>  9 101690658    14002 Hempstead     Houston  TX    77040 14002 Hempstead Rd    Harris          0 14002 Hempstead Rd, Housto~ 14002 Hempstead~ (-95.52886 29.85083) 48      139   
#> 10 107259458     1502 Quitman       Houston  TX    77009 1502 Quitman St       Harris          0 1502 Quitman St, Houston, ~ 1502 Quitman St~ (-95.35297 29.78335) 48      148 

Created on 2022-02-02 by the reprex package (v2.0.1)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1