'EOF with quoted string; warning message

I have a dataframe (df1) with ~2800 rows and one (1) column (let's call it column 'X'). I have another dataframe (df2) with ~2,170,000 rows and 51 columns, and it contains column X as well.

I was trying to see how many of the ~2800 X's (df1) are in the larger dataframe (df2), and I received this:

Warning message:
In scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
  EOF within quoted string

I noticed that about only half of the rows were uploaded into my R instance, and with a quick search found that I could insert disable quote towards the end of this read command in order to bypass the warning and get my full data in there:

df <- read.table("C:/Users/.../.txt", header = T, sep = "|", fill = T, quote = "")

Which I then had my full 2.17mil rows compared to having half beforehand. I ran an anti_join command on the data before and after I disabled quotes; before I had ~1300 matches of the 2800, and after I had ~300 matches of the 2800.

Any ideas as to why disabling quotes would lead me to having less matches? I was expecting to be able to match 95% or so of the data (df1) within the larger dataframe (df2), I think it makes sense as to why less matches are found but am looking for a work around to retain at least the same matches as I was having before disabling quotes. Thank you in advance! I'm including my full code below as well:

# packages
library(dplyr)
library(writexl)
library(readr)

# load
setwd('C:/Users/...')
df_1 <- read.csv("C:/Users/.../df1.csv", header=T, na.strings = c("","NA"))
df_2 <- read.table("C:/Users/.../df2.txt", header = T, sep = "|", fill = T, quote = "")

# anti join
df_missing <- anti_join(df_1, df_2, by="X")


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source