'merged azure spark delta table in dataverse not showing expected value
I created a simple synapse delta lake table via:
CREATE TABLE IF NOT EXISTS db1.tbl1 (
id INT NOT NULL,
name STRING NOT NULL
)
USING DELTA
I've merged rows of data into it multiple times with different test values for 'name'. If I select the rows, I see my most recent merge as expected, e.g.:
+---+-------+
| id| name|
+---+-------+
| 1| adam|
| 2| bob8|
| 3|charles|
+---+-------+
so, I also have a synapse pipeline using the 'copy data tool' that reads from the ADLS container and folder containing the parquet files for the synapse delta lake table (using a * wildcard for the files), then has a sink configured to the dataverse table using an 'alternate key name' of the 'id' field for the 'upsert' to dataverse.
I've tested this multiple times, and the 'name' value in the dataverse table seems to almost randomly take a value from one of my merges to it. anyone know what I'm doing wrong? The way I'm defining my source seems suspicious for sure, just using a wildcard for the files, but I don't know how else to do it. A pyspark sql select on the table shows the correct most recent merged rows, but I'm probably doing something wrong when trying to sink this with my dataverse table.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
