'Improve insert performance for a C# application into SQL database
I need to improve the performance of my insert in my C# application. I first go out and get data from a view. Then I go through a FOREACH
loop to insert into a table. I have over 200,000 records that I am working with and it takes an ridiculous amount of time to perform this task. I know the SaveChanges
is a round trip to the database but I'm not sure how to get around this. Is there something I can do to improve the time?
var values = db.TodaysAirs.ToList();
foreach (TodaysAir x in values)
{
//check to see if this is a new value or one that needs to be updated
var checkForNew = db.TodaysAirValues
.Where(m => m.ID == x.ID);
//new record
if (checkForNew.Count() == 0)
{
TodaysAirValue newRecord = new TodaysAirValue();
newRecord.ID = x.ID;
newRecord.Logger_Id = x.Logger_Id;
newRecord.SiteName = x.SiteName;
newRecord.Latitude = x.Latitude;
newRecord.Longitude = x.Longitude;
newRecord.Hour = x.Hour;
newRecord.Parameter = x.Parameter;
newRecord.Stan = x.Stan;
newRecord.Units = x.Units;
newRecord.InstrumentType = x.InstrumentType;
newRecord.NowCast = x.NowCast;
newRecord.AQIValue = x.AQIValue;
newRecord.HealthCategory = x.HealthCategory;
newRecord.Hr24Avg = x.Hr24Avg;
newRecord.Hr24Max = x.Hr24Max;
newRecord.Hr24Min = x.Hr24Min;
newRecord.SID = DateTime.Now;
db.TodaysAirValues.Add(newRecord);
db.SaveChanges();
// CallJenkinsJob();
}
}
Solution 1:[1]
The goal should be to run a single raw SQL statement that will look something very much like this:
INSERT INTO TodaysAirValues
(ID, Logger_id, SiteName, Latitude, Longitude, Hour, Parameter,
Stan, Units, InstrumentType, NowCast, AQIValue, HealthCategory,
Hr24Avg, Hr24Max, Hr24Min, SID)
SELECT ta.ID, ta.Logger_id, ta.SiteName, ta.Latitude, ta.Longitude,
ta.Hour, ta.Parameter, ta.Stan, ta.Units, ta.InstrumentType,
ta.NowCast, ta.AQIValue, ta.HealthCategory, ta.Hr24Avg,
ta.Hr24Max, ta.Hr24Min, current_timestamp
FROM TodaysAirs ta
LEFT JOIN TodaysAirValues tav ON tav.ID = ta.ID
WHERE tav.ID IS NULL
This might not have all the table or column names exactly right, if there are any differences from the EF mapping with the database. You might also get it to go even a little faster using NOT EXISTS()
rather than the LEFT JOIN WHERE NULL
technique.
I also see this:
if the Count is greater than 0 it checks to see if any changes where made and if so update the record.
In that case, you can still get this down to just two SQL commands if you precede (run this additional command first!) the INSERT above with an UPDATE that looks something like this:
UPDATE tav
SET tav.ID=ta.DI, tav.Logger_id=ta.Logger_id, tav.SiteName=ta.SiteName,
tav.Latitude=ta.Latitude, tav.Longitude=ta.Longitude, tav.Hour=ta.Hour,
tav.Parameter=ta.Parameter, tav.Stan=ta.Stan, tav.Units=ta.Units,
tav.InstrumentType=ta.InstrumentType, tav.NowCast=ta.NowCast,
tav.AQIValue=ta.AQIValue, tav.HealthCategory=ta.HealthCategory,
tav.Hr24Avg=ta.Hr24Avg,tav.Hr24Max=ta.Hr24Max, tav.Hr24Min=ta.Hr24Min,
tav.SID=ta.SID -- possibly current_timestamp here instead
FROM TodaysAirs ta
INNER JOIN TodaysAirValues tav ON tav.ID = ta.ID
WHERE (
-- compare here to decide if the record needs to update or not
)
Unfortunately I don't have enough info about what you want that look like to give you the full code.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |