Some rough notes about my investigations into different techniques to perform batch database updates.
It’s not efficient to send individual queries from an application to the database server. What techniques can be used from .net to send batch updates?
ETL
Most ETL processes probably uses either Bulk Insert, BCP, SSIS, or some fun third party product.
ADO.NET Options
Array and function
Use an Sql function to convert to table variable, and join on the table variable.
Old but nice explanantion of this technique – http://odetocode.com/code/365.aspx
Pass delimited list of “things” you wish to update. e.g. customer id’s “1|2|14|42”.
Table variable
Construct a table variable in code using SqlDbType.Structured and specify the table type
Pass to stored procedure or inline sqlCreate a User Defined Table Type
http://msdn.microsoft.com/en-us/library/bb675163.aspx
ADO.NET batch size
Set an UpdateBatchSize. Configure, update, insert, delete behaviour
http://msdn.microsoft.com/en-us/library/aadf8fk2.aspx
SqlBulkCopy
Like the name suggests copy only. Sql Server only.
Microsoft describe functionality as a managed equivalent to the bcp utility.
http://www.sqlteam.com/article/use-sqlbulkcopy-to-quickly-load-data-from-your-client-to-sql-server
ORM Tools
Generally, ORM tools are not built to handle batch processing. As an illustration, Rhino ETL exists, rather than the author using NHibernate as you might expect – http://ayende.com/Blog/archive/2008/01/16/Rhino-ETL-2.0.aspx
As a rule of thumb, the techniques available in ORM tools are suitable for “small” batch updates that occur in an application, not for creating ETL processes.
NHibernate
Can pass HQL across or set batch size – http://stackoverflow.com/questions/780940/batch-update-in-nhibernate
Linq 2 Sql
Nothing out of the box. Normally fire insert/update/delete statement for each object you wish to update. Can create extension methods to get batch like behaviour. http://www.aneyfamily.com/terryandann/post/2008/04/Batch-Updates-and-Deletes-with-LINQ-to-SQL.aspx
Simple.Data
Need to explore:
Can use dynamic api – db.Foo.UpdateByX(X: “Bar”, Y: “Quux”, Z: 42) updates all Foo where X = “Bar”, sets Y and Z.
Entity Framework
Nothing out of the box. Can specify SQL to run – http://stackoverflow.com/questions/1781175/entity-framework-4-multiple-object-deleteremoveall