Passing array parameters to a stored procedure
I've got a process that grabs a bunch of records (1000's) and operates on them, and when I'm done, I need to mark a large number of them as processed. I can indicate this with a big list of IDs. I'm trying to avoid the "updates in a loop" pattern, so I'd like to find a more efficient way to send this bag of ID's into a MS SQL Server 2008 stored proc.
Proposal #1 - Table Valued Parameters. I can define a table type w/ just an ID field and send in a table full of IDs to update.
Proposal #2 - XML parameter (varchar) with OPENXML() in proc body.
Proposal #3 - List parsing. I'd rather avoid this, if possible, as it seems unwieldy and error-prone.
Any preference among these, or any ideas I've missed?
I'm pulling them down along with "payload" data via another stored proc. I don't need to update all that data, though -- just update a flag on certain records.
The best ever articles on this matter are by Erland Sommarskog:
- Arrays and Lists in SQL Server 2008 - Using Table-Valued Parameters
- Arrays and Lists in SQL Server 2005 and Beyond- When TVPs Do Not Cut it
- Arrays and Lists in SQL Server 2000 and Earlier
He covers all options and explains pretty well.
Sorry for the shortness of the answer, but Erland's article on Arrays is like Joe Celko's books on trees and other SQL treats :)
There is a great discussion of this on StackOverflow that covers many approaches. The one I prefer for SQL Server 2008+ is to use table-valued parameters. This is essentially SQL Server's solution to your problem--passing in a list of values to a stored procedure.
The advantages of this approach are:
- make one stored procedure call with all your data passed in as 1 parameter
- table input is structured and strongly typed
- no string building/parsing or handling of XML
- can easily use table input to filter, join, or whatever
However, take note: If you call a stored procedure that uses TVPs via ADO.NET or ODBC and take a look at the activity with SQL Server Profiler, you will notice that SQL Server receives several
INSERT
statements to load the TVP, one for each row in the TVP, followed by the call to the procedure. This is by design. This batch ofINSERT
s needs to be compiled every time the procedure is called, and constitutes a small overhead. However, even with this overhead, TVPs still blow away other approaches in terms of performance and usability for the majority of use cases.If you want to learn more, Erland Sommarskog has the full skinny on how table-valued parameters work and provides several examples.
Here is another example I concocted:
CREATE TYPE id_list AS TABLE ( id int NOT NULL PRIMARY KEY ); GO CREATE PROCEDURE [dbo].[tvp_test] ( @param1 INT , @customer_list id_list READONLY ) AS BEGIN SELECT @param1 AS param1; -- join, filter, do whatever you want with this table -- (other than modify it) SELECT * FROM @customer_list; END; GO DECLARE @customer_list id_list; INSERT INTO @customer_list ( id ) VALUES (1), (2), (3), (4), (5), (6), (7); EXECUTE [dbo].[tvp_test] @param1 = 5 , @customer_list = @customer_list ; GO DROP PROCEDURE dbo.tvp_test; DROP TYPE id_list; GO
When I run this I get an error: Msg 2715, Level 16, State 3, Procedure tvp_test, Line 4 [Batch Start Line 4] Column, parameter, or variable #2: Cannot find data type id_list. Parameter or variable '@customer_list' has an invalid data type. Msg 1087, Level 16, State 1, Procedure tvp_test, Line 13 [Batch Start Line 4] Must declare the table variable "@customer_list".
@Damian - Did the `CREATE TYPE` statement at the beginning run successfully? What version of SQL Server are you running?
In the SP code you have this sentence inline `SELECT @param1 AS param1;' . What's the purpose? You don't make use or param1 so why did you put this as a parameter in the SP header?
@EAmez - It was just an arbitrary example. The point is `@customer_list` not `@param1`. The example simply demonstrates that you can mix different parameter types.
The entire subject is discussed on the definitive article by Erland Sommarskog: "Arrays and List in SQL Server". Take your pick of which version to choose.
Summary, for pre SQL Server 2008 where TVPs trump the rest
- CSV, split how you like (I generally use a Numbers table)
- XML and parse (better with SQL Server 2005+)
- Create a temporary table on the client
The article is worth reading anyway to see other techniques and thinking.
Edit: late answer for huge lists elsewhere: Passing array parameters to a stored procedure
I know I am late for this party, but I had such a problem in the past, having to send up to 100K bigint numbers, and did a few benchmarks. We ended up sending them in binary format, as an image - that was faster than everything else for up to 100K numbers.
Here is my old (SQL Server 2005) code:
SELECT Number * 8 + 1 AS StartFrom , Number * 8 + 8 AS MaxLen INTO dbo.ParsingNumbers FROM dbo.Numbers GO CREATE FUNCTION dbo.ParseImageIntoBIGINTs ( @BIGINTs IMAGE ) RETURNS TABLE AS RETURN ( SELECT CAST(SUBSTRING(@BIGINTs, StartFrom, 8) AS BIGINT) Num FROM dbo.ParsingNumbers WHERE MaxLen <= DATALENGTH(@BIGINTs) ) GO
The following code is packing integers into a binary blob. I am reversing the order of bytes here:
static byte[] UlongsToBytes(ulong[] ulongs) { int ifrom = ulongs.GetLowerBound(0); int ito = ulongs.GetUpperBound(0); int l = (ito - ifrom + 1)*8; byte[] ret = new byte[l]; int retind = 0; for(int i=ifrom; i<=ito; i++) { ulong v = ulongs[i]; ret[retind++] = (byte) (v >> 0x38); ret[retind++] = (byte) (v >> 0x30); ret[retind++] = (byte) (v >> 40); ret[retind++] = (byte) (v >> 0x20); ret[retind++] = (byte) (v >> 0x18); ret[retind++] = (byte) (v >> 0x10); ret[retind++] = (byte) (v >> 8); ret[retind++] = (byte) v; } return ret; }
I'm torn between referring you to SO or answering it here, 'cos this is almost a programming question. But since I've already got a solution I use ... I'll post that ;)
The way this one works is you feed a comma delimited string (simple split, doesn't do CSV style splits) into the stored procedure as a varchar(4000) and then feed that list into this function and get a handy table back out, a table of just varchars.
This allows you to send in the values of just the ids that you want processed, and you can do a simple join at that point.
Alternately you could do something with a CLR DataTable and feed that in, but that's a bit more overhead to support and everyone understands CSV lists.
USE [Database] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO ALTER FUNCTION [dbo].[splitListToTable] (@list nvarchar(MAX), @delimiter nchar(1) = N',') RETURNS @tbl TABLE (value varchar(4000) NOT NULL) AS /* http://www.sommarskog.se/arrays-in-sql.html This guy is apparently THE guy in SQL arrays and lists Need an easy non-dynamic way to split a list of strings on input for comparisons Usage like thus: DECLARE @sqlParam VARCHAR(MAX) SET @sqlParam = 'a,b,c' SELECT * FROM ( select 'a' as col1, '1' as col2 UNION select 'a' as col1, '2' as col2 UNION select 'b' as col1, '3' as col2 UNION select 'b' as col1, '4' as col2 UNION select 'c' as col1, '5' as col2 UNION select 'c' as col1, '6' as col2 ) x WHERE EXISTS( SELECT value FROM splitListToTable(@sqlParam,',') WHERE x.col1 = value ) */ BEGIN DECLARE @endpos int, @startpos int, @textpos int, @chunklen smallint, @tmpstr nvarchar(4000), @leftover nvarchar(4000), @tmpval nvarchar(4000) SET @textpos = 1 SET @leftover = '' WHILE @textpos <= datalength(@list) / 2 BEGIN SET @chunklen = 4000 - datalength(@leftover) / 2 SET @tmpstr = @leftover + substring(@list, @textpos, @chunklen) SET @textpos = @textpos + @chunklen SET @startpos = 0 SET @endpos = charindex(@delimiter, @tmpstr) WHILE @endpos > 0 BEGIN SET @tmpval = ltrim(rtrim(substring(@tmpstr, @startpos + 1, @endpos - @startpos - 1))) INSERT @tbl (value) VALUES(@tmpval) SET @startpos = @endpos SET @endpos = charindex(@delimiter, @tmpstr, @startpos + 1) END SET @leftover = right(@tmpstr, datalength(@tmpstr) / 2 - @startpos) END INSERT @tbl(value) VALUES (ltrim(rtrim(@leftover))) RETURN END
Well, I was specifically trying to avoid the comma-delimited list so that I wouldn't have to write something like that, but since it's already written, I guess I'd have to throw that solution back into the mix. ;-)
I say tried and true is easiest. You can spit out a comma separated list in C# in seconds of code, and you can toss it into this function (after getting it into your sproc) quickly enough, and you don't have to hardly even think about it. ~ And I know you said you didn't want to use a function, but I think it's the simplest way (maybe not the most effective)
I regularly receive sets of 1000s of rows and 10000s of rows sent from our application to be processed by various SQL Server stored procedures.
To meet the performance demands, we use TVPs, but you must implement your own abstract of the dbDataReader to overcome some performance issues in its default-mode of processing. I will not go into the hows and whys as they are out of scope for this request.
I did not considered XML processing as I have not found an XML implementation which remains performant with more than 10,000 "rows".
List processing can be handled by single-dimension and double-dimension tally (numbers) table processing. We have used successfully used these in various areas, but well-managed TVPs are more performant when there are more than a couple hundred "rows".
As with all choices regarding SQL Server processing, you have to make your choice selection based on the usage model.
I finally got a chance to do some TableValuedParameters and they work great, so I'm going to paste a whole lotta code that shows how I'm using them, with a sample from some of my current code: (note: we use ADO.NET)
Also note: I'm writing some code for a service, and I've got lots of predefined code bits in the other class, but I'm writing this as a console app so I can debug it, so I ripped all this from the console app. Excuse my coding style (like hardcoded connection strings) as it was sort of "build one to throw away". I wanted to show how I use a
List<customObject>
and push it into the database easily as a table, that I can use in the stored procedure. C# and TSQL code below:using System; using System.Collections.Generic; using System.Data; using System.Data.SqlClient; using a; namespace a.EventAMI { class Db { private static SqlCommand SqlCommandFactory(string sprocName, SqlConnection con) { return new SqlCommand { CommandType = CommandType.StoredProcedure, CommandText = sprocName, CommandTimeout = 0, Connection = con }; } public static void Update(List<Current> currents) { const string CONSTR = @"just a hardwired connection string while I'm debugging"; SqlConnection con = new SqlConnection( CONSTR ); SqlCommand cmd = SqlCommandFactory( "sprocname", con ); cmd.Parameters.Add( "@CurrentTVP", SqlDbType.Structured ).Value = Converter.GetDataTableFromIEnumerable( currents, typeof( Current ) ); //my custom converter class try { using ( con ) { con.Open(); cmd.ExecuteNonQuery(); } } catch ( Exception ex ) { ErrHandler.WriteXML( ex ); throw; } } } class Current { public string Identifier { get; set; } public string OffTime { get; set; } public DateTime Off() { return Convert.ToDateTime( OffTime ); } private static SqlCommand SqlCommandFactory(string sprocName, SqlConnection con) { return new SqlCommand { CommandType = CommandType.StoredProcedure, CommandText = sprocName, CommandTimeout = 0, Connection = con }; } public static List<Current> GetAll() { List<Current> l = new List<Current>(); const string CONSTR = @"just a hardcoded connection string while I'm debugging"; SqlConnection con = new SqlConnection( CONSTR ); SqlCommand cmd = SqlCommandFactory( "sprocname", con ); try { using ( con ) { con.Open(); using ( SqlDataReader reader = cmd.ExecuteReader() ) { while ( reader.Read() ) { l.Add( new Current { Identifier = reader[0].ToString(), OffTime = reader[1].ToString() } ); } } } } catch ( Exception ex ) { ErrHandler.WriteXML( ex ); throw; } return l; } } } ------------------- the converter class ------------------- using System; using System.Collections; using System.Data; using System.Reflection; namespace a { public static class Converter { public static DataTable GetDataTableFromIEnumerable(IEnumerable aIEnumerable) { return GetDataTableFromIEnumerable( aIEnumerable, null ); } public static DataTable GetDataTableFromIEnumerable(IEnumerable aIEnumerable, Type baseType) { DataTable returnTable = new DataTable(); if ( aIEnumerable != null ) { //Creates the table structure looping in the in the first element of the list object baseObj = null; Type objectType; if ( baseType == null ) { foreach ( object obj in aIEnumerable ) { baseObj = obj; break; } objectType = baseObj.GetType(); } else { objectType = baseType; } PropertyInfo[] properties = objectType.GetProperties(); DataColumn col; foreach ( PropertyInfo property in properties ) { col = new DataColumn { ColumnName = property.Name }; if ( property.PropertyType == typeof( DateTime? ) ) { col.DataType = typeof( DateTime ); } else if ( property.PropertyType == typeof( Int32? ) ) { col.DataType = typeof( Int32 ); } else { col.DataType = property.PropertyType; } returnTable.Columns.Add( col ); } //Adds the rows to the table foreach ( object objItem in aIEnumerable ) { DataRow row = returnTable.NewRow(); foreach ( PropertyInfo property in properties ) { Object value = property.GetValue( objItem, null ); if ( value != null ) row[property.Name] = value; else row[property.Name] = ""; } returnTable.Rows.Add( row ); } } return returnTable; } } } USE [Database] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO ALTER PROC [dbo].[Event_Update] @EventCurrentTVP Event_CurrentTVP READONLY AS /**************************************************************** author cbrand date descrip I'll ask you to forgive me the anonymization I've made here, but hope this helps caller such and thus application ****************************************************************/ BEGIN TRAN Event_Update DECLARE @DEBUG INT SET @DEBUG = 0 /* test using @DEBUG <> 0 */ /* Replace the list of outstanding entries that are still currently disconnected with the list from the file This means remove all existing entries (faster to truncate and insert than to delete on a join and insert, yes?) */ TRUNCATE TABLE [database].[dbo].[Event_Current] INSERT INTO [database].[dbo].[Event_Current] ([Identifier] ,[OffTime]) SELECT [Identifier] ,[OffTime] FROM @EventCurrentTVP IF (@@ERROR <> 0 OR @DEBUG <> 0) BEGIN ROLLBACK TRAN Event_Update END ELSE BEGIN COMMIT TRAN Event_Update END USE [Database] GO CREATE TYPE [dbo].[Event_CurrentTVP] AS TABLE( [Identifier] [varchar](20) NULL, [OffTime] [datetime] NULL ) GO
Also, I'll take constructive criticism on my coding style if you have that to offer (to all readers that come across this question) but please keep it constructive ;) ... If you really want me, find me in the chatroom here. Hopefully with this chunk of code one can see how they can use the
List<Current>
as I have it defined as a table in the db and aList<T>
in their app.I would either go with proposal #1 or , as an alternative, create a scratch table that just holds processed ids. Insert into that table during processing, then once finished, call a proc similar to below:
BEGIN TRAN UPDATE dt SET processed = 1 FROM dataTable dt JOIN processedIds pi ON pi.id = dt.id; TRUNCATE TABLE processedIds COMMIT TRAN
You'll do many inserts, but they'll be to a small table, so it should be fast. You could also batch your inserts using ADO.net or whatever data adapter you're using.
The question title includes the task to transmit data from an application into the stored procedure. That part is excluded by the question body, but let me try to answer this too.
In context of sql-server-2008 as specified by the tags there is another great article by E. Sommarskog Arrays and Lists in SQL Server 2008. BTW I found it in the article Marian referred to in his answer.
Instead of just giving the link, I quote its list of content:
- Introduction
- Background
- Table-Valued Parameters in T-SQL
- Passing Table-Valued Parameters from ADO .NET
- Using a List
- Using a DataTable
- Using a DataReader
- Final Remarks
- Using Table-Valued Parameters from Other APIs
- ODBC
- OLE DB
- ADO
- LINQ and Entity Framework
- JDBC
- PHP
- Perl
- What If Your API Does Not Support TVPs
- Performance Considerations
- Server-side
- Client-side
- Primary Key or Not?
- Acknowledgements and Feedback
- Revision History
Beyond the techniques mentioned there, I have the feeling that in some cases bulkcopy and bulk insert deserve to be mentioned to scope with the general case.
Passing array parameters to a stored procedure
For MS SQL 2016 latest version
With MS SQL 2016 they introduce a new function : SPLIT_STRING() to parse multiple values.
This can solve your problem easily.
For MS SQL Older Version
If you are using older version, than follow this step:
First Make one function:
ALTER FUNCTION [dbo].[UDF_IDListToTable] ( @list [varchar](MAX), @Seperator CHAR(1) ) RETURNS @tbl TABLE (ID INT) WITH EXECUTE AS CALLER AS BEGIN DECLARE @position INT DECLARE @NewLine CHAR(2) DECLARE @no INT SET @NewLine = CHAR(13) + CHAR(10) IF CHARINDEX(@Seperator, @list) = 0 BEGIN INSERT INTO @tbl VALUES ( @list ) END ELSE BEGIN SET @position = 1 SET @list = @list + @Seperator WHILE CHARINDEX(@Seperator, @list, @position) <> 0 BEGIN SELECT @no = SUBSTRING( @list, @position, CHARINDEX(@Seperator, @list, @position) - @position ) IF @no <> '' INSERT INTO @tbl VALUES ( @no ) SET @position = CHARINDEX(@Seperator, @list, @position) + 1 END END RETURN END
After making this, Just pass your string to this function with separator.
I hope this is helpful to you. :-)
License under CC-BY-SA with attribution
Content dated before 6/26/2020 9:53 AM
Larry Coleman 10 years ago
How are you getting the big list of IDs?