Stored Procedures a bad practice at one of worlds largest IT software consulting firms?
I'm working at a project in one of the world's top 3 IT consulting firms, and was told by a DBA that company best practice's state stored procedures are not a "best practice". This is so contrary to everything I've learned.
Stored procedures give you code reuse, and encapsulation (two pillars of software development), security (you can grant/revoke permissions on an individual stored proc), protect you from SQL injection attacks, and also help with speed (although that DBA said that starting with SQL Server 2008 that even regular SQL queries are compiled if they are run enough times).
We're developing a complex app using Agile software development methodology. Can anyone think of good reasons why they wouldn't want to use stored procs? My guess was that the DBAs didn't want to maintain those stored procs, but there seem to be way too many negatives to justify such a design decision.
What code reuse does it add? What if your client uses a different database. Have to trash all of those SPs and start from scratch. Don't protect you from sql injection. Speed is minimal in most cases.
Highly related, from another point of view: http://dba.stackexchange.com/questions/2450/what-are-the-arguments-against-or-for-putting-application-logic-in-the-database
Bear in mind that most large I.T. consulting firms have a motive to maximise billable hours while keeping their arse covered. The old-timers with any clout in these firms also tend to be players and bureaucrats rather than techies. I'd take things like that from a consultancy firm with a grain of salt - I've gotten consultancy firms out of the shit on more than one occasion by fixing up their 'best' practice.
@Rig Code reuse is added just as it is for functions of any language - by wrapping code in a reusable container. Certainly stored procedures do in fact protect you from SQL injection so long as you don't execute a string you've built. To say speed is minimal seems simply uneducated. Most cases wont fall into the same categories on performance benefits but show a wide disparity.
@GaretClaborn But it is much more likely to re architect the application layer than years and years of historical database data. On any non trivial application application anyway. And if you do you will spend months porting stored procedure rich code. There is little benefit to adding one more dependency to your project except in edgecase situations. Those do exist but the majority of the time its just adding one more hurdle to project agility and code reuse.
Coming from a background where we used sps almost exclusively I can tell you the benefit from moving away from them and using an ORM like Entity Framework. Far too many times business logic gets encapsulated within the procedure. While you can version procs with some work and or third party tools. It's not as easy as it would be to do so within a framework like TFS or GIT. Your database code that is emitted is agnostic of your RDBMS provider. So you can switch out RDBMS providers at a later date with less of a headache.
PLSQL can't have unit tests, meaning you should never EVER put any code or procedure in PLSQL. Only store data there. Just because you can doesn't mean you should, it's like the "include" function in PHP... it's there, but don't use it.
That's strange, I've created unit tests for PL/SQL procedures. I guess it's a good thing I didnn't know it was impossible. :-)
@ewahner Hi I'm curious about "business logic encapsulated within the proc". We have an SP which consumes a set of filters (thinking devices, OS) and calculates CLV under their constraint. It's about 1,300 lines long, and I'm wondering if it's possible to use ORM to speed up? E.g. to calculate revenue (business logic) the SP will first insert all filters into a temp table (A) and then insert the transaction table into another temp table (B), LEFT JOIN B with A and remove all null records to apply all filters, then uses aggregation to get revenue/fee/etc. from different columns. It's slow.
In my experience working on very large projects, you have to be very clear on where business logic lives. If you allow an environment where individual developers can put business logic in the business object layer or in a stored procedure as they see fit, a large application becomes VERY difficult to understand and maintain.
Stored procedures are great for speeding up certain DB operations. My architectural decision is to leave all logic in the business layer of the application and employ stored procedures in a targeted manner to improve performance where benchmarking indicates it is warranted.
Well mentioned. I'm re-writing a product for a company at the moment and business logic has crept in to all the stored procedures. It's become extremely difficult to see what rules love in the UI, business tiers, as well as the stored procedures themselves. Not to also forget, that stored procedures give you vendor locking.
If you don't use a proprietary db, vendor lock-in is not really that much of an issue, but it does indeed tie you to a solution.
I don't see things quite that simply. To me, it's ALL business logic. The database, with or without stored procedures, provides certain services and makes certain guarantees. Ideally it should be impossible for incorrect application code to put the database into an inconsistent state. If stored procedures are needed to maintain that consistency, I use them.
@kevin cline: Define "inconsistent state". I agree that DB features such as referential integrity are valuable and greatly reduce the prospect of an application error causing serious damage. However, generally speaking, the definition of "consistent data" depends on correct execution of business rules.
add my million to Mayo's million. Distributed business logic takes you off of the highway of good practice, straight into the lane of lunacy
+1 Business logic seeping into the DAL is a great concern when using stored procedures.
There is no silver bullet to keeping business logic out of any layer. What about default values on fields, triggers, and other types of constraints?
@ jeff database as a storage of data. does not enforce anything. logic layer enforces all.
@ChristopherMahan, I would NEVER want to use a database you design. That is the worst possible practice from a database perspective. Databases are often affected directly at the database. It is short-sighted to think someone will use the business layer to update a million records or other things that happen over time. Imports do not typeically go through the business layer (yep I want to process my 21 million record import one record at a time in the business layer). Fraud is much easier when you don't have constraints at the database level. Bad data is almost 100% certain.
@HLGEM, when you process your 21 million records, you remember to do a select into, right, so it doesn't populate the transaction log. And try 900 million records (with 600 fields) and see if you can finish that night with all the triggers. At some point, you gotta trust the people who can do direct data manipulation. Also, if everyone is DBO, you have other issues.
@JeffO: Drive UI validation from business rules defined in the business layer. I have done that for years. Now you can do that with IDataErrorInfo in the .Net Framework. Using EF Code First, metadata for the DB structure also derives from the business layer.
This should not be selected as answer... There are projects that people thing "all database calls must be Stored Procedures, and SQL is strictly verboten?"
This is such a narrow-minded programmer point of view. The first part is all right, the second isn't. http://dba.stackexchange.com/a/2452/9749
Silly @ChristopherMahan, if you're really going to insert 21,000,000 records all at once, you're probably better off using your database server's bulk load facility and not using insert statements at all.
There ARE NO GUARANTEES that additional apps are going to dutifully use your business logic layer to update the database. That being the case, and it being just as true that sooner or later there WILL be additional apps using your database, you're smart to enable the database to maintain data integrity and consistency. That means foreign keys and check constraints. If it also means using stored procedures, then use stored procedures. SPROCS are powerful tools. I don't really get why so many younger programmers are apparently scared of databases.
I agree with @kevincline - the idea of distinct and clear-cut "business logic layer" and "data access layer" are archaic at best and dangerous at worst - software should just be viewed as interconnecting components that are abstracted and decoupled where necessary - you don't need "names" for "layers." Whether your "logic" goes in a sproc or in a codebase depends entirely on your goals and infrastructure.
@Craig In that case, make the database its own "project" and put an api layer over it which encompasses the project. Having multiple apps connect to the same DB with different lifecycles, different requirements, different business logic is a recipe for disaster IMO
@JoePhillips I've been designing and building n-tier software systems for a long time. Of course you have a data layer/API in front of the database and your app accesses the database through that layer. If you're luck, you talk other departments into also using that API to access the database, but you have no guarantee of this. The database will also outlive your API--fact. If the database can protect itself, you stand a much better chance of maintaining the integrity of your data over time.
@Craig I think we just see the DB as being different things. I see two scenarios: (1) the DB is its own app and offers an API to gain access (2) the DB is a piece of a single app and that app has full ownership of the DB. You seem to saying there is scenario (3) where neither app really owns the DB and nobody really has full control of their app. Bad design IMO (but very common I'm certain)
@JoePhillips I'm saying that decades of real world experience bear out that the app is transitory, but the database lives on. Chances are that the database will live beyond the supported lifetime of whatever programming language or framework you used for your app (especially if you chose Ruby on Rails). Sooner or later, another system, and then another, will need access to the database. Technical debt will accrue, the app will be discarded and a new one will be developed. The data is the point. The C-suite and board don't care much about the app. The app and its API are just a means to an end.
@EricJ. enforcing business logic rules is exactly what Kevin Cline refers to, the mechanism of the DB like primary keys, foreign keys, triggers, cascading etc help you to enforce business logic. This idea of consolidating business logic is archaic and monolithic oriented, it makes no sense at all, this multitier/n-tier view is wrong, what you really have is a distributed system, every actor in the system has to enforce business logic. The frontend should do so to help users, backend in different services SOA or uServices enforce in their single responsibility side of business.
@kisai Business logic only in the DB creates a dependence to that DB. I think we agree that's a non-starter. I've seen companies fail because they did this. The UI (GUI, website, API endpoint) can't be trusted, so while it's an end-user convenience to implement some rules in the UI (e.g. input validation), that can't be the only place they're implemented. I think we both agree on that. At a minimum, then, logic must be in the middle tier. Replicating some rules in the UI or DB can be beneficial but they must always be enforced in the middle. Good distributed systems are still n-Tier.
Stored procedures give you code reuse, and encapsulation (two pillars of software development),
Only if you use them correctly in the context in which they are supposed to be used. The same claim can be said about functions (in structured programming) or methods (in object oriented programming), and yet, we see 1K functions and mega-ass objects.
Artifacts don't give you those benefits. The proper usage of those artifacts is what give those benefits.
security (you can grant/revoke permissions on an individual stored proc),
Yes. This is a good point and one of the main reasons I like stored procedures. They provide a finer-granularity access control than what can be typically achieved with just views and user accounts.
protect you from SQL injection attacks,
That is not specific to SPs since you can achieve the same level of protection with parameterized SQL statements and input scrubbing. I would use SPs in addition to those, however, as matter of "security in depth".
and also help with speed (although that DBA said that starting with SQL Server 2008 that even regular SQL queries are compiled if they are run enough times).
This is highly database vendor specific, but in general your DBA is right. SQL statements (either static or parametrized) do get compiled. SPs help if you want/need to aggregate and compute data that you cannot do with simple SQL statements, but are tightly integrated with SQL and does not warrant the round-trip to the app server.
A good example is querying data into a temporary cursor (or cursors) from which to run another SQL itself. You can do it programmatically in the app server, or you can save the multiple round-trips by doing it in the db.
This should not be the norm, however. If you have many of those cases, then that is a sign of bad database design (or you are pulling data from not-so compatible database schemas across departments.)
We're developing a complex app using Agile software development methodology.
Agility has to do with software engineering processes and requirement managements, and not technologies.
Can anyone think of good reasons why they wouldn't want to use stored procs?
The question is wrong and equivalent to asking "are there any good reasons not to use GOTO"? I side with Niklaus Wirth more than with Dijkstra on this subject. I can understand where Dijkstra's sentiment came from, but I do not believe it is 100% applicable in all cases. Same with store procs and any technology.
A tool is good when used well for its intended purpose, and when it is the best tool for the particular task. Using it otherwise is not an indication that the tool is wrong, but that the wielder doesn't know what he/she is doing.
The proper question is "what type of stored procedure usage patterns should be avoided." Or, "under what conditions should I (or should not) use stored procedures". Looking for reasons not to use a technology is simply putting the blame on the tool as opposed to placing the engineering responsibility squarely where it belongs - in the engineer.
In other words, it is a cop-out or a statement of ignorance.
My guess was that the DBAs didn't want to maintain those stored procs, but there seem to be way too many negatives to justify such a design decision.
What they are doing then is projecting the results of their bad engineering decisions on the tools they used poorly.
What to do in your case?
My experience is, when in Rome, do as the Romans do.
Don't fight it. If the people at your company want to label store procs as a bad practice, let them. Be advised however, that this can be a red flag in their engineering practices.
Typical labeling of things as bad practice is usually done in organizations with tons of incompetent programmers. By black-listing certain things, the organization tries to limit the damage inflicted internally by their own incompetence. I shit you not.
Generalizations are the mother of all screw ups. Saying that stored procs (or any type of technology) are a bad practice, that's a generalization. Generalizations are cop-outs for the incompetent. Engineers do not work with blatant generalizations. They do analysis on a case-by-case basis, do analysis trade-offs and execute engineering decisions and solutions according to the facts at hand, in the context in which they are supposed to solve a problem.
Good engineers do not label things as bad practice in such generalizing ways. They look at the problem, select the tool that are appropriate, make trade-offs. In other words, they do engineering.
My opinion on how not to use them
Don't put complex logic beyond data gathering (and perhaps some transformations) in them. It is ok to put some data massaging logic in them, or to aggregate the result of multiple queries with them. But that's about it. Anything beyond that would qualify as business logic which should reside somewhere else.
Don't use them as your sole mechanism of defense against SQL injection. You leave them there in case something bad makes it to them, but there should be a slew of defensive logic in front of them - client-side validation/scrubbing, server-side validation/scrubbing, possibly transformation into types that make sense in your domain model, and finally getting passed to parametrized statements (which could be parametrized SQL statements or parametrized stored procs.)
Don't make databases the only place containing your store procs. Your store procs should be treated just as you treat your C# or Java source code. That is, source control the textual definition of your store procs. People rant that store procs can't be source controlled - bullcrap, they just don't know what the bloody hell they are talking about.
My opinion in how/where to use them
Your application requires data that needs to be transposed or aggregated from multiple queries or views. You can offload that from the application into the db. Here you have to do a performance analysis since a) database engines are more efficient that app servers in doing these things, but b) app servers are (sometimes) easier to scale horizontally.
Fine grain access control. You do not want some idiot running cartesian joins in your db, but you cannot just forbid people from executing arbitrary SQL statements just like that either. A typical solution is to allow arbitrary SQL statements in development and UAT environments, while forbidding them in systest and production environments. Any statement that must make it to systest or production goes into a store procedure, code-reviewed by both developers and dbas.
Any valid need to run a SQL statement not in a store proc goes through a different username/account and connection pool (with the usage highly monitored and discouraged.)
- In systems like Oracle, you can get access to LDAP, or create symlinks to external databases (say calling a store proc on a business partner's db via vpn.) Easy way to do spaghetti code, but that's true for all programming paradigms, and sometimes you have specific business/environment requirements for which this is the only solution. Store procs help encapsulate that nastiness in one place alone, close to the data and without having to traverse to the app server.
Whether you run this on the db as a store proc or on your app server depends on the trade-off analysis that you, as an engineer, have to make. Both options have to be analyzed and justified with some type of analysis. Going one way or another by simply accusing the other alternative as "bad practice", that's just a lame engineering cop-out.
- In situations where you simply cannot scale up your app server (.ie. no budget for new hardware or cloud instances) but with plenty of capacity on the db back-end (this is more typical that many people care to admit), it pays to move business logic to store procs. Not pretty and can lead to anemic domain models... but then again... trade-off analysis, the thing most software hacks suck at.
Whether that becomes a permanent solution or not, that's specific to constrains observed at that particular moment.
Hope it helps.
Good answer, but was this intended to be ironic? "Generalizations are the mother of all screw ups."
Yep and nay. That comment of mine was intended for this particular sentence referred by the OP in his original question (*stored procedures are not a "best practice"*.) A coarse description of store procedures as best or bad practice is a generalization. Ignoring the context in which they can be good **OR** bad can (and will often lead) to screw ups when architecting or designing solutions ;)
+1 for "Typical labeling of things as bad practice is usually done in organizations with tons of incompetent programmers." - been there, lived through that, including being told to my face by a dev manager that he thought I had a great solution for one tricky problem, but if he was seen to allow me to implement it then it would open the floodgates for the muppets.
A little long winded, though a well though out answer. I hope the when In rome do as Romans do is taken to heart.
Answer would be improved without the snark. Or if you knew what a 'best practice' is. `A best practice is a method or technique that has been generally accepted as superior to any alternatives because it produces results that are superior to those achieved by other means or because it has become a standard way of doing things` This entire answer describes how Stored Procedures are *not* a best practice. You have to think about their pros and cons and use them accordingly. But then you take a dump on other people for saying the same thing.
@Shane You are right. However I believe what this answer is trying to convey is the tendency of some groups of engineers to excuse their lack of knowledge or analysis by calling on the bad practice card. The answer could see some improvement for the more inexperienced of us, though.
The rationale is that relying on a stored procedure layer limits portability and ties you to a certain DB. Added maintenance costs are also cited as a reason. I also wanted to comment on this point you made:
(stored procedures) protect you from SQL injection attacks
It's actually the parametrized querying that protects you, which you can easily do in plain text sql querying.
And if your stored proc is using any type of dynamic sql along with a string parameter, you're right back where you started.
The difference is that access permission can be set for stored procedures on per procedure basis, for parameterised SQL queries you have to rely on programmers sanity not to do `+ "blablabla"` because you have to allow plain SQL, and that's where control ends.
I've never understood the "ties you to a certain DB" argument. How often do you take your program and migrate it to an entirely different database?
@MasonWheeler - +1 every time. In any sufficiently large project, your app ends up being written against the foibles of a given DB product. Converting to another DB becomes a major job no matter what because the new DB will have different oddities!
@MichaelKohne, I would say never in my experience. It depends on what segment of the industry you are in. In Enterpise systems it is rare to change database vendors. In the COTS world, it happens all the time or the design must account for multiple vendor possibilities. In the web world is somehwat divided depending on what the web site actually does and how critical the db is to the success of the site.
Some of the reasons that I agree stored procs aren't a best practice.
- Business and application logic should be in the code not in the database. Putting logic in the DB is mixing up concerns.
- You can't test stored procs as seamlessly as code in your conventional unit test projects with the rest of the application logic.
- I don't find stored procs as being conducive to test first programming when I am writing code.
- Stored procs aren't as easy to debug as application code when you are debugging your program in your IDE.
- Versionning control / Source control of SP vs. normal code
You can just as easy do test-first programming on stored procedures.
Hmmm, well... 1) The usage of db stored procedures does not necessarily imply that business logic is being put in them. 2) stored procs are some of the easiest things to unit test. 3) store procs are not necessarily conductive of test-first practices, true, but not everything that is computable can be test-first'ed. 4) debugging shouldn't be an issue since store procs should contain nothing more than easy-to-verify SQL statements and cursors. Also, debugging should take place by first testing and debugging the SQL statements in code, and **then** moved into store procs... just IMO btw.
You're obviously not a DB dev. Source control, IDEs - its damn easy to debug a SP if you're using TOAD or a similar IDE, same with versioning.
2) on unit testing stored procs. idk about other unit test frameworks but at least with MS Test (VisualStudio.TestTools.UnitTesting), running any of the Assert methods on the stored proc at least requires a Db connection, which by definition makes it more of an integration test than a unit test. And a stored proc may reference state about the database at a global, database level. These may not be fake-able or have interfaces.
+1 In addition, stored procedure languages (pl/sql,t-sql,plpgsql, etc) are very clunky and verbose. It's much easier for me to use a scripting language to make a database connection and handle the business logic outside of the database.
Stored procedures give you code reuse, and encapsulation (two pillars of software development),
Yes, but at the cost of being able to meet other agile design goals. They're more difficult to maintain, for one thing. If the project I'm on is any indication, you'll likely end up with multiple, incompatible SPs that do essentially the same job, with no benefit.
protect you from SQL injection attacks,
No. They do not. I can't even begin to guess where this idea might have come from, as I hear it said often, and it's simply not true. It may mitigate certain types of SQL injection attacks, but if you're not using parametrized queries in the first place, that won't matter. I can still ';DROP TABLE Accounts; --
and also help with speed (although that DBA said that starting with SQL Server 2008 that even regular SQL queries are compiled if they are run enough times).
They're also typically compiled when you use prepared, parametrized statements (at least with several of the DBs I've used). By the time your application begins executing the query (or especially when you execute the same prepared query multiple times), any performance advantage that you think your SP has is completely moot.
The only reason to use a stored procedure, IMHO, is when you must make a complex, multi-staged query that pulls from multiple collated sources. SPs should not contain low-level decision logic, and they should never simply encapsulate an otherwise simple query. There are no benefits and only many drawbacks.
Listen to your DBA. He knows what's up.
Red Gate have a product SQL Source Control for SQL Server, but I agree, pushing logic into stored procs is an excellent way to ensure that you have important logic not under any sort of version control.
@greyfade - *"I have yet to see source control for SPs"* - are you kidding me? A store proc is just a bloody text file that you upload in your database engine (which takes it, **compiles** it and installs it for execution.) Every place I've worked that has stored procs, we store the store proc source code in, say, CVS, clearcase or whichever SCM that was in use. Saying that store procs cannot be source-controlled (because they are in the db) is like saying my application source code (Java, C# or whatever) cannot be source controlled because it is compiled and deployed in production.
@luis.espinal: I did not say they *could not* be in source control. I merely said that I did not know of a tool specifically for maintaining the history of SPs, implying maintaining that history within the database. Please don't rant at me just because you misread something.
All opur stored procs are under source conrtrol, just becasue you have seen bad parctices in the past doesn't mean that they are inherent in using stored procs.
How about I just strike all mention of source control from my answer, since neither of you seem to be able to understand what I'm saying and immediately assume the worst? I dislike that you equate my statement that I'm "unaware" of good source control for SPs with a blanket statement that it's impossible.
@luis.espinal is it typical that the source of a stored procedure can be retrieved later from the database? If so, you can just have a tool that pulls it out regularily, and have other tooling in place to recreate an installation from scratch. Do that once in a while to ensure it is accurate.
@ThorbjørnRavnAndersen - oh yeah, very typical in every place (good place that is) that I've worked that had stored procedures . The operative word is "good", a place that doesn't version SPs is a time-bomb IMO. I've worked mostly on Oracle shops, where we would use a command-line tool based off DBMS_METADATA or select text from user_source, or GUI-based extraction facilities off Toad or SQL Explorer.) Pretty much that's how things got versioned (and also, to create installs from scratch as you described.)
All three of the companies I work for used stored procedures for their application logic with SQL Server. I have not really seen things the other way. But to me they are a big mess. There typically aren't very good error handling facilities or code re-use facilities with stored procedures.
Let's say you have a stored procedure that returns a dataset you want to use, how can you use it in future stored procedures? The mechanisms on SQL Server for that are not very good. EXEC INTO...only works to one or two level)s of nesting (I forget now). Or you have to pre-define a work table and have it process keyed. Or you need to pre-create a temp table and have the procedure populate it. But what if two people call a temp table the same thing in two different procedures that they never planned to use at the same time? In any normal programming language, you can just return an array from a function or point to an object/global structure shared between them (except for functional languages where you'd return the data structure as opposed to just changing a global structure...)
How about code re-use? If you start putting common expressions into UDFs (or even worse sub queries) you will slow the code to a halt. You can't call a stored procedure to perform a calculation for a column (unless you use a cursor, pass the column values in one by one, and then update your table/dataset somehow). So basically to get the most performance, you need to cut/paste common expressions all over the place which is a maintenance nightmare... With a programming language you can create a function to generate the common SQL and then call it from everywhere when building the SQL string. Then if you need to adjust the formula you can make the change in a single place...
How about error handling? SQL Server has many errors that immediately stop the stored procedure from executing and some that even force a disconnection. Since 2005 there is try/catch but there are still a number of errors that cannot be caught. Also the same thing happens with code duplication on the error handling code and you really cannot pass exceptions around as easily or bubble them up to higher levels as easily as most programming languages.....
Also just speed. A lot of operations on datasets are not SET oriented. If you try to do row oriented stuff, either you are going to use a cursor, or you are going to use a "cursor" (when developers often query each row one by one and store the contents into @ variables just like a cursor...Even though this is often slower than a FORWARD_ONLY cursor). With SQL Server 2000 I had something that was running for 1 hour before I killed it. I rewrote that code in Perl and it finished in 20 minutes. When a scripting language that is 20-80x slower than C smokes SQL in performance, you definitely have no business writing row oriented operations in SQL.
Now SQL Server does have CLR integration and a lot of these issues go away if you use a CLR stored procedures. But many DBAs don't know how to write .NET programs or turn off the CLR due to security concerns and stick with Transact SQL.... Also even with the CLRs you still have issues of sharing data between multiple procedures in an efficient way.
Also in general the hardest thing to scale out is the database. If all your business logic is in the database, then when the database becomes too slow you are going to have issues. If you have a business layer you can just add more caching and more business servers to boost performance. Traditionally another server to install windows/linux and run .NET/Java is much cheaper than buying another database server and licensing more SQL Server. SQL Server does have more clustering support now though, originally it did not really have any. So if you do have a lot of money, you can add clustering or even do some log shipping to make multiple read only copies. But overall this will cost much more than just having a write behind cache or something.
Also look at the Transact-SQL facilities. String Manipulation? I'll take the Java String Class/Tokenizer/Scanner/Regex classes any day. Hash Tables/Linked Lists/Etc. I'll take the Java Collection frameworks, etc... And the same for .NET... Both C# and Java are way more evolved languages than Transact SQL...Heck coding with Transact-SQL makes me jealous of C...
On the plus stored procedures are more efficient for working with a big dataset and applying a multiple queries/criteria to shrink it down before returning it to the business layer. If you have to send a bunch of huge datasets to the client application and break down the data at the client it will be much more inefficient than just doing all the work at the server.
Also stored procedures are good for security. You can cut all the access to the underlying tables and only allow access through the stored procedures. With some modern techniques like XML you can have stored procedures that do batch updates. Then all access is controlled through the stored procedures so as long as they are secure/correct the data can have more integrity.
The SQL injection argument doesn't really apply so much anymore since we have parameterized queries on the programming language side. Also really even before parameterized queries a little replace("'", "''") worked most of the time as well (although there are still tricks to use to go past the end of the string to get what you want).
Overall I think SQL and Transact SQL are great languages for querying/updating data. But for coding any type of logic, doing string manipulation (or heck file manipulation....you'd be surprised what you can do with xp_cmdshell....) please no. I'm hoping to find a future place that does not use stored procedures mostly. From a code maintainability point of view they are a nightmare. Also what happens if you want to switch platforms (although really if you paid for Oracle/DB2/Sybase/Sql Server/etc. you might as well get everything you can out of them by using every proprietary extension you can that helps you out...).
Also surprisingly often the business logic is not the same. In the ideal world you would put all the logic in stored procedures and share that between applications. But quite often the logic differs based on the applications and your stored procedures end up becoming overly complex monoliths that people are afraid to change and do not understand all the implications of. Whereas with a good object oriented language you can code a data access layer which has some standard interface/hooks that each application can override to their own needs.
I can't help but offer food for thought on the whole set-oriented vs. procedural issue, though. I have seen database cursors used in all kinds of cases where that approach was just nuts. I have personally replaced explicit cursor-based SQL (Oracle PL/SQL, in that particular case) with a set-oriented query and saw the results come back in under a second, instead of 8 minutes. It took me 30 minutes to dissect that 1,000 line cursor code and "get" it. The resulting SQL query was concise, elegant, simple. People underestimate the power of their database servers far too often, and far too quickly.
This was the official line when I worked for one of the Big Five a few years back. The rationale was that since SPs are tied to particular implementations (PL/SQL vs T/SQL vs ...), they unnecessarily limit technology choices.
Having lived through the migration of one large system from T/SQL to PL/SQL, I can understand the argument. I think it's bit of a canard though - how many places really move from one database to another on a whim?
@DaveE: For an enterprise solution you're probably right. If you're creating packaged software, as soon as you ship on MSSQL your biggest prospect will want it to run on Oracle.
@Eric: too true. Where I'm at now, we use tons of SPs and tell folks 'no' if they don't want MSSQL. It's nice to be able to do that.
It is not as much moving one system from one database to another, but having one system being able to use whatever database system the customer already has. Big databases are expensive.
@EricJ: yes, but once they see what the cost would do to their commission, the request kinda goes away.
@ThorRA: That's why it's nice the mgt here will say 'no' to the sale. I've worked places where the database was regarded as an "implementation detail" and having Sales pitch one of those over the wall is no fun.
Maybe I'm missing something, but if you are going to have to change the stored procedures because of switching databases, are you going to have to change the code just as much if you don't use stored procedures?
@Kevin: If you don't care about performance, you may hope that you won't have to change much.
This kind of thinking would prevent you from developing in any language on any platform, wouldn't it?
@Quant_dev, if you are accessing a database you must care about performance. It is attitudes like yours which is why I see so many badly designed and poorly performing datbases. That is not optional and that is why database specific code is often preferred, but it is almost alawys more performant.
How do you version stored procedures on the server?
If you redeploy stored procedures to the server from version control, you blow out the stored execution plan.
Stored procedures should not be modifiable directly on the server, otherwise how do you know what's really running _right now? If they are not, the deployment tool needs access to write stored procedures to the database. You'd have to deploy on every build (exec plan might need to be different)
While stored procedures are non-portable, neither is SQL in general (ever seen oracle date handling--uggghhh).
So, if you want portability, build an internal data access API. You can call this like function calls, and internally you can build in in whatever lingo you want, with parameterized queries, and it can be version-controlled.
*How do you version stored procedures on the server?* - you version control the store proc source code. When it's time to deploy, you grab the store procs (from a given baseline) and you (or your dba) deploy to production. Redeployment (be it on testing or production) certainly blows out the stored exec plan, but that will happen independently of whether you source control your SPs or not.
Granted, we do the same. When we redeploy, though, we do the SP deploys manually, only for the ones needed. Automated would redeploy them all. Not optimal.
@Jeff what do you mean any sql statement? Changing the stored procedure, sure. But if there are 300 SP on the server, do you change the 2 you updated, or all 300 in an automated deploy?
Your build and deploy scripts should know which SPs have changed and update only those.
@BarryBrown It doesn't work if people have access to the server directly and can change the stored procedures. I'd have to have a process that watches the SPs, or have a check before each use...
If you have people just changing sprocs willy-nilly on servers without committing their changes to source control, you have a *process* problem that is almost certainly affecting your imperative code development, as well, even if you're not aware that it is.
Oh, I _know_ there is a process problem, but I came into the team with the system already set up this way, and there's no changing it now.
One of the things I've done in the past has been to put a development instance of the database server on individual developer's workstations, or if that wasn't possible, then at least have "dev" and "production" instances of the databases, and all the DDL and DML scripts, as well as sample data and load scripts lived under their own directory in the source tree, and the database was routinely built from those scripts using the MAKE file. Devs were able to use nmake to build single stored procs, as well. If they didn't put it under source control, it would disappear on them, and they knew it.
...I didn't mean to sound disparaging in my earlier comment, with the "..., even if you're not aware..." phrase. What I meant to convey was that if that sort of thing is happening with stored procedures, it's probably happening in other parts of the project, as well. I personally kind of dislike integrated source control in IDE's, in part because I think it makes people kind of lazy in terms of thinking about what it *really means* to the team and project as a whole to make changes and commit those changes to the source control repository. Those things should not be "automatic" in my opinion.
This is so contrary to everything I've learned.
You might need to get out more. [grin] Seriously, stored procs have been on the decline for at least 10 years. Pretty much ever since n-tier replaced client-server. The decline has only been accelerated by the adoption of OO languages like Java, C#, Python, etc.
That's not to say stored procs don't still have their proponents and advocates - but this is a long-running discussion and debate. It's not new, and will likely still be going on for quite some time; IMO, the opponents of stored procs are clearly winning.
Stored procedures give you code reuse, and encapsulation (two pillars of software development)
Very true. But, so does a decently architected OO layer.
security (you can grant/revoke permissions on an individual stored proc)
While you can do that, few do it because of the serious limitations. Security at the DB level isn't granular enough to make context-aware decisions. Because of performance and management overhead, it's unusual to have per-user connections as well - so you still need some level of authorization in your app code. You can use role based logins, but you will need to create them for new roles, maintain which role you're running as, switch connections to do "system level" work like logging, etc. And, in the end, if your app is owned - so is your connection to the DB.
protect you from SQL injection attacks
No more so than doing parametrized queries. Which you need to be doing anyway.
and also help with speed (although that DBA said that starting with SQL Server 2008 that even regular SQL queries are compiled if they are run enough times).
I think that started in MSSQL 7 or 2000. There's been a lot of debates, measurements, and misinformation on stored proc vs inline SQL performance - I lump it all under YAGNI. And, if you do need it, test.
We're developing a complex app using Agile software development methodology. Can anyone think of good reasons why they wouldn't want to use stored procs?
I can't think of many reasons you would want to. Java/C#/any 3rd GL language are all much more capable than T-SQL at encapsulation, reuse and felxibility, etc. Most of which is free given a half-decent ORM.
Also, given the advice to "distribute as needed, but not more" - I think the burden of proof these days is on SP advocates. A common reason for going stored proc heavy is that T-SQL is easier than OO, and the shop has better T-SQL devs than OO. Or, that the DBA stops at the database layer, and stored procs are the interface between dev and DBA. Or, you're shipping a semi-custom product and the stored procs can be user-customized. Absent some considerations like that, I think the default for any Agile SW project these days is going to be ORM.
There is a _LOT_ to gain performancewise if you do not have to transfer gigantic datasets out of the database to do simple stuff. Measure, and optimize if needed.
Precisely. Stored procedures can be used like a scalpel. It's an absolute guarantee that I/O inside the database server has more bandwidth than I/O between the database server and your middle tier. And you're not going to write faster, more efficient data join code in your middle tier than the database engine devs wrote in the database server. If you're transferring 1,000,000 rows of data to your middle tier to do a join, which I've certainly seen, you just ought to be flogged... It's like the guys who claim you should "write your own rollback code." Insanity.
FWIW, you don't need a stored proc to do a join on the database side. And if you're using a cursor for procedural logic, you've likely lost the performance war already. Giving up stored procedures is certainly not the same as giving up SQL or set based solutions.
Totally true, and I was actually arguing in favor of SQL more than arguing specifically for sprocs. But then having SQL embedded in your imperative code isn't necessarily the key to happiness, either, right? Which often leads into the whole ORM debate, which then leads to me pointing to performance comparisons between ORM-driven db access vs just learning how to use SQL. I've both seen and heard of systems where, say, Oracle consultants recommended keeping all the load possible off the database server, leading to heavy (and grossly expensive!) middleware with *abysmal* performance.
I want to cover both some pro- and con- issues with stored procs. We use them extensively with LedgerSMB, and our rule is, with a few very specific extension, "if it's a query, make it a stored proc."
Our reason for doing this was to facilitate cross-language query reuse. There isn't a better way to do this honestly.
In the end the question is always on the details. Well used, stored procedures make things a lot easier, and poorly used they make things a lot harder.
So on to the con side.
As traditionally used, stored procedures are brittle. Used alone they create the possibility of adding bugs to your code in places you didn't expect for no reason other than that the call syntax changed. Used alone this is a bit of a problem. There is way too much cohesion between layers and this causes problems.
Yes, it is possible to have intra-sproc sql injection if doing any dynamic sql. It is a bad thing to be over-confident in this area, and consequently one needs to have significant experience in security in this area.
Changes to interfaces are somewhat problematic with stored procedures for reason #1 above but this can become a very big nightmare if large numbers of client apps are involved.
The above are pretty hard to deny. They happen. Everyone, pro-SP and anti-SP has probably had horror stories regarding these. The problems aren't unsolvable but if you don't pay attention to them you can't solve them (in LedgerSMB we use a service locator to dynamically build SP calls at run-time, avoiding the above issues entirely. While we are PostgreSQL only, something similar could be done for other db's).
On to the positives. Assuming you can solve the above problems, you get:
The possibility of enhanced clarity in set operations. This is particularly true if your query is very large or very flexible. This also leads to enhanced testability.
If I have a service locator already working in this area, I find stored procedures speed up pace of development because they free up the application developer from db concerns and vice versa. This has some difficulties in doing right but it is not that hard to do.
Oh and a few things you should almost never do in an SP:
non-transactional logic. You sent the email that the order was shipped but the transaction rolled back... or now you are waiting to continue for the email server to come online.... or worse you roll back your transaction just because you can't reach the email server....
lots of little queries loosely strung together, sprinkled with procedural logic....
Agree strongly, re: keeping non-transactional junk out of stored procedures. In that Email example, the Email message should be dropped into a queue and serviced asynchronously, anyway. Talk about setting yourself up for a massive performance hit and funky behavior under load, making your database transactions dependent on the response of your mail server? Yikes!