I sat down to write a defense of stored procedures with specific regard to DAL/ORM solutions, but found myself painted into a corner.

Seven years ago, Ignia discarded four years of ongoing development into a class library we called "SQL Interface" in favor of stored procedures.  SQL Interface was a primitive Data Access Layer (DAL) and, to a lesser extent, Object Relation Mapper (ORM); given a couple parameters and a specific naming convention, it would map fields or variables to one or more database tables and automatically handle all CRUD (Create, Read, Update, Delete) operations.  Needless to say, it saved a ton of time and I had quite an attachment to it.

Today, a lot of DAL/ORM solutions are pursuing a similar approach.  Most notably, SubSonic and Microsoft's latest contribution, LINQ.  Both of these have cursory support for stored procedures (sprocs), admittedly, but both require as much if not more configuration than hand-coding database access with the EntLib DAAB; in other words, it's notably faster and easier to use dynamic SQL.  And, really, that makes sense; tables, logically, mirror relational objects better than stored procedures. As a result, there are few solutions (such as .netTiers) which offer full stored procedure support – and even those often rely on dynamic SQL inside the stored procedures through use of the execute statement.

So why the concern?  What did stored procedures offer us and why do we cling to them?  After reviewing the classic benefits, perhaps not as much as some (including myself) claim.

Defense #1: Performance

The performance benefit of stored procedures is often contested and as frequently misunderstood.  The typical argument against this benefit is based on the fact that Microsoft SQL Server caches the execution plan of ad hoc queries.  This is sometimes illustrated by stress/performance tests that demonstrate that executing n number of consecutive calls to a stored procedure is no faster than executing the same number of calls to dynamic SQL.  And this is true.  Seasoned testers, however, will immediately notice the flaw in this logic.  The cache for stored procedures is saved as part of the compiled code and persisted permanently; the cache for dynamic SQL, however, is stored in memory.  If your application is submitting the same twenty calls consistently, you'll probably find little performance benefit to stored procedures.  If, however, your server fields thousands of queries intermixed with one another that vary significantly in syntax then stored procedures will usually perform considerably faster.

However, when it comes to DAL/ORM solutions there's a catch.  Most of the operations that a DAL/ORM is performing are straight selects against a single table to pull either all records or a single record by its primary key (PKC).  These are queries that require minimal optimization and whose execution plan can be quickly compiled by the server.  The few cases where a DAL/ORM is doing something more complex is usually in cases of searches - which are a classic example of where stored procedures require either dynamic SQL or the "With Recompile" statement, both of which put it on par with dynamic SQL. 

For custom queries that truly take advantage of SQLs more complex features, such as grouping, joining, aggregating, etc stored procedures still provide considerable performance benefit - but in an DAL/ORM these are the exception, not the rule, and most DAL/ORM solutions provide some method of handling these cases.

Defense #2: Security

There are two primary arguments for security in stored procedures.  The first pertains to parameterization and the ability to avoid SQL injection hacks; this is a very real concern, but one that can easily be handled at the DAL/ORM side since the code is usually generated.  The second pertains to object level access.  Ad hoc SQL and thereby dynamic SQL rely on nearly full access to a database.  If a web server is compromised then the hacker has full access to your data.  Oops.  Stored procedures help ensure that only data intended for external consumption can be accessed.  For example, an e-commerce site may allow credit card data to be submitted via a set/insert/update sproc, but ensure that only the last four digits are returned during read operations; if your web server is compromised, this data is still protected. 

This is a serious issue.  I consider it the most significant benefit of stored procedures.  However, the very nature of a DAL/ORM violates this principle.  Even if stored procedures are used, most code generation tools will automatically provide full read/write access to each individual object - regardless of whether or not it's needed.  A few provide the ability to disable a level of access for specific objects, but they rarely provide column level security - and in the credit card example, even column level security wouldn't satisfy the requirement.

The bottom line?  This will ultimately depend on the nature of your data and your application.  Unfortunately this is one case where (regardless of the stored procedure vs. dynamic SQL debate) you are going to need to avoid a generated DAL/ORM, at least for highly privileged information - at least based on the capabilities of most DAL/ORM solutions commonly used today.

Conclusions

There are other benefits to stored procedures over dynamic or ad hoc SQL.  Erland Sommarkog's definitive article "The Curse and Blessing of Dynamic SQL" is one of the most comprehensive articles I've found on this matter.  There are dozens of others - as well as countless counter arguments from dynamic SQL enthusiasts.  My conclusion is that this is highly dependent on the particular purpose of your database; if your database is simply a persistent object cache that will be accessed by a generated Data Access Layer then stored procedures provide only minimal benefit.  If, however, you are working with sensitive data or require fast performance of complex queries then I consider stored procedures a necessity.

Saturday, April 14, 2007 1:59 PM
Filed Under [ SQL, ORM/DAL, ]

Comments

Gravatar
# re: Sprocs with regard to DALs
Posted by Hedley Robertson on 4/11/2008 8:40 AM
Heh, this is something I have been struggling with quite recently. Lately I have taken issue with some of the dynamic practices that the Rails community / philosophy encourages and pushed quite a bit of logic into the db, mostly in the form of constraints and triggers. I have two code paths into my db now, the main ActiveRecord ORM that most public facing Rails pages leverage, and a 'backend' set of triggers and long running procedures that I am calling with some background DRb schedulers. Thanks to PL/Ruby I don't even need to change languages when I write my Sprocs (PostgreSql is amazingly easy to extend with new language support)

http://www.robbyonrails.com/articles/2005/09/27/the-bitter-sweet-taste-of-agnostic-database-schemas

Here Robby talks about the role of Application DB vs Integration DB. My current requirements are more in the Integration DB department, thus my desire and move towards a sane database. Yes, I lose some agility... I cannot respond to changes as quickly. And, OMG, I have to violate DRY here and there, but my data is sane and will stay that way, period.

Personally I agree with your analysis, although I would ignore point #1 until it became a problem. The approach depends on the requirements. Is the database merely a persistence store for one web application to leverage? In this case I would just let my ORM manage security, it isn't that hard to prevent the SQL injections if you have one code path. Developer hours are more expensive then CPU hrs so if the requirements allow me get away with an ORM and dynamic Sql, sweet! If I feel like the database is ever, at any point, going to be used in a way that promotes it past 'glorified web persistence store', then I start looking at the db as an application I am now forced to maintain alongside my main code branch.
Gravatar
# re: Sprocs with regard to DALs
Posted by Jeremy on 4/11/2008 11:25 AM
Totally agreed re: Integration (DAL) vs. Application (ORM) being the driving factor in determining which approach to use. I actually posted about that in a more recent post, but not in such succinct terms; that's a good way of generally segmenting the decision tree.

The security issue, however, can't be fully addressed by the DAL. I consider SQL injections a non-issue (as you say, they're easy to manage). However, web servers are generally considered the weakest link in the security chain; if someone compromises your web server they can easily bypass your DAL. For our sweepstakes, for example, the web application doesn't have read access to any PII; it can only submit data. In this regard, we drive security at the lowest possible level and treat all clients as untrusted.

But the main point remains: depending on this business requirements this level of security may be a non-issue. For example, on a social networking site any data that can be submitted can also be read (except, possibly, the password) and, therefore, the web application requires this level of access and must manage security. Further, in that case your weakest link becomes the web application as opposed to the web server (which is expected).

Post Comment






 

Please add 5 and 1 and type the answer here:
*