There’s currently a very interesting discussion taking place on Tobin Harris’ blog about the usage of Repositories. Although I added my initial thoughts to the post, I had a few additional thoughts on my journey home from work, and I’ve had a bit of a re-think on some of my views. Let me see if I can explain…
NB: I’ve repeated some of my comments from Tobin’s blog, so I apologise if you’ve read some of this already!
Should a repository return IQueryable?
I’ve mentioned this on a couple of my previous posts, and also on Tobin’s blog, but I’ve generally found that exposing IQueryable from your repositories adds some complexity issues in managing the differing limitations of the underying provider.
In my experiments with L2S and the Entity Framework, I’ve noticed that the support for IQueryable varies somewhat between provider implementations, perhaps due to the bloated interface of IQueryable. Other issues I’ve noticed in my travels include the lack of common IQueryable provider support for Expression.Invoke – a technique used in some approaches to combining predicates. This technique cannot be used by the Entity Framework for instance (although a general solution to that can be found here). All this makes exposing IQueryable from a repository needlessly complicated.
I’m sold on the idea that the primary purpose of the repository is to obtain a reference to the root of an aggregate. A repository should therefore behave like a collection of Entities. Contrary to this, repositories that return IQueryable are returning a query object. I think this probably goes against the original purpose of a repository.
Steve Burman also highlighted a key consideration as part of the discussion on Tobin’s blog; ideally, SQL execution should never occur outside of the boundary of a repository. By returning an un-enumerated IQueryable object, we are exposing ourselves to just that and opens us up for potential abuse.
Where do specifications fit in with the Repository Pattern?
I generally believe specifications to be a domain concept, containing things like GoldCustomerSpecification and the like. In general I put most of my specifications alongside other domain artifacts, but only where the specification represents a domain concept. In the case of the GoldCustomerSpecification, it belongs in the domain since there is a “gold customer” in our ubiquitous language. This may be “all customers who’ve spend over £250 in the last month are considered to be gold customers” or something to that effect. I see these kinds specifications as value objects, since they seem to fit Evans description of a value object:
An object that represents a descriptive aspect of the domain with no conceptual identity is called a VALUE OBJECT. VALUE OBJECTS are instantiated to represent elements of the design that we care about only for what they are, not who or which they are.
There are also some specifications however, that are better suited to live outside the domain. An example of this is search criteria objects. These application layer objects tend to be formed from a combination of domain specifications (like GoldCustomerSpec) with other adhoc criteria (like name=”bob”). Steve Burman has proposed a good mechanism for creating adhoc specifications with can be found here.
Colin Jack reaffirmed my believe that this kinds of specifications are better left outside of the domain, and summarised that “ad hoc specs just speak to the language of the implementation not the language of the domain and so (in my view) are best used outside the domain.” I agree entirely; specifications like CustomerNameSpecification (things like name=”bob”) don’t form a part of our ubiquitous language, and therefore shouldn’t be part of our domain.
Are generic repositories worthwhile?
Another question that was raised was that with the use of specifications in our repositories, do we even need custom repositories anymore? Instead of having ICustomerRepository, IOrderRepository etc. could we just have IRepository<customer> and IRepository<order> instead?
I’ve been using generic repositories in my more recent work, and these repositories have taken the form:
public interface IRepository<TAggregateRoot>
where TAggregateRoot : class, IAggregateRoot
{
/// <summary>
/// Find entities by specification.
/// </summary>
IList<TAggregateRoot> Find(ISpecification<TAggregateRoot> criteria);
// More stuff here…
For me, a great benefit of generic repositories like the one above (using LINQ) is the ability to instantly gain from being able to write the following for ANY entity that is an aggregate root:
repository.Find(criteria);
Since the criteria can be any specification, or composite of multiple specifications (domain specifications or otherwise), they are a very powerful tool.
Colin raised some very valid questions about generic repositories, and their ability to encapsulate all details of persistence not just querying. Specifically, how do you say that:
- aggregate X is never deleted, or that its not deleted its archived,
- that aggregate Y is built up from multiple data sources,
- that query Z is very expensive and needs to be done using SQL.
In my repository implementations (such as a LINQ to SQL implementation) I can quite readily add custom persistence related logic, such as “aggregate X is never deleted”, or “that its not deleted its archived” etc, and this persistence logic solely exists in the implementation of the repository. This logic can be implemented in the same way as standard L2S i.e. By using the LINQ to SQL designer to graphically configure SPROCs to execute in response to Insert/Update/Delete operations on our data model classes.
I’ve also created a mechanism in my repositories to implement fetching strategies (using technique similar to the one suggested here), so specific entities can be pre-fetched where required. This allows me to optimise my data retrieval for specific use-cases.
Some further twists
Colin’s questions got me thinking. Perhaps we could have our cake, and eat it too? We could specify custom repositories that inherit behaviour from our default implementation. That way you can implement optimizations as and when you feel they are necessary. For example, the following repository may have an override for Find() that has some specific performance tweak:
CustomerRepository : DefaultRepository<customer>, ICustomerRepository
Now this is where it gets interesting. My generic repository interface also defines method signatures like:
void Insert(T item);
void Delete(T item);
Of course I may not ALWAYS want to allow inserts, deletes to my entities. Rather than throw a NotImplementedException for these cases, a better option is to use the Interface Segregation Principle to separate “updates” from “queries”. Perhaps use a ILoader interface and a IPersister interface to define this. Now here’s the crunch point…with generic repositories, are we even still using the repository pattern at all?!
It’d be great to get some thoughts on this!