What purpose does the Repository Pattern have?

There’s currently a very interesting discussion taking place on Tobin Harris’ blog about the usage of Repositories. Although I added my initial thoughts to the post, I had a few additional thoughts on my journey home from work, and I’ve had a bit of a re-think on some of my views. Let me see if I can explain…

NB: I’ve repeated some of my comments from Tobin’s blog, so I apologise if you’ve read some of this already!

 

Should a repository return IQueryable?

I’ve mentioned this on a couple of my previous posts, and also on Tobin’s blog, but I’ve generally found that exposing IQueryable from your repositories adds some complexity issues in managing the differing limitations of the underying provider.

In my experiments with L2S and the Entity Framework, I’ve noticed that the support for IQueryable varies somewhat between provider implementations, perhaps due to the bloated interface of IQueryable. Other issues I’ve noticed in my travels include the lack of common IQueryable provider support for Expression.Invoke – a technique used in some approaches to combining predicates. This technique cannot be used by the Entity Framework for instance (although a general solution to that can be found here). All this makes exposing IQueryable from a repository needlessly complicated.

I’m sold on the idea that the primary purpose of the repository is to obtain a reference to the root of an aggregate. A repository should therefore behave like a collection of Entities. Contrary to this, repositories that return IQueryable are returning a query object. I think this probably goes against the original purpose of a repository.

Steve Burman also highlighted a key consideration as part of the discussion on Tobin’s blog; ideally, SQL execution should never occur outside of the boundary of a repository. By returning an un-enumerated IQueryable object, we are exposing ourselves to just that and opens us up for potential abuse.

 

Where do specifications fit in with the Repository Pattern?

I generally believe specifications to be a domain concept, containing things like GoldCustomerSpecification and the like. In general I put most of my specifications alongside other domain artifacts, but only where the specification represents a domain concept. In the case of the GoldCustomerSpecification, it belongs in the domain since there is a “gold customer” in our ubiquitous language. This may be “all customers who’ve spend over £250 in the last month are considered to be gold customers” or something to that effect. I see these kinds specifications as value objects, since they seem to fit Evans description of a value object:

An object that represents a descriptive aspect of the domain with no conceptual identity is called a VALUE OBJECT. VALUE OBJECTS are instantiated to represent elements of the design that we care about only for what they are, not who or which they are.

There are also some specifications however, that are better suited to live outside the domain. An example of this is search criteria objects. These application layer objects tend to be formed from a combination of domain specifications (like GoldCustomerSpec) with other adhoc criteria (like name=”bob”). Steve Burman has proposed a good mechanism for creating adhoc specifications with can be found here.

Colin Jack reaffirmed my believe that this kinds of specifications are better left outside of the domain, and summarised that “ad hoc specs just speak to the language of the implementation not the language of the domain and so (in my view) are best used outside the domain.” I agree entirely; specifications like CustomerNameSpecification (things like name=”bob”) don’t form a part of our ubiquitous language, and therefore shouldn’t be part of our domain.

 

Are generic repositories worthwhile?

Another question that was raised was that with the use of specifications in our repositories, do we even need custom repositories anymore? Instead of having ICustomerRepository, IOrderRepository etc. could we just have IRepository<customer> and IRepository<order> instead?

I’ve been using generic repositories in my more recent work,  and these repositories have taken the form:

    public interface IRepository<TAggregateRoot>
        where TAggregateRoot : class, IAggregateRoot
    {
        /// <summary>
        /// Find entities by specification.
        /// </summary>
        IList<TAggregateRoot> Find(ISpecification<TAggregateRoot> criteria);
        
        // More stuff here… 
 

 

For me, a great benefit of generic repositories like the one above (using LINQ) is the ability to instantly gain from being able to write the following for ANY entity that is an aggregate root:

     repository.Find(criteria); 

Since the criteria can be any specification, or composite of multiple specifications (domain specifications or otherwise), they are a very powerful tool.

Colin raised some very valid questions about generic repositories, and their ability to encapsulate all details of persistence not just querying. Specifically, how do you say that:

  • aggregate X is never deleted, or that its not deleted its archived,
  • that aggregate Y is built up from multiple data sources,
  • that query Z is very expensive and needs to be done using SQL.

In my repository implementations (such as a LINQ to SQL implementation) I can quite readily add custom persistence related logic, such as “aggregate X is never deleted”, or “that its not deleted its archived” etc, and this persistence logic solely exists in the implementation of the repository. This logic can be implemented in the same way as standard L2S i.e. By using the LINQ to SQL designer to graphically configure SPROCs to execute in response to Insert/Update/Delete operations on our data model classes.

I’ve also created a mechanism in my repositories to implement fetching strategies (using technique similar to the one suggested here), so specific entities can be pre-fetched where required. This allows me to optimise my data retrieval for specific use-cases.

 

Some further twists

Colin’s questions got me thinking. Perhaps we could have our cake, and eat it too? We could specify custom repositories that inherit behaviour from our default implementation. That way you can implement optimizations as and when you feel they are necessary. For example, the following repository may have an override for Find() that has some specific performance tweak:

CustomerRepository : DefaultRepository<customer>, ICustomerRepository

Now this is where it gets interesting. My generic repository interface also defines method signatures like:

void Insert(T item);
void Delete(T item);
 

Of course I may not ALWAYS want to allow inserts, deletes to my entities. Rather than throw a NotImplementedException for these cases, a better option is to use the Interface Segregation Principle to separate “updates” from “queries”. Perhaps use a ILoader interface and a IPersister interface to define this. Now here’s the crunch point…with generic repositories, are we even still using the repository pattern at all?!

It’d be great to get some thoughts on this!

Advertisements

About craigcav

Craig Cavalier works as a Software Developer for Liquid Frameworks in Houston Tx, developing field ticketing and job management solutions for industrial field service companies.

Posted on October 22, 2008, in Design Patterns. Bookmark the permalink. 4 Comments.

  1. @Craig

    This is an great summary of discussion, plus some extra bits. Fetching strategies are something I need to get my head around more.

    Would be cool to see sample code for your fetching strategies.

    It’s seems the fetching strategy is dependent on the use-case – the needs of the particular usage context. If a repository exposes queries that are reusable, then you’d have to add allow for different strategies through your repository interface?

    I guess I can see the appeal of having a generic repository, and then constructing queries in a per-use-case service class (or controller), as the service class can then control fetching and optimisation. It just means that data access code goes is no longer confined to the repository.

    I’ve sene the idea of taking a generic repository a specializing it on a per aggregate root basis. This way you get some stuff for free, and the flexibility to add additional features.

  2. @Tobin

    Thanks for your feedback! I’m hoping to add a couple of further posts to my customer search service series, one of which im planning to include a discussion about fetching strategies – I’ll try to make the full solution available, or at least provide a large chunk of sample code if you’re interested?

    I have tried to tackle fetching strategies in as transparent a way as possible. By default, no fetching strategies are applied to my repositories (YAGNI) and my Linq repositories use Lazy Loading to populate any collection properties if/when they are required.

    As you point out, fetching strategies are dependent on use-case. The way I’m currently managing this with generic repositories is based on a couple of assumptions:

    1) My linq repositories (such as IRepository) are associated to a DataContext or ObjectContext, for which the recommended lifetime for this is for a single unit of work (session per request).
    2) Each application service method represents a single use-case (such as finding customers), and represents a single unit of work also.
    3) Any service that is dependent on a repository will have this repository injected into it (DI).

    Based on these assumptions I can use an IOC tool (I use structure map, but any other tool could be used instead) to inject a repository that is configured with appropriate load options to pre-fetch entities. Since my repository and contexts only have a lifetime spanning a single unit of work, differing load options can be applied for any other application service method.

    All of my fetching strategies belong in my persistence assemblies, alongside my repository implementations, but as you point out, it means that not ALL data access code is confined to the repository.

    I’m in two minds as to whether I see this as a problem. As far as SRP, and O/C principle goes, I see this as absolutely correct – why should I have to alter my reposiory implementation (and risk breaking it) to add a fetching strategy optimisation? In addition, the repository still encapsulates all the data access from the calling classes – the repository is mearly configured with the correct strategies using IOC, so the calling classes have no awareness of the underlying data access. I do however think that all this may in fact be over-cooking the issue, especially considering specialised repositories already manage this effectively!

    What do you think?

  3. Really nicely put together post.

    > By using the LINQ to SQL designer to graphically configure SPROCs
    > to execute in response to Insert/Update/Delete operations on our
    > data model classes.

    I see what your saying but I don’t use SProcs for any of this sort of logic, instead I’d want either an explicit Archive method on the repository or a delete method that you can look into to see the behavior.

    > Colin’s questions got me thinking. Perhaps we could have our
    > cake, and eat it too?

    Thats exactly what I’ve done in the past, actually maybe I should blog about it at some point but yeah you’ve hit the nail on the head:

    public interface CustomerRepository : ReadWriteRepository
    {
    }

    If you do this then its the ReadWriteRepository that has the Find(ISpecification criteria) method. Maybe we also have a ReadOnlyRepository, go nuts. Key thing is, as you say, these classes implement very specific interfaces so you’ve got ISP:

    public interface CustomerRepository : ReadWriteRepository, ISupportDeletionOf
    {
    public void Delete(Customer toDelete)
    {
    // TODO: decide if this is allowed
    this.RepositoryHelper.Delete(toDelete);
    }
    }

    Notice my funky fresh interface naming there, and if there are issues with this code its because I wrote it in Notepad! 🙂

    You can see that whilst my base class delegate to another class RepositoryHelper I can also do it when needed in my repositories. Nice and DRY.

    Also we may want to load in readonly/reference/static data, for example our prices come in via a feed but we want to load them. Are they really aggregates, well they fit some of the requirements so lets have a repository:

    public interface PrincingRepository : ReadonlyRepository
    {
    public void GetBySomeVeryInterestingAndDomainSpecificQuery(…)
    {

    }
    }

    Obviously the result is a repository with no Delete/Add methods.

    Will blog about this at some stage…

    > Fetching Strategies

    Interesting, I liked Udi’s idea but it didn’t fit for me as the interface based approach didn’t gel with repositories returning aggregates. Will need to look at it again at some stage but you could also say that you just use two totally seperate entities in those cases (all but one read-only), dunno.

  4. Hey Cav,

    I’ve just found your series of Blogs on doing DDD using EF. I particularly liked seeing some real code examples as unfortunately a lot of what I read about DDD is pretty abstract with little in the way of concrete examples – especially ones that use the EF!

    You mentioned that you might be able to make some sample code available. Would that be possiible?

    Many thanks , and it would be great to see some more posts from you,

    Jason

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: