Category Archives: Design Patterns

What purpose does the Repository Pattern have?

There’s currently a very interesting discussion taking place on Tobin Harris’ blog about the usage of Repositories. Although I added my initial thoughts to the post, I had a few additional thoughts on my journey home from work, and I’ve had a bit of a re-think on some of my views. Let me see if I can explain…

NB: I’ve repeated some of my comments from Tobin’s blog, so I apologise if you’ve read some of this already!


Should a repository return IQueryable?

I’ve mentioned this on a couple of my previous posts, and also on Tobin’s blog, but I’ve generally found that exposing IQueryable from your repositories adds some complexity issues in managing the differing limitations of the underying provider.

In my experiments with L2S and the Entity Framework, I’ve noticed that the support for IQueryable varies somewhat between provider implementations, perhaps due to the bloated interface of IQueryable. Other issues I’ve noticed in my travels include the lack of common IQueryable provider support for Expression.Invoke – a technique used in some approaches to combining predicates. This technique cannot be used by the Entity Framework for instance (although a general solution to that can be found here). All this makes exposing IQueryable from a repository needlessly complicated.

I’m sold on the idea that the primary purpose of the repository is to obtain a reference to the root of an aggregate. A repository should therefore behave like a collection of Entities. Contrary to this, repositories that return IQueryable are returning a query object. I think this probably goes against the original purpose of a repository.

Steve Burman also highlighted a key consideration as part of the discussion on Tobin’s blog; ideally, SQL execution should never occur outside of the boundary of a repository. By returning an un-enumerated IQueryable object, we are exposing ourselves to just that and opens us up for potential abuse.


Where do specifications fit in with the Repository Pattern?

I generally believe specifications to be a domain concept, containing things like GoldCustomerSpecification and the like. In general I put most of my specifications alongside other domain artifacts, but only where the specification represents a domain concept. In the case of the GoldCustomerSpecification, it belongs in the domain since there is a “gold customer” in our ubiquitous language. This may be “all customers who’ve spend over £250 in the last month are considered to be gold customers” or something to that effect. I see these kinds specifications as value objects, since they seem to fit Evans description of a value object:

An object that represents a descriptive aspect of the domain with no conceptual identity is called a VALUE OBJECT. VALUE OBJECTS are instantiated to represent elements of the design that we care about only for what they are, not who or which they are.

There are also some specifications however, that are better suited to live outside the domain. An example of this is search criteria objects. These application layer objects tend to be formed from a combination of domain specifications (like GoldCustomerSpec) with other adhoc criteria (like name=”bob”). Steve Burman has proposed a good mechanism for creating adhoc specifications with can be found here.

Colin Jack reaffirmed my believe that this kinds of specifications are better left outside of the domain, and summarised that “ad hoc specs just speak to the language of the implementation not the language of the domain and so (in my view) are best used outside the domain.” I agree entirely; specifications like CustomerNameSpecification (things like name=”bob”) don’t form a part of our ubiquitous language, and therefore shouldn’t be part of our domain.


Are generic repositories worthwhile?

Another question that was raised was that with the use of specifications in our repositories, do we even need custom repositories anymore? Instead of having ICustomerRepository, IOrderRepository etc. could we just have IRepository<customer> and IRepository<order> instead?

I’ve been using generic repositories in my more recent work,  and these repositories have taken the form:

    public interface IRepository<TAggregateRoot>
        where TAggregateRoot : class, IAggregateRoot
        /// <summary>
        /// Find entities by specification.
        /// </summary>
        IList<TAggregateRoot> Find(ISpecification<TAggregateRoot> criteria);
        // More stuff here… 


For me, a great benefit of generic repositories like the one above (using LINQ) is the ability to instantly gain from being able to write the following for ANY entity that is an aggregate root:


Since the criteria can be any specification, or composite of multiple specifications (domain specifications or otherwise), they are a very powerful tool.

Colin raised some very valid questions about generic repositories, and their ability to encapsulate all details of persistence not just querying. Specifically, how do you say that:

  • aggregate X is never deleted, or that its not deleted its archived,
  • that aggregate Y is built up from multiple data sources,
  • that query Z is very expensive and needs to be done using SQL.

In my repository implementations (such as a LINQ to SQL implementation) I can quite readily add custom persistence related logic, such as “aggregate X is never deleted”, or “that its not deleted its archived” etc, and this persistence logic solely exists in the implementation of the repository. This logic can be implemented in the same way as standard L2S i.e. By using the LINQ to SQL designer to graphically configure SPROCs to execute in response to Insert/Update/Delete operations on our data model classes.

I’ve also created a mechanism in my repositories to implement fetching strategies (using technique similar to the one suggested here), so specific entities can be pre-fetched where required. This allows me to optimise my data retrieval for specific use-cases.


Some further twists

Colin’s questions got me thinking. Perhaps we could have our cake, and eat it too? We could specify custom repositories that inherit behaviour from our default implementation. That way you can implement optimizations as and when you feel they are necessary. For example, the following repository may have an override for Find() that has some specific performance tweak:

CustomerRepository : DefaultRepository<customer>, ICustomerRepository

Now this is where it gets interesting. My generic repository interface also defines method signatures like:

void Insert(T item);
void Delete(T item);

Of course I may not ALWAYS want to allow inserts, deletes to my entities. Rather than throw a NotImplementedException for these cases, a better option is to use the Interface Segregation Principle to separate “updates” from “queries”. Perhaps use a ILoader interface and a IPersister interface to define this. Now here’s the crunch point…with generic repositories, are we even still using the repository pattern at all?!

It’d be great to get some thoughts on this!


Pluggable Dependencies demo part 1

I’ve been itching to build a “modern” application using an Agile approach with TDD since being inspired by Rob Conery’s MVC Storefront application, so  I figured the small example I discussed in my last post would be a good starting point for me to take my first baby steps.

To briefly re-cap and pad out the initial requirement, the idea is to build a small app where a person (and their address details) can be stored. There will be a minimum requirement that a person must have a full name (forename and surname) and an address for that person to be “published” to a list of people on the system. Draft person records can be created with incomplete details, and stored until such a time when they can be completed and published.

Please keep in mind as you read this that this will be this first time I’ve *attempted to* use TDD in anger, so please feel free to confront me about any questionable decisions I make, and I’ll try and make time to adjust the app/discuss the issues further!

Right, without further ado I’ll begin!

The first unit test

In attempting to follow the TDD mantra, the first step is to code out my first test. I figure I’d start with trying to return a list of people (or Persons) and since I know from my last post I want program against an abstraction, I know this list will be of type IPerson. Since I also know I wish to filter this list, I want my return type to be IQueryable. So this leaves me with:

public void Repository_ShouldReturn_Persons_AsQueryable()
    IDataContext rep = new InMemoryRepository();
    var query = from persons in rep.Repository<IPerson>()
                select persons;

    Assert.IsInstanceOfType(query, typeof(IQueryable<IPerson>));
    Assert.IsTrue(query.Count() > 0);

Of course since the none of these objects exist yet, the code will not build. With a little help from Resharper it’s easy to create some empty interfaces for IDataContext and IPerson and a basic implementation of InMemoryRepository to enable this code to build. These interfaces and classes now sit in the test project, within the InMemoryRepositoryTests code file and are marked as internal. At this point, this doesn’t matter. There’ll be a time to sort this out later in the process.

Now at this point, I run my first test, and of course, it fails – my repository method throws a NotImplementedException. At this point I really want to cheat a little since I already have an implementation I can use for this method that I’d like to use, but I know I would probably be burnt at the stake for such a blatant abuse of TDD, so I grit my teeth and continue. The TDD process tells me I should be adding the simplest implementation possible to make the test pass, so I write:

public IQueryable<T> Repository<T>()
    IList<T> items = new List<T>();

    return items.AsQueryable();

…and it fails again. This time with the message "Assert.IsTrue failed." – of course, there is no data being returned. Duh! This is easy to fix, I know I need to return a list of IQueryable<Person> so I add the following lines just above the return statement:

IPerson person = new Person();

This time, I run my test, and watch it pass!

Now I’m allowed to refactor. This is once again where Resharper comes in really handy; a few Alt+Enters and I’ve separated my objects and interfaces into separate files. I can then move IPerson and Person into a separate project called business objects (for want of a better name) and move IDataContext and InMemoryRepository into their own project (aptly named DataAccess). A few more tweaks are needed (like changing the protection level of the classes from internal to public, and altering namespaces) and I hit my next point requiring consideration – since I am making reference to the person object in the InMemoryRepository class, there is a one-way project dependency from my DataAccess project, to my BusinessObjects project and it appears that the dependency is avoidable. Since I want to avoid this dependency, I refactor the implementation to allow items to be inserted to an internally maintained list of objects:

private readonly List<object> _inMemoryDataStore = new List<object>();

public IQueryable<T> Repository<T>()
    var query = from objects in _inMemoryDataStore 
                where typeof(T).IsAssignableFrom(objects.GetType()) 
                select objects; 
    return query.Select(o => (T)o).AsQueryable();

public void Insert<T>(T item)

I then add the Insert definition to my IDataContext so I can initialise the test with some data. Now my code builds again, I can now run my test again watch it pass!

Phew! That’s all I’m going to cover in this post. In later posts I will try to cover off returning Address details related to a person, implementing repositories that persist data to a database, implementing a “draft” repository, and switching between the pluggable dependencies.

Program against an abstraction, not an implementation

I was asked by another developer today how I would go about enabling an application to record “temporary” incomplete data for a particular object (lets say, a person) where usually that object would require particular attributes to contain values, and perhaps pass some other validation (lets say, the person must have address details). Now my first thought was that I’d require different objects for the temporary person record, and the final, published person record, and that each could have different levels of validation applied. At this point, my mind wandered to a blog post I’d read somewhere before…

In the end, I answered his question by pointing him to the blog that I had read. This blog discusses the Repository Pattern and goes on to propose a super-flexible repository interface that allows different implementations of the repository to be swapped out – the primary goal being the ability to run unit tests in isolation (without hitting the database). The blog also emphasises that the power of an abstraction is it’s transparent plugability; the ability to swap in a “draft repository” is now made possible and changes made in draft can be persisted to a different storage location than the live data.

It’s not a big leap from here to see how different implementations of an IPerson interface could have differing validation logic depending on the context (draft or published). By programming against the abstraction of the repository interface, and against abstractions of the returned types, we can (with minimal effort) swap out different implementations to provide the required functionality. I’d quite like to have a go at implementing this technique myself to see what I can come up with. Perhaps in future posts I’ll explore this technique further.