.NET lacks a really good way to compare collections. To make matters worse, Google tends to be almost no help at all. Engineers are relegated to writing boiler-plate code repeatedly for the same, simple, mundane task. I recently ran into this task twice in immediate succession, so I finally decided to create something reusable.
In the first instance I had to compare lists in two list boxes; your typical “Available List” and “Selected List,” where you can drag and drop from one to another. The database only cares about what’s been added and deleted from the “selected list,” and ultimately as a programmer it is you task to decipher this from the user’s interactions. Secondly, I had two completely heterogeneous data sets that represented the same data. I need to synchronize those collections. One was a local database table of stocks held in my portfolio, and the second was a collection of stocks I actually held provided by my broker’s API. I needed to ensure that these collections were synchronized, and if not, handle each discrepancy appropriately.
When approaching this solution, I first needed to identify my goals to make sure my component was both simple and reusable (The result of years in requirements-driven design
). The solution should:
- Utilize IEnumerable<T> as the source for collections (Design pattern borrowed from LINQ)
- Be usable without needing to create a new class (Such as one that implements IComparer)
- Be able to compare collections of different types
- Be as strong-typed as possible
- Be simple; with overrides where necessary
Just to touch a little on the above points – IEnumerable<T> can represent a collection of any type, as long as that collection can be enumerated. This gives you a great amount of flexibility. It’s important to follow this paradigm since what we’re ultimately doing is supplementing LINQ functionality. The IComparer interface is a bit of an artifact and is passé with advancements in recent C# versions; one might even venture to say that using it now would be bad design. What you’re really looking for is a function that compares two objects. With IComparer you’re forced into creating an object for this task, even though the comparison maintains no state. Before anonymous methods and lambda expressions, passing in an IComparer instance was the cleanest way of passing a reusable comparison in as an argument. There are better ways to design this, however, but without getting into all that we’re going to use lambda expressions. Lambda expression are a great candidate for custom, on-demand comparisons without needing to create a custom class.
Performing Simple Comparisons
Let’s start with our first scenario: two list boxes containing names of users. The first contains a list of Available Users, and the second a list of Selected Users. You want to be able to move items from the list of Available Users to the list of Selected Users. When the user has finished making their selections, they’ll press Save. During Save we need to determine what they added and removed from the list of Selected Users. Consider the following WPF Window (Some code removed for brevity):
private List<String> _savedUsers = new List<string>();
void MainWindow_Loaded(object sender, RoutedEventArgs e)
{
AvailableUsers.Users.AddRange(new[] { "Bob", "Sally", "Jim", "Joe", "Mike", "Candace" });
}
private void btnSave_Click(object sender, RoutedEventArgs e)
{
List<string> added = SelectedUsers.Users.Where(x => !_savedUsers.Contains(x)).ToList();
List<string> removed = _savedUsers.Where(x => !SelectedUsers.Users.Contains(x)).ToList();
_savedUsers.AddRange(added);
_savedUsers.RemoveAll(x => removed.Contains(x));
MessageBox.Show(String.Format(
"Added: {0}\nRemoved: {1}",
String.Join(",", added.ToArray()),
String.Join(",", removed.ToArray())
));
}
Note: The listboxes are specialized components that expose a List<String> property named Users that maps to the underlying ItemsSource property. _savedUsers represents our persisted data store.
This is a very basic solution that’s not terribly easy to read or maintain. You can’t expect that junior developers or developers that aren’t fluent in lambda expressions will come up with such a concise solution. We need something simpler. Consider the following solution:
private void btnSave_Click(object sender, RoutedEventArgs e)
{
CompareResult<String,String> compare = SelectedUsers.Users.Compare(_savedUsers);
List<string> added = compare.Added.ToList();
List<string> removed = compare.Removed.ToList();
_savedUsers.AddRange(added);
_savedUsers.RemoveAll(x => removed.Contains(x));
MessageBox.Show(String.Format(
"Added: {0}\nRemoved: {1}",
String.Join(",", added.ToArray()),
String.Join(",", removed.ToArray())
));
}
The Compare method is an extension to IEnumerable<T>. Compare returns a CompareResult object that takes two type parameters; each specifying the type of the lists being compared. The CompareResult object the following properties:
- Removed – Items that are only contained in the left list (originating object) and not the right list (argument passed to Compare)
- Added - Items that are only contained in the right list (argument passed to compare) and not in the left list (originating object)
- Equal - Items that are contained in both lists and are equal.
- Different - Items that are contained in both lists but are different.
- IsSame - True if both collections are identical
- TotalDifferences - The number of differences between the two collections.
Not only does the Compare extension give us code that’s slightly easier to read / write, we also get a lot more information about the differences between the two collections. You’ll notice that there are two properties representing items that are contained in both collections: Equal and Different. The Compare method allows you to check both the contents of both collections as well as the equality.
Checking for Edits as well as Additions and Deletions
Assume our users were User objects as opposed to strings, and our UI allowed you to add, delete, and edit users before hitting save. First, we need to retrieve the data from the database and store it. That will serve as our original collection we’ll compare against to see what has changed. We’ll need another sandbox collection where we’ll let the user make changes without committing to the database. When the user presses the Save button, we’ll compare the sandbox collection with the original collection.
private Users _usersInDatabase;
public UserWindow()
{
InitializeComponent();
//"Fetch" users from database.
_usersInDatabase = Users.GetAll(); //Original Collection
UserList.Users = Users.GetAll(); //Sandbox Collection
}
private void btnSave_Click(object sender, RoutedEventArgs e)
{
CompareResult<User,User> changes =
_usersInDatabase.Compare(UserList.Users, (x, y) => x.Name == y.Name, (x, y) => x.ID == y.ID);
if (changes.IsSame)
return;
foreach (User deleted in changes.Removed)
deleted.Delete(); //Delete from DB
foreach (User added in changes.Added)
added.Insert(); //Add to DB
foreach (User edited in changes.Different.Values)
edited.Update(); //Edit from DB
MessageBox.Show(String.Format(
"Added: {0}\nEdited: {1}\nDeleted: {2}",
String.Join(", ", changes.Added.Select(x => x.Name).ToArray()),
String.Join(", ", changes.Different.Select(x => x.Key.Name + "->" + x.Value.Name ).ToArray()),
String.Join(", ", changes.Removed.Select(x => x.Name).ToArray())
));
}
The Different property is actually a Dictionary with the Key being a member of the original collection and the Value being a member of the altered collection that is different.
There are two delegates we can pass into Compare: IsEqual and IsSame. IsEqual checks the equality of the values; IsSame checks the identity of the values. In the above example, IsSame checks the IDs of the user objects. This identifies them as the Same. IsEqual checks the Name property. If the Name properties match, they are Equal.
Comparing Collections of Different Types
A less common scenario is the desire to compare collections of different types. This is most useful with synchronizing data from heterogeneous data sources, where the data is the same but the data is represented using different types.
List<String> names = new List<String>(new[]{ "Bob", "Jim", "Sally"});
List<User> numbers = new List<User>(new[]{
new User(1,"Bob"),
new User(1,"Bill"),
new User(1,"Jim"),
new User(1,"Sally"),
new User(1,"Jane")
});
List<User> not_in_names = numbers.Compare(names, (x, y) => x.Name == y).Removed.ToList();
A note of caution here: the implementation of the Compare method is not optimized for comparing large collections. If you’re synchronizing large datasets you’ll likely want to implement a custom synchronization method using an algorithm tailored to the nature of your data.
Conclusion
We’ve introduce a method named Compare that can compare collections of almost any type by following the LINQ paradigm. The compare returns a CompareResult object that provides powerful, detailed information about the differences between the two collections.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
namespace System.Linq
{
public static class IEnumerableExtensions
{
public static CompareResult<T, T> Compare<T>(this IEnumerable<T> left, IEnumerable<T> right)
{
return Compare(left, right, (x, y) => x.Equals(y));
}
public static CompareResult<T, TRight> Compare<T, TRight>(this IEnumerable<T> left, IEnumerable<TRight> right, Func<T, TRight, bool> isEqual)
{
return Compare(left, right, isEqual, isEqual);
}
public static CompareResult<TLeft, TRight> Compare<TLeft, TRight>(this IEnumerable<TLeft> leftList, IEnumerable<TRight> rightList, Func<TLeft, TRight, bool> isEqual, Func<TLeft, TRight, bool> isSame)
{
CompareResult<TLeft, TRight> results = new CompareResult<TLeft, TRight>();
results.Removed.AddRange(leftList.Where(x => rightList.Count(y => isSame(x, y)) == 0));
results.Added.AddRange(rightList.Where(x => leftList.Count(y => isSame(y, x)) == 0));
foreach (TLeft left in leftList)
{
TRight right = rightList.FirstOrDefault(x => isSame(left, x));
if (right == null)
continue;
if (!isEqual(left, right))
results.Different.Add(left, right);
else
results.Equal.Add(left, right);
}
return results;
}
}
}
using System;
using System.Collections.Generic;
using System.Text;
namespace System.Linq
{
public class CompareResult<TLeft,TRight>
{
#region Fields
private List<TLeft> _onlyInLeftList = new List<TLeft>();
private List<TRight> _onlyInRightList = new List<TRight>();
private Dictionary<TLeft, TRight> _different = new Dictionary<TLeft, TRight>();
private Dictionary<TLeft, TRight> _equal = new Dictionary<TLeft, TRight>();
#endregion
#region Properties
public Dictionary<TLeft, TRight> Equal
{
get { return _equal; }
}
public Dictionary<TLeft, TRight> Different
{
get { return _different; }
}
/// <summary>
/// Items in the left list no also in the right list
/// </summary>
public List<TLeft> Removed
{
get { return _onlyInLeftList; }
}
/// <summary>
/// Items in the right list not in the left list
/// </summary>
public List<TRight> Added
{
get { return _onlyInRightList; }
}
public bool IsSame
{
get { return TotalDifferences == 0; }
}
public int TotalDifferences
{
get { return _onlyInLeftList.Count + _onlyInRightList.Count + _different.Count; }
}
#endregion
#region Constructor
public CompareResult()
{
}
#endregion
}
}


July 14, 2010 at 5:53 am
Nice post!
A nice way to keep your comparisons DRY.
One refactoring suggestion…
from
results.Removed.AddRange(leftList.Where(x => rightList.Count(y => isSame(x, y)) == 0));
results.Added.AddRange(rightList.Where(x => leftList.Count(y => isSame(y, x)) == 0));
to
results.Removed.AddRange(leftList.Where(x => rightList.Any(y => isSame(x, y)) == false));
results.Added.AddRange(rightList.Where(x => leftList.Any(y => isSame(y, x)) == false));
Which reveals the intention more and is better in performance(sometimes).
October 28, 2010 at 1:47 pm
[...] http://robertbouillon.com/2010/04/29/comparing-collections-in-net/http://stackoverflow.com/questions/43500/is-there-a-built-in-method-to-compare-collections-in-chttp://msdn.microsoft.com/en-us/library/bb342073.aspx [...]
August 15, 2011 at 5:27 pm
Hello, I’ve really enjoyed this post. I need you to send me this project. I’ll be very grateful.