Monday, July 27, 2009

Entity Framework N-Tier Anti-Patterns and DevForce

Danny Simmons, dev manager for Microsoft’s Entity Framework, wrote an article published in the June MSDN Magazine called “Entity Framework: Anti-Patterns To Avoid In N-Tier Applications”. It’s a wonderful place to discover some of the pitfalls of developing your own n-Tier infrastructure. You will learn that such development won’t come cheap. My take-away: “don’t be foolish; don’t write this yourself.” Of course fools abound and you’re welcome to join them.

I arrive at the same conclusion from reading Julie Lerman’s essential book on Programming Entity Framework.

Danny could have shouted: “Don’t do that!”  It would just piss you off. Instead, he begins as if this were something you might do and he’s only going to guide you in doing it: “I will try to set a foundation on which you can build for success in this part of your applications”. Yeah, right.

I encourage you to read his article. Look at the six “anti-patterns” he identifies – the shortcuts that you are likely to take in your naive attempt. See for yourself not only the deep and dangerous traps … but also the sheer difficulty and complexity of avoiding those traps.

Go ahead, read it. I’ll be here when you get back.

So what do you do now? You have an application that crosses tiers. I know you do. If it’s a Silverlight application, you know you do.

My answer … wait for it … is DevForce. We solved this one seven years ago and have been enriching our product over many years. Try it and you will be relieved of all of Danny’s anti-patterns (and all of the gyrations described by Julie too). I’m not suggesting it addresses every n-Tier problem; nothing does. But you won’t succumb to these traps and you won’t waste your time – or your employer’s time – fighting your way through the cross-tier application jungle.

I cannot close without just a bit of criticism. I can’t give all of Danny’s anti-patterns equal weight and I would add some different ones of my own. In one case, I just think he’s wrong. 

Under the heading “Two Tiers Pretending to be Three”, he seems to say that it is imprudent to attempt cross-tier queries and updates.  He is literally saying that he thinks it is improper for the Entity Framework to do so but, in the context of the rest of the article, you are lead to believe that is imprudent for you to do so.

It is certainly difficult. But we still have to do it in some fashion or another. That’s why there is an ADO.NET Data Services and a RIA Services. These technologies, in their somewhat crippled way, facilitate client side queries and updates. DevForce does an even better job. My point is that they fill a need and they are capable of doing so without wrecking your application.

Danny distorts the original question. He mutates from “Why can't you make the Entity Framework serialize queries across tiers?” to “Why not just expose the database directly?” This is not what people are asking for. They want to be able to compose a query on the client – compose it in entity terms, not database terms – and send that query over the wire in the full expectation that they will get entities back.

Of course he is right to caution against “introducing a mid-tier that is simply a thin proxy for the database”? But this warning, like the others, is just more of the same advice: do a good job of writing the middle-tier – don’t hack it together – and it’s going to be hard. Point taken. Use DevForce … or RIA Services or CSLA if you prefer.

Danny tries to predicate his argument on Fowler’s “First Law of Distributed Object”. That won’t fly. Worse, Danny, like many others, misrepresents Fowler’s “law”. Let’s clear that up.

The short statement of the law is “Don’t distribute your objects”. What the heck does that mean? You’ll have to look at the chapter in Patterns of Enterprise Architecture (highly recommended) … or read the article he wrote in Dr. Dobbs that covers the same ground. You will see that it is a jeremiad against calling methods on a remote object.

Fine. I get it. Don’t call Employee.CalculateSalary() and have that be a cross-tier call. Fowler is inveighing against a design which would present the fiction of an object instance that is half on one tier and half on another. It is distributed in the sense that it has one foot on the server and another on the client. Don’t do that. I agree.

But we are interested in an entirely different notion of “distributed objects”. The notion I have in mind is more akin to “mobile objects”. 

When we query across tiers we intend to retrieve persistent state and reconstitute an object with that state on the client. That object can have behavior to go with the state; it should have behavior.

Make no mistake. We are executing locally on the client when we invoke CalculateSalary() on it; there is no ambiguity about this at all. Now it also may be (as it is in DevForce and RIA Services and CSLA) that the same type is available on the middle tier; that means we can reconstitute an instance with persistent state there – on the server – and invoke Calculate Salary() on it there – on the server. There is no ambiguity in this case either. The instance exists on a single tier and the method executes on the same tier as the instance. No instance straddles tiers … or even pretends to straddle tiers. No instance violates Fowler’s “law”.

This is what we expect from a cross-tier query. It is a perfectly reasonable expectation. Danny has announced that Entity Framework will not support that kind of query. That’s ok too. That’s a choice. Let’s not dress it up into some kind of principle.

My rant is over. Please read what Danny has to say … in this article and elsewhere. You will always be glad that you did.

Thursday, July 23, 2009

DevForce Predicate Builder

One in the “DevForce is the Shiznit” series of boastful posts about our product in which I describe a cool feature that may even interest those who have yet to discover the wonders of DevForce.

The time comes when you want to construct a LINQ “Where” clause programmatically. It should be easy. It turns out to be more challenging … until you use the DevForce PredicateBuilder.

The DevForce PredicateBuilder shares a common purpose and name with the Albahari brothers’ PredicateBuilder described here and here. I would be remiss if I failed to note their good work and inspiration. I’ll cover important differences in our solutions towards the end of this post.

February 2011 Update:

Much has happened since this post was written. PredicateBuilder has been joined by a number of related components such as PredicateDescription and SortDescription. Learn more at the DevForce Resource Center (DRC).

Intersoft is also introducing their UXGridView as I write. UXGridView can be bound to their QueryDescriptor component in the ViewModel that brings QBE functionality to the grid. The QueryDescriptor is backed by a DevForce "Data Provider" that uses these features. Bill Gower wrote a tutorial about it.

Imagine a product search interface. The user can enter words in a “Name Search” text box. Your program should find and display every product that contains any of the words entered by the user. You don’t know how many words the user might enter. What do you do?

The solution would be easy if you knew the user would enter exactly one word.

Let’s illustrate with the Northwind database. Assume “Manager” is some apparatus for producing an IQueryable object tied to some persistence apparatus; when we call ToList on the query object, the apparatus uses the query to fetch data from the database.

var word = "Sir";
var q = Manager.Products
.Where(p => p.ProductName.Contains(word));
var results = q.ToList();// returns 3 products

Of course you don’t know how many words the user will enter. You want to be prepared for more than one so you write this too-simple helper method that returns an array of words from the text entered in the text box:

private IEnumerable<String> GetWords(string phrase) {
return phrase.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries);
}

Now all you have to do is replace the “Where” clause with a sequence of OR clauses. You’ll want to construct it by iterating over the words. Go ahead and write it. I’ll wait.

Having trouble? I’ll give you the user’s input: “Sir Cajun Louisiana”. Did that help?

You will probably come up with the following:

var q = Manager.Products
.Where(p =>
p.ProductName.Contains("Sir")
p.ProductName.Contains("Cajun")
p.ProductName.Contains("Louisiana")
);

var results = q.ToList(); // returns 6 products

This is ultimately what the lambda expression must look like.

Of course you cannot demand that the user enter exactly three words any more than you can insist she enter exactly one. You want to construct the lambda dynamically based on the actual number of words entered. Sadly there is no obvious way of constructing a lambda expression dynamically.

The DevForce PredicateBuilder can help you build predicates dynamically.

What’s a “predicate”?

A “predicate” is a function that examines some input and returns true or false.

The code fragment, “p => p.ProductName.Contains(“Sir”)” , is a predicate that examines a product and returns true if the product’s ProductName contains the “Sir” string.

The CLR type of the predicate in our example is:

Func<Product, bool>

Which we can generalize to:

Func<T, bool>

We almost have what we want. When the compiler sees an example of this kind of thing it immediately resolves it into an anonymous delegate. We don’t want the delegate. We need a representation that retains our intent and postpones the resolution into a delegate until the last possible moment. We need an expression tree that we can analyze and morph if necessary. We want a Predicate Expression

Expression<Func<Product, bool>>

This is exactly what the DevForce “Where” extension method demands.

public static IEntityQuery<T> Where<TSource>(
this IEntityQuery<T> source1, Expression<Func<T,bool>> predicate)

The methods of the static PredicateBuilder class combine two or more Predicate Expressions into a single Predicate Expression that we can pass to this “Where” method.

Let’s stick with the example and see one of those methods in action. Let’s write a little method to produce an IEnumerable of Predicate Expressions, one expression for each of given word.

private IEnumerable<Expression<Func<Product, bool>>> ProductNameTests(
IEnumerable<String> words) {
foreach (var each in words) {
var word = each;
yield return p => p.ProductName.Contains(word);
}
}

The result is an IEnumerable of Predicate Expressions about the Product entity. The body is an iterator that returns a Predicate Expression for each word. The expression is exactly the same as the first predicate we wrote when we knew only one word.

If we give it the three-word input in our example, we’ll get an IEnumerable of three Predicate Expressions, each looking for one of the words in the product’s ProductName.

We want to OR these Predicate Expressions together so we will use this static method of PredicateBuilder:

public static Expression<Func<T, bool>>Or<T>(
params Expression<Func<T, bool>>[] expressions)

You see it takes an array (a params array to be precise) of Predicate Expressions. We will convert the output of our ProductNameTests into an array before giving it to this PredicateBuilder method. The final code looks like so:

var words = GetWords("Sir Cajun Louisiana");
var tests = ProductNameTests(words).ToArray();
if (0 == tests.Length) return;
var productNamePredicate = PredicateBuilder.Or(tests);
var q = Manager.Products.Where(productNamePredicate);
var results = q.ToList(); // returns 6 products

In plain language:

  • Split the user’s search text into separate words

  • Generate an array of Predicate Expressions that look for each word in the ProductName

  • Skip the query if there are no clauses … because there are no words

  • Ask “PredicateBuilder.Or” to combine the tests into a single Predicate Expression

  • Run it to get results.

Other PredicateBuilder Methods


There are 6 interesting methods.
MethodSyntax by example
Orp1.Or(p2)
OrPredicateBuilder.Or(p1, p2, p3 .. pn)
Andp1.And(p2)
AndPredicateBuilder.And(p1, p2, p3 .. pn)
TruePredicateBuilder.True()
FalsePredicateBuilder.False()

“p” = Predicate Expression.

All expressions must be of the same type (e.g., Product).

Examples:

Expression<Func<Product, bool>> p1, p2, p3, p4, bigP;

// Sample predicate expressions

p1 = p => p.ProductName.Contains("Sir");

p2 = p => p.ProductName.Contains("Cajun");

p3 = p => p.ProductName.Contains("Louisiana");

p4 = p => p.UnitPrice > 20;

bigP = p1.Or(p2); // Name contains "Sir" or "Cajun"

// Name contains any of the 3
bigP = p1.Or(p2).Or(p3);

bigP = PredicateBuilder.Or(p1, p2, p3); // Name contains any of the 3

bigP = PredicateBuilder.Or(tests); // OR together some tests

bigP = p1.And(p4); // "Sir" and price > 20

// Name contains "Cajun" and "Lousiana" and the price > 20

bigP = PredicateBuilder.And(p2, p3, p4);

bigP = PredicateBuilder.And(tests); // AND together some tests

// Name contains either “Sir” or “Louisiana” AND price > 20
bigP = p1.Or(p3).And(p4);

bigP = PredicateBuilder.True<Product>().And(p1);// same as p1

bigP = PredicateBuilder.False<Product>().Or(p1);// same as p1

// Not useful

bigP = PredicateBuilder.True<Product>().Or(p1);// always true

bigP = PredicateBuilder.False<Product>().And(p1);// always false

Observations

Notice that one each of the OR and the AND methods are Predicate Expression extension methods; they make it easier to compose predicates at number of Predicate Expressions known at design time.

Put a breakpoint on any of the “bigP” lines and ask the debugger to show you the result as a string. Here is the Immediate Window output for “bigP = p1.Or(p3).And(p4);”

{p => ((p.ProductName.Contains("Sir")   p.ProductName.Contains("Louisiana")) &&
(p.UnitPrice > Convert(20)))}

The True and False methods return Predicate Expression constants that help you jumpstart your chaining of PredicateBuilder expressions. Two of the combinations are not useful.

Compared to Albahari Brothers’ Predicate Builder

The Albahari brothers covered similar ground with their PredicateBuilder described here. Why duplicate their work?

Actually, we are not. We would have been happy to use their PredicateBuilder (which is open source) … if it worked in Silverlight. But it doesn’t. Moreover, it relies on a peculiar trick that adds mystery with no apparent benefit.

Why doesn’t it work in Silverlight? Because their implementation depends upon private reflection … which is forbidden in Silverlight.

Why do they require private reflection? Because they postpone resolution of the modified Expression tree until the query is resolved. In order to “lazily resolve” the expression tree, they have to carry unresolved expressions around inside the modified trees. The only way to do this is if the inner, unresolved expressions are closures … and closures are private.

The DevForce Predicate Builder resolves combined expressions immediately. Write “p1.Or(p2)” and you immediately get back the new expression

 p => p.ProductName.Contains(“Sir”)  p.ProductName.Contains(“Cajun”)

With the Albahari Predicate Builder you’d get something sort of like:

 p => p1.Invoke(p)   p.ProductName.Contains(“Cajun”)

“p1” is the closure I’m talking about.

Which brings me to the other apparent strangeness here. What is “Invoke” doing in there? I don’t want to call “invoke” … ever.

Don’t worry. “Invoke” is never actually called. “Invoke” is a placeholder in the expression tree. The Albahari’s have hijacked the “Invoke” method and are using it as a marker that means “when you finally resolve this expression, replace the invoke with the mini-expression inside the closure ‘p1’.”

They have also cleverly hooked their own expression “pre-compiler” into the expression object so that, when something tries to use this LINQ expression, this “pre-compiler” gets to fix up all the “Invoke” markers before that something get its chance to evaluate the expression.

Bet you didn’t know you could do that. It’s a good example of the Chain of Responsibility pattern.

I don’t want to go any deeper than this. You can learn more by reading up on their LinqKit and you can see where they discovered how to do it from Tomas Petricek.

I am only going this far so I can explain that

  1. They are postponing Expression tree resolution until the LINQ query is actually consumed
  2. The trick to doing so is to embed a closure of the left expression in the new Expression tree and mark it with an “Invoke” method
  3. The ultimate resolution of the LINQ query requires penetrating that closure
  4. Closures are private so you can’t penetrate them in Silverlight
  5. Therefore we can’t use their PredicateBuilder to dynamically construct LINQ in SIlverlight

Thankfully, the DevForce PredicateBuilder resolves these dynamically constructed LINQ expressions immediately so there is never an embedded closure. Nor does it need to hijack the “Invoke” method as a marker.

Importantly, there is no loss of expressiveness, performance, or capability by immediate resolution. I have no idea why they introduced this complication. The closures and “Invokes” add unnecessary mystery in my opinion … without apparent benefit.

To be clear, you still can use their PredicateBuilder to dynamically construct LINQ queries in .NET code. Their PredicateBuilder works fine with DevForce LINQ queries in regular .NET … as long as you remember to use AsExpandable (as you would for a dynamically composed Entity Framework LINQ query).

I prefer to use the DevForce version in both .NET and Silverlight myself.

Wednesday, July 22, 2009

Simplify the Prism Event Aggregator Protocol

I read with fascination Jeremy’s  “Braindump on the Event Aggregator Pattern” and recommend it to you, gentle reader. You’ll find concise coverage of the intent and the issues that confront anyone who would build his/her own EA.

Half way in Jeremy launches a critique of EA as implemented in Prism. It reminded me of my initial reaction to Prism's EA which was "man, this seems clunky."  Ouch!

One forgets in time and just accepts all the extra motion as "just the way it is". After awhile you don't even see it anymore. Then someone – a Jeremy – comes along and wags his finger at it.

Now I like Prism's reliance on strongly typed EA events. Much better than the string “Event Topics” of CAB. But, he is right. I shouldn't have to publish an event by writing:

  // Publish an event with no payload (have to fake it); the event type is the message
  _eventAggregator.GetEvent<CacheClearEvent>.Publish(null);

  // Publish an event with a strongly typed payload
  _eventAggregator.GetEvent<EntityNotificationEvent>().Publish(eventPayload);

Aside: I don't mind defining strongly-typed CacheClearEvent and EntityNotificationEvent. But Prism forces me to define an EventNotificationEventPayload class to support the second event.  This smelled wrong ... but I persevered.

Nor should I have to subscribe by writing:

  _eventAggregator.GetEvent<CacheClearEvent>().Subscribe(Clear);
  _eventAggregator.GetEvent<EntityNotificationEvent>().Subscribe(SetCustomer);

The issue here is that it takes two steps to publish and subscribe. First I have to get the event object from the EA and then I tell it what I want to do. This bugs Jeremy. And now it bugs me.

My initial reaction was "let's just clean this up with some extension methods."

I set aside, for the moment, his desire to eliminate the subscription line altogether; one step at a time, OK?

Ah ... but as soon as I started working on those extensions, I realized why the Prism team hadn't done this themselves. I saw how the team got hung up.

The Prism EA designers made a fundamental decision that the strongly-typed event must have a separate, strongly-typed payload. You see this in the signature of the base class for all such events, CompositePresentationEvent<TPayload>. If you want to define a pub/sub event in Prism you must inherit from this class.

This leads you to the following two extension method signatures:

  public static void Subscribe<TEventType, TPayload>(
    this IEventAggregator aggregator, Action<TPayload> action)
      where TEventType : CompositePresentationEvent<TPayload>

  public static void Publish<TEventType,TPayload>(
    this IEventAggregator aggregator, TPayload payload)
      where TEventType : CompositePresentationEvent<TPayload>

These seem ok until you try to use them. You end up with this:

  _eventAggregator.Publish<CacheClearEvent, object>(null);
  _eventAggregator.Publish<EntityNotificationEvent, EntityNotificationEventPayload>(eventPayload);

  _eventAggregator.Subscribe<CacheClearEvent, object>(null);
  _eventAggregator.Subscribe<EntityNotificationEvent, EntityNotificationEventPayload>(SetCustomer);

I’m not sure I’ve gained much clarity. Those type parameters are plain ugly.

While I can add more extension methods to smooth the way for the events that take no payload (e.g., CacheClearEvent), I'm stuck with this syntax for the more interesting events that take a payload. Maybe you can finesse them away; I can’t find a way.

This lead me to ask "what if an event that needed a payload was itself the payload?" I realized I could bring this off with the existing Prism EA ... if I adopted a rather strange convention.

Here is my new CacheClearMessage for example.

  public class CacheClearMessage : CompositePresentationEvent<CacheClearMessage> { }

Notice how it inherits from CompositePresentationEvent<TPayload> as required. But it cleverly references itself as the payload.

My new EntityNotificationMessage looks like this:

  public class EntityNotificationMessage
    : CompositePresentationEvent<EntityNotificationMessage> {

    public EntityNotificationMessage() {}
    public EntityNotificationMessage(Type entityType, object entityId) {
      EntityType = entityType;
      EntityId = entityId;
    }
    public object EntityId { get; private set; }
    public Type EntityType { get; private set; }
    // more stuff
  }

Notice that it contains its own payload which happens to be info about the subject of the entity notification. I no longer need my EntityNotificationEventPayload class which I delete from my project (yipee!)

Notice I can instantiate the message without a payload too. Prism requires a parameterless ctor in order to register the event; you wouldn't actually use this one.

Now I add two extension methods that look like this:

  public static void Publish<TMessageType>(
    this IEventAggregator aggregator, TMessageType message)
      where TMessageType : CompositePresentationEvent<TMessageType>

  public static void Subscribe<TMessageType>(
    this IEventAggregator aggregator, Action<TMessageType> action)
      where TMessageType : CompositePresentationEvent<TMessageType>

My client code looks much cleaner now:

  _eventAggregator.Publish<CacheClearMessage>();
  _eventAggregator.Publish(entityNotificationMessage);
 
  _eventAggregator.Subscribe<CacheClearMessage>(Clear);
  _eventAggregator.Subscribe<EntityNotificationMessage>(SetCustomer);

I get type inference in only one usage (one of the Publish calls); maybe the explicitness is not so bad. At least there is only one type parameter.

--

I'm not as freaked out by the subscriptions but I get Jeremy's point. I should be able to identify the Clear and SetCustomer methods as methods to be subscribed to. I shouldn't have to explicitly import the EventAggregator and wire the methods to it.

I'm not sure what the best way to get around that is just yet (and he doesn't seem to have settled on an answer either). So I'll just stop right here for now.

Tonight, as I post, I'm feeling that this was a good refinement. I see no benefit in forcing separation of the event class and the payload class. Maybe the Prism designers will educate me. Maybe I'll wake up tomorrow with a hangover and wish I'd left well enough alone.

Your thoughts are welcome.

Friday, July 17, 2009

Tom DeMarco Recants

Developers under thirty may not know the name, Tom DeMarco, but if you ever drew a paycheck from a large organization, you’ve felt his influence. When your boss said “You can’t control what you can’t measure”, he was channeling Mr. DeMarco.

I can think of no single individual who has had a more pervasive and decisive say in how we manage software development. Corporate CIOs, IT directors, and senior architects have listened to him prescribe “best practices” for more than 30 years. So when he says “I was wrong all along” … it’s like hearing Robert McNamara confessing his tragic mistakes. You feel you knew it all along … but it hits hard to hear him say it.

In “Software Engineering: An Idea Whose Time Has Come and Gone?”, Mr. DeMarco confronts his life’s legacy -- his insistence that precise planning and intense monitoring are essential to project success -- and condemns it.

Let’s not go overboard. He isn’t against measurement (and neither am I).  If you can measure it … and your measure bears a well-understood relationship to the good or evil you want to assess … and the cost of measurement is reasonable … then measure it.

But putting control and measurement first utterly obscures what really matters: the potential value of the project and the forces that can determine the timing and success of the project. Indeed, planning and measurement can … and often do … actively impede success.

Re-blogging is an etiquette violation. I’m doing it anyway because I fear readers unfamiliar with Mr. DeMarco will miss the many gems in his short and sweet mea culpa. Some juicy quotes to entice you.

Strict control is something that matters a lot on relatively useless projects and much less on useful projects

The more you focus on control, the more likely you’re working on a project that’s striving to deliver something of relatively minor value.

First we need to select projects where precise control won’t matter so much. Then we need to reduce our expectations for exactly how much we’re going to be able to control them, no matter how assiduously we apply ourselves to control.

Consistency and predictability are still desirable, but they haven’t ever been the most important things. The more important goal is transformation.

You say to your team leads, for example, “I have a finish date in mind, and I’m not even going to share it with you. When I come in one day and tell you the project will end in one week, you have to be ready to package up and deliver what you’ve got as the final product.”

Before we get all smug about agile and the disaster that is “Waterfall Design” … I want you to think a solid few minutes about whether you are susceptible to putting engineering practices in front of something more important.

His critique seems to target only an overweening attention to planning and control. That just happens to be his personal road to perdition. He aims at a wider target.

Look at the title again. He asks is “Software Engineering” an idea whose time has gone? If you find that TDD or DDD or BDD … are what you talk about first, are you making the same category of mistake? How can these practices – whatever their merits - be more important than the value of the project and whether it is delivering on its promise? If, for example, you believe that there isn’t great software without unit tests … have you fallen into the same trap?

Imagine the courage it takes to come to such conclusions about your life’s work? I salute Mr. DeMarco. And I hasten to add that his writing was always more nuanced and more useful than his own harsh self-criticism suggests. He remains well worth reading … more so in the light he shines from this vantage point.

Aside: I learned about this article from a Kent Beck tweet. You’ll find the endlessly fascinating Kent Beck blog here.