Did the Entity Framework team do this by design?
Saturday, May 8, 2010 at 11:04AM
lessCode

I’ve been playing around quite a bit recently with Entity Framework, in the context of a WCF RIA Services Silverlight application, and just today stumbled upon a quite elegant solution to a performance issue I had feared would not be easy to solve without writing a bunch of code.

The application in question is based around a SQL Server database which contains a large number of high-resolution images stored in binary form, which are associated with other entities via a foreign key relationship, like this: image

Pretty straightforward. Each Thing can have zero or many Image entities associated with it. Now, lets say we want to present a list of Thing entities to the user. With a standard WCF RIA Services domain service, we might implement the query method like this:

public IQueryable<Thing> GetThings() {
return ObjectContext.Things.Include("Images").OrderBy(t => t.Title);
}

 

Unfortunately, this query will perform quite poorly if there are many Things referencing many large Images, because all the Images for all the Things will cross the wire down to the client. When I try this for a database containing a single Thing with four low-resolution Images, Fiddler says the following about the query:

 image

If we had a large number of Thing entities, and the user never navigates to those entities to view their images, we’d be transferring a lot of images in order to simply discard them, unviewed. If we leave out the Include(“Images”) extension from the query, we won’t transfer the image data, but also the client will not be aware that there are in fact any Images associated with the Things, and we’d have to make subsequent queries back to the service to retrieve the image data separately.

What we’d like to be able to do is include a collection of the image Ids in the query results that go to the client, but leave out the actual image bytes. Then, we can write a simple HttpHandler that’s capable of pulling a single image out of the database and serving it up as an image resource. At the same time we can also instruct the browser to cache these image resources, which will even further reduce our bandwidth consumption. Here’s what that handler might look like:

public class ImageHandler : IHttpHandler {
    #region IHttpHandler Members

    public bool IsReusable {
        get { return true; }
    }

    public void ProcessRequest(HttpContext context) {
        Int32 id;

        if (context.Request.QueryString["id"] != null) {
            id = Convert.ToInt32(context.Request.QueryString["id"]);
        }
        else {
            throw new ArgumentException("No id specified");
        }

        using (Bitmap bmp = ConvertToBitmap(GetImageBytes(id))) {
            context.Response.Cache.SetValidUntilExpires(true);
            context.Response.Cache.SetExpires(DateTime.Now.AddMonths(1));
            context.Response.Cache.SetCacheability(HttpCacheability.Public);
            bmp.Save(context.Response.OutputStream, ImageFormat.Jpeg);
            bmp.Dispose();
        }
    }

    #endregion

    private Bitmap ConvertToBitmap(byte[] bmp) {
        if (bmp != null) {
            TypeConverter tc = TypeDescriptor.GetConverter(typeof(Bitmap));
            var b = (Bitmap) tc.ConvertFrom(bmp);
            return b;
        }
        return null;
    }

    private byte[] GetImageBytes(Int32 id) {
        var entities = new Entities();
        return entities.Images.Single(i => i.Id == id).Data;
    }
}
Note that the handler queries Entity Framework on the server side to load the image bytes from the database, given the image Id that comes from the Url’s query string.
 
So, back the real problem. How do we avoid sending the image bytes back to the client when RIA Services queries for the Things and requests that the images be Included?
 
One way to achieve this would be to remove the Data property from the Image entity in the entity model. This won’t, of course, affect the database, but now since there is no way to access the image bytes, an Image will consist only of an Id. However, this means that we’d have to change our handler’s GetImageBytes method to retrieve the image from the database with lower-level database calls, bypassing Entity Framework. 

It seems like there’s no clean way to achieve what we want, but in fact there is. If you look at the properties available to you on the Data property you can see that the accessibility of these properties can be changed:

 

image

By default entity properties are Public, and RIA Services will dutifully serialize Public properties for us. But if we change the accessibility to Internal, RIA Services chooses not to do so, which make sense. Since the property is Internal, it’s still visible to every class in the same assembly. Therefore, as long as our ImageHandler is part of the same project/application as the entity model and the domain service, it will still have access to the image bytes via Entity Framework, and the code above will work unmodified.

After making this small change in the property editor (and regenerating the domain service), on the client side we no longer see a byte[] as a member of the Image entity:

image

When I now run my single-entity-with-four-images example, Fiddler gives us a much better result:

image

Article originally appeared on lesscode.net (http://www.lesscode.net/).
See website for complete article licensing information.