Thursday, July 24, 2014

8 Biggest Mistakes of Application Architecture

We spend a lot of time arguing of the minutiae of software development. Exceptions or return codes, dynamic or static typing, OO or functional. IMO these have much less to do with the quality of software produced than the architecture, yet we seem to spend less time discussing that. This is the big picture stuff that makes or breaks a product (and it's owners) and in my experience, terrible architecture seems to be the norm in the enterprise world.

The following is a rant about some of the worst anti-patterns I've seen in enterprise software. Each one I have experienced at several companies, though (usually) not all at once. Most of them are cargo cult practices, some can be traced back to official Microsoft guidance, some have no reason at all but just seem to exist anyway.


1. Logical structure != Physical structure


I ran into this one just today. We have two services, a and b, that are different but related yet the share most/all of the same internal libraries. Let's pretend there are good reasons why these services need to be deployed on different machines, because sometimes there are. This does not mean they need to be in separate .csproj files.

It's just as easy, perhaps even easier, to deploy a single application to multiple machines with different configurations.

The physical deployment of an application does not have to be a 1:1 mapping with the solution structure.



2. Development is not Production


On a related note, it is often claimed that a developers machine should resemble production closely. I can't think of anything worse. The purpose of a development environment is to DEVELOP, anything that aids this endeavor is welcome, anything that inhibits it is not.

I want to be developing. I don't care about how many configurations the application will be split into in production. I don't want to manage IIS application pools. I don't want to set up SSL certificates. I don't even want to have to go to the login screen every time I recompile. These are all production issues and you shouldn't have to deal with it unless your configuring a production environment.

A good rule of thumb here is how long it takes a brand new developer to be setup. The answer should be minutes, not hours or days. I actually (very briefly) worked at a company that considered it normal for new developers to spend a week setting up there environment. Companies like this tend to receive the double blow of a week of downtime and a high turnover rate.

It's been 15 years since this appeared at number 2 on the Joel Test. I'm honestly amazed how many companies still screw it up.



3. The .csproj Fetish


There are many good reasons to separate code into different projects. An MVC app and a WCF service might share common functionality in a library project (or maybe not, see 1). But there are just as many good reasons NOT to separate code into a million tiny assemblies.

Recently I came across a solution that had 8 projects just to handle pdf conversion. HtmlToPdfConverter, JpegToPdfConverter, etc. Each project had a single class in a single file. The classes combined were barely long to deserve splitting up, let alone into a separate project each.

All this achieves is to slow down build and debug times. MSBuild slows down dramatically as you increase the number of projects. Try feeding your source tree directly into csc and see just how c# code can be compiled.

Long before IDE's held all our code together things were organized into folders. Fortunately, IDE's are capable if using this technological wizardry, giving us a very simple mechanism of grouping related functionality. Folders exist, use them!



4. Repositories on Repositories


ORMs frequently get a bad wrap for generating n+1 queries, but in the world of enterprise architecture I all to often see a different cause: The Repository Pattern.

ORMs actually come with a repository pattern built in. In nHibernate (which I'll use as my example) the repository is the ISession interface. It's simple and flexible, it can handle nearly every type of query you need and the ones it can't should probably not by performed by an ORM anyway.

An architect or developer will come along and build there own repository on top of this one, but with less features and no flexibility. Just in case you want to change your database one day. How many of you have ever actually seen this happen?

The problem with having data access behind a repository is that it has no context of what data is required for the task at hand. Does it need all the children of that entity? Should it load a parent entity? What should it filter by? It ends with a million permutations of a Find method that the ORM repository is perfectly capable of expressing.

Why does this cause n+1 errors? Because it's abstracted the tool with the best way of resolving them.

Advance Idiots will return a model with lazy loading. Essentially creating an ORM on an ORM.



5. The Service Layer.


This one only applies to web apps.

I'm not talking about a logical service layer here, but a real physical one, usually with WCF as the glue. I think the root cause of this is the same as number 1, architects assuming the the physical structure and the logical structure need to completely match.

Numbers everyone should know are unfortunately numbers that architects don't seem to know. Getting data from RAM is a lot faster than getting it from a network. The service is also likely to make network connections of it's own, usually to a database. The end response time is cumulative, having to make multiple service calls quickly adds up. It won't even show up in a profiler.

The most frequent justification I hear for this is scalability. Having a service layer on a different machine than the web server will somehow, magically, help the application scale in a way that having multiple web servers (maybe even a farm) will not. If the applications weren't spending so much time serializing objects and waiting for synchronous remote procedure calls then maybe they would scale a bit more linearly.

The other justification I hear a lot is that service oriented architectures are considered good architecture. I'll probably save this for another rant, but a service layer and SOA are not synonyms, they are completely unrelated to one another.

Advanced Idiots will combine this with the repository pattern and have n+1 issues spanning multiple network hops.

Super advanced idiots will have services that call services that call services. Displaying a single piece of data to a user can cross a dozen network/process boundaries.

Super amazeball idiots will instantiate a service layer in it's own process every time a method needs to be called on it. Seriously, I've seen this in a production environment handling millions of dollars a month.



6. Replacing new with Resolve


This isn't an attack on IOC containers, I think they're great, but in the wrong hands they are disastrous. The concept is simple enough, that's why every man and his dog have created a toy IOC container, unfortunately a lot of people don't seem to get to the end of tutorial 1.

An IOC container is designed to wire up structural dependencies. If a class name ends with Service, Factory or Handler it is a good candidate to go in the container. If the methods of a class contain adjectives then it is a good candidate to go in the container. If a class name ends with Model, DTO, Request or Response then it probably should NOT go in the container. If a class frequently appears in a list it should probably NOT go in the container.

If an application is full of calls to Resolve<T>() then it was built by someone who doesn't understand the basic concepts of IOC.

Advanced Idiots will combine this with service layers (see number 5). I've actually seen IOC containers used to resolve factories that call web services that resolve POCO objects. I've actually an architect that defended this as being good architecture. I've actually left work at lunch time before the urge to kill became too great...



7. CRUD


So many applications start out life as a simple forms-over-data interface. But using CRUD as the basis for an application architecture is doomed from the start. CRUD is built around the data model and non-developers are horrible at thinking about the data model, they really shouldn't have to.

There is simple crud like active record, but this is about the more advanced CRUD operations. CRUD DAO's, CRUD Repositories, CRUD View Models, CRUD DTO's. So much work for such a limited pattern.

The root of this, of course, is the developers innate desire to generalize common patterns and come up with elaborate use of generics to reduce the amount of code we have to write to solve the problem wrong.

Software should be modeled around users and the actions they perform, not rows in a database.



8. Deployment


The seems to be a rule that the more a company talk about being agile the worse their deployment procedures are. I would argue that being able to deploy frequently (dare I say continuously) is one of the defining features that show your agility as an organization.

If you can not have a bug fix in a customers hands within minutes of bug fixed (excluding testing) then your organization is not agile. Telling users that they have to wait for next months deploy to have a critical issue fixed is not agile. Being unable to put up and tear down test environments on a whim is not agile.

In far to many companies doing a production release is a nerve wracking affair that can consume several days. Number 1, 2 and 5 are the major causes of deployment headaches.



Conclusion


The cause of every one of these issues has been the same, someone with architect in their title. We really should reconsider what the role of an architect is and if the should be involved with technical decisions, or involved at all.