NDepend In Review

Back in September of this year, I was contacted by Patrick Smacchia, the lead developer from the NDepend team.  If you're not familiar with NDepend, it's a static analysis tool for .NET.  Patrick had asked me if I would be interested in reviewing the software.  I don't normally write software reviews on my blog, but on this occasion, I was intrigued.  I'm a long time user of ReSharper in my day to day work and ReSharper offers some amount of static analysis of your code whilst you write it.  I'd also been watching certain open source projects that had appeared since the introduction of the Roslyn compiler technology such as RefactoringEssentials, Code Cracker & SonarAnalyzer.  I've been thinking a lot about static analysis of code recently, especially since I discovered Connascence and started using that as a way to reason about code.

I had previously heard of the NDepend tool, but had never used it, so when Patrick contacted me and asked if I'd like to review his static analysis tool, I was more than happy to do so.

Full Disclosure:
NDepend is a commercial tool with a trial period.  Patrick agreed to provide me with a complimentary professional license for the software if I wrote a review of the software on my blog.  I agreed on the condition that the review would be my own, honest opinion of the software, good or bad.  What follows is that personal opinion.

This is a fairly long blog post so if you want the TL;DR skip to the end and the "Conclusion" section, but I suggest that you do read this whole post to better understand my complete experience using the NDepend product.


Getting Started

The first thing to getting up and running with NDepend was to download the software from the NDepend website.  This was easy enough, simply by following the relevant links from the NDepend homepage.  The version of NDepend that is current as I write this is v2017.3.2.  As I already had a license, I was able to enter my license key that had been previously supplied to me via email and begin the download.  NDepend comes as a .zip archive file rather than an MSI or other installer program.  This was somewhat unusual as a lot of software is delivered via a setup installer package these days, but I'm an old-school, command line kind of guy and I liked that the program was supplied as a .zip file that you simply extract into a folder and go.  After having done that, there's a couple of ways you can launch NDepend.  There's a few executable files in the extracted folder, so it's not immediately obvious what you execute first, but NDepend does have a helpful getting started guide on their website that's easy enough to follow along with.

For usage on a development machine, you're likely to be using either the stand alone Visual NDepend IDE (VisualNDepend.exe) or, perhaps more likely, the Visual Studio plugin for interacting with NDepend (installed by running the NDepend.VisualStudioExtension.Installer.exe installer).  As well as being used on a developer's machine, NDepend can also run on your build machine, and this is most likely to be integrated into your build process via the NDepend command line tool (NDepend.Console.exe). 

NDepend also ships with an executable called NDepend.PowerTools.exe and its source code in the NDepend.PowerTools.SourceCode folder.  This is a kind of optional extra utility which is run from the command line and contains a number of predefined metrics that can be run against some .NET code.  NDepend also provides an API with which we can integrate and use the NDepend functionality from our own code.  Many of the metrics of the NDepend Powertools are also used within the "main" NDepend tool itself, but as the source code is supplied to the Powertools utility, we can see exactly how those metrics are calculated and exactly how the various statistics around the code that NDepend is analysing is gathered from the NDepend API.  In essence, as well as providing some handy metrics of its own, the NDepend Powertools also serves as kind of demonstration code for how to use the NDepend API.


Launching NDepend For The First Time

After installing the Visual Studio plugin and launching Visual Studio, I loaded in a solution of source code that I wanted to analyse.  I opted to initially launch NDepend via the installed Visual Studio plugin as I figured that is the place I'd be interacting with it most.  NDepend installs itself as a extra menu option in the Visual Studio menu bar, much like other extensions such as ReSharper does.  When you first view this menu on a new solution, most of the menu options are greyed out.  Initially I was slightly confused by this but soon realised that as well as you having a Visual Studio solution and one or more project files for your own code, so too does NDepend have its own project files, and you need to create one of these first before you can perform any analysis of the current loaded solution.  NDepend helpfully allows this with one simple click on the "Attach New NDepend project to current VS Solution" menu item.  Once selected, you're asked which assemblies within your solution you wish to have analysed and then NDepend will go off and perform an initial analysis of your code.

NDepend uses both your solution's source code as well as the compiled assemblies to perform its analysis.  Most of the analysis is done against the compiled code and this allows NDepend to provide statistics and metrics of the IL (or Common Intermediate Language) that your code will compile to as well as statistics and metrics of other code areas.  NDepend uses the solution's source code during analysis in order to gather metrics relating to things not present inside the IL code such as code comments and also to gather statistics allowing NDepend to locate code elements such as types, methods and fields when these are shown as part of a query or search (see later for further details).

Once the analysis is complete, you'll get a pop-up window informing you of the completion of the initial analysis and asking you what to do next.  You'll probably also get a new browser tab open up in your default browser containing the NDepend Analysis report.   It was at this point, I was initially quite confused.  Having multiple things pop-up after the analysis was complete was a little startling and quite overwhelming.  One thing I'd have rather seen is for the browser based report to not be loaded initially, but to have been a button to be clicked on ("View Analysis Report" on something similar) within the pop-up window.  This way, only one "pop-up" is appearing, which to me is a lot less jarring.

The analysis popup allows you to view the NDepend dashboard, the interactive graph or the NDepend Code rules.  I hadn't read the online documentation, so it's probably mostly my own fault, but I'm the kind of person who just likes to dive straight in to something and try to figure things out for myself, but at this point, I was quite stuck.  I didn't quite know what to do next.  The help text on the popup indicates that the Dashboard is the best place to start, and so I selected that and was greeted with the following screen:

Now, this is an incredibly informative dashboard, with an awful lot of information contained within, but for me (and again, quite possibly my own fault for not reading the documentation more thoroughly) this was even more confusing.  There's a few things on the dashboard that seem to be obvious and make sense, such as the number of lines of code, the number of lines of comments and the percentage compared to actual code, but a lot of the other information around the debt level, quality gates and rules didn't seem to make much sense to me at all.  I decided I'd look at a few of the other options that were available in the popup, however that initial popup had now disappeared, so I had to go hunting through the NDepend menus.  Under the "Graph" menu, I found the "View Dependency Graph" option which seemed to coincide with one of the buttons I'd seen on the initial post-analysis pop-up.  Sure enough, opting to view the dependency graph showed me a nice UI of the various assemblies within my solution and how they were all related.

The Dependency Graph is a very nice way to get an overview of your solution.  It should be familiar to developers who have used the Enterprise SKU of Visual Studio and have used the various Architecture features, specifically the Code Map.  What's great about NDepend's Dependency Graph is the ability to alter the box size for the assemblies based upon various metrics for the assembly.  By default, it's based upon lines of code for the assembly, but can be changed to Cyclomatic Complexity, In/Out Edges, two types of coupling and overall "ranking" to name but a few.  The thickness of the arrows can also be altered based upon various metrics of the code such as namespaces, types, methods etc.  This gives the Dependency Graph view a great amount of power in seeing an overview of you solution, and specifically to see where complexity and potential issues with the code may lie. 

An alternative view of the dependency data of your solution can be found with the dependency matrix view.  This shows assemblies by name duplicated across two axes and showing the assembly metrics where the two assemblies intersect.

The assembly matrix view is very handy to use on larger solutions that may contain dozens of different assemblies as the Dependency Graph view of such a solution can be quite busy and difficult to follow due to being overloaded with information.  Another very useful visual representation of solution and project metrics is the Code Metrics View (aka the Treemap Metrics View).  This again shows all assemblies in the solution but this time as a series of coloured boxes.  Each box is further sub-divided into more coloured boxes of various sizes based upon the chosen metric.

The entire collection of boxes can be viewed at assembly, namespace, method, type or even field level with the size of the main boxes being determined by a similar set of metrics as the Dependency Graph, albeit with more options to choose from for method level metrics.  The colour of the boxes can be determined based upon a similar set of metrics as the size and is perhaps most frequently used to determine code coverage of the chosen element.  All of which gives a huge combination of options by which to gain a 100-foot view of the overall solution.  What's most interesting here is that the entire view can be based upon a specific "filter" or "query" performed over the entire codebase.

A "filter" or "query" over the codebase?  Well, this is where something called CQLinq comes in, and it's the very heart of the NDepend product.


Understanding NDepend

It took me a while to overcome the initial confusion and feeling of being overwhelmed by NDepend when I first started using it, however, after some time, I started to understand how the product hangs together, and it all starts with the foundation of everything else that NDepend offers, be that dependency graphs, code metrics, quality rules or other features.  The foundation of all of the impressive metrics and data that you can get from NDepend is CQLinq.

CQLinq stands for Code Query Linq and is at the heart of NDepend's features and functionality and is, essentially, a domain-specific language for code analysis.  It's based upon Linq, which is the Language Integration Query functionality that's been part of the .NET Framework since version 3.5.  For any developer who has used Linq to query in-memory objects, you'll know just have powerful and flexible a tool Linq can be.  It's even more powerful when Linq is used with a Linq Provider that can target some collection of external data such as a database.   Linq allows us to perform incredibly powerful filters, aggregations, projections and more, all from either a SQL-like syntax, or a very intuitive fluent chaining syntax.  For example:

var totalOutstanding = Customers.Where(c => c.Type == 1).Sum(c => c.Balance);

shows a simple but powerful query allowing the developer to, in a single line of code, get the total sum of all balances for a given type of customer from an arbitrary sized collection of customers, based upon properties of the customer objects within that collection.  This is just one example of a simple Linq query, and far more sophisticated queries are possible with Linq without adding too much additional complexity to the code.  The data here, of course, is the collection of customers, but what if that data was your source code?   That's where CQLinq comes in.

CQLinq takes your source code and turns it into data, allowing you, the developer, to construct a Linq query that can filter, aggregate and select metrics and statistics based upon viewing your code as data.  Here's a fairly simple CQLinq query:

from a in Application.Assemblies 
where a.NbLinesOfCode >= 0 
orderby a.NbLinesOfCode descending 
select new { a, a.NbLinesOfCode }

Here we're saying, look in all the assemblies within the solution that have greater than 0 lines of code.  We order those assemblies by the highest count of lines of code descending then we select a projected anonymous type showing the assembly itself and the count of its lines of code.  The result is a list similar to the following:

MyApp.Admin			8554
MyApp.Core			6112
MyApp.Data			4232
MyApp.DataAccess	3095
MyApp.Services		2398

So, even with this simple query, we can see the expressive power of a simple CQLinq query and the aggregate data we can obtain from our source code.

Now, the real power of CQLinq lies in the custom objects and properties that it makes available for you to use.  In the query above we can see the first line is selecting an a from Application.AssembliesApplication is an object provided by NDepend, and Assemblies is a collection of objects associated with the Application.  Further in the query, we can see that the variable a, that represents an assembly, has a property called NbLinesOfCode.  This custom property is also provided by NDepend and allows us to know how many lines of actual code the assembly contains.

NDepend, and CQLinq, provide a large number of custom objects (NDepend calls them custom domains) and properties out of the box giving insight into all sorts of different metrics and statistics on the source code that has been analysed by NDepend.  Custom objects/domains such as Assemblies, Types, Methods, Namespaces and custom properties such as IsClass, DepthOfInheritance, CouldBePrivate, CyclomaticComplexity, NbLinesOfCode, MethodsCalled, MethodsCallingMe, to give only the very briefest of overviews, contain the real power of CQLinq, and ultimately, NDepend.

Once you come to appreciate that underneath the surface of all of the other features of NDepend - such as the large array of metrics shown on the NDepend Dashboard and the NDepend Analysis Report - is one or more CQLinq queries slicing and dicing the data of your source code, everything starts to become that bit more clear!  (Or at least, it did with me!)


Code Rules, Quality Gates & Metrics

Having understood that most of the power of NDepend stems from the data provided by CQLinq queries, we can revisit the various metrics on the Dashboard.  Here, we can see that amongst the most obvious metrics such as lines of code and the number of assemblies, types and namespaces etc. there are more interesting metrics regarding Quality Gates, Rules and Issues.

If we click on the numbers next to the Quality Gates that have either failed, warned or passed, NDepend opens up a Queries and Rules explorer window and a Queries and Rules Edit window.

Within the Queries and Rules Explorer, we can see a treeview of a number of different rules, grouped together by category.  These are code quality rules that ship out of the box with NDepend, and include such categories as Code Smells, Architecture, Naming Conventions and many others.  The icons next to the various category groups show us whether our source code that has been analysed has failed (i.e. violates a rule), is considered a warning, or passes for each rule defined.  If we click on a rule category within the treeview, we see the individual rules from that category in the window to the right.  This shows us how much of our code has matched with the fail/warn/pass states.  Clicking on the individual rule name will cause the Queries and Rules Edit window to show the details for the chosen rule.  Note that, by default and for me at least, the Queries and Rules Explorer Window was docking along the bottom of the Visual Studio main window, whilst the Queries and Rules Edit Window was docking to the right of the main Visual Studio window (as a new tab along with my Solution Explorer).  It was initially confusing as I hadn't noticed one window being updated when interacting with the other window, but once you're aware of this, it becomes much easier to understand.  One other quirk of the Queries and Rules Explorer window is that there doesn't appear to be any way to search for a specific rule within the many rules contained in the various category groups and I found myself sometimes manually expanding each of the category groups in turn in order to find a rule I'd previously been examining.  It would be great if the ability to search within this window was introduced in a future version of NDepend.

The Queries and Rule Edit window is divided into a bottom section that shows the various parts of your source code (the assembly, type or method) grouped in a treeview that are matched by the chosen rule.  In the top section, we can see the rule name and a textual description of the rule itself.  For example, when looking at one of the Code Smells category rules, Avoid Types Too Big, we can see that the description states, "This rule matches types with more than 200 lines of code. Only lines of code in JustMyCode methods are taken account.  Types where NbLinesOfCode > 200 are extremely complex to develop and maintain.".

Hang on.  NbLinesOfCode?   We've seen that before.  Sure enough, clicking the "View Source Code" button at the top of the top section of the Queries and Rules Edit window changes the top section view into something else.  A CQLinq query!

Note that if the Queries and Rules Edit window is docked to the right in its default position, the overall window can appear a little cramped.  For me, this made it a little unclear exactly what I was looking at until I undocked the window and expanded its size.

Each and every one of the more than 200 rules that ship out of the box with NDepend are based upon a CQLinq query.  What's more is that each and every one of these queries can be edited.  And you can create your own category groups and rules by defining your own CQLinq queries.  Think about that for a moment as this is really what NDepend is providing you with.  Lots and a lots of pre-defined query and examination power right out of the box, but mostly the ability to have a full-featured, fluent and highly intuitive Linq-To-Code (for want of a better expression) provider that you can make full use of to perform all manner of examinations on your own source code.

Here's the pre-defined CQLinq query for determining which methods are in need of refactoring:

warnif count > 0 from m in JustMyCode.Methods where 
  m.NbLinesOfCode > 30 ||           
  m.CyclomaticComplexity > 20 ||    
  m.ILCyclomaticComplexity > 50 ||  
  m.ILNestingDepth > 5 ||           
  m.NbParameters > 5 ||             
  m.NbVariables > 8 ||              
  m.NbOverloads > 6                 

select new { m, m.NbLinesOfCode, m.NbILInstructions, m.CyclomaticComplexity, 
             m.ILCyclomaticComplexity, m.ILNestingDepth, 
             m.NbParameters, m.NbVariables, m.NbOverloads }

As can be seen above, the query makes extensive use of the built-in computed properties, all of which are very intuitively named, to gather a list of methods throughout the solution, and specifically methods that are from "JustMyCode" meaning that NDepend helpfully filters out all methods that belong to external frameworks and libraries, that fall afoul of the rules of this query and therefore are prime candidates for refactoring.  And, of course, if you don't like that NDepend only considers methods with more than 30 lines of code to be in need of refactoring, you can always edit the query and change that value to any number you like that better fits you and your team.

When running NDepend from the Visual Studio plug-in you get the ability to click on any of the results from the currently selected query/rules in the Queries and Rules Edit window and be taken directly to the type or method detected by the rule.   This is a really powerful and easy way to navigate through your code base to find all of specific areas of code that may be in need of attention.  I did, however, notice that not all matched code elements can be easily navigated to (see below for more detail).

NDepend also has extensive tooltip windows that can pop-up giving all manner of additional information about whatever you've selected and is currently in context.  For example, hovering over results in the Query and Rules Edit window shows a large yellow tooltip containing even more information about the item you're hovering over:

The "View Source Code" button within the Queries and Edit window acts as a toggle switch between viewing the source code of the CQLinq query and the textual description of the query.  The textual description often also includes text indicating "How To Fix" any of your code that violates the rule that the query relates to.  This can be a very handy starting point for identifying the first steps to take in refactoring and improving your code.

When editing CQLinq queries, the Queries and Edit window provides full intellisense whilst editing or writing query syntax, as well as even more helpful tooltip windows that give further help and documentation on the various CQLinq objects and properties used within the query.

NDepend allows individual rules to be designated as "Critical" rules and this means that any code found that falls afoul of a critical rule will result in a "failure" rather than a "warning".  These are the rules that you can define (and many are predefined for you, but you can always change them) which should not be violated and often such rules are included as part of NDepend's Quality Gates.

Quality Gates are the way that NDepend can determine an overall status of whether the code rules upon which the quality gate is based has passed or failed.  A Quality Gate is something that, like the code rules themselves, can used as-is out of the box, be modified from an existing one or be completely defined by yourself.  Quality gates can rely on one or more code rules in order to determine if the analysed source code is considered good enough to be released to production, so whereas the individual rules themselves will return a list of types, methods or other code artifacts that have matched the query, quality gates will return a pass, warn or fail status that can be used as part of an automated build process in order to fail the build, similar to how compilation errors will fail the compilation whereas warning won't.  Very powerful stuff.  Quality gates are defined using the same CQLinq syntax we've seen before:

// <QualityGate Name="Percentage Code Coverage" Unit="%" />
failif value < 70%
warnif value < 80%

Note that the first comment line actually defines this query as a quality gate and the first two lines of the query show the relative values from the query that should trigger the different return statuses of warning vs failure.

As well as code rules and quality gates, NDepend contains other metrics such as a SQALE Debt Ratio and other debt ratings.  The debt here is the technical debt that may exist within the code base.  Technical Debt is defined as the implied cost of rework/refactoring of parts of the solution in order to fix elements of code that are considered to be of lower quality.  The SQALE Ratio is a well defined, largely industry accepted mechanism for calculating the level of debt within a software project and is expressed as a percentage of the estimated technical-debt, compared to the estimated effort it would take to rewrite the code element from scratch.  NDepend uses the SQALE ratio to show a "Debt Rating", expressed as a letter from A (the least amount of debt) through to E (the most amount of debt), that is attached to the overall solution as well as individual elements of code such as types and methods and this debt rating can be seen on many of the tooltips such as the one shown when hovering over Query and Rules Edit window results as well as on the NDepend dashboard and Analysis Report.  There's a lot more metrics provided by NDepend including a very interesting Abstractness vs Instability metric that's shown on the front page of the Analysis Report.  Speaking of which...


Back to the beginning

So, having taken a diversion to understanding the core foundations of what makes NDepend tick, we can go back to that dashboard and analysis report that we saw at the beginning just after we created a new NDepend project and ran an analysis on our code for the very first time.

The dashboard and report contain many of the same metrics but whilst the dashboard is a window inside Visual Studio (or the separate Visual NDepend IDE), the report is an entirely standalone HTML report and so can be sent to colleagues who may not have the requisite development tools on their machines and be viewed by them.  This also means that the Analysis report is the perfect artifact to be produced as part of your automated build process.  Many teams will often have a dashboard of some description that shows statistics and metrics on the build process and the quality of the code that's been built.  NDepend's analysis report can be easily integrated into such a dashboard, or even hosted on a website domain as it's a static HTML file with some linked images.

The analysis report also contains static snapshot versions of the various graphical metrics representations that can be viewed inside NDepend such as the Dependency Graph, Dependency Matrix, Treemap Metric View and another interesting graph which plots Abstractness vs Instability.

This graph is an interesting one and can give a good idea of which areas of the analysed solution are in need of some care and attention but for very different reasons.

The graph plots each assembly within the solution against a diagonal baseline that runs from the top left corner of the graph to the bottom right.  Starting at the bottom right, as we move along the x-axis, towards the left, we get more "instable" and moving up the y-axis towards the top, we get more "abstract".  Abstractness is how much that project depends upon abstract classes or interfaces rather than concrete classes.  This indicates that the project is somewhat easier to change and is more malleable.  Instability is how easy such change can be made without breaking the code by analysing the various inter-dependencies within that code.  This is a combination of two other metrics, those of afferent coupling and efferent coupling.  Afferent coupling is the number of types that depend upon the given assembly whilst efferent coupling is the number of external types that this assembly depends upon.  We want our assemblies to remain inside the diagonal green area of the graph.  If an assembly's code is high in abstractness but also has a very low amount of either internal or external coupling, we're headed towards the "zone of uselessness".  This isn't a place we want to be, but correcting this may be reasonably easy since code being well within this area means that it doesn't do an awful lot and so could probably be fairly easily removed.  If, however, an assembly's  code is both overly rigid in its dependence on concrete types and also rigid in its instability (i.e has a high amount of internal and/or external coupling) we're headed towards the "zone of pain".  This is an area that we really don't want to be in since code being well within this area means that it's probably doing a lot - maybe even too much - and it's going to be very difficult to change it without breaking things.

Keeping an eye on each assembly's place within the Abstractness vs Instability graph can help to ensure your code stays at both the correct level of abstraction without being too abstract and also ensuring that the code remains fairly well decoupled allowing easier modification.  Metrics such as the abstractness vs instability graph and the treemap metric view are some of my favourite features of NDepend as they present their data in a very easily consumable format that can be viewed and understood "at a glance" whilst being based on some very complex underlying data.


But wait, it gets better

Similar to the Queries and Edit window that allows us to define our own code queries and rules as well as modifying existing ones, NDepend includes a very powerful Search function. 

This function uses the same kind of interface as the Query and Rules Edit window, but allows arbitrary queries to be performed against the analysed code with the matching results showing immediately within the window.  NDepend's search functionality can be optionally constrained to methods, types, assemblies or namespaces, but by default will search across all code elements.  Searches can be based on simple text matching, regular expressions or even full CQLinq queries.  This makes NDepend's search functionality even more powerful than something like the "Go to anything/everything" functionality that's been a part of ReSharper for a long time and is now also a part of Visual Studio itself as we can not only search for pieces of our code based on textual matches, but we can also use the highly expressive power of NDepend's CQLinq syntax, objects and properties to search for code based upon such things as method size, code lacking unit tests, code with no callers or too many callers.  As with the Queries and Rules edit window results, we can double-click on any result and be navigated directly to the matched line of code.  One quirk I did notice when using the Search functionality is that, when searching for matching elements inside of a method - i.e. fields, or other certain elements of code such as interfaces, you must click on the containing method or type that's shown as part of the search results and not the actual interface or field name itself.  Clicking on these will cause NDepend to display various errors as to why it can't navigate directly to that element.  I suspect this is due to the fact that NDepend's analysis works on the compiled intermediate language and not directly against the source code, but since this is a code search function, it would be nice if such navigation were possible.

As well as NDepend's ability to analyse the source code of your solution, it can leverage other tools' artifacts and data to improve its own analysis.  One such major piece of data that NDepend can utilise is code coverage data.  There are a number of tools that can calculate the code coverage of your solution.  This is the amount of code within your solution that is "covered" by unit tests, that is code that is executed as part of one or more unit tests.  Code coverage is built into Visual Studio itself at the highest SKU level, Enterprise, but is also provided by a number of other tools such as JetBrains' dotCover, and NCover.  NDepend can import and work with the code coverage output data of all three of the aforementioned tools, although the files from the dotCover tool have to be provided to NDepend in a special format.  Once some code coverage data is added to an NDepend project, several additional metrics relating to code coverage become available within the complete quite of NDepend metrics, such as identifying code that should have a minimum level of coverage and code whose coverage percentage should never decrease over time as well as other interesting metrics such as the amusingly titled C.R.A.P. metric.  This is the "Change Risk Analyzer and Predictor" metric and gives methods a score based upon their level of code coverage versus their complexity with more complex methods requiring higher code coverage.  Having the ability to integrate code coverage into NDepend's metrics suite is another very nice touch and allows the full set of metrics for your solution to be centralised in a single location.  This also helps to keep automated builds fairly simple without too many moving parts whilst also providing an extensive amount of data on the build.

We've mentioned a few times how NDepend can be integrated into your automated build process.  There's nothing new in having such tools be able to be integrated with such a fundamental part of any software team's daily processes, but whereas some tools simply give you the ability to run them from a command line and then leave you to figure out the exact return values from the executable and other output artifacts and how they might be integrated, NDepend provides some very comprehensive online documentation (even with links to that documentation from the NDepend menus inside of Visual Studio) giving very precise instructions for integrating NDepend into a large number of modern Continuous Integration tools, such as Visual Studio Team Services, TeamCity, Jenkins, FinalBuilder & SonarQube.  This documentation not only includes extensive step-by-step guides to the integration, but also tutorial and walkthrough videos.  This is a very nice touch.

Finally, we've touched upon this above, but NDepend has the ability to perform its analysis in a temporal way.  This essentially means that NDepend can analyse the same solution at different times and then compare different sets of analysis results.  This allows the production of "trends" within the analysis data and is incredibly powerful and possibly one of the most useful features of the NDepend product.  We saw earlier how the Analysis report and the NDepend dashboard contain metrics for such things as Debt rating of the solution, but upon closer examination, we can see that the debt rating and percentage is shown as either an increase or a decrease since the last time that the solution's debt was calculated:

Having such data available to us is incredibly powerful.  We can now track these metrics over time to see if our solution is generally improving or decreasing in quality.  Something that is very important for any software development team who must maintain a solution over a long period of time.   NDepend allows the setting of a "baseline" set of metrics data against which new sets of recently analysed data can be compared.  This gives us a powerful ability to not only compare our most recent set of analysis results with the immediately prior set of results, which if NDepend's analysis is integrated into a continuous build process could be very recently indeed, but also being able to compare our most recent set of results with those of a day ago, a week or a month ago, or even this time last year.  With this we can see not only how our code changes over small time frames, but over larger units of time too.  This ability to view metrics over time frames is helpfully built into many of the interfaces and windows within NDepend, so for example, the Dashboard contains a very useful filter at the top of the window allowing us to set the time range for the charts shown within the dashboard.  NDepend includes the ability to generate a full set of metrics around trends within the solution, so we can for example, track such things as how many "issues" that an earlier NDepend analysis identified have been fixed as well as how many new issues have been introduced, and many of the built-in rules that NDepend defines are specifically based upon changes to the code over time.  For example, there is a complete category group, "Code Smells Regression" that contains many rules starting with the name "From now...".  These specific rules will be broken if code quality for the specific attribute measured falls over time, from one analysis run to the next.  This helps to ensure that code is not only improved in quality, but stays that way.  NDepend doesn't stop there with its ability to view changes over time, and the CQLinq query language includes a large amount of properties that specifically make use of the ability to examine such change in the code over time.  Objects/Domains such as Issues and IssuesInBaseline, and properties such as NewerVersion on a method can allow us to write arbitrary queries comparing parts of our code over time.  So, for example, the following query shows us which methods have increased in cyclomatic complexity:

from m in Methods
where m.NewerVersion().CyclomaticComplexity > m.OlderVersion().CyclomaticComplexity
select m

This is exceptionally powerful data to have on a software project developed and maintained by a commercial team as it allows very informed decisions to be made regarding how much effort within a given amount of work (say, an agile sprint) should be dedicated not to adding additional features and functionality but to maintenance of the existing code that's already there.  Moreover, not only do we know how much effort we should expend on code maintenance but also we know exactly where that effort should be directed to have maximum impact in improving code quality - a critical factor in the success of any software project.



So, we've looked at NDepend and examined some of its powerful static analysis features and we've seen how such a tool could be integrated into a broader software development life-cycle, but should you actually use NDepend for your static analysis needs?

NDepend is a commercial tool and its license comes in two flavours.  A Developer license is required for each developer that will use NDepend on their own development machine, and this license is currently priced at 399 euros.  If you want to use NDepend as part of your Continuous Integration process on a build machine, you'll need a separate license for that which is currently priced at 799 euros.  All licenses are for one year and can be renewed with a 40% discount on the initial purchase price.  All ongoing licenses are eligible for continued upgrades to the product.  All licenses are perpetual fallback licenses meaning that you can stop renewing and still keep using your current version, although you'll lose the ability to upgrade to newer versions of the product.

As a software tool, it's expensive but not exorbitantly so.  For any commercial software development team that cares about the quality of their code, and especially a team tasked with maintaining a software product over time, NDepend is not really expensive at all when compared to the cost of other tools that typical software development teams will invest in.  In such an environment, NDepend is well worth the money.  NDepend has the ability to fit in with most modern team's processes and workflows and has the power to be able to be configured and tweaked to any given team's requirements regarding what constitutes good vs bad code and the level of "quality" that the code has to adhere to before it's considered good enough for production release.  This gives the tool incredible power and flexibility.  That said, I think it's perhaps too expensive a tool for individual software developers who may wish to purchase just for themselves.  Also, to the best of my knowledge, there's no free version of NDepend available for open source projects, as there are of other tools such as ReSharper, OzCode, TeamCity, AppVeyor and many more.  For this reason, I'd love to see another SKU or edition of the NDepend product that is more affordable for individual developers.  Perhaps an "NDepend Lite" that ships only as a Visual Studio plug-in and removes some functionality such as the ability to edit CQLinq queries and the tracking of metrics over time and having a more affordable price for individual developers.

Setup of NDepend is very easy, being delivered as a simple .zip archive file.  The integration of the Visual Studio plugin is also very simple to install and use, and registering your software license is straight-forward and uncomplicated so you can be up and running with the NDepend product in a matter of minutes.  In a way, this is deceptive as once you're back in Visual Studio (or even the Visual NDepend stand alone IDE) it can be difficult to know exactly where to start.  Again, this is largely my own fault for trying to dive straight in without referring to the documentation.  And it must be said that the NDepend's documentation is exceptionally good.  Although newer versions of NDepend try to improve ease of accessibility into the software, the newly added "guide tool-tips" are often not necessarily easily discoverable until you stumble onto them resulting in a bit of a chicken-and-egg situation.  It's hard to really criticise the product for this, though, as by its very nature, it is a highly technical and complex product.  Having done things the hard way, my recommendation would be to spend some time reading the (very good) documentation first and only then diving into the product.

Although the initial learning curve can be quite steep, usage of the product is very easy, especially once you get familiar with the various functions, menu options and windows. And if you're stuck at any point, the excellent documentation and handy tool-tips are always only a mouse click (or hover) away.  Also, NDepend is quick.  Very quick.  Once analysis of a solution is completed - using my test solution, NDepend was able to perform its analysis in less than half the time it takes to fully compile and build the solution - the complete code-base can be queried and searched in near real-time.  And as well as providing a plethora of highly informative and profoundly useful metrics on the analysed code, NDepend also contains such functionality as a general-purpose "code-search-on-steroids" that give other tools offering similar functionality (i.e ReSharper) a run for their money.

NDepend isn't perfect, and I did discover a few quirks with the product as I used it.  One quirk seemed to be a bit of a false positive which was a broken built-in rule that stated that your own methods shouldn't be called "Dispose".  The section of code that was identified as breaking this rule did indeed have a Dispose() method, but it was necessary due to interface implementation.  In this case, it wasn't the IDisposable interface that the class was implementing, but rather the IHttpModule interface that forced me to implement the Dispose() method, yet NDepend didn't seem to like that and flagged that as a broken rule.  NDepend also has a rule about the names of methods not being too long, however, I was finding that my unit tests were frequently falling afoul of this rule as most of them had quite long method names.  This is quite normal for unit tests, and is often considered good or best-practice to name unit tests in such a way with a very expressive name.  You can always decide not to include the test assemblies in the NDepend analysis, but then you may miss out on other NDepend rules that would be very helpful, such as warning when a method has too many lines of code.  It would be nice to be able to tell NDepend to omit certain files or assemblies from certain rules or categories of rules similar to how you can tell ReSharper to ignore certain inspections via comments within your code.  You can omit code from queries at the moment, however, in order to achieve this you have to manually edit the CQLinq of the query, which isn't the best or simplest way.  On the plus side, NDepend have said that they will support a [SuppressWarning] attribute as well as custom views of code within queries, making ignoring certain code much easier in a future version of the product.

Another quirk I found was that navigating to certain code elements from either the search results or results of code rules in the Queries and Rules Edit window can sometimes be problematic.  One rule states "Avoid interfaces too big" and so the results shown in the window are all matched interfaces that the rule has deemed too big.  However, trying to navigate to the interface by double-clicking on the matching result gives an error with a reason of "N/A because interface", which is quite confusing and somewhat unintuitive.  It would be nice if navigation to code via double-clicking on query results was more universal, meaning that all results for all code elements would navigate to the corresponding section of source code when clicked.

An omission from NDepend's functionality that I'd love to see included out-of-the-box would be the ability to identify "code clones" - sections of code that are duplicated in various different parts of the solution.  Whilst this isn't a "metric" as such, it's certainly one of the first ports of call that I turn to when looking to refactor and improve an existing large code base.  In incumbent code-bases, there is frequently a fair amount of duplicated code that exists and removing as much duplication as you possibly can not only reduces the number of lines of code (so long as readability and functionality are maintained, less code is always better than more code) but it also helps in the refactoring of methods against other metrics such as cyclomatic complexity and other factors.  NDepend does include a "Search for Duplicate Code" powertool although in my experience of comparing its output to that of the "Analyze solution for Code Clones" option in Visual Studio Enterprise, I found that it didn't detect as many instances of duplicated code.

This is almost certainly due to the difference in how the two tools work - the NDepend feature looks for methods that make many calls to a "set" of other methods, whilst the Visual Studio function will look for duplicates of actual code itself.  The NDepend feature works in the way it does no doubt due to examining the compiled code rather than the raw source code.  However, in the test solution that I was using, which I knew contained many sections of identically duplicated code inline within many different methods,  NDepend failed to identify a lot of the duplicated code (not to mention taking significantly longer to perform the analysis - a whopping 42 minutes for a 38KLOC solution), whilst the Visual Studio feature detected them all in less than 5 minutes.  Also, as the NDepend functionality is only provided from the command line invoked Powertools, it's not easy to immediately navigate to the offending lines of code by double-clicking as we can with features that are part of the Visual Studio plug-in.  The good news, though, is that in speaking to Patrick, the NDepend lead developer, he tells me that a first-class in-built function for detecting code duplicates is on the roadmap for a future version of the NDepend product.  Hopefully, this functionality shouldn't be too long in coming as NDepend is a very actively developed product, receiving frequent updates.

One other feature I'd love to see in a future version of NDepend would be some dashboard metrics or query rules around identifying connascence within a software solution. Admittedly, some of the levels of connascence are dynamic and so can't really be identified without running the actual code, however, the first five types of connascence can definitely be identified via static analysis and it'd be great if NDepend could include identification of these out-of-the-box.

Overall, I would definitely recommend NDepend if code quality is important to you.  For any commercial software team charged with maintaining a code-base over a period of time, quality of that code-base is of critical importance to the overall success of the software and the ability and ease of continued development.  A tool like NDepend provides great power and insight into the quality of the code-base both right now as well as how that quality fluctuates over time.  NDepend's ability to both aggregate quality metrics into a single, simple debt percentage or rating as well as its ability to identify the many, individual specific issues that the code-base suffers from and how those issues are introduced or resolved over time is a huge help for any team in knowing not only exactly when to focus effort on quality improvement but where to focus that effort.

DDD South West 6 In Review

image(4) This past Saturday 25th April 2015 saw the 6th annual DDD South West event, this year being held at the Redcliffe Sixth Form Centre in image(2)Bristol.  This was my very first DDD South West event, having travelled south to the two DDD East Anglia events previously, but never to the south west for this one.

I’d travelled down south on the Friday evening before the event, staying in a Premier Inn in Gloucester.  This enabled me to only have a relatively short drive on the Saturday to get to Bristol and the DDD South West event.  After a restful night’s sleep in Gloucester, I started off on the journey to Bristol, arriving at one of the recommended car parks only a few minutes walk away from the DDDSW venue.

Upon arrival at the venue, I checked myself in and proceeded up the stairs to what is effectively the Sixth Form “common room”.  This was the main hall for the DDDSW event and where all the attendees would gather, have teas, coffees & snacks throughout the day.

image(7) Well, as is customary, the first order of business is breakfast!  Thanks to the generous sponsors of the event, we had ample amounts of tea, coffee and delicious danish pastries for our breakfast!  (Surprisingly, these delicious pastries managed to last through from breakfast to the first (and possibly second) tea-break of the day!)

image(10)Well, after breakfast there was a brief introduction from the organisers as to the day’s proceedings.  All sessions would be held in rooms  on the second floor of the building and all breaks, lunch and the final gathering for the customary prize draw would be held in the communal common room.  This year’s DDDSW had 4 main tracks of sessions with a further 5th track which was the main sponsors track.  This 5th track only had two sessions throughout the day whilst the other 4 had 5 sessions in each.

The first session of the day for me was “Why Service Oriented Architecture?” by Sean Farmar

image(9)Sean starts his talk by mentioning how "small monoliths" of applications can, over time and after many tweaks to functionality, become large monoliths and can become a maintenance nightmare which is both a high risk to the business and can lead to changes that are difficult to make and can have unforeseen side-effects.  When we’ve created a large monolith of an application, we’re frequently left with a “big ball of mud”.

Sean talks about one of his first websites that he created back in the early 1990’s.   It had around 5000 users, which by the standards of the day was a large number.  Both the internet and the web have grown exponentially since then, so 5000 users is very small by today’s standards.  Sean states that we can take those numbers and “add two noughts to the end” to get a figure for a large number of users today.  Due to this scaling of the user base, our application needs to scale too, but if we start on the path of creating that big ball of mud, we’ll simply create it far quicker today than we’ve ever done in the past.

Sean continues to state that after we learn from our mistakes with the monolithic big ball of mud, we usually move to web services.  We break a large monolith into much smaller monoliths, however, these webservices need to then talk both to each other as well to the consumers of the webservice. For example, the sales webservice has to talk to the user webservice which then possibly has to talk to the credit webservice in order to verify that a certain user can place an order of a specific size.  However, this creates dependencies between the various web services and each service becomes coupled in some way to one or more other services.  This coupling is a bad thing which prevents the individual web services from being able to exist and operate without the other webservices upon which it depends.

From here, we often look to move towards a Service Oriented Architecture (SOA).  SOA’s core tenets are geared around reducing this coupling between our services.

Sean mentions the issues with coupling:

Afferent (dependents) & Efferent (depends on) – These are the things that a given service depends upon and the other services that, in turn, depend upon the first service.
Temporal (time, RPC) – This is mostly seen in synchronous communications – like when a service performs a remote procedure call (RPC) to another service and has to wait for the response.  The time taken to deliver the response is temporal coupling of those services.
Spatial (deployment, endpoint address) – Sean explains this by talking about having numerous copies of (for example) a database connection string in many places.  A change to the database connection string can cause redeployments of complete parts of the system.

After looking at the problems with coupling, Sean moves on to looking at some solutions for coupling:  If we use XML (or even JSON) over the wire, along with XSD (or JSON Schema) we can define our messages and the transport of our messages using industry standards allowing full interoperability.  To overcome the temporal coupling problems, we should use a publisher/subscriber (pub/sub) communication mechanism.  Publishers do not need to know the exact receivers of a message, it’s the subscribers responsibility to listen and respond to messages that it is interested in when the publisher publishes the message.  To overcome the spatial issues, we can most often use a central message queue or service bus.  This allows publishers and subscribers to communicate with each other without hard references to the location of the publisher or subscriber on the network, they both only need to communicate to the single message bus endpoint.  This frees our application code from ever knowing who (or where) we  are “talking to” when sending a command or event message to some other service within the system, pushing these issues down to being an infrastructure rather than an application level concern.  Usage of a message bus also gives us durability (persistence) of our messages meaning that even if a service is down and unavailable when a particular event is raised, the service can still receive and respond to the event when it becomes available again at a later time. 

arch Sean then shows us a diagram of a  typical n-tier architecture system.  He mentions how “wide” the diagram is and how each “layer” of the application spans the full extent of that part of the system (i.e. the UI later is a complete layer than contains all of the UI for the entire system).  All of these wide horizontal layers are dependent upon the layer above or beneath it.

Within a SOA architecture, we attempt to take this n-tier design and “slice” the diagram vertically.  Therefore each of our smaller services each contain all of the layers - a service endpoint, business logic, data access layer and database - each in thin, focused vertical slices for specific focused areas of functionality.

arch2 Sean remarks that if we're going to build this kind of system, or modify an existing n-tier system into these vertical slices of services, we must start at the database layer and separate that out.  Databases have their own transactions, which in a large monolithic DB can lock the whole DB, locking up the entire system.  This must be avoided at all costs.

Sean continues to talk about how our services should be designed.  Our services should be very granular.  i.e. we shouldn't have an "UpdateUser" method that performs creation and updates of all kinds of properties of a "User" entity, we should have separate "CreateUser", "UpdateUserPassword", "UpdateUserPhoneNumber" methods instead.  The reason is that, during maintenance, constantly extending an "UpdateUser" method will force it to take more and more arguments and parameters and will grow extensively in lines of code as it tries to handle more and more properties of a “user” entity and it thus become unwieldy.  A simpler "UpdateUserPassword" is sufficiently granular enough that it'll probably never need to change over its lifetime and will only ever require 1 or 2 arguments/parameters to the method. 

Sean then asks how many arguments our methods should take.  He says his own rule of thumb for maximum arguments to any method is 2.  Once you find yourself needing 3 arguments, it's time to re-think and break up the method and create another new one.   By slicing the system vertically we do end up with many many methods, however, each of these methods are very small, very simple and are very specific with individual specific concerns.

Next we look at synchronous vs asynchronous calls.  Remote procedure calls (RPC) will usually block and wait as one service waits for a reply from another.  This won’t scale in production to millions of users.  We should use the pub/sub mechanism which allows for asynchronous messaging allowing services that require data from other services to not have to wait and block while the other service provides the data, it can subscribe to a message queue and be notified of the data when it's ready and available.

Sean goes on to indicate that things like a user’s address can be used by many services, however, it’s all about the context in which that piece of data is used by that service.  For this reason it’s ok for our system to have many different representations of, effectively, the same piece of data.  For example, to an accounting service, a user’s address is merely a string that gets printed onto a letter or an invoice and it has no further meaning beyond that.  However, to a shipping service, the user’s address can and probably will affect things like delivery timescales and shipping costs.

Sean ends his talk by explaining that, whilst a piece of data can be represented in different ways by different parts of the system, only one service ever has control to write that data whereas all other services that may need that data in their own representation will only ever be read-only.


image (15) The next session was Richard Dalton’s “Burnout”.  This was a fascinating session and is quite an unusual talk to have at these DDD events, albeit a very important talk to have, IMHO.  Richard’s session was not about a new technology or method of improving our software development techniques as many of the other sessions at the various DDD events are, but rather this session was about the “slump” that many programmers, especially programmers of a certain age, can often feel.  We call this “burnout”.

Richard started by pointing out that developer “burnout” isn’t a sudden “crash-and-burn” explosion that suddenly happens to us, but rather it’s more akin to a candle - a slow burn that gradually, but inevitably, burns away.  Richard wanted to talk about how burnout affected him and how it can affect all of us, and importantly, what can we do to overcome the burnout if and when it happens to us.  His talk is about “keeping the fire alive” – that motivation that gets you up in the morning and puts a spring in your step to get to work, get productive and achieve things.

Richard starts by briefly running through the agenda of his talk.  He says he’ll talk about the feelings of being a bad programmer, and the “slump” that you can feel within your career, he’ll talk about both the symptoms and causes of burnout, discuss our expectations versus the reality of being a software developer along with some anti-patterns and actions.

We’re shown a slide of some quite shocking statistics regarding the attrition rate of programmers.  Computer Science graduates were surveyed to see who was still working as a programmer after a certain length of time.  After 6 years, the amount of CS graduates still working as a programmer is 57%, however after 20 years, this number is only 19%.  It’s clear that the realistic average lifespan of a programmer is perhaps only around 20-30 years.

Richard continues by stating that there’s really no such thing as a “computer programmer” anymore – there no longer a job titled as such.  We’re all “software developers” these days and whilst that obviously entails programming of computers, it also entails far more tasks and responsibilities.  Richard talks about how his own burnout started and he first felt it was at least partially caused by his job and his then current employer.  Although a good and generous employer, they were one of many companies who claimed to be agile, but really only did enough to be able to use the term without really becoming truly agile.  He left this company to move to one that really did fully embrace the agile mantra however due to lots of long-standing technical debt issues, agile didn’t really seem to be working for them.  Clearly, the first job was not the cause (or at least not the only cause) of Richard’s burnout.  He says how every day was a bad day, so much so that he could specifically remember the good days as they were so rare and few and far between.

He felt his work had become both Dull and Overwhelming.  This is where the work you do is entirely unexciting with very little sense of accomplishment once performed, but also very overwhelming which was often manifested by taking far longer to accomplish some relatively simple task than should really have been taken, often due to “artificial complexity”.  Artificial complexity is the complexity that is not inherent within the system itself, but rather the complexity added by taking shortcuts in the software design in the name of expediency.  This accrues technical debt, which if not paid off quickly enough, leads to an unwieldy system which is difficult to maintain.  Richard also states how from this, he felt that he simply couldn’t make a difference.  His work seemed almost irrelevant in the grand scheme of things and this leads to frustration and procrastination.  This all eventually leads to feelings of self-doubt.

Richard continues talking about his own career and it was at this point he moved to Florida in the US where he worked for 1.5 years.  This was a massive change, but didn’t really address the burnout and when Richard returned he felt as though the entire industry had moved on significantly in those 1.5 years when he was away, whilst he himself had remained where he was before he went.  Richard wondered why he felt like this.  The industry had indeed changed in that time and it’s important to know that our industry does change at a very rapid pace.  Can we handle that pace of change?  Many more developers were turning to the internet and producing blogs of their own and the explosion of quality content for software developers to learn from was staggering.  Richard remarks that in a way, we all felt cleverer after reading these blogs full of useful knowledge and information, but we all feel more stupid as we feel that others know far more than we do.  What we need to remember is that we’re reading the blogs showing the “best” of each developer, not the worst.

We move on to actually discuss “What is burnout?”  Richard states that it really all starts with stress.  This stress is often caused by the expectation vs. reality gap – what we feel we should know vs. what we feel we actually do know.  Stress then leads to a cognitive decline.  The cognitive decline leads to work decline, which then causes further stress.  This becomes a vicious circle feeding upon itself, and this all starts long before we really consider that we may becoming burnt out.  It can manifest itself as a feeling of being trapped, particularly within our jobs and this leads itself onto feeling fatigued.  From here we can become irritable, start to feel self-doubt and become self-critical.  This can also lead to feeling overly negative and assuming that things just won’t work even when trying to work at them.  Richard uses a phrase that he felt within his own slump - “On good days he thought about changing jobs.  On bad days he thought about changing career”!  Richard continues by stating that often the Number 1 symptom of not having burnout is thinking that you do indeed have it.  If you think you’re suffering from burnout, you probably aren’t but when you do have it, you’ll know.

Now we’re moving on to look at what actually leads to burnout?  This often starts with a set of unclear expectations, both in our work life, but in our general life as a software developer.  It can also come from having too many responsibilities, sleep and relaxation issues and a feeling of sustained pressure.  This often all occurs within the overarching feelings of a weight of expectation versus the reality of what can be achieved.

Richard states that it was this raised expectation of the industry itself (witness the emergence of agile development practices, test-driven development practices and a general maturing of many companies’ development processes and practices in a fairly short space of time) and the disconnect with reality, which Richard felt simply didn’t live up to the expectations that ultimately lead to him feeling a great amount of stress.  For Richard, it was specifically around what he felt was a “bad” implementation of agile software development which actually created more pressure and artificial work stress.  The implementation of a new development practice that is supposed to improve productivity naturally raises expectations, but when it goes wrong, it can widen the gap between expectation and reality causing ever more stress.  He does further mention that this trigger for his own feelings of stress may or may not be what could cause stress in others.

Richard talks about some of the things that we do as software developers that can often contribute to the feelings of burnout or of increasing stress.  He discusses how software frameworks – for example the recent explosion of JavaScript frameworks – can lead to an overwhelming amount of choice.  Too much choice then often leads to paralysis and Richard shares a link to an interesting video of a TED talk that confirms this.  We then move on to discuss software side projects.  They’re something that many developers have, but if you’re using a side-project as a means to gain fulfilment when that fulfilment is lacking within your work or professional life, it’s often a false solution.  Using side-projects as a means to try out and learn a new technology is great, but they won’t fix underlying fulfilment issues within work.  Taking a break from software development is great, however, it’s often only a short-term fix.  Like a candle, if there’s plenty of wax left you can extinguish the candle then re-light it later, however, if the candle has burned to nothing, you can’t simply re-ignite the flame.  In this case, the short break won’t really help the underlying problem.

Richard proceeds to the final section of his talk and asks “what can we do to combat burnout?”  He suggests we must first “keep calm and lower our expectations!”.  This doesn’t mean giving up, it means continuing to desire the professionalism within both ourselves and the industry around us, but acknowledging and appreciating the gap that exists between expectation and reality.  He suggests we should do less and sleep more. Taking more breaks away from the world of software development and simply “switching off” more often can help recharge those batteries and we’ll come back feeling a lot better about ourselves and our work.  If you do have side-projects, make it just one.  Many side-projects is often as a result of starting many things but finishing none.  Starting only one thing and seeing it through to the finish is a far better proposition and provides for a far greater sense of accomplishment.  Finally, we look at how we can deal with procrastination.  Richard suggests one of the best ways to overcome it in work is to pair program.

Finally, Richard states that there’s no shame in burnout.  Lots of people suffer from it even if they don’t call it burnout, whenever you have that “slump” of productivity it can be a sign that it’s time to do something about it.  Ultimately, though, we each have to find our own way through it and do what works for us to overcome it.


image (19) The final talk before lunch was on the sponsor’s track, and was “Native Cross-Platform mobile apps with C# & Xamarin.Forms” by Peter Major.  Peter first states his agenda with this talk and that it’s all about Xamarin, Xamarin.Forms and what they both can and can’t do and also when you should use one over the other.

Peter starts by indicating that building mobile apps today is usually split between taking a purely “native” approach – where we code specifically for the target platform and often need multiple teams of developers for each platform we’ll be supporting – versus a “hybrid” approach which often involves using technologies like HTML5 and JavaScript to build a cross-platform application which is then deployed to each specific platform via the use of a “container” (i.e. using tools like phonegap or Apache’s Cordova).

Peter continues by looking at what Xamarin is and what is can do for us.  Xamarin allows us to build mobile applications targeting multiple platforms (iOS, Android, Windows Phone) using C# as the language.  We can leverage virtually all of the .NET or Mono framework to accomplish this.  Xamarin provides “compiled-to-native” code for our target platforms and also provides a native UI for our target platforms too, meaning that the user interface must be designed and implemented using the standard and native design paradigms for each target platform.

Peter then talks about what Xamarin isn’t.  It’s not a write-once, run-anywhere UI, and it’s not a replacement for learning about how to design effective UI’s for each of the various target platforms.  You’ll still need to know the intricacies for each platform that you’re developing for.

Peter looks at Xamarin.iOS.   He states that it’s AOT (Ahead-Of-Time) compiled to an ARM assembly.  Our C# source code is pre-compiled to IL which in turn is compiled to a native ARM assembly which contains the MONO framework embedded within it.  This allows us as developers to use virtually the full extent of the .NET / Mono framework.  Peter then looks at Xamarin.Android.  This is slightly different to Xamarin.iOS as it’s still compiled to IL code, but then the IL code is JIT (Just-In-Time) compiled inside of a MONO Virtual Machine within the Android application.  It doesn’t run natively inside the Dalvik runtime on Android.  Finally, Peter looks at Xamarin.WindowsPhone.  This is perhaps the simplest to understand as the C# code is compiled to IL and this IL can run (in a Just-In-Time manner) directly against the Windows Phone’s own runtime.

Peter then looks at whether we can use our favourite SDK’s and NuGet Packages in our mobile apps.  Generally, the answer is yes.  SDK’s such as Amazon’s Kinesis for example are fully usable, but NuGet packages need to target PCL’s (Portable Class Libraries) if they’re to be used.

Peter asks whether applications built with Xamarin run slower than pure native apps, and the answer is that they generally run at around the same speed.  Peter shows some statistics around this however, he does also state that the app will certainly be larger in size than a natively written app.  Peter indicates, though, that Xamarin does have a linker and so it will build your app with a cut-down version of the Mono Framework so that it’ll only include those parts of the framework that you’re actually using.

We can use pretty much all C# code and target virtually all of the .NET framework’s classes when using Xamarin with the exception of any dynamic code, so we can’t target the dynamic language runtime or use the dynamic keyword within our code.  Because of this, usage of certain standard .NET frameworks such as WCF (Windows Communication Foundation) should be done very carefully as there can often be dynamic types used behind the scenes.

Peter then moves on to talk about the next evolution with Xamarin, Xamarin.Forms.  We’re told that Xamarin.Forms is effectively an abstraction layer over the disparate UI’s for the various platforms (iOS, Android, Windows Phone).  Without Xamarin.Forms, the UI of our application needs to be designed and developed to be specific for each platform that we’re targeting, even if the application code can be shared, but with Xamarin.Forms the amount of platform specific UI code is massively reduced.  It’s important to note that the UI is not completely abstracted away, there's still some amount of specific code per platform, but it's a lot less than when using "standard" Xamarin without Xamarin.Forms.

Developing with Xamarin.Forms is very similar to developing a WPF (Windows Presentation Foundation) application.  XAML is used for the UI mark-up, and the premise is that it allows the developer to develop by feature and not by platform.  Similarly to WPF, the UI can be built up using code as well as XAML mark-up, for example:

Content = new StackPanel().AddChildren(new Button() { Content = "Normal" });

Xamarin.Forms works by taking our mark-up that defines the placement of Xamarin.Forms specific “controls” and user interface elements and converting them using a platform-specific “renderer” to a native platform control.  By default, using the standard build-in renderers means that our apps won’t necessarily “look" like the native apps you’d find on the platform.  You can customize specific UI elements (i.e. a button control) for all platforms, or you can make the customisation platform specific.  This is achieved with a custom Renderer class that inherits from the EntryRenderer and adds the required customisations that are specific to the platform that is being targeted.

Peter continues to tell us that Xamarin.Forms apps are best developed using the MVVM pattern.  MVVM is Model-View-ViewModel and allows a good separation of concerns when developing applications, keeping the application code separate from the user interface code.  This mirrors the best-practice for development of WPF applications.  Peter also highlights the fact that most of the built-in controls will provide two-way data binding right out of the box.  Xamarin.Forms has "attached properties" and triggers.  You can "watch" a specific property on a UI element and in response to changes to the property, you can alter other properties on other UI elements.  This provides a nice and clean way to effectively achieve the same functionality as the much old (and more verbose) INotifyPropertyChanged event pattern provides.

Peter proceeds to talk about how he performs testing of his Xamarin and Xamarin.Forms apps.  He says he doesn’t do much unit testing, but performs extensive behavioural testing of the complete application instead.  For this, he recommends using Xamarin’s own Calabash framework for this.

Peter continues by explaining how Xamarin.Forms mark-up contains built-in simple behaviours so, for example, you can check a textbox's input is numeric without needing to write your own code-behind methods to perform this functionality.  It can be as simple as using mark-up similar to this:

<Entry Placeholder="Sample">

Peter remarks about speed of Xamarin.Forms developed apps and concludes that they are definitely slower than either native apps or even normal Xamarin developed apps.  This is, unfortunately, the trade-off for the improved productivity in development.

Finally, Peter concludes his talk by summarising his views on Xamarin.Forms.  The good:  One UI Layout and very customizable although this customization does come with a fair amount of initial investment to get platform-specific customisations looking good.  The bad:  Xamarin.Forms does still contain some bugs which can be a development burden.  There’s no XAML “designer” like there is for WPF apps – it all has to be written in a basic mark-up editor. Peter also states how the built-in Xamarin.Forms renderers can contain some internal code that is difficult to override, thus limiting the level of customization in certain circumstances.  Finally, he states that Xamarin.Forms is not open source, which could be a deciding factor for adoption by some developers.


IMG_20150425_131838 After Peter’s talk it was time for lunch!  Lunch at DDDSW was very fitting for the location in which we were in, the South-West of England.  As a result, lunch consisted of a rather large pasty of which we could choose between Steak or Cheese & Onion varieties, along with a packet of crisps, and a piece of fruit (a choice of apples, bananas or oranges) along with more tea and coffee!  I must say, this was a very nice touch – especially having some substantial hot food and certainly made a difference from a lot of the food that is usually served for lunch at the various DDD events (which is generally a sandwich with no hot food options available).

IMG_20150425_131849 After scoffing my way through the large pasty, my crisps and the fruit – after which I was suitably satiated – I popped outside the building to make a quick phone call and enjoy some of the now pleasant and sunny weather that had overcome Bristol.

IMG_20150425_131954 After a pleasant stroll around outdoors during which I was able to work off at least a few of the calories I’d just consumed, I headed back towards the Redcliffe Sixth Form Centre for the two remaining sessions of the afternoon.

I headed back inside and headed up the stairs to the session rooms to find the next session.  This one, similar to the first of the morning was all about Service Oriented Architecture and designing distributed applications.

image (1) So the first of the afternoon’s sessions was “Introduction to NServiceBus and the Particular Platform” by Mauro Servienti.  Mauro’s talk was to be an introduction to designing and building distributed applications with a SOA (Service Oriented Architecture) and how we can use a platform like NServiceBus to easily enable that architecture.

Mauro first starts with his agenda for the talk.  He’ll explain what SOA is all about, then he’ll move on to discuss long running workflows in a distributed system and how state can be used within.  Finally, he’ll look at asynchronous monitoring of asynchronous processes for those times when something may go wrong and allow us to see where and when it did.

Mauro starts by explaining the core tenets of NServiceBus.  Within NServiceBus, all boundaries are explicit.  Services are constrained and cannot share things between them.  Services can share schema and a contract but never classes.  Services are also autonomous, and service compatibility is based solely upon policy.

NServiceBus is built around messages.  Messages are divided into two types, commands and events. Each messages is an atomic piece of information and is used to drive the system forward in some way.  Commands are imperative messages and are directed to a well known receiver.  The receiver is expected (but not compelled) to act upon the command.  Events are also messages that are an immutable representation of something that has already happened.  They are directed to anyone that is interested.  Commands and events are messages with a semantic meaning and NServiceBus enforces the semantic of commands and events - it prevents trying to broadcast a Command message to many different, possibly unknown, subscribers and enforces this kind of “fire-and-forget” publishing only to Event messages.

We’re told about the two major messaging patterns.  The first is request and response.  Within the request/response pattern, a message is sent to a known destination - the sender knows the receiver perfectly but the receiver doesn't necessarily know the sender.  Here, there is coupling between the sender and the receiver.  The other major message pattern is publish and subscribe (commonly referred to as pub/sub).  This pattern has constituent parts of the system become “actors”, and each “actor” in the system can act on some message that is received.  Command messages are created and every command also raises an event message to indicate that the command was requested.  These event messages are published and subscribers to the event can subscribe and receive these events without having to be known to the command generator.  Events are  broadcast to anyone interested and subscribers can subscribe, listen and act on the event, or not act on the event.  Within a pub/sub system, there is much less coupling between the system’s constituent parts, and the little coupling that exists is inverted, that is, the subscriber knows where the publish is, not the other way round.

In a pub/sub pattern, versioning is the responsibility of the publisher.  The publisher can publish multiple versions of the same event each time an event is published.  This means that we can have numerous subscribers, each of which can be listening for, and acting upon different versions of the same event message.  As a developer using NServiceBus, your job is primarily to write message handlers to handle the various messages passing around the system.  Handlers must be stateless.  This helps scalability as well as concurrency.  Handlers live inside an “Endpoint” and are hosted somewhere within the system.  Handlers are grouped by "services" which is a logical concept within the business domain (i.e. shipping, accounting etc.).  Services are hosted within Endpoints, and Endpoint instances run on a Windows machine, usually as a Windows Service.

NServiceBus messages are simply classes.  They must be serializable to be sent over the wire.  NServiceBus messages are generally stored and processed within memory, but can be made durable so that if a subscriber fails and is unavailable (for example, the machine has crashed or gone down) these messages can be retrieved from persistent storage once the machine is back online.

NServiceBus message handlers are also simply classes, which implement the IHandleMessages generic interface like so:

public class MyMessageHandler : IHandleMessages<MyMessage>

So here we have a class defined to handle messages implemented by the class MyMessage.

NServiceBus endpoints are defined within either the app.config or the web.config files within the solution:

    <add Assembly="MyMessages" Endpoint="MyMessagesEndpoint" />

Such configuration settings are only required on the Sender of the message.  There is no need to configure anything on the message receiver.

NServiceBus has a BusConfiguration class.  You use it to define which messages are defined as commands and which are defined as events.  This is easily performed with code such as the following:

var cfg = new BusConfiguration();

    .DefiningCommandsAs( t => t.Namespace != null && t.Namespace.EndsWith( ".Commands" ) )
    .DefiningEventsAs( t => t.Namespace != null && t.Namespace.EndsWith( ".Events" ) );

using ( var bus = Bus.Create( cfg ).Start() )

Here, we’re declaring that the Bus will use in-memory persistence (rather than any disk-based persistence of messages), and we’re saying that all of our command messages are defined within a namespace that ends with the string “.Commands” and that all of our event messages are defined within a namespace ending with the string “.Events”.

Mauro then shows all of this theory with some code samples. He has an extensive set of samples that show all virtually all aspects of NServiceBus and this solution is freely available on GitHub at the following URL:  https://github.com/mauroservienti/NServiceBus.Samples

Mauro goes on to state that when sending and recieving commands, the subscriber will usually work with concrete classes when handling messages for that specific command, however, when sending or receiving event messages, the subscriber will work with interfaces rather than concrete classes.  This is a best practice and helps greatly with versioning.

NServiceBus allows you to use your own persistence store for persisting messages.  A typical store used is RavenDB, but virtually anything can be used.  There's only two interfaces that need to be implemented by a storage provider, and many well-known databases and storage mechanisms (RavenDB, NHibernate/SQL Server etc.) have integrations with NServiceBus such that they can be used as persistent storage. NServiceBus can also use third-party message queues.  MSMQ, RabbitMQ, SQL Server, Azure ServiceBus etc. can all be used.  By default NServiceBus uses the built-in Windows MSMQ for the messaging.

Mauro goes on to talk about state.  He asks, “What if you need state during a long-running workflow of message passing?”  He explains how NServiceBus accomplishes this using “Sagas”.    Sagas are durable, stateful and reliable, and they guarantee state persistence across message handling.  They can express message and state correlation and they empower "timeouts" within the system to make decisions in an asynchronous world – i.e. they allow a command publisher to be notified after a specific "timeout" of elapsed time as to whether the command did what was expected or if something went wrong.   Mauro demonstrates this using his NServiceBus sample code.

Mauro explains how the business endpoints are responsible for storing the business state used at each stage (or step) of a saga.  The original message that kicks off a saga only stores the "orchestration" state of the saga (for example, an Order management service could start a  saga that uses an Order Creation service, a Warehouse Service and a Shipping service that creates an order, picks the items to pack and then finally ships them).

The final part of Mauro’s talk is about monitoring and how we can monitor the various messages and flow through all of the messages passing around an inherently asynchronous system.  He states that auditing is a key feature, and that this is required when we have many asynchronous messages floating around a system in a disparate fashion.  NServiceBus provides some "behind-the-scenes" part of the software called "ServiceControl".  ServiceControl sits in the background of all components within a system that are publishing or subscribing to NServiceBus messages and it keeps it's own copy of all messages sent and received within that entire system.  It therefore allows us to have a single place where we can get a complete overview of all of the messages from the entire system along with their current state.

serviceinsight-sagaflow The company behind NServiceBus also provides separate software called “ServiceInsight”, which Mauro quickly demonstrates to us showing how it provides a holistic overview and monitoring of the entire message passing process and the instantiation and individual steps of long-running sagas.  It displays all of this data in a user interface that looks not dissimilar to a SSIS (SQL Server Integration Service) workflow diagram. 

Mauro states that handling asynchronous messages can be hard.  In a system built with many disparate messages, we cannot ever afford to lose a single message.  To prevent message loss, Mauro says that we should never use try/catch blocks within our business code.  He states that NServiceBus will automatically "add" this kind error handling within the creation, generation and sending of messages.  We need to consider transient failures as well as business errors.  NServiceBus will perform it’s own retries for transient failures of messages but business errors must be handled by our own code.  Eventually, transient errors in sent messages that fail to be delivered after the configured amount of maximum retries are placed into a special error message queue by NServiceBus itself, and this allows us to handle these failed messages in this queue as special cases.  To this end, Particular Software also have a separate piece of software called "ServicePulse" which allows monitoring of the entire the infrastructure.  This includes all message endpoints to see if they’re up and available to send/receive messages and well as full monitoring of the failed message queue.

IMG_20150425_155100image (3) After Mauro’s talk it was time for another break.  Unlike the earlier breaks throughout the day, this one was a bit special.  As well as the usual teas and coffees that were available all day long, this break treated all of the attendees to some lovely cream teas!  This was a very pleasant surprise and ensured that all conference attendees were incredibly well-fed throughout the entire conference.  Kudos to the organisers, and specifically the sponsors who allowed all this to be possible.


After our lovely break with the coffee and cream teas, it was on to the second session of the afternoon and indeed, the final session of the whole DDD event.  The final session was entitled “Monoliths to Microservices : A Journey”, presented by Sam Elamin.

IMG_20150425_160220 Sam works for Just Eat, and his talk is all about how he has gone on a journey within his work to move from large, monolithic applications to re-implementing the required functionality in a more leaner, distributed system composed largely of micro-services.

Sam firsts mentions the motivation behind his talk: Failure.  He describes how we learn from our failures, and states that we need to talk about our failures more as it’s only from failure that we can actually really improve. 

He asks, “Why do we build monoliths?”  As developers, we know it will become painful over time but we build these monolithic systems because we’re building a system very quickly in order to ship it fast.  People then use our system and we, over time, add more and more features into it. We very rarely, if ever, get the opportunity to go back and break things down into better structured code and implement a better architecture.  Wanting to spend time performing such work is often a very hard sell to the business as we’re talking to them about a pain that they don’t feel.  It’s only the developers who feel the pain during maintenance work.

Sam then states that it’s not all a bed of roses if we break down our systems into smaller parts.  Breaking down a monolithic application into smaller components reduces the complexity of each individual component but that complexity isn’t removed from the system.  It’s moved from within the the individual components to the interactions between the components.

Sam shares a definition of "What is a microservice?"  He says that Greg Young once said, "It’s anything you can rewrite in a week".  He states that a micro service should be a "business context", i.e. a single business responsibility and discrete piece of business functionality.

But how do we start to move a monolithic application to a smaller, microservices-based application?  Well, Sam tells us that he himself started with DDD (Domain Driven Design) for the design of the application and to determine bounded contexts – which are the distinct areas of services or functionality within the system.  These boundaries would then communicate, as the rest of the system communicated, with messages in a pub/sub (Publisher/Subscriber) mechanism, and each conceptual part of the system was entirely encapsulated by an interface – all other parts of the system could only communicate through this interface.

Sam then talks about something that they hadn’t actually considered when the first started on the journey: Race Hazards.  Race Hazards, or Race Conditions as they can also be known, within a distributed message-based architecture are when there are failures in the system due to messages being lost or being recieved out of order and the inability of the system to deal with this.  Testing for these kind of failures is hard as asynchronous messages can be difficult to test by their very nature. 

Along the journey, Sam discovered that things weren’t proceeding as well as expected.  The boundaries within the system were unclear and there was no clear ownership of each bounded context within the business.  This is something that is really needed in order for each context to be accurately defined and developed.  It’s also really important to get a good ubiquitous language - which is a language and way of talking about the system that is structured around the domain model and used by all team members to connect all the activities of the team with the software - correct so that time and effort is not wasted trying to translate between code "language" and domain language.

Sam mentioned how the teams’ overly strict code review process actually slowed them down.  He says that Code Reviews are usually used to address the symptom rather than the underlying problem which is not having good enough tests of the software, services and components.  He says this also applies to ensuring the system has the necessary amount of monitoring, auditing and metrics implemented within to ensure speedy diagnosis of problems.

Sam talks about how, in a distributed system of many micro-services, there can be a lot of data duplication.  One area of the system can deal with it’s own definition of a “customer”, whilst another area of the system deals with it’s own definition of that same “customer” data.  He says that businesses fear things like data duplication, but that it really shouldn't matter in a distributed system and it's often actually a good thing – this is frequently seen in systems that implement CQRS patterns, eventual consistency and correct separation of concerns and contexts due to implementation of DDD.  Sam states that, for him, code decoupling is vastly favourable to code duplication – if you have to duplicate some code in two different areas of the system in order to correctly provide a decoupled environment, then that’s perfectly acceptable and that introducing coupling simply to avoid code duplication is bad.

He further states that business monitoring (in production monitoring of the running application and infrastructure) is also favourable to acceptance tests.  Continual monitoring of the entire production system is the most useful set of metrics for the business, and that with metrics comes freedom.  You can improve the system only when you know what bits you can easily replace, and only when you know what bits actually need to be replaced for the right reasons (i.e. replacing one component due to low performance where the business monitoring has identified that this low-performance component is a genuine system bottleneck).   Specifically, business monitoring can provide great insights not just into the system’s performance but also the businesses performance and trends, too.  For example, monitoring can surface data such as spikes in usage.  From here we can implement alerts based upon known metrics – i.e. We know we get around X number of orders between 6pm and 10pm on a Friday night, if this number drops by Y%, then send an alert.

Sam talks about "EventStorming" (a phrase coined by Alberto Brandolini) with the business/domain experts.  He says he would get them all together in a room and talk about the various “events” and the various “commands” that exist within the business domain, but whilst avoiding any vocabulary that is technical. All language used is expressed within the context of the business domain. (i.e. Order comes in, Product is shipped etc).  He states that using Event Storming really helped to move the development of system forward and really helped to define both the correct boundaries of the domain contexts, and helped to define the functionality that each separate service within the system would provide.

Finally, Sam says the downside of moving to microservices are that it's a very time-consuming approach and can be very expensive (both in terms of financial cost and time cost) to define the system, the bounded contexts and the individual commands and events of the system.  Despite this, it’s a great approach and using it within his own work, Sam has found that it’s provided the developers within his company with a reliable, scalable and maintainable system, and most importantly it’s provided the business with a system that supports their business needs both now and into the future.


After Sam’s session was over, we all reconvened in the common room and communal hall for the final section of the day.  This was the prize-draw and final wrap-up.

The organisers first thanked the very generous sponsors of the event as without them the event simply would not have happened.  Moreover, we wouldn’t have been anywhere nearly as well fed as we were! 

There were a number of prize draws, and the first batch was a prize from each of the in-house sponsors who had been exhibiting at the event.  Prizes here ranged from a ticket to the next NDC conference to a Raspberry Pi kit.

After the in-house sponsors had given away their individual prizes, there was a “main” prize draw based upon winners drawn randomly from the event feedback that was provided about each session by each conference attendee.  Amongst the prizes were iPad mini’s, a Nexus 9 tablet, technical books, laser pens and a myriad of software licenses.  I sat as the winner’s names were read out, watching as each person as called and the iPad’s and Nexus 9 were claimed by the first few people who were drawn as a winner.  Eventually, my own name was read out!  I was very happy and went up to the desk to claim my prize.  Unfortunately, the iPad’s and Nexus 9 were already gone, but I managed to get myself a license for PostSharp Ultimate!IMG_20150504_143116

After this, the day’s event was over.  There was a customary geek dinner that was to take place at a local Tapas restaurant later in the evening, however, I had a long drive home from Bristol back to the North-West ahead of me so I was unable to attend the geek dinner after-event.

So, my first DDD South-West was over, and I must say it was an excellent event.  Very well run and organised by the organisers and the staff of the venue and of course, made possible by the fantastic sponsors.  I’d had a really great day and I can’t wait for next year’s DDDSW event!

DDD North 2014 In Review

Outside the Entrance This past Saturday, 18th October 2014, saw another DDD (Developer, Developer, Developer) event.  This one was the 4th annual DDD North event, this year held at the University Of Leeds.

Communal Area After arriving and signing in, I proceeded through the corridors to the communal area where we were all greeted with a cup of coffee (or tea) and a nice Danish pastry!  It’s always a nice surprise to get a nice cake with your morning coffee, so although I wasn’t really hungry as I’d recently eaten a large breakfast, I decided that a Danish Pastry covered in sweet, sweet icing was too much of a temptation to be able to refuse!Danish Pastries  After this delightful breakfast, I headed down the corridor for the first of the day’s sessions.

The first session of the day is Liam Westley’sAn Actor’s Life For Me” which talks about parallel processing with multiple threads using the Task Parallel Library and utilising the Actor Model.  Liam introduces the Actor model and states it was first described by Carl Hewitt as early as 1973.  The dilemma we have for parallel processing is due to shared state, causing us to lock around areas of memory where multiple threads may try to access that state.  The Actor model solves this by not having shared state within the system, instead having each process take stateless data that is not shared and outputting stateless data to the next process in the processing pipeline.  Liam uses an analogy of making a cup of tea and the steps involved in that whilst also getting an itch that needs scratching whilst making that cup of tea.  The itch (and thus the scratch) can happen during any of the tea-making steps, thus increasing the combinations of how alternating between making tea and scratching can grow exponentially.Liam Westley's Actor Pattern

Liam talks about how CPU’s have been multi-threaded and multi-core for many years now, first arriving around the same time as .NET v1.0, whilst in the same time frame, our developer tools haven’t really kept up.  .NET 1.0 pretty much gave us raw access to how windows handles threads using the TheadPool, which meant managing multiple threads and sharing state between them was very difficult.  .NET 2.0 gave us a SynchronizationContext, but multi-threaded programming was still very hard.  Eventually, we got the much simplified Async & Await keywords, but now we have the Task Parallel Library which provides us with the Actor pattern.  This basically allows us to write our code in individual “blocks” which are essentially black boxes sharing no state with any other block.  We can then chain these blocks together into a processing pipeline, giving us the ability to perform some computational process without sharing state.

Liam then shows us a demo of a console application which produces an MD5 hash for a number of large files in a folder.  The first  iteration of the demo shows this happening without using the Task Parallel Library (TPL) and so performs no parallel processing and simply processes each file, one at a time on a single thread, taking some time to complete.  The second iteration Liam shows us uses the TPL, but still only works in a single-threaded manner by wrapping the hash calculation function as a TPL ActionBlock.  This iteration does the same as the single-threaded version, as again, no parallel processing is occurring.  The final iteration runs in a multi threaded manner by simply setting the block configuration (ExecutionDataFlowBlockOptions) property of MaxDegreeOfParallelism.  What’s really amazing about these ActionBlocks is that they inherently and implicitly handle all input and output buffering and queuing by themselves. This means we can add many blocks into the processing pipeline at a faster rate than they can be executed, and the TPL will handle the queuing for us.

20141018_095624 Liam next talks about separating the processing and calculating of the file hashes by performing these in a TransformBlock rather than an ActionBlock, and only using ActionBlocks to print the hash value to the UI.  The output of the TransformBlock (the hash value and the filename) is passed to the ActionBlock in the processing pipeline.

Liam then introduces the BufferBlock.  This acts as a propagator between other blocks and a FIFO queue of data.  Liam talks about how, in our example, we can add a BufferBlock in front of all of the TransformBlocks which will effectively evenly distribute the “load” as we provide the files to be hashed between the TransformBlocks. 

Next, Liam shows how we can use the LinkTo method which allows us to filter the passing of blocks along the processing pipeline, as the LinkTo method allows us to pass a predicate to perform the filtering.  This could be used (for example) to hash files of different types by different TransformBlocks (i.e. an MP3 file is processed differently than an MP4 file etc.).  Liam also introduces the TransformManyBlock which takes an IEnumerable of things to process.  This means we no longer have to have our own loop through each of the files to be processed, instead, we can simply pass in the contents of the folder’s files as a complete IEnumerable collection.

Finally, Liam mentions both the BroadcastBlock and the BatchBlock.  The Broadcast block is effectively a pub/sub mechanism as used in Message Buses etc. which allows fanning-out of the messages and broadcasting to other blocks.  The BatchBlock allows batching of inputs before passing the messages along the processing pipeline.

All in all, Liam’s talk was very informative and shows just how far we’ve come in our ability to relatively easily and simply perform parallel processing in a multi-threaded manner, taking advantage of all of the cores available to us on a modern day machine.  Liam’s demo code has been made available on GitHub for those interested in learning more.


20141018_110411 The next talk is Ian Cooper’sNot Just Layers! – What can pipelines and events do for you?”, which is a talk about Data Flow Architectures, and specifically Pipelines and Events.  Ian first talks about general software architecture and how processes evolve from basic application of a skill through to adoption of genuine craftsmanship and best-practices.  Software Architecture has many styles, but a single style can be explained as a series of component and connectors.  Components are the individual parts of an architecture that does something and the connectors are how multiple components talk to each other.

Ian states that Data Flow architectures are more driven by behaviour rather than state, and says that functional languages (such as F#) are better suited to behaviourally modelled architecture, whereas object oriented (OO) languages like C# are better suited to solve state driven processes and architectures.

Ian uses the KWIC (Keyword in context) algorithm, which is how Unix indexes text in its man pages, as the reference for the session.

Ian talks about pipes and filters, and states that it’s a flow of data processing along a pipeline of specific stages.  A push pipeline “pushes” tasks along the pipeline, the pipeline usually consisting of a pump at the front, which pushes data into the pipeline, with a series of filters which are the processing tasks and with each preceding filter responsible for pushing the data to the succeeding filter in the pipeline.  There’s also usually a sink at the end that provides the final end result.  There’s also Pull pipelines, of which .NET’s LINQ is an example, which have each filter further along the pipeline doing the pulling of the data from the previous filter, rather than the previous filter pushing the data on.

20141018_113104 Ian mentions how pipes and filters architecture is similar to a batch sequence architecture (See below for the subtle difference between them).  He talks about how errors that may happen in a long-running sequence that need the entire processing stream to be undo are better suited to a batch sequence architecture than a pipes and filter architecture, due to the more disconnected nature of the pipes and filter architecture.

Ian talks about parallel execution and the potential pub/sub problem of consumers awaiting data and not knowing when the entire workload is completed.  If individual steps are either faster or slower than the preceding or succeeding steps in the chain, this can cause problems with either no data, or too much data to process.  The solution to this problem is to introduce a “buffer” in between steps within the chain.  Such things as Message Queues (i.e. MSMQ, RabbitMQ etc) or in-memory caching mechanisms (such as those provided by tools like Redis) can offer this.

20141018_113427 Ian then show us an in-memory demo of a program using the pipes and filters architecture.  Ian states that, ideally, filters in a pipeline shouldn’t really know about other filters, but its okay for them to be aware of an abstraction of a new filter that’s next in the pipeline, but not the concrete instance of that filter.  Ian uses the KWIC algorithm for the demo code.  Ian shows the same demo using the manual pipeline and filters, and also a LINQ implementation.  The LINQ example has its filters implemented as fluent method calls simply chained together (i.e. TextLines.Shift(x=>x).RemoveNoise(x => x).Sort() etc.).  Ian then show the same example as written in F#.  This shows the pipeline, using F#’s pipeline operator “|>” is even simpler to see from the code that implements it.

Ian shows us the demo code using a message queue (using MSMQ behind the scenes), this shows a pull based pipeline where each filter down the chain pulls messages from a message queue to which messages are posted by the preceding filter in the pipeline chain.  Ian also shows us the pipeline running in a parallel manner, using the Task Parallel Library.  Each filter has distinct Inputs and Outputs defined as BlockingCollection<T> allowing the data to flow in and out, but to be blocked on the individual thread if the next filter in the pipeline isn’t ready to receive that data.

Finally, Ian talks about Batch Sequences and how they differ slightly from a pipes and filters architecture.  He talks about how you did Batch Sequencing many years ago with magnetic tapes being passed from one reel-to-reel processing machine to the next!  The main difference between Batch Sequence and Pipes and Filters is that in a batch sequence, each filter has to complete the entire workload of data before passing everything as output to the next filter in the chain.  By contrast, pipes and filters will have its filter only process one small piece of work or one individual piece of data before passing it down the processing chain.  This means that true pipes and filters is much better suited to being parallelized than a batch sequence architecture.


20141018_125418_LLS The next session is Richard Tasker’sBDD and why you should be doing it”.  Richard starts by introducing BDD (Behaviour Driven Development) and where it originated.  It was first proposed by Dan North as a “solution” to some of the failings of TDD such as: Where do you start with TDD? What to test and what not to test? and How much to test in one go?

Richard starts by talking about his first exposures to understanding BDD.  This started with writing expressive names for standard unit tests.  This helps understand what the test is testing and thus, what the code is doing.  I.e. the expression of a behaviour of the code.  It’s from here that we can see how we can make the mental leap from testing and exercising small methods of of code, but a more user-centric behaviour of the overall application.

Richard shows a series of Database Entity Relationship diagrams as the first mechanism he used to design an application used to model car parts and their relation to vehicles.  This had to go through a number of iterations to fully realise the entities involved and their relationships to each other and it wasn’t the most effective way to achieve the overall design.  Using a series of User Stories which could be turned into BDD tests was the way forward.

Richard next introduced the MoSCoW method as the way in which he started writing his BDD tests.  Using this method combined with the new style of user story templates emphasises the behaviour and business function.  Instead of writing “As a <type of user> I want <some functionality> so that <some benefit>”, we instead write, “In order to <achieve some value>, as a <type of user>, I should have <some functionality>”.  The last part of the user story gets the relevant must/should/could/won’t wording in order to help achieve effective prioritization with the customer.

Cynefin_as_of_1st_June_2014 Richard then introduces SpecFlow as his BDD tool of choice.  He shows a simple demo of a single SpecFlow acceptance test, backed by a number of standard unit tests.  Richard says that you probably don’t want to do this for every individual tiny part of your application as this can lead to an abundance of unit tests and further lead to a test maintenance burden.  To help solve this, Richard talks about Decision Frameworks, of which a popular one is called “Cynefin”.   It defines states of Obvious, Chaotic, Complex and Complicated.  Each area of the application and discrete pieces of functionality can be assessed to see which of the four Cynefin states they may fall into.  From here, we can decide how many or how few BDD Acceptance tests are best utilised for that feature to deliver the best return on investment.  Richard says that Acceptance tests are often best used in Complicated & Complex states, but are often less useful in Obvious & Chaotic states.

Richard closes his session with “why” we should be doing BDD.  He talks about many of the benefits of adopting BDD and says that it is a great helper for teams that are new to TDD.  Richard says that BDD helps to reduce communication barriers between the developers and other technical professionals and the perhaps less technical business stakeholders and that BDD also helps with prioritizing which features should be implemented before others.  BDD also helps with naming things and defining the specific behaviours of our application in a more user-oriented way and also helps to define the meaning of “done”. 


20141018_131051_LLS After Richard’s talk, it was lunchtime.  Lunch was served in the same communal area where we’d all gathered earlier at breakfast time and consisted to a rather nice sandwich, a bag of crisps and a drink.  It was nice that all three ingredients could be chosen by each individual attendee from a selection available.

20141018_131444_LLS After enjoying this very nice lunch, I decided to skip the Grok talks (these are short, 10 minute talks that generally happen over lunchtime at the various DDD conferences) and get some fresh air outside.  That didn’t last too long, as I found the Pack Horse pub just down the road from the area of the university used for the conference.  This is a pub belonging to a small local microbrewery called The Burley Street Brewhouse.  I decided I had to go in and sneak a cheeky pint of bitter as a lunchtime treat.  It was indeed a lovely pint and afterwards, I headed back to the university and to the DDD North conference.  I went back in via an entrance close to the communal area still housing some conference attendees and realised that a number of sandwiches and crisps were still available for any attendee that wanted 2nd helpings!  I was still a bit peckish after my liquid refreshment (and knowing that I wouldn’t be eating until quite late in the evening at the after conference Geek Dinner) I decided to go for seconds!  After enjoying my second helpings, I headed off for the first session of the afternoon.


20141018_143120_LLS The first afternoon session is Andrew MacDonald’sCQRS & Event Sourcing”.  Andrew first talks about the how & why of starting development in a brand new project.  Andrew has his own development project, treevue.com, for which he decided to try out CQRS and event sourcing as they were two new interesting techniques that Andrew believed could help with the development of his software.  treevue.com is a web product which offers virtual data rooms.  Andrew talks about the benefits of CQRS & Event sourcing such as allowing a truly abstracted data storage model, providing domain driven design without noise and that separating reads and writes to the data model via CQRS could open up new possibilities for the software.  Andrew states that it’s not appropriate for everything and quotes Udi Dahan who said that most people who have used CQRS shouldn’t have done so!

CQRS is Command Query Responsibility Segregation and allows commands (processes that alter our data) to be separate from and entirely distinct from Queries (processes that only read our data but don’t change it).  The models behind each of these can be entirely different, even when referring to the same domain entities, so a data model for reading (for example) a Customer type can have a different design when reading than when writing.

Architectures Compared_thumbAndrew talks about the overall architecture of a system that employs CQRS vs. one that doesn’t.  Without CQRS, reads and writes flow through the same layers of our application.  With CQRS, we can have entirely different architectures for reading vs. writing.  Usually the writing architecture is similar to the entire non-CQRS architecture, flowing through many layers including data access, validation layers etc., but often the reading architecture uses a much flatter set of layers to read the data as concerns such as validation are generally not required in this context.  The two separate reading and writing stacks can often even connect to separate databases which provide “eventual consistency” with each other.  This also means reading and writing can scale independently of each other, and given that many apps read far more than write, this can be invaluable.

image19 Andrew then introduces Event Sourcing which, whilst separate and different from CQRS, does play well with it.  Andrew shows a typical relational model of a purchase order with multiple purchase order line item types related to it and a separate shipping info type attached.  This model only allows us to see the state of the order and its data as it stands right now.  Event sourcing shows the timeline of events against the purchase order as each alteration to the entity is stored separately in an event queue/database.  i.e. A line item is added with an (incorrect) quantity of 4.  But corrected with a later event deducting 2 from the line item, leaving a line item with a correct quantity of 2.  This provides us with the ability to not only see how the data looks “right now”, but to be able to create the entire state of the entity model at any given point in time.

Andrew then proceeds to talk about Azure’s role in his treevue app and how he’s utilised Azure’s Table Storage as a first class citizen.  He then shows us a quick demo and some code using EventProcessors and CommandProcessors which effectively implement the CQRS pattern. 

Finally, Andrew shows how he uses something called a “snapshot” when reading domain aggregates, which is effectively a caching layer used to improve performance around building the domain aggregate models from the various events that make up a specific state of the model as at a certain point in time.  This is particularly important when running applications in the cloud and using such technology as Azure Table Storage, as this will only serve back a maximum of 1000 rows per query before you, as the developer, have to make further requests for more data.  Andrew points out that the demo code is available on GitHub for those interested in diving deeper and learning more from his own implementation.


20141018_154117_LLS The final session for today is David Whitney’sLessons Learnt running a public API”.  David is a freelance consultant who has worked for many companies writing large public API’s.  The company used for reference during David’s talk is the work he did with Just Giving.  David states how the project to build the Just Giving API grew so large that the API eventually became the company’s biggest revenue stream.

David’s talk is a fast-paced set of tips, tricks and lessons that he has personally learned over the many years working with clients developing their large public-facing API’s.

David starts with stating that your API is your public facing contract to the world, and that it will live or die by the strength of it’s documentation.  If it’s bad, people will write bad implementations, and you can’t blame them when that happens.  Documentation for APIs can either be created first, which then drives the design of the API, or it can be performed the other way around, where you write the API first and document it afterwards.  Either approach is viable, so long as documentation does indeed exist and is sufficiently comprehensive to allow your consumers to build quality implementations of your API.  David says it’s often best to host the docs with the API itself so that if you hit the API endpoint with a web browser as a human user, you’ll serve up the API documentation.

David states that the DTO’s returned from API calls should provide “examples” of themselves.  This is a simple mechanism that lets users “discover” your API and helps them to understand just how they should use it.  Code such as this:

public interface IProvideAnExampleOf<TMyself>
    ExampleOf<TMyself>[] BuildExample();

public class ExampleOf<T>
    public string Description { get; set; }
    public T Example { get; set; }

    public ExampleOf(string description, T example)
        Description = description;
        Example = example;

will enable your API to provide examples of itself to your users.  David states that anything you can do to help your API consumers will greatly cut down the inevitable avalanche of help requests that will hit you.

Following on from individual examples, it’s good to have your API and it’s documentation provide “recipes” for how to use large sections of your API and how to call discrete service endpoints in a coherent chain in order to achieve a specific outcome.  Recipes help your users to “fall into the pit of success”.  Providing things like a complete web application, ideally written in multiple languages, that exercises various parts of your API is even better.

David next talks about versioning of your API, and says that it’s something you have to ensure you have a policy on from Day 1.  Retrofitting versioning is very hard and often leads to broken or awkward implementations.  Adding version numbers to the URI is perhaps the easiest to achieve, but it’s not really the best approach.  It’s far better to add the API version in the HTTP header.

He continues by talking about modifying existing API calls.  Don’t.  Just don’t do it at any cost!  If you really must, you can add additional data to the return values of your API endpoints, but you must never change or remove anything that’s already there.  You must also never rename anything.  If you need to do any of this, use a new version.  This leads into Content Types, and here David states that you’ll really need to provide all the different content types that people will realistically use.  Whilst many web developers today see JSON as the de-facto standard, many companies – especially large enterprises – are still using XML as their de-facto standard.  Your API is going to have to support both.  David also mentions that JSONP is another, growing, standard that you may well have to support, but be careful if you do as you’ll need to be mindful of possible errors caused by CORS (Cross Origin Resource Sharing) which is the ability of resources such as JavaScript to be able to be called from domains other than the one where the resource is hosted.

David talks about the importance of making Statistics for your API available and public.  You need to ensure you’re gathering performance and other statistics on every method call.  One possibility is returning some statistics back to the consumer directly in the HTTP response header after every request to your API, such as the server name that serviced the consumer’s request.  This is especially useful if you’ve got a large server farm and need help debugging service call issues.  Also you should ensure you publically expose your statistics in a dashboard via status updates, uptime pages and more.  For one, it’ll help you deflect any criticisms that your performance is broken, and it’ll provide consumers with confidence that your API is up, that it stays up and that you’re on top of maintaining this.  (Unless, of course, your performance really is broken in which case that same fancy dashboard will help you have visibility into diagnosing and correcting the issue!).  David next mentions the importance of a good staging server for user testing.  Don’t simply expose an internal “test” server that you may have cobbled together.  David relates first hand experience of just how difficult it can be getting users to stop using your “test” server after you’ve allow them access!

20141018_162628_LLS The next part of the session focuses on the overall approach to design of your API.  David stresses that it’s good to go back and read the original documentation on RESTful architecture, written by Roy Fielding as a doctoral dissertation back in the year 2000.  Further, it’s important to lean on existing conventions – always return canonical URI’s rather than relative ones and always supply ID’s and URI’s when returning data that refers to any domain or service entity.  As well as ensuring you follow existing standards, it’s also important to investigate new, emerging standards too.  Standards such as HAL (Hypertext Application Language) and JSON API can ensure that should such standards quickly become mainstream, you can adapt your API to support them.

David continues his session by talking about the cardinal sins of API design.  First thing you must never do is this:

    "PageType": 1,
    "SomeText": "This is some text"

What, exactly, is PageType 1?  We’re talking, of course, about magic numbers.  Don’t do it.  This forces your consumers to go off and look it up in the documentation, and whilst that documentation should definitely exist, there’s no reason why you can’t provide a more meaningful value to your consumer.  You have to think like a consumer at all times and try to imagine the applications they’re going to build using your API.  Also, don’t ever ask a user for data that your API itself can’t supply – i.e. Don’t ever request some specific identifier for a resource if you don’t provide that identifier when returning that resource in other requests.  Build your services RESTfully, don’t build XML-RPC with SOAP envelopes.  Be resource oriented, and always ensure you use the correct HTTP verbs for all of your services actions – especially understand the difference between POST & PUT.

Make sure you understand multi-tenancy and how that will impact the design and implementation of your API.  Good load balancers and proxies can balance based on request headers, so it’s really easy and useful to provide multi-tenancy in this manner.  Also ensure you use a good sandbox environment for testing and don’t forget to implement good rate limiting!   Users and consumers will make mistakes in their code and you don’t want them to take down your service when they do.

David talks about error handling and says you should validate everything you can when requests are made to your API.  Try to return errors in batches if possible, and always make sure that error messages are useful and readable.  Similar the magic numbers above, don’t return only an arcane error code to your consumers and force them to have to cross reference it from deep within your documentation.

20141018_163740_LLS David moves onto authentication for your API and states that this is an area that can get a bit painful.  Basic HTTP Auth will get you going, and can be sufficient if your API is (and will remain) fairly small scale, however, if your API is large or likely to grow to a larger scale – and especially if your API will be used by users via third-parties, you’ll quickly grow out of Basic Auth and need something more robust.  He says that OpenAuth is the best worst alternative.  It provides good security but can be painful to implement.  Fortunately, there are many third-party providers out there to whom you can outsource your authorisation concerns.

David then discusses providing support for your API to your users.  He says the best approach is to simply put it all out there in the public domain.  This provides transparency which is a good thing, but can also encourage a “self-service” model where people within the community will start to help provide answers and solutions to other community members.  Something as simple as a Google Group or a tag on Stack Overflow can get you started.

David closes his session by stating that, as your API grows over time, always ensure that you’re never attempting to serve only a single customer.  Keep your API clean and generic and it will remain useful to all consumers, rather than compromising that usefulness for just a minority of users.  And finally, if your API is or will become a first-class product for your business, just as the Just Giving API became for them, make sure you have a full product team within your business to deal with its day to day operation and its ongoing maintenance and development.  It’s all too easy to think that the API isn’t strictly a “product” due to its highly technical and slightly opaque nature, however, doing so would be a mistake.


20141018_173357_LLS After David’s session, we all congregated in the main lecture theatre for the wrap up presentation from Andy Westgarth, one of the conference organisers.  This involved thanking the very generous sponsors of the event as without them there simply wouldn’t be a DDD conference, and it also involved a prize giving session – the prizes consisting of books, T-shirts, some Visual Studio headphones and a main prize of a Surface Pro 3!

After the excellent day, I headed to the pub which was very conveniently located immediately across the road from the venue entrance.  I had a few hours to kill until the Geek Dinner which was to be held later that evening at Pizza Express in Leed’s Corn Exchange.  I enjoyed a couple of pints of Leeds Pale Ale before heading off to the Pizza Express venue for my dinner.

20141018_224309_LLS The Geek Dinner was attended by approximately 40 people and a fantastic time was had by all.  I was sat close one of the day’s earlier speakers, Andrew MacDonald, and we had a good old chin wag about past projects, work, and life as a software developer in general.

Overall, the DDD North 2014 event and the Geek Dinner afterwards was a fantastic success, and a great time was had by all.  Andy promised that there’d be another one in 2015, which will be held back up in the North-East of England due to the alternating location of DDD North, so here’s looking forward to another wonderful DDD North conference in 2015.