DDD North 2017 In Review

IMG_20171014_085217On Saturday, 14th October 2017, the 7th annual DDD North event took place.  This time taking place in the University of Bradford.

IMG_20171015_171513One nice element of the DDD North conferences (as opposed to the various other DDD conferences around the UK) is that I'm able to drive to the conference on the morning of the event and drive home again after the event has finished.  This time, the journey was merely 1 hour 20 minutes by car, so I didn't have to get up too early in order to make the journey.  On the Saturday morning, after having a quick cup of coffee and some toast at home, I headed off towards Bradford for the DDD North event.

IMG_20171014_085900After arriving at the venue and parking my car in one of the ample free car parks available, I headed to the Richmond Building reception area where the conference attendees were gathering.  After registering my attendance and collecting my conference badge, I headed into the main foyer area to grab some coffee and breakfast.  The catering has always been particularly good at the DDD North conferences and this time round was no exception.  Being a vegetarian nowadays, I can no longer avail myself of a sausage or bacon roll, both of which were available, however on this occasion there was also veggie sausage breakfast rolls available too.   A very nice and thoughtful touch!  And delicious, too!

After a some lovely breakfast and a couple of cups of coffee, it was soon time to head off the the first session of the day.  This one was to be Colin Mackay's User Story Mapping For Beginners.

IMG_20171014_093309Colin informs us that his talk will be very hands-on, and so he hands out some sticky notes and markers to some of the session attendees, but unfortunately, runs out of stock of them before being able to supply everyone.

Colin tells us about story mapping and shares a quote from Martin Fowler:

"Story mapping is a technique that provides the big picture that a pile of stories so often misses"

Story mapping is essentially a way of arranging our user stories, written out on sticky notes, into meaningful "groups" of stories, tasks, and sections of application or business functionality.  Colin tells us that it's a very helpful technique for driving out a "ubiquitous language" and shares an example of how he was able to understand a sales person's usage of the phrase "closing off a customer" to mean closing a sale, rather than the assuming it to mean that customer no longer had a relationship with the supplier.  Colin also states that a document cannot always tell you the whole story.  He shares a picture from his own wedding which was taken in a back alley from the wedding venue.  He says how the picture appears unusual for a wedding photo, but the photo doesn't explain that there'd been a fire alarm in the building and all the wedding guests had to gather in the back alley at this time and so they decided to take a photo of the event!  He also tells us how User Story Mapping is great for sparking conversations and helps to improve prioritisation of software development tasks.

Colin then gets the attendees that have the sticky notes and the markers to actually write out some user story tasks based upon a person's morning routine.  He states that this is an exercise from the book User Story Mapping by Jeff Patton.  Everyone is given around 5 minutes to do this and afterwards, Colin collects the sticky notes and starts to stick them onto a whiteboard.  Whilst he's doing this, he tells us that there's 3 level of tasks with User Story Mapping.  At the very top level, there's "Sea" level.  These are the user goals and each task within is atomic - i.e. you can't stop in the middle of it and do something else.  Next is Summary Level which is often represented by a cloud or a kite and this level shows greater context and is made up of many different user goals.  Finally, we have the Sub-functions, represented by a fish or a clam.  These are the individual tasks that go to make up a user goal. So an example might have a user goal (sea level) of "Take a Shower" and the individual tasks could be "Turn on shower", "Set temperature", "Get in shower", "Wash body", "Shampoo hair" etc.

After an initial arrangement of sticky notes, we have our initial User Story Map for a morning routine.  Colin then says we can start to look for alternatives.  The body of the map is filled with notes representing details and individual granular tasks, there's also variations and exceptions here and we'll need to re-organise the map as new details are discovered so that the complete map makes sense.  In a software system, the map becomes the "narrative flow" and is not necessarily in a strict order as some tasks can run in parallel.  Colin suggests using additional sticker or symbols that can be added to the sticky note to represent which teams will work on which parts of the map.

Colin says it's good to anthropomorphise the back-end systems within the overall software architecture as this helps with conversations and allows non-technical people to better understand how the component parts of the system work together.  So, instead of saying that the web server will communicate with the database server, we could say that Fred will communicate with Bob or that Luke communicates with Leia. Giving the systems names greater helps.

We now start to look at the map's "backbone".  These are the high level groups that many individual tasks will fit into.  So, for our morning routine map, we can group tasks such as "Turn off alarm" and "Get out of bed" as a grouping called "Waking up".  We also talk about scope creep.  Colin tells us that, traditionally, more sticky notes being added to a board even once the software has started to be built is usually referred to as scope creep, however, when using techniques such as User Story Mapping, it often just means that your understanding of the overall system that's required is getting better and more refined.

IMG_20171014_101304Once we've built our initial User Story Map, it's easy to move individual tasks within a group of tasks in a goal below a horizontal line which was can draw across the whiteboard.  These tasks can the represent a good minimum viable product and we simply move those tasks in a group that we deem to be more valuable, and thus required for the MVP, whilst leaving the "nice to have" tasks in the group on the other side of the line.  In doing this, it's perfectly acceptable to replace a task with a simpler task as a temporary measure, which would then be removed and replaced with the original "proper" task for work beyond MVP.  After deciding upon our MVP tasks, we can simply rinse and repeat the process, taking individual tasks from within groups and allocating them to the next version of the product whilst leaving the less valuable tasks for a future iteration. 

Colin says how this process results in producing something called "now maps" as the represent what we have, or where we're at currently, whereas what we'll often produce is "later maps", these are the maps that represent some aspect of where we want to be in the future.  Now maps are usually produced when you're first trying to understand the existing business processes that will be modelled into software.  From here, you can produce Later maps showing the iterations of the software as will be produced and delivered in the future.  Colin also mentions that we should always be questioning all of the elements of our maps, asking question such as "Why does X happen?", "What are the pain points around this process?", "What's good about the process?" and "What would make this process better?".  It's by continually asking such questions, refining the actual tasks on the map, and continually reorganising the map that we can ultimately create great software that really adds business value.

Finally, Colin shares some additional resources where we can learn more about User Story Mapping and related processes in general.  He mentions the User Story Mapping book by Jeff Patton along with The Goal by Eli Goldratt, The Phoenix Project by Gene Kim, Kevin Behr and George Spafford and finally, Rolling Rocks Downhill by Clarke Ching.

After Colin's session is over, it's time for a quick coffee break before the next session.   The individual rooms are a little distance away from the main foyer area where the coffee is served, and I realised by next session was in the same room as I was already sat!  Therefore, I decided I'm simply stay in my seat and await the next session.  This one was to be David Whitney's How Stuff Works...In C# - Metaprogramming 101.

IMG_20171014_104223David's talk is going to be all about how some of the fundamental frameworks that we use as .NET developers everyday work and how they're full of "metaprogramming".  Throughout his talk, he's going to decompose an MVC (Model View Controller) framework, a unit testing framework and a IoC (Inversion of Control) container framework to show they work and specifically to examine how they operate on the code that we write that uses and consumes these frameworks.

To start, David explains what "MetaProgramming" is.  He shares the Wikipedia definition, which in typical Wikipedia fashion, is somewhat obtuse.  However the first statement does sum it up:

"Metaprogramming is a programming technique in which computer programs have the ability to treat programs as their data."

This simply means that meta programs are programs that operate on other source code, and Meta programming is essentially about writing code that looks at, inspects and works with your own software's source code.

David says that in C#, meta programming is mostly done by using class within the System.Reflection namespace and making heavy use of things such as the Type class therein, which allows us to get all kinds of information about the types, methods and variables that we're going to be working with.  David shows a first trivial example of a meta program, which enumerates the list of types by using a call to the Assembly.GetTypes() method:

public class BasicReflector
{
	public Type[] ListAllTypesFromSamples()
	{
		return GetType().Assembly.GetTypes();
	}

	public MethodInfo[] ListMethodsOn<T>()
	{
		return typeof(T).GetMethods();
	}
}

He asks why you want to do this?  Well, it's because many of the frameworks we use (MVC, Unit Testing etc.) are essentially based on this ability to perform introspection on the code that you write in order to use them.  We often make extensive use of the Type class in our code, even when we're not necessarily aware that we're doing meta-programming but the Type class is just one part of a rich "meta-model" for performing reflection and introspection over code.  A Meta-Model is essentially a "model of your model".  The majority of methods within the System.Reflection namespace that provide this Metamodel usually end with "Info" in the method name, so methods such as TypeInfo, MethodInfo, MemberInfo and ConstructorInfo can all be used to give us highly detailed information and data about our code.

As an example, a unit testing framework at it's core is actually trivially simple.  It essentially just finds code and runs it.  It examines your code for classes decorated with a specific attribute (i.e. [TestFixture]) and invokes methods that are decorated with a specific attribute(s)(i.e. [Test]).  David says that one of his favourite coding katas is to write a basic unit testing framework in less than an hour as this is a very good exercise for "Meta Programming 101".

We look at some code for a very simple Unit Testing Framework, and there's really not a lot to it.  Of course, real world unit testing frameworks contain many more "bells-and-whistles", but the basic code shown below performs the core functionality of a simple test runner:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;

namespace ConsoleApp1
{
    public class Program
    {
        public static void Main(string[] args)
        {
            var testFinder = new TestFinder(args);
            var testExecutor = new TestExecutor();
            var testReporter = new TestReporter();
            var allTests = testFinder.FindTests();
            foreach (var test in allTests)
            {
                TestResult result = testExecutor.ExecuteSafely(test);
                testReporter.Report(result);
            }
        }
    }

    public class TestFinder
    {
        private readonly Assembly _testDll;

        public TestFinder(string[] args)
        {
            var assemblyname = AssemblyName.GetAssemblyName(args[0]);
            _testDll = AppDomain.CurrentDomain.Load(assemblyname);
        }

        public List<MethodInfo> FindTests()
        {
            var fixtures = _testDll.GetTypes()
                .Where(x => x.GetCustomAttributes()
                    .Any(c => c.GetType()
                        .Name.StartsWith("TestFixture"))).ToList();
            var allMethods = fixtures.SelectMany(f => 
                f.GetMethods(BindingFlags.Public | BindingFlags.Instance));
            return allMethods.Where(x => x.GetCustomAttributes()
                .Any(m => m.GetType().Name.StartsWith("Test")))
                .ToList();
        }
    }

    public class TestExecutor
    {
        public TestResult ExecuteSafely(MethodInfo test)
        {
            try
            {
                var instance = Activator.CreateInstance(test.DeclaringType);
                test.Invoke(instance, null);
                return TestResult.Pass(test);
            }
            catch (Exception ex)
            {
                return TestResult.Fail(test, ex);
            }
        }
    }

    public class TestReporter
    {
        public void Report(TestResult result)
        {
            Console.Write(result.Exception == null ? "." : "x");
        }
    }

    public class TestResult
    {
        private Exception _exception = null;
        public Exception Exception { get => _exception;
            set => _exception = value;
        }

        public static TestResult Pass(MethodInfo test)
        {
            return new TestResult { Exception = null };
        }

        public static TestResult Fail(MethodInfo test, Exception ex)
        {
            return new TestResult { Exception = ex };
        }
    }
}

David then talks about the ASP.NET MVC framework.  He says that it is a framework that, in essence, just finds and runs user code, which sounds oddly familiar to a unit testing framework!  Sure, there's additional functionality within the framework, but at a basic level, the framework simply accepts a HTTP request, finds the user code for the requested URL/Route and runs that code (this is the controller action method).  Part of running that code might be the invoking of a ViewEngine (i.e. Razor) to render some HTML which is sent back to the client at the end of the action method.  Therefore, MVC is merely meta-programming which is bound to HTTP.  This is a lot like an ASP.NET HttpHandler and, in fact, the very first version of ASP.NET MVC was little more than one of these.

David asks if we know why MVC was so successful.  It was successful because of Rails.  And why was Rails successful?  Well, because it had sensible defaults.  This approach is the foundation of the often used "convention over configuration" paradigm.  This allows users of the framework to easily "fall into the pit of success" rather than the "pit of failure" and therefore makes learning and working with the framework a pleasurable experience.  David shows some more code here, which is his super simple MVC framework.  Again, it's largely based on using reflection to find and invoke appropriate user code, and is really not at all dissimilar to the Unit Testing code we looked at earlier.  We have a ProcessRequest method:

public void ProcessRequest(HttpContext context)
{
	var controller = PickController(context);
	var method = PickMethod(context, controller);
	var instance = Activator.CreateInstance(controller);
	var response = method.Invoke(instance, null);
	HttpContext.Current.Response.Write(response);
}

This is the method that orchestrates the entire HTTP request/response cycle of MVC.  And the other methods called by the ProcessRequest method use the reflective meta-programming and are very similar to what we've already seen.  Here's the PickController method, which we can see tries to find types whose names both start with a value from the route/URL and also end with "Controller".  We can also see that we use a sensible default of "HomeController" when a suitable controller can't be found:

private Type PickController(HttpContext context)
{
	var url = context.Request.Url;
	Type controller = null;
	var types = AppDomain.CurrentDomain.GetAssemblies()
					.SelectMany(x => x.GetTypes()).ToList();
	controller = types.FirstOrDefault(x => x.Name.EndsWith("Controller")
					&& url.PathAndQuery.StartsWith(x.Name)) 
					?? types.Single(x => x.Name.StartsWith("HomeController"));
	return controller;
}

Next, we move on to the deconstruction of an IoC Container Framework.  An IoC container framework is again a simple framework that works due to meta-programming and reflection.  At their core, they simply stores a dictionary of mappings of interfaces to types, and they expose a method to register this mapping, as well as a method to create an instance of a type based on a given interface.  This creation is simply a recursive call ensuring that all objects down the object hierarchy are constructed by the IoC Container using the same logic to find each object's dependencies (if any).   David shows us his own IoC container framework on one of his slides and it's only around 70 lines of code.  It almost fits on a single screen.  Of course, this is a very basic container and doesn't have all the real-world required features such as object lifetime management and scoping, but it does work and performs the basic functionality.  I haven't shown the code here as it's very similar to the other meta-programming code we've already looked at, but there's a number of examples of simple IoC containers out there on the internet, some written in only 15 lines of code!

After the demos, David talks about how we can actually use the reflection and meta-programming we've seen demonstrated in our own code as we're unlikely to re-write our MVC, Unit Testing or IoC frameworks.  Well, there's a number of ways in which such introspective code can be used.  One example is based upon some functionality for sending emails, a common enough requirement for many applications.  We look at some all too frequently found code that has to branch based upon the type of email that we're going to be sending:

public string SendEmails(string emailType)
{
	var emailMerger = new EmailMerger();
	if (emailType == "Nightly")
	{
		var nightlyExtract = new NightlyEmail();
		var templatePath = "\\Templates\\NightlyTemplate.html";
		return emailMerger.Merge(templatePath, nightlyExtract);
	}
	if (emailType == "Daily")
	{
		var dailyExtract = new DailyEmail();
		var templatePath = "\\Templates\\DailyTemplate.html";
		return emailMerger.Merge(templatePath, dailyExtract);
	}
	throw new NotImplementedException();
}

We can see we're branching conditionally based upon a string that represents the type of email we'll be processing, either a daily email or a nightly one.  However, by using reflective meta-programming, we can change the above code to something much more sophisticated:

public string SendEmails(string emailType)
{
	var strategies = new Email[] {new NightlyEmail(), new DailyEmail()};
	var selected = strategies.First(x => x.GetType().Name.StartsWith(emailType));
	var templatePath = "\\Templates\\" + selected.GetType().Name + ".html";
	return new EmailMerger().Merge(templatePath, selected);
}

IMG_20171014_112710Another way of using meta-programming within our own code is to perform automatic registrations for our DI/IoC Containers.  We often have hundreds or thousands of lines of manual registration, such as container.Register<IFoo, Foo>(); and we can simplify this by simply enumerating over all of the interfaces within our assemblies and looking for classes that implement that interface and possibly are called by the same name prefix and automatically registering the interface and type with the IoC container.  Of course, care must be taken here as such an approach may actually hide intent and is somewhat less explicit.  In this regard, David says that with the great power available to us via meta-programming comes great responsibility, so we should take care to only use it to "make the obvious thing work, not make the right thing totally un-obvious".

Finally, perhaps one of the best uses of meta-programming in this way is to help protect code quality.  We can do this by using meta-programming within our unit tests to enforce some attribute to our code that we care about.  One great example of this is to ensure that all classes within a given namespace have a specific suffix to their name.  Here's a very simple unit test that ensures that all classes in a Factories namespace have the word "Factory" at the end of the class name:

[Test]
public void MakeSureFactoriesHaveTheRightNamingConventions()
{
	var types = AppDomain.CurrentDomain
		.GetAssemblies()
		.SelectMany(a => a.GetTypes())
		.Where(x => x.Namespace == "MyApp.Factories");

	foreach (var type in types)
	{
		Assert.That(type.Name.EndsWith("Factory"));
	}
}

After David's session was over it was time for another quick coffee break.  As I had to change rooms this time, I decided to head back to the main foyer and grab a quick cup of coffee before immediately heading off to find the room for my next session.  This session was James Murphy's A Gentle Introduction To Elm.

IMG_20171014_120146James starts by introducing the Elm language.  Elm calls itself a "delightful language for reliable web apps".  It's a purely functional language that transpiles to JavaScript and is a domain-specific language designed for developing web applications.  Being a purely functional language allows Elm to make a very bold claim.  No run-time exceptions!

James asks "Why use Elm?".  Well, for one thing, it's not JavaScript!  It's also functional, giving it all of the benefits of other functional languages such as immutability and pure functions with no side effects.  Also, as it's a domain-specific language, it's quite small and is therefore relatively easy to pick up and learn.  As it boasts no run-time exceptions, this means that if your Elm code compiles, it'll run and run correctly.

James talks about the Elm architecture and the basic pattern of implementation, which is Model-Update-View.  The Model is the state of your application and it's data.  The Update is the mechanism by which the state is updated, and the View is how the state is represented as HTML.  It's this pattern that provides reliability and simplicity to Elm programs.  It's a popular, modern approach to front-end architecture, and the Redux JavaScript framework was directly inspired by the Elm architecture.  A number of companies are already using Elm in production, such as Pivotal, NoRedInk, Prezi and many others.

Here's a simple example Elm file showing the structure using the Model-Update-View pattern.  The pattern should be understandable even if you don't know the Elm syntax:

import Html exposing (Html, button, div, text)
import Html.Events exposing (onClick)

main =
  Html.beginnerProgram { model = 0, view = view, update = update }

type Msg = Increment | Decrement

update msg model =
  case msg of
    Increment ->
      model + 1

    Decrement ->
      model - 1

view model =
  div []
    [ button [ onClick Decrement ] [ text "-" ]
    , div [] [ text (toString model) ]
    , button [ onClick Increment ] [ text "+" ]
    ]

Note that the Elm code is generating the HTML that will be rendered by the browser.  This is very similar to the React framework and how it also performs the rendering for the actual page's markup.  This provides for a strongly-typed code representation of the HTML/web page, thus allowing far greater control and reasoning around the ultimate web page's markup.

You can get started with Elm by visiting the projects home page at elm-lang.org.  Elm can be installed either directly from the website, or via the Node Package Manager (NPM).  After installation, you'll have elm-repl - a REPL for Elm, elm-make which is the Elm compiler, elm-package - the Elm package manager and elm-reactor - the Elm development web server.  One interesting thing to note is that Elm has strong opinions about cleanliness and maintainability, so with that in mind, Elm enforces semantic versioning on all of it's packages!

James shows us some sample Elm statements in the Elm REPL. We see can use all the standard and expected language elements, numbers, strings, defining functions etc.  We can also use partial application, pipe-lining and lists/maps, which are common constructs within functional languages.  We then look at the code for a very simple "Hello World" web page, using the Model-Update-View pattern that Elm programs follow.  James is using Visual Studio Code to as his code editor here, and he informs us that there's mature tooling available to support Elm within Visual Studio Code.

We expand the "Hello World" page to allow user input via a textbox on the page, printing "Hello" and then the user's input.  Due to the continuous Model-Update-View loop, the resulting page is updated with every key press in the textbox, and this is controlled by the client-side JavaScript that has been transpiled from the Elm functions.  James shows this code running through the Elm Reactor development web server.  On very nice feature of Elm Reactor is that is contains built-in "time-travel" debugging, meaning that we can enumerate through each and every "event" that happens within our webpage. In this case, we can see the events that populate the "Hello <user>" text character-by-character.  Of course, it's possible to only update the Hello display text when the user has finished entering their text and presses the Enter key in the textbox, however, since this involves maintaining state, we have to perform some interesting work in our Elm code to achieve it.

James shows us how Elm can respond to events from the outside world.  He writes a simply function that will respond to system tick events to show an ever updating current time display on the web page.  James shows how we can work with remote data by defining specific types (unions) that represent the data we'll be consuming and these types are then added to the Elm model that forms the state/data for the web page.  One important thing to note here is that we need to be able to not only represent the data but also the absence of any data with specific types that represent the lack of data.  This is, of course, due to Elm being a purely functional language that does not support the concept of null.

IMG_20171014_121610The crux of Elm's processing is taking some input (in the form of a model and a message), performing the processing and responding with both a model and a message.  Each Elm file has an "init" section that deals with the input data.  The message that is included in that data can be a function, and could be something that would access a remote endpoint to gather data from a remote source.  This newly acquired data can then be processed in the "Update" section of the processing loop, ultimately for returning as part of the View's model/message output.  James demonstrates this by showing us a very simple API that he's written implementing a simply To-Do list.  The API endpoint exposes a JSON response containing a list of to-do items.  We then see how this API endpoint can be called from the Elm code by using a custom defined message that queries the API endpoint and pulls in the various to-do items, processes them and writes that data into the Elm output model which is ultimately nicely rendered on the web page.

Elm contains a number of rich packages out-of-the-box, such as a HTTP module.  This allows us to perform HTTP requests and responses using most of the available HTTP verbs with ease:

import Json.Decode (list, string)

items : Task Error (List String)
items =
    get (list string) "http://example.com/to-do-items.json"

Or:

corsPost : Request
corsPost =
    { verb = "POST"
    , headers =
        [ ("Origin", "http://elm-lang.org")
        , ("Access-Control-Request-Method", "POST")
        , ("Access-Control-Request-Headers", "X-Custom-Header")
        ]
    , url = "http://example.com/hats"
    , body = empty
    }

It's important to note, however, that not all HTTP verbs are available out-of-the-box and some verbs, such as PATCH, will need to be manually implemented.

James wraps up his session by talking about the further eco-system around the Elm language.  He mentions that Elm has it's own testing framework, ElmTest, and that you can very easily achieve a very high amount of code coverage when testing in Elm due to it being a purely functional language.  Also, adoption of Elm doesn't have to be an all-or-nothing proposition.  Since Elm transpiles to JavaScript, it can play very well with existing JavaScript applications.  This means that Elm can be adopted in a piece meal fashion, with only small sections of a larger JavaScript application being replaced by their Elm equivalent, perhaps to ensure high code coverage or to benefit from improved robustness and reduced possibility of errors.

Finally, James talks about how to deploy Elm application when using Elm in a real-world production application.  Most often, Elm deployment is performed using WebPack, a JavaScript module bundler.  This often takes the form of shipping a single small HTML file containing the necessary script inclusions for it to bootstrap the main application.

IMG_20171014_131053After James' session was over, it was time for lunch.  All the attendees made there way back to the main foyer area where a delicious lunch of a selection of sandwiches, fruit, crisps and chocolate was available to us.  As is customary at the various DDD events, there were to be a number of grok talks taking place over the lunch period.  As I'd missed the grok talks at the last few DDD events I'd attended, I decided that I'd make sure I aught a few of the talks this time around.

I missed the first few talks as the queue for lunch was quite long and it took a little while to get all attendees served, however, after consuming my lunch in the sunny outdoors, I headed back inside to the large lecture theatre where the grok talks were being held.  I walked in just to catch the last minute of Phil Pursglove's talk on Azure's CosmosDB, which is Microsoft's globally distributed, multi-model database.  Unfortunately, I didn't catch much more than that, so you'll have to follow the link to find out more. (Update:  Phil has kindly provided a link to a video recording of his talk!)

The next grok talk was Robin Minto's OWASP ZAP FTW talk.  Robin introduces us to OWASP, which is the Open Web Application Security Project and exists to help create a safer, more secure web.  Robin then mentions ZAP, which is a security testing tool produced by OWASP.  ZAP is the Zed Attack Proxy and is a vulnerability scanner and intercepting proxy to help detect vulnerabilities in your web application.  Robin shows us a demo application he's built containing deliberate flaws, Bob's Discount Diamonds.  This is running on his local machine.  He then shows us a demo of the OWASP ZAP tool and how it can intercept all of the requests and responses made between the web browser and the web server, analysing those responses for vulnerabilities and weaknesses.  Finally, Robin shows us that the OWASP ZAP software contains a handy "fuzzer" capability which allows it to replay requests using lists of known data or random data - i.e. can replay sending login requests with different usernames/passwords etc.

The next grok talk was an introduction to the GDPR by John Price.  John introduces the GDPR, which is the new EU wide General Data Protection Regulations and effectively replaced the older Data Protection Act in the UK.  GDPR, in a nutshell, means that users of data (mostly companies who collect a person's data) need to ask permission from the data owner (the person to whom that data belongs) for the data and for what purpose they'll use that data.  Data users have to be able to prove that they have a right to use the data that they've collected.  John tells us that adherence to the GDPR in the UK is not affected by Brexit as it's already enshrined in UK law and has been since April 2016, although it's not really been enforced  up to this point.  It will start to be strictly enforced from May 2018 onwards.  We're told that, unlike the previous Data Protection Act, violations of the regulations carry very heavy penalties, usually starting at 20 million Euros or 4% of a company's turnover.  There will be some exceptions to the regulations, such as police and military but also exception for private companies too, such as a mobile phone network provider giving up a person's data due to "immediate threat to life".  Some consent can be implied, so for example, entering your car's registration number into a web site for the purposes of getting an insurance quote is implied permission to use the registration number that you've provided, but the restriction is that the data can only be used for the specific purpose for which it was supplied.  GDPR will force companies to declare if data is sent to third parties.  If this happens, the company initially taking the data and each and every third-party that receives that data have to inform the owner of the data that they are in possession of the data.  GDPR is regulated by the Information Commissioners Office in the UK.  Finally, John says that the GDPR may make certain businesses redundant.  He gives an example industry of credit reference agencies.  Their whole business model is built on non-consentual usage of data, so it will be interesting to see how GDPR affects industries like these.

After John's talk, there was a final grok talk however, I needed a quick restroom break before the main sessions of the afternoon, so headed off for my restroom break before making my way back to the room for the first of the afternoon's sessions.  This was Matt Ellis's How To Parse A File.

IMG_20171014_142833Matt starts his session by stating that his talk is all about parsing files, but he immediately says, "But, Don't do it!"  He tells us that it's a solved problem and we really shouldn't be writing code to parsing files by hand for ourselves and should just use one of the many excellent libraries out there instead.  Matt does discuss why you decide you really needed to parse files for yourself.  Perhaps you need better speed and efficiency or maybe it's to reduce dependencies or to parse highly specific custom formats.  It could even be parsing for things that aren't even files such as HTTP headers, standard output etc.  From here, Matt mentions that he works for JetBrains and that the introduction of simply parsing a file is a good segue into talking about some of the features that can be found inside many of JetBrains' products.

Matt starts by looking at the architecture of many of JetBrains' IDE's and developer tools such as ReSharper.  They're build with a similar architecture and they all rely on a layer that they call the PSI layer.  The PSI layer is responsible for parsing, lexing and understanding the user code that the tool is working on.  Matt says that he's going to use the Unity framework to show some examples throughout his session and that he's going to attempt to build up a syntax tree for his Unity code.  We first look at a hand-rolled parser, this one is attempting to understand the code by observing each character at a time.  It's a very laborious approach and prone to error, so this is an approach to parsing that we shouldn't use.  Matt tells us that the best approach, which has been "solved" many time in the past is ti employ the services of a lexer.  This is a processor that turns the raw code into meaningful tokens based upon the words and vocabulary of the underlying code or language and gives structure to those tokens.  It's from the output of the lexer that we can more easily and robustly perform the parsing.  Lexers are another solved problem, and many lexers already exist for popular programming languages such as lex, CsLex, FsLex, flex, JFlex and many more.

Lexers generate source code, but it's not human readable code.  It's similar to how .NET language code (C# or VB.NET) is first compiled to Intermediate Language prior to being JIT'ed at runtime.  The code output from the lexer is read by the parser and from there the parser can try to understand the grammar and structure of the underlying code via syntactical analysis.  This often involved the use of Regular Expressions in order to match specific tokens or sets of tokens.  This works particularly well as Regular Expressions can be translated into a state machine and from there, translated into a transition table.  Parsers understand the underlying code that they're designed to work on, so for example, a parser for C# would know that in a class declaration, there would be the class name which would be preceded by a token indicating the scope of the class (public, private etc).  Parsing is not a completely solved problem. It's more subjective, so although solutions exist, they're more disparate and specific to the code or language that they're used for and therefore, they're not a generic solution.

Matt tells us how parsing can be done either top-down or bottom-up. Top down parsing starts at highest level construct of the language, for example at a namespace or class level in C#, and it then works it's way down to the lower level constructs from there - through methods and the code and locally scoped variables in those methods.  Bottom up parsing works the opposite way around, starting with the lower level constructs of the language and working back up to the class or namespace.  Bottom up parsers can be beneficial over top-down ones as they have the ability to utilise shift-reduce algorithms to simplify code as it's being parsed.  Parsers can even be "parser combinators". These are parsers built from other, simpler, parsers where the input the next parser in the chain is the output from the previous parser in the chain, more formally known as recursive-descendant parsing.  .NET's LINQ acts in a similar way to this.  Matt tells us about an F# parser combinator called FParsec and a C# parser is sort of like this. FParsec is a parser combinator for F# along with a C# parser combinator called Sprache, itself relying heavily on LINQ:

Parser<string> identifier =
    from leading in Parse.WhiteSpace.Many()
    from first in Parse.Letter.Once()
    from rest in Parse.LetterOrDigit.Many()
    from trailing in Parse.WhiteSpace.Many()
    select new string(first.Concat(rest).ToArray());

var id = identifier.Parse(" abc123  ");

Assert.AreEqual("abc123", id);

Matt continues by asking us to consider how parsers will deal with whitespace in a language.  This is not always as easy as it sounds as some languages, such as F# or Python, use whitespace to give semantic meaning to their code, whilst other languages such as C# use whitespace purely for aesthetic purposes.  In dealing with whitespace, we often make use of a filtering lexer.  This is a simple lexer that specific detects and removes whitespace prior to parsing.  The difficulty then is that, for languages where  whitespace is significant, we need to replace the removed whitespace after parsing.  This, again, can be tricky as the parsing may alter the actual code (i.e. in the case of a refactoring) so we must again be able to understand the grammar of the language in order to re-insert whitespace into the correct places.  This is often accomplished by building something known as a Concrete Parse Tree as opposed to the more normal Abstract Syntax Tree.  Concrete Parse Tree's work in a similar way to a C# Expression Tree, breaking down code into a hierarchical graph of individual code elements.

IMG_20171014_144725Matt tells us about other uses for Lexers such as the ability to determine specific declarations in the language.  For example, in F# typing 2. would represent a floating point number, where as typing [2..0] would represent a range.  When the user is only halfway through typing, how can we know if they require a floating point number or a range?  Also such things as comments within comments, for example: /* This /* is */ valid */  This is something that lexers can be good at, at such matching is difficult to impossible with regular expressions.

The programs that use lexers and parsers can often have very different requirements, too.  Compilers using them will generally want to compile the code, and so they'll work on the basis that the program code that they're lexing/parsing is assumed correct, whilst IDE's will take the exact opposite approach.  After all, most of the time whilst we're typing, our code is in an invalid state.  For those programs that assume the code is in an invalid state most of the time, they often use techniques such as error detection and recovery.  This is, for example, to prevent your entire C# class from being highlighted as invalid within the IDE just because the closing brace character is missing from the class declaration.  They perform error detection on the missing closing brace, but halt highlighting of the error at the first "valid" block of code immediately after the matching opening brace.  This is how the Visual Studio IDE is able to only highlight the missing closing brace as invalid and not the entire file full of otherwise valid code.  In order for this to be performant, lexers in such programs will make heavy use of caching to prevent having to continually lex the entire file with every keystroke.

Finally, Matt talks about how JetBrains often need to also deal with "composable languages".  These are things like ASP.NET MVC's Razor files, which are predominantly comprised of HTML mark-up, but which can also contain "islands" of C# code.  For this, we take a similar approach to dealing with whitespace in that the file is lexed for both languages, HTML and C# and the HTML is temporarily removed whilst the C# code is parsed and possibly altered.  The lexed tokens from both the C# and the preserved HTML are then re-combined after the parsing to re-create the file.

After Matt's session, there was one final break before the last session of the day.  Since there was, unfortunately, no coffee served at this final break, I made my way directly to the room for my next and final session, Joe Stead's .NET Core In The Real World.

IMG_20171014_154513Joe starts his talk by announcing that there's no code or demos in his talk, and that his talk will really just be about his own personal experience of attempting to migrate a legacy application to .NET Core.  He says that he contemplated doing a simple "Hello World" style demo for getting started with .NET Core, but that it would give a false sense of .NET Core being simple.  In the real-world, and when migrating an older application, it's a bit more complicated than that.

Joe mentions the .NET Standard and reminds us that it's a different thing than .NET Core.  .NET Core does adhere to the .NET Standard and Joe tells us that .NET Standard is really just akin to Portable Class Libraries Version 2.0.

Joe introduces the project that he currently works on at his place of employment.  It's a system that started life in 2002 and was originally built with a combination of Windows Forms applications and ASP.NET Web Forms web pages, sprinkled with Microsoft AJAX JavaScript.   The system was in need of being upgraded in terms of the technologies used, and so in 2012, they migrated to KnockoutJS for the front-end websites, and in 2013, to further aid with the transition to KnockoutJS, they adopted the NancyFX framework to handle the web requests.  Improvements in the system continued and by 2014 they had started to support the Mono Framework and had moved from Microsoft SQL Server to a PostgreSQL database.  This last lot of technology adoptions was to support their growing demand for a Linux version of their application from their user base.  The adoptions didn't come without issues, however, and by late 2014, they started to experience serious segfaults in their application.  After some diagnosis after which they never did fully get to the bottom of the root cause of the segfaults, they decided to adopt Docker in 2015 as a means of mitigating the segfault problem.  If one container started to display problems associated with segfaults, they could kill the container instance and create a new one.  By this point, they were in 2015 and decided that they'd start to now look into .NET Core.  It was only in Beta at this time, but were looking for a better platform that Mono that might provide some much needed stability and consistency across operating systems.  And since they were on such a roll with changing their technology stacks, they decided to move to Angular2 on the front-end, replacing KnockoutJS in 2016 as well!

By 2017, they'd adopted .NET Core v1.1 along with RabbitMQ and Kubernetes.  Joe states that the reason for .NET Core adoption was to move away from Mono.  By this point, they were not only targeting Mono, but a custom build of Mono that they'd had to fork in order to try to fix their segfault issues.  They needed much more flexible deployments such as the ability to package and deploy multiple versions of their application using multiple different versions of the underlying .NET platform on the same machine.  This was problematic in Mono, as it can be in the "full" .NET Framework, but one of the benefits of .NET Core is the ability to package the run-time with your application, allowing true side-by-side versions of the run-time to exist for different applications on the same machine.

Joe talks about some of the issues encountered when adopting and migrating to .NET Core.  The first issue was missing API's.  .NET Core 1.0 and 1.1 were built against .NET Standard 1.x and so many API's and namespace were completely missing.  Joe also found that many NuGet packages that his solution was dependent upon were not yet ported across to .NET Core.  Joe recalls that testing of the .NET Core version of the solution was a particular challenge as few other people had adopted the platform and the general response from Microsoft themselves was that "it's coming in version 2.0!".  What really helped save the day for Joe and his team was that .NET Core itself and many of the NuGet packages were open source.  This allowed them to fork many of the projects that the NuGet packages were derived from and help with transitioning them to support .NET Core.  Joe's company even employed a third party to work full time on helped to port NancyFX to .NET Core. 

Joe now talks about the tooling around .NET Core in the early days of the project.  We examine how Microsoft introduced a whole new project file structure, moving away from XML representation in the .csproj files, and moving to a JSON representation with project.json.  Joe explains how they had to move their build script and build tooling to the FAKE build tool as a result of the introduction of project.json.  There were also legal issues around using the .NET Core debugger assemblies in tools other than Microsoft's own IDE's, something that the JetBrain's Rider IDE struggled with.  We then look at tooling in the modern world of .NET Core and project.json has gone away, and reverted back to the .csproj files although they're much more simplified and improved.  This allows the use of MSBuild again, however, FAKE itself now has native support for .NET Core.  The dotnetCLI tool has improved greatly and the legal issues around the use of the .NET Core debugging assemblies has been resolved, allowing third-party IDE's such as JetBrain's Rider to use them again. 

IMG_20171014_161612Joe also mentions how .NET Core now, with the introduction of version 2.0, is much better than the Mono Framework when it comes to targeting multiple run-times.  He also mentions issues that plagued their use of libcurl on the Mac platform when using .NET Core 1.x, but that these have now been resolved in .NET Core 2.0 as .NET Core 2.0 now uses the native macOS implementation rather than trying to abstract that and use it's own implementation.

Joe moves on to discuss something that's not really specific to .NET Core, but is a concern when developing code to be run on multiple platforms.  He shows us the following two lines of code:

TimeZoneInfo.FindSystemTimeZoneById("Eastern Standard Time");
TimeZoneInfo.FindSystemTimeZoneById("America/New_York");

He asks which is the "correct" one to use.   Well, it turns out that they're both correct.  And possibly incorrect!   The top line works on Windows, and only on Windows, whilst the bottom line works on Linux, and not on Windows.  It's therefore incredibly important to understand such differences when targeting multiple platforms with your application.  Joe also says how, as a result of discrepancies such as the timezone issue, the tooling can often lie.  He recalls a debugging session where one debugging window would show the value of a variable with one particular date time value, and another debugging window - in the exact same debug session - would interpret and display the same variable with an entirely different date time value.  Luckily, most of these issues are now largely resolved with the stability that's come from recent versions of .NET Core and the tooling around it.

In wrapping up, Joe says that, despite the issues they encountered, moving to .NET Core was the right thing for him and his company.  He does say, though, that for other organisations, such a migration may not be the right decision. Each company, and each application, needs to be evaluated for migration to .NET Core on it's own merits.  For Joe's company, the move to .NET Core allowed them to focus attention elsewhere after migration.  They've since been able to adopt Kubernetes. They've been able to improve and refactor code to implement better testing and many more long overdue improvements.  In the last month, they've migrated again from .NET Core 1.1 to .NET Core 2.0 which was a relatively easy task after the initial .NET Core migration.  This one only involved the upgrading of a few NuGet packages and that was it.  The move to .NET Core 2.0 also allowed them to re-instate lots of code and functionality that had been temporarily removed thanks to the new, vastly increased API surface area of .NET Core 2.0 (really, .NET Standard 2.0).

IMG_20171014_165613After Joe's session, it was time for all the attendees to gather in the main foyer area of the university building for the final wrap-up and prize draws.  After thanking to sponsors, the venue, and the organisers and volunteers, without whom of course, events such as DDD simply wouldn't be able to take place, we moved onto the prize draw.  Unfortunately, I wasn't a winner, however, the day had been brilliant.

IMG_20171014_132318Another wonderful DDD event had been and gone but a great day was had by all.  We were told that the next DDD Event was to be a DDD Dublin, held sometime around March 2018.  So there's always that to look forward to.

DDD East Anglia 2017 In Review

IMG_20170916_083642xThis past Saturday, 16th September 2017, the fourth DDD East Anglia event took place in Cambridge.  DDD East Anglia is a relatively new addition to the DDD event line-up but now it’s fourth event sees it going from strength to strength.

IMG_20170917_091108I’d made the long journey to Cambridge on the Friday evening and stayed in a local B&B to be able to get to the Hills Road College where DDD East Anglia was being held on the Saturday.  I arrived bright and early, registered at the main check-in desk and then proceeded to the college’s recital room just around the corner from the main building for breakfast refreshments.

After some coffee, it was soon time to head back to the main college building and up the stairs to the room where the first session of the day would commence.  My first session was to be Joseph Woodward’s Building A Better Web API With CQRS.

IMG_20170916_091329xJoseph starts his session by defining CQRS.  It’s an acronym, standing for Command Query Responsibility Segregation.  Fundamentally, it’s a pattern for splitting the “read” models from your “write” models within a piece of software.  Joseph points out that we should beware when googling for CQRS as google seems to think it’s a term relating to cars!

CQRS was first coined by Greg Young and it’s very closely related to a prior pattern called CQS (Command Query Separation), originally coined by Bertrand Meyer which states that every method should either be a command which performs an action, or a query which returns data to the caller, but never both.  CQS primarily deals with such separations at a very micro level, whilst CQRS primarily deals with the separations at a broader level, usually along the seams of bounded contexts.  Commands will mutate state and will often be of a “fire and forget” nature.  They will usually return void from the method call.  Queries will return state and, since they don’t mutate any state are idempotent and safe.  We learn that CQRS is not an architectural pattern, but is more of a programming style that simply adheres to the the separation of the commands and queries.

Joseph continues by asking what’s the problem with some of our existing code that CQRS attempts to address.   We look at a typical IXService (where X is some domain entity in a typical business application):

public class ICustomerService
{
     void MakeCustomerPreferred(int customerId);
     Customer GetCustomer(int customerId);
     CustomerSet GetCustomersWithName(string name);
     CustomerSet GetPreferredCustomers();
     void ChangeCustomerLocale(int cutomerId, Locale newLocale);
     void CreateCustomer(Customer customer);
     void EditCustomerDetails(CustomerDetails customerDetails);
}

The problem here is that the interface ends up growing and growing and our service methods are simply an arbitrary collection of queries, commands, actions and other functions that happen to relate to a Customer object.  At this point, Joseph shares a rather insightful quote from a developer called Rob Pike who stated:

“The bigger the interface, the weaker the abstraction”

And so with this in mind, it makes sense to split our interface into something a bit smaller.  Using CQRS, we can split out and group all of our "read" methods, which are our CQRS queries, and split out and group our "write" methods (i.e. Create/Update etc.) which are our CQRS commands.  This will simply become two interfaces in the place of one, an ICustomerReadService and an ICustomerWriteService.

There's good reasons for separating our concerns along the lines of reads vs writes, too.  Quite often, since reads are idempotent, we'll utilise heavy caching to prevent us from making excessive calls to the database and ensure our application can return data in as efficient a manner as possible, whilst our write methods will always hit the database directly.  This leads on to the ability to have entirely different back-end architectures between our reads and our writes throughout the entire application.  For example, we can scale multiple read replica databases independently of the database that is the target for writes.  They could even be entirely different database platforms.

From the perspective of Web API, Joseph tells us how HTTP verbs and CQRS play very nicely together.  The HTTP verb GET is simply one of our read methods, whilst the verbs PUT, POST, DELETE etc. are all of our write concerns.  Further to this, Joseph looks at how we can often end up with MVC or WebAPI controllers that require services to be injected into them and often our controller methods end up becoming bloated from having additional concerns embedded within them, such as validation.  We then look at the command dispatcher pattern as a way of supporting our separation of reads and writes and also as a way of keeping our controller action methods lightweight.

There are two popular frameworks that implement the command dispatcher pattern in the .NET world. MediatR and Brighter.  Both frameworks allow us to define our commands using a plain old C# object (that implements specific interfaces provided by the framework) and also to define a "handler" to which the commands are dispatched for processing.  For example:

public class CreateUserCommand : IRequest
{
     public string EmailAddress { get; set; }
     // Other properties...
}

public class CreateUserCommandHandler : IAsyncRequestHandler<CreateUserCommand>
{
     public CreateUserCommandHandler(IUserRepository userRepository, IMapper mapper)
     {
          _userRepository = userRepository;
          _mapper = mapper;
     }

     public Task Handle(CreateUserCommand command)
     {
          var user = _userRepository.Map<CreateUserCommand, UserEntity>(command);
          await _userRepository.CreateUser(user);
     }
}

Using the above style of defining commands and handlers along with some rudimentary configuration of the framework to allow specific commands and handlers to be connected, we can move almost all of the required logic for reading and writing out of our controllers and into independent, self-contained classes that perform a single specific action.  This enables further decoupling of the domain and business logic from the controller methods, ensuring the controller action methods remain incredibly lightweight:

public class UserController : Controller
{
     private readonly IMediator _mediator;
	 
	 public UserController(IMediator mediator)
	 {
	      _mediator = mediator;
	 }
	 
	 [HttpPost]
	 public async Task Create(CreateUserCommand user)
	 {
	      await _mediator.Send(user);
	 }
}

Above, we can see that the Create action method has been reduced down to a single line.  All of the logic of creating the entity is contained inside the handler class and all of the required input for creating the entity is contained inside the command class.

Both the MediatR and Brighter libraries allow for request and post-request pre-processors.  This allows defining another class, again deriving from specific interfaces/base classes within the framework, which will be invoked before the actual handler class or immediately afterwards.  Such pre-processing if often a perfect place to put cross-cutting concerns such as validation:

public class CreateUserCommandValidation : AbstractValidation<CreateUserCommand>
{
     public CreateUserCommandValidation()
	 {
	      RuleFor(x => x.EmailAddress).NotEmpty().WithMessage("Please specify an email address");
	 }
}

The above code shows some very simple example validation, using the FluentValidation library, that can be hooked into the command dispatcher framework's request pre-processing to firstly validate the command object prior to invoking the handler, and thus saving the entity to the database.

Again, we've got a very nice and clean separation of concerns with this approach, with each specific part of the process being encapsulated within it's own class.  The input parameters, the validation and the actual creation logic.

Both MediatR and Brighter have an IPipelineBehaviour interface, which allows us to write code that hooks into arbitrary places along the processing pipeline.  This allows us to implement other cross-cutting concerns such as logging.  Something that's often required at multiple stages of the entire processing pipeline.

At this point, Joseph shares another quote with us.  This one's from Uncle Bob:

"If your architecture is based on frameworks then it cannot be based on your use cases"

From here, Joseph turns his talk to discuss how we might structure our codebases in terms of files and folders such that separation of concerns within the business domain that the software is addressing are more clearly realised.  He talks about a relatively new style of laying out our projects called Feature Folders (aka Feature Slices).

This involves laying out our solutions so that instead of having a single top-level "Controllers" folder, as is common in almost all ASP.NET MVC web applications, we instead have multiple folders named such that they represent features or specific areas of functionality within our software.  We then have the requisite Controllers, Views and other folders underneath those.   This allows different areas of the software to be conceptually decoupled and kept separate from the other areas.  Whilst this is possible in ASP.NET MVC today, it's even easier with the newer ASP.NET Core framework, and a NuGet package called AddFeatureFolders already exists that enables this exact setup within ASP.NET Core.

Joseph wraps up his talk by suggesting that we take a look at some of his own code on GitHub for the DDD South West website (Joseph is one of the organisers for the DDD South West events) as this has been written using the the CQRS pattern along with using feature folders for layout.

IMG_20170916_102558After Joseph's talk it's time for a quick coffee break, so we head back to the recital room around the corner from the main building for some liquid refreshment.  This time also accompanied by some very welcome biscuits!

After our coffee break, it's time to head back to the main building for the next session.  This one was to be Bart Read's Client-Side Performance For Back-End Developers.

IMG_20170916_103032Bart's session is all about how to maximise performance of client-side script using tools and techniques that we might employ when attempting to troubleshoot and improve the performance of our back-end, server-side code.  Bart starts by stating that he's not a client-side developer, but is more of a full stack developer.  That said, as a full stack developer, one is expected to perform at least some client-side development work from time to time.  Bart continues that in other talks concerning client-side performance, the speakers tend to focus on the page load times and page lifecycle, which whilst interesting and important, is a more a technology-centric way of looking at the problem.  Instead, Bart says that he wants to focus on RAIL, which was originally coined by Google.  This is an acronym for Response, Animation, Idle and Load and is a far more user-centric way of looking at the performance (or perhaps even just perceived performance) issue.  In order to explore this topic, Bart states that he learnt JavaScript and built his own arcade game site, Arcade.ly, which uses extensive JavaScript and other resources as part of the site.

We first look at Response.  For this we need to build a very snappy User Interface so that the user feels that the application is responding to them immediately.  Modern web applications are far more like desktop applications written using either WinForms or WPF than ever and users are very used to these desktop applications being incredibly responsive, even if lots of processing is happening in the background.  One way to get around this is to use "fake" pages.  These are pages that load very fast, usually without lots of dynamic data on them, that are shown to the user whilst asynchronous JavaScript loads the "real" page in the background.  Once fully loaded, the real page can be gracefully shown to the user.

Next, we look at Animation. Bart reminds us that animations help to improve the user perception of responsiveness of your user interface.  Even if your interface is performing some processing that takes a few milliseconds to run, loading and displaying an animation that the user can view whilst that processing to going on will greater enhance the perceived performance of the complete application.  We need to ensure that our animations always run at 60 fps (frames per second), anything less than this will cause them to look jerky and is not a good user experience.  Quite often, we need to perform some computation prior to invoking our animation and in this scenario we should ensure that the computation is ideally performed in less than 10 milliseconds.

Bart shares a helpful online utility called CanvasMark which provides benchmarking for HTML5 Canvas rendering.  This can be really useful in order to test the animations and graphics on your site and how they perform on different platforms and browsers

Bart then talks about using the Google Chrome Task Manager to monitor the memory usage of your page.  A lot of memory can be consumed by your page's JavaScript and other large resources.  Bart talks about his own arcade.ly site which uses 676MB of memory.  This might be acceptable on a modern day desktop machine, but it will not work so well on a more constrained mobile device.  He states that after some analysis of the memory consumption, most of the memory was consumed by the raw audio that was decompressed from loaded compressed audio in order to provide sound effects for the game.  By gracefully degrading the quality and size of the audio used by the site based upon the platform or device that is rendering the site, performance was vastly improved.

Another common pitfall is in how we write our JavaScript functions.  If we're going to be creating many instances of a JavaScript object, as can happen in a game with many individual items on the screen, we shouldn't attach functions directly to the JavaScript object as this creates many copies of the same function.  Instead, we should attach the function to the object prototype, creating a single copy of the function, which is then shared by all instances of the object and thus saving a lot of memory.  Bart also warns us to be careful of closures on our JavaScript objects as we may end up capturing far more than we actually need.

We now move onto Idle.   This is all about deferring work as the main concern for our UI is to respond to the user immediately.  One approach to this is to use Web Workers to perform work at a later time.  In Bart's case, he says that he wrote his own Task Executor which creates an array of tasks and uses the builtin JavaScript setTimeout function to slowly execute each of the queued tasks.  By staggering the execution of the queued tasks, we prevent the potential for the browser to "hang" with a white screen whilst background processing is being performed, as can often happen if excessive tasks and processing is performed all at once.

Finally, we look at Load.  A key take away of the talk is to always use HTTP/2 if possible.  Just by switching this on alone, Bart says you'll see a 20-30% improvement in performance for free.  In order to achieve this, HTTP/2 provides us with request multiplexing, which bundles requests together meaning that the browser can send multiple requests the the server in one go.  These requests won't necessarily respond any quicker, but we do save on the latency overhead we would incur if sending of each request separately.  HTTP/2 also provides server push functionality, stream priority and header compression.  It also has protocol encryption, which whilst not an official part of the HTTP/2 specification, is currently mandated by all browsers that support the HTTP/2 protocol, effectively making encryption compulsory.  HTTP/2 is widely supported across all modern browsers on virtually all platforms, with only Opera Mini being only browser without full support, and HTTP/2 is also fully supported within most of today's programming frameworks.  For example, the .NET Framework has supported HTTP/2 since version 4.6.0.  One other significant change when using HTTP/2 is that we no longer need to "bundle" our CSS and JavaScript resources.  This also applies to "spriting" of icons as a single large image.

Bart moves on to talk about loading our CSS resources and he suggests that one very effective approach is to inline the bare minimum CSS we would require to display and render our "above the fold" content with the rest of the CSS being loaded asynchronously.  The same applies to our JavaScript files, however, there can be an important caveat to this.  Bart explains how he loads some of his JavaScript synchronously, which itself negatively impacts performance, however, this is required to ensure that the asynchronously loaded 3rd-party JavaScript - over which you have no control - does not interfere with your own JavaScript as the 3rd-party JavaScript is loaded at the very last moment whilst Bart's own JavaScript is loaded right up front.  We should look into using DNS Prefetch to force the browser to perform DNS Lookups ahead of time for all of the domains that our site might reference for 3rd party resources.  This incurs a one off small performance impact as the page first loads, but makes subsequent requests for 3rd party content much quicker.

Bart warns us not to get too carried away putting things in the HEAD section of our pages and instead we should focus on getting the "above the fold" content to be as small as possible, ideally it should be all under 15kb, which is the size of data that can fit in a single HTTP packet.  Again, this is a performance optimization that may not have noticeable impact on desktop browsers, but can make a huge difference on mobile devices, especially if they're using a slow connection.  We should always check the payload size of our sites and ensure that we're being as efficient as possible and not sending more data than is required.  Further to this, we should use content compression if our web server supports it.  IIS has supported content compression for a long time now, however, we should be aware of a bug that affects IIS version 8 and possibly version 9 which turns off compression for chunked content. This bug was fixed in IIS version 10.

If we're using libraries or frameworks in our page, ensure we only deliver the required parts.  Many of today's libraries are componentized, allowing the developer to only include the parts of the library/framework that they actually need and use.  Use Content Delivery Networks if you're serving your site to many different geographic areas, but also be aware that, if your audience is almost exclusively located in a single geographic region, using a CDN could actually slow things down.  In this case, it's better to simply serve up your site directly from a server located within that region.

Finally, Bart re-iterates.  It's all about Latency.   It's latency that slows you down significantly and any performance optimizations that can be done to remove or reduce latency will improve the performance, or perceived performance, of your websites.

IMG_20170916_102542After Bart's talk, it's time for another coffee break.  We head back to the recital room for further coffee and biscuits and after a short while, it's time for the 3rd session of the day and the last one prior to lunch.  This session is to be a Visual Note Taking Workshop delivered by Ian Johnson.

As Ian's session was an actual workshop, I didn't take too many notes but instead attempted to take my notes visually using the technique of Sketch-Noting that Ian was describing.

Ian first states that Sketch-Noting is still mostly about writing words.  He says that most of us, as developers using keyboards all day long, have pretty terrible hand writing so we simply need to practice more at it.  Ian suggests to avoid all caps words and cursive writing, using a simple font and camel cased lettering (although all caps is fine for titles and headings).  Start bigger to get the practice of forming each letter correctly, then write smaller and smaller as you get better at it.  You'll need this valuable skill since Sketch-Noting requires you to be able to write both very quickly and legibly. 

At this point, I put my laptop away and stopped taking written notes in my text editor and tried to actually sketch-note the rest of Ian's talk, which gave us many more pointers and advice on how to construct our Sketch Notes.  I don't consider myself artistic in the slightest, but Ian insists that Sketch Notes don't really rely on artistic skill, but more on the skill of being able to capture the relevant information from a fast-moving talk.  I didn't have proper pens for my Sketch Note and had to rely solely on my biro, but here in all its glory is my very first attempt at a Sketch Note:

IMG_20170916_125616

IMG_20170916_130218After Ian's talk was over, it was time for lunch.  All the attendees reconvened in the recital room where we could help ourselves to the lunch kindly provided by the conference organizers and paid for by the sponsors.  Lunch was the usual brown bag affair consisting of a sandwich, some crisps a chocolate bar, a piece of fruit and a can of drink.  I took the various items for my lunch and the bag and proceeded to wander just outside the recital room to a small green area with some tables.   It was at this point that the weather decided to turn against us and is started raining very heavily.  I made a hasty retreat back inside the recital room where it was warm and dry and proceeded to eat my lunch there.

There were some grok talks taking place over the lunch time, but considering the weather and the fact the the grok talk were taking place in the theatre room which was the further point from the recital room, I decided against attending them and chose to remain warm and dry instead.

After lunch, it was time to head back to the main building for the next session, this one was to be Nathan Gloyn's Microservices - What I've Learned After A Year Building Systems.

IMG_20170916_135912Nathan first introduces himself and states that he's a contract developer.  As such, he's been involved in two different projects over the previous 12 months that have been developed using a microservices architecture.  We first asked to consider the question of why should we use microservices?  In Nathan's experience so far, he says, Don't!  In qualifying that statement, Nathan states that microservices are ideal if you need only part of a system to scale, however, for the majority of applications, the benefits to adopting a microservices architecture doesn't outweigh the additional complexity that is introduced.

Nathan state that building a system composed of microservices requires a different way of thinking.  With more monolithic applications, we usually scale them by scaling out - i.e. we use the same monolithic codebase for the website and simply deploy it to more machines which all target the same back-end database.  Microservices don't really work like this, and need to be individually small and simple.  They may even have their own individual database just for the individual service.

Microservices are often closely related to Domain-driven Design's Bounded Contexts so it's important to correctly identify the business domain's bounded contexts and model the microservices after those.  Failure to do this runs the risk that you'll create a suite of mini-monoliths rather than true microservices.

Nathan reminds us that we are definitely going to need a messaging system for an application built with a microservice architecture.  It's simply not an option not to use one as virtually all user interactions will be performed across multiple services.  Microservices are, by their very nature, entirely distributed.  Even simple business processes can often require multiple services and co-ordination of those services.  Nathan says that it's important not to build any messaging into the UI layer as you'll end up coupling the UI to the domain logic and the microservice which is something to be avoided.  One option for a messaging system is NServiceBus, which is what Nathan is currently using, however many other options exist.   When designing the messaging within the system, it's very important to give consideration to versioning of messages and message contracts.  Building in versioning from the beginning ensures that you can deploy individual microservices independently rather than being forced to deploy large parts of the system - perhaps multiple microservices - together if they're all required to use the exact same message version.

We next look at the difference between "fat" versus "thin" services.  Thin services generally only deal with data that they "own", if the thin service needs other data for processing, they must request it from the other service that owns that data.  Fat services, however, will hold on to data (albeit temporarily) that actually "belongs" to other services in order to perform their own processing.  This results in coupling between the services, however, the coupling of fat and thin services is different as fat services are coupled by data whereas thin services are coupled by service.

With microservices, cross-cutting concerns such as security and logging become even more important than ever.  We should always ensure that security is built-in to every service from the very beginning and is treated as a first class citizen of the service.  Enforcing the use of HTTPS across the board (i.e. even when running in development or test environments as well as production) helps to enforce this as a default choice.

We then look at how our system's source code can be structured for a microservices based system.  It's possible to use either one source control repository or multiple and there's trade-offs against both options.  If we use a single repository, that's really beneficial during the development phase of the project, but is not so great when it comes to deployment.  On the other hand, using multiple repositories, usually separated by microservice, is great for deployment since each service can be easily integrated and deployed individually, but it's more cumbersome during the development phase of the project.

It's important to remember that each microservice can be written using it's own technology stack and that each service could use an entirely different stack to others.  This can be beneficial if you have different team with different skill sets building the different services, but it's important to remember to you'll need to constantly monitor each of the technology stacks that you use for security vulnerabilities and other issues that may arise or be discovered over time.  Obviously, the more technology stacks you're using, the more time-consuming this will be.

It's also important to remember that even when you're building a microservices based system, you will still require shared functionality that will be used by multiple services.  This can be built into each service or can be separated out to become a microservice in it's own right depending upon the nature of the shared functionality.

Nathan talks about user interfaces to microservice based systems.  These are often written using a SPA framework such as Angular or React.  They'll often go into their own repository for independent deployment, however, you should be very careful that the front-end user interface part of the system doesn't become a monolith in itself.  If the back-end is nicely separated into microservice based on domain bounded contexts, the front-end should be broken down similarly too.

Next we look at testing of a microservice based system.  This can often be a double-edged sword as it's fairly easy to test a single microservice with its known good (or bad) inputs and outputs, however, much of the real-world usage of the system will be interactions that span multiple services so it's important to ensure that you're also testing the user's path through multiple services.  This can be quite tricky and there's no easy way to achieve this.  It's often done using automated integration testing via the user interface, although you should also ensure you do test the underlying API separately to ensure that security can't be bypassed.

Configuration of the whole system can often be problematic with a microservice based system.  For this reason, it's usually best to use a separate configuration management system rather than trying to implement things like web.config transforms for each service.  Tools like Consul or Spring Cloud Config are very useful here.

Data management is also of critical importance.  It should be possible to change data within the system's data store without requiring a deployment.  Database migrations are a key tool is helping with this.  Nathan mentions both Entity Framework Migrations and also FluentMigrator as two good choices.  He offers a suggestion for things like column renames and suggests that instead of a migration that renames the column, create a whole new column instead.  That way, if the change has to be rolled back, you can simply remove the new column, leaving the old column (with the old name) in place.  This allows other services that may not be being deployed to continue to use the old column until they're also updated.

Nathan then touches on multi-tenancy within microservice based systems and says that if you use the model of a separate database per tenant, this can lead to a huge explosion of databases if your microservices are using multiple databases for themselves.  It's usually much more manageable to have multi-tenancy by partitioning tenant data within a single database (or the database for each microservice).

Next, we look at logging and monitoring within our system.  Given the distributed nature of a microservice based system, it's important to be able to log and understand the complete user interaction even though logging is done individually by individual microservices.  To facilitate understanding the entire end-to-end interaction we can use a CorrelationID for this.  It's simply a unique identifier that travels through all services, passed along in each message and written to the logs of each microservice.  When we look back at the complete set of logs, combined from the disparate services, we can use the CorrelationID to correlate the log messages into a cohesive whole.  With regard to monitoring, it's also critically important to monitor the entire system and not just the individual services.  It's far more important to know how healthy the entire system is rather than each service, although monitoring services individually is still valuable.

Finally, Nathan shares some details regarding custom tools.  He says that, as a developer on a microservice based system, you will end up building many custom tools.  These will be such tools as bulk data loading whereby getting the data into the database requires processing by a number of different services and cannot simply be directly loaded into the database.  He says that despite the potential downsides of working on such systems, building the custom tools can often be some of the more enjoyable parts of building the complete system.

After Nathan's talk it was time for the last coffee break of the day, after which it was time for the final day's session.  For me, this was Monitoring-First Development by Benji Weber.

IMG_20170916_152036Benji starts his talk by introducing himself and says that he's mostly a Java developer but that he done some .NET and also writes JavaScript as well.  He works at an "ad-tech" company in London.  He wants to first start by talking about Extreme Programming as this is a style of programming that he uses in his current company.  We look at the various practices within Extreme Programming (aka XP) as many of these practices have been adopted within wider development styles, even for teams that don't consider themselves as using XP.  Benji says that, ultimately, all of the XP practices boils down to one key thing - Feedback.  They're all about getting better feedback, quicker.  Benji's company uses full XP with a very collaborative environment, collectively owning all the code and the entire end-to-end process from design/coding through to production, releasing small changes very frequently.

As part of this style adopted by the teams, it's lead onto the adoption of something they term Monitor-Driven Development.  This is simply the idea that monitoring of all parts of a software system is the core way to get feedback on that system, both when the system is being developed and as the system is running in production.  Therefore, all new feature development starts by asking the question, "How will we monitor this system?" and then ensuring that the ability to deeply monitor all aspects of the system being developed is a front and centre concern throughout the development cycle.

To illustrate how the company came to adopt this methodology, Benji shares three short stories with us.  The first started with an email to the development team with the subject of "URGENT".  It was from some sales people trying to demonstrate some part of the software and they were complaining that the graphs on the analytics dashboard weren't loading for them.  Benji state that this was a feature that was heavily tested in development so the error seemed strange.   After some analysis into the problem, it was discovered that data was the root cause of the issue and that the development team had underestimated the way in which the underlying data would grow due to users doing unanticipated things on the system, which the existing monitoring that the team had in place didn't highlight.  The second story involves the discovery that 90% of the traffic their software was serving from the CDN was HTTP 500 server errors!  Again, after analysis it was discovered that the problem lay in some JavaScript code that a recently released new version of Internet Explorer was interpreting different from the old version and that this new version was caused the client-side JavaScript to continually make requested to a non-existent URL.  The third story involves a report from an irate client that the adverts being served up by the company's advert system was breaking the client's own website.  Analysis showed that this was again caused by a JavaScript issue and a line of code of self = this; that was incorrectly written using a globally-scoped variable, thereby overwriting variables that the client's own website relied upon.  The common theme throughout all of the stories was that the behaviour of the system had changed, even though no code had changed.   Moreover, all of the problems that arose from the changed behaviour were first discovered by the system's user and not the development team.

Benji references Google's own Site Reliability Engineering book (available to read online for free) which states that 70% of the reasons behind things breaking is because you've changed something.  But this leaves a large 30% of the time where the reasons are something that's outside of your control.  So how did Benji approach improving his ability to detect and respond to issues?  He started by looking at causes vs problems and concluded that they didn't have enough coverage of the problems that occurred.  Benji tells us that it's almost impossible to get sufficient coverage of the causes since there's an almost infinite number of things that could happen that could cause the problem.

To get better coverage of the problems, they universally adopted the "5 whys" approach to determining the root issues.  This involves starting with the problem and repeated asking "why?" to each cause to determine the root cause.  An example is, monitoring is hard. Why is it hard since we don't have the same issue when using Test-Driven Development during coding?  But TDD follows a Red - Green - Refactor cycle so you can't really write untestable code etc.

IMG_20170916_155043So Benji decided to try to apply the Test-Driven Development principles to monitoring.  Before even writing the feature, they start by determining how the feature will be monitored, then only after determining this, they start work on writing the feature ensuring that the monitoring is not negatively impacted.  In this way, the monitoring of the feature becomes the failing unit test that the actual feature implementation must make "go green".

Benji shares an example of show this is implemented and says that the "failing test" starts with a rule defined within their chosen monitoring tool, Nagios.  This rule could be something like "ensure that more adverts are loaded than reported page views", whereby the user interface is checked for specific elements or a specific response rendering.  This test will show as a failure within the monitoring system as the feature has not yet been implemented, however, upon implementation of the feature, the monitoring test will eventually pass (go green) and there we have a correct implementation of the feature driven by the monitoring system to guide it.  Of course, this monitoring system remains in place with these tests increasing over time and becoming an early warning system should any part of the software, within any environment, start to show any failures.  This ensure that the development team are the first to know of any potential issues, rather than the users being first to know.

Benji says they use a pattern called the Screenplay pattern for their UI based tests.  It's an evolution of the Page Objects pattern and allows highly decoupled tests as well as bringing the SOLID principles to the tests themselves.  He also states that they make use of Feature Toggles not only when implementing new features and functionality but also when refactoring existing parts of the system.  This allows them to test new monitoring scenarios without affecting older implementations.  Benji states that it's incredibly important to follow a true Red - Green - Refactor cycle when implementing monitoring rules and that you should always see your monitoring tests failing first before trying to make them pass/go green.

Finally, Benji says that adopting a monitoring-driven development approach ultimately helps humans too.  It helps in future planning and development efforts as it builds awareness of how and what to think about when designing new systems and/or functionality.

IMG_20170916_165131After Benji's session was over, it was time for all the attendees to gather back in the theatre room for the final wrap-up by the organisers and the prize draw.  After thanking the various people involved in making the conference what it is (sponsors, volunteers, organisers etc.) it was time for the prize draw.  There were some good prizes up for grabs, but alas, I wasn’t to be a winner on this occasion.  The DDD East Anglia 2017 event had been a huge success and it was all the more impressive given that the organisers shared the story that their original venue had pulled out only 5 weeks prior to the event!  The new venue had stepped in at the last minute and held an excellent conference which was  completely seamless to the attendees.  We would never have known of the last minute panic had it not been shared with us.  Here's looking forward to the next DDD East Anglia event next year.

DDD 12 In Review

IMG_20170610_085641On Saturday 10th June 2017 in sunny Reading, the 12th DeveloperDeveloperDeveloper event was held at Microsoft’s UK headquarters.  The DDD events started at the Microsoft HQ back in 2005 and after some years away and a successful revival in 2016, this year’s DDD event was another great occasion.

I’d travelled down to Reading the evening before, staying over in a B&B in order to be able to get to the Thames Valley Park venue bright and early on the Saturday morning.  I arrived at the venue and parked my car in the ample parking spaces available on Microsoft’s campus and headed into Building 3 for the conference.  I collected my badge at reception and was guided through to the main lounge area where an excellent breakfast selection awaited the attendees.  Having already had a lovely big breakfast at the B&B where I was staying, I simply decided on another cup of coffee, but the excellent breakfast selection was very well appreciated by the other attendees of the event.

IMG_20170610_085843

I found a corner in which to drink my coffee and double-check the agenda sheets that we’d been provided with as we entered the building.   There’d been no changes since I’d printed out my own copy of the agenda a few nights earlier, so I was happy with the selections of talks that I’d chosen to attend.   It’s always tricky at the DDD events as there are usually at least 3 parallel tracks of talks and invariably there will be at least one or two timeslots where multiple talks that you’d really like to attend will overlap.

IMG_20170611_104806Occasionally, these talks will resurface at another DDD event (some of the talks and speakers at this DDD in Reading had given the same or similar talks in the recently held DDD South West in Bristol back in May 2017) and so, if you’re really good at scheduling, you can often catch a talk at a subsequent event that you’d missed at an earlier one!

As I finished off my coffee, more attendees turned up and the lounge area was now quite full.  It wasn’t too much longer before we were directed by the venue staff to make our way to the relevant conference rooms for our first session of the day.

Wanting to try something a little different, I’d picked Andy Pike’s Introducing Elixir: Self-Healing Applications at ZOMG scale as my first session.

Andy started his sessions by talking about where Elixir, the language came from.  It’s a new language that is built on top of the Erlang virtual machine (BEAM VM) and so has it’s roots in Erlang.  Like Erlang, Elixir compiles down to the same bytecode that the VM ultimately runs.  Erlang was originally created at the Ericsson company in order to run their telecommunications systems.  As such, Erlang is built on top of a collection of middleware and libraries known as OTP or Open Telecom Platform.  Due to its very nature, Erlang and thus Elixir has very strong concurrency, fault tolerance and distributed computing at it’s very core.  Although Elixir is a “new” language in terms perhaps of it’s adoption, it’s actually been around for approximately 5 years now and is currently on version 1.4 with version 1.5 not too far off in the future.

Andy tells of how Erlang is used by WhatsApp to run it’s global messaging system.  He provides some numbers around this.  WhatsApp have 1.2 billion users, process around 42 billion messages per day and can manage to handle around 2 million connections on each server!   That’s some impressive performance and concurrency figures and Andy is certainly right when he states that very few other, if any, platforms and languages can boast such impressive statistics.

Elixir is a functional language and its syntax is heavily inspired by Ruby.  The Elixir language was first designed by José Valim, who was a core contributor in the Ruby community.  Andy talks about the Elixir REPL that ships in the Elixir installation which is called “iex”.  He shows us some slides of simple REPL commands showing that Elixir supports all the same basic intrinsic types that you’d expect of any modern language, integers, strings, tuples and maps.  At this point, things look very similar to most other high level functional (and perhaps some not-quite-so-functional) languages, such as F# or even Python.

Andy then shows us something that appears to be an assignment operator, but is actually a match operator:

a = 1

Andy explains that this is not assigning 1 to the variable ‘a’, but is “matching” a against the constant value 1.  If ‘a’ has no value, Elixir will bind the right-hand side operand to the variable, then perform the same match.  An alternative pattern is:

^a = 1

which performs the matching without the initial binding.  Andy goes on to show how this pattern matching can work to bind variables to values in a list.  For example, given the code:

success = { :ok, 42 }
{ ;ok, result } = success

this will bind the value of 42 to the variable result and subsequently perform the match of the tuple to the variable ‘success’ which now matches.  We’re told how the colon in front of the ok variable makes it an “atom”.  This is similar to a constant, where the variables name is it’s own value.

IMG_20170610_095341Andy shows how Elixir’s code is grouped into functions and functions can be contained within modules.  This is influenced by Ruby and how it also groups it’s code.  We then move on to look at lists.  These are handled in a very similar way to most other functional languages in that a list is merely seen as a “head” and a “tail”.  The “head” is the first value in the list and the “tail” is the entire rest of the list.  When processing the items in a list in Elixir, you process the head and then, perhaps recursively, call the same method passing the list “tail”.  This allows a gradual shortening of the list as the “head” is effectively removed with each pass through the list.  In order for such recursive processing to be performant, Elixir includes tail-call optimisation which allows the compiler to eliminate the necessity of maintaining state through each successive call to the method.  This is possible when the last line of code in the method is the recursive call.

Elixir also has guard clauses built right into the language.  Code such as:

def what_is(x) when is_number(x) and X > 0 do…

helps to ensure that code is more robust by only being invoked when ‘x’ is not only a number but also has some specific value too.  Andy states that, between the usage of such guard clauses and pattern matching, you can probably eliminate around 90-95% of all conditionals within your code (i.e. if x then y).

Elixir is very expressive within it’s allowed characters for function names, so functions can (and often do) have things like question marks in their name.  It’s a convention of the language that methods that return a Boolean value should end in a question mark, something shared with Ruby also, i.e. String.contains? "elixir of life", "of"  And of course, Elixir, like most other functional languages has a pipe operator(|>) which allows the piping of the result of one function call into the input of another function call, so instead of writing:

text = "THIS IS MY STRING"
text = String.downcase(text)
text = String.reverse(text)
IO.puts text

Which forces us to continually repeat the “text” variable, we can instead write the same code like this:

text = "THIS IS MY STRING"
|> String.downcase
|> String.reverse
|> IO.puts

Andy then moves on to show us an element of the Elixir language that I found particular intriguing, doctests.  Elixir makes function documentation a first class citizen within the language and not only does the doctest code provide documentation for the function – Elixir has a function, h, that when passed another function name as a parameter, will display the help for that function – but also serves as a unit test for the function, too!  Here’s a sample of an Elixir function containing some doctest code:

defmodule MyString do
   @doc ~S"""
   Converts a string to uppercase.

   ## Examples
       iex> MyString.upcase("andy")
       "ANDY"
   """
   def upcase(string) do
     String.upcase(string)
   end
end

The doctest code not only shows the textual help text that is shown if the user invokes the help function for the method (i.e. “h MyString”) but the examples contained within the help text can be executed as part of a doctest unit test for the MyString method:

defmodule MyStringTest do
   use ExUnit.Case, async: true
   doctest MyString
end

This above code uses the doctest code inside the MyString method to invoke each of the provided “example” calls and assert that the output is the same as that defined within the doctest!

After taking a look into the various language features, Andy moves on talk about the real power of Elixir which it inherits from it’s Erlang heritage – processes.  It’s processes that provide Erlang, and thus Elixir with the ability to scale massively, provide it with fault-tolerance and its highly distributed features.

Wen Elixir functions are invoked, they can effectively be “wrapped” within a process.  This involves spawning a process that contains the function.  Processes are not the same as Operating System processes, but are much more lightweight and are effectively only a C struct that contains a pointer to the function to call, some memory and mailbox (which will hold messages sent to the function).  Processes have a Process ID (PID) and will, once spawned, continue to run until the function contained within terminates or some error or exception occurs.  Processes can communicate with other processes by passing messages to those processes.  Here’s an example of a very simple module containing a single function and how that function can be called by spawning a separate process:

defmodule HelloWorld do
     def greet do
		IO.puts "Hello World"
     end
end

HelloWorld.greet                 # This is a normal function call.
pid = spawn(HelloWorld, :greet)  # This spawns a process containing the function

Messages are sent to processes by invoking the “send” function, providing the PID and the parameters to send to the function:

send pid, { :greet, “Andy” }

This means that invoking functions in processes is almost as simple as invoking a local function.

Elixir uses the concept of schedulers to actually execute processes.  The Beam VM will supply one scheduler per core of CPU available, giving the ability to run highly concurrently.  Elixir also uses supervisors as part of the Beam VM which can monitor processes (or even monitor other supervisors) and can kill processes if they misbehave in unexpected ways.  Supervisors can be configured with a “strategy”, which allows them to deal with errant processes in specific ways.  One common strategy is one_for_one which means that if a given process dies, a single new one is restarted in it’s place.

Andy then talks about the OTP heritage of Elixir and Erlang and from this there is a concept of a “GenServer”.  A GenServer is a module within Elixir that provides a consistent way to implement processes.  The Elixir documentation states:

A GenServer is a process like any other Elixir process and it can be used to keep state, execute code asynchronously and so on. The advantage of using a generic server process (GenServer) implemented using this module is that it will have a standard set of interface functions and include functionality for tracing and error reporting. It will also fit into a supervision tree.

The GenServer provides a common set of interfaces and API’s that all processes can adhere to, allowing common conventions such as the ability to stop a process, which is frequently implemented like so:

GenServer.cast(pid, :stop)

Andy then talks about “nodes”.  Nodes are separate actual machines that can run Elixir and the Beam VM and these nodes can be clustered together.  Once clustered, a node can start a process not only on it’s own node, but on another node entirely.  Communication between processes, irrespective of the node that the process is running on is handled seamlessly by the Beam VM itself.  This provides Elixir solutions great scalability, robustness and fault-tolerance.

Andy mentions how Elixir has it’s own package manager called “hex” which gives access to a large range of packages providing lots of great functionality.  There’s “Mix”, which is a build tool for Elixir, OTP Observer for inspection of IO, memory and CPU usage by a node, along with “ETS”, which is an in-memory key-value store, similar to Redis, just to name a few.

Andy shares some book information for those of us who may wish to know more.  He suggests “Programming Elixir” and “Programming Phoenix”, both part of the Pragmatic Programmers series and also “The Little Elixir & OTP Guidebook” from Manning.

Finally, Andy wraps up by sharing a story of a US-based Sports news website called “Bleacher Report”.  The Bleacher Report serves up around 1.5 billion pages per day and is a very popular website both in the USA and internationally.  Their entire web application was originally built on Ruby on Rails and they required approximately 150 servers in order to meet the demand for the load.  Eventually, they re-wrote their application using Elixir.  They now serve up the same load using only 5 servers.  Not only have they reduced their servers by an enormous factor, but they believe that with 5 servers, they’re actually over-provisioned as the 5 servers are able to handle the load very easily.  High praise indeed for Elixir and the BeamVM.  Andy has blogged about his talk here.

After Andy’s talk, it was time for the first coffee break.  Amazingly, there were still some breakfast sandwiches left over from earlier in the morning, which many attendees enjoyed.  Since I was still quite full from my own breakfast, I decided a quick cup of coffee was in order before heading back to the conference rooms for the next session.  This one was Sandeep Singh’s Goodbye REST; Hello GraphQL.

IMG_20170610_103642

Sandeep’s session is all about the relatively new technology of GraphQL.  It’s a query language for your API comprising of a server-side runtime for processing queries along with a client-side framework providing an in-browser IDE, called GraphiQL.  One thing that Sandeep is quick to point out is that GraphQL has nothing to do with Graph databases.  It can certainly act as a query layer over the top of a graph database, but can just as easily query any kind of underlying data such as RDBMS’s through to even flat-file data.

Sandeep first talks about where we are today with API technologies.  There’s many of them, XML-RPC, REST, ODATA etc. but they all have their pros and cons.  We explore REST in a bit more detail, as that’s a very popular modern day architectural style of API.  REST is all about resources and working with those resources via nouns (the name of the resource) and verbs (HTTP verbs such as POST, GET etc.), there’s also HATEOAS if your API is “fully” compliant with REST.

Sandeep talks about some of the potential drawbacks with a REST API.   There’s the problem of under-fetching.  This is seen when a given application’s page is required to show multiple resources at once, perhaps a customer along with recently purchased products and a list of recent orders.  Since this is three different resources, we would usually have to perform three distinct different API calls in order to retrieve all of the data required for this page, which is not the most performant way of retrieving the data.  There’s also the problem of over-fetching.  This is where REST API’s can take a parameter to instruct them to include additional data in the response (i.e. /api/customers/1/?include=products,orders), however, this often results in data that is additional to that which is required.  We’re also exposing our REST endpoint to potential abuse as people can add arbitrary inclusions to the endpoint call.  One way to get around the problems of under or over fetching is to create ad-hoc custom endpoints that retrieve the exact set of data required, however, this can quickly become unmaintainable over time as the sheer number of these ad-hoc endpoints grows.

IMG_20170610_110327GraphQL isn’t an architectural style, but is a query language that sits on top of your existing API. This means that the client of your API now has access to a far more expressive way of querying your API’s data.  GraphQL responses are, by default, JSON but can be configured to return XML instead if required, the input queries themselves are similarly structured to JSON.  Here’s a quick example of a GraphQL query and the kind of data the query might return:

{
	customer {
		name,
		address
	}
}
{
	"data" : {
		"customer" : {
			"name" : "Acme Ltd.",
			"address" : ["123 High Street", "Anytown", "Anywhere"]
		}
	}
}

GraphQL works by sending the query to the server where it’s translated by the GraphQL server-side library.  From here, the query details are passed on to code that you have written on the server in order that the query can be executed.  You’ll write your own types and resolvers for this purpose.  Types provide the GraphQL queries with the types/classes that are available for querying – this means that all GraphQL queries are strongly-typed.  Resolvers tell the GraphQL framework how to turn the values provided by the query into calls against your underlying API.  GraphQL has a complete type system within it, so it supports all of the standard intrinsic types that you’d expect such as strings, integers, floats etc. but also enums, lists and unions.

Sandeep looks at the implementations of GraphQL and explains that it started life as a NodeJS package.  It’s heritage is therefore in JavaScript, however, he also states that there are many implementations in numerous other languages and technologies such as Python, Ruby, PHP, .NET and many, many more.  Sandeep says how GraphQL can be very efficient, you’re no longer under or over fetching data, and you’re retrieving the exact fields you want from the tables that are available.  GraphQL allows you to version your queries and types also by adding deprecated attributes to fields and types that are no longer available.

IMG_20170610_110046We take a brief looks at the GraphiQL GUI client which is part of the GraphQL client side library.  It displays 3 panes within your browser showing the schemas, available types and fields and a pane allowing you to type and perform ad-hoc queries.  Sandeep explains that the schema and sample tables and fields are populated within the GUI by performing introspection over the types configured in your server-side code, so changes and additions there are instantly reflected in the client.  Unlike REST, which has obvious verbs around data access, GraphQL doesn't really have these.  You need introspection over the data to know how you can use that data, however, this is a very good thing.  Sandeep states how introspection is at the very heart of GraphQL – it is effectively reflection over your API – and it’s this that leads to the ability to provide strongly-typed queries.

We’re reminded that GraphQL itself has no concept of authorisation or any other business logic as it “sits on top of” the existing API.  Such authorisation and business logic concerns should be embedded within the API or a lower layer of code.  Sandeep says that the best way to think of GraphQL is like a “thin wrapper around the business logic, not the data layer”.

IMG_20170610_111157GraphQL is not just for reading data!  It has full capabilities to write data too and these are known as mutations rather than queries.  The general principle remains the same and a mutation is constructed using JSON-like syntax and sent to the server for resolving to a custom method that will validate the data and persist it the data store by invoking the relevant API endpoint.  Sandeep explains how read queries can be nested, so you can actually send one query to the server that actually contains syntax to perform two queries against two different resources.    GraphQL has a concept of "loaders".  These can batch up actual queries to the database to prevent issues when asking for such things all Customers including Orders.  Doing something like this normally results in a N+1 issue where by Orders are retrieved by issuing a separate query for each customer, resulting in degraded performance.  GraphQL loaders works by enabling the rewriting of the underlying SQL that can be generated for retrieving the data so that all Orders are retrieved for all of the required Customers in a single SQL statement.  i.e. Instead of sending queries like the following to the database:

SELECT CustomerID, CustomerName FROM Customer
SELECT OrderID, OrderNumber, OrderDate FROM Order WHERE CustomerID = 1
SELECT OrderID, OrderNumber, OrderDate FROM Order WHERE CustomerID = 2
SELECT OrderID, OrderNumber, OrderDate FROM Order WHERE CustomerID = 3

We will instead send queries such as:

SELECT CustomerID, CustomerName FROM Customer
SELECT OrderID, OrderNumber, OrderDate FROM Order WHERE CustomerID IN (1,2,3)

IMG_20170610_110644Sandeep then looks at some of the potential downsides of using GraphQL.  Caching is somewhat problematic as you can no longer perform any caching at the network layer as each query is now completely dynamic.  Despite the benefits, there are also performance considerations if you intend to use GraphQL as your underlying data needs to be correctly structured in order to work with GraphQL in the most efficient manner.  GraphQL Loaders should also be used to ensure N+1 problems don’t become an issue.  There’s security considerations too. You shouldn’t expose anything that you don’t want to be public since everything through the API is available to GraphQL and you’ll need to be aware of potentially malicious queries that attempt to retrieve too much data.  One simple solution to such queries is to use a timeout.  If a given query takes longer than some arbitrarily defined timeout value, you simply kill the query, however this may not be the best approach.  Other approaches taken by big websites currently using such functionality is to whitelist all the acceptable queries.  If a query is received that isn’t in the whitelist, it doesn’t get run.  You also can’t use HTTP codes to indicate status or contextual information to the client.  Errors that occur when processing the GraphQL query are contained within the GraphQL response text itself which is returned to the client with 200 HTTP success code.  You’ll need to have your own strategy for exposing such data to the user in a friendly way.   Finally Sandeep explains that, unlike other querying technologies such as ODATA, GraphQL has no intrinsic ability to paginate data.  All data pagination must be built into your underlying API and business layer, GraphQL will merely pass the paging data – such as page number and size – onto the API and expect the API to correctly deal with limiting the data returned.

IMG_20170610_103123After Sandeep’s session, it’s time for another coffee break.  I quickly grabbed another coffee from the main lounge area of the conference, this time accompanied by some rather delicious biscuits, before consulting my Agenda sheet to see which room I needed to be in for my next session.  Turned out that the next room I needed to be in was the same one I’d just left!  After finishing my coffee and biscuits I headed back to Chicago 1 for the next session and the last one before the lunch break.  This one was Dave Mateer’s Fun With Twitter Stream API, ELK Stack, RabbitMQ, Redis and High Performance SQL.

Dave’s talk is really all about performance and how to process large sets of data in the most efficient and performant manner.  Dave’s talk is going to be very demo heavy and so to give us a large data set to work with, Dave starts by looking at Twitter, and specifically, it’s Stream API.  Dave explains that the full Twitter firehose, which is reserved only for Twitter’s own use, currently has 1.3 million tweets per second flowing through it.  As a consumer, you can get access to a deca-firehose which contains 1/10th of the full firehose (i.e. 130000 tweets per second) but this costs money, however, Twitter does expose a freely available Stream API although it’s limited to 50 tweets per second.  This is still quite a sizable amount of data to process.

IMG_20170610_120247Dave starts by showing us a demo of a simple C# console application that uses the Twitter Stream API to gather real-time tweets for the hashtag of #scotland which are then echoed out to the console.   Our goal is to get the tweet data into a SQL Server database as quickly and as efficiently as possible.  Dave now says that to simulate a larger quantity of data, he’s going to read in a number of pre-saved text files containing tweet data that he’s previously collected.  These files represent around 6GB of raw tweet data, containing approximately 1.9 million tweets.  He then adds to the demo to start saving the tweets into SQL Server.  Dave mentions that he's using Dapper to access SQL Server and that previously he tried such things using Entity Framework, which was admittedly some time in the past, but that it was a painful experience and not very performant.   Dave likes Dapper as it's a much simpler abstraction over the database, so therefore much more performant.  It's also a lot easier to optimize your queries when using Dapper as the abstraction isn’t so great and you’re not hiding the implementation details too much.

IMG_20170610_121958Next, Dave shows us a Kibana interface.  As well as writing to SQL Server, he's also saving the tweets to a log file using SeriLog and then using LogStash to send those logs to ElasticSearch allowing viewing the raw log data with Kibana (also known as the ELK stack).  Dave then shows us how easy it is to really leverage the power of tools like Kibana by creating a dashboard for all of the data.

From here, Dave begins to talk about performance and just how fast we can process the large quantity of tweet data.  The initial run of the application, which was simply reading in each tweet from the file and performing an INSERT via Dapper to insert the data to the SQL Server database was able to process approximately 420 tweets per second.  This isn’t bad, but it’s not a great level of performance.  Dave digs out SQL Server Profiler to see where the bottle-necks are within the data access code, and this shows that there are some expensive reads – the data is normalized so that the user that a tweet belongs to is stored in a separate table and looked up when needed.  It’s decided that adding indexes on the relevant columns used for the lookup data might speed up the import process.  Sure enough, after adding the indexes and re-running the tweet import, we improve from 420 to 1600 tweets per second.  A good improvement, but not an order of magnitude improvement.  Dave wants to know if we can change the architecture of our application and go even faster.  What if we want to try to achieve a 10x level of increase in performance?

Dave states that, since his laptop has multiple cores, we should start by changing the application architecture to make better use of parallel processing across all of the available cores in the machine.  We set up an instance of the RabbitMQ message queue, allowing us to read the tweet data in from the files and send it to a RabbitMQ queue.  The queue is explicitly set to durable and each message is set to persistent in order to ensure we have the ability to continue where we left off from in the event of a crash or server failure.  From here, we can have multiple instances of another application that pull messages off the queue, leveraging RabbitMQ’s ability to effectively isolate each of the client consumers, ensuring that the same message is not sent to more than one client.  Dave then sets up Redis.  This will be used for "lookups" that are required when adding tweet data.  So users (and other data) is added to the DB first, then all data is cached in Redis, which is an in-memory key/value store and is often used for caching scenarios.  As tweets are added to the RabbitMQ message queue, required ID/Key lookups for Users and other data are done using the Redis cache rather than performing a SQL Server query for the data, thus improving the performance.  Once processed and the relevant values looked up from Redis, Dave uses SQL Server Bulk Copy to get the data into SQL Server itself.  SQL Server Bulk Copy provides a significant performance benefit over using standard INSERT T-SQL statements.   For Dave’s purposes, he Bulk Copies the data into temporary tables within SQL Server and then, at the end of the import, runs a single T-SQL statement to copy the data from the temporary tables to the real tables. 

IMG_20170610_125524Having re-architected the solution in this manner, Dave then runs his import of 6GB of tweet data again.  As he’s now able to take advantage of the multiple CPU cores available, he runs 3 console applications in parallel to process all of the data.  Each console application completes their jobs within around 30 seconds, and whilst they’re running, Dave shows us the Redis dashboard which is indicating that Redis is receiving around 800000 hits to the cache per second!  The result, ultimately, is that the application’s performance has increased from processing around 1600 tweets per second to around 20000!  An impressive improvement indeed!

Dave then looks at some potential downsides of such re-architecture and performance gains.  He shows how he’s actually getting duplicated data imported into his SQL Server and this is likely due to race conditions and concurrency issues at one or more points within the processing pipeline.  Dave then quickly shows us how he’s got around this problem with some rather ugly looking C# code within the application (using temporary generic List and Dictionary structures to pre-process the data).  Due to the added complexity that this performance improvement brings, Dave argues that sometimes slower is actually faster in that, if you don't absolutely really need so much raw speed, you can remove things like Redis from the architecture, slowing down the processing (albeit to a still acceptable level) but allowing a lot of simplification of the code.

IMG_20170610_130715After Dave’s talk was over, it was time for lunch.  The attendees made their way to the main conference lounge area where we could choose between lunch bags of sandwiches (meat, fish and vegetarian options available) or a salad option (again, meat and vegetarian options).

Being a recently converted vegetarian, I opted for a lovely Greek Salad and then made my way to the outside area along with, it would seem, most of the other conference attendees.

IMG_20170610_132208It had turned into a glorious summer’s day in Reading by now and the attendees and some speakers and other conference staff were enjoying the lovely sunshine outdoors while we ate our lunch.  We didn’t have too long to eat, though, as there were a number of Grok talks that would be taking place inside one of the major conference rooms over lunchtime.  After finishing my lunch (food and drink was not allowed in the session rooms themselves) I heading back towards Chicago 1 where the Grok Talks were to be held.

I’d managed to get there early enough in order to get a seat (these talks are usually standing room only) and after settling myself in, we waited for the first speaker.  This was to be Gary Short with a talk about Markov, Trump and Countering Radicalisation.

Gary starts by asking, “What is radicalisation?”.  It’s really just persuasion – being able to persuade people to hold an extreme view of some information.  Counter-radicalisation is being able to persuade people to hold a much less extreme belief.  This is hard to achieve, and Gary says that it’s due to such things as Cognitive Dissonance and The “Backfire” Effect (closely related to Confirmation Bias) – two subjects that he doesn’t have time to go into in a 10 minute grok talk, but he suggests we Google them later!  So, how do we process text to look for signs of radicalisation?  Well, we start with focus groups and corpus's of known good text as well as data from previously de-radicalised people (Answers to questions like “What helped you change?” etc.)  Gary says that Markov chains are used to process the text.  They’re a way of “flowing” through a group of words in such a way that the next word is determined based upon statistical data known about the current word and what’s likely to follow it.  Finally, Gary shows us a demo of some Markov chains in action with a C# console application that generates random tweet-length sentences based upon analysis of a corpus of text from Donald Trump.  His application is called TweetLikeTrump and is available online.

The next Grok talk is Ian Johnson’s Sketch Notes.  Ian starts by defining Sketch Notes.  They’re visual notes that you can create for capturing the information from things like conferences, events etc.   He states that he’s not an artist, so don’t think that you can’t get started doing Sketch Noting yourself.  Ian talks about how to get started with Sketch Notes.  He says it’s best to start by simply practising your handwriting!   Find a clear way of quickly and clearly writing down text using pen & paper and then practice this over and over.  Ian shared his own structure of a sketch note, he places the talk title and the date in the top left and right corners and the event title and the speaker name / twitter handle in the bottom corners.  He goes on to talk more about the component parts of a sketch note.   Use arrows to create a flow between ideas and text that capture the individual concepts from the talk.  Use colour and decoration to underline and underscore specific and important points – but try to do this when there’s a “lull” in the talk and you don’t necessarily have to be concentrating on the talk 100% as it’s important to remember that the decoration is not as important as the content itself.  Ian shares a specific example.  If the speaker mentions a book, Ian would draw a little book icon and simply write the title of the book next to the icon.  He says drawing people is easy and can be done with a simple circle for the head and no more than 4 or 5 lines for the rest of the body.  Placing the person’s arms in different positions helps to indicate expression.  Finally, Ian says that if you make a mistake in the drawing (and you almost certainly will do, at least when starting out) make a feature of it!  Draw over the mistake to create some icon that might be meaningful in the entire context of the talk or create a fancy stylised bullet point and repeat that “mistake” to make it look intentional!  Ian has blogged about his talk here.

Next up is Christos Matskas’s grok talk on Becoming an awesome OSS contributor.   Christos starts by asking the audience who is using open source software?  After a little thought, virtually everyone in the room raises their hand as we’re all using open source software in one way or another.  This is usually because we'll be working on projects that are using at least one NuGet package, and NuGet packages are mostly open source.  Christos shares the start of his own journey into becoming an OSS contributor which started with him messing up a NuGet Package restore on his first day at a new contract.  This lead to him documenting the fix he applied which was eventually seen by the NuGet maintainers and he was offered to write some documentation for them.  Christos talks about major organisations using open source software including Apple, Google, Microsoft as well as the US Department of Defence and the City of Munich to name but a few.  Getting involved in open source software yourself is a great way to give back to the community that we’re all invariably benefiting from.  It’s a great way to improve your own code and to improve your network of peers and colleagues.  It’s also great for your CV/Resume.  In the US, its almost mandatory that you have code on a GitHub profile.  Even in the UK and Europe, having such a profile is not mandatory, but is a very big plus to your application when you’re looking for a new job.  You can also get free software tools to help you with your coding.  Companies such as JetBrains, RedGate, and many, many others will frequently offer many of their software products for free or a heavily discounted price for open source projects.   Finally, Christos shares a few websites that you can use to get started in contributing to open source, such as up-for-grabs.net, www.firsttimersonly.com and the twitter account @yourfirstPR.

The final grok talk is Rik Hepworth’s Lability.  Lability is a Powershell module, available via Github, that uses Azure’s DSC (Desired State Configuration) feature to facilitate the automated provisioning of complete development and testing environments using Windows Hyper-V.  The tool extends the Powershell DSC commands to add metadata that can be understood by the Lability tool to configure not only the virtual machines themselves (i.e. the host machine, networking etc.) but also the software that is deployed and configured on the virtual machines.  Lability can be used to automate such provisioning not only on Azure itself, but also on a local development machine.

IMG_20170610_141738After the grok talks were over, I had a little more time available in the lunch break to grab another quick drink before once again examining the agenda to see where to go for the first session of the afternoon, and penultimate session of the day.  This one was Ian Russell’s Strategic Domain-Driven Design.

Ian starts his talk by mentioning the “bible” for Domain Driven Design and that’s the book by Eric Evans of the same name.  Ian asks the audience who has read the book to which quite a few people raise their hands.  He then asks who has read past Chapter 14, to which a lot of people put their hands down.  Ian states that, if you’ve never read the book, the best way to get the biggest value from the book is to read the last 1/3rd of the book first and only then return to read the first 2/3rds!

So what is Domain-Driven Design?  It’s an abstraction of reality, attempting to model the real-world we see and experience before us in a given business domain.  Domain-Driven Design (DDD) tries to break down the domain into what is the “core” business domain – these are the business functions that are the very reason for a business’s being – and other domains that exist that are there to support the “core” domain.  One example could be a business that sells fidget spinners.  The actual domain logic involved with selling the spinners to customers would be the core domain, however, the same company may need to provide a dispute resolution service for unhappy customers.  The dispute resolution service, whilst required by the overall business, would not be the core domain but would be a supporting or generic domain.

DDD has a ubiquitous language.  This is the language of the business domain and not the technical domain.  Great care should be taken to not use technical terms when discussing domains with business stake holders, and the terms within the language should be ubiquitous across all domains and across the entire business.  Having this ubiquitous language reduces the chance of ambiguity and ensures that everyone can relate to specific component parts of the domain using the same terminology.  DDD has sub-domains, too.  These are the core domain – the main business functionality, the supporting domains – which exist solely to support the core, and generic domains - such as the dispute resolution domain, which both supports the core domain but is also generic and could apply to other businesses too.  DDD has bounded contexts.  These are like sub-domains but don’t necessarily map directly to sub-domains.  They are explicitly boundaries from other areas of the business.  Primarily, bounded contexts can be developed in software independently from each other.  They could be written by different development teams and could even use entirely different architectures and technologies in their construction.

Ian talks about driving out the core concepts by creating a “shared kernel”.  These are the concepts that exist in the overlaps between bounded contexts.  These concepts don’t have to be shared between the bounded contexts and they may be different – the concept of an “account” might mean different things within the finance bounded context to the concept of an “account” within the shipping bounded context, for example.  Ian talks about the concept of an “anti-corruption layer” as part of each bounded context.  This allows bounded contexts to communicate with items from the shared kernel but where those concepts actually do differ between contexts, the anti-corruption layer will prevent corruption from incorrect implementations of concepts being passed to it.  Ian mentions domain events next.  He says that these are something that are not within Eric Evans’ book but are often documented in other DDD literature.  Domain events are essentially just ”things that occur” within the domain. For example, a new customer registered on the company’s website is an domain event.  Events can be created by users, by the passage of time, by external system, or even by other domain events.

All of this is Strategic domain-driven design.  It’s the modelling and understanding of the business domain without ever letting any technological considerations interfere with the modelling and understanding.  It’s simply good architectural practice and the understanding of how different parts of the business domain interact with, and communicate with other parts of the business domain.

Ian suggests that there’s three main ways in which to achieve strategic domain-driven design.  These are Behavioural Driven Design (BDD), Prototyping and Event Storming.  BDD involves writing acceptance tests for our software application using a domain-specific language within our application code that allows the tests to use the ubiquitous language of the domain, rather than the explicitly technical language of the software solution.  BDD facilitates engagement with business stakeholders in the development of the acceptance tests that form part of the software solution.  One common way to do this is to run three amigos sessions which allow developers, QA/testers and domain experts to write the BDD tests, usually in the standard Given-When-Then style.   Prototyping consists of producing images and “wireframes” that give an impression of how a completed software application could look and feel.  Prototypes can be low-fidelity, just simple static images and mock-ups, but it’s better if you can produce high-fidelity prototypes, which allows varying levels of interaction with the prototype. Tools such as Balsamiq and InVision amongst others can help with the production of high-fidelity prototypes.  Event storming is a particular format of meeting or workshop that has developers and domain experts collaborating on the production of a large paper artefact that contains numerous sticky notes of varying colours.  The sticky notes’ colour represent various artifacts within the domain such as events, commands, users, external systems and others.  Sticky notes are added, amended, removed and moved around the paper on the wall by all meeting attendees.  The resulting sticky notes tends to naturally cluster into the various bounded contexts of the domain, allowing the overall domain design to emerge.  If you run your own Event Storming session, Ian suggests to start by trying to drive out the domain events first, and for each event, attempt to first work backwards to find the cause or causes of that event, then work forwards, investigating what this event causes - perhaps further events or the requirement for user intervention etc.

IMG_20170610_145159Ian rounds off his talk by sharing some of his own DDD best practices.  We should strive for creative collaboration between developers and domain experts at all stages of the project, and a fostering of an environment that allows exploration and experimentation in order to find the best model for the domain, which may not be the first model that is found.  Ian states that determining the explicit context boundaries are far more important than finding a perfect model, and that the focus should always be primarily on the core domain, as this is the area that will bring the greatest value to the business.

After Ian’s talk was over, it was time for another coffee break, the last of the day.  I grabbed a coffee and once again checked my agenda sheet to see where to head for the final session of the day.   This last session was to be Stuart Lang’s Async in C#, the Good, the Bad and the Ugly.

IMG_20170610_153023Stuart starts his session by asking the audience to wonder why he’s doing a session on Async in C#' today.  It’s 2017 and Async/Await has been around for over 6 years!  Simply put, Stuart says that, whilst Async/Await code is fairly ubiquitous these days, there are still many mistakes made with some implementations and the finer points of asynchronous code are not as well understood.  We learn how Async is really an abstraction, much like an ORM tool.  If you use it in a naïve way, it’s easy to get things wrong.

Stuart mentions Async’s good parts.  We can essentially perform non-blocking waiting for background I/O operations, thus allowing our user interfaces to remain responsive.  But then comes the bad parts.  It’s not always entirely clear what is asynchronous code.  If we call an Async method in someone else’s library, there’s no guarantee that what we’re really executing is asynchronous code, even if the method returns a Task<T>.  Async can sometimes leads to the need to duplicate code, for example, when your code has to provide both asynchronous and synchronous versions of each method.  Also, Async can’t be used everywhere.  It can’t be used in a class constructor or inside a lock statement.

IMG_20170610_155205One of the worst side-effects of poor Async implementation is deadlocks.  Stuart states that, when it comes to Async, doing it wrong is worse than not doing it at all!  Stuart shows us a simple Async method that uses some Task.Delay methods to simulate background work.  We then look at what the Roslyn compiler actually translates the code to, which is a large amount of code that effectively implements a complete state machine around the actual method’s code. 

IMG_20170610_160038Stuart then talks about Synchronization Context.  This is an often misunderstood part of Async that allows awaited code, that can be resumed on a different thread, to synchronize with other threads.  For example, if awaited code needs to update some element on the user interface in a WinForms or WPF application, it will need to synchronize that change to the UI thread, it can’t be performed by a thread-pool thread that the awaited code would be running on.  Stuart talks about blocking on Async code, for example:

var result = MyAsyncMethod().Result;

We should try to never do this!   Doing so can cause code within the Async method to block on the synchronization context as awaited code within the Async method may be trying to restart on the same synchronization context that is already blocked by the "outer" code that is performing the .Result on the main Async method.

Stuart then shows us some sample code that runs as an ASP.NET page with the Async code being called from within the MVC controller.  He outputs the threads used by the Async code to the browser, and we then examine variations of the code to see how each awaited piece of code uses either the same or different threads.  One way of overcoming the blocking issue when using .Result at the end of an Async method call is to write code similar to:

var result = Task.Run(() => MyAsyncMethod().Result).Result;

It's messy code, but it works.  However, code like this should be heavily commented because if someone removes that outer Task.Run(); the code will start blocking again and will fail miserably.

Stuart then talks about the vs-threading library which provides a JoinableTaskFactory.   Using code such as:

jtf.Run(() => MyAsyncMethod())

ensures that awaited code resumes on the same thread that's blocked, so Stuart shows the output from his same ASP.NET demo when using the JoinableTaskFactory and all of the various awaited blocks of code can be seen to always run and resume on the same thread.

IMG_20170610_163019Finally, Stuart shares some of the best practices around deadlock prevention.  He asserts that the ultimate goal for prevention has to be an application that can provide Async code from the very “bottom” level of the code (i.e. that which is closest to I/O) all the way up to the “top” level, where it’s exposed to other client code or applications.

IMG_20170610_164756After Stuart’s talk is over, it’s time for the traditional prize/swag giving ceremony.  All of the attendees start to gather in the main conference lounge area and await the attendees from the other sessions which are finishing shortly afterwards.  Once all sessions have finished and the full set of attendees are gathered, the main organisers of the conference take time to thank the event sponsors.  There’s quite a few of them and, without them, there simply wouldn’t be a DDD conference.  For this, the sponsors get a rapturous round of applause from the attendees.

After the thanks, there’s a small prize giving ceremony with many of the prizes being given away by the sponsors themselves.  Like most times, I don’t win a prize – given that there was only around half a dozen prizes, I’m not alone!

It only remained for the organisers to announce the next conference in the DDD calendar, which although doesn’t have a specific date at the moment, will take place in October 2017.  This one is DDD North.  There’s also a surprise announcement of a DDD in Dublin to be held in November 2017 which will be a “double capacity” event, catering for around 600 – 800 conference delegates.  Now there’s something to look forward to!

So, another wonderful DDD conference was wrapped up, and there’s not long to wait until it’s time to do it all over again!

UPDATE (20th June 2017):

As last year, Kevin O'Shaughnessy was also in attendance at DDD12 and he has blogged his own review and write-up of the various sessions that he attended.  Many of the sessions attended by Kevin were different than the sessions that I attended, so check out his blog for a fuller picture of the day's many great sessions.