DDD North 2017 In Review
On Saturday, 14th October 2017, the 7th annual DDD North event took place. This time taking place in the University of Bradford.
One nice element of the DDD North conferences (as opposed to the various other DDD conferences around the UK) is that I'm able to drive to the conference on the morning of the event and drive home again after the event has finished. This time, the journey was merely 1 hour 20 minutes by car, so I didn't have to get up too early in order to make the journey. On the Saturday morning, after having a quick cup of coffee and some toast at home, I headed off towards Bradford for the DDD North event.
After arriving at the venue and parking my car in one of the ample free car parks available, I headed to the Richmond Building reception area where the conference attendees were gathering. After registering my attendance and collecting my conference badge, I headed into the main foyer area to grab some coffee and breakfast. The catering has always been particularly good at the DDD North conferences and this time round was no exception. Being a vegetarian nowadays, I can no longer avail myself of a sausage or bacon roll, both of which were available, however on this occasion there was also veggie sausage breakfast rolls available too. A very nice and thoughtful touch! And delicious, too!
After a some lovely breakfast and a couple of cups of coffee, it was soon time to head off the the first session of the day. This one was to be Colin Mackay's User Story Mapping For Beginners.
Colin informs us that his talk will be very hands-on, and so he hands out some sticky notes and markers to some of the session attendees, but unfortunately, runs out of stock of them before being able to supply everyone.
Colin tells us about story mapping and shares a quote from Martin Fowler:
"Story mapping is a technique that provides the big picture that a pile of stories so often misses"
Story mapping is essentially a way of arranging our user stories, written out on sticky notes, into meaningful "groups" of stories, tasks, and sections of application or business functionality. Colin tells us that it's a very helpful technique for driving out a "ubiquitous language" and shares an example of how he was able to understand a sales person's usage of the phrase "closing off a customer" to mean closing a sale, rather than the assuming it to mean that customer no longer had a relationship with the supplier. Colin also states that a document cannot always tell you the whole story. He shares a picture from his own wedding which was taken in a back alley from the wedding venue. He says how the picture appears unusual for a wedding photo, but the photo doesn't explain that there'd been a fire alarm in the building and all the wedding guests had to gather in the back alley at this time and so they decided to take a photo of the event! He also tells us how User Story Mapping is great for sparking conversations and helps to improve prioritisation of software development tasks.
Colin then gets the attendees that have the sticky notes and the markers to actually write out some user story tasks based upon a person's morning routine. He states that this is an exercise from the book User Story Mapping by Jeff Patton. Everyone is given around 5 minutes to do this and afterwards, Colin collects the sticky notes and starts to stick them onto a whiteboard. Whilst he's doing this, he tells us that there's 3 level of tasks with User Story Mapping. At the very top level, there's "Sea" level. These are the user goals and each task within is atomic - i.e. you can't stop in the middle of it and do something else. Next is Summary Level which is often represented by a cloud or a kite and this level shows greater context and is made up of many different user goals. Finally, we have the Sub-functions, represented by a fish or a clam. These are the individual tasks that go to make up a user goal. So an example might have a user goal (sea level) of "Take a Shower" and the individual tasks could be "Turn on shower", "Set temperature", "Get in shower", "Wash body", "Shampoo hair" etc.
After an initial arrangement of sticky notes, we have our initial User Story Map for a morning routine. Colin then says we can start to look for alternatives. The body of the map is filled with notes representing details and individual granular tasks, there's also variations and exceptions here and we'll need to re-organise the map as new details are discovered so that the complete map makes sense. In a software system, the map becomes the "narrative flow" and is not necessarily in a strict order as some tasks can run in parallel. Colin suggests using additional sticker or symbols that can be added to the sticky note to represent which teams will work on which parts of the map.
Colin says it's good to anthropomorphise the back-end systems within the overall software architecture as this helps with conversations and allows non-technical people to better understand how the component parts of the system work together. So, instead of saying that the web server will communicate with the database server, we could say that Fred will communicate with Bob or that Luke communicates with Leia. Giving the systems names greater helps.
We now start to look at the map's "backbone". These are the high level groups that many individual tasks will fit into. So, for our morning routine map, we can group tasks such as "Turn off alarm" and "Get out of bed" as a grouping called "Waking up". We also talk about scope creep. Colin tells us that, traditionally, more sticky notes being added to a board even once the software has started to be built is usually referred to as scope creep, however, when using techniques such as User Story Mapping, it often just means that your understanding of the overall system that's required is getting better and more refined.
Once we've built our initial User Story Map, it's easy to move individual tasks within a group of tasks in a goal below a horizontal line which was can draw across the whiteboard. These tasks can the represent a good minimum viable product and we simply move those tasks in a group that we deem to be more valuable, and thus required for the MVP, whilst leaving the "nice to have" tasks in the group on the other side of the line. In doing this, it's perfectly acceptable to replace a task with a simpler task as a temporary measure, which would then be removed and replaced with the original "proper" task for work beyond MVP. After deciding upon our MVP tasks, we can simply rinse and repeat the process, taking individual tasks from within groups and allocating them to the next version of the product whilst leaving the less valuable tasks for a future iteration.
Colin says how this process results in producing something called "now maps" as the represent what we have, or where we're at currently, whereas what we'll often produce is "later maps", these are the maps that represent some aspect of where we want to be in the future. Now maps are usually produced when you're first trying to understand the existing business processes that will be modelled into software. From here, you can produce Later maps showing the iterations of the software as will be produced and delivered in the future. Colin also mentions that we should always be questioning all of the elements of our maps, asking question such as "Why does X happen?", "What are the pain points around this process?", "What's good about the process?" and "What would make this process better?". It's by continually asking such questions, refining the actual tasks on the map, and continually reorganising the map that we can ultimately create great software that really adds business value.
Finally, Colin shares some additional resources where we can learn more about User Story Mapping and related processes in general. He mentions the User Story Mapping book by Jeff Patton along with The Goal by Eli Goldratt, The Phoenix Project by Gene Kim, Kevin Behr and George Spafford and finally, Rolling Rocks Downhill by Clarke Ching.
After Colin's session is over, it's time for a quick coffee break before the next session. The individual rooms are a little distance away from the main foyer area where the coffee is served, and I realised by next session was in the same room as I was already sat! Therefore, I decided I'm simply stay in my seat and await the next session. This one was to be David Whitney's How Stuff Works...In C# - Metaprogramming 101.
David's talk is going to be all about how some of the fundamental frameworks that we use as .NET developers everyday work and how they're full of "metaprogramming". Throughout his talk, he's going to decompose an MVC (Model View Controller) framework, a unit testing framework and a IoC (Inversion of Control) container framework to show they work and specifically to examine how they operate on the code that we write that uses and consumes these frameworks.
To start, David explains what "MetaProgramming" is. He shares the Wikipedia definition, which in typical Wikipedia fashion, is somewhat obtuse. However the first statement does sum it up:
"Metaprogramming is a programming technique in which computer programs have the ability to treat programs as their data."
This simply means that meta programs are programs that operate on other source code, and Meta programming is essentially about writing code that looks at, inspects and works with your own software's source code.
David says that in C#, meta programming is mostly done by using class within the System.Reflection
namespace and making heavy use of things such as the Type
class therein, which allows us to get all kinds of information about the types, methods and variables that we're going to be working with. David shows a first trivial example of a meta program, which enumerates the list of types by using a call to the Assembly.GetTypes()
method:
public class BasicReflector
{
public Type[] ListAllTypesFromSamples()
{
return GetType().Assembly.GetTypes();
}
public MethodInfo[] ListMethodsOn<T>()
{
return typeof(T).GetMethods();
}
}
He asks why you want to do this? Well, it's because many of the frameworks we use (MVC, Unit Testing etc.) are essentially based on this ability to perform introspection on the code that you write in order to use them. We often make extensive use of the Type
class in our code, even when we're not necessarily aware that we're doing meta-programming but the Type
class is just one part of a rich "meta-model" for performing reflection and introspection over code. A Meta-Model is essentially a "model of your model". The majority of methods within the System.Reflection namespace that provide this Metamodel usually end with "Info" in the method name, so methods such as TypeInfo
, MethodInfo
, MemberInfo
and ConstructorInfo
can all be used to give us highly detailed information and data about our code.
As an example, a unit testing framework at it's core is actually trivially simple. It essentially just finds code and runs it. It examines your code for classes decorated with a specific attribute (i.e. [TestFixture]
) and invokes methods that are decorated with a specific attribute(s)(i.e. [Test]
). David says that one of his favourite coding katas is to write a basic unit testing framework in less than an hour as this is a very good exercise for "Meta Programming 101".
We look at some code for a very simple Unit Testing Framework, and there's really not a lot to it. Of course, real world unit testing frameworks contain many more "bells-and-whistles", but the basic code shown below performs the core functionality of a simple test runner:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
namespace ConsoleApp1
{
public class Program
{
public static void Main(string[] args)
{
var testFinder = new TestFinder(args);
var testExecutor = new TestExecutor();
var testReporter = new TestReporter();
var allTests = testFinder.FindTests();
foreach (var test in allTests)
{
TestResult result = testExecutor.ExecuteSafely(test);
testReporter.Report(result);
}
}
}
public class TestFinder
{
private readonly Assembly _testDll;
public TestFinder(string[] args)
{
var assemblyname = AssemblyName.GetAssemblyName(args[0]);
_testDll = AppDomain.CurrentDomain.Load(assemblyname);
}
public List<MethodInfo> FindTests()
{
var fixtures = _testDll.GetTypes()
.Where(x => x.GetCustomAttributes()
.Any(c => c.GetType()
.Name.StartsWith("TestFixture"))).ToList();
var allMethods = fixtures.SelectMany(f =>
f.GetMethods(BindingFlags.Public | BindingFlags.Instance));
return allMethods.Where(x => x.GetCustomAttributes()
.Any(m => m.GetType().Name.StartsWith("Test")))
.ToList();
}
}
public class TestExecutor
{
public TestResult ExecuteSafely(MethodInfo test)
{
try
{
var instance = Activator.CreateInstance(test.DeclaringType);
test.Invoke(instance, null);
return TestResult.Pass(test);
}
catch (Exception ex)
{
return TestResult.Fail(test, ex);
}
}
}
public class TestReporter
{
public void Report(TestResult result)
{
Console.Write(result.Exception == null ? "." : "x");
}
}
public class TestResult
{
private Exception _exception = null;
public Exception Exception { get => _exception;
set => _exception = value;
}
public static TestResult Pass(MethodInfo test)
{
return new TestResult { Exception = null };
}
public static TestResult Fail(MethodInfo test, Exception ex)
{
return new TestResult { Exception = ex };
}
}
}
David then talks about the ASP.NET MVC framework. He says that it is a framework that, in essence, just finds and runs user code, which sounds oddly familiar to a unit testing framework! Sure, there's additional functionality within the framework, but at a basic level, the framework simply accepts a HTTP request, finds the user code for the requested URL/Route and runs that code (this is the controller action method) . Part of running that code might be the invoking of a ViewEngine (i.e. Razor) to render some HTML which is sent back to the client at the end of the action method. Therefore, MVC is merely meta-programming which is bound to HTTP. This is a lot like an ASP.NET HttpHandler and, in fact, the very first version of ASP.NET MVC was little more than one of these.
David asks if we know why MVC was so successful. It was successful because of Rails. And why was Rails successful? Well, because it had sensible defaults. This approach is the foundation of the often used "convention over configuration" paradigm. This allows users of the framework to easily "fall into the pit of success" rather than the "pit of failure" and therefore makes learning and working with the framework a pleasurable experience. David shows some more code here, which is his super simple MVC framework. Again, it's largely based on using reflection to find and invoke appropriate user code, and is really not at all dissimilar to the Unit Testing code we looked at earlier. We have a ProcessRequest
method:
public void ProcessRequest(HttpContext context)
{
var controller = PickController(context);
var method = PickMethod(context, controller);
var instance = Activator.CreateInstance(controller);
var response = method.Invoke(instance, null);
HttpContext.Current.Response.Write(response);
}
This is the method that orchestrates the entire HTTP request/response cycle of MVC. And the other methods called by the ProcessRequest
method use the reflective meta-programming and are very similar to what we've already seen. Here's the PickController
method, which we can see tries to find types whose names both start with a value from the route/URL and also end with "Controller". We can also see that we use a sensible default of "HomeController
" when a suitable controller can't be found:
private Type PickController(HttpContext context)
{
var url = context.Request.Url;
Type controller = null;
var types = AppDomain.CurrentDomain.GetAssemblies()
.SelectMany(x => x.GetTypes()).ToList();
controller = types.FirstOrDefault(x => x.Name.EndsWith("Controller")
&& url.PathAndQuery.StartsWith(x.Name))
?? types.Single(x => x.Name.StartsWith("HomeController"));
return controller;
}
Next, we move on to the deconstruction of an IoC Container Framework. An IoC container framework is again a simple framework that works due to meta-programming and reflection. At their core, they simply stores a dictionary of mappings of interfaces to types, and they expose a method to register this mapping, as well as a method to create an instance of a type based on a given interface. This creation is simply a recursive call ensuring that all objects down the object hierarchy are constructed by the IoC Container using the same logic to find each object's dependencies (if any). David shows us his own IoC container framework on one of his slides and it's only around 70 lines of code. It almost fits on a single screen. Of course, this is a very basic container and doesn't have all the real-world required features such as object lifetime management and scoping, but it does work and performs the basic functionality. I haven't shown the code here as it's very similar to the other meta-programming code we've already looked at, but there's a number of examples of simple IoC containers out there on the internet, some written in only 15 lines of code!
After the demos, David talks about how we can actually use the reflection and meta-programming we've seen demonstrated in our own code as we're unlikely to re-write our MVC, Unit Testing or IoC frameworks. Well, there's a number of ways in which such introspective code can be used. One example is based upon some functionality for sending emails, a common enough requirement for many applications. We look at some all too frequently found code that has to branch based upon the type of email that we're going to be sending:
public string SendEmails(string emailType)
{
var emailMerger = new EmailMerger();
if (emailType == "Nightly")
{
var nightlyExtract = new NightlyEmail();
var templatePath = "\\Templates\\NightlyTemplate.html";
return emailMerger.Merge(templatePath, nightlyExtract);
}
if (emailType == "Daily")
{
var dailyExtract = new DailyEmail();
var templatePath = "\\Templates\\DailyTemplate.html";
return emailMerger.Merge(templatePath, dailyExtract);
}
throw new NotImplementedException();
}
We can see we're branching conditionally based upon a string that represents the type of email we'll be processing, either a daily email or a nightly one. However, by using reflective meta-programming, we can change the above code to something much more sophisticated:
public string SendEmails(string emailType)
{
var strategies = new Email[] {new NightlyEmail(), new DailyEmail()};
var selected = strategies.First(x => x.GetType().Name.StartsWith(emailType));
var templatePath = "\\Templates\\" + selected.GetType().Name + ".html";
return new EmailMerger().Merge(templatePath, selected);
}
Another way of using meta-programming within our own code is to perform automatic registrations for our DI/IoC Containers. We often have hundreds or thousands of lines of manual registration, such as container.Register<IFoo, Foo>();
and we can simplify this by simply enumerating over all of the interfaces within our assemblies and looking for classes that implement that interface and possibly are called by the same name prefix and automatically registering the interface and type with the IoC container. Of course, care must be taken here as such an approach may actually hide intent and is somewhat less explicit. In this regard, David says that with the great power available to us via meta-programming comes great responsibility, so we should take care to only use it to "make the obvious thing work, not make the right thing totally un-obvious".
Finally, perhaps one of the best uses of meta-programming in this way is to help protect code quality. We can do this by using meta-programming within our unit tests to enforce some attribute to our code that we care about. One great example of this is to ensure that all classes within a given namespace have a specific suffix to their name. Here's a very simple unit test that ensures that all classes in a Factories namespace have the word "Factory" at the end of the class name:
[Test]
public void MakeSureFactoriesHaveTheRightNamingConventions()
{
var types = AppDomain.CurrentDomain
.GetAssemblies()
.SelectMany(a => a.GetTypes())
.Where(x => x.Namespace == "MyApp.Factories");
foreach (var type in types)
{
Assert.That(type.Name.EndsWith("Factory"));
}
}
After David's session was over it was time for another quick coffee break. As I had to change rooms this time, I decided to head back to the main foyer and grab a quick cup of coffee before immediately heading off to find the room for my next session. This session was James Murphy's A Gentle Introduction To Elm.
James starts by introducing the Elm language. Elm calls itself a "delightful language for reliable web apps". It's a purely functional language that transpiles to JavaScript and is a domain-specific language designed for developing web applications. Being a purely functional language allows Elm to make a very bold claim. No run-time exceptions!
James asks "Why use Elm?". Well, for one thing, it's not JavaScript! It's also functional, giving it all of the benefits of other functional languages such as immutability and pure functions with no side effects. Also, as it's a domain-specific language, it's quite small and is therefore relatively easy to pick up and learn. As it boasts no run-time exceptions, this means that if your Elm code compiles, it'll run and run correctly.
James talks about the Elm architecture and the basic pattern of implementation, which is Model-Update-View. The Model is the state of your application and it's data. The Update is the mechanism by which the state is updated, and the View is how the state is represented as HTML. It's this pattern that provides reliability and simplicity to Elm programs. It's a popular, modern approach to front-end architecture, and the Redux JavaScript framework was directly inspired by the Elm architecture. A number of companies are already using Elm in production, such as Pivotal, NoRedInk, Prezi and many others.
Here's a simple example Elm file showing the structure using the Model-Update-View pattern. The pattern should be understandable even if you don't know the Elm syntax:
import Html exposing (Html, button, div, text)
import Html.Events exposing (onClick)
main =
Html.beginnerProgram { model = 0, view = view, update = update }
type Msg = Increment | Decrement
update msg model =
case msg of
Increment ->
model + 1
Decrement ->
model - 1
view model =
div []
[ button [ onClick Decrement ] [ text "-" ]
, div [] [ text (toString model) ]
, button [ onClick Increment ] [ text "+" ]
]
Note that the Elm code is generating the HTML that will be rendered by the browser. This is very similar to the React framework and how it also performs the rendering for the actual page's markup. This provides for a strongly-typed code representation of the HTML/web page, thus allowing far greater control and reasoning around the ultimate web page's markup.
You can get started with Elm by visiting the projects home page at elm-lang.org. Elm can be installed either directly from the website, or via the Node Package Manager (NPM). After installation, you'll have elm-repl - a REPL for Elm, elm-make which is the Elm compiler, elm-package - the Elm package manager and elm-reactor - the Elm development web server. One interesting thing to note is that Elm has strong opinions about cleanliness and maintainability, so with that in mind, Elm enforces semantic versioning on all of it's packages!
James shows us some sample Elm statements in the Elm REPL. We see can use all the standard and expected language elements, numbers, strings, defining functions etc. We can also use partial application, pipe-lining and lists/maps, which are common constructs within functional languages. We then look at the code for a very simple "Hello World" web page, using the Model-Update-View pattern that Elm programs follow. James is using Visual Studio Code to as his code editor here, and he informs us that there's mature tooling available to support Elm within Visual Studio Code.
We expand the "Hello World" page to allow user input via a textbox on the page, printing "Hello" and then the user's input. Due to the continuous Model-Update-View loop, the resulting page is updated with every key press in the textbox, and this is controlled by the client-side JavaScript that has been transpiled from the Elm functions. James shows this code running through the Elm Reactor development web server. On very nice feature of Elm Reactor is that is contains built-in "time-travel" debugging, meaning that we can enumerate through each and every "event" that happens within our webpage. In this case, we can see the events that populate the "Hello <user>" text character-by-character. Of course, it's possible to only update the Hello display text when the user has finished entering their text and presses the Enter key in the textbox, however, since this involves maintaining state, we have to perform some interesting work in our Elm code to achieve it.
James shows us how Elm can respond to events from the outside world. He writes a simply function that will respond to system tick events to show an ever updating current time display on the web page. James shows how we can work with remote data by defining specific types (unions) that represent the data we'll be consuming and these types are then added to the Elm model that forms the state/data for the web page. One important thing to note here is that we need to be able to not only represent the data but also the absence of any data with specific types that represent the lack of data. This is, of course, due to Elm being a purely functional language that does not support the concept of null.
The crux of Elm's processing is taking some input (in the form of a model and a message), performing the processing and responding with both a model and a message. Each Elm file has an "init" section that deals with the input data. The message that is included in that data can be a function, and could be something that would access a remote endpoint to gather data from a remote source. This newly acquired data can then be processed in the "Update" section of the processing loop, ultimately for returning as part of the View's model/message output. James demonstrates this by showing us a very simple API that he's written implementing a simply To-Do list. The API endpoint exposes a JSON response containing a list of to-do items. We then see how this API endpoint can be called from the Elm code by using a custom defined message that queries the API endpoint and pulls in the various to-do items, processes them and writes that data into the Elm output model which is ultimately nicely rendered on the web page.
Elm contains a number of rich packages out-of-the-box, such as a HTTP module. This allows us to perform HTTP requests and responses using most of the available HTTP verbs with ease:
import Json.Decode (list, string)
items : Task Error (List String)
items =
get (list string) "http://example.com/to-do-items.json"
Or:
corsPost : Request
corsPost =
{ verb = "POST"
, headers =
[ ("Origin", "http://elm-lang.org")
, ("Access-Control-Request-Method", "POST")
, ("Access-Control-Request-Headers", "X-Custom-Header")
]
, url = "http://example.com/hats"
, body = empty
}
It's important to note, however, that not all HTTP verbs are available out-of-the-box and some verbs, such as PATCH, will need to be manually implemented.
James wraps up his session by talking about the further eco-system around the Elm language. He mentions that Elm has it's own testing framework, ElmTest, and that you can very easily achieve a very high amount of code coverage when testing in Elm due to it being a purely functional language. Also, adoption of Elm doesn't have to be an all-or-nothing proposition. Since Elm transpiles to JavaScript, it can play very well with existing JavaScript applications. This means that Elm can be adopted in a piece meal fashion, with only small sections of a larger JavaScript application being replaced by their Elm equivalent, perhaps to ensure high code coverage or to benefit from improved robustness and reduced possibility of errors.
Finally, James talks about how to deploy Elm application when using Elm in a real-world production application. Most often, Elm deployment is performed using WebPack, a JavaScript module bundler. This often takes the form of shipping a single small HTML file containing the necessary script inclusions for it to bootstrap the main application.
After James' session was over, it was time for lunch. All the attendees made there way back to the main foyer area where a delicious lunch of a selection of sandwiches, fruit, crisps and chocolate was available to us. As is customary at the various DDD events, there were to be a number of grok talks taking place over the lunch period. As I'd missed the grok talks at the last few DDD events I'd attended, I decided that I'd make sure I aught a few of the talks this time around.
I missed the first few talks as the queue for lunch was quite long and it took a little while to get all attendees served, however, after consuming my lunch in the sunny outdoors, I headed back inside to the large lecture theatre where the grok talks were being held. I walked in just to catch the last minute of Phil Pursglove's talk on Azure's CosmosDB, which is Microsoft's globally distributed, multi-model database. Unfortunately, I didn't catch much more than that, so you'll have to follow the link to find out more. (Update: Phil has kindly provided a link to a video recording of his talk!)
The next grok talk was Robin Minto's OWASP ZAP FTW talk. Robin introduces us to OWASP, which is the Open Web Application Security Project and exists to help create a safer, more secure web. Robin then mentions ZAP, which is a security testing tool produced by OWASP. ZAP is the Zed Attack Proxy and is a vulnerability scanner and intercepting proxy to help detect vulnerabilities in your web application. Robin shows us a demo application he's built containing deliberate flaws, Bob's Discount Diamonds. This is running on his local machine. He then shows us a demo of the OWASP ZAP tool and how it can intercept all of the requests and responses made between the web browser and the web server, analysing those responses for vulnerabilities and weaknesses. Finally, Robin shows us that the OWASP ZAP software contains a handy "fuzzer" capability which allows it to replay requests using lists of known data or random data - i.e. can replay sending login requests with different usernames/passwords etc.
The next grok talk was an introduction to the GDPR by John Price. John introduces the GDPR, which is the new EU wide General Data Protection Regulations and effectively replaced the older Data Protection Act in the UK. GDPR, in a nutshell, means that users of data (mostly companies who collect a person's data) need to ask permission from the data owner (the person to whom that data belongs) for the data and for what purpose they'll use that data. Data users have to be able to prove that they have a right to use the data that they've collected. John tells us that adherence to the GDPR in the UK is not affected by Brexit as it's already enshrined in UK law and has been since April 2016, although it's not really been enforced up to this point. It will start to be strictly enforced from May 2018 onwards. We're told that, unlike the previous Data Protection Act, violations of the regulations carry very heavy penalties, usually starting at 20 million Euros or 4% of a company's turnover. There will be some exceptions to the regulations, such as police and military but also exception for private companies too, such as a mobile phone network provider giving up a person's data due to "immediate threat to life". Some consent can be implied, so for example, entering your car's registration number into a web site for the purposes of getting an insurance quote is implied permission to use the registration number that you've provided, but the restriction is that the data can only be used for the specific purpose for which it was supplied. GDPR will force companies to declare if data is sent to third parties. If this happens, the company initially taking the data and each and every third-party that receives that data have to inform the owner of the data that they are in possession of the data. GDPR is regulated by the Information Commissioners Office in the UK. Finally, John says that the GDPR may make certain businesses redundant. He gives an example industry of credit reference agencies. Their whole business model is built on non-consentual usage of data, so it will be interesting to see how GDPR affects industries like these.
After John's talk, there was a final grok talk however, I needed a quick restroom break before the main sessions of the afternoon, so headed off for my restroom break before making my way back to the room for the first of the afternoon's sessions. This was Matt Ellis's How To Parse A File.
Matt starts his session by stating that his talk is all about parsing files, but he immediately says, "But, Don't do it!" He tells us that it's a solved problem and we really shouldn't be writing code to parsing files by hand for ourselves and should just use one of the many excellent libraries out there instead. Matt does discuss why you decide you really needed to parse files for yourself. Perhaps you need better speed and efficiency or maybe it's to reduce dependencies or to parse highly specific custom formats. It could even be parsing for things that aren't even files such as HTTP headers, standard output etc. From here, Matt mentions that he works for JetBrains and that the introduction of simply parsing a file is a good segue into talking about some of the features that can be found inside many of JetBrains' products.
Matt starts by looking at the architecture of many of JetBrains' IDE's and developer tools such as ReSharper. They're build with a similar architecture and they all rely on a layer that they call the PSI layer. The PSI layer is responsible for parsing, lexing and understanding the user code that the tool is working on. Matt says that he's going to use the Unity framework to show some examples throughout his session and that he's going to attempt to build up a syntax tree for his Unity code. We first look at a hand-rolled parser, this one is attempting to understand the code by observing each character at a time. It's a very laborious approach and prone to error, so this is an approach to parsing that we shouldn't use. Matt tells us that the best approach, which has been "solved" many time in the past is ti employ the services of a lexer. This is a processor that turns the raw code into meaningful tokens based upon the words and vocabulary of the underlying code or language and gives structure to those tokens. It's from the output of the lexer that we can more easily and robustly perform the parsing. Lexers are another solved problem, and many lexers already exist for popular programming languages such as lex, CsLex, FsLex, flex, JFlex and many more.
Lexers generate source code, but it's not human readable code. It's similar to how .NET language code (C# or VB.NET) is first compiled to Intermediate Language prior to being JIT'ed at runtime. The code output from the lexer is read by the parser and from there the parser can try to understand the grammar and structure of the underlying code via syntactical analysis. This often involved the use of Regular Expressions in order to match specific tokens or sets of tokens. This works particularly well as Regular Expressions can be translated into a state machine and from there, translated into a transition table. Parsers understand the underlying code that they're designed to work on, so for example, a parser for C# would know that in a class declaration, there would be the class name which would be preceded by a token indicating the scope of the class (public, private etc). Parsing is not a completely solved problem. It's more subjective, so although solutions exist, they're more disparate and specific to the code or language that they're used for and therefore, they're not a generic solution.
Matt tells us how parsing can be done either top-down or bottom-up. Top down parsing starts at highest level construct of the language, for example at a namespace or class level in C#, and it then works it's way down to the lower level constructs from there - through methods and the code and locally scoped variables in those methods. Bottom up parsing works the opposite way around, starting with the lower level constructs of the language and working back up to the class or namespace. Bottom up parsers can be beneficial over top-down ones as they have the ability to utilise shift-reduce algorithms to simplify code as it's being parsed. Parsers can even be "parser combinators". These are parsers built from other, simpler, parsers where the input the next parser in the chain is the output from the previous parser in the chain, more formally known as recursive-descendant parsing. .NET's LINQ acts in a similar way to this. Matt tells us about an F# parser combinator called FParsec and a C# parser is sort of like this. FParsec is a parser combinator for F# along with a C# parser combinator called Sprache, itself relying heavily on LINQ:
Parser<string> identifier =
from leading in Parse.WhiteSpace.Many()
from first in Parse.Letter.Once()
from rest in Parse.LetterOrDigit.Many()
from trailing in Parse.WhiteSpace.Many()
select new string(first.Concat(rest).ToArray());
var id = identifier.Parse(" abc123 ");
Assert.AreEqual("abc123", id);
Matt continues by asking us to consider how parsers will deal with whitespace in a language. This is not always as easy as it sounds as some languages, such as F# or Python, use whitespace to give semantic meaning to their code, whilst other languages such as C# use whitespace purely for aesthetic purposes. In dealing with whitespace, we often make use of a filtering lexer. This is a simple lexer that specific detects and removes whitespace prior to parsing. The difficulty then is that, for languages where whitespace is significant, we need to replace the removed whitespace after parsing. This, again, can be tricky as the parsing may alter the actual code (i.e. in the case of a refactoring) so we must again be able to understand the grammar of the language in order to re-insert whitespace into the correct places. This is often accomplished by building something known as a Concrete Parse Tree as opposed to the more normal Abstract Syntax Tree. Concrete Parse Tree's work in a similar way to a C# Expression Tree, breaking down code into a hierarchical graph of individual code elements.
Matt tells us about other uses for Lexers such as the ability to determine specific declarations in the language. For example, in F# typing 2.
would represent a floating point number, where as typing [2..0]
would represent a range. When the user is only halfway through typing, how can we know if they require a floating point number or a range? Also such things as comments within comments, for example: /* This /* is */ valid */
This is something that lexers can be good at, at such matching is difficult to impossible with regular expressions.
The programs that use lexers and parsers can often have very different requirements, too. Compilers using them will generally want to compile the code, and so they'll work on the basis that the program code that they're lexing/parsing is assumed correct, whilst IDE's will take the exact opposite approach. After all, most of the time whilst we're typing, our code is in an invalid state. For those programs that assume the code is in an invalid state most of the time, they often use techniques such as error detection and recovery. This is, for example, to prevent your entire C# class from being highlighted as invalid within the IDE just because the closing brace character is missing from the class declaration. They perform error detection on the missing closing brace, but halt highlighting of the error at the first "valid" block of code immediately after the matching opening brace. This is how the Visual Studio IDE is able to only highlight the missing closing brace as invalid and not the entire file full of otherwise valid code. In order for this to be performant, lexers in such programs will make heavy use of caching to prevent having to continually lex the entire file with every keystroke.
Finally, Matt talks about how JetBrains often need to also deal with "composable languages". These are things like ASP.NET MVC's Razor files, which are predominantly comprised of HTML mark-up, but which can also contain "islands" of C# code. For this, we take a similar approach to dealing with whitespace in that the file is lexed for both languages, HTML and C# and the HTML is temporarily removed whilst the C# code is parsed and possibly altered. The lexed tokens from both the C# and the preserved HTML are then re-combined after the parsing to re-create the file.
After Matt's session, there was one final break before the last session of the day. Since there was, unfortunately, no coffee served at this final break, I made my way directly to the room for my next and final session, Joe Stead's .NET Core In The Real World.
Joe starts his talk by announcing that there's no code or demos in his talk, and that his talk will really just be about his own personal experience of attempting to migrate a legacy application to .NET Core. He says that he contemplated doing a simple "Hello World" style demo for getting started with .NET Core, but that it would give a false sense of .NET Core being simple. In the real-world, and when migrating an older application, it's a bit more complicated than that.
Joe mentions the .NET Standard and reminds us that it's a different thing than .NET Core. .NET Core does adhere to the .NET Standard and Joe tells us that .NET Standard is really just akin to Portable Class Libraries Version 2.0.
Joe introduces the project that he currently works on at his place of employment. It's a system that started life in 2002 and was originally built with a combination of Windows Forms applications and ASP.NET Web Forms web pages, sprinkled with Microsoft AJAX JavaScript. The system was in need of being upgraded in terms of the technologies used, and so in 2012, they migrated to KnockoutJS for the front-end websites, and in 2013, to further aid with the transition to KnockoutJS, they adopted the NancyFX framework to handle the web requests. Improvements in the system continued and by 2014 they had started to support the Mono Framework and had moved from [Microsoft SQL Server] (https://www.microsoft.com/en-us/sql-server/sql-server-2016) to a PostgreSQL database. This last lot of technology adoptions was to support their growing demand for a Linux version of their application from their user base. The adoptions didn't come without issues, however, and by late 2014, they started to experience serious segfaults in their application. After some diagnosis after which they never did fully get to the bottom of the root cause of the segfaults, they decided to adopt Docker in 2015 as a means of mitigating the segfault problem. If one container started to display problems associated with segfaults, they could kill the container instance and create a new one. By this point, they were in 2015 and decided that they'd start to now look into .NET Core. It was only in Beta at this time, but were looking for a better platform that Mono that might provide some much needed stability and consistency across operating systems. And since they were on such a roll with changing their technology stacks, they decided to move to Angular2 on the front-end, replacing KnockoutJS in 2016 as well!
By 2017, they'd adopted .NET Core v1.1 along with RabbitMQ and Kubernetes. Joe states that the reason for .NET Core adoption was to move away from Mono. By this point, they were not only targeting Mono, but a custom build of Mono that they'd had to fork in order to try to fix their segfault issues. They needed much more flexible deployments such as the ability to package and deploy multiple versions of their application using multiple different versions of the underlying .NET platform on the same machine. This was problematic in Mono, as it can be in the "full" .NET Framework, but one of the benefits of .NET Core is the ability to package the run-time with your application, allowing true side-by-side versions of the run-time to exist for different applications on the same machine.
Joe talks about some of the issues encountered when adopting and migrating to .NET Core. The first issue was missing API's. .NET Core 1.0 and 1.1 were built against .NET Standard 1.x and so many API's and namespace were completely missing. Joe also found that many NuGet packages that his solution was dependent upon were not yet ported across to .NET Core. Joe recalls that testing of the .NET Core version of the solution was a particular challenge as few other people had adopted the platform and the general response from Microsoft themselves was that "it's coming in version 2.0!". What really helped save the day for Joe and his team was that .NET Core itself and many of the NuGet packages were open source. This allowed them to fork many of the projects that the NuGet packages were derived from and help with transitioning them to support .NET Core. Joe's company even employed a third party to work full time on helped to port NancyFX to .NET Core.
Joe now talks about the tooling around .NET Core in the early days of the project. We examine how Microsoft introduced a whole new project file structure, moving away from XML representation in the .csproj
files, and moving to a JSON representation with project.json
. Joe explains how they had to move their build script and build tooling to the FAKE build tool as a result of the introduction of project.json. There were also legal issues around using the .NET Core debugger assemblies in tools other than Microsoft's own IDE's, something that the JetBrain's Rider IDE struggled with. We then look at tooling in the modern world of .NET Core and project.json
has gone away, and reverted back to the .csproj files although they're much more simplified and improved. This allows the use of MSBuild again, however, FAKE itself now has native support for .NET Core. The dotnetCLI tool has improved greatly and the legal issues around the use of the .NET Core debugging assemblies has been resolved, allowing third-party IDE's such as JetBrain's Rider to use them again.
Joe also mentions how .NET Core now, with the introduction of version 2.0, is much better than the Mono Framework when it comes to targeting multiple run-times. He also mentions issues that plagued their use of libcurl on the Mac platform when using .NET Core 1.x, but that these have now been resolved in .NET Core 2.0 as .NET Core 2.0 now uses the native macOS implementation rather than trying to abstract that and use it's own implementation.
Joe moves on to discuss something that's not really specific to .NET Core, but is a concern when developing code to be run on multiple platforms. He shows us the following two lines of code:
TimeZoneInfo.FindSystemTimeZoneById("Eastern Standard Time");
TimeZoneInfo.FindSystemTimeZoneById("America/New_York");
He asks which is the "correct" one to use. Well, it turns out that they're both correct. And possibly incorrect! The top line works on Windows, and only on Windows, whilst the bottom line works on Linux, and not on Windows. It's therefore incredibly important to understand such differences when targeting multiple platforms with your application. Joe also says how, as a result of discrepancies such as the timezone issue, the tooling can often lie. He recalls a debugging session where one debugging window would show the value of a variable with one particular date time value, and another debugging window - in the exact same debug session - would interpret and display the same variable with an entirely different date time value. Luckily, most of these issues are now largely resolved with the stability that's come from recent versions of .NET Core and the tooling around it.
In wrapping up, Joe says that, despite the issues they encountered, moving to .NET Core was the right thing for him and his company. He does say, though, that for other organisations, such a migration may not be the right decision. Each company, and each application, needs to be evaluated for migration to .NET Core on it's own merits. For Joe's company, the move to .NET Core allowed them to focus attention elsewhere after migration. They've since been able to adopt Kubernetes. They've been able to improve and refactor code to implement better testing and many more long overdue improvements. In the last month, they've migrated again from .NET Core 1.1 to .NET Core 2.0 which was a relatively easy task after the initial .NET Core migration. This one only involved the upgrading of a few NuGet packages and that was it. The move to .NET Core 2.0 also allowed them to re-instate lots of code and functionality that had been temporarily removed thanks to the new, vastly increased API surface area of .NET Core 2.0 (really, .NET Standard 2.0).
After Joe's session, it was time for all the attendees to gather in the main foyer area of the university building for the final wrap-up and prize draws. After thanking to sponsors, the venue, and the organisers and volunteers, without whom of course, events such as DDD simply wouldn't be able to take place, we moved onto the prize draw. Unfortunately, I wasn't a winner, however, the day had been brilliant.
Another wonderful DDD event had been and gone but a great day was had by all. We were told that the next DDD Event was to be a DDD Dublin, held sometime around March 2018. So there's always that to look forward to.