I’m now an MCSD in Application Lifecycle Management!

MCSD_2013(rgb)_1509Well, after previously saying that I’d would give the pursuit of further certifications a bit of a rest, I’ve gone and acquired yet another Microsoft Certification.  This one is Microsoft Certified Solutions Developer – Application Lifecycle Management.

It all started around the beginning of January this year when Microsoft sent out an email with a very special offer.  Register via Microsoft’s Virtual Academy and you would be sent a 3-for-1 voucher for selected Microsoft exams.  Since the three exams required to achieve a Microsoft Certified Solutions Developer – Application Lifecycle Management exams were included within this offer, I decided to go for it.  I’d pay for only the first exam and get the other two for free!

So, having acquired my voucher code, I proceeded to book myself in for the first of the 3 exams.  “Administering Visual Studio Team  Foundation Server 2012” was the first exam which I’d scheduled for the beginning of February.  Although I’d had some previous experience of setting up, configuring and administrating Team Foundation Server, that was with the 2010 version of the product.  I realised I needed to both refresh and update by skills.  Working on a local copy of TFS 2012 and following along with the “Applying ALM with Visual Studio 2012 Jumpstart” course on Microsoft’s Virtual Academy site, as well as studying with the excellent book, “Professional Scrum Development with Microsoft Visual Studio 2012” that is recommended as a companion/study guide for the MCSD ALM exams, I quickly got to work.

I sat and passed the first exam in early February this year.  Feeling energised by this, I quickly returned to the Prometric website to book the second of the three exams, “Software Testing with Visual Studio 2012”, which was scheduled for March of this year.  I’d mistakenly thought this was all about unit testing within Visual Studio, and whilst some of that was included in this course, it was really all about Visual Studio’s “Test Manager” product.  The aforementioned Virtual Academy course and the book covered all of the this course’s content, however, so continued study with those resources along with my own personal tinkering helped me tremendously.  When the time came I sat the exam and amazingly, passed with full marks!

So, with 2 exams down and only 1 to go, I decided to plough on and scheduled my third and final exam for late in April.  This final exam was “Delivering Continuous Value with Visual Studio 2012 Application Lifecycle Management” and was perhaps the most abstract of all of the exams, focusing on agility, project management and best practices around the “softer” side of software development.  Continued study with the aforementioned resources was still helpful, however, when the time came to sit the exam, I admit that I felt somewhat underprepared for this one.  But sit the exam I did, and whilst I ended up with my lowest score from all three of the exams, I still managed to score enough to pass quite comfortably.

So, with all three exams sat and passed, I was awarded the “Microsoft Certified Solution Developer – Application Lifecycle Management” certification.  I’ll definitely slow down with my quest for further certifications now….Well, unless Microsoft send me another tempting email with a very “special” offer included!

SSH over SSL with BitBucket & GitHub.

I’ve recently decided to switch to using SSH (Secure Shell) access for all of my repositories on both BitBucket & GitHub.  I was previously using HTTPS access, however this frequently means that you end up with hard-coded usernames and passwords inside your Mercurial and Git configuration files.  Not the most secure approach.

I switched over to using SSH Keys for access to both BitBucket and GitHub and I immediately ran into a problem.  SSH access is, by default, done over Port 22 however this is not always available for use.  In a corporate environment, or over public Wi-Fi, this port is frequently blocked.  Fortunately, both GitHub and BitBucket both allow using SSH over the port that is used for SSL (HTTPS) traffic instead as this is almost always never blocked (Both Port 80 (HTTP) and 443 (HTTPS) are required for web browsing).

Setting this up is usually easy enough, but there can be a few slightly confusing parts to ensuring your SSH Keys are entered in the correct format and making sure you’re using the correct URI to access your repositories.  I found BitBucket that little bit easier to configure, and initially struggled with GitHub.  I believe this is primarily because GitHub is more geared towards Unix and OpenSSH users rather than Windows and PuTTY users.

Setting Up The Keys

The first step is to ensure that the SSH Key is in the correct format to be added to either your GitHub or BitBucket account.  If you’re using PuTTYGen to generate your SSH keys, the easiest way it to simply copy & paste the key from the PuTTYGen window:

puttygen

In my own experience, I’ve found that BitBucket is slightly more forgiving of the exact format of the SSH Key.  I’d previously opened my private SSH Key files (.ppk file extension) in Notepad and copied and pasted from there.  When viewed this way, the SSH Key is rendered in an entirely different format as shown below:

puttygenkeytext

It seems that BitBucket will accept copying and pasting the “public” section from this file (identified as the section between the lines “Public-Lines: 6” and “Private-Lines: 14”) however GitHub won’t.  Copying and pasting from the PuTTYGen window, though, will consistently work with both BitBucket & GitHub.

Configuring Client Access

The next step is to configure your client to correctly access your BitBucket and GitHub repositories using SSH over the HTTPS/SSL port.  Personally, I’ve been using TortoiseHG for some time now for my Mercurial repositories, but recently I’ve decided to switch to Atlassian’s Sourcetree as it allows me to work with both Mercurial & Git repositories from the same UI.  (I’m fairly comfortable with Mercurial from the command line, too, but never really got around to learning Git from the command line.  Maybe I’ll come back to it one day!)

normalsshBitBucket has a very helpful page on their documentation that details the URI that you’ll need to use in order to correctly use SSH over Port 443.  It’s a bit different from the standard SSH URI that you get from the BitBucket repositories “home page”.

normalssh2

Note the altssh.bitbucket.org domain rather than the standard bitbucket.org one!  You’ll also need to add the port to the end of the domain as shown in the image.

Configuring the client access for GitHub was a little bit trickier.  Like BitBucket, GitHub has a page in their documentation relating to using SSH over SSL, however, this assumes you’re using the ssh command line tool, something that’s there by default in Unix/Linux but not there on Windows (although a 3rd party implementation of OpenSSH does exist).  The GitHub help page suggests to change your SSH configuration to “point” your ssh.github.com host name to run over Port 443.  That’s easy if you’re using the command line OpenSSH client, but if you’re using something like Tortoise or in my case, SourceTree, that’s not so easy.

ghnormalsshThe way to achieve this is to forget about fiddling with configuration files, and just ensure that you use the correct URI, correctly formed in order to establish a connection to GitHub with SSH over SSL.  The standard SSH URL provided by GitHub on any of your GitHub repository homepages (as shown in the image) suggests that the URL should follow this kind of format:

git@github.com:craigtp/craigtp.github.io.git

That’s fine for “normal” SSH access where SSH connects over the standard, default port of 22.  You’ll need to change that URL if you want to use SSH over the SSL port (Port 443).  The first thing to notice is that the colon within the URL above separates the domain from the username.  Ordinarily, colons in URLs separate the domain from the port number to be used, however here we’re going to add the port number separated by a colon from the domain and move the username part to be separated after the port number by a slash.  We also need to change the actual domain from git@github.com to git@ssh.github.com.

ghsshssl

Therefore our SSH over SSL URL becomes:

git@ssh.github.com:443/craigtp/craigtp.github.io.git

instead of:

git@github.com:craigtp/craigtp.github.io.git

(Obviously, replace the craigtp.github.io.git part of the URL with the relevant repository name!)

It’s a simple enough change, but one that’s not entirely obvious at first.

DDD North 2013 In Review

dddnorthlogo

On Saturday 12th October 2013, in a slightly wet and windy Sunderland, the 3rd DDD North Developer conference took place.  DDD North events are free one day conferences for .NET and the wider development community, run by developers for developers.  This was the 3rd DDDNorth, and my 3rd DDD event in general (I’d missed the first DDD North, but did get to attend DDD East Anglia earlier this year) and this year’s DDDNorth was better than ever.

 

The day started when I arrived at the University Of Sunderland campus.  I was travelling from Newcastle after having travelled to the North-East on the Friday evening beforehand.  I’m lucky in that I have in-laws in Newcastle so was staying with them for the duration of the weekend making the journey to Sunderland fairly easy.  Well, not that easy.  I don’t really know Sunderland so I’d had to use my Sat-Nav which was great until we got close to the City Centre at which point my Sat-Nav took me on an interesting journey around Sunderland’s many roundabouts! :)

 

I eventually arrived at the Sir Tom Cowie Campus at the University of Sunderland and parked my car, thanks to the free (and ample) car parking provided by the university.

20131012_083814

I’d arrived reasonably early for the registration, which opened at 8:30am, however there was still a small queue which I dutifully joined to wait to be signed in.  Once I was signed in, it was time to examine the goodie bag that had been handed to me upon entrance to what was inside.  There was some promotional material from some of the great sponsors of the events as well as a pen (very handy, as I always forget to bring pens to these events!) along with other interesting swag (the pen-cum-screwdriver was a particularly interesting item).

 

The very next task was to find breakfast!  Again, thanks to some of the great sponsors of DDDNorth, the organisers were able to put on some sausage and bacon breakfast rolls for the attendees.  This was a very welcome addition to the catering that was provided last time around at DDD North.

 

20131012_084326

Once the bacon roll had been acquired, I was off to find perhaps the most important part of the morning’s requirements.  Caffeine.  Now equipped with a bacon roll and a cup of coffee, I was ready for the long but very exciting day of sessions ahead of me.

 

DDD North is somewhat larger than DDD East Anglia (although the latter will surely grow over time) so whereas DDD East Anglia had 3 parallel tracks of sessions, DDD North has 5!  This can frequently lead to difficulties in deciding which session to attend but it is really testament to the variety and quality of the sessions at DDD North.  So, having taken the difficult choices of which sessions to attend, I headed off the room for my first session.

 

20131012_092813The first session up was Phil Trelford’s F# Eye 4 the C# Guy.  This session was one of three sessions during the day dedicated to F#.  Phil’s session was aimed at developers currently using C# and he starts off by saying that, although F# offers some advantages over C#, there’s no “one true language” and it’s often the correct approach to use a combination of languages (both C# and F#) within a single application.  Phil goes on to talk about the number and variety of companies that are currently using and taking advantage of the features of F#.  F# was used within Halo 3 for the multi-player component which uses a self-improving machine learning algorithm to monitor, rate and intelligently match players with similar abilities together in games.  This same algorithm was also tweaked and later used within the Bing search engine to match adverts to search queries.  Phil also shares with us a quotation from a company called Kaggle who were previously predominantly a C# development team and who moved a lot of their C# code to F# with great success.  They said, that their F# was “consistently shorter, easier to read, easier to refactor and contained far fewer bugs” compared to the equivalent C# code.

 

Phil talks about the the features of the F# language next. It’s statically typed and multi-paradigm.  Phil states that it’s not entirely a functional language, but is really “functional first" and is also object-oriented.  It’s also completely open source!  Phil’s next step is to show a typical class in C#, the standard Person class with Name and Age properties:

 

public class Person
{
    private string _Name;
    private int _Age;

    public Person(string name, int age)
    {
        _Name = name;
        _Age = age;
    }

    public string Name
    {
        get { return _Name; }
        set { _Name = value; }
    }

    public int Age
    {
        get { return _Age; }
        set { _Age = value; }
    }

    public override string ToString()
    {
        return string.Format("{0} {1}", _Name, _Age);
    }
}

 

Phil’s point here is that although this is a simple class with only two properties, the amount of times that the word “name” or “age” is repeated is excessive.  Phil calls this the “Local Government Pattern” as everything has to be declared in triplicate! :)  Here’s the same class, with the same functionality, but written in F#:

 

namespace People

type Person (name, age) = 
    member person.Name = name
    member person.Age = age

    override person.ToString() = 
        sprintf "%s %d" name age

 

Much shorter, and with far less repetition.  But it can get far better than that.  Here’s the same class again (albeit minus the .ToString() override) in a single line of F#:

 

type Person = { Name: string, Age: int }

 

Phil continues his talk to discuss how, being a fully-fledged, first class citizen of a language in the .NET world, F# code and components can fully interact with C# components, and vice-versa.  F# also has the full extent of the .NET Framework at it’s disposal, too.  Phil shows some more F# code, this one being something called a “discriminated union”:

 

type Shape = 
      | Circle of float 
      | Square of float * float 
      | Rectangle of float 

I’d come across the discriminated unions before, but as an F# newbie, I only barely understood them.  Something that really helped me at least, as a C# guy, was when Phil explained the IL that is generated from the code.  In the above example, the Shape class is defined as an abstract base class and the Circle, Square and Rectangle classes are concrete implementations of the abstract Shape class!  Although thinking of these unions as base and derived classes isn’t strictly true when thinking of F# and it’s functional-style paradigm, it certainly helped me in mentally mapping something in F# back to the equivalent concept in C# (or a more OOP-style language).

 

Phil continues by mentioning some of the best ways to get up to speed with the F# language.  One of Phil’s favourite methods for complete F# newbies, is the F# Koans GitHub repository.  Based upon the Ruby Koans, this repository contains “broken” F# code that is covered by a number of unit tests.  You run the unit tests to see them fail and your job is to “fix” the broken code, usually by “filling in the blanks” that are purposely left there, thereby allowing the test to pass.  Each time you fix a test, you learn a little more about the F# syntax and the language constructs.  I’ve already tried the first few of these and they’re a really good mechanism for a beginner to use to get to grips with F#.  Phil states that he uses these Koans to train new hires in F# for the company he works for.  Phil also gives a special mention to the tryfsharp.org website which also allows newbies to F# to play with the language.  What’s special about tryfsharp.org is that you can try out the F# language entirely from within your web-browser, needing no other installed software on your PC.  It even contains full IntelliSense!

 

Phil’s talk continues with a discussion of meta-programming and F#’s “quotations”.  These are similar to C#’s Expressions but more powerful.  They’re a more advanced subject (and worthy of a talk all of their own no doubt) but effectively allow you to represent F# code in an expression tree which can be evaluated at runtime.  From here, we dive into BDD and testing of F# code in general.  Phil talks about a BDD library (his own, called TickSpec) and how even text-based BDD test definitions are much more terse within F# rather than the equivalent C# BDD definitions (See the TickSpec homepage for some examples of this).  Not only that, but Phil shows a remarkable ability to be able to debug his BDD text-based definitions within the Visual Studio IDE, including setting breakpoints, running the program in debug mode and breaking in his BDD text file!  He also tells a story of how he was able, with a full suite of unit and BDD tests wrapped around the code, to convert a 30,000+ line C# code base into a 200 line F# program that not only perfectly replicated the behaviour of the C# program, but was actually able to deliver even more – all within less than 1/10th of the lines of code!

 

Phil shows us his “Cellz” spread sheet application written in F# next.  He says it’s only a few hundred lines of code and is a good example of a medium sized F# program.  He also states that his implementation of the code that parses and interprets user-defined functions within the spread sheet “cell” is sometimes as little as one line of code!  We all ponder as to whether Excel’s implementations are as succinct! :)  As well as Cellz, there’s a number of other project’s of Phil’s that he tells us about.  One is a mocking framework, similar to C#’s Moq library, which of course, had to be called Foq!    There is also a “Mario” style game that we are shown that was created with the FunScript set of type providers allowing JavaScript to be created from F# code.  Phil also shows us a PacMan clone, running in the browser, created with only a few hundred lines of F# code.

 

Nearing the end of Phil’s talk, he shows us some further resources for our continued education, pointing out a number of books that cover the F# language.  Some specific recommendations are “Programming F#” as well as Phil’s own book (currently in early-access), “F# Deep Dives” which is co-authored by Tomas Petricek (whom I’d seen give an excellent talk on F# at DDD East Anglia).  Finally, Phil mentions that, although F# is a niche language with far fewer F# programmers than C# programmers, it’s a language that can command some impressive salaries! :)  Phil shows us a slide that indicates the UK average salary of F# programmers is almost twice that of a C# programmer.  So, there may not be as much demand for F# at the moment, but with that scarcity comes great rewards! :)

 

Overall, Phil’s talk was excellent and very enlightening.  It certainly helped me as a predominantly C# developer to get my head around the paradigm shift that is functional programming.  I’ve only scratched the surface so far, but I’m now curious to learn much more.

 

20131012_115002

After a quick coffee break back in the main hall of the campus (during which time I was able to snaffle a sausage baguette which had been left over from the morning breakfast!), I headed off to one of the largest rooms being used during the entire conference for my next session.  This one was Kendall Miller’s Scaling Systems: Architectures That Grow.

 

 

Kendall opens his talk by saying that the entire session will be entirely technology agnostic.  He says that what he’s about to talk about are concepts that can apply right across the board and across the complete technology spectrum.  In fact, the concepts that Kendall is about to discuss regarding scalability in terms of how to achieve it and the things that can prevent you achieving it are not only technology agnostic, but they haven’t changed in over 30+ years of computing!

 

Kendall first asks, “What is scalability?”  Scaling is the ability for a system to cope under a certain demand.  That demand is clearly different for different systems.  Kendall shows us some slides that differentiate between the “big boys” such as Amazon, Microsoft, Twitter etc., who are scaling to anything between 30-60 million unique visitors per day and those of us mere mortals that only need to scale to a few thousand or even hundred users per day.  If we have a website that needs to handle 25,000 unique visitors per day, we can calculate that this is approximately 125,000 pages per day.  In the USA, there’s around 11 “high traffic” hours (these are the daytime hours, but spread across the many time zones of North America).  This gives us a requirement of around 12,000 pages/hour, and that divides down to only 3.3 pages per second.  This isn’t such a large amount to expect of our webserver and, in the grand scheme of things, is effectively “small fry” and should be easily achievable in any technology.  If we’re thinking about how we need our own systems to scale, it’s important to understand what we’re aiming for.  We may actually not need all that much scalability!  Scalability costs money, so we clearly don’t need to aim for scalability to millions of daily visitors to our site if we’re only likely to ever attract a few thousand.

 

We then ask, “What is availability?”  Availability is having a request being completed in a given amount of time.  It’s important to think about the different systems that this can apply to and the relative time that users of those systems will expect for a request to be completed.  For example, simply accessing a website from your browser is a request/response cycle that’s expected to be completed within a very short amount of time.  Delays here can turn people away from your website.  Contrast this with (for example) sending an email.  Here, it’s expected that email delivery won’t necessarily be instantaneous and the ability of the “system” in question to respond to the user’s request can take longer.  Of course, it’s expected that the email will eventually be delivered otherwise the system couldn’t be said to be “available”!

 

Regarding websites, Kendall mentions that in order to achieve scalability we need only concern ourselves with the dynamic pages.  Our static content should be inherently scalable in this day and age as scaling static content has long been a “solved problem”.  Geo-located CDN’s can help in this regard and have been used for a long time.  Kendall tells us that achieving scalability is simple in principle, but obviously much harder to implement in practice.  That said, once we understand the principles required for scalability, we can seek to ensure our implementations adhere to them.

 

There’s only 3 things required to make us scale.  And there’s only 1 thing that prevents us from scaling!

 

Kendall then introduces the 4 principles we need to be aware of:  ACD/C. 

 

This acronym is explained as Asynchronicity, Caching, Distribution & Consistency.  The first three are the principles which, when applied, give us scalability.  The last one, Consistency (or at least the need for our systems to remain in a consistent internal state) is the one that will stand in the way of scalability.  Kendall goes on to elaborate on each of the 4 principles, but he also re-orders them in the order in which they should be applied when attempting to implement scalability in a system that perhaps has none already.  We need to remember that scalability isn’t finite and that we need to ensure we work towards a scalability goal that makes sense for our application and it’s demands.

 

Kendall first introduces us to our own system’s architecture.  All systems have this architecture he says…!   Must admit, it’s a fairly popular one:

 

arch1

 

Kendall then talks about the principles we should apply, and the order in which we should apply them to an existing system in order to add scalability.

 

The first principle to add to a system is Caching.  Caching is almost always the easiest to start with to introduce some scalability in a system/application that needs it.  Caching is saving or storing the results of earlier work so that it can be reused at some later point in time.  After all, the very best performing queries are those ones that never have to be run!  Sometimes, caching alone can prevent around 99% of processing that really needn’t be done (i.e. a request for a specific webpage may well serve up the same page content over a long period of time, thus multiple requests within that time-scale can serve up the cached content).  Caching should be applied in front of everything that is time consuming and it’s easiest to apply in a left-to-right order (working from adding a cache in front of the web server, through to adding one in front of the application server, then finally the database server).

 

Once in place, the caches can use very simple strategies, as these can be incredibly effective despite their simplicity.  Microsoft’s Entity Framework uses a strategy that removes all cached entries as soon as a user commits a write (add/update/delete) to the database.  Whilst on the surface this may seem excessive to eradicate all of the cache, it’s really not as in the vast majority of systems, reads from the database outnumber writes by an order of magnitude.  For this reason, the cache is still incredibly effective and is still extensively re-used in real-world usage.  We’re reminded that applications ask lots of repeated questions.  Stateless applications even more so, but the answers to these questions rarely change.  Authorative information, such as the logged on user’s name, is expensive to repeatedly query for as it’s required so often.  Such information is the prime candidate to be cached.

 

An interesting point that Kendall makes here is to question the conventional wisdom that “the fewest lines of code is the fastest”.  He says that very often, that’s not really the case as very few lines of code in a method that is doing a lot of work implies that much of your processing is being off-loaded to other methods or classes that are doing your work for you.  This can often slow things down, especially if those other methods and/or classes are not specifically built to utilise cached data.  Very often, having more lines of code in a method can actually be the faster approach as your method is in total control of all of the processing work that needs to be done.  You’re doing all of the work yourself and so can ensure that the processing uses your newly cached data rather than expecting to have to read (or re-read it) from disk or database.

 

Distribution is the next thing to tackle after Caching.  Distribution is spreading the load around multiple servers and having many things doing your work for you rather than just one.  It’s important to note that the less state that’s held within your system, the better (and wider) you can distribute the load.  If we think of session state in a web application, such state will often prevent us from being able to fulfil user requests by any one of many different webservers.  We’re often in a position where we’ll require at least “Server Affinity” (also known as “sticky sessions”) to ensure that each specific user’s requests are always fulfilled by the same server in a given session.  Asynchronous code can really help here as it means that processing can be offloaded to other servers to be run in the background whilst the processing of the main work can continue to be performed in the foreground without having to wait for the response from the background processes.

 

Distribution is hardest when it comes to the database.  Databases, and indeed other forms of storage, are fundamentally state and scaling state is very difficult.  This is primarily due to the need to keep that state consistent across it’s distributed load.  This is the same consistency, or the requirement of consistency, that can hinder all manner of scalability and is one of the core principles.  One technique of scaling your storage layer is to use something called “Partitioned Storage Zones”.  These are similar to the server affinity (or sticky sessions) used on the web server when state needs to be maintained except that storage partitioning is usually more permanent.  We could have 5 separate database servers and split out (for example) 50 customers across those 5 database servers with 10 customers on each server.  We don’t need to synchronize the servers as any single given customer will only ever use the one server to which they’ve been permanently assigned.

 

After distribution comes Asynchronicity.  Asynchronicity (or Async for short) is always the hardest to implement and so is the last one to be attempted in order to provide scalability.  Async is the decoupling of operations to ensure that the minimum amount of work is performed within the “critical path” of the system.  The critical path is the processing that occurs to fulfil a user’s request end-to-end.  A user request to a web server for a given resource will require processing of the request, retrieval and processing of data before returning to the user.  If the retrieval and processing of data requires significant and time-consuming computation, it would be better if the user was not “held up” whilst waiting for the computation to complete, but for the response to be sent to the user in a more expedient fashion, with the results of the intensive computation delivered to the user at a later point in time.  Work should always be “queued” in this manner so that load is smoothed out across all servers and applications within the system.

 

One interesting Async technique, which is used by Amazon for their “recommendation” engine, is “Speculative Execution”.  This is some asynchronous processing that happens even though the user may never have explicitly requested such processing or may never even be around to see the results of such processing.  This is a perfectly legitimate approach and, whilst seemingly contrary to the notion of not doing any work unless you absolutely have to, “speculative execution” can actually yield performance gains.  It’s always done asynchronously so it’s never blocking the critical path of work being performed, and if the user does eventually require the results of the speculative execution, it’ll be pre-computed and cached so that it can be delivered to the user incredibly quickly.  Another good async technique is “scheduled requests”.  These are simply specific requests from the user for some computation work to be done, but the request is queued and the processing is performed at some later point in time.  Some good examples of these techniques are an intensive report generation request from the user that will have it’s results available later, or a “nightly process” that runs to compute some daily total figures (for example, the day’s financial trading figures).  When requested the next day, the previous day’s figures do not need to be computed in real-time at all and the system can simply retrieve the results from cache or persistent storage. This obviously improves the user’s perception of the overall speed of the system.  Amazon uses an interesting trick that actually goes against async in that they actually “wait” for an order’s email confirmation to be sent before displaying the order confirmation web page to the user.  It’s one of only a few areas of Amazon’s site that specifically isn’t async and is very intentionally done this way as the user’s perception of an order being truly finalized is of receiving the confirmation email in their inbox!

 

Kendall next talks about the final principle, which of the 4 principles is the one that actually prevents scalability, or at least complicates it significantly.  It’s the principle of Consistency.  Consistency is the degree to which all parties within the system observe some state that exists within the system at the same time.  Of course, the other principles of distribution and asynchronicity that help to provide scalability will directly impact the consistency of a system.  With this in mind, we need to recognize that scalability and scaling is very much about compromise.

 

There are numerous consistency challenges when scaling a system.  Singleton data structures (such as a numbering system that must remain contiguous) are particularly challenging as having multiple separate parts of a system that can generate the next number in sequence would require locking and synchronicity around the number generation in order to prevent the same number being used twice.  Kendall also talks about state that can be held at two separate endpoints of a process, such as a layer that reads some data from a database, and how this must be shared consistently – changes to the database after the data has been read must ideally be communicated to the layer that has previously read the data to be informed of the change.  Within the database context, this consistency extends to ensuring multiple database servers are kept consistent in the data that they hold and queries across partitioned datasets must be kept in sync.  All of these consistency challenges will cause compromise with the system, however, consistency can be achieved if the approach by the other 3 principles (Caching, Distribution & Async) are themselves implemented in a consistent manner and work towards the same goals.

 

Finally, Kendall discusses how we can actually implement all of these concepts within a real-world system.  The key to this is to test your existing system and gather as many timings and metrics as you possibly can.  Remember, scaling is about setting a realistic target that makes sense for your application.  Once armed with metrics and diagnostic data, we can set specific targets that our scalability must reach.  This could be something like, “all web pages must return to the user within 500ms”.  You would then start to implement, working from left to right within your architecture, and implementing the principles in the order of simplicity and which will provide the biggest return on investment. Caching first, then Distribution, finally Async.  But, importantly, when you hit your pre-defined target, you stop.  You’re done.

 

20131012_115812

After another coffee break back in the main hall, during which time I was able to browse through the various stalls set up by the conference’s numerous sponsors, chat with some of the folks running those stalls, and even grab myself some of the swag that was spread around, it was time for the final session before lunch.  This one was Matthew Steeples’You’ve Got Your Compiler In My Service”.

 

Matthew’s talk was about the functionality and features that the upcoming Microsoft Roslyn project will offer to .NET developers.  Roslyn is a “compiler-as-a-service”.  This means that the C# compiler offered by Roslyn will be available to be interacted with via other C# code.  Traditionally, compilers – and the existing C# compiler is no exception – are effectively “black boxes” and operate in one direction only.  Raw source code is fed in at one end, and after “magic” happening in the middle, compiled executable binary code came out from the other end.  In the case of the C# compiler, it’s actually IL code that gets output, ready to be JIT’ed by the .NET runtime.  But once that IL is output, there’s really no simple way to return from the IL back to the original source code.  Roslyn will change that.

 

Roslyn represents a deconstruction of the existing C# compiler.  It’s exposes all of the compiler’s functionality publically allowing a developer to use Roslyn to construct new C# code with C# code!  Traditional compilers will follow a series of steps to convert the raw text-based source code into something that the compiler can understand in order to convert it into working machine code.  These steps can vary from one compiler to another, but generally consist of a step to first breakdown the text into individual words and characters that can be further processed.  This step is known as “parsing”.  Next, the parsed text must be examined for language keywords that the compiler understands as being part of the language, as well as user-defined variable names and other tokens.  This is known as “lexical analysis”.  This is followed by “syntax analysis”, which is the understanding of (and verification against) the syntactical rules of the language.  Next comes the “semantic analysis” which is the checking of the semantics of the languages expression (for example, ensuring that the expression with an if statement’s condition evaluates to a boolean).  Finally, after all of this analysis, “code generation” can take place.

 

Roslyn, on the other hand, takes a different approach, and effectively turns the compiler of both the C# and VB languages into a large object model, exposing an API that programmers can easily interact with (For example: An object called “CatchClause” exists within the Roslyn.Compiler namespace that effectively represents the “catch” statement from within the try..catch block).

 

Creating code via Roslyn is achieved by creating a top-level object known as a Syntax Tree.  Syntax Trees contain a vast hierarchy of child objects, literally as a tree data structure and usually contain multiple Compilation Units (a compilation unit is a single class or “module” of code).  Each compilation unit, in turn, contains further objects and nodes that represent (for example) a complete C# class, starting with the class declaration itself including its scope and modifiers, drilling down the the methods (and their scoping and modifiers) and ultimately the individual lines of code contained within.  These syntax trees ultimately represent an entire C# (or VB!) program and can either be declared and created within other C# code, or parsed from raw text.  Specifically, Syntax Trees have three important attributes.  They represent all of the source code in full fidelity meaning every keyword, every variable name, every operator.  In fact, they’ll represent everything right down to the whitespace.   The second important attribute of a Syntax Tree is that, due to the first attribute, they’re completely reversible.  This means that code parsed from a raw text file into the SyntaxTree object model, is completely reversible back to the raw text source code.  The third and final attribute is that of immutability.  Once created, Syntax Trees cannot be changed.  This means they’re completely thread-safe.

 

Syntax Trees break down all source code into only three types of object.  Nodes, Tokens and Trivia.  Nodes are syntactic constructs of the language like declarations, statements, clauses and expressions.  Nodes generally also act as parent objects for other child objects and nodes within the Syntax Tree.  Tokens are the individual language grammar keywords but can also be identifiers, literals and punctuation.  Tokens have properties that represent (for example) their type (a token representing a string literal in code will have a property that represents the fact that the literal is of type string) as well as other meta-data for the token, but tokens can never be parents of other objects within the Syntax Tree.  Finally, trivia, is everything else within the source code and are primarily concerned with largely insignificant text such as whitespace, comments, pre-processor directives etc.

 

The following bit of C# code shows how we can use Roslyn to parse a literal text representation of a simple “Hello World” application:

 

var tree = SyntaxTree.ParseText(@"
    using System;
     
    namespace HelloRoslyn
    {
        class Program
        {
            static void Main(string[] args)
            {
                Console.WriteLine(""Hello World"");
            }
        }
    }
");

Once this code has been executed, the tree variable will hold a complete syntax tree that represents the entire program as defined in the string literal.  Once created, tree variable’s syntax tree can be executed (i.e. the “Hello World” program can be run), it can be turned into IL (Intermediate Language), or turned back into the same source code!

 

The following C# code is the equivalent of the code above, except that here we’re not just parsing from the raw source code text, we’re actually creating and building up the syntax tree by hand using the built-in Roslyn objects that represent the various facets of the C# language:

 

using System;
using Roslyn.Compilers.CSharp;

namespace HelloRoslyn
{
  class Program
  {
    static void Main()
    {
      string program = Syntax.CompilationUnit(
        usings: Syntax.List(Syntax.UsingDirective(name: Syntax.ParseName("System"))),
        members: Syntax.List<MemberDeclarationSyntax>(
          Syntax.NamespaceDeclaration(
            name: Syntax.ParseName("HelloRoslyn"),
            members: Syntax.List<MemberDeclarationSyntax>(
              Syntax.ClassDeclaration(
                identifier: Syntax.Identifier("Program"),
                members: Syntax.List<MemberDeclarationSyntax>(
                  Syntax.MethodDeclaration(
                    returnType: Syntax.PredefinedType(Syntax.Token(SyntaxKind.VoidKeyword)),
                    modifiers: Syntax.TokenList(Syntax.Token(SyntaxKind.StaticKeyword)),
                    identifier: Syntax.ParseToken("Main"),
                    parameterList: Syntax.ParameterList(),
                    bodyOpt: Syntax.Block(
                      statements: Syntax.List<StatementSyntax>(
                        Syntax.ExpressionStatement(
                          Syntax.InvocationExpression(
                            Syntax.MemberAccessExpression(
                              kind: SyntaxKind.MemberAccessExpression,
                              expression: Syntax.IdentifierName("Console"),
                              name: Syntax.IdentifierName("WriteLine"),
                              operatorToken: Syntax.Token(SyntaxKind.DotToken)),
                            Syntax.ArgumentList(
                              arguments: Syntax.SeparatedList(
                                Syntax.Argument(
                                  expression: Syntax.LiteralExpression(
                                    kind: SyntaxKind.StringLiteralExpression,
                                    token: Syntax.Literal("\"Hello world\"", "Hello world")
                                  )
                                )
                              )
                            )
                          )
                        )
                      )
                    )
                  )
                )
              )
            )
          )
        ));
    }
  }
}

Phew!  That’s quite some code there to create the Syntax Tree for a simple “Hello World” console application!  Although Roslyn can be quite verbose, and building up syntax trees in code can be incredibly cumbersome, the functionality offered by Roslyn is incredibly powerful.  So, why on earth would we need this kind of functionality?

 

Well, one current simple usage of Roslyn is to create a “plug-in” for the Visual Studio IDE.  This plug-in can interact with the source code editor window to dynamically interrogate the current user edited source and perform alterations.  These could be refactoring and code generation, similar to the functionality that’s currently offered by the ReSharper or JustCode tools.  Of course, those tools can perform a myriad of interactions with the code editor windows of the Visual Studio IDE, however they probably currently have to implement their own parsing and translation engine over the code that’s edited by the user.  Roslyn makes this incredibly easy to accomplish within your own plug-in utilities.  Other usages of Roslyn include the ability for an application to dynamically “inject” code into itself.  At this point Matthew shows us a demo of a simple Windows Forms application with a simple textbox on the form.  He proceeds to type out a C# class declaration into the form’s textbox.  He ensures that this class declaration implements a specific interface that the Windows Forms application already knows about.  Once entered, the running WinForms app can take the raw text from the textbox, and using Roslyn, convert this text into a Syntax Tree.  This Syntax Tree can then be invoked as actual code, as though it were simply a part of the running application.  In this case, Matthew’s example has an interface the defines a single “GetDate” method that returns a string.  Matthew types his class into the WinForms textbox and returns the current Date and Time in the current locale.  This is then executed and invoked by the running application and the result is displayed on the Form.  Matthew then shows how the code within the textbox can be easily altered to return the same Date and Time but in the UTC time zone.  One click of a button and the new code is parsed, interpreted and invoked using Roslyn to immediately show the new result on the Windows Form.

 

Roslyn, as a new C# compiler, is itself written in C#.  Some of the current complexities with the Roslyn toolkit is that the current C# compiler, which is written in C++, doesn’t entirely conform to the C# specification.  This makes it fairly tricky to reproduce the compiler in accordance with the C# specification, and the current dilemma is whether Roslyn should embrace the C# specification entirely (thus making it slightly incompatible with the existing C# compiler) or whether to faithfully reproduce the existing C# compiler’s behaviour even though it doesn’t strictly conform to the specification.

 

Matthew wraps up his talk with a summary of the Roslyn compiler’s abilities, which are extensive and powerful despite it still only being a CTP (Community Technology Preview) of the final functionality, and offers the link to the area on MSDN where you can download Roslyn and learn all about this new “compiler-as-a-service” which will, eventually, become a standard part of Visual Studio and C# (and VB!) development in general.

 

20131012_131316

After Matthew’s talk it was time for lunch.  Lunch at DDD North this year was just as great as last year.  We all wandered off to the main entrance hall where the staff of the venue were frantically trying to put out as many bags with a fantastic variety of sandwiches, fruit and chocolate bars as they could before the hoards of hungry developers came along to whisk them away.  The catering really was excellent as it was possible to pre-order specific lunches for those with specific dietary requirements, as well as ensuring there was a wide range of vegetarian options available too.

 

I examined the available options, which took a little while as I, too, have specific dietary requirements in that I’m a fussy bugger as I don’t like mayonnaise!  It took a little while to find a sandwich that didn’t come loaded with mayo, but after only a short while, I found one.  And a lovely sandwich it was too!  Along with my crisps, chocolate and fruit, I found a place to sit down and quietly eat my lunch whilst contemplating the quantity and quality of the information I’d learned so far.

 

20131012_131448

During the lunch break, there were a number of “grok talks” taking place in the largest of the lecture theatres that were being used for the conference (this was the same theatre where Kendall Miller had given his talk earlier).  Whilst I always try to take in at least one or two (if not all) of the grok talks that take place during the DDD (and other) conferences, unfortunately on this occasion I was too busy stuffing my face, wandering around the main hall and browsing the many sponsors stands as well as chatting away to some old and new friends that I’d met up with there.  By the time I realised the grok talks were talking place, it was too late to attend.

 

After an lovely lunch, it was time for the first of the afternoon’s sessions, one of two remaining in the day.  This session saw us gathering in one of the lecture halls only to find that the projector had decided to stop working.  The DDD volunteers tried frantically to get the thing working again, but ultimately, it proved to be a futile endeavour.  Eventually, we were told to head across the campus to the other building that was being used for the conference and to a “spare room”, apparently reserved for such an eventuality.

 

After a brisk, but slightly soggy walk across the campus forecourt (the weather at this point was fairly miserable!) we entered the David Goldman Informatics Centre and trundled our way to the spare room.  We quickly sat ourselves down and the speaker quickly set himself up as we were now running slightly behind schedule.  So, without further ado, we kicked off the first afternoon session which was MongoDB For C# Developers, given by Simon Elliston Ball.

 

Simon’s talk was an introduction to the MongoDB No-SQL database and specifically how we as C# developers can utilise the functionality provided by MongoDB.  Mongo is a document-oriented database and stores it’s data as a collection of key/value pairs within a document.  These documents are then stored together as collections within a database.  A document can be thought of as a single row in a RDBMS database table, and the collection of documents can be thought of as the table itself, finally multiple collections are grouped together as a database, however, this analogy isn’t strictly correct.  This is very different from the relational structure you can can find in today’s popular database systems such as Microsoft’s SQL Server, Oracle, MySQL & IBM’s DB2 to name just a few of them.  Document oriented databases usually store their data represented in JSON format, and in the case of MongoDB, it uses a flavour of JSON known as BSON which is Binary JSON.  An example JSON document could something as simple as:

 

{
    "firstName": "John",
    "lastName": "Smith",
    "age": 25
}

 

However, the same document could be somewhat more complex, like this:

 

{
    "firstName": "John",
    "lastName": "Smith",
    "age": 25,
    "address": {
        "streetAddress": "21 2nd Street",
        "city": "New York",
        "state": "NY",
        "postalCode": 10021
    },
    "phoneNumbers": [
        {
            "type": "home",
            "number": "212 555-1234"
        },
        {
            "type": "fax",
            "number": "646 555-4567"
        }
    ]
}

 

This gives us an ability that RDBMS database don’t have and that’s the ability to nest multiple values for a single “key” in a single document.  RDBMS’s would require multiple tables joined together by a foreign key in order to represent this kind of data structure, but for document-oriented databases, this is fairly standard.  Furthermore, MongoDB is a schema-less database which means that documents within the same collection don’t even need to have the same structure.  We could take our two JSON examples from above and safely store them within the exact same collection in the same database!  Of course, we have to be careful when we’re reading them back out again, especially if we’re trying to deserialize the JSON into a C# class.  Importantly, as MongoDB uses BSON rather than JSON, it can offer strong typing of the values that are assigned to keys.  Within the .NET world, the MongoDB client framework allows decorating POCO classes with annotations that will aid in the mapping between the .NET data types and the BSON data types.

 

So, given this incredible flexibility of a document-oriented database, what are the downsides?  Well, there are no joins within MongoDB.  This means we can’t join documents (or records) from one collection with another as you could do with different tables within a RDBMS system.  If your data is very highly relational, a document-oriented database is probably not the right choice, but but a lot of data structures can be represented by documents.  MongoDB allows an individual document to be up to 16MB in size, and given that we can have multiple values for a given key within the document, we can probably represent an average hierarchical data/object graph using a single document.

 

Simon makes a comparison between MongoDB and another popular document-oriented database, RavenDB.  Simon highlights how RavenDB, being the newer document-oriented database offers ACID-compliance and transactions that stretch over multi-documents.  He states that MongoDB’s transactions are only per document.  MongoDB’s replication supports a Master-Slave configuration, but Raven’s replication is Master-Master and that MongoDB supports being used from within many different languages with native client libraries for JavaScript, Java, Python, Ruby, .NET, Scala, Erlang and many more.  RavenDB is effectively .NET only (at least as far as native client libraries go) however RavenDB does offer a REST-based API and is thus callable from any language that can reach a URI.

 

Simon continues by telling us about how we can get to play with MongoDB as C# developers.  The native C# MongoDB client library is distributed as a NuGet package which is easily installable from within any Visual Studio project.  The NuGet package contains the client library which enables easy access to a MongoDB Server instance from .NET as well as containing types that provides the aforementioned annotations to decorate your POCO classes to enable easy mapping of your .NET types to the MongoDB BSON types.  Once installed, accessing some data within a MongoDB database can be performed quite easily:

 

var client = new MongoClient(connectionString);
var server = client.GetServer(); 
var database = server.GetDatabase("MyDatabase");
var collection = database.GetCollection("MyCollection");

 

One of the nice things with MongoDB is that we don’t have to worry about explicitly closing or disposing of the resources that we’ve acquired with the above code.  Once these objects fall out of scope, the MongoDB client library will automatically close the database connection and release the connection back to the connection pool.  Of course, this can be done explicitly too, but it’s nice to know that failure to do so won’t leak resources.

 

Simon explains that all of Mongo’s operations are as “lazy” as they possibly can be, thus in the code above, we’re only going to hit the database to retrieve the documents from “MyCollection” once we start iterating over the collection variable.  The code above shows a simple query that simply returns all of the documents within a collection.  We can compose more complex queries in a number of ways, but perhaps the way that will be most familiar to C# developers is with LINQ-style query:

 

var readQuery = Query<Person>.EQ(p => p.PersonID == 2);
Person thePerson = personCollection.FindOne(readQuery);

This style of query allows retrieving a strongly-typed “Person” object using a Lambda expression as the argument to the EQ function of the Query object.  The resulting configured query object is then passed to the .FindOne method of the collection to allow retrieval of one specific Person object based upon the predicate of the query.  The newer versions of MongoDB support most of the available LINQ operators and expressions and collections can easily be exposed to the client code as an IQueryable:

 

var query =
   from person in personCollection.AsQueryable()
   where person.LastName == "Smith"
   select person;

foreach (var person in query)
// ....[snip]....

 

We can also create cursors to iterate over an entire collection of documents using the MongoCursor object:

 

MongoCursor<Person> personCursor = personCollection.FindAll();
personCursor.Skip = 100;
personCursor.Limit = 10;

foreach(var person in personCursor)
// .....[snip]....

Simon further explains how Mongo’s Update operations are trivially simple to perform too, often merely requiring the setting of the object properties, and calling the .Save method against the collection, passing in the updated object:

 

person.LastName = "Smith";
personCollection.Save(person);

Simon tells us that MongoDB supports something known as “write concerns”.  This mechanism allows us to return control to our code only after the master database and all slave servers have been successfully updated with our changes.  Without these write concerns, control will return to our code before the changes have persisted across all database servers, returning control to our code after only the master server has been updated whilst the slaves continue to update asynchronously in the background.  Unlike most RDBMS systems, UPDATEs to MongoDB will, by default, only ever affect one document, and this is usually the first document that the update query finds.  If you wish to perform a multi document update, you must explicitly tell MongoDB to perform such an update.

 

As stated earlier, documents are limited to 16MB in size however MongoDB provides a way to store a large “blob” of data (for example, if you needed to store a video file) using a technology called GridFS.  GridFS sits on top of MongoDB and allows you to store a large amount of binary data in “chunks”, even if this data exceeds the 16MB document limit.  Large files are committed to the database with a simple command such as:

 

database.GridFS.Upload(filestream, "mybigvideo.wmv").

 

This will upload the large video file to the database, which will break down the file into many small chunks.  Querying and retrieving this data is as simple as retrieving a normal document, and the database and the database driver are responsible for re-combining all of the chunks of the file to allow you to retrieve the file correctly with no further work required on the developers behalf.

 

MongoDB supports GeoSpatial functionality which allows querying location and geographic data for results that are “near” or within a certain distance of a specific geographic location:

 

database = server.GetDatabase("MyDatabase");
var collection = database.GetCollection("MyCollection");
var query = Query.EQ("Landmarks.LandMarkType", new BsonString("Statue"));
double lon = 54.9117468;
double lat = -1.3737675;
var earthRadius = 6378.0; // km
var rangeInKm = 100.0; // km
var options = GeoNearOptions
              .SetMaxDistance(rangeInKm / earthRadius /* to radians */)
              .SetSpherical(true);
var results = collection.GeoNear(query, lat, lon, 10, options);

The above code sample would find all documents within the Landmarks collection that have a LandMarkType of Statue and which are also within 10 kilometres of our defined Latitude and Longitude position.

 

MongoDB also supports the ability to query and transform data using a “MapReduce”  algorithm.  MapReduce is a very powerful way in which a large set of data can be filtered, sorted (the “map” part) and summarised (the “reduce” part) using hand-crafted map and reduce functions.  These functions are written in JavaScript and are interpreted by the MongoDB database engine, which contains a full JavaScript interpreter and execution engine.  Using this MapReduce mechanism, a developer can perform many of the same kinds of complicated “grouping” and aggregation queries that RDBMS systems perform.  For example, the following sample query would iterate over the collection within the database and sum the count of documents, grouped together by the key:

 

var map =
    "function() {" +
    "    for (var key in this) {" +
    "        emit(key, { count : 1 });" +
    "    }" +
    "}";

var reduce =
    "function(key, emits) {" +
    "    total = 0;" +
    "    for (var i in emits) {" +
    "        total += emits[i].count;" +
    "    }" +
    "    return { count : total };" +
    "}";

var mr = collection.MapReduce(map, reduce);

Finally, Simon wraps up his talk by telling us about a Glimpse plug-in that he’s authored himself which can greatly help to understand exactly what is going on between the client-side code that talks to the MongoDB client library and the actual requests that are sent to the server, as well as being able to inspect the resulting responses.

 

After a short trip back across the campus to grab a coffee in the other building that contains the main entrance hall, as well as an array of crisps, chocolate and fruit (these were the “left-overs” from the lunch bags of earlier in the afternoon!) to keep us developers well fed and watered, I trundled back across the campus to the same David Goldman Informatics Centre building I’d been in previously to watch the final session of the day.  This session was another F# session (F# was a popular subject this year) called “You’ve Learned The Basics Of F#, What’s Next?” and given by Ian Russell.

 

The basis of Ian’s talk was to examine two specific features of F# that Ian thought offered a fantastic amount of productivity over other languages, and especially over other .NET languages.  These two features were Type Providers and the MailboxProcessor.

 

First up, Ian takes a look at Type Providers.  First introduced in F# 3.0, Ian starts by explaining that Type Providers provide type inference over third party data.  What this essentially means is that a type provider for something like (say) a database can give the F# IDE type inference over what types you’ll be working with from the database as soon as you’ve typed in the line of code that specifies the connection string!  Take a look at the sample code below:

 

open System.Linq
open Microsoft.FSharp.Data.TypeProviders
    
type SqlConnection =
    SqlDataConnection<ConnectionString = @"Data Source=.\sql2008r2;Initial Catalog=chinook;Integrated Security=True">

let db = SqlConnection.GetDataContext()

let table =
    query { for r in db.Artist do
    select r }

 

The really important line of code from the sample above is this one:

 

query { for r in db.Artist do

Note the db.Artist part.  There’s no type within the code that defines what artist is.  The FSharp Data Type Provider has asynchronously and in the background of the IDE quietly opened the SQL Server connection as soon as the connection string was specified in the code.  It’s examined the database referred to in the connection string and it has automatically generated the types base upon the tables and their columns within the database!

 

Ian highlights the fact that F#’s SQL Server type provider requires to mapping code to go from F# type in code to SQL Server entities.  The equivalent C# code using Entity Framework would be significantly more verbose.

 

Ian also shows how it’s easy to take the “raw” types captured by the type provider and wrap them up into a nicer pattern, in this case a repository:

 

type ChinookRepository () =
    member x.GetArtists () =
        use context = SqlConnection.GetDataContext()
        query { for g in context.Artist do
                select g }
        |> Seq.toList

let artists =
    ChinookRepository().GetArtists()

 

Ian explains how F# supports a “query” syntax that is very similar (but much better than) C# and LINQ’s query syntax, ie:

 

from x in y select new { TheID = x.Id, TheName = x.FirstName }

 

The reason that F#’s query syntax is far superior is that F# allow you to define your own query syntax keywords.  For example, you can define your own keyword, “top” which would implement “Select Top X” style functionality.  This effectively allows you to define your own DSL (Domain-Specific Language) within F#!

 

After the data type provider, Ian goes on to show us how the same functionality of early-binding and type inference to a third-party data source works equally well with local CSV data in a file.  He shares the following code with us:

 

open FSharp.Data

let csv = new CsvProvider<"500-uk.csv">()

let data =
    csv.Data
    |> Seq.iter (fun t -> printf "%s %s\n" t.``First Name`` t.``Last Name``)

 

This code shows how you can easily express the columns from the CSV that you wish to work with by simply specifying the column name as a property of the type.  The actual type of this data is inferred from the data itself (numeric, string etc.) however, you can always explicitly specify the types should you desire.  Ian also shows how the exact same mechanism can even pull down data from an internet URI and infer strong types against it:

 

open FSharp.Data

let data = WorldBankData.GetDataContext()

data.Countries.``United Kingdom``.Indicators.``Central government debt, total (% of GDP)``
|> Seq.maxBy fst

 

The above code shows how simple and easy it is to consume data from the World Bank’s online data store in a strong, type inferred way.

 

This is all made possible thanks to the FSharp.Data library which is available as a NuGet package and is fully open-source and available on GitHub.  This library has the type providers for the World Bank and Freebase online data sources already built-in along with generic type providers for dealing with any CSV, JSON or XML file.  Ian tells us about a type provider that’s currently being developed to generically work against any REST service and will type infer the required F# objects and properties all in real-time simply from reading the data retrieved by the REST service.  Of course, you can create your own type providers to work with your own data sources in a strongly-typed, eagerly-inferred magical way! 

 

After this quick lap around type providers, Ian moves on to show us another well used and very useful feature of F#, the MailboxProcessor.  A MailboxProcessor is also sometimes known as an “Agent” (this name is frequently used in other functional languages) and effectively provides a stateless, dedicated message queue.  The MailboxProcessor consists of a lightweight message queue (the mailbox) and a message handler (the processor).  For code interacting with the MailboxProcessor, it’s all asynchronous, code can post messages to the message queue asynchronously (or synchronously if you prefer), however, internally the MailboxProcessor itself will only process it’s messages in a strictly synchronous manner and in a strict FIFO (First in, First Out) order, one message at a time.  This helps to maintain consistency of the queue.  Due to the MailboxProcessor exposing it’s messages asynchronously (but maintaining strict synchronicity internally), we don’t need to acquire locks when we’re dealing with the messages going in or coming out.  So, why is the MailboxProcessor so useful?

 

Well, Ian shows us a sample chat application that consists of simply posting messages to a MailboxProcessor.  The entire functionality of the chat application is contained within a single type/class:

 

type ChatMessage =
  | GetContent of AsyncReplyChannel<string>
  | SendMessage of string

let agent = Agent<_>.Start(fun agent ->
  let rec loop messages = async {

    // Pick next message from the mailbox
    let! msg = agent.Receive()
    match msg with
    | SendMessage msg ->
        // Add message to the list & continue
        return! loop (msg :: messages)

    | GetContent reply ->
        // Generate HTML with messages
        let sb = new StringBuilder()
        sb.Append("<ul>\n") |> ignore
        for msg in messages do
          sb.AppendFormat(" <li>{0}</li>\n", msg) |> ignore
        sb.Append("</ul>") |> ignore
        // Send it back as the reply
        reply.Reply(sb.ToString())
        return! loop messages }
  loop [] )


agent.Post(SendMessage "Welcome to F# chat implemented using agents!")
agent.Post(SendMessage "This is my second message to this chat room...")

agent.PostAndReply(GetContent)

 

The code above creates a single type (ChatRoom) that encapsulates all of the functionality required to “post” and “receive” messages from a MailboxProcessor – effectively mimicking the back and forth chat messages of a chat room.  Further code shows how this can be exposed over a webpage by utilising a HttpListener with another type:

 

let root = @"C:\Temp\Demo.ChatServer\"
let cts = new CancellationTokenSource()

HttpListener.Start
("http://localhost:8082/", (fun (request, response) -> async {
  match request.Url.LocalPath with
  | "/post" ->
      // Send message to the chat room
      room.SendMessage(request.InputString)
      response.Reply("OK")
  | "/chat" ->
      // Get messages from the chat room (asynchronously!)
      let! text = room.AsyncGetContent()
      response.Reply(text)
  | s ->
      // Handle an ordinary file request
      let file =
        root + (if s = "/" then "chat.html" else s.ToLower())
      if File.Exists(file) then
        let typ = contentTypes.[Path.GetExtension(file)]
        response.Reply(typ, File.ReadAllBytes(file))
      else
        response.Reply(sprintf "File not found: %s" file) }),
   cts.Token)

cts.Cancel()

 

This code shows how an F# type can be written to create a server which listens on a specific HTTP address and port and accepts messages to URL endpoints as part of the HTTP payload.  These messages are stored within the internal MailboxProcessor and subsequently retrieved to display on the webpage.  We can imagine two (or more) separate users with the same webpage open in their browser’s and each person’s messages getting both echoed back to themselves as well as being shown on each other user’s browsers.

 

Ian has actually coded up such a web application, with a slightly nicer UI, and ends off his demonstrations of the power of the MailboxProcessor by firing up two separate browsers on the same machine (mimicking two different users) and showing how chat messages from one user instantly and easily appear on the other user’s browser.  Amazingly, there’s a minimum of JavaScript involved in this demo, and even the back-end code that maintains the list of users and the list of messages is no more than a few screens full!

 

Ian wrapped up his talk by recapping the power of both Type Providers and the MailboxProcessor, and how both techniques build upon your existing F# knowledge and make the consumption and processing of data incredibly easy.

 

20131012_170122

After Ian’s talk it was time for the final announcements of the day and the prize give away!  We all made our way back to the main building, and to the largest room, the Tom Cowie Lecture Theatre.

 

After a short while all of the DDD North attendees along with the speakers, and sponsors had assembled in the lecture theatre.  The main organiser of DDD North, Andy Westgarth, gave a short speech thanking the attendees and the sponsors.  I’d like to offer my thanks to the sponsors here also, because as Andy said, if it wasn’t for them there wouldn’t be a DDD North.  After Andy’s short speech a number of the sponsors took to the microphone to both offer their thanks to the organisers of the event and to give away some prizes!  One of the first was Rachel Hawley who had been manning the Telerik stand all day, and who lead the call for applause and thanks for Andy and his team.  After Rachel had given away a prize, Steve from Tinamous was up to thank everyone involved and to give away more prizes.  After Steve had given away his prize, Andy mentioned that Steve had generously put some money behind the bar for the after event Geek Dinner that was taking place later in the evening and that everyone’s first drink was on him!   Thanks Steve!

 

Steve was followed by representatives from the NDC Conference, a representative from Sage and various other speakers and sponsor staff, all giving a quick speech to thank to organisers and to state how much they’ve enjoyed sponsoring such a great community event as DDD North.

 

Of course, each of these sponsors had prizes to give away.  Each time, Andy would offer a bag of our feedback forms which we’d submitted at the end of each session and the sponsor would draw out a winning entry.  As is usual for me, I didn’t win anything, however, lots of people did and there were some great prizes on offer, including a stack of various books, some Camtasia software licenses along with a complete copy of Visual Studio Premium with MSDN!

 

After a final closing speech by Andy thanking everyone again and telling us that, although there’s no confirmed date or location for the next DDD North, it will definitely happen and it’ll be in a North-West location, as the intention is to alternate the location each time between a North East location and one in the North West in order to cover the entire “north” of England.

 

20131012_175549And with that, another fantastic DDD North event was over…   Except that it wasn’t.  Not quite yet!   Courtesy of Make It Sunderland and Sunderland Software City, they had agreed to host a “drinks reception” at the Sunderland Software City offices!  The organisers of DDD North had laid on a free bus transfer service for the short ride from Sunderland University to the location of the Sunderland Software City offices closer to Sunderland city centre.  Since I was in the car, I drove the short 10 minutes drive to the Sunderland Software City offices.  Of course, being in the car meant that my drinking was severely limited.

 

Around 80 of the 300+ attendees from DDD North made the trip to the drinks reception and we we’re treated with a small bar with two hand-pulled ales from the Maxim Brewery.  One was the famous Double Maxim and the other, Swedish Blonde.  Two fine ales and they were free all night long, for as long as the cask lasted (or at least for the 2 hours that the drinks reception event lasted)!

 

Being a big fan of real ales, it was at this point that I was kicking myself for having brought the car with me to DDD North.  I could have relatively easily taken the Metro train service from Newcastle to Sunderland, but alas, I was not to know this fantastic drinks reception would be so great or that there would be copious amounts of real-ale on offer.  In hindsight though, it was probably for the best that my ability to drink the endless free ale was curtailed!  :)

20131012_175818

 

I made my way to a comfy seating area and was joined by Phil Trelford who had given the first talk of the day that I attended and who I had been chatting with off and on throughout the day, and also Sean Newham.  Later we were joined by another guy who’s name I forget (sorry).  We chatted about various things and had a really fun time.  It was here that Phil showed us an F# Type Provider that himself and his friend had written in a moment of inspiration that mimics the old “Choose Your Own Adventure” style books from the 1980’s by offering up the entire story within the Visual Studio IDE!

 

Not only were we supplied with free drinks for the evening, we were also supplied with a seemingly endless amount of nibbles, Hors d'oeuvre and tiny desserts and cakes.  These were brought to us with such an alarming frequency and never seemed to end!   Not that I’m complaining… Oh no.  They were delicious, but there was a real fear that the sheer amount of these lovely nibbles would ruin everyone’s appetite for the impending Geek Dinner.

 

There’s a tradition at DDD events to have a “geek dinner” after the event where attendees that wish to hang around can all go to a local restaurant and have their evening dinner together.  I’d never been to one of these geek dinner’s before, but on this occasion, I was able to attend.  Andy had selected a Chinese buffet restaurant, the Panda Oriental Buffet, mainly because it was a very short walk from the Sunderland Software City offices, and also presumably because they use Windows Azure to host their website!

 

After the excellent drinks reception was finished, we all wandered along the high street in Sunderland city centre to the restaurant.  It took a little while for us all to be seated, but we were all eventually in and were able to enjoy some nice Chinese food and continue to chat with fellow geeks and conference attendees.  I managed to speak with a few new faces, some guys who worked at Sage in Newcastle, some guys who worked at Black Marble in Yorkshire and a few other guys who’d travelled from Leeds.

 

After the meal, and with a full belly, I bid goodbye to my fellow geeks and set off back towards my car which I’d left parked outside the Sunderland Software City offices to head back to what was my home for that weekend, my in-law’s place in Newcastle.  A relatively short drive (approx. 30-40 minutes) away.

 

And so ended another great DDD event.  DDD North 2013 was superb.  The talks and the speakers were superb, and Andy and his team of helpers had, once again, arranged a conference with superb organisation. So, many thanks to those involved in putting on this conference, and of course, thanks to the sponsors without whom there would be no conference.  Here’s looking forward to another great DDD North in 2014.    I can’t wait!

I’m now a Microsoft Certified Solutions Developer!

MCSD_2013(rgb)_1477Ever since my last post about Microsoft Certification, I’ve been slowly beavering away to study and take the two remaining exams that would allow me to be recognised as a Microsoft Certified Solutions Developer: Web Applications.  I had passed the second exam back in April, and this past Saturday, I took the final exam required to gain the Microsoft Certified Solutions Developer (MCSD) certificate.  I’m pleased to say that I passed.

The MCSD: Web Applications certification is an interesting one and has allowed me to brush up my study and skills on a number of languages and technologies despite me using most of these day in and day out in my day job!

The first exam (70-480 – Programming in HTML5, JavaScript & CSS3) that I took back in February required some study of HTML5 and CSS3, although I’m using both of these technologies in work.  Additional study certainly helps, though, as I was probably only using a small amount of these technologies as I’m mainly working on legacy code.  Passing this exam, as well as being one of the three exams required for the MCSD certificate, came with it’s own certificate – a Microsoft Specialist certificate.

The second exam (70-486 - Developing ASP.NET MVC 4 Web Applications) was relatively simple.  This exam was all about building ASP.NET MVC 4 applications and this was an area that I had a good amount of knowledge and experience of.  I’m using ASP.NET MVC frequently within my day job when trying to improve our legacy code base one small piece at a time, and I’m also using ASP.NET MVC in some of my own software that I write for fun in my spare time.

The third and final exam (70-487 - Developing Windows Azure and Web Services) was the really interesting one of the three as this was all about web services (primarily using ASP.NET MVC Web API) but specifically using Windows Azure as the deployment platform.  I had only really started to scratch the surface of ASP.NET MVC Web API and, although I was aware of Windows Azure’s existence, I’d never used it at all.  Around 8 weeks prior to taking this exam, I started to look into studying more of ASP.NET MVC Web API and tried to implement some of it’s functionality in my own spare-time projects.  I was also somewhat fortuitous that Microsoft happened to be running a free trial promotion with Windows Azure whereby you could receive $200 of free credit that would last 1 month during which time you can spend that on any Azure services you like.  This enabled me to not only read about and study Windows Azure from books, articles and blogs but to also get some real-world hands-on experience with the platform.  This was a big boon and I believe it helped me immensely when it came to taking this final exam.

So, having now acquired the MCSD certificate, where to from here?  Well, I’m not sure.  I’m very happy with the certification I’ve been able to obtain thus far, so I’ll enjoy that for the time being and think about my next move later in the year.

DDD East Anglia Conference Write-Up

logo-smallThis past Saturday, 29th June, saw the inaugural DDD East Anglia conference.  This is the latest addition to the DeveloperDeveloperDeveloper events that take place all over the UK and sometimes around the world!  I was there, this being only my second ever DDD event that I’d attended.

 

DDD East Anglia was set on the grounds of the extensive Cambridge University in a building called “The Hauser Forum”.  For those attendees like myself who were unfamiliar with the area or the university campus, it was a little tricky to find exactly where to go.  The DDDEA team did make a map available on the website, but the designated car park that we were to use was cordoned off when I arrived!  After some driving around the campus, and with the help of a fellow attendee who was in the same situation as myself, I managed to find another car park that could be used.  I must admit that better signposting for both the car parking and the actual Hauser Forum building would have helped tremendously here.  As you can imagine, Cambridge University is not a small place and it’s fairly easy to get lost amongst the myriad of buildings on campus.

 

The event itself had 3 parallel tracks of sessions, with 5 sessions throughout the day in each track.  As is often the case with most DDD Events, once the agenda was announced, I found myself in the difficult position of having to choose one particular session over another as there were a number of timeslots where multiple sessions that I’d really like to attend were running concurrently.  As annoying as this can sometimes be, it’s testament to quality and diversity of the sessions available at the various DDD events.  East Anglia’s DDD was no different.

 

As I’d arrived slightly late (due to the car parking shenanigans) I quickly signed in and went along to seminar room 1 where the introduction was taking place.  After a brief round of introductions, we were off and running with the first session.  There had been some last minute changes to the published agenda, so without knowing quite where I wanted to be, I didn’t have to move from Seminar Room 1 and found myself in Dave Sussman’s session entitled, “SignalR: Ready For Real-Time”.

 

Dave started out by talking about how SignalR is built upon the notions of persistent connections and hubs on the server-side and that hubs are built on top of persistent connections and simply offer a higher level of abstraction.  SignalR, at it’s most basic, is a server-side hub (or persistent connection) through which all messages and data flows and this hub then broadcasts that data back to each connected client.  Other than this, and for the client-side of the equation, Dave tells us that SignalR is effectively one big jQuery module!

 

Some of the complexities that SignalR wraps up and abstracts away from the developer is the requirement to determine the best communication protocol to use in a given situation.  SignalR uses web sockets as the default method of communication, if the client supports such a protocol.  Web Sockets are a relatively new protocol that provide true duplex communication between client and server.  This facilitates cool functionality such as server-side push to the client, however, if Web Sockets are not available SignalR will seamlessly “downgrade” the connection protocol to HTTP long-polling – which uses a standard HTTP connection that is kept alive for a long time in order to receive the response from the server.

 

Dave starts to show us a demo, which fails to work the first time.  Dave had planned this however, and proceeded to tell us about one of the most common problems in getting a simple SignalR demo application up and running: Adding the call to .MapHubs (a requirement to register the routes of the Hubs that have been defined on the server-side) after all of the other route registration has been done.  This causes SignalR to fail to generate some dynamic JavaScript code that is required for the client.  The resolution is simply to place the call to .MapHubs before any calls to the other MVC route registrations.

Dave tells us that SignalR doesn’t have to be in the browser.  We can create many other types of application (Console apps, WinForms Apps etc.) that connect to the HubConnection object over HTTP and then we can create a proxy in the client application that can send and receive the required messages over HTTP to the SignalR hub on the server!  Also, although it’s seen as an ASP.NET addition, SignalR isn’t dependent upon the ASP.NET runtime.  You can use it, via JavaScript, in a simple standalone HTML page with just the inclusion of a few JavaScript files and a bit of your own JavaScript to create and interact with a jQuery $._hubConnection which allows sending and receiving messages and data to the SignalR server-side Hub.

 

Although SignalR’s most used and default function on the Hub is the ability to broadcast a received message back to all connected clients (as in the ubiquitous “chat” sample application), SignalR has the ability to have the server send messages to only one or more specific clients.  This is done by sending a given message to a specific client based upon that client’s unique ID, which the Hub keeps in an internal static list of connected clients.  Clients can also be “grouped” too if needed so that multiple clients (but not all connected clients) can receive a certain message.  There’s no inherent state in the server-side Hub.  Therefore, things like the server-side collection of connected clients is declared as static to ensure state is maintained.  The Hub class itself is re-instantiated with each message that needs processing.  Methods of the Hub class are Tasks, so clients can send a message to the server hub and be notified sometime later when the task is completed (for example if performing some long running operation such as database backup etc.)

 

We’re told that there’s no built-in persistence with SignalR, and we should be aware that messages can (and sometimes do!) get lost in transit.  SignalR can, however, be configured to run over a message bus (For example, Windows Azure Message Bus) and this can provide the persistence and improved guarantee of delivery of messages.

 

Finally, although the “classic” demo application for SignalR is that of a simple “chat” application with simple text being passed back and forth between client and server, SignalR is not restricted to sending text.  You can send any object!  The only requirement here is that the object be able to be serialized for communication over the wire.  The objects that can be passed across the wire are serialized, by default, with JSON (internally using Newtonsoft’s JSON2 library).

 

After a quick break, during which there was tea, coffee and some lovely Danish pastries available, I was back in the same Seminar Room 1 to attend Mark Rendle’s session entitled, “The densest, fastest-moving talk ever”.  Mark’s session wasn’t really about any single topic, but consisted of Mark writing and developing a simple browser-based TODO List application that utilised quite a number of interesting technologies.  These were Mark’s own Simple.Web and Simple.Data, along with some TypeScript, AngularJS and Bootstrap!

 

As a result of the very nature of this session, which contained no slides, demos or monologue regarding a specific technology, it was incredibly difficult to take notes on this particular talk.  It was simply Mark, doing his developer thing on a big screen.  All code.  Of course, throughout the session, Mark would talk about what he was doing and give some background as to the what and the why.  As I’m personally unfamiliar with TypeScript and AngularJS it was at times difficult to follow along with what and why Mark was making the choices he did when choosing to utilise one or more of these technologies.  Mark’s usage of his own Simple.Web and Simple.Data frameworks were easier to understand, and although I’ve not used either of these frameworks before, they both looked incredibly useful and lightweight to allow you to get basic database reading & writing within a simple web-based application up and running quite quickly.

 

After 30 minutes of intense coding including what appeared to be an immense amount of set-up and configuration of AngularJS routing, Mark can show us his application which is displaying his TODO items (from his previously prepared SQL Server database) in a lovely Bootstrap-styled webpage.  We’re only reading data at the moment, with no persistence back to the DB, but Mark spends the next 30 minutes plugging that in and putting it all together (with even more insane AngularJS configuration!).  By the end of the session, we do indeed have a rudimentary TODO List application!

 

I must admit that I feel I would have got a lot more from this session if I already knew more about the frameworks that Mark was using, specifically AngularJS which appears to be a rather extensive framework that can do everything you’d want to do in client-side JavaScript/HTML when building a web application.  Nonetheless, it was fun and enjoyable to watch Mark pounding out code.  Also, Mark’s inimitable and very humorous style of delivery made this session a whirlwind of information but really fun to attend.

 

Another break followed after Mark’s session with more tea, coffee and a smorgasbord of chocolate-based snacks positioned conveniently on tables just outside of each seminar room (more on the food later!).  Once the break was over, it was time for the final session of the morning and before the lunch break.  This one was Rob Ashton’s “Outside-In Testing of MVC”.

 

Rob’s opening gambit in this session is to tell us that his talk isn’t really about testing MVC, but its about testing general web applications and will include some MVC within it.  A slight bait and switch, and clearly Rob’s style of humour.  He’s mostly a Ruby developer these days so there’s little wonder there’s only a small amount of MVC within the session!  That said, the general tone of the talk is to explore ways of testing web applications from the outermost layer – the User Interface – and how to achieve that in a way that‘s fast, scalable and non-brittle.  To that extent, it doesn’t really matter what language the web application under test is written in!

 

Rob talks about TDD and that often people trying to get started in TDD often get it wrong.  This is very similar to what Ian Cooper talked in his “TDD, where did it all go wrong?” talk that he’s given recently in a number of places.  I attended Ian’s talk at a recent local user group.  Rob says he doesn’t focus so much on “traditional” TDD, and that often having complete tests that start at the UI layer and test a discrete piece of functionality in a complete end-to-end way is very often the best of all testing worlds.  Of course, the real key to being able to do this is to keep those tests fast.

 

Rob says he’s specifically avoiding definitions in his talk.  He just wants to talk about what he does and how it really helps him when he does these very things.  To demonstrate, he starts with the example of starting a brand new project.  He tells us that if we’re working in brownfield application development, we may as well give up all hope!  :(

 

Rob says that we start with a test.  This is a BDD style test, and follows the standard “Given, When, Then” format.  Rob uses CoffeeScript to write his tests as its strict handling of white-space forces him to keep his tests short, to the point, and easily readable, but we can use any language we like for the tests, even C#.

 

Rob says he’s fairly dismissive of the current set of tools often used for running BDD tests, such as Cucumber.  He says it can add a lot of noise to the test script that’s really unnecessary and often causes the test script wording to become more detached and abstracted from what it is the test should actually be doing in relation to the application itself.  So we are asked the question, What do we need to run our test?” – Merely a web browser and a web server!

 

In order to keep the tests fast, we must use a “headless” browser.  These are implementations of browser functionality but without the actual UI and Chrome of a real web browser.  One such headless browser is PhantomJS.  Using such a tool allows us to run a test that hits our webpage, performs some behaviour – adds text to a textbox, clicks a button etc. – and verifies the result of those actions, all from the command line.  Rob is quick to suggest that we shouldn’t use PhantomJS directly as then our tests will be tightly coupled to the framework we’re running them within.  Rob suggests using WebDriver (part of the Selenium suite of web browser automation tools) in conjunction with PhantomJS as that will provide a level of abstraction, and thereby not coupling the tests tightly to the browser (or headless browser) being used.  This level of abstraction is what allows the actual test scripts themselves to be written in any language of our choosing.  It just needs to be a language that can communicate with the WebDriver API.

 

Rob then proceeds to show us a demo of running multiple UI tests in a console window.  These tests are loading a real webpage, interacting with that page – often involving submission of data to the server to be persisted in someway, and asserting that some action has happened as a result of that interaction.  They’re testing the complete end-to-end process.  The first thing to note is that these tests are fast, very fast!.  Rob is spitting out some simple diagnostic timings with each test, and each test is completing in approx. 500ms!

 

Rob goes on to suggest ways of ensuring that, when we write our tests, they’re not brittle and too closely tied to the specific ID’s or layout of the page elements within the page that we’re testing.  He mentions one of the best tools to come from the Ruby world, Capybara.  Rob says that there’s a .NET version of Capybara called Coypu although it’s not quite as feature-complete as Capybara.  Both of these tools aim to allow intelligent automation of the browser testing process and help make tests readable, robust, fast to write with less duplication and less tightly coupled to the UI.  They help to prevent brittle tests that are heavily bound to UI elements.  For example, the tools try multiple ways to fill in a “username” textbox when instructed by first looking for the specific ID, but then intelligently looking for a <label for=”username”> if the ID is not found and using the textbox associated with the label.  If that’s not found, the tool will then intelligently try to find a textbox that happens to be “near” to where some static text saying “Username” may be on the page.

 

Rob suggests not to bother with “fast” unit tests.  He says to make your UI tests faster!  You’ll run them more frequently, and a fast UI test means a fast UI.  If we achieve this, we’re not just testing the functionality, but by ensuring we have a suite of UI tests that run very fast, we will by virtue of that, have an actual application and UI that runs very fast.  This is a win-win situation!

 

Rob proceeds to build up a demo application that we can use to run some tests against.  He does this to show us that he’s not going to concern himself with databases or persistence at this point – he’s only storing things in an in-memory collection.  Decisions about persistence and storage should come at the very end of the development as by then, we’ll have a lot more information about what that persistence layer needs to be (i.e.. a document database, SQL Server for more complex queries etc.)  This helps to keep the UI tests fast!

 

Rob then proceeds to give us a few tips on MVC-specific development, and also about how to compose our unit tests when we have to step-down to that level.  He says that our Controllers should be very lightweight and that we shouldn’t bother testing them.  You’ve got UI tests that cover that anyway.  He states that, “If your controller has more than one IF statement, then it shouldn’t!”.  Controllers should be performing the minimal amount of work.  Rob says that if a certain part of the UI or page (say a register/signup form) has complex validation logic, we should test that validation in isolation in it’s own test(s).   Rob says that ActionFilters are bad.  They’re hard to test properly (usually needing horrible mocking of HTTPContext etc.) and they often hide complexity and business logic.  This logic is better placed in the model.  We should also endeavour have our unit-level tests not touch any part of the MVC framework.  If we do need to do that, have a helper method that abstracts that away and allows the test code to not directly touch MVC at all.

 

To close, Rob gives us the “key takeaways” from his talk:  Follow your nose, focus on the pain and keep the feedback loop fast.  Slides for Rob’s talk are available here.

 

20130629_124147After Rob’s talk, it was time for lunch.  This was provided by the DDDEA team, and consisted of sandwiches, crisps, a drink and even more chocolate-based confectionary.  There was also the token gesture of a piece of fruit, I suppose to give the impression that there was some healthy items within there!

 

There was even the ability to sit outside of the main room in the Hauser Forum and eat lunch in an al-fresco style.  It was a beautiful day, so many attendees did just that.  The view from the tables on this balcony was lovely.

 

20130629_124107As is often the case at events such as these, there were a number of informal “grok” talks that took place over the lunchtime period.  These are usually 10 minutes talks from any member of the audience that cares to get up and talk about a subject that interests them or that they’re passionate about.

 

Since I was so busy stuffing my face with the lovely lunch that was so kindly provided, I had missed the first of the grok talks.  I managed to miss most of the second grok talk, too, which was given by Dan Maharry about his experiences writing technical books.  As I only caught the very end of Dan’s talk, I saw only one slide upon which was the wise words, "Copy editors are good. Ghost-writers are bad."  Dan did conclude that whilst writing a technical manual can be very challenging at times it is worth it when, three months after completing your book, you receive a large package from the publishers with 20-30 copies of your book in there with your own name in print!

 

The last grok talk, which I did catch, was given by Richard Duttton on life as a Software Team Lead at Red Bull Racing.  Richard  spoke about the team itself and what kind of software they produce to help the Formula 1 team build a better and faster car.  Richard answered the question of “What’s it like to work in F1?”.  He said there’s long hours and high pressure but it’s great to travel and see how the software you write affects the cars and the race first hand.

 

Red Bull Racing development team is about 40 people strong.  About half of these have a MATLAB background rather than .NET/C#.  Richard’s main role is developing software for data distribution and analysis.  He writes software that is used on the race pit walls as well as being used back at HQ.  They can get a data reading from the car back to the IT systems in the HQ within 4 seconds from anywhere in the world!  The main data they capture is GPS data, Telemetry data and Timing data.  Within each of these categories of data, there can be 1000’s of individual data points that are captured.

 

Richard spoke about the team’s development methodology and said that they do “sort-of” agile, but not true agile.  It’s short sprints that align with the F1 race calendar.  There are debriefs after each race.  When 2 races are back-to-back on consecutive weekends, there’s only around 6 hours of development time between these two races!

 

The main software languages used are C#.NET 4.5 (VS2010 & VS2012 with TFS and centralized build system) but they mostly develop WPF applications (with some legacy WinForms stuff in there as well).  There’s also a lot of MATLAB.  They still have to support Windows XP as an OS as well as more modern platforms like Windows Phone 8.

 

After the lunch time and grok talk sessions were over, it was back to the scheduled agenda.  There were two more sessions left for the day, and my first talk of the afternoon was Ashic Mahtab’s “Why use DDD, CQRS and Event Sourcing?

 

Ashic starts by ensuring that everyone is familiar with the terminology of DDD, CQRS and Event Sourcing.  He gives us the 10 second elevator pitch of what each acronym/technology is to ensure we know.  He says that he’s not going to go into detail of what these things are, but rather why and when you should use them.  For the record, DDD is Domain-Driven Design, CQRS is Command Query Responsibility Segregation and is about having two different models for reading vs. writing of data, and Event Sourcing is about not writing a lot of changes to a database record in one go, but to write the different stages of changes to the record over time.  The current state of the record is then derived by combining all the changes (like delta differences).

 

He says that very often applications and systems are designed as “one big model”.  This usually doesn’t work out so well in the end.  Ashic talks about the traditional layered top-down N-Tier architecture and suggests that this is a bad model to follow these days.  Going through many layers makes no sense, and this is demonstrated especially well when looking at reading vs. writing data – something directly addressed by CQRS.  Having your code go through a layer that (for example) enforces referential integrity when only needing to read data rather than writing it is unnecessary as referential integrity can never be violated when reading data, only when data is being written.

 

Ashic continues his talk by discussing the notion of a ubiquitous language and that very often, a ubiquitous language isn’t ubiquitous.  Different people calls things by different names.  This is often manifested between disparate areas of an enterprise.  The business analysts may call something by one name, whilst the IT staff may call the same thing by a different name.    We need to use a ubiquitous language, but we also need to understand that it’s often only ubiquitous within a “bounded context”.  A bounded context is a ring-fenced area where a single “language” can be used by everyone within that area of the enterprise and is ubiquitous within that context.  This provides a “delimited applicability of a particular model, gives team members a clear and shared understanding of what has to be consistent and what can develop independently.”.

 

Ashic goes on to talk about the choices of the specific technology stack and how those choices can impact many areas of a project.  A single technology stack, such as the common WISA Microsoft-based stack for web applications (the Windows/Microsoft equivalent of the even more common LAMP stack) can often reduce IT expenditure within an enterprise, but those cost savings can be mitigated by the complexity of developing part of a complete system using a technology that’s not an ideal fit for the purpose.  An example may be using SQL Server to store documents or binary data, when a document –oriented database would be a much more appropriate solution.

 

Ashic tells a tale of his current client who only have one big model for their solution that comprises of around 238 individual projects in a single Visual Studio solution.  A simple feature change that was only 3 lines of code required the entire solution to be redeployed.  This in turn required testing/QA, compliance verification and other related disciplines to be re-performed across the entire solution even though only a tiny portion had actually changed.  The “one big model” had forced them into this situation, whereas multiple, separate models communicating between each other by passing messages in a service-oriented approach would have facilitated a much smaller footprint for deployment, and thus a smaller sized application that needed testing and verification.

 

Ashic tells us that event sourcing over a RESTful API is a good thing.  Although there’s the possibility of the client application dealing with slightly stale data, it’ll never be “wrong” data as the messages passed over this architecture are immutable.  Also, if you’re using event sourcing, there’s no need to concern yourself about auditing and logging, all individual changes are effectively audited anyway by virtue of the event sourcing mechanism!  Ashic advises caution when applying event sourcing and consideration should be given to where not to apply it.  If all you’re doing in a certain piece of functionality is retrieving a simple list of countries from a database table that perhaps contains only one or two columns, it’s overkill and will only cause you further headaches if applied.

 

He states that versioning of data is difficult in a relational database model.  You can achieve rudimentary versioning with a version column on your database tables, or a separate audit table, however this is rarely the best approach or delivers the best performance of design.  Event sourcing can significantly help in this regard, too, as rather than versioning an entire record (say a “customer” record which may consist of dozens of fields), you’re versioning a very small and specific amount of data (perhaps only a single field of the customer record).  The event sourcing message that communicates the change in the one field (or a small number of fields) effectively becomes the version itself as multiple changes to that field(s) will be sent over several different immutable messages.

 

The talk continues with an examination of many of the various tools and technologies that we use today. Dependency Injection, Object mapping (with Automapper for example) and Aspect-Oriented programming.  Ashic ponders whether these things are really good practice.  Not so much the techniques that (for example) a dependency-injection container performs, but whether we need the container itself at all.  He says that before DI Containers came along, we simply used the factory pattern and wrote our own classes to perform the very same functionality.  Perhaps where such techniques can be written by ourselves, we should avoid leaning upon third-party libraries.  After all, dependency injection can often be accomplished in as little as 15 lines of code!  For something a little more complicated such as aspect-oriented programming, Ashic uses the decorator pattern instead.  It’s tried and trusted, and doesn’t re-write the IL of your compiled binaries – something which makes debugging very difficult - like many AoP frameworks do.

 

Ashic concludes his talk by restating the general theme.  Don’t use “one big model” to design your system.  Create bounded contexts, each with their own model and use a service-oriented architecture to pass messages between these model “islands”.  The major drawback to using this approach to the design of your system is that there’s a fair amount of modelling work to do upfront to ensure that you properly map the domain and can correctly break that down into multiple discreet models that make sense.

 

20130629_150120After Ashic’s talk, there was the first afternoon break.  Each of the three seminar rooms shuffled out into the main hall area to be greeted with a lovely surprise.  There was a table full of delicious local cheeses, pork pies, grapes and artisan bread, lovingly laid on by Rachel Hawley and the generous folks at Gibraltar software (Thanks guys! I think you can safely say that the spread went down very well with the attendees!)

 

So after we had all graciously stuffed our faces with the marvellous ploughman’s platter, and wet our whistles with more tea and coffee, it was time for the final session of the day.

 

The final session for me was Tomas Petricek’s “F# Domain-Specific Languages”.

 

Tomas starts out by mentioning that F# is a growing language with a growing community across the world - user groups, open source projects etc.  It’s also increasingly being used in a wide variety of companies across many different areas. Credit Suisse use F# for financial processing (perhaps the most obvious market for F#) but the language is also used by companies like Kaggle for machine learning and also companies like GameSys for developing the server-side components used within such software as Facebook games.

 

Tomas then demos a sample 3D Domain-specific language (or DSL) that composes multiple 3d cylinders, cones and blocks to composite ever more elaborate structures from the component parts.  He shows building a 3D “castle” structure using these parts that combines multiple functions from the domain-specific language, underpinned by F# functions.  He shows that the syntax of the DSL contains very little F# code, only requiring a small number of F# constructs when we come to combine the functions to create larger, more complex functions.

 

After this demo, Tomas moves on to show us how a DSL for European Call and Put stock options may look.  He explains what call and put options are (Call options are an agreement to buy something at a specific price in the future and Put options are an agreement to sell something at a specific price in the future) and he then shows some F# that wraps functions that model these two options.

 

Whilst writing this code, Tomas reminds us that everything in F# is statically typed.  Also that everything is immutable.  He talks about how we would proceed to define a domain-specific language for any domain that we may wish to model and create a language for.  He says that we should always start by examining the data that we’ll be working with in the domain.  It’s important to identify the primitives that we’ll be using.  In the case of Tomas’ stock option DSL, his primitives are the call and put options.  It’s from these two primitive functions that further, more complex functions can be created by simply combining these functions in certain ways.  The call and put functions calculate the possible gains and/or losses for each of the two options (call or put) based upon a specific current actual price. Tomas is then able to “pipeline” the data that is output from these functions into a “plot” function to generate a graph that allows us to visualize the data.  He then composes a new function which effectively “merges” the two existing functions before pipelining the result again to create another graph that shows the combined result set on a single graph.  From this we can visualize data points that represent the best possible price at which to either buy or sell our options.

 

Tomas tells us that F# is great for prototyping as you’re not constrained by any systems or frameworks, you can simply write your functions that accept simple primitive input data, process that data, then output the result.  Further functions are then simply  composed of those more basic functions, and this allows for very quick testing of a given hypotheses or theory.

 

For some sample F# code, Tomas models the domain first like so:

type Option =
| EuropeanPut of decimal
| EuropeanCall of decimal
| Combine of Option * Option

This is simply modelling the domain and the business language used within that domain.  The actual functionality to implement this business language is defined later.

 

Tomas then decides that the definition can actually be rewritten to something equivalent but slightly better like so:

type OptionKind = Put | Call

type Option =
| European of OptionKind * decimal
| Combine of Option * Option
He can then combine these two put/call options/functions like so:
let Strangle name lowPrice highPrice = 
    Combine
       (  European(Put, name, lowPrice),
          European(Call, name, highPrice)  )

Strangle is the name of a specific type of option in the real world of stock options and this option is a combination of call and put options that are combined in a very specific way.  The function called Strangle is now defined and is representative of the domain within which it is used.  This makes it a perfect part of the domain-specific language.

 

Tomas eventually moves onto briefly showing us a DSL for pattern detection.  He shows a plotted graph that can go up and down across the x axis and how we can used F#-defined DSL-specific functions to detect that movement up or down.  We start by defining the “primitives”.  That could be the amount of the movement (say, expressed in pixels or some other arbitrary unit we decide to use), and then a “classifier”.  The classifier tells us in which direction the movement is (i.e. up or down).  With these primitives defined, we can create functions that detect this movement based upon a certain amount of points that are plotted on our graph.  Although Tomas didn’t have time to write the code for this as we watched (we were fairly deep into the talk at this point with only a few minutes left), he showed the code he had prepared earlier running live on the monitor in front of us.  He showed how he could create multiple DSL functions, all derived from the primitives, that could determine trends of the movement of the plotted graph over time.  These included detection of:  Movement of the graph upwards, Movement of the graph downwards and even Movement of the graph in a specific bell curve-style (i.e. a small downwards movement, immediately followed by an upwards movement).  For each of these combined functions, Tomas was able to apply them to the graph in real-time, by simply “wiring up” the graph output - itself a DSL function, in this case a recursive one that simply returned itself with new data with every invocation – to the detection functions of the DSL.

 

At this point, Tomas was out of time, however what we had just seen was an incredibly impressive display of the expressiveness, the terseness, and the power of F# and how domain-specific languages created using F# can be both very feature rich and functional (pardon the pun!) with the minimum of code.

 

At this point, the conference was almost over.  We all left our final sessions and re-gathered in the main hall area to finish off the lovely cheese (amazingly there was still some left over from earlier on!) and wait whilst the conference organisers and Hauser Forum staff rearranged the seminar rooms into one big room in which the closing talk from the conference organisers would be given.

 

After a few minutes, we all shuffled inside the large room, and listened as the DDD organisers thanked the conference sponsors, the speakers, the university staff and finally the attendees for the making the conference the great success that it was. And it was indeed a splendid event.  There were then some prizes and various items of swag to be given away (I didn’t win anything :( ) but I’d had a fantastic day at a very well organised and well run event and I’d learned a lot, too.  Thanks to everyone involved in DDDEA, and I hope it’s just as good next year!

Another Microsoft Certification acquired!

MS(rgb)

Ever since gaining my MCTS (Microsoft Certified Technology Specialist) and MCPD (Microsoft Certified Professional Developer) certificates at the end of 2011 and the early part of 2012 I’ve had an appetite to acquire more.  Life seemed to get in the way of this during 2012 so that was, unfortunately, a quiet year on the certification front.

 

Well, it’s now 2013, and Microsoft have recently revamped a lot of their certification offerings.  A new type of certification that they’ve introduced is that of a Microsoft Specialist.  The Microsoft Specialist certification seem to be a replacement for the old MCTS (Microsoft Technology Specialist) and is effectively a certificate awarded for showing competence in a specific piece of Microsoft Technology, of which there are quite a number.

 

During the latter part of 2012 and the early part of 2013, Microsoft were running a promotion to take a free exam.  This was exam 70-480 – Programming in HTML5, JavaScript & CSS3.  Successfully passing this exam would award the exam taker the certification of Microsoft Specialist – Programming in HTML5, JavaScript & CSS3.

 

Well, towards the end of February of this year, I sat and successfully passed the exam acquiring the certification of Microsoft Specialist – Programming in HTML5, JavaScript & CSS3.

 

This is one of three exams that, when all three are successfully passed, will gain the new style Microsoft Certified Solution Developer – Web Applications certificate.  I guess the rest of this year’s certification journey has just been mapped out!

Razor’s Conditional Attributes Bit Me!

When ASP.NET MVC 4 was released, Microsoft upgraded the Razor view engine that ships with ASP.NET MVC to version 2 and with it came a number of improvements. One of these improvements was a feature called “conditional attributes”.

 

Conditional Attributes are a new feature that allows you to shortcut “boilerplate” null check code when rendering an attribute to a HTML element. If you have a model property or a local variable that is used to output the “value” of a HTML element’s attribute that evaluates to NULL, the Razor engine will now automatically discard rendering the entire (empty) attribute.

 

Thus, whereas we’d previously have to do something like this:

<div @{ if(@Model.ClassName != null) { <text>class="@Model.ClassName"</text> } }>Content</div>

to ensure that, if @Model.ClassName was null, we wouldn’t render the entire class attribute, the new Conditional Attributes feature allows us to do this:

<div class="@Model.ClassName">Content</div>

and the Razor parser is smart enough to not render the class=”” literal attribute text if @Model.ClassName evaluates to null.  So we don’t get this:

<div class="">Content</div>

But instead we get much cleaner markup like this:

<div>Content</div>

 

Razor’s conditional attributes also work similarly with boolean values, so you can for example, cleverly output checked attributes on an input element defined as a checkbox like so:

<input type="checkbox" checked="@IsChecked">

If @IsChecked evaluates to true, the checked attribute is rendered with a value which is the same name as the attribute:

<input type="checkbox" checked="checked">

However, if @IsChecked evaluates to false, the entire attribute is not rendered.  Andrew Nurse, a developer on Microsoft’s Razor team, has a great blog post about this and the other new features in Razor v2.

 

So, this is all well and good, however, there is a huge gotcha that you need to be aware of surrounding conditional attributes!  I discovered this when upgrading a project originally built in ASP.NET MVC 3 (which had Razor v1 and thus didn’t have the conditional  attributes feature) to ASP.NET MVC 4.  This previously working project suddenly developed bugs that weren’t there before.  Upon inspection, it was due to Razor’s new conditional attributes feature that introduced a breaking change in my code.

 

Basically, I had a ASP.NET MVC strongly-typed View that displayed a grid of data.  As part of the model for this view was an object used to hold some basic data relating to how the user had configured the grid.  This was simply non-sensitive data such as the number of records per page, the column name upon which the grid was sorted and the sort order (ascending or descending).  This was output to the View as a number of hidden input fields within the view’s form, such that they could be posted back to the server upon each page request:

<input type="hidden" name="pagesize" value="@Model.PagingInfo.ItemsPerPage" />
<input type="hidden" name="sortname" value="@Model.PagingInfo.SortName" />
<input type="hidden" name="sortasc" value="@Model.PagingInfo.SortAscending" />

The problem was within that last line.  The @Model.PagingInfo.SortAscending is a boolean that evaluates to true if the user wants to sort ascending, or false if the user wants to sort descending.  In ASP.NET MVC 3, this would work just fine with the resulting output looking something like:

<input type="hidden" name="sortasc" value="false" />

when the user had elected to sort descending.  However, upon upgrading the project to ASP.NET MVC 4, Razor v2’s conditional attributes feature saw that the @Model.PagingInfo.SortAscending model property was a boolean and that it evaluated to false, and decided not to render the value attribute at all, thus my output became:

<input type="hidden" name="sortasc" />

When the user had selected to sort in an ascending manner, and the value of the @Model.PagingInfo.SortAscending property evaluated to true, the output was even more strange:

<input type="hidden" name="sortasc" value="value" />

This was the “cleverness” of the Razor parser kicking in and outputting the attribute’s name as it’s value when my boolean property evaluated to true.  This makes lots of sense when we’re outputting a series of checkboxes and we want one of them to be checked, which requires the checked=”checked” attribute to be added to the checked element, but not so much sense when we actually want to output the string “true” or “false” as a value attribute’s value in a hidden text input form field!

 

So whilst the output each time was valid HTML markup with no errors being displayed, this clearly affected the functionality of my page.  POSTs of this page, which would cause the page to redisplay (for example, when the user selected a different sort column or a different sort order) would always result in the grid sorting in a descending manner, irrespective of the user’s choice.

 

This was due to the Razor model binder finding no suitable value with which to bind the sortasc parameter of the controller method that was invoked when the page was posted back to the server:

public ViewResult List(string searchterm, string sortname, bool sortasc = false, int id = Page, int pagesize = PageSize)

Of course, the sortasc parameter’s default value was then always used, resulting in the “grid is always descending” behaviour!

 

This was an interested bug to hunt down within my code, and was also a particularly annoying one too, as the code had worked perfectly in ASP.NET MVC 3.  However, once it was discovered how and why this bug reared it’s head, it was also simple enough to fix.

 

The fix for this is to simply append .ToString() to all boolean variables/model properties that are used purely to render a true or a false attribute’s value on a HTML element.

 

Thus, my above code was fixed quite simply like so:

<input type="hidden" name="pagesize" value="@Model.PagingInfo.ItemsPerPage" />
<input type="hidden" name="sortname" value="@Model.PagingInfo.SortName" />
<input type="hidden" name="sortasc" value="@Model.PagingInfo.SortAscending.ToString()" />

The addition of the .ToString() forces the evaluation of the boolean and it’s resulting conversion to a string prior to the Razor engine’s parser being able to work it’s “conditional attribute” cleverness.  It simply results in the boolean’s string value being output as the attributes value every time, like this:

<input type="hidden" name="sortasc" value="True" />

So, whilst this issue didn’t manifest itself for me until I upgraded an older ASP.NET MVC 3 project to ASP.NET MVC 4, it’s quite feasible that a developer could write code like this from scratch in MVC 4 and expect the Razor parser to simply output the value of the boolean as the attribute value.  There’s an open case for this in the ASP.NET Web Stack issue tracker on CodePlex and whilst there is a simple enough workaround for the problem, it’s the “breaking change” nature of the issue that is most concerning.

 

Let’s be careful out there and remember to .ToString() our booleans!

DeveloperDeveloperDeveloper North 2 Conference Write-up

 

This past weekend, on Saturday 13th October, the 2nd Developer Developer Developer North conference was held at the University of Bradford.  I attended the conference, which was my first Developer Developer Developer (DDD) event ever, and it was a cracker!

 

DeveloperDeveloperDeveloper events are a series of conferences held around the UK and in some locations abroad focused primarily at Microsoft/.NET Developers.  The conferences are free to attend and are made possible by the support of a wonderful set of sponsors.

 

The University Of Bradford was a great venue for the conference.  It had plenty of space and rooms available to accommodate the DDDNorth event which had 5 parallel tracks of talks and 5 sessions throughout the entire day.  Each session was an hour in length with 15 minute breaks in-between the 2 morning sessions and the 3 afternoon sessions.  There was a catered lunch provided free of charge to attendees during the generous 1.5 hour lunch time.

 

As there were 5 parallel tracks of sessions, it was often a difficult choice to pick just one session to attend.  This was especially true for myself within the 2nd session time-slot, where I really wanted to attend all 5 parallel talks!  Unfortunately, I had to only pick one.

 

The first talk I attended was Garry Shutler's "10 Practices that make me the developer I am today".  This was a talk aimed at more "entry level" developers, but I thought I'd attend to see if there were a few nuggets of wisdom that I perhaps didn't know.

 

Garry told us that standards matter, although what they are, doesn't.  Using StyleCop to help enforce standards across your team can help to keep consistency and there's even integration into ReSharper to help with this.  Garry also tells us of the importance of Code Reviews, although much of their value comes when they're done at a "story" (as in, an Agile User-Story) level rather than at a more granular level.  We should learn constantly as no-one else cares about our own personal learning (for both our current and future jobs) and we shouldn't wait for our employers to do this for us.  We should learn new languages, especially ones that are significantly different from those that we use every day.  It's a big investment, but worth it as concepts and paradigms in one language can help us understand similar concepts in other languages.  To help with our learning, we should leverage experience of other developers around us.  It's only obvious once you know!  Testing and Automation are a huge help in getting a faster development feedback loop and allow us effectively "go faster" in our development by correcting our course more frequently.  Within our code, we should trust no-one by ensuring we always implement preconditions to prevent such annoyances as pesky "Null Reference Exceptions" when we were expecting an object to be passed to us, and we should also log excessively, not just errors and exceptions, but everything.  This is a big help when trying to debug issues on production environments where a real debugger can't be used.

 

The next talk of the day was Liam Westley's "Async C# 5.0 - Patterns for Real World Use" which was a great talk about the Asynchronous programming features introduced into C# 5.0 with the async and await keywords.  Liam's talk specifically focused on the WhenAny and WhenAll methods that perform functionality when working with sets of tasks, and used a concept of copying or downloading music files of varying formats to demonstrate the versatility of the various Async methods.

 

Liam tells us that we should return a Task instead of a void, where we would have previously done so.  This gives the caller information about what's happened (or happening) during our method.  He then goes on to tell us about the use of the WhenAll method for dealing with lists (i.e. List<Task<string>>) of tasks.  Further processing can happen when all tasks have completed, as all tasks are important here.  Next up, Liam tells us about the very useful and versatile WhenAny method which can be used in numerous ways.  One of which was maintaining a limited batch of a specific number of files when copying/downloading large numbers of files.  The WhenAny method detects completions, removes them from the batch and replaces them with a new file copy task, thereby effectively throttling the downloads to certain batch size.  WhenAny was also used to show it's usefulness in redundancy.  This can be used for competing services where the first to return wins. For example, downloading multiple versions/formats of the same music file, and automatically playing the one that's downloaded first. Other files will continue to download in the background, but can be cancelled if required, needing only a single Cancellation Token for all outstanding tasks.  This technique can also be used for early bailout when we want to cancel outstanding tasks based upon notifications from one or more completed tasks.  The final interesting usage of WhenAny was for interleaving. Liam's demo here showed music files and their associated md5 hashes being downloaded.  Once both the music file and the associated .md5 hash file had downloaded, a hash check could be computed on the file pair.  Very clever stuff indeed!  Liam ended his talk with a mention of a Microsoft white-paper that he recommended us all to download for further reading, "Task-Based Asynchronous Pattern" by Stephen Toub.  Oh, and a rickroll.

 

After a short break, the following talk was Gemma Cameron's "BDD - Look Ma! No frameworks".  Gemma promised an interactive session with this one, all geared around Behaviour-Driven Development (BDD) and design, and we weren't disappointed.  A very interesting talk that caused all the attendees of the talk to have a real good think about how we would document the behaviour of buying an apple!

 

Gemma started by asking us to think about why we test. Testing creates good code because it creates a good design for our code.  When we test first, we're forced into making our production code testable.  We should beware of retro-fitting tests to existing code that perhaps was not written using a test-first approach.  This has the potential to "bake-in" bad code by fitting a (passing) unit test around it!  Gemma goes on to say that BDD has often been called "TDD Done Right" and in many ways this is true.  BDD is about us as developers asking "Why?" rather than asking "How?". As developers, we're good at solving problems and thinking about how we might implement a solution, but it's very easy for us to lose sight of why a feature is implemented when viewed from a business requirement perspective.  BDD helps, and if done correctly, forces us to consider the why of a feature's business requirements as that is baked right into our BDD tests forcing us to informatively document those business requirements within our tests.  Adding such an expressiveness to our test code, and specifically an expressiveness that is written in a language understandable not just by developers but by business analysts, project managers and product owners as well will result in bringing developers closer to product owners and including all the other roles in between.

 

BDD isn't really about unit tests, it's about working from the top-down instead of from the bottom-up. This means we start with the abstract feature requirements (the why?) and gradually drill down into the specifics (the how?) of how we'll implement that within our code.  A popular starting point for organising our BDD tests is to use the GWT (Given When Then) syntax, for example: Given [initial context], when [event occurs], then [ensure some outcomes]. Gemma does point out that although GWT can be a useful starting point, it's often restrictive as it doesn't always lend itself to best expressing our requirements.

 

Gemma's session continued with all attendees attempting to put this into practice by writing a test that correctly and sufficiently expresses our requirements around purchasing an apple!  Our first collective attempt by the attendees, which followed a GWT syntax was something along the lines of:

public void BuyAnApple()
{
   GivenAppleCosts50();
   AndThatIHaveOneAppleInMyBasket();
   WhenICheckout();
   ThenTotalIs50();
}

Note our use of the GWT syntax for expressing our requirements.  After some further discussion and reflection, we eventually arrived at a much better test:

public void ShopKeeperSellsAnApple()
{
   AppleCosts50();
   CustomerHasAppleInTheirBasket();
   WhenICheckout();
   ThenTotalIs50();
}

Note that here we’ve dispensed with the formal GWT syntax, instead preferring a more natural language to express our requirements and actual real-world behaviour.

 

In reference to the title of her talk, Gemma was quite down on the use of frameworks to help with the process of writing BDD tests, preferring instead to "hand-code" all of the test syntax.  Frameworks can be helpful, but there's the potential to focus (or lean) too much on the tool or framework rather than getting the behaviour and requirements documented in the common language.  It's this language that we need to work on improving, as this will act as our product documentation both for ourselves as developers and for the business people.  It's this language that's invaluable in letting us know what and why we did something when we return to that code in 6 months time!

 

After this we had a very nice lunch with sandwiches, fruit and chocolate, all provided by the generous sponsors of the event.  There were a number of Grok Talks during lunch.  These are short 10-15 minute ad-hoc talks given by various attendees of the event.  Unfortunately, I didn't get to attend any of the Grok Talks as I was far too busy stuffing my face!  :)

 

The first session after lunch was Rob Ashton's "Javascript Sucks And It Doesn't Matter!".  Rob's talk was advertised as being controversial, and he didn't disappoint!  Pretty much straight out of the blocks Rob tells that he doesn't use semi-colons in his JavaScript code, something that he's sure would really annoy Douglas Crockford, the man who's trying to get us all to write more "correct" JavaScript.  Rob goes on to say it's specifically because it'd get up Douglas Crockford's nose that he doesn't use semi-colons!

 

Rob tells us that JavaScript is a "broken" language in that it's dynamic, allows much flexibility with it's syntax (witness Rob eschewing semi-colons), and will happily perform the strangest type-coercion in order to allow you to add apples to oranges.  Rob goes on to say that, despite JavaScript's fast-and-loose approach, this doesn't really matter as we've got some very nice tools at our disposal to help "keep JavaScript sane".  JSLint & JSHint are both very useful code-quality tools that allow us to have our JavaScript inspected for potential problems including unsafe comparisons, accidentally declaring global variables (rather than locally-scoped ones), un-strict code and stylistic correctness amongst other things.  We can even run these tools from the command line as part of a continuous integration process with the use of node.js, which is effectively JavaScript for the server.

 

Rob says that our JavaScript code should be tested just as much as we would test our C# code, and to this end, a tool called Zombie comes in very handy.  Zombie is a "headless browser" which is itself written in JavaScript and allows us to put our JavaScript code through it's paces from a command line driven, continuous testing process.  Rob says that we should try to avoid automating a real browser, as we would do with Selenium, as this is a much slower process.  Testing with Zombie is fast and provides that quick feedback loop that we get with a speedy continuous testing process allowing a more rapid develop/debug cycle.

 

Of course, it's not all about leaning on the tools to prevent bad JavaScript code from being written.  Rob highlights the fact that we need good discipline to write good code.  At this point, Rob also wanted to address the thorny issue of TypeScript.  TypeScript is a new language from Microsoft that is a superset of JavaScript.  It's advertised as "application-scale" JavaScript, compiles to native plain-old JavaScript, and attempts to add some static typing and better object-oriented support to the JavaScript language.  Rob suggests that TypeScript is largely "application-scale" Marketing rather than anything else and points out that much of what TypeScript gives us can be achieved without the need for a new "language".

 

We all know that JavaScript files and functions can be included within other JavaScript files, simply by referencing them, however, this quickly gets unwieldy and unmaintainable.  Moreover, the ordering of how JavaScript files are loaded is often very important.  This reminds me of my own struggles with developing in "Classic" ASP/VBScript many years ago and just how easy it can be to tie yourself up in knots when including one file within another (which usually, in turn, includes further files) and managing that complex chain of inclusion.  Tools such as RequireJS can help to bring order to this inclusion madness and attempt to make JavaScript more modular by making it easier to ensure the relevant functionality has been loaded and brought into scope before attempting to execute code that will rely upon it.  Other tools such as CommonJS offer similar functionality but also go further by offering additional functionality in the areas of Modules, Packages and Promises (which greatly simplify asynchronous programming).

 

A simple example of how RequireJS helps to achieve it's goal is seen here:

var otherFile = require("./otherFile.js");
otherFile.Dosomething();

A similar concept to this is available within TypeScript, too:

import otherFile = module(“otherFile”);

This simplification and improvement of managing required/included files is slated to be included within ECMAScript Version 6, currently code-named "Harmony".  ECMAScript is the standardized version of the JavaScript language, ratified by ECMA (European Computer Manufacturers Association), however, as to when Version 6 will be finally released is anyone's guess!

 

The last session of the day was Ian Cooper’s “Event-Driven Architecture”.  This was a fairly heavyweight talk from Ian that gave us a deep-dive into the concepts and best-practices around architecting an application in an event-driven (or service oriented) approach.

 

Ian started by reminding us of the 4 tenets of service orientation:

 

  • Boundaries are explicit
  • Services are autonomous
  • Services share schema and contract, not class
  • Service compatibility is determined based on policy

 

Event-driven, or service oriented architecture is a set of design principles that, much like object-orientation, help us to architect an application that is composed of individual services.  Services are autonomous “mini-applications” that perform some discreet function.  The entire application is composed of many of these services that will talk to each other via message passing.  There are explicit boundaries between these services and each service passes all of the data required to the next service in the chain in order for that service to perform its function.  Services are effectively “black-boxes” that share nothing of their internal workings or state, and assert their requirements, constraints and capabilities via a public schema or contract.

 

Ian goes on to discuss the various types of inter-service communication that is available with Service Oriented Architecture.   The simplest type is “Request-Reply”.  This is something we’re all familiar with as it’s exactly how the world-wide web works.  We (the client) request something.  We wait whilst the server composes and sends back to us it’s reply.  A slightly better approach that avoid the necessity for the client to “wait” for the response is something known as “Request-Reaction”.  Here, the requestor (client) no longer has to wait for the data to be sent back.  After an initial acknowledgement of the request by the server, the client can receive the data a later point in time.  Here, the requestor usually “polls” for the result, but can be informed when the result is available (i.e. similar to a callback function).

 

The next type of communication is “Inversion of communication”.  This allows a main service to push all of it’s events or messages to an external central message queue.  Consumers who are interested in those events can then simply “listen” to that queue.  This helps to reduce the need for systems or services to “know” about each other, thereby reducing coupling between autonomous services even further.  This is helpful as part of the overall architecture as the more one system needs to “know” about another one, the harder it is to integrate those systems into a cohesive whole.  In this “event publishing” scenario, the publisher of the event doesn’t need to know anything about the consumers of those events!  Ian used an example of a hotel’s internal systems with a main “reservation” system publishing a “Reservation Made!” event or message to the external message queue, allowing a “Room Cleaning” service to subsequently view that message and arrange for room service personnel to clean the room pending the hotel guest’s arrival, by watching the central message queue for “Reservation Made!” events.

 

Ian continued by talking by Messages and Events.  What is an event?  Well, an event is simply a message.  Messages are the data that passes into and out of our services, in a format or schema defined by the service, and allows the service to perform functions based upon that data.  Messages can be either thin or fat (see below for further information) and are usually communicated along a non-durable channel (which essentially means the messages are not persisted to disk) for high throughput.  Messages are passed along Channels and Queues.  Channels allow the passing of messages between services and usually operate in a real-time manner.  Some channels can act as Queues, and queues will often persist messages (thus acting as a durable channel) allowing long-term storage and delaying of messages.  A central message queue would most likely take this approach to it’s handling of messages.  Crucially, channels should only operate on one type of message, using separate channels when different messages types need to be passed around.

 

Often, channels and queues will operate in conjunction with related services such as a routing service or a transformation service.  Routing services will ensure messages are routed to the correct destination – often determined by a business process or workflow (see orchestration details below), whilst transformation services will ensure messages can be converted (transformed) from one type to another.  This can often involve adding additional data to the message, removing extraneous data from the message that’s no longer required, or it could simply mean changing the schema of the message from one schema to another.

 

Ian then talked about the concept of Reference Data which is the data used by the services themselves.  This data can be both private data, which is used by the service itself internally, or it can be public data, which is the data that services will pass around within their messages.  Often the public data can be delivered in two different ways.  Ian talked about thin messages and fat messages.  A thin message doesn’t include all of the data that subsequent services may require from the originating service in order to do their jobs.  They are given a small amount of data (keeping the message “thin”) and told where they can go to retrieve further data that they may require.  This usually involves going back to the originating service with a request for that additional data.  On the other hand, messages can be “fat” and include all of the possible data that subsequent services may require in order for them to perform their functions.  They may even have more data available to them than they need.  There are pros and cons to both messages types.  Thin messages create a need to services to perform extra communication in order to retrieve additional data and this in turn requires services to “know more” about each other.  Fat messages avoid this extra communication but create bigger messages and may introduce security issues by exposing so much data, publically, that may not be required by the downstream services that will operate on the messages.

 

Ian continued by talking about Sagas.  Sagas are like long-running transactions that pass through multiple services.  Ian was keen to point out that actual transactions should never cross service boundaries. This would defeat the concept of services being autonomous and having explicit boundaries.  It’s quite possible, though, that business processes and workflows  will indeed be composed of several discreet steps performed in a specific order, where each of these steps has its functionality provided by a separate service.  The alternative approach to a service-spanning transaction is to raise additional messages (such as a “Reservation Failure” event) that interested consumers must listen for an respond to appropriately.

 

When many different services are required as part of a long-running process, we utilise an orchestration service to manage the “flow” of the messages.  Orchestration will ensure that the correct events or messages are passed to the appropriate service at the appropriate time, and helps to define the steps of a business process.  An Orchestration service effectively know all about the various services involved in a long-running “story” so that they (the services themselves) don’t have to!

 

Ian had warned us at the beginning that there was a lot of information to cover in his talk, and at this point we had, unfortunately, run out of time.  It was time for all of the attendees to gather in the main hall for the grand prize draws.  We’d each been given a raffle ticket earlier in the day, and now was the time that we would see if we’d won one of the many, many prizes being given away at the end of the day.  Unfortunately, I didn’t win a thing – although the raffle tickets with numbers either side of mine were called out!  This didn’t matter, though, as I’d had a brilliant day at a very well run event and listened to some impressive speakers talking about incredibly interesting subjects.

 

Overall, I really enjoyed my first DDD event and I can’t wait until next year to be able to attend DDD North (and hopefully some of the other DDD events around the UK) again!

Website Security Presentation

At my place of work, we have “Thursday Tech Talks” that take place every Thursday afternoon over lunch.  The talks vary in subject matter, ranging from specific technologies through to more general subjects that are relevant to software development and IT in general.  Sandwiches and Muffins are kindly provided by the company so the talks are very well attended - who doesn’t want free sandwiches and muffins? Smile

 

The talks are given by members of staff, and each staff member can offer to talk about a subject that interests them on a company-wide Trello board that we have for just this purpose.  The Trello entries are then voted upon by other staff members, and every week, the person with the highest votes gets to give a presentation to other members of the company who wish to attend.

 

Well, this past week, it was my turn.  I’d opted to give a talk on Website Security and general security concepts and goals.  We had quite a large turn out for what was my very first time presenting (No pressure then!), and it was both flattering and petrifying giving my inaugural presentation to such a large audience, but it was also very exciting to do.

 

As mentioned at the talk, I’m making my slides and notes available for all to view and download.  I’ve used Google Docs Presentations (surprisingly good these days!) for the slides and notes, so they’re viewable online by visiting the following link:

 

Website Security Presentation – Slides & Notes

 

The entire presentation can be downloaded in a variety of formats (PDF, PPTX (Microsoft PowerPoint) etc.) by going to the

File > Download As menu.

 

Enjoy!

JavaScript / jQuery IntelliSense in Visual Studio 2012

I blogged a while ago about a rather ugly and hacky way in which you could get the goodness of jQuery (and general JavaScript) IntelliSense in the Razor editor in Visual Studio 2010’s IDE.

 

This basically involved placing code similar to the following into every MVC view where you wanted IntelliSense to show up:

 

@* Stupid hack to get jQuery intellisense to work in the VS2010 IDE! *@
@if (false)
{
   <script src="../../Scripts/jquery-1.6.2-vsdoc.js" type="text/javascript"></script>
}

 

Well, since the release of Visual Studio 11 Beta, and the recent release of Visual Studio 2012 RC (Visual Studio 2012 is now the formal name of Visual Studio 11) we now no longer have to perform the above hack and clutter up our MVC views in order to enjoy the benefits of IntelliSense.

 

In Visual Studio 2012 (hereafter referred to as VS2012) this has been achieved by allowing an additional file to be placed within the solution/project which will contain a list of “references” to other JavaScript files that all MVC views will reference and honour.

 

The first step to configuring this is to open up VS2012’s Options dialog by selecting TOOLS > OPTIONS from the main menu bar:

 

image

 

Once there, you’ll want to navigate to the Text Editor > JavaScript > IntelliSense > References options:

 

image

 

The first thing to change here, is to select Implicit (Web) from the Reference Group drop-down list.  Doing this shows the list of references and included files within the Implicit (Web) group, as shown below the drop-down. Implicit (Web) includes a number of .js files that are included with VS2012 itself (and are located within your VS2012 install folder), but it also includes the following, project-specific entry:

 

~/Scripts/_references.js

 

Of course, this is all configurable, so you can easily add your own file in here in your own specific location or change the pre-defined _references.js, but since ASP.NET MVC is based around convention over configuration, let’s leave the default as it is!  Click OK and close the options dialog.

 

Now, what’s happened so far is that as part of the pre-defined Implicit (Web) reference group, VS2012 will look for a file called _references.js within a web project’s ~/Scripts/ folder (the ~/ implying the root of our web application) and use the references that are defined within that file as other files that should be referenced from within each of our MVC views automatically.

 

So, the next step is to add this file to one of our MVC projects in the relevant location, namely the ~/Scripts/ folder.  Right-click on the Scripts folder and select Add > New Item:

 

image

 

Once the Add New Item dialog is displayed, we can add a new JavaScript File, making sure that we name the file exactly as the default pre-defined Implicit (Web) reference group expects the file to be named:

 

image

 

 

The format of the _references.js file follows the JScript Comments Reference format that has been in Visual Studio since VS2008.  It’s shown below:

 

/// <reference path=”path to file to include” />

 

You can add as many or as few “references” within the _references.js file that you need.  Bear in mind, though, that the more files you add in here, the more it may negatively impact the performance of the editor/IDE as it’ll have far more files that it has to load and parse in order to determine what IntelliSense should be displayed.   A sample _references.js file is shown below:

 

image

 

The format/syntax of the references within this file can take a number of forms.  You can directly reference other JavaScript files without needing a path if they’re in the same folder as the _references.js file (as the example above shows):

 

/// <reference path="jquery-1.6.3.js" />

 

You can use relative paths which are relative from the folder where the _references.js file is located:

 

/// <reference path="SubfolderTest/jquery-1.6.3.js" />

 

And you can also use paths that are relative to your web project’s “root” folder by using the special ASP.NET ~ (tilde) syntax:

 

/// <reference path="~/Scripts/SubfolderTest/jquery-1.6.3.js" />

 

Once this is configured and saved, you will now have lovely IntelliSense within your MVC Views without needing additional hacky script references from within the view itself.  See the screen shot below:

 

image

 

Yep, that’s the entirety of the view that you can see there (no @if(false) nonsense!), and that’s lovely jQuery IntelliSense being displayed as soon as you start typing $( !

 

Update (10/Feb/2013): Changed the screenshots to use a nicer elliptical highlighting and tweaked some references ever-so-slightly (i.e. change jQuery from 1.6.2 to 1.6.3)