DDD North 2020 In Review

On Saturday 29th February 2020, the 9th annual DDD North conference took place, held as it was last year at the University of Hull.

Images

Having missed out on last year's conference, I was very much looking forward to this one, and I was even attending as a speaker, giving the same talk as I had at DDD East Anglia the previous year.

I'd travelled eastwards to Hull on the evening before the event and stayed in a local B&B. On the Saturday morning, I was able to get up bright and early, drive to the event, and be there just in time for registration to open.

After registering at the main desk and informing the organisers that I was here to speak, I was given my speaker's polo shirt and was directed to the speaker's room on the first floor of the same building. After quickly changing into my new purple shirt, I grabbed my bag and headed downstairs to grab a coffee. My own session wasn't until after lunch, the penultimate session of the day, so I was able to catch most of the other session slots throughout the day. After filling my coffee cup, I headed off to find my first session of the day.

A more flexible way to store your data with MongoDB

Kev Smith

Images

Kev first starts by giving us an introduction to the origins of MongoDB. The database was first created by two developers, Eliot Horowitz and Dwight Merriman, who were also the founders of DoubleClick. Eventually, they also founded a company called 10gen, which was the vehicle for providing support for the MongoDB database. With the success of MongoDB, 10gen was ultimately renamed to MongoDB Inc.

MongoDB is a non-relational, NoSQL database engine. It's a document database. MongoDB support strong typing of properties within documents and offers a large amount of types, out-of-the-box, for developers to use. It uses a binary format for storage of documents called BSON. This is a binary representation of a JSON object.

MongoDB is a schemaless database. Developers can create schemas, if desired, and these can be used for data validation when attempting to write data into a MongoDB database, however, a schema is not required. This means that each document inside a MongoDB collection (a collection is analogous to a relational-database table) can have an entirely different schema.

Unlike a relational database, MongoDB doesn't allow joins between collections. This is a common restriction within document databases. MongoDB does allow a mechanism called a "lookup" as part of its built-in aggregation framework, however, this is an expensive operation causing entire additional documents to be loaded for each source document, so is best avoided if possible. MongoDB's document-centric approach is designed so that you can keep all necessary and related data together in one single document.

When we insert documents in MongoDB, we will receive a response indicating if the write was successful or not. As part of the write request, we can specify a "write concern" for the operation. This tells MongoDB to only report a successful write if our write concern is met. Write concerns allow specifying such things as a minimum number of nodes in a MongoDB cluster that must have the data written to disk before a successful write is reported to the client. For example, in a 3 node cluster, we can specify that writes are only considered successful if at least 2 of the 3 nodes have persisted the data to disk. MongoDB allows writing many documents in one request/response round-trip with the insertMany command, and we can also add the ordered property to this command to ensure that all documents are inserted in the correct order as specified in the operation. All MongoDB documents have a property called _id which acts as the unique identifier for that document.

MongoDB has a built-in text search which Kev tells is ok, but not as good as a Lucene-based search. If we're using MongoDB via the cloud-based Atlas offering, you'll get the Lucene- based search add-on included and it'll be correctly setup and configured for you automatically.

When we're reading documents from MongoDB, we can "project" to new document shapes as part of the query. This allows us to retrieve only a subset of specific document properties from the original document. As part of all read queries, we'll automatically get the _id property returned in the output even if we don't explicitly ask for it. We can turn this off, however. If using the .find() command to query for documents, multiple clauses within the command will be combined using a logical AND, for example, this query:

db.events.find(
    {
        "date": "$gte" ISODate("2020-01-01"),
        "date": "$lt" ISODate("2020-12-31")
    }
)

will ensure that we're returning documents that are both greater than the provided date of 2020-01-01 and less than the provided date of 2020-12-31.

We can use the .updateOne() command to update a single document. We can optionally add and upsert attribute to the command to perform an "upsert" operation which creates a new document if a matching one does not exist, or updates the document if a matching one does exist.

By default, updates in MongoDB only apply to one document. We need to be explicit if we want our update commands to potentially affect multiple documents at once. For matching against specific, single documents, we can pass in the _id attribute. This works for deletes also.

MongoDB includes an "aggregation pipeline" which allows us to aggregate and model data into new shapes and aggregations. It is based upon a concept of a data-processing pipeline and consists of a series of specific stages:

$match -> $unwind -> $group -> $project

The $match stage will match documents based upon specific criteria. It's effectively a "find". The $unwind stage allows creating multiple documents from each individual input document and is used when the input document contains arrays of data. The $group stage allows for the grouping of the resulting documents from the previous stage in the pipeline based on some specific criteria. Finally, the $project stage allows projecting to a new result document with its own shape.

Kev moves on to look at transactions. He says that, since v4 of MongoDB, we do have transactions and these are a standard part of the database engine. Kev does say, however, that we shouldn't really use transactions to wrap writes to multiple documents, and that we should always endeavour to keep our transactions scoped to a single document. If we find ourselves feeling the need to wrap multiple documents in a single transaction, it's usually a good indicator that the documents are modelled incorrectly.

The MongoDB engine includes indexes for documents. Documents can have both primary and secondary indexes. Each index can be based on a single document attribute, can be a compound index which is based on multiple document attributes, and multikey indexes which are based upon an attribute that may contain multiple values from an array.

We look back at writes to a MongoDB cluster and learn that within any cluster there is a single "primary" node and potentially many "secondary" nodes. All writes to the database go to the primary node, with that data then being replicated by MongoDB to any and all secondary nodes. We can also shard data which allows for splitting a large dataset across multiple nodes. Shards can be created based on hashed, ranged or zonal properties. Hashed shards use a hashed index of a specified document attribute to partition data across the nodes in the cluster. Ranged-based sharding involves dividing data into contiguous ranges based upon some specified document attribute (e.g. person names could be ranged starting with A-M and N-Z). Finally, zonal-based sharding involves grouping other sharded data into zones based upon the shard key allowing isolation of specific subsets of data. Sharding ensures all writes and reads are directed to the appropriate node for where that data will reside. It's important to note, however, that once a database has been sharded, it cannot be un-sharded.

At this point, Kev shows us some demos with the MongoDB daemon (mongod) running on his laptop and using the Mongo shell (mongo) to execute some commands and queries directly against the local database. This includes creating a new database, creating new collections and inserting and retrieving data from those collections.

Finally, Kev walks us through some misconceptions that MongoDB has suffered from over the years and addresses those concerns. One popular misconception is that "Mongo loses data", which was frequently levelled at MongoDB in the early days, however, since v1.2 of MongoDB (released way back in December 2009), default write concerns have changed to ensure that acknowledged writes are written to disk on at least one node. And we can always strengthen those write concerns ourselves to mitigate the risk of further data loss.

Another popular misconception is "Mongo gets hacked", and it's true that there have been many reported incidents of people's MongoDB databases being hacked, however, these were almost all databases placed onto the public internet with no authentication restrictions applied, allowing anyone who finds the database to connect to it and interact with it. As such, this isn't really an issue with MongoDB, but with incredibly lax security practices. Modern versions of MongoDB are restricted to only allowing connections from localhost by default and come with plenty of warnings to ensure that appropriate authentication credentials are configured.

After Kev's session was over it was time for a quick break and some refreshment refills back in the communal building where the registration had taken place earlier. And after suitable caffeination, it was time to head off to the next session.

Deploying globally resilient websites

Callum Whyte

Images

Callum starts by setting the scene of how he was tasked with building a "Global AI community" website in conjunction with Microsoft. The site, being worldwide in scope and availability and being part of Microsoft, had to be fast all over the world and had to run on Azure.

We first take a look at some of the traditional ways of scaling a website to handle global load and see that, at the simplest, we can put a load balancer in the front and balance load across multiple machines in the backend. One problem with this approach is that the load balancer is becomes a single point of failure. It's also likely that both the load balancer and the machines serving up the website are all in the same data centre. We also need to consider how clients will be balanced across the backend machines, as if there's state required by the websites, we'll likely have to use sticky sessions to ensure clients always get routed to the same backend machine for a given session. Azure can provide the above solution pretty much seamlessly and even includes the ability to define "slots" within an Web App Service. These slots facilitate blue/green deployments of new versions of the website.

The next step beyond this is balancing between sets of machines in two different regions in the cloud. We would have a load balancer and backend webserver machines in each region and direct traffic to a region via DNS. We can then easily take this one step further and deploy into different regions in different areas of the world. This is geo-redundancy and protects against natural disasters in a specific area of the world. This also helps with latency as we can direct clients to their geographically closest data centre. One downside of this approach is that we can no longer use sticky sessions since we're now balancing at the DNS level, rather than the load balancer level.

To solve this problem, Callum moves on to introduce a new service within Azure called Azure Front door. This service is available in all Azure regions across the world, and is a globally deployed gateway which handles all routing, load balancing and other scaling concerns from a single place. Front door underpins other Azure services such as API Gateway.

It has an optional content-delivery network (CDN) and Web Application Firewall (WAF) built-in and can provide SSL termination at the edge. All traffic using Front door runs through Microsoft's own internal network and this means that Microsoft can control all network hops to get from client to the server that will ultimately serve the request. You can use Front door and gain the benefits of internal network routing even if your ultimate server isn't in Azure. You traffic will be routed through the internal Microsoft network, using private DNS for the internal network hops in order to get to the closest public egress point to reach the required server on public DNS.

Callum shows us a demo of configuring a simple website and Azure Front door to control routing and load balancing. The serving up of the backend application from the closest configured region is done automatically by Azure Front door, and for deployments, we can simply deploy over the top of an existing application version which can be performed in a rolling manner, taking down and upgrading each of the backend applications in turn so as to ensure no downtown for clients.

Front door handles all of this and takes care of removing server/application instances from the available "pool" before upgrades, returning them to the pool once they're upgraded and ready to serve traffic again. It's also possible to implement a true staging environment that can be deployed to and have Front door manage switching between the stage and production environments in a true blue/green deployment manner.

After Callum's session, it was time for another refreshment break in the main building. Once suitably refreshed, we headed off the find the required building for the final session of the morning before lunch.

Container-based web app development made easy on Azure App Service

Andrew Westgarth

Images

There had been a last minute change to the schedule as one of the speakers could, unfortunately, no longer make it. Instead, we had the pleasure of welcoming back to DDD North the man who had started it all in the first place, Andy Westgarth. Now working for Microsoft and living in Seattle, Andy had been on a business trip to Europe immediately prior to the weekend of this DDD North event. With some time to spare, Andy came along to fill the gap left by the speaker who couldn't attend.

Andy's talk was all about Azure's App Service, and specifically, how we can leverage this app service using containers such as Docker etc. App Service is a high productivity service in Azure and it's fully managed by Azure. Azure App Service supports .NET Core, Python, PHP, Java, NodeJS. These are the "first-class citizens" within the service, but more are supported. App Service primarily run on Linux. It has many deployment options such as a simple zip deploy, deploying from a git repository, GitHub integration and even from a Mercurial repository (but only on windows). There's .war and .jar options for Java along with OneDrive, Dropbox and FTP integrations.

For containers, we can pull them in from Docker Hub, Azure Container Registry or a private container registry. App Service containers does support Windows containers (running Windows Server 2019), however, Linux containers are greatly preferred. The smallest size of a Windows container is around 300MB which Andy tells us is actually smaller than many Java images. App Service, being managed by Azure, can't have any dependencies installed on it, so these have to be provided inside the containers. Andy tells to be careful if using Windows containers and to pay close attention to the "layering" within the container itself. Not removing the redundant layers when building a container image is the number one contributor to excessive image size.

Andy shows us some demos of building some sample images, uploading them to Docker Hub and then, having configured Azure App Service to deploy from Docker Hub, the image is automatically deployed to App Service once completely uploaded to Docker Hub.

Andy talks more about the differences between Windows and Linux containers. Windows containers use Hyper-V isolation which means that the Operating System kernel is inside the container. Linux containers operate in a "shared kernel" mode, so the kernel is shared between both the host and the containers. This is a slightly less secure mode as attacks on the host can more easily escalate to containers (and vice-versa). We learn that Kubernetes has forthcoming support for Hyper-V isolation for containers and that Hyper-V isolation can also be used for LCOWhttps://docs.microsoft.com/en-us/virtualization/windowscontainers/deploy-containers/linux-containers (Linux Containers on Windows).

One downside to Azure App Service is that there's no current support for multi-container orchestrations, however, this functionality could be provided in the future.

We look at deployment slots, which we can leverage to maintain uptime. We can even set traffic percentages to be directed to different slots (i.e only send 20% of users to secondary slot, with 80% of users being directed to primary slot). When configuring deployment slots in the Azure Portal, there's "Deployment Slot Setting" checkbox in the Add/Edit Application settings. Checking this ensures environment and application configuration settings are not swapped when swapping slots. If unchecked, this means that application settings between slots are also swapped along with the application code itself, when swapping slots, but this is probably not what you want! Andy reminds us that we shouldn't have lots of different slots in a single instance of an App Service, since all slots share the backend compute resources. If we want to use slots to mimic development, staging or production environments, it's far better to have different App Service instances in their own separate Plan for this.

Finally, Andy tells us how App Services include "hybrid connections" capability. This means that you can deploy an application to Azure App Service whilst keeping the database on-premise. This is an alternate to having to set up such things such as ExpressRoute to connect your on-premise network with the Azure network.

Once Andy's session was over, it was time to head back to the main building once more for lunch.

Lunch

Images

After Andrew's talk, it was time for lunch. The lunch provided on this occasion was excellent, with a large selection of freshly prepared food, both hot and cold, offered to the attendees. This included plenty of options for the veggies amongst us. This was very different from the usual brown bag lunch of a sandwich and some crisps that is so often offered at such events. Don't get me wrong, I have nothing against the brown bag lunch, and appreciate that the ability to provide lunch is down to the funding provided by event sponsors, but this particular lunch was a very pleasant surprise, and very well received by all.

Images

An Introduction To Domain-Driven Design

Craig Phillips

After lunch was over, it was that time of the afternoon when it was my turn to speak. I was delivering the same talk as I'd given at DDD East Anglia last year, an Introduction To Domain-Driven Design.

Obviously, I don't have any notes of my giving my own talk but a fellow speaker, Ian Johnson, sketched some brilliant sketchnotes on my talk (Thanks Ian!):

"An introduction to Domain-Driven Design" by @craigtptech at #dddnorth #sketchnotes pic.twitter.com/P6yhXsGsKM
— Ian (@IJohnson_TNF) February 29, 2020

After my session was over, I headed back to the speaker's room to unwind, grab some refreshments and pack away some of my equipment. I finally made it back to the separate building to catch the last session of the day, albeit having missed the first few minutes.

Logging and Alerting with Application Insights Azure Log Analytics

Steve Spencer

Images

Steve's talk looks at Azure's Application Insights and how we can use it as a sink for our application logs and how we can query those resulting log files to get intelligent information about our application's performance.

We look at the query language used to query the Application Insight logs. This is called the Kusto Query Language (KQL) and is similar in structure to SQL but does contain some differences. We can learn all about the language on the Microsoft Docs site, but a simple KQL query might look something like the following:

MyEvents
| where TimeGenerated > ago(7d)
| summarize count() by IPAddress, bin(TimeGenerated, 1h)
| render timechart

We can see that KQL includes some handy helpers when querying logs. Our where clause is based on a timestamp that's included in the event and the ago(7d) criteria tells us to only include events where the TimeGenerated event property is greater than 7 days ago. The summarize clause is roughly equivalent to a GROUP BY clause in SQL and this too provides some helpers, for example the bin function, allowing rounding of values to a specific increment, aiding the grouping and summarizing. Finally, the render operator in the KQL query allows for the output of a number of different ways of visualizing the query results including various charts, tables, cards etc.

One handy tip for KQL queries is that a query can use a parse_json function which allows querying within and pulling out specific properties that could be located inside of a complex JSON object inside of a single Application Insights log column.

Steve looks at Azure Monitor. He tells us that every service inside of Azure has a dependency on Azure Monitor meaning that Azure itself can leverage the Azure Monitor to ensure the health of it's own services. Of course, we can use the Azure Monitor too in order to create alerts based upon specific events. Azure Monitor works with KQL queries similar to Application Insights logs to continually query and monitor our applications based upon the logs and diagnostic events that we emit from our application. Azure Monitor allows for the creation of dashboards that we can inspect to see real-time data from our applications and allow for the configuration of alerts and automated actions based off defined criteria.

Steve shows some demos of deploying a simple application into Azure that generates some Application Insights logs and then shows how we can create alerts within Azure Monitor based off queries on those logs with specific criteria for triggering the alert.

Steve gives us link to a number of blog posts that he's written going into further detail on the setup and configuration of both Application Insights and Azure Monitor.

After Steve's session was over, it was time to head to the main lecture hall for the final wrap-up of the day.

Images

When all the attendees had gathered in the main lecture hall, the organisers thanked everyone involved for making the event what it was. It had been another excellent DDD event, and many of the events in the DDD calendar go from strength to strength each year, and DDD North is no exception.

There was the obligatory prize giving session, during which I didn't win anything, however, lots of additional freebies were given away to almost anyone who wanted one including a rather fetching Azure Racoon plushie!

Images

We were assured that DDD North would be back again next year, at around the same time of the year and in the same location, but for now, another successful DDD North event was wrapped up and it was yet another excellent conference.

Images

A more flexible way to store your data with MongoDB

Kev Smith

Deploying globally resilient websites

Callum Whyte

Container-based web app development made easy on Azure App Service

Andrew Westgarth

Lunch

An Introduction To Domain-Driven Design

Craig Phillips

Logging and Alerting with Application Insights Azure Log Analytics

Steve Spencer

Search

Categories

Tags