.NET Oxford Meetup XVII: CosmosDB


Our .NET Oxford meetup this month was all about Microsoft's globally-distributed database, Cosmos DB - and we were very pleased to welcome James Broome and Mike Larah from Endjin to tell us all about it! It also turned out that Endjin (being a remote working team), decided to spend the day in sunny Oxford for their team catch-up. Which was perfect, as it meant that we also had the entire team joining us for the meetup!

It was a fantastic turnout, with attendees pretty much filling the room - even despite the heat and lack of air conditioning!

audience

Intro Talk

I wanted to start off the intro talk by thanking Endjin not only for joining us that evening - but also for the amazing work they've been doing recently sponsoring tech events, and helping make events like the DDD conferences possible. I've been to quite a few of the DDD conferences now, and have even spoken at a couple, and without companies like Endjin willing to provide sponsorship, these events just wouldn't be able to continue. Fantastic work Endjin!

For this meetup, we introduced nametag stickers printed from the Meetup.com RSVP list. I pinched this idea from DevOps Oxford which I attended the week before. I thought this was a nice way of allowing people to easily "sign in" when entering - making it obvious who hadn't RSVPd, and also giving visibility of no-shows (ie. stickers left over), whilst of course also helping everyone get to know each others' names. For a lot of our meetups, it's not the end of the world if guests turns up without RSVPs, but when we hit (or are close to) limited capacity (eg. for Jon Skeet's meetup in October), we obviously need to be much stricter with checking RSVPs. And the nametag stick idea worked really nicely.

After talking about the above, and thanking our sponsors (see below), I then moved onto the news-items and prize-draws, before handing over to James and Mike for the main talk.

The slides for the intro talk can be found here, and the Reveal.js source code can be found here.

News

TypeScript 3.0

The first news item was about the TypeScript 3.0 release. Obviously a major release, so I didn't go into all the features - but if you're interested in reading more, see their announcement blog post for details.

F# 4.5 Preview

The second news item was about the F# 4.5 Preview announcement. Out of interest, I asked everyone, who uses F#. In a room packed full of developers, literally just one hand went up! I then asked if anyone would be interested in us getting someone in for a talk on F#, and a lot of people seemed keen, so we'll have to get this sorted one of our meetups next year!

(link)

Azure Service Fabric Mesh

The last news item was about Azure Service Fabric Mesh going into Public preview. From what I gather, this is a serverless Service Fabric. I haven't really done anything directly with Service Fabric, but watching a Channel 9 video on Service Fabric Mesh, it felt very Kubernetes-like - except without having to worry about the nodes (ie. VMs). There is also a Kubernetes Virtual Kubelet provider for Service Fabric Mesh, which offers the best of both worlds!

(link)

Prize Draws

After the news, we then moved onto the prize draws, using my usual WPF Prize Draw app. A massive congratulation to the winners, and a massive thank you to our awesome prize draw sponsors ...

Jetbrains

Congratulations to Jonathan Hughes for winning a year-long Jetbrains product licence!

Manning Books

Congratulations to John Christian for winning a Manning ebook of his choice! The winner has the choice of any of the awesome Manning ebooks from their website. John hasn't yet got back in touch with his choice, but I'll update here when he does.

Remember that we have our special Manning coupon code (ug367) which gives all of our members a 36% discount on any of their e-books! They've also asked me to share a link to some of their new courses for their LiveVideo system.

Oz-Code

Congratulations to Tim Ford for winning the Oz-Code licence! For those that don't know, this is a Visual Studio extension that puts your debugger on steroids! Take a look at their website for videos of their features.

If you haven't checked it out, then definitely download the trial and have a play. All our member get a free 3 month trial licence (see below) or 50% off a full licence! To claim, you can visit this link to pick up your licence!

prizesponsors


Primary Sponsors - Corriculo Recruitment

When I arrived, one thing that jumped out was that our main sponsors (Corriculo) were wearing awesome looking .NET Oxford t-shirts!!! I just wish I had remembered to take a picture! They had mentioned previously that they were thinking of doing this, but I was really surprised and impressed with how good the t-shorts looked! My first question was obviously "where's ours?!" ;)

Corriculo have been our primary sponsors from the start, and have always done an amazing job helping us out. Not only helping us financially, paying or the venue and providing lots of liquid refreshments - but also helping us out in many other ways too. One example of this, and this is something I spoke about in the intro talk too, is the October meetup with Jon Skeet. Unfortunately, after we booked the venue, and the max RSVP count of 110 had filled up in under 24 hours - the Story Museum got in touch saying they'd made a mistake and we couldn't have the room any more! The first I heard of this was in an email from Corriculo, telling us this had happened, but also telling us that they'd spent time researching alternative venues, and found a room just around the corner which could hold 120! So what could have been a really stressful email, wasn't at all thanks to the fantastic work that Corriculo did. It was really nice in the intro talk, that when I said this, everyone gave Corriculo a massive round of applause. They definitely deserve it!

Secondary Sponsor - Everstack

Our secondary sponsor is my own company, Everstack. Providing a lot of my own time for organising and managing .NET Oxford. Everstack provides software development and consultation services - specialising in .NET and the Cloud.


The Talk: "Analysing flight data in real time using Cosmos DB, Azure Functions, and Power BI"

In some of the talks where there are two speakers - one speaker does the first half, and the other does the second. In this talk, James and Mike swapped over more frequently, with James focusing on the slides and explaining high-level concepts, then Mike swapping in for the demos. I thought this worked quite nicely, and meant that each logical section of the talk had a demo associated with it after the concept had been explained. It was also interesting to see that Mike was doing the demos from a VM running in Azure. Very cool that you can spin up VMs like this, with no noticeable latency! I'm just glad the venue's internet connection held out!

james

James started off introducing Cosmos DB, explaining that it's a "fast and scalable, globally-distributed multi-model, NoSQL database service" (yes, I pinched that quote from one of their slides). Basically this means that like most things in Azure, it's a service than can be created effortlessly - and this immediately gives you a globally distributed database. The multi-model bit means that you can use it for different types of data model - eg. document, graph, key-value, table, and column-family. I was particularly interested in the graph model, as I've played with Neo4j in the past, and found graph databases really powerful for certain scenarios. There are also various different APIs for the data models - for example, you can use the MongoDB API - which means that if your application currently uses MongoDB, you can switch to Cosmos without changing any of your code! Likewise, for graph databases, you can query using the generic Gremlin API.

Another point that James covered in his intro, was that you can easily put your data where your users are. One thing that I hear quite a lot about Cosmos, and also came up quite a lot in this talk, is its guaranteed low latency. To be able to distribute your data to datacentres as close as possible to the user, really helps keep this latency as low as possible, keeping your applications extremely fast and responsive regardless of where your users are based.

A good point made was that Cosmos doesn't just have to be for larger projects. It can be for small projects too, and can grow as your project grows. James said that he pretty much uses Cosmos for all his projects now - even small ones. We're not all writing the next big thing, so it's nice to see that it still makes sense using this kind of tech for the smaller projects.

mike

Mike then took over with a demo showing this in action. He showed creating a Cosmos service in the Azure portal, and how you can choose different regions within seconds. His demo showed some code which ingests live flight data from FlightAware API, which is what their product uses (also introduced in James' intro). Unfortunately this code isn't available to share, as it's part of their product, but Mike did mention afterwards when I asked him, that they might be able to split the example out so it's sharable. I'll update this post with a link if they do. In their product, they use a continuously running webjob to run this code and ingest the flight data to be transformed and saved into Cosmos. He walked through the code, showing the client being set up, and explained some options they had chosen. For example, setting the indexed mode to lazy to opt for eventual consistency in the data, etc.

Graph API

Whilst Endjin do not use the Graph model themselves for their product, they did decide to demo it to show what it can do. They used a 3rd party GitHub project to show this in action. This uses the Open Flights API to query airport and route data, and stores that data into a Cosmos DB graph database, which can then be queried with the Gremlin query language. Or even the SQL API, but obviously the Gremlin language is more suited to this kind of data.

They also showed that the Portal had a pretty decent graph visualiser to visualise your graph data. Having used Neo4j, and loving their Neo4j Browser - it was really nice to see a way of Visualising graph data in Cosmos DB too.

Elastic Scaling

Another area they covered was Elastic Scaling. This is where the underlying resources required can be scaled out (and also back in again) as required. An example might be an ecommerce company that needs additional resources during the Christmas period when under heavy traffic, but doesn't need (or want to pay for!) those additional resources when those peek periods have ended. If you were doing this on-premise, you'd have to buy new hardware, and you'd be stuck with it, even when you no longer need it. Elastic Scale means you scale almost infinitively as meets your requirements.

Mike demoed this by switching back to his previous FlightAware demo that was still running and showing that it was now failing. This is because he had limited throughput when creating it. He then adjusted the settings in the portal to allow automatic elastic scaling, and it then started working again even with the increased load.

Pricing

Obviously one of the big questions that comes up is the pricing. The short story is - it's complicated! It also sounded like it can depend very heavily on your queries. Later on in the talk, Mike showed that in the Portal, you can submit a query against your data, and see the cost of that query in request units (which is the currency in Cosmos DB). He showed a couple of ways in which you can optimise your queries, and showed the dramatic different that made in RUs for that query!

It is really useful to know that you can see the associated cost when executing your queries. And it sounds like you can also choose for this cost in RUs to be includes in the query responses, so you can add monitoring to this.

Detailed information on pricing can be found here, and there's also a price calculator.

Consistency Models and Low Latency

Another topic they covered was Consistency Models. The choice of which model you use obviously heavily depends on your use-case. Endjin opted for eventual consistency to give better performance and availability, keeping the latency as low as possible. They also pointed out that you can override this on a per-request basis to control this based on individual use-cases.

Speaking of low latency, they also spoke about how Microsoft guarantees millisecond latency worldwide (or your money back). They do this by trying to keep your data as close to your users as possible. Which is of course one of the advantages of using a globally-distributed database like Cosmos.

Endjin didn't want to just take Microsoft's word for it though - they wanted to ensure this was the case using their own use-case. So they performed load testing using their codebase, and graphed out the results. To see the graphs, take a look at slide 28 in their slides. One point to mention is that in these load test, they saw zero failures.

Geospatial Queries

Obviously with Endjin's use-case being about flight data - geo-location is a key factor. So they also spoke about some of the geospatial functionality in Cosmos, demoing the ST_WITHIN query predicate to further improve performance and cost in the queries shown earlier.

Azure Functions and Cosmos Change Feeds

The next section covered Cosmos Change Feeds, allowing your application to react to events in Cosmos when your data changes. Mike demoed creating a boiler-plate Azure Function in the Portal with a change-feed trigger. This was done with a few clicks in the portal. This gave a placeholder function which was successfully triggered on data change. Mike then copied in some code that took the event data, and transformed it so that it used the clamped longitude and latitude as the partition key instead, then wrote it back. He then went back to the example query he showed earlier (see the pricing section above), and showed how much cheaper it was using a different partitioning strategy that closer matches the query requirement.

This is of course just one example use-case for Change Feeds. Having the ability to have serverless Azure Functions triggered from Cosmos DB events opens up a massive potential of possibilities.

PowerBI

Unfortunately, the Demo Gods of course had to interject at some point. The PowerBI demonstration was a video in one of their slides, which sadly didn't render on the projector, just showing a black screen. James has kindly provided a link to the video, which can be found here.

SLAs and Security Compliance

James wrapped up by talking about Cosmos SLAs and Security Compliance. For the SLA bit, I'll pinch a quote from their slides ...

"Only service with financially-backed SLAs for millisecond latency at the 99th percentile, 99.99% HA and guaranteed throughput and consistency"

Which sounds pretty good to me!

They also mentioned that Cosmos defaults to encryption at rest, and spoke about performing risk-assessment when moving to the Cloud, which they have a lot of information about on their website.

Questions

In the break, when chatting with James and Mike about timings, they said they were unsure how much time to allow for questions, and whether they'd actually get any. Well, as I expected, the .NET Oxford crowd certainly didn't disappoint, as there was plenty! I should have made a note of the questions, so I could include them here - but by that point I had all but melted due to the heat! A massive thank you to both our members for being so awesomely inquisitive, and also to James and Mike for putting up with the bombardment and drilling ;)

As well as our usual prize draw in the intro - Endjin also brought along a few Amazon Echo Dots to give away, and they chose to pick out a few of the top questions for their prize giveaway. Congrats to the winners!


Links


Pub

As usual, we headed to the Old Tom afterwards. Always great to catch-up with regulars, as well as chat to new faces! Also, as usual, I forgot to take any photos at the pub afterwards! Maybe next time!


Upcoming Meetups

Below are our upcoming meetups for this year. Not all have been announced on Meetup.com yet, but if you subscribe to the meetup, you'll get email notifications as they are announced.

September 11th: "Pilot Decision Management" and "Chatting with your Data":

In September we have two talks - Clifford Agius will be talking about Pilot Decision Management, and how their training relates and can benefit programmers. Then we also have .NET Oxford co-founder, Matt Nield talking about using bots to access your data.

(Meetup.com event page)

October 9th: Jon Skeet - "C# 8: The story so far":

October's meetup is with Jon Skeet himself, talking all about the upcoming C#8. For this meetup, there'll be a slight change of venue - and it'll be in the St Aldates Conference Centre which is just a few doors down from our usual venue.

(Meetup.com event page)

November: Performance in the JavaScript Era:

November will be with Benjamin Howarth, talking about various different aspects of performance in an era where much more functionality has been pushed to the frontend. This isn't just a JavaScript talk though - it covers a lot of aspects that are important to us .NET developers too!

December: More Lightning Talks!

We thought that December would be a nice place to slot in another lightning talk event. It may be a long way off, but if you do want to do one, feel free to get in touch! First timers are most welcome too!


Please retweet if you enjoyed this post ...