TL;DR: YAML can be a wolf in sheep’s clothing when used for CI infrastructure. Other alternatives include moving to build systems like CAKE/FAKE, or using TeamCity and its Kotlin DSL. Both approaches have their unique pros and cons. NUKE, of which I’m the author, provides a solution that combines the power of both - flexibility of a build system, and CI specific features like parallelization and build queue optimizations.
If you dive into the DevOps world, chances are high you meet YAML around the next corner. For some tools, like Docker and Kubernetes, I think it’s a good match. However, for CI infrastructure it often becomes a nightmare, and actually, I’m not alone having such feelings. Recently, a tweet of Jeff Fritz started a debate about YAML in DevOps, to which the general agreement can be summarized as:
Many of us only use YAML reluctantly. Personally, I would even say that the idea of Configuration as Code is a lie, because it doesn’t feel like coding (more about this in the next section). Yet, almost every CI/CD service out there is YAML-first:
Some try to stay clear from YAML, only to replace it with other suboptimal solutions. For instance, Jenkins uses a Groovy-flavoured configuration files, which isn’t really great either.
By the way, Azure Pipelines actually provides some better tooling around editing YAML, but apparently, it doesn’t really help:
But let’s try getting more to the bottom of this.
What’s wrong with YAML
While I see how YAML configuration can be attractive, I truly believe that for CI/CD purpose, this is only because the sample pipelines in talks and blog posts are often just that – samples. YAML is in disrepute for several reasons. Some of the most important ones, in particular for CI/CD, are:
- It imposes long feedback loops. Typically, the only way to test your configuration, is to commit your changes to the repository. Then the CI servers needs to pick up changes, and finally you might need to wait until the agent is available to trigger a new build. This adds up to a lot of time, especially given the following pitfalls.
- It’s error-prone. We might indent too much or too little, mistype a well-known property, or forget to escape properly without even knowing. YAML is almost always valid, and there is no proper syntax highlighting. There are schema files that enable rudimentary code completion, but as a C# developer, this still feels clunky.
- It’s not refactoring-safe. This is a matter of tooling again. Whenever we’re dealing with IDs and their references, the best choice you have is Search & Replace. This should only be the last resort.
- It’s declarative. Not imperative. Not everyone needs that, but usually, there’s a time when you want to iterate over a collection, filter items, write some more complex conditions, and other funky stuff. YAML is just the wrong format for that.
- It causes vendor lock-ins. Each CI system has its very own format. Switching to a different CI system becomes non-trivial, as we have to rewrite the complete configuration.
One more important fact is that many YAML configurations define inline Bash or PowerShell scripts. Typically, those make it hard to use any kind of IDE tooling. However, in JetBrains IDEs we can use language injections:
Seems like JetBrains has at least partially fixed YAML! 🤓
Modern Configuration as Code
Hilarious. Calling it modern almost sounds like a second attempt to make it actually attractive. Following the same discussion from earlier, Jeff Fritz is on to something:
In fact, TeamCity implements this approach since 2016 already. We can use the Kotlin DSL to implement our complete build pipeline, which is then internally converted by TeamCity to its own runner format, which is XML. I’m absolutely not a Java or Kotlin developer, but writing Kotlin scripts is actually pretty decent and discoverable when using IntelliJ IDEA. We get all the IDE features like syntax highlighting, code completion, navigation, and refactorings:
In my opinion, this is much better than YAML. We don’t have to commit our configuration just to realize that we missed an indentation, or mistyped a reference. Our IDE will just tell us right away, if something is semantically broken or we’re good. As a bonus, whenever we feel lost in the Kotlin DSL, we can fallback to using the UI wizards and let TeamCity show us the particular configuration as Kotlin code:
Getting into Build Systems
If you’re a .NET developer, I can understand if you’re feeling reluctant to use Kotlin DSL. After all, I’m a huge fan of the philosophy to use the same language for build implementation as for the rest of the project1. Following this philosophy, and using build systems such as FAKE, CAKE, or BullsEye, …
Great thing, right? We gain the benefit of being loosely coupled from the CI system, so we don’t experience a vendor lock-in, and can easily switch if we need to. Another plus, is that we can easily execute the build locally, which makes troubleshooting much easier. However, this approach also comes with a drawback: we deliberately avoid using features of the CI system like parallelization of tasks or build queue optimization. We basically gained portability and ease of use at the cost of provided value of the CI system.
Merging Approaches
Can there be a way to get the best of both worlds without any of the disadvantages? For sure. What it boils down to is what Damian Hickey suggested a little earlier already:
Meaning that we use both, a CI configuration and a build system, whereas the CI configuration defines multiple steps, each invoking the build system with a separate target. Typically, we get much better log output this way, and also allow the CI system to gather statistical data. However, even if the CI configuration is quite simple, writing it ourselves still has the potential to break things. For instance, when a target or input parameter name gets changed, we need to update configuration. Secondly, it’s hard to share state between different targets invocations. For instance, one target might calculate an in-memory list, which should be reported in the next target. How would that work?
Integration with NUKE
NUKE is a build system that I’m working on, which is similar to CAKE and FAKE. One unique aspect to NUKE is that it can generates the CI configuration from the C# build implementation itself. Currently supporting Azure Pipelines, AppVeyor, GitHub Actions, TeamCity, GitLab, and JetBrains Space. Let’s start with a more simple example and see how we can use GitHub Actions for our CI build:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[GitHubActions(
"continuous",
GitHubActionsImage.UbuntuLatest,
GitHubActionsImage.MacOsLatest,
On = new[] { GitHubActionsTrigger.Push },
InvokedTargets = new[] { nameof(Test), nameof(Pack) },
ImportSecrets = new[] { nameof(SlackWebhook), nameof(GitterAuthToken) })]
partial class Build : NukeBuild
{
public static int Main() => Execute<Build>(x => x.Pack);
[Parameter("Gitter Auth Token")] readonly string GitterAuthToken;
[Parameter("Slack Webhook")] readonly string SlackWebhook;
Target Test => /* ... */
Target Pack => /* ... */
}
In the example above, we’re adding the GitHubActionsAttribute
to our build class (Line 1) to define a new workflow called continuous
. The workflow invokes the Test
and Pack
targets (Line 6) on 2 different images (Line 3-4) whenever we push new changes (Line 5). Additionally, we import the secrets GitterAuthToken
and SlackWebhook
(Line 7). Note that everything is refactoring-safe! Images and triggers are defined via enumerations. The targets and parameters are referenced with the nameof
operator. If we rename them, our CI configuration will change as well. Finally, here’s the generated YAML file based on our attribute:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// <auto-generated />
name: continuous
on: [push]
jobs:
ubuntu-latest:
name: ubuntu-latest
runs-on: ubuntu-latest
steps:
- uses: actions/[email protected]
- name: Run './build.cmd Test Pack'
run: ./build.cmd Test Pack
env:
SlackWebhook: ${{ secrets.SlackWebhook }}
GitterAuthToken: ${{ secrets.GitterAuthToken }}
macOS-latest:
name: macOS-latest
runs-on: macOS-latest
steps:
- uses: actions/[email protected]
- name: Run './build.cmd Test Pack'
run: ./build.cmd Test Pack
env:
SlackWebhook: ${{ secrets.SlackWebhook }}
GitterAuthToken: ${{ secrets.GitterAuthToken }}
A more complex example is the following usage of the TeamCityAttribute
that generates a configuration for TeamCity:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[TeamCity(
TeamCityAgentPlatform.Windows,
VcsTriggeredTargets = new[] { nameof(Pack), nameof(Test) },
NightlyTriggeredTargets = new[] { nameof(Test) },
ManuallyTriggeredTargets = new[] { nameof(Publish) })]
partial class Build : NukeBuild
{
AbsolutePath TestResultDirectory => OutputDirectory / "test-results";
[Partition(2)] readonly Partition TestPartition;
IEnumerable<Project> TestProjects => TestPartition.GetCurrent(Solution.GetProjects("*.Tests"));
Target Test => _ => _
.DependsOn(Compile)
.Produces(TestResultDirectory / "*.trx")
.Produces(TestResultDirectory / "*.xml")
.Partition(() => TestPartition)
.Executes(() =>
{
// Test invocation
});
}
This time we won’t look at the generated code, but point out individual features:
- Nightly builds are defined via
NightlyTriggeredTargets
property (Line 4). Again, we can reference theTest
target with thenameof
operator. Internally, NUKE will generate a scheduled trigger. - Manual builds are defined via
ManuallyTriggeredTargets
(Line 5). In this case, we choose that thePublish
target is represented as a deployment build configuration. - Parallelization can be achieved in three easy steps. Firstly, we declare a
TestPartition
object along with its size (Line 10). Secondly, we assign the partition to theTest
target by callingPartition(() => TestPartition)
(Line 17). This causes TeamCity to use a composite build configuration with multiple sub-configurations according to the partition size. In the last step, we use the partition to get the current slice of test projects for the currently running sub-configuration (Line 11). - Publishing and consuming artifacts is just a matter of calling
Produces(...)
andConsumes(...)
(Line 15-16). This can be used to forward data to a subsequent target, or to provide file downloads through the TeamCity UI. - Build queue optimization is more of an implicit feature in TeamCity that comes along with the separation of different targets. Whenever a target has already been executed and could be reused, for instance when the affecting files haven’t changed, TeamCity will happily do that and save you resources.
Here are a few illustrations how things will look like in TeamCity. Including the Run Build dialog, that automatically exposes all parameters declared in the build class:
If you want to learn more about NUKE and its CI integration, check out the documentation.
One remaining issue is to allow different build steps to share state on a .NET process level. For instance, changes to a field of type List<Data>
should be available in the next build step. A possible solution is to deserialize and serialize the build object before and after a build is invoked. TeamCity actually provides a great extension point with the .teamcity
directory, which is automatically published as hidden artifact. Other CI systems have not been evaluated for this functionality yet.
Conclusion
As far as I’m concerned, extending build systems to generate CI configuration files is an interesting way to provide a better developer experience when building in different environments. Publishing artifacts, nightly builds, parallelization, build queue optimizations, and other gems are just a step an attribute away. We don’t need to know about CI systems inside out, and there is much less effort involved when we switch between them.
Skip the YAML pain and try NUKE!
-
Credits to Gary Ewan Park ↩