To draw a direct comparison, when I look at the examples in the github repository, all I can think is "I would never want to have this be a source of truth in my codebase". While I get frustrated w/ whitespace in yaml and the difficulty of reading complex json configuration, if I need a way to programmatically load complex data I would almost always rather use those two as a base and write a 'config-loader' in the language that I am already using for my project (instead of introducing another syntax into the mix)
Here's a public example - using Jsonnet to parametrize all core resources of a bare metal Kubernetes cluster: [1]. This in turn uses cluster.libsonnet [2], which sets up Calico, Metallb, Rook, Nginx-Ingress-Controller, Cert-Manager, CoreDNS, ...
Note that this top-level file aims to be the _entire_ source of truth for that particular cluster. I know of people who are reusing some of the dependent lib/*libsonnet code in their own deployments, which shows that this is not just abstraction for the sake of abstraction.
Jsonnet isn't perfect, but it allows for actual building of abstraction layers in configuration, guaranteed pure evaluation, and not a single line of text templated or repeated YAML.
[1] - https://cs.hackerspace.pl/hscloud/-/blob/cluster/kube/k0.lib...
[2] - https://cs.hackerspace.pl/hscloud/-/blob/cluster/kube/cluste...
Much like layered dockerfiles, mature configuration often comes from several places: env vars, configuration appropriate for checkin to git (no secrets), secrets configuration, and of course the old environment-specific configuration.
All of that merges to "The Configuration".
Also, these seem close to templating languages.
I've done this several times with a "stacked map" implementation (much like the JSP key lookups went through page / session / application scopes, or even more convoluted for Spring Webflow.
I'm not sure which is the lesser evil.
Actually, I think (as always), that it depends. For something simple like a config file for an app, JSON/YAML is usually fine.
But for something more complex, like IaC (Infrastructure as Code) definitions, I think perhaps "proper" programming languages might be more beneficial. I had a look at Pulumi just yesterday, and I very much like the idea of writing a simple C#/Typescript app to deploy my, when compared to something like HCL (HashiCorp Configuration Language) or bash scripts that wrap the Azure/AWS CLI tooling.
Honestly, I would prefer anything that mimics popular languages to lower the bar of reading.
I've played with it briefly, along with the Kubernetes plugin, and it was a nice experience.
Better configuration doesn't mean more ways to treat config like code, or data like config, or god forbid, code. It means treating config like config and code like code. Gitops just makes me sad. Truth should only flow in one direction. The first time I had to write a script to utilize the GitHub API to auto-update a code repo I died a little inside.
Doesn't something like this go a little way toward solving that problem?
Heavy lifting needs to be done with code. If your config layer is growing, I would look for why that is and how you could push the complexity to the code or data and “boil down” the config until it can be represented with just keys and values.
Growing config means there’s areas of your application that aren’t being properly encapsulated. But once something is enshrined as config then it usually never will get treated as a legitimate application concern, worthy of a data model and a UI for changing it. Devs just keep adding onto it and before you know it you need a whole team just to deal with it.
Static languages like JSON and YAML are fine for toy configurations, but they don't scale to the most basic real-world configuration tasks. Consider any reasonably sized Kubernetes project that someone wants to make available for others to install in their cluster's. The project probably has thousands of lines of complex configuration but much of it will change subtly from one installation to another. Rather than distributing a copy of the configs and detailed instructions on how to manually configure the configuration for each use case, it becomes very naturally expedient to parameterize the configuration.
The most flat-footed solution involves text-based templates (a la jinja, mustache, etc) which is pretty much what Helm has done for a long time. But text-based templates are tremendously cumbersome (you have to make sure your templates always render syntactically correct and ideally also human readable, which is difficult because YAML is whitespace-sensitive and text templates aren't designed to make it easy to control whitespace).
A similarly naive solution is to simply encode a programming language into the YAML. Certain YAML forms encode references (e.g., `{"Ref": "<identifier>"}` is equivalent to dereferencing a variable in source code). Another program evaluates this implicit language at runtime. This is the CloudFormation approach, and it also gives you some crude reuse while leaving much to be desired.
After stumbling through a few of these silly permutations, it becomes evident that this reuse problem isn't different than the reuse problems that standard programming languages solve for; however, what is different is that we don't want our configuration to have access to system APIs including I/O and we may also want to prevent against non-halting programs (which is to say that we may not want our language to be turing complete). An expression-based configuration language becomes a natural fit.
After using an expression-based configuration language, you realize that it's pretty difficult to make sure that your JSON/YAML output has the right "shape" such that it will be accepted by Kubernetes or CloudFormation or whatever your target is, so you realize the need for static type annotations and a type checker.
Note that at no point are we trying to implement the fibonacci sequence, and in fact we prefer not to be able to implement it at all because we expressly prefer a language that is guaranteed to halt (though this isn't a requirement for all use cases, I believe it does satisfy the range of use-cases that we're discussing, and the principle of least power suggests that we should prefer it to turing-complete solutions).
That said, this one language does not look powerful enough for that. So I'm not sure where it can be used.
I mean, it's used to configure all of NixOS, so I'm not sure if that's true.
Let me add another postit of "try NixOS in an environment" into my TODO list...
It is sold as "You use this to generate configuration in other formats like JSON"... but why? Why would I want to use some language other than the target format to configure things? Why am I making my configuration a 2 step process? And even if I bought all of those reasons, why wouldn't I just use a general purpose language instead? Why have some esoteric language dialect whose only purpose is... making configuration files?
I'd much rather use Bash, python, perl, javascript, typescript, groovy, Java, kotlin, C++, C, Rust, erlang, php, awk, pascal, go, Nim, Nix, VB, Hax, coffeescript etc. Really, take your pick. Any well established language seems like a much better approach than something like this.
systemd.services.tarsnapback = {
startAt = "*-*-* 05:20:00";
path = [ pkgs.coreutils ];
environment = {
HOME = "/home/XXXX";
};
script = ''${pkgs.tarsnap}/bin/tarsnap -c -f "$(uname -n)-$(date +%Y-%m-%d_%H-%M-%S)" "$HOME/ts" '';
serviceConfig.User = "XXXX";
};
1: Quick reference if you aren't familiar with systemd timers: https://wiki.archlinux.org/index.php/Systemd/Timers
You're throwing away all the organizational learning and preexisting systemd documentation, and forcing something different on the world. `man systemd.timer` contains no mention of `startAt`; what you have there is something inherently different from systemd.
And what if I want more complex rules, like a combination of intervals and time from boot?
NixOS gives you this option, and I choose not to. Fortunately nobody is forcing you to use this (or forcing me to not use it).
> You're throwing away all the organizational learning and preexisting systemd documentation, and forcing something different on the world. `man systemd.timer` contains no mention of `startAt`
Not quite throwing it all away, because you can easily observe the output of this before making it live. Yes, systemd.timer contains no mention of startAt because as you correctly observed this is somethign inherently different from systemd. startAt is used by other configuration options to specify items running at specific calendar times, so it's reasonably consistent within nixOS itself.
To read the nix documentation is quite simple (and it shows the currently configured value for you):
% nixos-option systemd.services.tarsnapback.startAt
Value:
[ "*-*-* 05:20:00" ]
Default:
[ ]
Type:
"string or list of strings"
Example:
"Sun 14:00:00"
Description:
''
Automatically start this unit at the given date/time, which
must be in the format described in
<citerefentry><refentrytitle>systemd.time</refentrytitle>
<manvolnum>7</manvolnum></citerefentry>. This is equivalent
to adding a corresponding timer unit with
<option>OnCalendar</option> set to the value given here.
''
> what you have there is something inherently different from systemd.That's kind of the point. If it was inherently the same as systemd there would be no point to it. Systemd timers are quite boilerplate heavy (compare to e.g. a crontab entry), so when I'm not using nixos, I often end up copying an existing timer and modifying it.
> And what if I want more complex rules, like a combination of intervals and time from boot?
Add a time from boot of 120 seconds with this:
systemd.timers.tarsnapBack.timerConfig = { OnBootSec = "120"; };
For things that actually use all the bells and whistles of systemd, you'll need to specify all the various details.[edit]
For a nice hyperlinked searching of options see also:
https://search.nixos.org/options?query=startAt&from=0&size=3...
1) This doesn't have to be a two-step process. Specialized tools like kubecfg for Jsonnet will directly take a Jsonnet top-level config and instantiate it, traverse the tree, and apply the configuration intelligently to your Kubernetes Cluster.
2) General purpose languages are at a disadvantage, because most of them are impure. Languages that limit all filesystem imports to be local to a repository and disallow any I/O ensure that you can safely instantiate configuration on CI hosts, in production programs, etc. The fact that languages like Jsonnet also ship as a single binary (or simple library) that requires no environment setup, etc. also make them super easy to integrate to any stack.
3) Configuration languages tend to be functional, lazily evaluated and declarative, vastly simplifying building abstractions that feel more in-line with your data. This allows for progressive building of abstraction, from just a raw data representation, through removal of repeated fields, to anything you could imagine makes sense for your application.
Related reading: https://landing.google.com/sre/workbook/chapters/configurati...
Nix as an example:
nix-repl> { foo = 5 / 0; bar = 5; }
error: division by zero, at (string):1:9
nix-repl> { foo = 5 / 0; bar = 5; }.bar
5
vs. Python as an obvious example of a language with eager evaluation: >>> { "foo": 5 / 0, "bar": 5 }.bar
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero
This lazy evaluation allows for a very nice construct in Jsonnet: local widget = {
id:: error "id must be set",
name: "widget-%d" % [self.id],
url: "https://factory.com/widget/%d" % [self.id],
};
widgetStandard: widget { id: 42 },
widgetSpecial: widget { name: "foo"; url: "https://foo.com" },
When the resulting code only expects a widget to have a 'name' and 'url' field, you can either have both automatically defined based on a single to-level ID, or override them, even fully skipping the ID if not needed. (a :: in jsonnet is a hidden field, ie. one that will not be evaluated automatically when generating a YAML/JSON/..., but can be evaluated by other means).
The reason you don’t use regular languages for this task is because you want to enforce termination (programs can’t run forever without halting, allowing someone to DoS your system) or reproducibility (the config program doesn’t evaluate to different JSON depending on some outside state because the program did I/O). If your use case involves users who can be trusted not to violate these principles, then a standard programming language can work fine, but this frequently isn’t the case.
Nickel is turing complete. See, the fib example.
> or reproducibility
Nickel doesn't force reproducibility
So again, why Nickel and not a GP programming language?
A configuration file is uniquely suited to a pure and lazy language.
Pure, because the all the advantages of a pure language remain, while none of the downsides; the result of evaluating the function is your configuration data. You don't need to do arbitrary I/O and ordering for generating configuration files.
Lazy because configuration files are naturally declarative, but you don't want to evaluate tons of things you have declared but then never used.
I should have been more clear, I was listing potential reasons why you might not use a standard programming language. "Not wanting turing completeness" is a reason to use a non-turing-complete DSL. I wasn't suggesting that Nickel was appropriate for this particular use case, but many of the other languages in this category are (e.g., Starlark, Dhall).
> Nickel doesn't force reproducibility
Scanning the docs, I don't see anything about Nickel allowing I/O, so I believe you're mistaken.
> However, sometimes the situation does not fit in a rigid framework: as for Turing-completeness, there may be cases which mandates side-effects. An example is when writing Terraform configurations, some external values (an IP) used somewhere in the configuration may only be known once another part of the configuration has been evaluated and executed (deploying machines, in this context). Reading this IP is a side-effect, even if not called so in Terraform's terminology.
> Nickel permits side-effects, but they are heavily constrained: they must be commutative, a property which makes them not hurting parallelizability. They are extensible, meaning that third-party may define new effects and implement externally the associated effect handlers in order to customize Nickel for specific use-cases.
This answers your question about why Nickel is preferable to general purpose programming languages--the side-effects are more limited. Further, it reads to me like the "side-effects" are something that the owner of the runtime opts into by extending the sandbox with callables that can do side-effects as opposed to untrusted code being able to perform side-effects in any Nickel sandbox.