PuppetConf 2017: Hiera 5: The Full Data Enchilada- Hendrik Lindberg, Puppet

Post on 21-Jan-2018

236 views 4 download

Transcript of PuppetConf 2017: Hiera 5: The Full Data Enchilada- Hendrik Lindberg, Puppet

Finding Waldo…

(“the full data enchilada“ - sorry for the original title, I must have been hungry)

About Me• Swedish

• Live on theislandof Gozo,Malta

• Father of 3

• Author of the Puppet 4 Language, Hiera 5, and Puppet Type system, Task Plans.

• On the Puppet Core team

helindbe @hel

Agenda• What is Hiera?

• What does Hiera do?

• Authoring Data

• Differences hiera 3, (4), and 5

• Writing backends

What is Hiera?

[haɪrɑ] or [haɪərɑ]

“hiera" - the gem“hiera" - the command line tool

“hiera" - the function

What Hiera is• A key-value store abstraction with multiple and

extensible set of backends (backend API).

• A key lookup resolution mechanism searching multiple key-value stores (~ query)

• A hierarchical data organization

• A data composition mechanism (defaults, override, merge, unique - etc.)

What do we use Hiera for?

• Explicit lookup

• Automatic Parameter Lookup (APL)

Why a new Hiera?

hiera 3 does this thing…• Similar code in every backend - COPY PASTA - very hard to fix

general things when the logic is in each and every backend. • Easy to make mistakes and leak memory in a backend • Global & static architecture - must restart after changes • Global config pointing into environments - all environments

must change at the same time - yeah right, when you have 1000nds forever changing environments…

• Search uses a cartesian product of levels and backends based on file suffix - lots of trickery and lots of file stats

• Different backend versions not supported in different environments. • No explanation support - need to trace/debug - must restart server • Use of dynamic variables (because of lack of suitable features) makes it impossible to

have efficient cashing to speed up performance. • Has its own backed loading system • Circular dependency on puppet - made it very hard to fix certain types of issues

Hiera Versions

where a gem in puppet

CLI hiera puppet lookup

hiera.yaml version 3 5 ( supports 3 and 4)

explicit lookup hiera() hiera_array() hiera_hash() lookup()

backend API complicated simple using function API

APL options no lookup_options in data, explicit and APL the

same!explain support no --explain

advanced paths no globs, mapped paths

deprecated

Explicit lookup# get value, and…# …fail if not present$x = lookup(‘key’)

# …verify data type$x = lookup(‘key’, Array)

# …return default if not present$x = lookup(‘key’, Array, first,[blue])

# …options hash$x = lookup('key', { <options> })

Many Lookup Options name the key to lookup

value_type the return type to assert

merge merge options

defalt_value if not found use this

default_values_hash if not found pick lookup here

override look here first

default via code block

$x = lookup(‘mykey’) |$x| { # calculate the value compute_it($x)}

Merge Behaviourfirst the first found (default)

unique for Array ( v3 = “array merge”)

hash merge hash, highest prio key wins (no recursion)

deepmerge hash, recursive, higher prio wins on conflict, arrays made unique

Deep Options

knockout_prefix string to match for removal (undef = no knockout)

sort_merged_arrays sorts arrays (false)

merge_hash_arrays if hashes in arrays should be merged (false)

APLclass mymodule::myclass($myparam) { #... }include 'mymodule::myclass'

# Automatically looks up:# 'mymodule::myclass::myparam'# when value is not given.

explicit lookup vs. APL• APL = “Inversion of control” - “Push don’t Pull”

• Much easier to test

• Can be overridden!

• Parameterized classes are documented - your arbitrary keys are not…

• Use APL in your APIs

APL and options

• All lookup options can be set in the data!

• Control “deep merge” etc per key!

• Any backend can return a Hash for the key “lookup_options” with a map of “key” => <options-hash>

• All “lookup_options” are merged

• You can supply defaults in a module for example

APL and lookup_options !

Authoring Datawhere to stick (possibly pieces of) Waldo…

Three Layers

Global Environment Module

Global Layer

• For operational use

• Across all environments

• Overrides environment and modules

• Ok to use a deprecated version 3 hiera.yaml

• In Hiera 3, the only layer (with nasty tricks of referencing into each environment).

Environment Layer

• The typical place to store data

• Across all modules in the env

• Overrides modules

• Use a hiera version 5 hiera.yaml

Module Layer

• Regular hierarchy for overridable and merge-able values

• a default_hierarchy only consulted when not found in regular hierarchy

• Only keys for the module’s namespace

• Must use a hiera 5 version hiera.yaml

the module’s default_hierarchy

A config

Global Environment Module

a::

b::

Strategy ‘first’

Global Environment Module

lookup(‘waldo’)a::

b::

Strategy ‘first’ - not found

Global Environment Module

lookup(‘waldo’)a::

b::

not founderror

Strategy ‘first’ - not found

Global Environment Module

lookup(‘b::waldo’)a::

b::

not founderror

Strategy ‘first’ - simple find

Global Environment Module

lookup(‘b::waldo’)a::

b::

returned

Strategy ‘first’ - simple find

Global Environment Module

lookup(‘b::waldo’)a::

b::

Strategy ‘first’ - find in module

Global Environment Module

lookup(‘b::waldo’)a::

b::

Strategy ‘first’ - find in defaults

Global Environment Module

lookup(‘b::key’)a::

b::

not found

returned

A merging strategy

Global Environment Module

lookup(‘b::key’)a::

b::

Inside a layer

Highest prio

Lowest prio

levels with named entries

Search order

Highest prio

Lowest prio

each level in turn

---version: 5defaults: # Used for any hierarchy level that omits these keys. datadir: data # This path is relative to hiera.yaml's directory. data_hash: yaml_data # Use the built-in YAML backend.

hierarchy: - name: "Per-node data" # Human-readable name. path: "nodes/%{trusted.certname}.yaml" # File path, relative to datadir. # ^^^ IMPORTANT: include the file extension!

- name: "Per-datacenter business group data" # Uses custom facts. path: "location/%{facts.whereami}/%{facts.group}.yaml"

- name: "Global business group data" path: "groups/%{facts.group}.yaml"

- name: "Per-datacenter secret data (encrypted)" lookup_key: eyaml_lookup_key # Uses non-default backend. path: "secrets/%{facts.whereami}.eyaml" options: pkcs7_private_key: "/etc/puppetlabs/puppet/eyaml/private_key.pkcs7.pem" pkcs7_public_key: "/etc/puppetlabs/puppet/eyaml/public_key.pkcs7.pem"

- name: "Per-OS defaults" path: "os/%{facts.os.family}.yaml"

- name: "Common data" path: "common.yaml"

a level

Inside a layer

Highest prio

Lowest prio

multiple paths/globs etc. per level (backend only called if file exists)

---version: 5defaults: # Used for any hierarchy level that omits these keys. datadir: data # This path is relative to hiera.yaml's directory. data_hash: yaml_data # Use the built-in YAML backend.

hierarchy: - name: "Per-node, datacenter, and business group data" paths: - "nodes/%{trusted.certname}.yaml" - "location/%{facts.whereami}/%{facts.group}.yaml" - "groups/%{facts.group}.yaml"

- name: "Per-datacenter secret data (encrypted)" lookup_key: eyaml_lookup_key # Uses non-default backend. path: "secrets/%{facts.whereami}.eyaml" options: pkcs7_private_key: "/etc/puppetlabs/puppet/eyaml/private_key.pkcs7.pem" pkcs7_public_key: "/etc/puppetlabs/puppet/eyaml/public_key.pkcs7.pem"

- name: "Defaults per os and common" paths: - "os/%{facts.os.family}.yaml" - "common.yaml"

multiple

Different ways to reference data files / sources

Key Data type Expected valuepath paths

String Array

One file path. Any number of file paths. This acts like a sub-hierarchy: if multiple files exist, Hiera searches all of them, in the order in which they’re written.

glob globs

String Array

One (or several) shell-like glob patterns, which might match any number of files. If multiple files are found, Hiera searches all of them in alphanumerical order (ignoring the order in which multiple globs were given).

uri uris

String Array

One, (or several) URIs that are not checked for existence. One call to the backend is performed for every given URI.

mapped_paths Array or Hash

A fact that is a collection (array or hash) of values. Hiera expands these values to produce an array of paths.

mapped_paths: [services, tmp, "service/%{tmp}/common.yaml"]

Tips and Tricks

• Have keys with ‘.’ in them?Quote the key when looking up to prevents the built in “dig” behaviour: lookup("'my.dotted.key'")

Writing Backends

Three kinds of backends

data_hash

Produces all of the key => valuepairs at once as a Hash

Good for small to moderate data volume and where most of the data is always used. Limit; static in nature.

Reading a json file

function mymodule::myjson(Hash $options, Puppet::LookupContext $ctx) { $options[‘path’].file.parsejson()}

lookup_key

Produces values per key - called multiple times.

Slightly more complex because of the added flexibility/power, but still not complicated to implement.

A “prefixer” added to our hiera.yaml

- name: "Using example with prefix" path: “examples/%{trusted.certname}.yaml" lookup_key: mymodule::json_with_prefix # the example function options: prefix: “Yo, Waldo! The value is: "

transforming values by key…

function mymodule::myjson_with_prefix( Variant[String, Numeric] $key, # the key being looked up Hash $options, # the options from hiera.yaml Puppet::LookupContext $ctx # the context/helper){ $hash = $ctx.cache_file($options[‘path’]) |$content| { $content.parsejson() } case $val = $hash[$key] { String : { "${options[‘prefix']}${val}" } NotUndef: { $val } default : { $ctx.not_found() } }}

data_dig

Like lookup_key but is responsible for any digging into the key.

because lookup(“users.jane_doe.pager_nbr”) would be terrible if there are thousands of users…

Puppet::LookupContext object

Key Expected valuenot_found() Immediately returns from the function and tells hiera there is no

value for the keyinterpolate(value) Perform hiera style interpolation on the given string value

environment_name() module_name()

Produces information about the container where this function is part of a hiera.yaml

cache(key, value) cache_all(hash)

Adds values to a cache.

cached_value(key) cache_has_value(key) cached_entries()

Retrieves value(s) from the cache

cached_file_data(path) |$content| {...}

Reads and caches the contents of a file, or the transformed content of a file

explain() || { 'message' } Emits an “explain” message if —explain mode is on

Ideas for backends• DRY up data - use lookup inside backend to compose

values from lookups - earlier not possible for arrays and hashes.

• Computed values - given input from hiera.yaml (maybe even a key), other values can be derived

• Provide different data sets (in a module; for example “standalone” vs. a “client server” configuration) that can be integrated. While a module cannot directly supply global keys, it can provide the data/backend that does so if added to the env’s hiera.yaml!

Cage Fight !

• Faster • Cleaner • More powerful, yet Simpler • Explain Support

3 vs 5 - deprecated bad magic• Nothing good came from using these hiera 3 magic variables:

$calling_module$calling_class$calling_class_path

• Hiera 3 could use these as a hacky predecessor of module data, but anything you were doing with them is better accomplished with the module layer. You can continue using these in a version 3 hiera.yaml file, but you’ll need to remove them once you update your global config to version 5.

• If used to split up data in multiple files (per module etc). Use the ‘glob’ pattern.

3 vs 5• Use lookup() instead of hiera_xxx() • Use lookup() + include() instead of hiera_include() • Use lookup CLI instead of hiera CLI • No global merge/deep-merge setting (was: horrible!) - use lookup options. • Move to using hiera 5 backends! • The ‘data binding terminus’ (advanced hackery) is no longer used - write a

backend instead • Hiera 5 is faster (much thanks to caching) and with greatly reduced risk of

memory leaks due to mistakes in backends • You can call lookup() from within backend functions! Can do what hiera 3

alias never could (hiera 3 - limited to strings). • The lookup_key function opens up for advanced data composition - merge

multiple (different) keys into one etc.

And the winner is…

Questions?