Dynamic Terraform Provider

8 Nov, 2024 · Read in about 5 min · (910 Words)

Dymamic Terraform Provider

I wanted to create a proof-of-concept simple method of storing and retrieving data.

User X --push--> Database --read--> User Y

This would be used as a sort of dictionary for users to query information being published by others.

The exact nature and source of the data made Terraform an obvious choice for this.

I explored the most basic option:

A basic boiler-plated http call:

data "http" "example" {
  url = "https://my-db.example.com"
  method = "POST"
  request_headers = {
    Accept = "application/json"
  }

  request_body = jsonencode({
    some_data = "goes here"
  })
}

However, controlling the lifecycle of this could be very difficult - how would we trigger a deletion upon resource deletion.

So I started thinking about writing a provider, which could look like:

resource "mydataprovider_item" "this" {
  name       = "Me"
  other_data = "that"
}

This could be fairly simple - just a basic resource with methods to send POST, PATCH and DELETE methods to a simple CRUD API, backed by a database.

~Dynamicismability~ Being Dynamic

I wanted the code (of both the application and the API/database) to be relatively dynmic and be indifferent to whatever the fields that were needed by the application.

The lifecycle of Terraform providers is the following:

Write code
Push to Github
Github CI/CD builds and creates a release, along with the signed artifacts for the build
Hashicorp registry is triggered to scan new release
New version appears in registry and is downloadable by Terraform

To add these custom fields in by code would either require:

The user of the platform to clone the upstream Terraform Provider code
Modify (or via some scripting) inject their fields
Setup GPG keys and add to Github CI/CD
Register new Provider with Hashicorp’s registry and go through the above process for releasing a version.

Not only is this an incredibly arduous task, it also exposes their implementation to the world - which they may well not want to do.

Another option is to create a local Terraform provider registry (not something that I’m completely against.. shameless plug) - the API could act as a registry and provide the binary download. But would mean the API would need to recompile a golang binary when the fields definitions are changed by the user, which would likely be incredibly brittle and, frankly, horrendous.

So… on to exploring only slightly less horrendous options.

If I can modify the attributes for the resource at runtime, it would be ideal. However, Terraform providers specify their schema as code, which may pose some issues:

/// Snippet from https://github.com/DockStudios/terraform-provider-terrareg/blob/main/internal/terrareg/module.go
		Attributes: map[string]schema.Attribute{
			// ID attribute required for unit testing
			"id": schema.StringAttribute{
				Computed:            true,
				MarkdownDescription: "Full ID of the module",
				PlanModifiers: []planmodifier.String{
					stringplanmodifier.UseStateForUnknown(),
				},
			},
			"namespace": schema.StringAttribute{
				Required:            true,
				MarkdownDescription: "Namespace of the module",
			},
			"name": schema.StringAttribute{
				Required:            true,
				MarkdownDescription: "Module name",
			},
            ...
        }

For context, there are two critical aspect of a provider that we request, the above schema definition and the provider configuration. When a provider is used, a provider block is (optionally) provided, which looks like:

provider "myprovider" {
  some = "config"
  goes = "here"
}

In our case, we will need to utilise this to pass the URL of our API (and perhaps some other bits, such as API key for authentication).

When Terraform intiialises the provider, it runs a Configure method of the provider, along the lines of:

var _ provider.Provider = &MyProvider{}
var _ provider.ProviderWithFunctions = &MyProvider{}

type MyProvider struct {
	version string
}

type MyProviderModel struct {
    // Specify the provider model attribtues, as per example above
    Some types.String `tfsdk:"some"`
    Goes types.String `tfsdk:"goes"`
}

// Here we have the configuration function, which is our entrypoint to obtaining these values
func (p *MyProvider) Configure(ctx context.Context, req provider.ConfigureRequest, resp *provider.ConfigureResponse) {
	var data MyProviderModel

    // Obtain the user-provided config from the provider.ConfigureRequest
	req.Config.Get(ctx, &data)
    ...
}

We’ll need to try to obtain these provider model values during the generation of the schema, to be able to query our API.

To define the schema, the original Terraform provider SDK provides (the earlier iteration of Terraform provider implementations) provides a very static approach:

/// Snippet from https://github.com/DockStudios/terraform-provider-jmon/blob/main/jmon/resource_environment.go
func resourceEnvironment() *schema.Resource {
	return &schema.Resource{
		CreateContext: resourceEnvironmentCreate,
		ReadContext:   resourceEnvironmentRead,
		// UpdateContext: resourceEnvironmentUpdate,
		DeleteContext: resourceEnvironmentDelete,

		Schema: map[string]*schema.Schema{
			"name": &schema.Schema{
				Type:     schema.TypeString,
				Required: true,
				ForceNew: true,
			},
		},
		Importer: &schema.ResourceImporter{
			StateContext: schema.ImportStatePassthroughContext,
		},
	}
}

Within this, at runtime, resourceEnvironment is called, which returns a schema.Resource, with a schema.Schema within it. There’s no arguments with this and is called early on in the provider instanciation for Terraform.

The newer Terraform Provider Framework, provides some additional options:

func (r *ModuleResource) Schema(ctx context.Context, req resource.SchemaRequest, resp *resource.SchemaResponse) {
	resp.Schema = schema.Schema{
        ...
    }
}

During provider initialisation we have the context, schema request and schema response.

Let’s dump the context values and se what’s available:

func (r *ExampleDynamicResource) Schema(ctx context.Context, req resource.SchemaRequest, resp *resource.SchemaResponse) {
	log.Printf("Context: %+v", ctx)

2024-11-08T07:43:15.293Z [DEBUG] provider.terraform-provider-resoreg: 2024/11/08 07:43:15 CONTEXT context.Background.WithValue(transport.connectionKey, *net.UnixConn).WithValue(peer.peerKey, Peer{Addr: '', LocalAddr: '/var/folders/58/p2bkk3cs03b07bwg8mnx2z6c0000gn/T/plugin392428091', AuthInfo: 'tls'}).WithCancel.WithValue(metadata.mdIncomingKey, metadata.MD).WithValue(grpc.serverKey, *grpc.Server).WithValue(grpc.streamKey, *transport.Stream).WithValue(logging.loggerKey, *hclog.intLogger).WithValue(logging.loggerKey, *hclog.LoggerOptions).WithValue(logging.loggerKey, logging.LoggerOpts).WithValue(logging.loggerKey, *hclog.intLogger).WithValue(logging.loggerKey, *hclog.intLogger).WithValue(logging.loggerKey, *hclog.LoggerOptions).WithValue(logging.loggerKey, logging.LoggerOpts).WithValue(logging.loggerKey, logging.LoggerOpts).WithValue(logging.loggerKey, logging.LoggerOpts).WithValue(logging.loggerKey, logging.LoggerOpts).WithValue(logging.loggerKey, logging.LoggerOpts).WithValue(logging.loggerKey, logging.LoggerOpts).WithValue(logging.loggerKey, logging.LoggerOpts).WithValue(logging.loggerKey, logging.LoggerOpts).WithValue(logging.loggerKey, logging.LoggerOpts).WithValue(logging.loggerKey, logging.LoggerOpts).WithCancel.WithValue(tf6serverlogging.ContextKeyDownstreamRequestStartTime, 2024-11-08 07:43:15.292355 +0000 GMT m=+0.025564492).WithCancel.WithValue(logging.loggerKey, logging.LoggerOpts).WithValue(logging.loggerKey, *hclog.intLogger)

Clearly, at least at this point in the execution, it’s mostly used for logging context and the few other context values are just details with the connection from Terraform binary to the provider.

The SchemaReseponse will simply be a type for the provider’s Schema function of the to return the specification of the schema. The last hope is the SchemaRequest, which may