Decouple config spec and type spec #43

opensorceror · 2019-01-03T19:04:10Z

It would be helpful to decouple typesafe config files from tscfg's type spec. This would help ensure that the original typesafe config is not modified in any way with special syntax that is not recognized by Typesafe Config or any of its other wrappers.

Example:

Before
application.conf:

endpoint {
  path: "string"
  url: "string | http://example.net"
  serial: "int?"
  interface {
    port: "int | 8080"
  }
}

Instead of specifying tscfg types in this typesafe config file itself, we can decouple the type spec by introducing an optional additional file.

After
application.conf:

endpoint {
  path: "string"
  url: "http://example.net"
  serial: ""
  interface {
    port: "8080"
  }
}

tscfg.conf:

endpoint {
  serial: "int?"
  interface {
    port: int
  }
}

Moreover, since all values will now be strings by default unless explicitly converted (#42), the string type specification becomes redundant and unnecessary. The result is a much simpler typespec.

The user can then supply this optional typespec file like so:
java -jar tscfg-x.y.z.jar --spec example.spec.conf --typespec example.typespec.conf

If the user does not specify the --typespec parameter, tscfg can look for a default tscfg.conf file in the same directory. If one does not exist, tscfg can proceed with default type conversions (i.e., all String).

Let me know what you think!

The text was updated successfully, but these errors were encountered:

carueda · 2019-01-04T00:49:13Z

Sorry, I'm not seeing the need for these extra elements. The "decoupling" you mention is precisely the underlying model that is already place.

The input to tscfg is a config spec -- as such, all the (optional) "extended" semantics on top of the original type syntax, as well as any optional annotations (as explained in the readme) are processed toward generating the wrapper. Then the wrapper is to be used on actual configuration input. So, the general scenario is that config specs are different from the actual config inputs. However, as explained in the readme, a regular config input could also be used as a spec, in which case:

The tool determines the type of each field according to the given value in the input configuration. Used in this way, all fields are considered optional, with the given value as the default. But this wouldn't be flexible enough! To allow the specification of required fields, explicit types, and default values, a string with a simple syntax as follows can be used [...]

In this sense, you may have noted that all the spec examples in the code are named with a "spec" fragment (***.spec.conf ), which is not required at all, but used as a convention to make explicit the intention that they are actually specs. (BTW, keeping the extension .conf just helps with the associated syntax highlighting in editors/IDEs, which is desirable given that the underlying syntax continues to be the one supported by Typesafe Config).

carueda · 2019-01-04T00:56:09Z

So, continuing with your "application.conf" example, I actually call this application.spec.conf (in some of my real applications) to capture the spec of my application, while application.conf would be the name for the actual configuration input for my application at runtime.

opensorceror · 2019-01-04T14:40:46Z

The input to tscfg is a config spec -- as such, all the (optional) "extended" semantics on top of the original type syntax, as well as any optional annotations (as explained in the readme) are processed toward generating the wrapper. Then the wrapper is to be used on actual configuration input

On the surface, this does seem sufficiently generic, but I have a not-so-uncommon case that I think would benefit from the de-coupling. Let me elaborate.

Use case:
I have an application.conf file that contains generic settings:

application.conf:

include "application.preprod.conf"

project {
  mlJob {
    memory = "size | 50G"
    lorem = "ipsum"
  }
}

And I have an application.preprod.conf file that contains settings specific to our preprod environment. These settings override generic settings when the application is running on preprod:

preprod {
  project {
    mlJob {
      memory = "size | 100G"
      lorem = "opium"
      three = "four"
    }
  }
}

The problem is:
If I now supply the application.conf file to tscfg to generate the wrapper, the wrapper contains the generic settings (i.e., default values). However, at runtime, if the application is running on preprod I need to load the preprod settings instead, with fallback on the generic settings. I'm doing this like so:

val cfg = ConfigFactory.load()

// On preprod
val preprodConfig = cfg.getConfig("preprod").withFallback(cfg)

val tscfgMapped = ExampleCfg(preprodConfig)

However, preprodConfig now contains a memory key with value size | 100G. If I supply this to the wrapper, the wrapper throws an exception:

Invalid value at 'memory': Could not parse size-in-bytes number 'size | 100'
    at com.typesafe.config.impl.SimpleConfig.parseBytes(SimpleConfig.java:889)
    at com.typesafe.config.impl.SimpleConfig.getBytes(SimpleConfig.java:290)
    at ca.bell.networkbigdata.typesafe.ExampleCfg$Project$MlJob$.apply(ExampleCfg.scala:22)
    at ca.bell.networkbigdata.typesafe.ExampleCfg$Project$.apply(ExampleCfg.scala:29)
    at ca.bell.networkbigdata.typesafe.ExampleCfg$.apply(ExampleCfg.scala:36)
    at ca.bell.networkbigdata.typesafe.TypesafeTest$.main(TypesafeTest.scala:22)
    at ca.bell.networkbigdata.typesafe.TypesafeTest.main(TypesafeTest.scala)

Understandably, this is because the wrapper tries to invoke typesafe's getBytes method on the supplied value, which fails because it doesn't recognize the tscfg-specific syntax size | 100G.

How do I get the correct size-in-bytes conversion at runtime for the preprod configuration?

With decoupling, this problem wouldn't occur because the typesafe config itself would not contain the tscfg type spec. It would just contain the values, which I could override easily with any valid value at runtime.

Note: I'm not actually doing a size-in-bytes conversion; the use case has been simplified to illustrate my point and support my case.

The only workaround I've currently found is to remove all tscfg-specific syntax after generating the wrapper, and add it back every time before I re-generate the wrapper. But this is tedious and I think the decoupling would remove the need for this.

carueda · 2019-01-05T00:41:15Z

I think you are actually not separating the two concerns you want to decouple: you are trying to use the same files for both the wrapper generation and their loading at runtime.

The typical use of the tool involves the following aspects:

The build-time aspect: that is, your configuration schema specification and the corresponding generated wrapper. Typically (but certainly not required) a single **.spec.conf file is used for the specification as one often wants a single place as the "true" source of the config schema.
The runtime aspect: that is, the concrete configuration (or multiple configurations to handle different environments, etc.) with actual values for the configuration attributes.

With your example, I would have something like the following:

For build-time:

application.spec.conf:

project {
  mlJob {
    memory: "size | 50G"
    lorem: "string | ipsum"
    three: string  # just something required for illustration
  }
}

For runtime:

application.preprod.conf:

project {
  mlJob {
    memory = 100G
    lorem = "opium"
    three = "four"
  }
}

application.conf:

include "application.preprod.conf"

project {
  mlJob {
    memory = 64G
    lorem = "foobar"
  }
}

Note: there are of course multiple possible variations of the above depending of how you load your concrete configurations at runtime. This is just to focus on the difference between the two core concerns. (Also, I slightly changed some values just to highlight places where they would be explicitly needed in case of actually overriding the defaults in the spec.)

One more comment: we have been looking at an "application" use case in this discussion. But it is relevant to note there's also the "library" use case, in which the developer won't of course know exactly what specific configuration values will be used at runtime. In any case, the general "best practice" is to capture the specification of the configuration schema in a separate resource, from which to generate the code for the wrapper. Then, the runtime aspect pretty much will just follow the regular Typesafe Config mechanisms to loading the concrete configuration, from which to construct the wrapper instance.

opensorceror · 2019-01-08T15:43:04Z

This makes perfect sense, thank you for the detailed explanation and the example! I followed your suggestion, and ended up creating a separate **.spec.conf file. I'll close this issue now, since it's not really an issue.

Thank you!

carueda · 2019-01-09T20:25:43Z

just added a faq entry in the readme, which points to new wiki https://github.com/carueda/tscfg/wiki/workflow

opensorceror · 2019-01-10T15:20:26Z

Awesome, thanks!

opensorceror closed this as completed Jan 8, 2019

abonander mentioned this issue Jun 5, 2019

UX improvement: using real config file as spec and with --all-required flag #47

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decouple config spec and type spec #43

Decouple config spec and type spec #43

opensorceror commented Jan 3, 2019

carueda commented Jan 4, 2019

carueda commented Jan 4, 2019

opensorceror commented Jan 4, 2019 •

edited

Loading

carueda commented Jan 5, 2019

opensorceror commented Jan 8, 2019

carueda commented Jan 9, 2019

opensorceror commented Jan 10, 2019

Decouple config spec and type spec #43

Decouple config spec and type spec #43

Comments

opensorceror commented Jan 3, 2019

carueda commented Jan 4, 2019

carueda commented Jan 4, 2019

opensorceror commented Jan 4, 2019 • edited Loading

carueda commented Jan 5, 2019

opensorceror commented Jan 8, 2019

carueda commented Jan 9, 2019

opensorceror commented Jan 10, 2019

opensorceror commented Jan 4, 2019 •

edited

Loading