Metavault Configuration
The component’s name of metaVault is DFAKTO_METAVAULT. You will need to prefix the environment variable with this value.
Authentication
This section is used to configure the Identity provider used to manage the users and log in to the different components of the solution. See Identity provider configuration to help configure an identity provider for beVault.
The default Scope is: “profile email OpenID”
RequireHttpsMetadata defines the protocol used to communicate with the Oauth authentication provider (Keycloak). “True“, which is the default, requires https. Local deployments may want to set “false”.
Servers
In order to create an environment in beVault, and therefore a database to store your data vault, you will need to configure at least one server. You can add as many servers you need. This can be useful to have your testing environment on another server than your production one.
We recommend having all your environments on the same type of database to avoid issues with your custom SQL code (information mart, hard rules and data quality controls)
Supported target databases
Here are the currently supported databases where you can deploy your Data Vault.

PostgreSQL

Snowflake
SQL Server
And also, in alpha: IbmDb2
Configuration
The configuration may differ a little depending on the type of database you target.
Name: the host where the DBMS resides, for example localhost
allowCustomEnvironmentDatabaseName : If the users have the right to choose a database name when creating an environment. If set to false, the database name will have the following format: project_environment
DatabaseType : The type of Database. Expected values :
PostgreSQL
for versions 12 → 14
PostgreSQL15
for versions 15 → …
SQLServer
Snowflake
Username : the name of the database user. This user will need to have the read and write access to the database, the right to create tables, schema, and databases (if the user has no right to create database you will have to create them manually).
Password : the password of the database user
ReadOnlyUsername : Username of a user with stg_reader, ref_reader, dv_reader Role and im_reader
ReadOnlyPassword : password of the read-only user
ConnectionStringSuffix : (1.6.8+) Additional parameters that will be appended to the connection string
EngineParameters: A list of parameters to pass to the Datavault Engine. This is specific to the database flavor. See below for details per db
Engine parameters
Postgresql
Parameter name | Expected value | Effect |
---|---|---|
FORCE_LOWERCASE | True/False | All identifiers to be generated and used for the database are changed to lowercase. This option is used to mimic the behavior of previous versions of beVault. Note: If you migrate a beVault from version 2.X to version 3.0.0, you need to activate this option. |
Snowflake
Parameter name | Expected value | Effect |
---|---|---|
DATAWAREHOUSE | Snowflake cloud database has the concept of “Warehouse” which is the entity doing the work when querying the database. This parameter specifies the warehouse to use. If not set, Snowflake selects the role’s default warehouse. |
Step Functions
This section gives the necessary information for the orchestrator to connect to either AWS Step Function or, most likely, dFakto states.
AuthenticationKey and AuthenticationSecret : allow to safely authenticate with the orchestrator. The values need to be retrieved from the orchestrator.
serviceUrl: url where the orchestrator is running (not required if running using AWS Step Functions)
roleArn and AWSRegion : if the orchestrator is states, can be left with the default value.
RegisterRetryDelay: Delay in seconds between two attempts of registering an activity.
TaskTimeoutSeconds: METAVAULT CONFIG - When the user pushes a new version of the data vault, state machine whose purpose is to load the data vault are generated and sent to the orchestrator. This field sets the maximum duration of each state of a state machine.
HeartbeatTimeoutSeconds: METAVAULT CONFIG - Same as the previous field, except that it sets how long the orchestrator will wait at most between heartbeat from the workers. Can most likely be left with the default value.
DefaultMaxConcurrency: WORKERS CONFIG - (1.5.1+) Maximum number of tasks processed at the same time for a given activity. Some Workers can have a hard-coded value and ignore the default configuration.
DefaultHeartbeatDelay: WORKERS CONFIG - (1.5.1+) Default delay in seconds between two Heartbeat sent to the server while processing a Task. Some Workers can have a hard-coded value and ignore the default configuration.
EnvironmentName: WORKERS CONFIG - For the other workers, can be left to any values. The value will be used to prefix the activities the worker connect to. For example, if the value chosen is “Prod”, the worker will connect to the Prod-gzip activity.
Logs
By default, all applications are sending reasonable logs to the console, the configuration can be updated using Serilog configuration section.
Here is the configuration of the logs. The most useful field to set is probably the path field, which sets where the logs will be stored on the disk.
For the other options :
MinimumLevel: Indicate the level of log we want to store. From low to high, these are
Verbose
,Debug
,Information
,Warning
,Error
andFatal
rollOnfileSizeLimit: Indicate if we want to create a new log file when the current one reaches its size limit
fileSizeLimitByte: Indicate the size limit of a log file. Once this size is reached, a new file will be created if the rollOnfileSizeLimit is set to true
retainedFileCountLimit: Indicate how much file we should have, we start overriding the first log file.
option : formatter: The formatter decides the format of the logs (text, json, …)
For more option, see https://github.com/serilog/serilog-settings-configuration
Git
{
"git": {
"EndOfLine": "\n"
}
}
The git section of the config contains only one field, the EndOfLine field : This field can most likely be left untouched. It can be changed if a .git folder is copied from a Unix system to a Windows system or vice versa, where the end of line character is encoded differently.
Query
deploymentTimeoutSeconds: The number of seconds a single query can run against the database during (or during the analysis of) the deployment. Note that a deployment typically executes hundreds of small queries, so this should be pretty small. Defaults to 30 (seconds).
defaultWorkerTimeoutSeconds: The number of seconds, by default, a single query can run against the database during the execution of the Metavault workers. (BulkImportExport or DatavaultQuery). Defaults to 7200 (seconds, i.e. 2 hours). A single workflow step in the orchestrator may override this timeout using the worker parameter:
TimeoutSeconds
migrationTimeoutSeconds: The number of seconds a single query can run against the database during a migration. This was especially relevant for the 3.0 migration. Defaults to 600 (seconds).
Other
Web server config
HTTP Strict Transport Security (HSTS) is a simple and widely supported standard to protect visitors by ensuring that their browsers always connect to a website over HTTPS.
MaxAge: The time, that the browser should remember that a site is only to be accessed using HTTPS.
Preload: it is possible to enforce secure connections on a higher level, even before visiting a website for the first time: the HSTS preload list. This is a list, managed by google, with domain names that by default support HSTS:
The ForwardedHeadersOptions set the behavior of proxied headers onto the requests. You can most likely leave it to the default All value.
The accepted values are :
All : Process X-Forwarded-For, X-Forwarded-Host and X-Forwarded-Proto.
None : Do not process any forwarders
XForwardedFor : Process X-Forwarded-For, which identifies the originating IP address of the client.
XForwardedHost : Process X-Forwarded-Host, which identifies the original host requested by the client.
XForwardedProto : Process X-Forwarded-Proto, which identifies the protocol (HTTP or HTTPS) the client used to connect.
Fore more information, see https://docs.microsoft.com/en-us/aspnet/core/host-and-deploy/proxy-load-balancer?view=aspnetcore-5.0
Prometheus integration
Prometheus (https://prometheus.io/) is an open-source systems monitoring and alerting
DefaultContextLabel: Metrics recorded are grouped into “Contexts”, for example a database context or application context. Metrics names should be unique per context. The default is “Application”.
Enabled: Allows recording of all metrics to be enabled/disabled, default is true.
ApdexTrackingEnabled: Allows enabling/disabling of calculating the apdex score on the overall responses times. Defaults to
true
. The Apdex (Application Performance Index) is used to monitor end-user satisfaction. It is an open industry standard that estimates the end user’s satisfaction level on an application’s response time through a score between 0 and 1.apdexTSeconds: The Apdex T seconds value used in calculating the score on the samples collected.
IgnoredHttpStatusCode: Allows specific HTTP status codes to be ignored when reporting on response related information, e.g., You might not want to monitor 404 status codes.
IngoredRoutesRegexPatterns: An list of regex patterns used to ignore matching routes from metrics tracking.
Oauth2TrackingEnabled: Allows recording of all OAuth2 Client tracking to be enabled/disabled. Defaults to true.
MetricsEndPointEnabled: Allows enabling/disabling of the /metrics endpoint, when disabled will result in a 404 status code, the default is true.
MetrucsTextEndpointEnabled: Allows enabling/disabling of the /metrics-text endpoint, when disabled will result in a 404 status code, the default is true.
EnvironmentInfoEndpointEnabled: Allows enabling/disabling of the
/env
endpoint, when disabled will result in a 404 status code, the default istrue
.
Sentry
Sentry (https://sentry.io/) is an application monitoring platform.
DSN: where to send events, so the events are associated with the correct project.
IncludeRequestPayload: whether we should send the request body to Sentry. This is done so that the request data can be read at a later point in case an error happens while processing the request.
SendDefaultPii: Whether we should report the user who made the request
MinimumBreadcrumbLevel: Configure the lowest level a message has to be to become a breadcrumb. Breadcrumbs are the last (by default 100) log that were sent before the event was fired to Sentry.
MinimumEventLevel: A
LogLevel
which indicates the minimum level a log message has to be sent to Sentry as an event. By default, this value isError
.AttachStackTrace: Configures whether Sentry should generate and attach stack traces to capture message calls.
Debug: Turns debug mode on or off. If debug is enabled, Sentry will attempt to print out useful debugging information if something goes wrong with sending the event. The default is always
false
. It's generally not recommended to turn it on in production, though turningdebug
mode on will not cause any safety concerns.DiagnosticsLevel: Debug by default.
DefaultTags: Defaults tags to add to all events.