Breadcrumbs

Snapshots

The snapshot submodule allows the users to manage snapshot groups. Snapshot groups are a set of points in time that will be used for the information marts, enrichment, and data quality tests.

image-20250930-093655.png


Create a snapshot group

To create a snapshot group, click on the “Create Snapshots” button in the upper-right corner of the screen. This will open a modal where the user can enter the descriptive data of the snapshot group.

image-20250930-093719.png


Each snapshot group requires those fields:

  • Snapshot Name: MANDATORY the name of the snapshot group

  • Description: the description of the snapshot group

  • Type: MANDATORY the type of snapshot group. (CFR below)

    • Reference

    • Schedule

    • Load

  • Arguments: depending on the type selected, additional information may be required (CFR below)


Once the fields are filled in, click on the “Save” button. Once deployed on an environment (CFR versions), a state machine will be created to populate the snapshot referential with all the dates based on the configuration made here.

Reference

A reference snapshot is a set of points in time based on another table of the data vault (typically a referential in the ref schema)

image-20250930-093806.png
  • Reference Table Name: the physical name of the referential containing the list of points in time (e.g., ref.snapshot_referential)

  • Reference Date Column Name: The name of the column that contains the date of the point in time

  • Reference Display Column Name: The name of the column that contains the display name of each point in time

Schedule

A scheduled snapshot is a snapshot group that will generate a new point in time at a specific frequency (e.g., one new snapshot date every month)

image-20250930-093858.png
  • Schedule: Cron expression of the schedule to load the pit table. (optional) https://dfakto.atlassian.net/wiki/s/-2024114597/6452/d11bb6fdc5c8ba4c973785837de5a62b5833bfe3/_/images/icons/emoticons/warning.png This schedule should be manually set for the state machine generated for this snapshot. Format documentation: https://crontab.guru/

  • Display Format: Format to be applied to generate the display name of the snapshot. For Postgres, you can check the format documentation on this page: https://www.postgresql.org/docs/9.6/functions-formatting.html

  • Lifespan: Number of days after which the snapshot is deleted.

The state machine generated for this type of snapshot group will create a new snapshot each time it is executed. The schedule put here should be put on the state machine as well. The state machine should not be executed manually to avoid having snapshot outside the set schedule.

Load

This type of snapshot group will generate a new snapshot every time the associated state machine is executed.
It requires no additional parameters.

image-20251003-205843.png

Edit a snapshot group

To edit a snapshot group, click on the edit button. A modal will open with descriptive information (CFR Create a snapshot group)

image-20250930-094050.png


Delete a snapshot group

To delete a snapshot group, click on the trash can button. A confirmation will be required.

image-20250930-094140.png

Use a snapshot group

As explained earlier, a state machine will be generated for each snapshot group with the deployment of a new version. The state machine has the following name : [project]-[environment]-[snapshot name]

image-20250930-094940.png

Once executed, the snapshot dates will be available in the table ref.snapshot_dates.

image-20230317-145809.png

It can be used afterward in the information mart scripts to get the values at certain given points in time like this:

SELECT
  h_car.bk
  ,vsh_car_crm_info.*
  ,snapshot_dates.snapshot_date
FROM dv.h_car
CROSS JOIN ref.snapshot_dates
INNER JOIN dv.vsh_car_crm_info ON vsh_car_crm_info.hk = h_car.hk
      AND snapshot_dates.snapshot_date BETWEEN vsh_car_crm_info.load_dts AND vsh_car_crm_info.load_end_dts
WHERE snapshot_dates.name = 'default'