Mode
Important Capabilities
Capability | Status | Notes |
---|---|---|
Detect Deleted Entities | ✅ | Optionally enabled via stateful_ingestion.remove_stale_metadata |
Platform Instance | ✅ | Enabled by default |
Table-Level Lineage | ✅ | Supported by default |
This plugin extracts Charts, Reports, and associated metadata from a given Mode workspace. This plugin is in beta and has only been tested on PostgreSQL database.
Report
/api/{account}/reports/{report} endpoint is used to retrieve the following report information.
- Title and description
- Last edited by
- Owner
- Link to the Report in Mode for exploration
- Associated charts within the report
Chart
/api/{workspace}/reports/{report}/queries/{query}/charts' endpoint is used to retrieve the following information.
- Title and description
- Last edited by
- Owner
- Link to the chart in Metabase
- Datasource and lineage information from Report queries.
The following properties for a chart are ingested in DataHub.
Chart Information
Name | Description |
---|---|
Filters | Filters applied to the chart |
Metrics | Fields or columns used for aggregation |
X | Fields used in X-axis |
X2 | Fields used in second X-axis |
Y | Fields used in Y-axis |
Y2 | Fields used in second Y-axis |
Table Information
Name | Description |
---|---|
Columns | Column names in a table |
Filters | Filters applied to the table |
Pivot Table Information
Name | Description |
---|---|
Columns | Column names in a table |
Filters | Filters applied to the table |
Metrics | Fields or columns used for aggregation |
Rows | Row names in a table |
CLI based Ingestion
Install the Plugin
pip install 'acryl-datahub[mode]'
Starter Recipe
Check out the following recipe to get started with ingestion! See below for full configuration options.
For general pointers on writing and running a recipe, see our main recipe guide.
source:
type: mode
config:
# Coordinates
connect_uri: http://app.mode.com
# Credentials
token: token
password: pass
# Options
workspace: "datahub"
default_schema: "public"
owner_username_instead_of_email: False
api_options:
retry_backoff_multiplier: 2
max_retry_interval: 10
max_attempts: 5
sink:
# sink configs
Config Details
- Options
- Schema
Note that a .
is used to denote nested fields in the YAML recipe.
Field | Description |
---|---|
password ✅ string(password) | Mode password for authentication. |
token ✅ string | Mode user token. |
connect_uri string | Mode host URL. Default: https://app.mode.com |
default_schema string | Default schema to use when schema is not provided in an SQL query Default: public |
ingest_embed_url boolean | Whether to Ingest embed URL for Reports Default: True |
owner_username_instead_of_email boolean | Use username for owner URN instead of Email Default: True |
platform_instance_map map(str,string) | |
tag_measures_and_dimensions boolean | Tag measures and dimensions in the schema Default: True |
workspace string | |
env string | The environment that all assets produced by this connector belong to Default: PROD |
api_options ModeAPIConfig | Retry/Wait settings for Mode API to avoid "Too many Requests" error. See Mode API Options below Default: {'retry_backoff_multiplier': 2, 'max_retry_interva... |
api_options.max_attempts integer | Maximum number of attempts to retry before failing Default: 5 |
api_options.max_retry_interval One of integer, number | Maximum interval to wait when retrying Default: 10 |
api_options.retry_backoff_multiplier One of integer, number | Multiplier for exponential backoff when waiting to retry Default: 2 |
stateful_ingestion StatefulStaleMetadataRemovalConfig | Base specialized config for Stateful Ingestion with stale metadata removal capability. |
stateful_ingestion.enabled boolean | Whether or not to enable stateful ingest. Default: True if a pipeline_name is set and either a datahub-rest sink or datahub_api is specified, otherwise False Default: False |
stateful_ingestion.remove_stale_metadata boolean | Soft-deletes the entities present in the last successful run but missing in the current run with stateful_ingestion enabled. Default: True |
The JSONSchema for this configuration is inlined below.
{
"title": "ModeConfig",
"description": "Base configuration class for stateful ingestion for source configs to inherit from.",
"type": "object",
"properties": {
"env": {
"title": "Env",
"description": "The environment that all assets produced by this connector belong to",
"default": "PROD",
"type": "string"
},
"platform_instance_map": {
"title": "Platform Instance Map",
"description": "A holder for platform -> platform_instance mappings to generate correct dataset urns",
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"stateful_ingestion": {
"$ref": "#/definitions/StatefulStaleMetadataRemovalConfig"
},
"connect_uri": {
"title": "Connect Uri",
"description": "Mode host URL.",
"default": "https://app.mode.com",
"type": "string"
},
"token": {
"title": "Token",
"description": "Mode user token.",
"type": "string"
},
"password": {
"title": "Password",
"description": "Mode password for authentication.",
"type": "string",
"writeOnly": true,
"format": "password"
},
"workspace": {
"title": "Workspace",
"type": "string"
},
"default_schema": {
"title": "Default Schema",
"description": "Default schema to use when schema is not provided in an SQL query",
"default": "public",
"type": "string"
},
"owner_username_instead_of_email": {
"title": "Owner Username Instead Of Email",
"description": "Use username for owner URN instead of Email",
"default": true,
"type": "boolean"
},
"api_options": {
"title": "Api Options",
"description": "Retry/Wait settings for Mode API to avoid \"Too many Requests\" error. See Mode API Options below",
"default": {
"retry_backoff_multiplier": 2,
"max_retry_interval": 10,
"max_attempts": 5
},
"allOf": [
{
"$ref": "#/definitions/ModeAPIConfig"
}
]
},
"ingest_embed_url": {
"title": "Ingest Embed Url",
"description": "Whether to Ingest embed URL for Reports",
"default": true,
"type": "boolean"
},
"tag_measures_and_dimensions": {
"title": "Tag Measures And Dimensions",
"description": "Tag measures and dimensions in the schema",
"default": true,
"type": "boolean"
}
},
"required": [
"token",
"password"
],
"additionalProperties": false,
"definitions": {
"DynamicTypedStateProviderConfig": {
"title": "DynamicTypedStateProviderConfig",
"type": "object",
"properties": {
"type": {
"title": "Type",
"description": "The type of the state provider to use. For DataHub use `datahub`",
"type": "string"
},
"config": {
"title": "Config",
"description": "The configuration required for initializing the state provider. Default: The datahub_api config if set at pipeline level. Otherwise, the default DatahubClientConfig. See the defaults (https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/graph/client.py#L19).",
"default": {},
"type": "object"
}
},
"required": [
"type"
],
"additionalProperties": false
},
"StatefulStaleMetadataRemovalConfig": {
"title": "StatefulStaleMetadataRemovalConfig",
"description": "Base specialized config for Stateful Ingestion with stale metadata removal capability.",
"type": "object",
"properties": {
"enabled": {
"title": "Enabled",
"description": "Whether or not to enable stateful ingest. Default: True if a pipeline_name is set and either a datahub-rest sink or `datahub_api` is specified, otherwise False",
"default": false,
"type": "boolean"
},
"remove_stale_metadata": {
"title": "Remove Stale Metadata",
"description": "Soft-deletes the entities present in the last successful run but missing in the current run with stateful_ingestion enabled.",
"default": true,
"type": "boolean"
}
},
"additionalProperties": false
},
"ModeAPIConfig": {
"title": "ModeAPIConfig",
"type": "object",
"properties": {
"retry_backoff_multiplier": {
"title": "Retry Backoff Multiplier",
"description": "Multiplier for exponential backoff when waiting to retry",
"default": 2,
"anyOf": [
{
"type": "integer"
},
{
"type": "number"
}
]
},
"max_retry_interval": {
"title": "Max Retry Interval",
"description": "Maximum interval to wait when retrying",
"default": 10,
"anyOf": [
{
"type": "integer"
},
{
"type": "number"
}
]
},
"max_attempts": {
"title": "Max Attempts",
"description": "Maximum number of attempts to retry before failing",
"default": 5,
"type": "integer"
}
},
"additionalProperties": false
}
}
}
See Mode's Authentication documentation on how to generate token
and password
.
Code Coordinates
- Class Name:
datahub.ingestion.source.mode.ModeSource
- Browse on GitHub
Questions
If you've got any questions on configuring ingestion for Mode, feel free to ping us on our Slack.