CLI Reference

When you’re using this tool -d flag is referring to the Synapse ID of a folder that would be found under the files tab that contains a manifest and data. This would be referring to a “Top Level Folder”. It is not required to provide a dataset_id but if you’re trying to pull existing annotations by using the -a flag and the manifest is file-based then you would need to provide a dataset_id.

Generate a new manifest as a Google Sheet

schematic manifest -c /path/to/config.yml get -dt <your data type> -s

Generate an existing manifest from Synapse

schematic manifest -c /path/to/config.yml get -dt <your data type> -d <your synapse "Top Level Folder" folder id> -s

Validate a manifest

schematic model -c /path/to/config.yml validate -dt <your data type> -mp <your csv manifest path>

Submit a manifest as a file

schematic model -c /path/to/config.yml submit -mp <your csv manifest path> -d <your synapse "Top Level Folder" id> -vc <your data type> -mrt file_only

In depth guide

schematic

Command line interface to the schematic backend services.

schematic [OPTIONS] COMMAND [ARGS]...

Options

-v, --verbosity <LVL>: Either CRITICAL, ERROR, WARNING, INFO or DEBUG

--version: Show the version and exit.

manifest

Sub-commands with Manifest Generation utilities/methods.

schematic manifest [OPTIONS] COMMAND [ARGS]...

Options

-v, --verbosity <LVL>: Either CRITICAL, ERROR, WARNING, INFO or DEBUG

-c, --config <config>: Specify the path to the config.yml using this option. This is a required argument.

Environment variables

SCHEMATIC_CONFIG: Provide a default for -c

download

Function to download manifest from asset store (Synapse).

schematic manifest download [OPTIONS]

Options

-v, --verbosity <LVL>: Either CRITICAL, ERROR, WARNING, INFO or DEBUG

-d, --dataset_id <dataset_id>: Specify the synID of a dataset folder on Synapse. If there is an exisiting manifest already present in that folder, then it will be pulled with the existing annotations for further annotation/modification.

-nmn, --new_manifest_name <new_manifest_name>: Specify the new name to download the manifest file as.

get

Running CLI with manifest generation options.

schematic manifest get [OPTIONS]

Options

-v, --verbosity <LVL>: Either CRITICAL, ERROR, WARNING, INFO or DEBUG

-t, --title <title>: Specify the title of the manifest (or title prefix of multiple manifests) that will be created at the end of the run. You can either explicitly pass the title of the manifest here or provide it in the config.yml file as a value for the (manifest > title) key.

-dt, --data_type <data_type>: Specify the component(s) (data type) from the data model that is to be used for generating the metadata manifest file. To make all available manifests enter ‘all manifests’. You can either explicitly pass the data type here or provide it in the config.yml file as a value for the (manifest > data_type) key.

-p, --path_to_data_model <path_to_data_model>

-d, --dataset_id <dataset_id>: Specify the synID of a dataset folder on Synapse. If there is an exisiting manifest already present in that folder, then it will be pulled with the existing annotations for further annotation/modification.

-s, --sheet_url: This is a boolean flag. If flag is provided when command line utility is executed, result will be a link/URL to the metadata manifest file. If not it will produce a pandas dataframe for the same.

-o, --output_csv <output_csv>: Path to where the CSV manifest template should be stored.

-oxlsx, --output_xlsx <output_xlsx>: Path to where the Excel manifest template should be stored.

-a, --use_annotations: This is a boolean flag. If flag is provided when command line utility is executed, it will prepopulate template with existing annotations from Synapse.

-js, --json_schema <json_schema>: Specify the path to the JSON Validation Schema for this argument. You can either explicitly pass the .json file here or provide it in the config.yml file as a value for the (model > location) key.

-av, --alphabetize_valid_values <alphabetize_valid_values>: Specify to alphabetize valid attribute values either ascending (a) or descending (d).Optional

-dml, --data_model_labels <data_model_labels>

Choose how to set the label in the data model. display_label, use the display name as a label, if it is valid (contains no blacklisted characters) otherwise will default to class_label. class_label, default, use standard class or property label. Do not change from default unless there is a real need, using ‘display_label’ can have consequences if not used properly.

Options:: class_label | display_label

migrate

Running CLI with manifest migration options.

schematic manifest migrate [OPTIONS]

Options

-v, --verbosity <LVL>: Either CRITICAL, ERROR, WARNING, INFO or DEBUG

-ps, --project_scope <project_scope>: Specify a comma-separated list of projects where manifest entities will be migrated to tables.

-ap, --archive_project <archive_project>: Specify a single project where legacy manifest entities will be stored after migration to table.

-p, --jsonld <jsonld>: Specify the path to the JSON-LD data model (schema) using this option. You can either explicitly pass the schema here or provide a value for the (model > input > location) key.

-re, --return_entities: This is a boolean flag. If flag is provided when command line utility is executed, entities that have been transferred to an archive project will be returned to their original folders.

-dr, --dry_run: This is a boolean flag. If flag is provided when command line utility is executed, a dry run will be performed. No manifests will be re-uploaded and no entities will be migrated, but archival folders will still be created. Migration information for testing purposes will be logged to the INFO level.

model

Sub-commands for Metadata Model related utilities/methods.

schematic model [OPTIONS] COMMAND [ARGS]...

Options

-v, --verbosity <LVL>: Either CRITICAL, ERROR, WARNING, INFO or DEBUG

-c, --config <config>: Specify the path to the config.yml using this option. This is a required argument.

Environment variables

SCHEMATIC_CONFIG: Provide a default for -c

submit

Running CLI with manifest validation (optional) and submission options.

schematic model submit [OPTIONS]

Options

-v, --verbosity <LVL>: Either CRITICAL, ERROR, WARNING, INFO or DEBUG

-mp, --manifest_path <manifest_path>: Specify the path to the metadata manifest file that you want to submit to a dataset on Synapse. This is a required argument.

-d, --dataset_id <dataset_id>: Specify the synID of the dataset folder on Synapse to which you intend to submit the metadata manifest file. This is a required argument.

-vc, --validate_component <validate_component>: The component or data type from the data model which you can use to validate the data filled in your manifest template.

-hb, --hide_blanks: This is a boolean flag. If flag is provided when command line utility is executed, annotations with blank values will be hidden from a dataset’s annotation list in Synaspe.If not, annotations with blank values will be displayed.

-mrt, --manifest_record_type <manifest_record_type>

Specify the way the manifest should be store as on Synapse. Options are ‘file_only’, ‘file_and_entities’, ‘table_and_file’ and ‘table_file_and_entities’. ‘file_and_entities’ will store the manifest as a csv and create Synapse files for each row in the manifest. ‘table_and_file’ will store the manifest as a table and a csv on Synapse. ‘file_only’ will store the manifest as a csv only on Synapse.’table_file_and_entities’ will perform the options file_with_entites and table in combination.Default value is ‘table_file_and_entities’.

Options:: table_and_file | file_only | file_and_entities | table_file_and_entities

-rr, --restrict_rules: This is a boolean flag. If flag is provided when command line utility is executed, validation suite will only run with in-house validation rules, and Great Expectations rules and suite will not be utilized.If not, the Great Expectations suite will be utilized and all rules will be available.

-fa, --file_annotations_upload, -no-fa, --no-file_annotations_upload: This is a boolean flag. Default to True. If False, annotations will not be added to files during submission.

-ps, --project_scope <project_scope>: Specify a comma-separated list of projects to search through for cross manifest validation.

-ds, --dataset_scope <dataset_scope>: Specify a dataset to validate against for filename validation.

-tm, --table_manipulation <table_manipulation>

Specify the way the manifest tables should be store as on Synapse when one with the same name already exists. Options are ‘replace’ and ‘upsert’. ‘replace’ will remove the rows and columns from the existing table and store the new rows and columns, preserving the name and synID. ‘upsert’ will add the new rows to the table and preserve the exisitng rows and columns in the existing table. Default value is ‘replace’. Upsert specific requirements: { }’upsert’ should be used for initial table uploads if users intend to upsert into them at a later time.Using ‘upsert’ at creation will generate the metadata necessary for upsert functionality.Upsert functionality requires primary keys to be specified in the data model and manfiest as <component>_id.Currently it is required to use –table_column_names = display_name with table upserts.

Options:: replace | upsert

-dml, --data_model_labels <data_model_labels>

Choose how to set the label in the data model. display_label, use the display name as a label, if it is valid (contains no blacklisted characters) otherwise will default to class_label. class_label, default, use standard class or property label. Do not change from default unless there is a real need, using ‘display_label’ can have consequences if not used properly.

Options:: class_label | display_label

-tcn, --table_column_names <table_column_names>

class_label, display_label, display_name, default, class_label. When true annotations and table columns will be uploaded with the display name formatting with blacklisted characters removed. To use for tables, use in conjunction with the use_schema_label flag.

Options:: class_label | display_label | display_name

-ak, --annotation_keys <annotation_keys>

Store attributes using the class label (default) or store attributes using the display label. Attribute display names in the schema must not only include characters that are not accepted by Synapse. Annotation names may only contain: letters, numbers, ‘_’ and ‘.’

Options:: class_label | display_label

validate

Running CLI for manifest validation.

schematic model validate [OPTIONS]

Options

-v, --verbosity <LVL>: Either CRITICAL, ERROR, WARNING, INFO or DEBUG

-mp, --manifest_path <manifest_path>: Required Specify the path to the metadata manifest file that you want to submit to a dataset on Synapse. This is a required argument.

-dt, --data_type <data_type>: Specify the component (data type) from the data model that is to be used for validating the metadata manifest file. You can either explicitly pass the data type here or provide it in the config.yml file as a value for the (manifest > data_type) key.

-js, --json_schema <json_schema>: Specify the path to the JSON Validation Schema for this argument. You can either explicitly pass the .json file here or provide it in the config.yml file as a value for the (model > input > validation_schema) key.

-rr, --restrict_rules: This is a boolean flag. If flag is provided when command line utility is executed, validation suite will only run with in-house validation rules, and Great Expectations rules and suite will not be utilized.If not, the Great Expectations suite will be utilized and all rules will be available.

-ps, --project_scope <project_scope>: Specify a comma-separated list of projects to search through for cross manifest validation.

-ds, --dataset_scope <dataset_scope>: Specify a dataset to validate against for filename validation.

-dml, --data_model_labels <data_model_labels>

Choose how to set the label in the data model. display_label, use the display name as a label, if it is valid (contains no blacklisted characters) otherwise will default to class_label. class_label, default, use standard class or property label. Do not change from default unless there is a real need, using ‘display_label’ can have consequences if not used properly.

Options:: class_label | display_label

schema

Sub-commands for Schema related utilities/methods.

schematic schema [OPTIONS] COMMAND [ARGS]...

convert

Running CLI to convert data model specification in CSV format to data model in JSON-LD format.

Note: Currently, not configured to build off of base model, so removing –base_schema: argument for now

schematic schema convert <options> <DATA_MODEL_CSV>

Options

-v, --verbosity <LVL>: Either CRITICAL, ERROR, WARNING, INFO or DEBUG

-dml, --data_model_labels <data_model_labels>

Choose how to set the label in the data model. display_label, use the display name as a label, if it is valid (contains no blacklisted characters) otherwise will default to class_label. class_label, default, use standard class or property label. Do not change from default unless there is a real need, using ‘display_label’ can have consequences if not used properly.

Options:: class_label | display_label

-o, --output_jsonld <OUTPUT_PATH>: Path to where the generated JSON-LD file needs to be outputted.

Arguments

<DATA_MODEL_CSV>: Required argument

generate-jsonschema

Command to generate jsonschema files for validation for component(s) of the data model.

schematic schema generate-jsonschema <options>

Options

-v, --verbosity <LVL>: Either CRITICAL, ERROR, WARNING, INFO or DEBUG

-dms, --data_model_source <data_model_source>: Path to the data model file or url to the raw jsonld data model.

-od, --output_directory <output_directory>: Path to directory where jsonschema file(s) should be stored.

-dt, --data_type <data_type>: Specify the component (data type) from the data model that is to be used.Not providing a data type here will generate jsonschema files for all components in the data model.

-dml, --data_model_labels <data_model_labels>

Choose how to set the label in the data model. display_label, use the display name as a label, if it is valid (contains no blacklisted characters) otherwise will default to class_label. class_label, default, use standard class or property label. Do not change from default unless there is a real need, using ‘display_label’ can have consequences if not used properly.

Options:: class_label | display_label

viz

Sub-commands for Visualization methods.

schematic viz [OPTIONS] COMMAND [ARGS]...

Options

-v, --verbosity <LVL>: Either CRITICAL, ERROR, WARNING, INFO or DEBUG

-c, --config <config>: Specify the path to the config.yml using this option. This is a required argument.

Environment variables

SCHEMATIC_CONFIG: Provide a default for -c

attributes

Gets attributes

schematic viz attributes [OPTIONS]

Options

-v, --verbosity <LVL>: Either CRITICAL, ERROR, WARNING, INFO or DEBUG

-dml, --data_model_labels <data_model_labels>

Choose how to set the label in the data model. display_label, use the display name as a label, if it is valid (contains no blacklisted characters) otherwise will default to class_label. class_label, default, use standard class or property label. Do not change from default unless there is a real need, using ‘display_label’ can have consequences if not used properly.

Options:: class_label | display_label

tangled_tree_layers

Get the components that belong in each layer of the tangled tree visualization.

schematic viz tangled_tree_layers [OPTIONS]

Options

-v, --verbosity <LVL>: Either CRITICAL, ERROR, WARNING, INFO or DEBUG

-ft, --figure_type <figure_type>

Required Specify the type of schema visualization to make. Either ‘component’ or ‘dependency’.

Options:: component | dependency

-dml, --data_model_labels <data_model_labels>

Choose how to set the label in the data model. display_label, use the display name as a label, if it is valid (contains no blacklisted characters) otherwise will default to class_label. class_label, default, use standard class or property label. Do not change from default unless there is a real need, using ‘display_label’ can have consequences if not used properly.

Options:: class_label | display_label

tangled_tree_text

Get text to be placed on the tangled tree visualization.

schematic viz tangled_tree_text [OPTIONS]

Options

-v, --verbosity <LVL>: Either CRITICAL, ERROR, WARNING, INFO or DEBUG

-ft, --figure_type <figure_type>

Required Specify the type of schema visualization to make. Either ‘component’ or ‘dependency’.

Options:: component | dependency

-tf, --text_format <text_format>

Required Specify the type of text to gather for tangled tree visualization, either ‘highlighted’ or ‘plain’.

Options:: highlighted | plain

-dml, --data_model_labels <data_model_labels>

Choose how to set the label in the data model. display_label, use the display name as a label, if it is valid (contains no blacklisted characters) otherwise will default to class_label. class_label, default, use standard class or property label. Do not change from default unless there is a real need, using ‘display_label’ can have consequences if not used properly.

Options:: class_label | display_label