A transformer converts a set of columns and arbitrary values into a single transformed column. Each transformer has an input type and an output column type.
Custom transformers can be implemented in Python or PySpark. See the implementation docs for a detailed guide.
- kind: transformername: <string> # transformer name (required)path: <string> # path to the implementation file, relative to the cortex root (default: implementations/transformers/<name>.py)output_type: <column_type> # The type of column that will be generated by this transformer (required)input: <input_type> # the input type of the transformer (required)
See Data Types for details about input and column types.
- kind: transformername: normalizeoutput_type: FLOAT_COLUMNinput:num: INT_COLUMN|FLOAT_COLUMNmean: FLOATstddev: FLOAT
Cortex includes common transformers that can be used out of the box (see transformers.yaml). To use built-in transformers, use the cortex namespace in the transformer name (e.g. cortex.normalize).