Aggregators

An aggregator converts a set of columns and arbitrary values into a single value. Each aggregator has an input type and an output type. Aggregators run before transformers.
Custom aggregators can be implemented in Python or PySpark. See the implementation docs for a detailed guide.
Config
- kind: aggregator
  name: <string>  # aggregator name (required)
  path: <string>  # path to the implementation file, relative to the cortex root (default: implementations/aggregators/<name>.py)
  output_type: <output_type>  # the output type of the aggregator (required)
  input: <input_type>  # the input type of the aggregator (required)
See Data Types for details about input and output types.
Example
- kind: aggregator
  name: bucket_boundaries
  path: bucket_boundaries.py
  output_type: [FLOAT]
  input:
    num: FLOAT_COLUMN|INT_COLUMN
    num_buckets: INT
Built-in Aggregators
Cortex includes common aggregators that can be used out of the box (see aggregators.yaml). To use built-in aggregators, use the cortex namespace in the aggregator name (e.g. cortex.mean).
Pipeline Deployments - Previous
Raw Columns
Next - Pipeline Deployments
Custom Aggregators
Last updated -2