predictor.py.requirements.txt file to specify the dependencies needed by predictor.py. Cortex will automatically install them into your runtime once you deploy:cortex.yaml file and add the configuration below. A RealtimeAPI provides a runtime for inference and makes your predictor.py implementation available as a web service that can serve real-time predictions:cortex deploy takes your Predictor implementation along with the configuration from cortex.yaml and creates a web API:cortex get:cortex get <api_name>:curl to test your API (it will take a few seconds to generate the text):--env flag specifies the name of the CLI environment to use. CLI environments contain the information necessary to connect to your cluster. The default environment is local, and when the cluster was created, a new environment named aws was created to point to the cluster. You can change the default environment with cortex env default <env_name).cortex get:cortex get <api_name>:predictor.py or your cortex.yaml, you can update your api by re-running cortex deploy.predictor.py to set the length of the generated text based on a query parameter:cortex deploy to perform a rolling update of your API:cortex get:compute field to your API configuration:cortex deploy to update your API with this configuration:cortex get to check the status of your API, and once it's live, prediction requests should be faster.max_surge to 0 in the update_strategy configuration:cortex delete to delete each API:cortex delete will free up cluster resources and allow Cortex to scale down to the minimum number of instances you specified during cluster creation. It will not spin down your cluster.