Model Analyzer CLI
Use the -h or --help flag to view a description of the Model Analyzer's
command line interface.
Options like -q, --quiet and -v, --verbose are global and apply to all
model analyzer subcommands.
Model Analyze Modes
The -m or --mode flag is global and is accessible to all subcommands. It tells the model analyzer the context
in which it is being run. Currently model analyzer supports 2 modes.
Online Mode
This is the default mode. When in this mode, Model Analyzer will operate to find
the optimal model configuration for an online inference scenario. By default in
online mode, the best model configuration will be the one that maximizes
throughput. If a latency budget is specified to the profile subcommand via
--latency-budget, then the best model configuration will be the one with the highest throughput in the given budget.
In online mode the profile and report subcommands will generate summaries specific to online inference. See the example online summary and online detailed report.
Offline Mode
The offline mode --mode=offline tells Model Analyzer to operate to find the
optimal model configuration for an offline inference scenario. By default
in offline mode, the best model configuration will be the one that maximizes throughput.
A minimum throughput can be specified to the profile subcommand
via --min-throughput to ignore any configuration that does not exceed a minimum number of inferences per second.
In offline mode the analyze and report subcommands will generate reports specific to offline inference. See the example offline summary and offline detailed report examples.
Model Analyzer Subcommands
The Model Analyzer's functionality is split across three separate subcommands. Each
subcommand has its own CLI and config options. Some options are required for
more than one subcommand (e.g. --export-path). See the Configuring Model
Analyzer section for more details on configuring each of these
subcommands.
Subcommand: profile
The profile subcommand begins by loading the "latest" checkpoint (if available) in
the checkpoint directory. It will then run model inferences using perf
analyzer, and collect metrics like throughput, latency and memory usage for
any measurements not present in the checkpoint.
Next, it sorts the models specified in the CLI or config YAML, using any objectives specified in the config YAML. Finally, it constructs summary PDFs using the top model configs for each model, as well as across models, if requested (See the Reports section for more details).
The profile subcommand can be run multiple times with different configurations if
the user would like to sort and filter the results using different objectives or
under different constraints.
Use the following command to see the usage and argument descriptions for the subcommand.
$ model-analyzer profile -h
Depending on the command line or YAML config options provided, the profile
subcommand will either perform a
manual or automatic
search over perf analyzer
and model config file parameters. For each combination of model config
parameters (e.g. max batch size, dynamic batching, and instance count), it will run tritonserver and perf analyzer instances with
all the specified run parameters (client request concurrency and static batch
size). It will also save the protobuf (.pbtxt) model config files corresponding
to each combination in the output model
repository. Model Analyzer collects
various metrics at fixed time intervals during these perf analyzer runs. Each
perf analyzer run generates a single measurement, which corresponds to a row in
the output tables. After completing the runs for all configurations for each
model, the Model Analyzer will save the measurements it has collected into the
checkpoint directory. See the
Checkpointing section for more details on checkpoints
Examples
Some example profile commands are shown here. For a full example see the quick start section.
- Run auto config search on a model called
resnet50_libtorchlocated in/home/model_repo
$ model-analyzer profile -m /home/model_repo --profile-models resnet50_libtorch
- Run auto config search on 2 models called
resnet50_libtorchandvgg16_graphdeflocated in/home/model_repoand save checkpoints tocheckpoints
$ model-analyzer profile -m /home/model_repo --profile-models resnet50_libtorch,vgg16_graphdef --checkpoint-directory=checkpoints
- Run auto config search on a model called
resnet50_libtorchlocated in/home/model_repo, but change the repository where model config variants are stored to/home/output_repo
$ model-analyzer profile -m /home/model_repo --output-model-repository-path=/home/output_repo --profile-models resnet50_libtorch
- Run profile over manually defined configurations for a models
classification_malaria_v1andclassification_chestxray_v1located in/home/model_repousing the YAML config file
$ model-analyzer profile -f /path/to/config.yaml
The contents of config.yaml are shown below.
model_repository: /home/model_repo run_config_search_disable: True concurrency: [2, 4, 8, 16, 32] batch_sizes: [8, 16, 64] profile_models: classification_malaria_v1: model_config_parameters: instance_group: - kind: KIND_GPU count: [1, 2] dynamic_batching: max_queue_delay_microseconds: [100] classification_chestxray_v1: model_config_parameters: instance_group: - kind: KIND_GPU count: [1, 2] dynamic_batching: max_queue_delay_microseconds: [100]
- Apply objectives and constraints to sort and filter results in summary plots and tables using yaml config file.
$ model-analyzer profile -f /path/to/config.yaml
The contents of config.yaml are shown below.
checkpoint_directory: ./checkpoints/ export_path: ./export_directory/ analysis_models: resnet50_libtorch: objectives: - perf_throughput constraints: perf_latency_p99: max: 15 vgg16_graphdef: objectives: - gpu_used_memory constraints: perf_latency_p99: max: 15
Note: The checkpoint directory should be removed between consecutive runs of
the model-analyzer profile command when you do not want to include the results
from a previous profile.
Subcommand: analyze
Note: This subcommand has been deprecated and is slated for removal. This subcommand's functionality has been subsumed into the profile subcommand
The analyze subcommand allows the user to create summaries and data tables
from the measurements taken using the profile subcommand. The YAML config file
can be used to set constraints and objectives used to sort and filter the
measurements, and order the model configs and models according to the metrics
collected. Use the
following command to see the usage and argument descriptions for the subcommand.
$ model-analyzer analyze -h
The analyze subcommand begins by loading the "latest" checkpoint available in
the checkpoint directory. Next, it sorts the models specified in the CLI or
config YAML, provided they contain measurements in the checkpoint, using the
objectives specified in the config YAML. Finally, it constructs summary PDFs
using the top model configs for each model, as well as across models, if
requested (See the Reports section for more details). The
analyze subcommand can be run multiple times with different configurations if
the user would like to sort and filter the results using different objectives or
under different constraints.
Examples
- Create summary and results for model
resnet50_libtorchfrom latest checkpoint in directorycheckpoints.
$ model-analyzer analyze --analysis-models resnet50_libtorch --checkpoint-directory=checkpoints
- Create summaries and results for models
resnet50_libtorchandvgg16_graphdeffrom same checkpoint as above and export them to a directory calledexport_directory
$ model-analyzer analyze --analysis-models resnet50_libtorch,vgg16_graphdef -e export_directory --checkpoint-directory=checkpoints
- Apply objectives and constraints to sort and filter results in summary plots and tables using yaml config file.
$ model-analyzer analyze -f /path/to/config.yaml
The contents of config.yaml are shown below.
checkpoint_directory: ./checkpoints/ export_path: ./export_directory/ analysis_models: resnet50_libtorch: objectives: - perf_throughput constraints: perf_latency_p99: max: 15 vgg16_graphdef: objectives: - gpu_used_memory constraints: perf_latency_p99: max: 15
Subcommand: report
The report subcommand allows the user to create detailed reports on one or
more of the model configs that were profiled.
$ model-analyzer report -h
Instead of showing only the top measurements from each config like in the summary reports, Model Analyzer compiles and displays all the meausurements for a given config in the detailed report (See the Reports section for more details).
Examples
- Generate detailed reports for a model configs of
resnet50_libtorchcalledresnet50_libtorch_config_1andresnet50_libtorch_config_2. Read fromcheckpointsand write toexport_directory.
$ model-analyzer --report-model-configs resnet50_libtorch_config_1,resnet50_libtorch_config_2 --checkpoint-directory checkpoints -e export_directory
- Generate detailed report for
resnet50_libtorch_config_2with a custom plot using YAML config file
$ model-analyzer report -f /path/to/config.yaml
The contents of the config.yaml are shown below
checkpoint_directory: ./checkpoints/ export_path: "./export_directory" report_model_configs: resnet50_libtorch_config_2: plots: throughput_v_memory: title: Thoughput vs GPU Memory x_axis: gpu_used_memory y_axis: perf_throughput monotonic: True