You can upload existing logs to your Vertex AI TensorBoard instance that were created by training locally, training outside of Vertex AI, created by a colleague, are example logs, or were created using a different Vertex AI TensorBoard instance. Logs can be shared among multiple Vertex AI TensorBoard instances.
Vertex AI TensorBoard offers Google Cloud CLI and Vertex AI SDK for Python for uploading TensorBoard logs. You can upload logs from any environment that can connect to Google Cloud.
Vertex AI SDK for Python
Continuous monitoring
For continuous monitoring call aiplatform.start_upload_tb_log
at the beginning
of the training.
The SDK opens a new thread for uploading. This thread monitors for new data in the
directory, and uploads it to your Vertex AI TensorBoard experiment.
When training completes, call end_upload_tb_log
to end the uploader thread.
Note that after calling start_upload_tb_log()
your thread will kept alive even if
an exception is thrown. To ensure the thread gets shut down, put any code after
start_upload_tb_log()
and before end_upload_tb_log()
in a
try
statement, and call end_upload_tb_log()
in finally
.
Python
tensorboard_experiment_name
: The name of the TensorBoard experiment to upload to.logdir
: The directory location to check for TensorBoard logs.tensorboard_id
: The TensorBoard instance ID. If not set, thetensorboard_id
inaiplatform.init
is used.project
: Your project ID. You can find you Project ID in the Google Cloud console welcome page.location
: The location where your TensorBoard instance is located.experiment_display_name
: The display name of the experiment.run_name_prefix
: If present, all runs created by this invocation will have their name prefixed by this value.description
: A string description to assign to the experiment.
One time logging
Upload TensorBoard logs
Call aiplatform.upload_tb_log
to perform a one-time upload of TensorBoard logs.
This uploads existing data in the logdir and then returns immediately.
Python
tensorboard_experiment_name
: The name of the TensorBoard experiment.logdir
: The directory location to check for TensorBoard logs.tensorboard_id
: The TensorBoard instance ID. If not set, thetensorboard_id
inaiplatform.init
is used.project
: Your project ID. You can find these Project IDs in the Google Cloud console welcome page.location
: The location where your TensorBoard instance is located.experiment_display_name
: The display name of the experiment.run_name_prefix
: If present, all runs created by this invocation will have their name prefixed by this value.description
: A string description to assign to the experiment.verbosity
: Level of statistics verbosity, an integer. Supported values: 0 - No upload statistics are printed. 1 - Print upload statistics while uploading data (default).
Upload profile logs
Call aiplatform.upload_tb_log
to upload TensorBoard profile logs to an experiment.
Python
experiment_name
: The name of the TensorBoard experiment.logdir
: The directory location to check for TensorBoard logs.project
: Your project ID. You can find these Project IDs in the Google Cloud console welcome page.location
: The location where your TensorBoard instance is located.run_name_prefix
: For profile data, this is the run prefix. The directory format within LOG_DIR should match the following:/RUN_NAME_PREFIX/plugins/profile/YYYY_MM_DD_HH_SS/
allowed_plugins
: A list of additional plugins to allow. For uploading profile data, this should include"profile"
gcloud CLI
- (Optional) Create a dedicated virtual environment to install the Vertex AI TensorBoard
uploader Python CLI.
python3 -m venv PATH/TO/VIRTUAL/ENVIRONMENT source PATH/TO/VIRTUAL/ENVIRONMENT/bin/activate
PATH/TO/VIRTUAL/ENVIRONMENT
: your dedicated virtual environment.
- Install the Vertex AI TensorBoard package through Vertex AI SDK.
pip install -U pip pip install google-cloud-aiplatform[tensorboard]
- Upload TensorBoard logs
- Time Series and Blob Data
tb-gcp-uploader --tensorboard_resource_name \
TENSORBOARD_RESOURCE_NAME
\ --logdir=LOG_DIR
\ --experiment_name=TB_EXPERIMENT_NAME
--one_shot=True - Profile Data
tb-gcp-uploader \ --tensorboard_resource_name
TENSORBOARD_RESOURCE_NAME
\ --logdir=LOG_DIR
--experiment_name=TB_EXPERIMENT_NAME
\ --allowed_plugins="profile" --run_name_prefix=RUN_NAME_PREFIX
\ --one_shot=True
- Time Series and Blob Data
-
TENSORBOARD_RESOURCE_NAME
: The TensorBoard Resource name used to fully identify the Vertex AI TensorBoard instance. LOG_DIR
: The location of the event logs that resides either in the local file system or Cloud StorageTB_EXPERIMENT_NAME
: The name of the TensorBoard experiment, for exampletest-experiment
.RUN_NAME_PREFIX
: For profile data, this is the run prefix. The directory format withinLOG_DIR
should match the following:/RUN_NAME_PREFIX/plugins/profile/YYYY_MM_DD_HH_SS/
The uploader CLI by default runs indefinitely, monitoring changes in the LOG_DIR
,
and uploads newly added logs. --one_shot=True
disables the
behavior. Run tb-gcp-uploader --help
for more information.