Google Cloud Video Intelligence Streaming API enables real-time streaming analysis for live media and archived data. Supported features include:
Live Label Detection
Live Shot Change Detection
Live Explicit Content Detection
Live Object Tracking
AIStreamer ingestion library provides a set of open source interface and example code to connect to Google Cloud Video Intelligence Streaming API. The library supports:
File Streaming
HTTP Live Streaming (HLS): a HTTP based media streaming and communication protoocol.
Real Time Streaming Protocol (RTSP): a network control protocol for streaming media servers. It is used in conjunction with Real Time Protocol (RTP) and Real Time Control Protocol (RTCP).
Real Time Messaging Protocol (RTMP): a protocol for streaming audio, video and data over the Internet.
AIStreamer ingestion library provides a Docker example. Please refer to individual documentation:
- Live Streaming: Instruction for supporting live streaming protocols (HLS, RTSP and RTMP) in Google Cloud Video Intelligence.
- File/Data Streaming: Instruction for support file/data streaming in Google Cloud Video Intelligence.
- Docker & K8s: Instruction of using our docker example and kubernetes deployment.
- Live Label Detection: Instruction for streaming label detection.
- Live Shot Change Detection: Instruction for streaming shot change detection.
- Live Explicit Content Detection: Instruction for streaming explicit content detection.
- Live Object Tracking: Instruction for streaming object tracking.
AIStreamer ingestion library includes the following three directories:
- client: Python & C++ client libraries for connecting to Cloud Video Intelligence.
- env: Docker example for AIStreamer ingestion.
- proto: Proto definitions and gRPC interface for Cloud Video Intelligence.
The open source AIStreamer ingestion library is based on the following Google-owned and third-party open source libraries.
- Bazel: A build and test tool with multi-language support.
- gRPC: A high performance, open-source universal RPC framework.
- Protobuf: Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data.
- rules_protobuf: Bazel rules for building protocol buffers and gRPC services.
- glog: C++ implementation of the Google logging module.
- gflags: C++ library that implements commandline flags processing.
- ffmpeg: A complete, cross-platform solution to record, convert and stream audio and video.
- gStreamer: Another cross-platform multimedia processing and streaming framework.
Google Cloud Video Intelligence Streaming API supports the following features in video_intelligence_streaming.proto.
// Streaming video annotation feature.
enum StreamingFeature {
// Unspecified.
// Label detection. Detect objects, such as dog or flower.
// Shot change detection.
// Explicit content detection.
// Object tracking.
AIStreamer ingestion client sends StreamingAnnotateVideoRequest to Google Cloud Video Intelligence Streaming API servers. The first StreamingAnnotateVideoRequest message must only contain StreamingVideoConfig, and cannot include input_content. There is an option to store live annotation results to customer specified GCS bucket. By default, this storage option is disabled.
// The top-level message sent by the client for the `StreamingAnnotateVideo`
// method. Multiple `StreamingAnnotateVideoRequest` messages are sent.
// The first message must only contain a `StreamingVideoConfig` message.
// All subsequent messages must only contain `input_content` data.
message StreamingAnnotateVideoRequest {
// *Required* The streaming request, which is either a streaming config or
// video content.
oneof streaming_request {
// Provides information to the annotator, specifing how to process the
// request. The first `AnnotateStreamingVideoRequest` message must only
// contain a `video_config` message.
StreamingVideoConfig video_config = 1;
// The video data to be annotated. Chunks of video data are sequentially
// sent in `StreamingAnnotateVideoRequest` messages. Except the initial
// `StreamingAnnotateVideoRequest` message containing only
// `video_config`, all subsequent `AnnotateStreamingVideoRequest`
// messages must only contain `input_content` field.
bytes input_content = 2;
// Provides information to the annotator that specifies how to process the
// request.
message StreamingVideoConfig {
// Requested annotation feature.
StreamingFeature feature = 1;
// Config for requested annotation feature.
oneof streaming_config {
StreamingShotChangeDetectionConfig shot_change_detection_config = 2;
// Config for LABEL_DETECTION.
StreamingLabelDetectionConfig label_detection_config = 3;
StreamingExplicitContentDetectionConfig explicit_content_detection_config =
StreamingObjectTrackingConfig object_tracking_config = 5;
// Streaming storage option. By default: storage is disabled.
StreamingStorageConfig storage_config = 30;
// Config for streaming storage option.
message StreamingStorageConfig {
// Enable streaming storage. Default: false.
bool enable_storage_annotation_result = 1;
// GCS URI to store all annotation results for one client. Client should
// specify this field as the top-level storage directory. Annotation results
// of different sessions will be put into different sub-directories denoted
// by project_name and session_id. All sub-directories will be auto generated
// by program and will be made accessible to client in response proto.
// URIs must be specified in the following format: `gs://bucket-id/object-id`
// `bucket-id` should be a valid GCS bucket created by client and bucket
// permission shall also be configured properly. `object-id` can be arbitrary
// string that make sense to client. Other URI formats will return error and
// cause GCS write failure.
string annotation_result_storage_directory = 3;
AIStreamer ingestion client receives StreamingAnnotateVideoResponse from Google Cloud Video Intelligence Streaming API servers.
// `StreamingAnnotateVideoResponse` is the only message returned to the client
// by `StreamingAnnotateVideo`. A series of zero or more
// `StreamingAnnotateVideoResponse` messages are streamed back to the client.
message StreamingAnnotateVideoResponse {
// If set, returns a [google.rpc.Status][] message that
// specifies the error for the operation. error = 1;
// Streaming annotation results.
StreamingVideoAnnotationResults annotation_results = 2;
// Streaming annotation results corresponding to a portion of the video
// that is currently being processed.
message StreamingVideoAnnotationResults {
// Shot annotation results. Each shot is represented as a video segment.
repeated VideoSegment shot_annotations = 1;
// Label annotation results.
repeated LabelAnnotation label_annotations = 2;
// Explicit content detection results.
ExplicitContentAnnotation explicit_annotation = 3;
// Object tracking results.
repeated ObjectTrackingAnnotation object_annotations = 4;
AIStreamer ingestion client uses bidirectional streaming gRPC interface to talk to Google Cloud Video Intelligence Streaming API servers. The bidrectional gRPC streaming interface is defined as StreamingVideoIntelligenceService.
// Service that implements streaming Google Cloud Video Intelligence API.
service StreamingVideoIntelligenceService {
// Performs video annotation with bidirectional streaming: emitting results
// while sending video/audio bytes.
// This method is only available via the gRPC API (not REST).
rpc StreamingAnnotateVideo(stream StreamingAnnotateVideoRequest)
returns (stream StreamingAnnotateVideoResponse);
AIStreamer ingestion client must use two threads (sender thread and receiver thread) to support bidirectional streaming gRPC interface. To see Python and C++ examples related to AIStreamer, go to client directory. To understand the basic gRPC concept and how it works, go to gRPC documentation.