Skip to content

Commit

Permalink
Add separate example DAGs and system tests for google cloud speech (#…
Browse files Browse the repository at this point in the history
  • Loading branch information
ephraimbuddy committed May 10, 2020
1 parent 9bb91ef commit 493b685
Show file tree
Hide file tree
Showing 11 changed files with 328 additions and 125 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

import os

from airflow import models
from airflow.providers.google.cloud.operators.speech_to_text import CloudSpeechToTextRecognizeSpeechOperator
from airflow.providers.google.cloud.operators.text_to_speech import CloudTextToSpeechSynthesizeOperator
from airflow.utils import dates

GCP_PROJECT_ID = os.environ.get("GCP_PROJECT_ID", "example-project")
BUCKET_NAME = os.environ.get("GCP_SPEECH_TO_TEXT_TEST_BUCKET", "gcp-speech-to-text-test-bucket")

# [START howto_operator_speech_to_text_gcp_filename]
FILENAME = "gcp-speech-test-file"
# [END howto_operator_speech_to_text_gcp_filename]

# [START howto_operator_text_to_speech_api_arguments]
INPUT = {"text": "Sample text for demo purposes"}
VOICE = {"language_code": "en-US", "ssml_gender": "FEMALE"}
AUDIO_CONFIG = {"audio_encoding": "LINEAR16"}
# [END howto_operator_text_to_speech_api_arguments]

# [START howto_operator_speech_to_text_api_arguments]
CONFIG = {"encoding": "LINEAR16", "language_code": "en_US"}
AUDIO = {"uri": "gs://{bucket}/{object}".format(bucket=BUCKET_NAME, object=FILENAME)}
# [END howto_operator_speech_to_text_api_arguments]

default_args = {"start_date": dates.days_ago(1)}

with models.DAG(
"example_gcp_speech_to_text",
default_args=default_args,
schedule_interval=None, # Override to match your needs
tags=['example'],
) as dag:
text_to_speech_synthesize_task = CloudTextToSpeechSynthesizeOperator(
project_id=GCP_PROJECT_ID,
input_data=INPUT,
voice=VOICE,
audio_config=AUDIO_CONFIG,
target_bucket_name=BUCKET_NAME,
target_filename=FILENAME,
task_id="text_to_speech_synthesize_task",
)
# [START howto_operator_speech_to_text_recognize]
speech_to_text_recognize_task2 = CloudSpeechToTextRecognizeSpeechOperator(
config=CONFIG,
audio=AUDIO,
task_id="speech_to_text_recognize_task"
)
# [END howto_operator_speech_to_text_recognize]

text_to_speech_synthesize_task >> speech_to_text_recognize_task2
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

import os

from airflow import models
from airflow.providers.google.cloud.operators.text_to_speech import CloudTextToSpeechSynthesizeOperator
from airflow.utils import dates

GCP_PROJECT_ID = os.environ.get("GCP_PROJECT_ID", "example-project")
BUCKET_NAME = os.environ.get("GCP_TEXT_TO_SPEECH_BUCKET", "gcp-text-to-speech-test-bucket")

# [START howto_operator_text_to_speech_gcp_filename]
FILENAME = "gcp-speech-test-file"
# [END howto_operator_text_to_speech_gcp_filename]

# [START howto_operator_text_to_speech_api_arguments]
INPUT = {"text": "Sample text for demo purposes"}
VOICE = {"language_code": "en-US", "ssml_gender": "FEMALE"}
AUDIO_CONFIG = {"audio_encoding": "LINEAR16"}
# [END howto_operator_text_to_speech_api_arguments]

default_args = {"start_date": dates.days_ago(1)}

with models.DAG(
"example_gcp_text_to_speech",
default_args=default_args,
schedule_interval=None, # Override to match your needs
tags=['example'],
) as dag:

# [START howto_operator_text_to_speech_synthesize]
text_to_speech_synthesize_task = CloudTextToSpeechSynthesizeOperator(
project_id=GCP_PROJECT_ID,
input_data=INPUT,
voice=VOICE,
audio_config=AUDIO_CONFIG,
target_bucket_name=BUCKET_NAME,
target_filename=FILENAME,
task_id="text_to_speech_synthesize_task",
)
text_to_speech_synthesize_task2 = CloudTextToSpeechSynthesizeOperator(
input_data=INPUT,
voice=VOICE,
audio_config=AUDIO_CONFIG,
target_bucket_name=BUCKET_NAME,
target_filename=FILENAME,
task_id="text_to_speech_synthesize_task2",
)
# [END howto_operator_text_to_speech_synthesize]

text_to_speech_synthesize_task >> text_to_speech_synthesize_task2
Original file line number Diff line number Diff line change
Expand Up @@ -16,41 +16,29 @@
# specific language governing permissions and limitations
# under the License.

"""
Example Airflow DAG that runs speech synthesizing and stores output in Google Cloud Storage
This DAG relies on the following OS environment variables
* GCP_PROJECT_ID - Google Cloud Platform project for the Cloud SQL instance.
* GCP_SPEECH_TEST_BUCKET - Name of the bucket in which the output file should be stored.
"""

import os

from airflow import models
from airflow.providers.google.cloud.operators.speech_to_text import CloudSpeechToTextRecognizeSpeechOperator
from airflow.providers.google.cloud.operators.text_to_speech import CloudTextToSpeechSynthesizeOperator
from airflow.providers.google.cloud.operators.translate_speech import CloudTranslateSpeechOperator
from airflow.utils import dates

GCP_PROJECT_ID = os.environ.get("GCP_PROJECT_ID", "example-project")
BUCKET_NAME = os.environ.get("GCP_SPEECH_TEST_BUCKET", "gcp-speech-test-bucket")
BUCKET_NAME = os.environ.get("GCP_TRANSLATE_SPEECH_TEST_BUCKET", "gcp-translate-speech-test-bucket")

# [START howto_operator_text_to_speech_gcp_filename]
# [START howto_operator_translate_speech_gcp_filename]
FILENAME = "gcp-speech-test-file"
# [END howto_operator_text_to_speech_gcp_filename]
# [END howto_operator_translate_speech_gcp_filename]

# [START howto_operator_text_to_speech_api_arguments]
INPUT = {"text": "Sample text for demo purposes"}
VOICE = {"language_code": "en-US", "ssml_gender": "FEMALE"}
AUDIO_CONFIG = {"audio_encoding": "LINEAR16"}
# [END howto_operator_text_to_speech_api_arguments]

# [START howto_operator_speech_to_text_api_arguments]
# [START howto_operator_translate_speech_arguments]
CONFIG = {"encoding": "LINEAR16", "language_code": "en_US"}
AUDIO = {"uri": "gs://{bucket}/{object}".format(bucket=BUCKET_NAME, object=FILENAME)}
# [END howto_operator_speech_to_text_api_arguments]

# [START howto_operator_translate_speech_arguments]
TARGET_LANGUAGE = 'pl'
FORMAT = 'text'
MODEL = 'base'
Expand All @@ -60,13 +48,11 @@
default_args = {"start_date": dates.days_ago(1)}

with models.DAG(
"example_gcp_speech",
"example_gcp_translate_speech",
default_args=default_args,
schedule_interval=None, # Override to match your needs
tags=['example'],
) as dag:

# [START howto_operator_text_to_speech_synthesize]
text_to_speech_synthesize_task = CloudTextToSpeechSynthesizeOperator(
project_id=GCP_PROJECT_ID,
input_data=INPUT,
Expand All @@ -76,16 +62,6 @@
target_filename=FILENAME,
task_id="text_to_speech_synthesize_task",
)
# [END howto_operator_text_to_speech_synthesize]

# [START howto_operator_speech_to_text_recognize]
speech_to_text_recognize_task = CloudSpeechToTextRecognizeSpeechOperator(
project_id=GCP_PROJECT_ID, config=CONFIG, audio=AUDIO, task_id="speech_to_text_recognize_task"
)
# [END howto_operator_speech_to_text_recognize]

text_to_speech_synthesize_task >> speech_to_text_recognize_task

# [START howto_operator_translate_speech]
translate_speech_task = CloudTranslateSpeechOperator(
project_id=GCP_PROJECT_ID,
Expand All @@ -97,6 +73,14 @@
model=MODEL,
task_id='translate_speech_task'
)
translate_speech_task2 = CloudTranslateSpeechOperator(
audio=AUDIO,
config=CONFIG,
target_language=TARGET_LANGUAGE,
format_=FORMAT,
source_language=SOURCE_LANGUAGE,
model=MODEL,
task_id='translate_speech_task2'
)
# [END howto_operator_translate_speech]

text_to_speech_synthesize_task >> translate_speech_task
text_to_speech_synthesize_task >> translate_speech_task >> translate_speech_task2
81 changes: 81 additions & 0 deletions docs/howto/operator/gcp/speech_to_text.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
Google Cloud Speech to Text Operators
=====================================

Prerequisite Tasks
------------------

.. include:: _partials/prerequisite_tasks.rst

.. _howto/operator:CloudSpeechToTextRecognizeSpeechOperator:

CloudSpeechToTextRecognizeSpeechOperator
----------------------------------------

Recognizes speech in audio input and returns text.

For parameter definition, take a look at
:class:`~airflow.providers.google.cloud.operators.speech_to_text.CloudSpeechToTextRecognizeSpeechOperator`

Arguments
"""""""""

config and audio arguments need to be dicts or objects of corresponding classes from
google.cloud.speech_v1.types module

for more information, see: https://googleapis.github.io/google-cloud-python/latest/speech/gapic/v1/api.html#google.cloud.speech_v1.SpeechClient.recognize

.. exampleinclude:: ../../../../airflow/providers/google/cloud/example_dags/example_speech_to_text.py
:language: python
:start-after: [START howto_operator_text_to_speech_api_arguments]
:end-before: [END howto_operator_text_to_speech_api_arguments]

filename is a simple string argument:

.. exampleinclude:: ../../../../airflow/providers/google/cloud/example_dags/example_speech_to_text.py
:language: python
:start-after: [START howto_operator_speech_to_text_api_arguments]
:end-before: [END howto_operator_speech_to_text_api_arguments]

Using the operator
""""""""""""""""""

.. exampleinclude:: ../../../../airflow/providers/google/cloud/example_dags/example_speech_to_text.py
:language: python
:dedent: 4
:start-after: [START howto_operator_speech_to_text_recognize]
:end-before: [END howto_operator_speech_to_text_recognize]

Templating
""""""""""

.. literalinclude:: ../../../../airflow/providers/google/cloud/operators/speech_to_text.py
:language: python
:dedent: 4
:start-after: [START gcp_speech_to_text_synthesize_template_fields]
:end-before: [END gcp_speech_to_text_synthesize_template_fields]

Reference
---------

For further information, look at:

* `Client Library Documentation <https://googleapis.github.io/google-cloud-python/latest/speech/>`__
* `Product Documentation <https://cloud.google.com/speech/>`__
Loading

0 comments on commit 493b685

Please sign in to comment.