Custom Task Template Example

Keywords: AWS Augmented AI, a2i, human in loop, HIL, Task Template

对 A2I 进行自定义的关键就是定义 Task Template, 因为 Task Template 定义了 HIL Task Input Data, Output Data 转换关系, 以及最关键的 UI.

1. 准备工作

首先你需要在 GroundTruth AWS Console 手动创建 Private Workforce Team, 并邀请至少 1 位员工加入这个 Team.

2. 开发 Task Template

准备 Task Input 的数据, 需要包含你的 Computational 的 Input / Output 以及计算所需的参考信息.

{
    "Pairs": [
        {"row": "Sanhe", "prediction": 0.1544346809387207},
        {"row": "Vignesh", "prediction": 0.4938497543334961},
        {"row": "Sharanya", "prediction": 0.23486430943012238}
    ]
}

开发 UI HTML 模板, 把这些信息用 Human Friendly 的形式排版好.

运行这个本地开发用的脚本, 在本地 render HTML UI, 并在浏览器中打开预览, 直到你满意为止.

# -*- coding: utf-8 -*-

"""
This is a utility script allow you to debug your AWS Augmented AI Task Template
locally.

Copyright (c) 2021-2022, Sanhe Hu.
License: MIT (see LICENSE for details)

Pre-requisite:

- ``Python >= 3.7``, ``python_liquid``, ``python_box``

.. code-block:: bash

    pip install python_liquid
    pip install python_box
"""

import json
from pathlib import Path
from box import Box
from liquid import Template

dir_here = Path(__file__).parent
path_template = Path(dir_here, "task-backup.liquid")
path_data = Path(dir_here, "task.json")
path_html = Path(dir_here, "task.html")

# read liquid template
template = Template(path_template.read_text())

# read task data
input_data = json.loads(path_data.read_text())

# convert task data to box, so it support dot notation
task = Box({"input": input_data})

# render template
content = template.render(task=task)

# write template to html file
path_html.write_text(content)

3. 部署整个 Human Review Workflow

用 A2I as code script 部署 Task Template 以及 Flow definition. 最后运行 start_human_loop 函数, 触发 HIL, 然后 login 到 Workforce 的 UI 中 review 并 submit. 最后到 S3 bucket 中检查 Output.

  1# -*- coding: utf-8 -*-
  2
  3"""
  4Augmented AI as Code script. Allow you to quickly spin up required resources
  5to play with A2I, and then easily clean up all resource with one click.
  6
  7Prerequisite:
  8
  9- Create private human workforce:
 10    - go to https://console.aws.amazon.com/sagemaker/groundtruth?#/labeling-workforces, create a private team
 11    - invite new workers, validate their email
 12
 13Ref:
 14
 15- boto3 Sagemaker: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html
 16- boto3 a2i: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker-a2i-runtime.html
 17- boto3 iam: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iam.html
 18"""
 19
 20import json
 21import boto3
 22import uuid
 23from pathlib import Path
 24
 25dir_here = Path(__file__).parent
 26
 27
 28# ------------------------------------------------------------------------------
 29# Configuration
 30# ------------------------------------------------------------------------------
 31class Config:
 32    aws_profile = None
 33    aws_region = "us-east-1"
 34    project_name = "a2i_poc"
 35    hil_output_loc = "s3://aws-data-lab-sanhe-for-everything/2022-02-23-a2i"  # DO NOT end with "/"
 36    worker_team_arn = "arn:aws:sagemaker:us-east-1:669508176277:workteam/private-crowd/sanhe-labeling-workforce"
 37
 38    @property
 39    def project_name_slug(self):
 40        return self.project_name.replace("_", "-")
 41
 42    @property
 43    def a2i_execution_role_name(self):
 44        return f"{self.project_name_slug}-a2i-execution-role"
 45
 46    @property
 47    def a2i_execution_role_arn(self):
 48        return f"arn:aws:iam::{account_id}:role/{self.a2i_execution_role_name}"
 49
 50    @property
 51    def a2i_execution_role_console_url(self):
 52        return f"https://console.aws.amazon.com/iamv2/home?region={self.aws_region}#/roles/details/{self.a2i_execution_role_name}?section=permissions"
 53
 54    @property
 55    def a2i_execution_role_policy_name(self):
 56        return f"{config.a2i_execution_role_name}-inline-policy"
 57
 58    @property
 59    def task_ui_name(self):
 60        return f"{self.project_name_slug}-task-ui"
 61
 62    @property
 63    def task_ui_arn(self):
 64        return f"arn:aws:sagemaker:{config.aws_region}:{account_id}:human-task-ui/{config.task_ui_name}"
 65
 66    @property
 67    def task_ui_console_url(self):
 68        return f"https://console.aws.amazon.com/a2i/home?region={config.aws_region}#/worker-task-templates/{config.task_ui_name}"
 69
 70    @property
 71    def flow_definition_name(self):
 72        return f"{self.project_name_slug}-flow-def"
 73
 74    @property
 75    def flow_definition_arn(self):
 76        return f"arn:aws:sagemaker:{config.aws_region}:{account_id}:flow-definition/{self.flow_definition_name}"
 77
 78    @property
 79    def flow_definition_console_url(self):
 80        return f"https://console.aws.amazon.com/a2i/home?region={config.aws_region}#/human-review-workflows/{self.flow_definition_name}"
 81
 82    @property
 83    def hil_output_uri(self):
 84        return f"{self.hil_output_loc}/{self.flow_definition_name}"
 85
 86
 87config = Config()
 88
 89boto_ses = boto3.session.Session(
 90    profile_name=config.aws_profile,
 91    region_name=config.aws_region,
 92)
 93
 94sm_client = boto_ses.client("sagemaker")
 95a2i_client = boto_ses.client("sagemaker-a2i-runtime")
 96iam_client = boto3.client("iam")
 97sts_client = boto3.client("sts")
 98
 99account_id = sts_client.get_caller_identity()["Account"]
100
101
102class Tag:
103    project_name = dict(Key="ProjectName", Value=config.project_name)
104
105
106def create_a2i_execution_role():
107    # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iam.html#IAM.Client.create_role
108    response = iam_client.create_role(
109        RoleName=config.a2i_execution_role_name,
110        AssumeRolePolicyDocument=json.dumps({
111            "Version": "2012-10-17",
112            "Statement": [
113                {
114                    "Effect": "Allow",
115                    "Principal": {
116                        "Service": "sagemaker.amazonaws.com"
117                    },
118                    "Action": "sts:AssumeRole"
119                }
120            ]
121        }),
122        Tags=[Tag.project_name],
123    )
124
125    # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iam.html#IAM.Client.put_role_policy
126    response = iam_client.put_role_policy(
127        RoleName=config.a2i_execution_role_name,
128        PolicyName=config.a2i_execution_role_policy_name,
129        PolicyDocument=json.dumps({
130            "Version": "2012-10-17",
131            "Statement": [
132                {
133                    "Effect": "Allow",
134                    "Action": [
135                        "s3:ListBucket",
136                        "s3:GetObject",
137                        "s3:PutObject",
138                        "s3:DeleteObject"
139                    ],
140                    "Resource": [
141                        "arn:aws:s3:::*"
142                    ]
143                }
144            ]
145        }),
146    )
147
148    print(f"Successful created {config.a2i_execution_role_arn}")
149    print(f"Preview at {config.a2i_execution_role_console_url}")
150
151
152def delete_a2i_execution_role():
153    # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iam.html#IAM.Client.delete_role_policy
154    response = iam_client.delete_role_policy(
155        RoleName=config.a2i_execution_role_name,
156        PolicyName=config.a2i_execution_role_policy_name,
157    )
158
159    # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iam.html#IAM.Client.delete_role
160    response = iam_client.delete_role(
161        RoleName=config.a2i_execution_role_name
162    )
163
164    print(f"Successful delete {config.a2i_execution_role_arn}")
165    print(f"Verify at {config.a2i_execution_role_console_url}")
166
167
168def create_human_task_ui():
169    # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_human_task_ui
170    liquid_template = Path(dir_here, "task-backup.liquid").read_text(encoding="utf-8")
171    response = sm_client.create_human_task_ui(
172        HumanTaskUiName=config.task_ui_name,
173        UiTemplate={
174            "Content": liquid_template
175        },
176        Tags=[Tag.project_name, ]
177    )
178
179    print(f"Successful created {config.task_ui_arn}")
180    print(f"Preview at {config.task_ui_console_url}")
181
182
183def delete_human_task_ui():
184    # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.delete_human_task_ui
185    response = sm_client.delete_human_task_ui(
186        HumanTaskUiName=config.task_ui_name
187    )
188
189    print(f"Successful delete {config.task_ui_arn}")
190    print(f"Verify at {config.task_ui_console_url}")
191
192
193def create_flow_definition():
194    # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_flow_definition
195    response = sm_client.create_flow_definition(
196        FlowDefinitionName=config.flow_definition_name,
197        HumanLoopConfig={
198            "WorkteamArn": config.worker_team_arn,
199            "HumanTaskUiArn": config.task_ui_arn,
200            "TaskTitle": config.task_ui_name,
201            "TaskDescription": f"{config.task_ui_name} description",
202            "TaskCount": 2,
203            "TaskTimeLimitInSeconds": 3600,
204        },
205        OutputConfig={
206            "S3OutputPath": config.hil_output_uri,
207        },
208        RoleArn=config.a2i_execution_role_arn,
209        Tags=[Tag.project_name, ],
210    )
211
212    print(f"Successful created {config.flow_definition_arn}")
213    print(f"Preview at {config.flow_definition_console_url}")
214
215
216def delete_flow_definition():
217    # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.delete_flow_definition
218    response = sm_client.delete_flow_definition(
219        FlowDefinitionName=config.flow_definition_name
220    )
221
222    print(f"Successful delete {config.flow_definition_arn}")
223    print(f"Verify at {config.flow_definition_console_url}")
224
225
226def start_human_loop():
227    input_content = {
228        "Pairs": [
229            {"row": "Key_0: Value_0", "prediction": 0.1544346809387207},
230            {"row": "Key_1: Value_1", "prediction": 0.4938497543334961},
231            {"row": "Key_2: Value_2", "prediction": 0.23486430943012238},
232            {"row": "Avantor Team 1", "prediction": 0.23486430943012238},
233        ]
234    }
235    response = a2i_client.start_human_loop(
236        HumanLoopName=str(uuid.uuid4()),
237        FlowDefinitionArn=config.flow_definition_arn,
238        HumanLoopInput={
239            "InputContent": json.dumps(input_content)
240        }
241    )
242
243
244if __name__ == "__main__":
245    # create_a2i_execution_role()
246    # delete_a2i_execution_role()
247
248    create_human_task_ui()
249    delete_human_task_ui()
250
251    # create_flow_definition()
252    # delete_flow_definition()
253
254    start_human_loop()
255    pass

4. 如何编写复杂的自定义 Task Template

Task Template 本质上就是一个 HTML + JavaScript, 属于前端知识. 在这个页面上给人类展示一些信息, 提供一些互动的操作, 并允许人类输入数据. 这就是 Task Template 的本质了. 要想学会如何用 HTML 为 HIL 写出复杂的交互功能, 建议按照顺序阅读下面的三篇文档: