{ "cells": [ { "cell_type": "markdown", "source": [ "# Amazon Augmented AI Learning Lab\n" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "## 1. Overview\n", "\n", "Amazon Augmented AI (Amazon A2I) is a service that brings human review of ML predictions to all developers by removing the heavy lifting associated with building human review systems or managing large numbers of human reviewers.\n", "\n", "Many ML applications require humans to review low-confidence predictions to ensure the results are correct. For example, extracting information from scanned mortgage application forms can require human review due to low-quality scans or poor handwriting. Building human review systems can be time-consuming and expensive because it involves implementing complex processes or workflows, writing custom software to manage review tasks and results, and managing large groups of reviewers.\n", "\n", "**In this lab, we will learn how to build a robust, scalable, customizable Human In Loop system**.\n", "\n", "There are two major components:\n", "\n", "1. Infrastructure\n", "2. HIL Application\n", "\n", "**Terminology**\n", "\n", "- A2I: Augmented AI, the AWS Service name\n", "- HIL: Human in Loop, a feature in A2I\n", "- Task Template:\n" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "## 2. Infrastructure - Create Labeling Workforce Private Team\n", "\n", "Firstly, we want to create a [labeling workforce private team](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-management.html) and invite your employee to help us do the Human-in-Loop tasks. Then, the labeling workers can use the web app to log in their workspace and to do HIL tasks.\n", "\n", "[Amazon Sagemaker Ground Truth](https://aws.amazon.com/sagemaker/data-labeling/?sagemaker-data-wrangler-whats-new.sort-by=item.additionalFields.postDateTime&sagemaker-data-wrangler-whats-new.sort-order=desc) greatly simplifies the effort to create / manage this HIL GUI system. In tradition, without Amazon Sagemaker Ground Truth, you need to develop your own web application with GUI, and manage the login credential, data access policy, worker management your self.\n", "\n", "1. go to AWS Sagemaker Console -> Ground Truth sub menu -> Labeling workforces -> Private -> Create Private Team:" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "2. configure the private team, you can follow detailed instruction below" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "- Private Team Creation = Create a private team with [AWS Cognito](https://aws.amazon.com/cognito/)\n", "- Team details:\n", " - Team name = ``my-labeling-team``\n", "- Add Workers:\n", " - Invite new workers by email, Email address: ``alice@example.com``\n", " - Organization name: ``my-org``\n", " - Contact email: ``admin@example.com``\n", "- Enable Notifications: we don't need this for learning.\n", "- Click \"Create Private Team\" button." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "3. Now you can enter your Private Team console. There is a sign-un URL your workers can log in to the HIL workspace. If you want to add more workers to your team, You can invite more people by clicking the \"Invite new workers\" button." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "4. As a worker, once you logged in to the HIL workspace, you will see the following GUI. Now we don't have any HIL task available yet." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "## 3. Infrastructure - Create Human in Loop Workflow\n", "\n", "In this section, we will create all required AWS resources for Human in Loop Workflow, including:\n", "\n", "1. An S3 bucket to store the HIL data\n", "2. An IAM Role for Human Review Workflow execution\n", "3. A Human Review Workflow definition that defines the metadata about this workflow\n", "4. A Task template that defines the HIL task HTML UI\n" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 20, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: pathlib_mate in /Users/sanhehu/venvs/python/3.8.11/dev_exp_share_venv/lib/python3.8/site-packages (from -r requirements.txt (line 1)) (1.0.3)\r\n", "Requirement already satisfied: smart_open in /Users/sanhehu/venvs/python/3.8.11/dev_exp_share_venv/lib/python3.8/site-packages (from -r requirements.txt (line 2)) (5.2.1)\r\n", "Requirement already satisfied: boto3 in /Users/sanhehu/venvs/python/3.8.11/dev_exp_share_venv/lib/python3.8/site-packages (from -r requirements.txt (line 3)) (1.22.9)\r\n", "Requirement already satisfied: boto_session_manager in /Users/sanhehu/venvs/python/3.8.11/dev_exp_share_venv/lib/python3.8/site-packages (from -r requirements.txt (line 4)) (0.0.4)\r\n", "Requirement already satisfied: s3pathlib in /Users/sanhehu/venvs/python/3.8.11/dev_exp_share_venv/lib/python3.8/site-packages (from -r requirements.txt (line 5)) (1.0.12)\r\n", "\u001B[31mERROR: Could not find a version that satisfies the requirement box (from versions: none)\u001B[0m\r\n", "\u001B[31mERROR: No matching distribution found for box\u001B[0m\r\n", "\u001B[33mWARNING: You are using pip version 21.2.4; however, version 22.2.2 is available.\r\n", "You should consider upgrading via the '/Users/sanhehu/venvs/python/3.8.11/dev_exp_share_venv/bin/python -m pip install --upgrade pip' command.\u001B[0m\r\n", "Note: you may need to restart the kernel to use updated packages.\n" ] } ], "source": [ "# install additional python libraries\n", "%pip install -r requirements.txt" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "**Below is an automation script**. With a minimal parameter definition (the project name you want to use, provides AWS credential, etc ...), you are able to deploy necessary AWS resources without writing code." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 3, "outputs": [], "source": [ "# import standard library\n", "import typing as T\n", "import os\n", "import json\n", "import time\n", "import uuid\n", "import subprocess\n", "import dataclasses\n", "\n", "# import 3rd party library\n", "from box import Box\n", "from liquid import Template\n", "from pathlib_mate import Path\n", "from s3pathlib import S3Path, context\n", "from boto_session_manager import BotoSesManager, AwsServiceEnum\n", "\n", "dir_here = Path(os.getcwd()).absolute()\n", "path_task_template = dir_here / \"task.liquid\"\n", "path_task_ui_html = dir_here / \"task.html\"\n", "\n", "\n", "@dataclasses.dataclass\n", "class Lab:\n", " # constant attributes\n", " project_name: str = dataclasses.field()\n", " labeling_team_arn: str = dataclasses.field()\n", " bsm: BotoSesManager = dataclasses.field(default=None)\n", " path_task_template: Path = dataclasses.field(default=path_task_template)\n", " path_task_ui_html: Path = dataclasses.field(default=path_task_ui_html)\n", "\n", " # derived attributes\n", " s3_client: T.Any = dataclasses.field(default=None)\n", " iam_client: T.Any = dataclasses.field(default=None)\n", " sm_client: T.Any = dataclasses.field(default=None)\n", " a2i_client: T.Any = dataclasses.field(default=None)\n", "\n", " _workspace_portal_signin_url_cache: str = dataclasses.field(default=None)\n", "\n", " def __post_init__(self):\n", " if self.bsm is None:\n", " self.bsm = BotoSesManager()\n", " context.attach_boto_session(self.bsm.boto_ses)\n", "\n", " self.s3_client = self.bsm.get_client(AwsServiceEnum.S3)\n", " self.iam_client = self.bsm.get_client(AwsServiceEnum.IAM)\n", " self.sm_client = self.bsm.get_client(AwsServiceEnum.SageMaker)\n", " self.a2i_client = self.bsm.get_client(AwsServiceEnum.AugmentedAIRuntime)\n", "\n", " @property\n", " def project_name_slug(self) -> str:\n", " return self.project_name.replace(\"_\", \"-\")\n", "\n", " @property\n", " def common_tags(self) -> T.List[T.Dict[str, str]]:\n", " return [\n", " dict(\n", " Key=\"ProjectName\",\n", " Value=self.project_name_slug,\n", " )\n", " ]\n", "\n", " # --- Create S3 bucket to store HIL data\n", " @property\n", " def bucket_name(self) -> str:\n", " return f\"{self.bsm.aws_account_id}-{self.bsm.aws_region}-{self.project_name_slug}\"\n", "\n", " @property\n", " def bucket_console_url(self) -> str:\n", " return f\"https://s3.console.aws.amazon.com/s3/buckets/{self.bucket_name}?region={self.bsm.aws_region}&tab=objects\"\n", "\n", " def step_1a_create_s3_bucket(self) -> dict:\n", " print(\"Create s3 bucket to store HIL data\")\n", " print(f\" Preview at {self.bucket_console_url}\")\n", " # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.create_bucket\n", " response1 = self.s3_client.create_bucket(\n", " Bucket=self.bucket_name,\n", " CreateBucketConfiguration=dict(\n", " LocationConstraint=self.bsm.aws_region,\n", " ),\n", " )\n", "\n", " # grant CORS permission so HIL UI can access artifacts in S3 bucket\n", " # ref: https://docs.aws.amazon.com/sagemaker/latest/dg/sms-cors-update.html\n", " # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.put_bucket_cors\n", " response2 = self.s3_client.put_bucket_cors(\n", " Bucket=self.bucket_name,\n", " CORSConfiguration={\n", " \"CORSRules\": [\n", " {\n", " \"AllowedHeaders\": [],\n", " \"AllowedMethods\": [\"GET\"],\n", " \"AllowedOrigins\": [\"*\"],\n", " \"ExposeHeaders\": [\"Access-Control-Allow-Origin\"],\n", " }\n", " ]\n", " },\n", " )\n", "\n", " print(f\" Successful created s3://{self.bucket_name}\")\n", " return response1\n", "\n", " def step_1b_delete_s3_bucket(self) -> dict:\n", " print(\"Delete HIL data s3 bucket\")\n", " print(f\" Preview at {self.bucket_console_url}\")\n", "\n", " s3dir = S3Path(self.bucket_name)\n", " s3dir.delete_if_exists()\n", " # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.create_bucket\n", " response = self.s3_client.delete_bucket(\n", " Bucket=self.bucket_name,\n", " )\n", " print(f\" Successful deleted s3://{self.bucket_name}\")\n", " return response\n", "\n", " # --- Create IAM Role for HIL\n", " @property\n", " def flow_execution_role_name(self) -> str:\n", " return f\"{self.project_name_slug}-flow-role\"\n", "\n", " @property\n", " def flow_execution_role_policy_name(self) -> str:\n", " return f\"{self.project_name_slug}-flow-role-in-line-policy\"\n", "\n", " @property\n", " def flow_execution_role_arn(self) -> str:\n", " return f\"arn:aws:iam::{self.bsm.aws_account_id}:role/{self.flow_execution_role_name}\"\n", "\n", " @property\n", " def flow_execution_role_console_url(self) -> str:\n", " return f\"https://console.aws.amazon.com/iamv2/home?region={self.bsm.aws_region}#/roles/details/{self.flow_execution_role_name}?section=permissions\"\n", "\n", " def step_2a_create_flow_execution_role(self) -> dict:\n", " print(\"Create IAM role for Human review workflow\")\n", " print(f\" Preview at {self.flow_execution_role_console_url}\")\n", " # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iam.html#IAM.Client.create_role\n", " response1 = self.iam_client.create_role(\n", " RoleName=self.flow_execution_role_name,\n", " AssumeRolePolicyDocument=json.dumps({\n", " \"Version\": \"2012-10-17\",\n", " \"Statement\": [\n", " {\n", " \"Effect\": \"Allow\",\n", " \"Principal\": {\n", " \"Service\": \"sagemaker.amazonaws.com\"\n", " },\n", " \"Action\": \"sts:AssumeRole\"\n", " }\n", " ]\n", " }),\n", " Tags=self.common_tags,\n", " )\n", "\n", " # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iam.html#IAM.Client.put_role_policy\n", " response2 = self.iam_client.put_role_policy(\n", " RoleName=self.flow_execution_role_name,\n", " PolicyName=self.flow_execution_role_policy_name,\n", " PolicyDocument=json.dumps({\n", " \"Version\": \"2012-10-17\",\n", " \"Statement\": [\n", " {\n", " \"Effect\": \"Allow\",\n", " \"Action\": [\n", " \"s3:ListBucket\",\n", " \"s3:GetObject\",\n", " \"s3:PutObject\",\n", " \"s3:DeleteObject\"\n", " ],\n", " \"Resource\": [\n", " f\"arn:aws:s3:::{self.bucket_name}*\"\n", " ]\n", " }\n", " ]\n", " }),\n", " )\n", "\n", " print(f\" Successful created {self.flow_execution_role_arn}\")\n", " return response1\n", "\n", " def step_2b_delete_flow_execution_role(self) -> dict:\n", " print(\"Delete Human review workflow IAM role\")\n", " print(f\" Preview at {self.flow_execution_role_console_url}\")\n", " # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iam.html#IAM.Client.delete_role_policy\n", " response = self.iam_client.delete_role_policy(\n", " RoleName=self.flow_execution_role_name,\n", " PolicyName=self.flow_execution_role_policy_name,\n", " )\n", "\n", " # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iam.html#IAM.Client.delete_role\n", " response = self.iam_client.delete_role(\n", " RoleName=self.flow_execution_role_name,\n", " )\n", "\n", " print(f\" Successful delete {self.flow_execution_role_arn}\")\n", " return response\n", "\n", " # --- Create Task\n", " @property\n", " def task_template_name(self) -> str:\n", " return f\"{self.project_name_slug}\"\n", "\n", " @property\n", " def task_template_arn(self) -> str:\n", " return f\"arn:aws:sagemaker:{self.bsm.aws_region}:{self.bsm.aws_account_id}:human-task-ui/{self.task_template_name}\"\n", "\n", " @property\n", " def task_template_console_url(self) -> str:\n", " return f\"https://console.aws.amazon.com/a2i/home?region={self.bsm.aws_region}#/worker-task-templates/{self.task_template_name}\"\n", "\n", " def step_3a_create_hil_task_template(self) -> dict:\n", " print(\"Create Human in Loop task template\")\n", " print(f\" Preview at {self.task_template_console_url}\")\n", " # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_human_task_ui\n", " liquid_template = self.path_task_template.read_text(encoding=\"utf-8\")\n", " response = self.sm_client.create_human_task_ui(\n", " HumanTaskUiName=self.task_template_name,\n", " UiTemplate={\n", " \"Content\": liquid_template\n", " },\n", " Tags=self.common_tags,\n", " )\n", "\n", " print(f\" Successful created {self.task_template_name}\")\n", " return response\n", "\n", " def step_3b_delete_hil_task_template(self) -> dict:\n", " print(\"Delete Human in Loop task template\")\n", " print(f\" Verify at {self.task_template_console_url}\")\n", " # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.delete_human_task_ui\n", " response = self.sm_client.delete_human_task_ui(\n", " HumanTaskUiName=self.task_template_name\n", " )\n", "\n", " print(f\" Successful delete {self.task_template_arn}\")\n", " return response\n", "\n", " # --- Create Human review workflow\n", " @property\n", " def flow_definition_name(self) -> str:\n", " return f\"{self.project_name_slug}\"\n", "\n", " @property\n", " def flow_definition_arn(self) -> str:\n", " return f\"arn:aws:sagemaker:{self.bsm.aws_region}:{self.bsm.aws_account_id}:flow-definition/{self.flow_definition_name}\"\n", "\n", " @property\n", " def flow_definition_console_url(self) -> str:\n", " return f\"https://console.aws.amazon.com/a2i/home?region={self.bsm.aws_region}#/human-review-workflows/{self.flow_definition_name}\"\n", "\n", " @property\n", " def s3dir_hil_input(self) -> S3Path:\n", " return S3Path.from_s3_uri(f\"s3://{self.bucket_name}/hil/input\").to_dir()\n", "\n", " @property\n", " def s3dir_hil_output(self) -> S3Path:\n", " return S3Path.from_s3_uri(f\"s3://{self.bucket_name}/hil/output\").to_dir()\n", "\n", " def step_4a_create_flow_definition(self) -> dict:\n", " print(\"Create Human review workflow definition, it may takes 30 sec ~ 1 minute\")\n", " print(f\" Preview at {self.flow_definition_console_url}\")\n", " # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_flow_definition\n", " response = self.sm_client.create_flow_definition(\n", " FlowDefinitionName=self.flow_definition_name,\n", " HumanLoopConfig={\n", " \"WorkteamArn\": self.labeling_team_arn,\n", " \"HumanTaskUiArn\": self.task_template_arn,\n", " \"TaskTitle\": self.task_template_name,\n", " \"TaskDescription\": f\"{self.task_template_name} description\",\n", " \"TaskCount\": 1, # if it\n", " \"TaskTimeLimitInSeconds\": 3600,\n", " },\n", " OutputConfig={\n", " \"S3OutputPath\": self.s3dir_hil_output.to_file().uri,\n", " },\n", " RoleArn=self.flow_execution_role_arn,\n", " Tags=self.common_tags,\n", " )\n", "\n", " print(f\" Successful created {self.flow_definition_arn}\")\n", " return response\n", "\n", " def step_4b_delete_flow_definition(self) -> dict:\n", " print(\"Delete Human review workflow definition, it may takes 30 sec ~ 1 minute\")\n", " print(f\" Preview at {self.flow_definition_console_url}\")\n", " # ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.delete_flow_definition\n", " response = self.sm_client.delete_flow_definition(\n", " FlowDefinitionName=self.flow_definition_name\n", " )\n", "\n", " print(f\" Successful delete {self.flow_definition_arn}\")\n", " return response\n", "\n", " # --- Start Human in Loop\n", " @property\n", " def labeling_workforce_console_url(self) -> str:\n", " return (\n", " f\"https://{self.bsm.aws_region}.console.aws.amazon.com/sagemaker/\"\n", " f\"groundtruth?region={self.bsm.aws_region}#/labeling-workforces\"\n", " )\n", "\n", " @property\n", " def workspace_portal_signin_url(self) -> str:\n", " if self._workspace_portal_signin_url_cache is None:\n", " self._workspace_portal_signin_url_cache = \"https://\" + self.sm_client.describe_workteam(\n", " WorkteamName=\"my-labeling-team\"\n", " )[\"Workteam\"][\"SubDomain\"]\n", " return self._workspace_portal_signin_url_cache\n", "\n", " def get_hil_console_url(self, hil_id: str) -> str:\n", " return (\n", " f\"https://{self.bsm.aws_region}.console.aws.amazon.com/a2i/home?\"\n", " f\"region={self.bsm.aws_region}#/human-review-workflows/\"\n", " f\"{self.flow_definition_name}/human-loops/{hil_id}\"\n", " )\n", "\n", " def start_human_loop(self, input_data: dict):\n", " print(\"Start human loop ...\")\n", " print(f\" You can enter the labeling portal from {self.workspace_portal_signin_url}\")\n", " response = self.a2i_client.start_human_loop(\n", " HumanLoopName=str(uuid.uuid4()),\n", " FlowDefinitionArn=self.flow_definition_arn,\n", " HumanLoopInput={\n", " \"InputContent\": json.dumps(input_data),\n", " }\n", " )\n", " hil_arn = response[\"HumanLoopArn\"]\n", " hil_id = hil_arn.split(\"/\")[-1]\n", " hil_console_url = self.get_hil_console_url(hil_id)\n", " print(f\" You can preview HIL status at {hil_console_url}\")\n", "\n", " # --- Develop Task Template\n", " def render_template_and_preview(self, input_data: dict):\n", " # read liquid template\n", " template = Template(self.path_task_template.read_text())\n", "\n", " # convert task data to box, so it support dot notation\n", " task = Box({\"input\": input_data})\n", "\n", " # render template\n", " content = template.render(task=task)\n", "\n", " # write template to html file\n", " self.path_task_ui_html.write_text(content)\n", "\n", " # open html in browser\n", " subprocess.run([\"open\", self.path_task_ui_html.abspath])" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "Now let's run this automation script.\n", "\n", "Firstly, you have to:\n", "\n", "- give a ``project_name``, this will become a common prefix in all AWS resources naming convention.\n", "- copy and paste the labeling team ARN that you created in the previous step to here, you can find the ARN in [labeling team console](https://console.aws.amazon.com/sagemaker/groundtruth?#/labeling-workforces)\n", "- give this script necessary AWS credential using [boto_session_manager](https://pypi.org/project/boto-session-manager/) library.\n", "\n", "Then uncomment the following code line by line and execute:\n", "\n", "- ``lab.step_1a_create_s3_bucket()``\n", "- ``lab.step_2a_create_flow_execution_role()``\n", "- ``lab.step_3a_create_hil_task_template()``\n", "- ``lab.step_4a_create_flow_definition()``" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 4, "outputs": [], "source": [ "lab = Lab(\n", " project_name=\"a2i-poc\",\n", " labeling_team_arn=\"arn:aws:sagemaker:us-east-2:669508176277:workteam/private-crowd/my-labeling-team\",\n", " # bsm=BotoSesManager(), # use this if you use AWS Sagemaker Notebook Instance\n", " bsm=BotoSesManager(profile_name=\"aws_data_lab_sanhe_us_east_2\"), # use this is you are on your local laptop\n", ")" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": 21, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Create s3 bucket to store HIL data\n", " Preview at https://s3.console.aws.amazon.com/s3/buckets/669508176277-us-east-2-a2i-poc?region=us-east-2&tab=objects\n" ] }, { "ename": "BucketAlreadyOwnedByYou", "evalue": "An error occurred (BucketAlreadyOwnedByYou) when calling the CreateBucket operation: Your previous request to create the named bucket succeeded and you already own it.", "output_type": "error", "traceback": [ "\u001B[0;31m---------------------------------------------------------------------------\u001B[0m", "\u001B[0;31mBucketAlreadyOwnedByYou\u001B[0m Traceback (most recent call last)", "Input \u001B[0;32mIn [21]\u001B[0m, in \u001B[0;36m\u001B[0;34m()\u001B[0m\n\u001B[0;32m----> 1\u001B[0m \u001B[43mlab\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mstep_1a_create_s3_bucket\u001B[49m\u001B[43m(\u001B[49m\u001B[43m)\u001B[49m\n", "Input \u001B[0;32mIn [3]\u001B[0m, in \u001B[0;36mLab.step_1a_create_s3_bucket\u001B[0;34m(self)\u001B[0m\n\u001B[1;32m 74\u001B[0m \u001B[38;5;28mprint\u001B[39m(\u001B[38;5;124mf\u001B[39m\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124m Preview at \u001B[39m\u001B[38;5;132;01m{\u001B[39;00m\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mbucket_console_url\u001B[38;5;132;01m}\u001B[39;00m\u001B[38;5;124m\"\u001B[39m)\n\u001B[1;32m 75\u001B[0m \u001B[38;5;66;03m# ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.create_bucket\u001B[39;00m\n\u001B[0;32m---> 76\u001B[0m response1 \u001B[38;5;241m=\u001B[39m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43ms3_client\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mcreate_bucket\u001B[49m\u001B[43m(\u001B[49m\n\u001B[1;32m 77\u001B[0m \u001B[43m \u001B[49m\u001B[43mBucket\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mbucket_name\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 78\u001B[0m \u001B[43m \u001B[49m\u001B[43mCreateBucketConfiguration\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[38;5;28;43mdict\u001B[39;49m\u001B[43m(\u001B[49m\n\u001B[1;32m 79\u001B[0m \u001B[43m \u001B[49m\u001B[43mLocationConstraint\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mbsm\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43maws_region\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 80\u001B[0m \u001B[43m \u001B[49m\u001B[43m)\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 81\u001B[0m \u001B[43m\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 83\u001B[0m \u001B[38;5;66;03m# grant CORS permission so HIL UI can access artifacts in S3 bucket\u001B[39;00m\n\u001B[1;32m 84\u001B[0m \u001B[38;5;66;03m# ref: https://docs.aws.amazon.com/sagemaker/latest/dg/sms-cors-update.html\u001B[39;00m\n\u001B[1;32m 85\u001B[0m \u001B[38;5;66;03m# ref: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.put_bucket_cors\u001B[39;00m\n\u001B[1;32m 86\u001B[0m response2 \u001B[38;5;241m=\u001B[39m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39ms3_client\u001B[38;5;241m.\u001B[39mput_bucket_cors(\n\u001B[1;32m 87\u001B[0m Bucket\u001B[38;5;241m=\u001B[39m\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mbucket_name,\n\u001B[1;32m 88\u001B[0m CORSConfiguration\u001B[38;5;241m=\u001B[39m{\n\u001B[0;32m (...)\u001B[0m\n\u001B[1;32m 97\u001B[0m },\n\u001B[1;32m 98\u001B[0m )\n", "File \u001B[0;32m~/venvs/python/3.8.11/dev_exp_share_venv/lib/python3.8/site-packages/botocore/client.py:415\u001B[0m, in \u001B[0;36mClientCreator._create_api_method.._api_call\u001B[0;34m(self, *args, **kwargs)\u001B[0m\n\u001B[1;32m 412\u001B[0m \u001B[38;5;28;01mraise\u001B[39;00m \u001B[38;5;167;01mTypeError\u001B[39;00m(\n\u001B[1;32m 413\u001B[0m \u001B[38;5;124m\"\u001B[39m\u001B[38;5;132;01m%s\u001B[39;00m\u001B[38;5;124m() only accepts keyword arguments.\u001B[39m\u001B[38;5;124m\"\u001B[39m \u001B[38;5;241m%\u001B[39m py_operation_name)\n\u001B[1;32m 414\u001B[0m \u001B[38;5;66;03m# The \"self\" in this scope is referring to the BaseClient.\u001B[39;00m\n\u001B[0;32m--> 415\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_make_api_call\u001B[49m\u001B[43m(\u001B[49m\u001B[43moperation_name\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mkwargs\u001B[49m\u001B[43m)\u001B[49m\n", "File \u001B[0;32m~/venvs/python/3.8.11/dev_exp_share_venv/lib/python3.8/site-packages/botocore/client.py:745\u001B[0m, in \u001B[0;36mBaseClient._make_api_call\u001B[0;34m(self, operation_name, api_params)\u001B[0m\n\u001B[1;32m 743\u001B[0m error_code \u001B[38;5;241m=\u001B[39m parsed_response\u001B[38;5;241m.\u001B[39mget(\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mError\u001B[39m\u001B[38;5;124m\"\u001B[39m, {})\u001B[38;5;241m.\u001B[39mget(\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mCode\u001B[39m\u001B[38;5;124m\"\u001B[39m)\n\u001B[1;32m 744\u001B[0m error_class \u001B[38;5;241m=\u001B[39m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mexceptions\u001B[38;5;241m.\u001B[39mfrom_code(error_code)\n\u001B[0;32m--> 745\u001B[0m \u001B[38;5;28;01mraise\u001B[39;00m error_class(parsed_response, operation_name)\n\u001B[1;32m 746\u001B[0m \u001B[38;5;28;01melse\u001B[39;00m:\n\u001B[1;32m 747\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m parsed_response\n", "\u001B[0;31mBucketAlreadyOwnedByYou\u001B[0m: An error occurred (BucketAlreadyOwnedByYou) when calling the CreateBucket operation: Your previous request to create the named bucket succeeded and you already own it." ] } ], "source": [ "# lab.step_1a_create_s3_bucket()\n", "# lab.step_1b_delete_s3_bucket()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "lab.step_2a_create_flow_execution_role()\n", "# lab.step_2b_delete_flow_execution_role()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "lab.step_3a_create_hil_task_template()\n", "# lab.step_3b_delete_hil_task_template()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "lab.step_4a_create_flow_definition()\n", "# lab.step_4b_delete_flow_definition()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "Now HIL infrastructure are all deployed, we can start solving some sample business problems with HIL." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "## 4. Learn A2I from Examples\n", "\n", "The AWS A2I team has a public AWS Repo [amazon-a2i-sample-task-uis](https://github.com/aws-samples/amazon-a2i-sample-task-uis). It has many HIL task HTML template, but without sample data (text / pdf / image / audio / video files). You still need to write many codes and set up your AWS console to run those samples.\n", "\n", "In this tutorial, we provide the automation script, so you can focus on the.\n", "\n", "**How to Use**\n", "\n", "In current directory, there is a ``usecases`` folder with many sub folders. Each sub folder represents a single use case. Belows are many Jupyter Notebook cells that each one represents the experimental script for a single use case. Each cell includes the following logics:\n", "\n", "1. Update the task template to the chosen use case\n", "2. Upload necessary artifacts to S3 bucket\n", "3. Start the HIL\n", "4. Show helpful information so you can inspect, preview your HIL task" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "markdown", "source": [ "### Image Use Case\n", "\n", "#### Bounding Box" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 57, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Preview artifacts at https://console.aws.amazon.com/s3/object/669508176277-us-east-2-a2i-poc?prefix=hil/input/cat-and-dog.jpeg\n", "Delete Human in Loop task template\n", " Verify at https://console.aws.amazon.com/a2i/home?region=us-east-2#/worker-task-templates/a2i-poc\n", " Successful delete arn:aws:sagemaker:us-east-2:669508176277:human-task-ui/a2i-poc\n", "Create Human in Loop task template\n", " Preview at https://console.aws.amazon.com/a2i/home?region=us-east-2#/worker-task-templates/a2i-poc\n", " Successful created a2i-poc\n", "Start human loop ...\n", " You can enter the labeling portal from https://3zqu42gydr.labeling.us-east-2.sagemaker.aws\n", " You can preview HIL status at https://us-east-2.console.aws.amazon.com/a2i/home?region=us-east-2#/human-review-workflows/a2i-poc/human-loops/ac04a258-7452-46c5-8f01-39bbfa94c842\n" ] } ], "source": [ "def bounding_box_use_case():\n", " lab.path_task_template = dir_here / \"usecases\" / \"images\" / \"bounding-box\" / \"task.liquid\"\n", " path_artifact = dir_here / \"usecases\" / \"images\" / \"bounding-box\" / \"cat-and-dog.jpeg\"\n", " s3path_artifact = lab.s3dir_hil_input / \"cat-and-dog.jpeg\"\n", " s3path_artifact.upload_file(path_artifact.abspath, overwrite=True)\n", " print(f\"Preview artifacts at {s3path_artifact.console_url}\")\n", " input_data = {\n", " \"taskObject\": s3path_artifact.uri\n", " }\n", "\n", " lab.step_3b_delete_hil_task_template()\n", " lab.step_3a_create_hil_task_template()\n", " time.sleep(3)\n", " lab.start_human_loop(input_data)\n", "\n", "\n", "bounding_box_use_case()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "#### Bounding Box Hierarchical" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 13, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Preview artifacts at https://console.aws.amazon.com/s3/object/669508176277-us-east-2-a2i-poc?prefix=hil/input/cat-and-dog.jpeg\n", "Delete Human in Loop task template\n", " Verify at https://console.aws.amazon.com/a2i/home?region=us-east-2#/worker-task-templates/a2i-poc\n", " Successful delete arn:aws:sagemaker:us-east-2:669508176277:human-task-ui/a2i-poc\n", "Create Human in Loop task template\n", " Preview at https://console.aws.amazon.com/a2i/home?region=us-east-2#/worker-task-templates/a2i-poc\n", " Successful created a2i-poc\n", "Start human loop ...\n", " You can enter the labeling portal from https://3zqu42gydr.labeling.us-east-2.sagemaker.aws\n", " You can preview HIL status at https://us-east-2.console.aws.amazon.com/a2i/home?region=us-east-2#/human-review-workflows/a2i-poc/human-loops/a1e4dcbc-074f-4c8e-985b-1914055bc71d\n" ] } ], "source": [ "def bounding_box_hierarchical_use_case():\n", " lab.path_task_template = dir_here / \"usecases\" / \"images\" / \"bounding-box-hierarchical\" / \"task.liquid\"\n", " path_artifact = dir_here / \"usecases\" / \"images\" / \"bounding-box-hierarchical\" / \"lisa.png\"\n", " s3path_artifact = lab.s3dir_hil_input / \"cat-and-dog.jpeg\"\n", " s3path_artifact.upload_file(path_artifact.abspath, overwrite=True)\n", " print(f\"Preview artifacts at {s3path_artifact.console_url}\")\n", " input_data = {\n", " \"taskObject\": s3path_artifact.uri\n", " }\n", "\n", " lab.step_3b_delete_hil_task_template()\n", " lab.step_3a_create_hil_task_template()\n", " time.sleep(3)\n", " lab.start_human_loop(input_data)\n", "\n", "\n", "bounding_box_hierarchical_use_case()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "### Text Use Case\n", "\n", "#### Key Phrase Extraction" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 34, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Delete Human in Loop task template\n", " Verify at https://console.aws.amazon.com/a2i/home?region=us-east-2#/worker-task-templates/a2i-poc\n", " Successful delete arn:aws:sagemaker:us-east-2:669508176277:human-task-ui/a2i-poc\n", "Create Human in Loop task template\n", " Preview at https://console.aws.amazon.com/a2i/home?region=us-east-2#/worker-task-templates/a2i-poc\n", " Successful created a2i-poc\n", "Start human loop ...\n", " You can enter the labeling portal from https://us-east-2.console.aws.amazon.com/sagemaker/groundtruth?region=us-east-2#/labeling-workforces\n", " You can preview HIL status at https://us-east-2.console.aws.amazon.com/a2i/home?region=us-east-2#/human-review-workflows/a2i-poc/human-loops/30cf4969-8ec3-4db0-98b1-fe9edd7d9d1a\n" ] } ], "source": [ "def key_phrase_extraction_use_case():\n", " lab.path_task_template = dir_here / \"usecases\" / \"text\" / \"key-phrase-extraction\" / \"task.liquid\"\n", " input_data = {\n", " \"taskObject\": \"Excellent bag and fast shipping! Bag arrived right on time and packaged very well. The bag itself is good quality! Was a bit skeptical ordering this bag off amazon but it's 100% authentic and the best price!\"\n", " }\n", "\n", " lab.step_3b_delete_hil_task_template()\n", " lab.step_3a_create_hil_task_template()\n", " time.sleep(3)\n", " lab.start_human_loop(input_data)\n", "\n", "\n", "key_phrase_extraction_use_case()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "### ML Use Case\n", "\n", "#### Credit Card Application\n", "\n", "You are a financial bank. You have a ML model that can automatically approve / deny credit card application. However, you want a HIL to review application cases that has high credit line or low ML confidence score.\n", "\n", "Sample application and ML predict looks like this:\n", "\n", "![](./usecases/ml/credit-card-application/Human-Review-App.png)" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 18, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Delete Human in Loop task template\n", " Verify at https://console.aws.amazon.com/a2i/home?region=us-east-2#/worker-task-templates/a2i-poc\n", " Successful delete arn:aws:sagemaker:us-east-2:669508176277:human-task-ui/a2i-poc\n", "Create Human in Loop task template\n", " Preview at https://console.aws.amazon.com/a2i/home?region=us-east-2#/worker-task-templates/a2i-poc\n", " Successful created a2i-poc\n", "Start human loop ...\n", " You can enter the labeling portal from https://3zqu42gydr.labeling.us-east-2.sagemaker.aws\n", " You can preview HIL status at https://us-east-2.console.aws.amazon.com/a2i/home?region=us-east-2#/human-review-workflows/a2i-poc/human-loops/cb60d9d9-462f-4bcb-be23-9944f12a1083\n" ] } ], "source": [ "def credit_card_application_use_case():\n", " lab.path_task_template = dir_here / \"usecases\" / \"ml\" / \"credit-card-application\" / \"task.liquid\"\n", " input_data = {\n", " \"application\": [\n", " {\"key\": \"application_id\", \"value\": \"a-127\"},\n", " {\"key\": \"application_date\", \"value\": \"2022-01-03\"},\n", " {\"key\": \"ssn\", \"value\": \"111-22-3333\"},\n", " {\"key\": \"name\", \"value\": \"James Bond\"},\n", " {\"key\": \"month_income\", \"value\": 15000},\n", " ],\n", " \"ml_prediction\": \"approve\",\n", " }\n", "\n", " lab.step_3b_delete_hil_task_template()\n", " lab.step_3a_create_hil_task_template()\n", " time.sleep(3)\n", " lab.start_human_loop(input_data)\n", "\n", "\n", "credit_card_application_use_case()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "## 5. Develop Your Own Task Template\n", "\n", "In the previous section, we deployed the ``task.liquid`` file as a HIL Task Template to AWS, then we triggered a sample HIL task to preview how it looks it in web browser. In other word, we have to create lots AWS Resources for a simple test, including Labeling Workforce, S3 bucket, IAM Role, Flow Definition, Task Template, etc ... **This is a common pain point for many Amazon Augmented AI users**.\n", "\n", "**In this Lab, we create a Python tool that can render the Task Template HTML with input data locally without triggering a HIL**. Then you just need to focus on template development and data manipulation.\n", "\n", "A2I uses the HTML template language [liquid](https://shopify.github.io/liquid/basics/types/), an HTML template language created by Shopify. We use [python-liquid](https://pypi.org/project/python-liquid/) library to render the HTML locally.\n", "\n", "You can find useful resources to learn how to write a good Task Template below:\n", "\n", "- [Tutorial: Create Custom Worker Task Templates](https://docs.aws.amazon.com/sagemaker/latest/dg/a2i-custom-templates.html): learn the basic concept about A2I task template.\n", "- [Crowd HTML Elements Reference](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-ui-template-reference.html): all available crowd HTML tag usage.\n", "- [amazon-a2i-sample-task-uis](https://github.com/aws-samples/amazon-a2i-sample-task-uis): get inspired by samples\n", "\n", "Please review this code snippet to learn how to develop Task Template locally." ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": 19, "outputs": [], "source": [ "dir_use_case = dir_here / \"usecases\" / \"ml\" / \"credit-card-application\"\n", "# tell the tool where is the task template liquid file\n", "lab.path_task_template = dir_use_case / \"task.liquid\"\n", "# tell the tool where you want to store the rendered HTML\n", "lab.path_task_ui_html = dir_use_case / \"task.html\"\n", "# define the input data that matches your task template\n", "input_data = {\n", " \"application\": [\n", " {\"key\": \"application_id\", \"value\": \"a-127\"},\n", " {\"key\": \"application_date\", \"value\": \"2022-01-03\"},\n", " {\"key\": \"ssn\", \"value\": \"111-22-3333\"},\n", " {\"key\": \"name\", \"value\": \"James Bond\"},\n", " {\"key\": \"month_income\", \"value\": 15000},\n", " ],\n", " \"ml_prediction\": \"approve\",\n", "}\n", "# run this magic python function to render and preview\n", "lab.render_template_and_preview(input_data)" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" } }, "nbformat": 4, "nbformat_minor": 0 }