有关Amazon Sagemaker groundtruth的信息 [英] Information regarding Amazon Sagemaker groundtruth

查看：66 发布时间：2021/4/3 19:37:01 amazon-web-services amazon-sagemaker

本文介绍了有关Amazon Sagemaker groundtruth的信息的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用私人劳动力来执行简单的GroundTruth标签作业，以进行文本分类.由于我是AWS GroundTruth的新手，所以我有一些问题:

I'm trying to run a simple GroundTruth labeling job with a private workforce for text classification. Since I'm new to AWS GroundTruth, I have some questions:

如果我使用私人劳动力，我可以分配给标签工作的最大人数是多少?定价成本是否取决于私人劳动力中使用的人数.

If I use private workforce what is the maximum number of persons that I can allocate to the labeling job? Does the pricing cost depend on number of persons used in private workforce.

我有一个标记的数据集(文本分类)，并将其上传到S3存储桶，如果我向其上传了另一个未标记的数据，AutoML是否会标记提供的原始数据?如果没有，如何使用已标记的数据集标记新的原始数据/

I have a labeled dataset (text classication), and I upload it to S3 bucket, if I upload another unlabeled datas to it, will AutoML label the provided raw data? If not, how can I use already labelled dataset to label new raw datas/

Groundtruth文档说，它至少需要1000个对象才能被人类标记.它是指所有类别的1000个对象，还是单个类别的1000个对象?如果我手动标记1000个以上的对象，则AutoML可以标记多少个对象，或者AutoML可以标记的最大对象数是什么?

Groundtruth documentation says that it needs atleast 1000 objects to be labeled by humans. Does it mean 1000 objects of all classes, or 1000 objects for individual class? If I manually label 1000+ objects, how many more objects will AutoML label or what is the maximum number of objects can AutoML label?

推荐答案

我是Amazon SageMaker Ground Truth的产品经理，很高兴为您解答.这是我的回复:

I'm the product manager for Amazon SageMaker Ground Truth, and I would be happy to answer your query. Here are my responses:

[1]您的私人标签工作人员可以随心所欲地变大或变小.定价不取决于标签工作人员的人数.

[1] Your private labeling workforce can be as large or small as you would like it to be. The pricing does not depend on this size of your labeling workforce.

[2]您将在此处了解有关如何带部分"标签数据集的更多信息:

[2] You learn more about how to bring a "partially" labeled dataset here: https://docs.aws.amazon.com/sagemaker/latest/dg/sms-reusing-data.html#sms-reusing-data-newdata

您还可以使用从先前的贴标工作中训练出来的ML模型.在这里了解更多； https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-ground-truth-using-a-pre-trained-model-for-faster-data-labeling/

You can also use the ML model trained from a previous labeling job. Learn more here; https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-ground-truth-using-a-pre-trained-model-for-faster-data-labeling/

[3]为了明确起见，您需要1,000个数据集对象才能开始自动标记作业，但是可以对这1,000个对象中的某些对象进行自动标记(％取决于您的数据和用例).您的课程中总共有1,000个对象-也就是说，除了拥有1,000个文本数据集对象外，没有其他要求.

[3] To clarify, you need 1,000 dataset objects to start an auto-labeling job, but some of these 1,000 objects can be auto-labeled (the % depends on your data and use case). It is 1,000 objects across your classes - i.e. there is no additional requirement beyond having 1,000 text dataset objects.

您可以从此博客文章中了解有关自动标记机制的更多信息: https://aws.amazon.com/blogs/machine-learning/annotate-data-for-less-with-亚马逊智者地面真相和自动数据标签/

You can learn more about the mechanics of auto-labeling from this blog post: https://aws.amazon.com/blogs/machine-learning/annotate-data-for-less-with-amazon-sagemaker-ground-truth-and-automated-data-labeling/

这篇关于有关Amazon Sagemaker groundtruth的信息的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

有关Amazon Sagemaker groundtruth的信息 [英] Information regarding Amazon Sagemaker groundtruth

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

有关Amazon Sagemaker groundtruth的信息 [英] Information regarding Amazon Sagemaker groundtruth

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭