有趣的NLP/机器学习风格项目-分析隐私政策 [英] Interesting NLP/machine-learning style project -- analyzing privacy policies

查看：216 发布时间：2020/4/27 3:42:35 language-agnostic artificial-intelligence nlp machine-learning

本文介绍了有趣的NLP/机器学习风格项目-分析隐私政策的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想就分配给我的一个有趣的问题提供一些意见.任务是分析成百上千个，最后成千上万个隐私策略，并确定它们的核心特征.例如，他们获取用户的位置吗?是否与第三方共享/出售?等等.

I wanted some input on an interesting problem I've been assigned. The task is to analyze hundreds, and eventually thousands, of privacy policies and identify core characteristics of them. For example, do they take the user's location?, do they share/sell with third parties?, etc.

我已经与一些人进行了交谈，阅读了很多有关隐私政策的内容，并亲自思考了一下.这是我目前的攻击计划:

I've talked to a few people, read a lot about privacy policies, and thought about this myself. Here is my current plan of attack:

首先，阅读大量隐私，找到满足特定特征的主要线索"或指标.例如，如果成百上千的隐私策略在同一行:我们将前往您的位置."，则该行可能是100％确信该隐私策略包括获取用户位置的提示.其他提示可能会使人们对某个特征的信任度小得多.例如，单词"location"的出现可能会使用户位置存储的可能性增加25％.

First, read a lot of privacy and find the major "cues" or indicators that a certain characteristic is met. For example, if hundreds of privacy policies have the same line: "We will take your location.", that line could be a cue with 100% confidence that that privacy policy includes taking of the user's location. Other cues would give much smaller degrees of confidence about a certain characteristic.. For example, the presence of the word "location" might increase the likelihood that the user's location is store by 25%.

想法是继续发展这些提示及其适当的置信区间，以便我可以高度自信地对所有隐私策略进行分类.可以比喻为使用贝叶斯过滤器的电子邮件垃圾邮件捕获系统，以识别哪些邮件可能是商业邮件和未经请求的邮件.

The idea would be to keep developing these cues, and their appropriate confidence intervals to the point where I could categorize all privacy policies with a high degree of confidence. An analogy here could be made to email-spam catching systems that use Bayesian filters to identify which mail is likely commercial and unsolicited.

我想问你们是否认为这是解决此问题的好方法.您将如何精确地解决这样的问题?此外，您是否建议使用任何特定的工具或框架.欢迎任何输入.这是我第一次做一个涉及人工智能的项目，特别是机器学习和NLP.

I wanted to ask whether you guys think this is a good approach to this problem. How exactly would you approach a problem like this? Furthermore, are there any specific tools or frameworks you'd recommend using. Any input is welcome. This is my first time doing a project which touches on artificial intelligence, specifically machine learning and NLP.

有趣的NLP/机器学习风格项目-分析隐私政策 [英] Interesting NLP/machine-learning style project -- analyzing privacy policies

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

有趣的NLP/机器学习风格项目-分析隐私政策 [英] Interesting NLP/machine-learning style project -- analyzing privacy policies

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭