有趣的 NLP/机器学习风格项目——分析隐私政策 [英] Interesting NLP/machine-learning style project -- analyzing privacy policies

查看：36 发布时间：2022/1/2 17:55:40 language-agnostic artificial-intelligence nlp machine-learning

本文介绍了有趣的 NLP/机器学习风格项目——分析隐私政策的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想就分配给我的一个有趣问题提供一些意见.任务是分析成百上千的隐私政策，并确定它们的核心特征.例如，他们是否获取用户的位置?他们是否与第三方共享/销售?等.

I wanted some input on an interesting problem I've been assigned. The task is to analyze hundreds, and eventually thousands, of privacy policies and identify core characteristics of them. For example, do they take the user's location?, do they share/sell with third parties?, etc.

我与一些人交谈过，阅读了很多关于隐私政策的内容，并自己思考了这个问题.这是我目前的攻击计划:

I've talked to a few people, read a lot about privacy policies, and thought about this myself. Here is my current plan of attack:

首先，阅读大量隐私，找到满足某个特征的主要线索"或指标.例如，如果数百个隐私政策具有相同的行:我们将获取您的位置."，则该行可以 100% 地表明该隐私策略包括获取用户的位置.其他提示可能会降低对某个特征的置信度.例如，位置"一词的存在可能会使用户的位置在商店中的可能性增加 25%.

First, read a lot of privacy and find the major "cues" or indicators that a certain characteristic is met. For example, if hundreds of privacy policies have the same line: "We will take your location.", that line could be a cue with 100% confidence that that privacy policy includes taking of the user's location. Other cues would give much smaller degrees of confidence about a certain characteristic.. For example, the presence of the word "location" might increase the likelihood that the user's location is store by 25%.

我们的想法是不断发展这些线索及其适当的置信区间，以便我可以高度自信地对所有隐私政策进行分类.这里可以类比电子邮件垃圾邮件捕获系统，该系统使用贝叶斯过滤器来识别哪些邮件可能是商业邮件和未经请求的邮件.

The idea would be to keep developing these cues, and their appropriate confidence intervals to the point where I could categorize all privacy policies with a high degree of confidence. An analogy here could be made to email-spam catching systems that use Bayesian filters to identify which mail is likely commercial and unsolicited.

我想问一下你们是否认为这是解决这个问题的好方法.你究竟会如何处理这样的问题?此外，是否有任何您建议使用的特定工具或框架.欢迎任何意见.这是我第一次做涉及人工智能的项目，特别是机器学习和 NLP.

I wanted to ask whether you guys think this is a good approach to this problem. How exactly would you approach a problem like this? Furthermore, are there any specific tools or frameworks you'd recommend using. Any input is welcome. This is my first time doing a project which touches on artificial intelligence, specifically machine learning and NLP.

有趣的 NLP/机器学习风格项目——分析隐私政策 [英] Interesting NLP/machine-learning style project -- analyzing privacy policies

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

有趣的 NLP/机器学习风格项目——分析隐私政策 [英] Interesting NLP/machine-learning style project -- analyzing privacy policies

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭