我如何最准确地将Google云端平台项目成本与App Engine活动关联起来? [英] How can I most accurately associate Google Cloud Platform project costs with App Engine activity?
问题描述
我支持基于Google云端平台的数据管理解决方案。随着我们产品的成熟,越来越多的团队和个人采用它,这意味着更多的人正在存储和搜索数据并降低成本。我们需要更好地了解这些用户/工作流程中每个用户/工作流程需要花费多少钱,以便我们最终可以开始向他们收取使用我们服务的费用。
我们的解决方案运行的Google Cloud Platform项目导出到BigQuery。我注意到,有关该项目的Google Cloud Platform帐单中有70-80%属于App Engine(作为产品),所以我目前正专注于分摊App Engine费用。以下是一天(来自BigQuery)的项目App Engine成本的简明视图:
行产品resource_type start_time end_time成本usage_amount usage_unit
1 App Engine简单搜索2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.1473 3946.0请求
2 App Engine Flex实例RAM 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.6816 3.710851743744E14 byte-seconds
3 App引擎搜索文档存储2017-08-20 07:00:00 UTC 2017-08-20 08 :00:00 UTC 0.505028 8.0921704558464E15 byte-seconds
4 App引擎代码和静态文件存储2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.0 5.96811043008E13 byte-秒
5 App引擎数据存储实体写入2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.085804 67669.0请求
6 App Engine其他搜索Ops 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.0 1732.0请求
7 App Engine Out带宽2017-08-20 07 :00:00 UTC 2017-08-20 08:00:00 UTC 0.273014 3.516638423E9字节
8 App引擎Datastore Read Ops 2017-08-20 07:00:00 UTC 2017-08-20 08:00: 00 UTC 1.494541 2540902.0请求
9 App引擎搜索文档索引2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.05012 3.7645832E7字节
10 App引擎数据存储存储2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 1.72891 2.7716055728688E16 byte-seconds
11 App Engine Flex实例Core Hours 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 5.0496 345600.0秒
12 App Engine任务队列存储2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.0 5.14512 E8 byte-seconds
13 App Engine数据存储小操作2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.0 16166.0请求
14 App Engine后端实例2017- 08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 206.080588 1.4870202339153E7秒
15 App Engine前端实例2017-08-20 07:00:00 UTC 2017-08-20 08 :00:00 UTC 1.35596 198429.126958秒
问题1: ,任何熟悉Google云端平台结算导出的人都可以使用 start_time
2017-08-20 07:00:00 UTC
和 end_time
2017-08-20 08:00:00 UTC
反映2017-08-20发生的成本,而不是2017-08-19,对吗?
现在,我知道将这些App Engine费用与App Engine活动相关联并不会成为精确的映射 - Google Cloud平台不会为每个操作付费,并且在那里将被修复,我猜,共享资源成本(请纠正我,如果我错了!) - 但我仍然希望得到一个合理的估计。我的第一次尝试涉及检查 Google记录的每次请求估计成本。因此,我为App Engine请求日志创建了一个接收器,并等待这些数字推出。但是,使用此方法的特定日期的所有请求的总体估计成本非常低:
SELECT SUM(protoPayload.cost)AS cost_total
FROM [my-data-management-solution:request_log.appengine_googleapis_com_request_log_20170820]; b
$ b $ p code> Row cost_total
1 3.2711573326337837
仅占总数的1.5% App Engine费用!
问题2:
约95%的App Engine成本归因于后端实例 RESOURCE_TYPE
。我对它们的内容做了一些粗略的研究(包括此视频,声称Google正在移动远离整个后端/前端实例的区别)。我假设(或者可能已经读过)Google依靠任何秘密算法来启动,关闭和管理这些实例。因此... ...
问题3(最大问题):如何获得有关个人用户/工作流操作的限制通过App Engine可以)为Google云计划项目的App Engine总成本或最低App Engine后端实例成本做出贡献?没有像用户活动和创建ML模型那样的降低成本的方式吗?是否有想法了解这个黑匣子(从扩展和定价角度来看)是如何工作的,或者是否认为App Engine成本与用户活动有点直接相关?
其他信息
-
我们的数据管理解决方案使用自己的身份概念,并且我不希望Google奇迹般地想出来。我现在可以通过解析Stackdriver日志将
request_log
项目链接到用户,我将解决用户 - 工作流程关联或从其他工具中获取它们。 -
为了以防万一,有什么东西需要开箱吗?一个StackOverflow评论提到了 Potamus ,但存储库是不再可用,几乎没有任何有关它开始的信息。
-
如果App Engine的成本分裂不是什么大问题,关于云存储等其他产品的信息?这将成为我的下一个目标,尽管将App Store活动与云存储成本(实际的,潜在可忽略的存储成本和更昂贵的I / O成本)相关联的挑战在这一点上似乎更加不合理。
充分理解您对资源使用的兴趣,希望这有助于您!
您可以通过GCP云控制台中的API资源管理器创建(&管理)标签,这应该清楚地表明资源使用情况。
标签实体可以与团队/成本中心,用户,环境等相关联,以更清楚地了解资源使用情况。链接资源提供了更多详细信息: GCP_Using标签
您还可以使用导出到BigQuery&数据工作室。链接的媒体文章是一个很棒的概述。使用BQ和数据对GCP帐单数据进行Medium_Visualizing Medium_Visualizing BCP和数据Studio
干杯,
琥珀
I support a data management solution built on Google Cloud Platform. As our product matures, more and more teams and individuals are adopting it, meaning more people are storing and searching for data and racking up costs. We need to better understand how much each of these users/workflows are costing us so that we can eventually start charging them for using our services.
I already have billing data for the Google Cloud Platform project that our solution runs on exported to BigQuery. I've observed that 70-80 percent of our Google Cloud Platform bill for the project in question is attributed to App Engine (as a product), so I'm currently focusing on splitting App Engine costs. Here's a condensed view of App Engine costs for the project for one day (from BigQuery):
Row product resource_type start_time end_time cost usage_amount usage_unit
1 App Engine Simple Searches 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.1473 3946.0 requests
2 App Engine Flex Instance RAM 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.6816 3.710851743744E14 byte-seconds
3 App Engine Search Document Storage 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.505028 8.0921704558464E15 byte-seconds
4 App Engine Code and Static File Storage 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.0 5.96811043008E13 byte-seconds
5 App Engine Datastore Entity Writes 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.085804 67669.0 requests
6 App Engine Other Search Ops 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.0 1732.0 requests
7 App Engine Out Bandwidth 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.273014 3.516638423E9 bytes
8 App Engine Datastore Read Ops 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 1.494541 2540902.0 requests
9 App Engine Search Document Indexing 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.05012 3.7645832E7 bytes
10 App Engine Datastore Storage 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 1.72891 2.7716055728688E16 byte-seconds
11 App Engine Flex Instance Core Hours 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 5.0496 345600.0 seconds
12 App Engine Task Queue Storage 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.0 5.14512E8 byte-seconds
13 App Engine Datastore Small Ops 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 0.0 16166.0 requests
14 App Engine Backend Instances 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 206.080588 1.4870202339153E7 seconds
15 App Engine Frontend Instances 2017-08-20 07:00:00 UTC 2017-08-20 08:00:00 UTC 1.35596 198429.126958 seconds
Question 1: By the way, for anybody familiar with Google Cloud Platform billing exports, an entry with start_time
2017-08-20 07:00:00 UTC
and end_time
2017-08-20 08:00:00 UTC
reflects costs incurred on 2017-08-20, not 2017-08-19, right?
Now, I understand that associating these App Engine costs with App Engine activity is not going to be an exact mapping–Google Cloud Platform does not bill per action, and there will be fixed and, I guess, shared resource costs (please correct me if I'm wrong!)–but I'd still like to get a sensible estimate. My first attempt involved checking Google's logged estimated cost per request. Therefore, I created a sink for the App Engine request logs and waited for the numbers to roll in. However, the total estimated cost for all requests on a given day using this approach is very low:
SELECT SUM(protoPayload.cost) AS cost_total
FROM [my-data-management-solution:request_log.appengine_googleapis_com_request_log_20170820];
yields
Row cost_total
1 3.2711573326337837
That barely accounts for 1.5% of the total App Engine costs!
Question 2: What resource_type
(s) (from the Google Cloud Platform billing export) do the request log cost estimates correspond or contribute to?
About 95% of my App Engine costs are attributed to the Backend Instances resource_type
. I did some cursory research into what they are (including this video claiming that Google was moving away from the whole backend/frontend instances distinction). I assume (or may have read) that Google relies on whatever secret algorithms to spin up, shut down, and otherwise manage these instances. As such…
Question 3 (the big question): How can I get some visibility into how much individual user/workflow actions (limited to via App Engine is OK) contribute to total App Engine costs, or minimally App Engine Backend Instances costs, for a Google Cloud Project? Is it possible without something like regressing costs against user activity and creating an ML model? Is the idea of gaining insight into how this black box (both from the scaling and pricing perspectives) works or otherwise thinking that App Engine costs are somewhat directly correlated with user activity reasonable at all?
Additional Information
Our data management solution uses its own concept of identity, and I'm not expecting Google to magically figure it out. I can currently link
request_log
items to users by parsing Stackdriver logs, and I'll work out the user-workflow associations or get them from another tool.Just in case, is there anything to do some of this stuff out of the box? One StackOverflow comment mentioned Potamus, but the repository is no longer available, and there's hardly any information out there about what it did to begin with.
If App Engine cost splitting isn't a big deal, how about for other products like Cloud Storage? It will be my next target, although the challenge of associating Cloud Storage costs (both the actual, potentially negligible, storage costs and the more expensive I/O costs) with App Engine activity seems even less reasonable at this point.
Fully understand your interest in resource usage, hope this helps!
You can create (& manage) labels via the API resource manager in your GCP cloud console, this should provide clarity into resource usage. Label entities can be associated with teams/cost centers, users, environments, etc to gain clarity into resource usage. Linked resource provides further detail: GCP_Using Labels
You can also create visual representations of billing data for further analysis using the export-to-BigQuery & Data Studio. The linked medium article is an awesome overview. Medium_Visualizing GCP Billing Data using BQ and Data Studio
Cheers, Amber
这篇关于我如何最准确地将Google云端平台项目成本与App Engine活动关联起来?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!