在数据仓库中发布数据 [英] Publishing data in a data warehouse

查看:32
本文介绍了在数据仓库中发布数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有最佳实践或众所周知的方法来发布/公布(通过元数据等)哪些数据已加载、验证并且当前可用于数据仓库中的报告?

Are there best practices or well known methods for publishing/announcing (via metadata etc) what data has been loaded, verified and is currently available for reporting in a data warehouse?

我已经看到了几个用于执行此操作的内部系统 - 有些非常脆弱.

I've seen several in-house systems for doing this - some pretty fragile.

我可以寻找一些众所周知的概念或好的搜索词吗?

Are there some well-known concepts or good search terms I could look for?

推荐答案

我不确定您在此处究竟要寻找什么,但用户究竟在等待什么?

I'm not sure exactly what you're looking for here, but what exactly are the users waiting for?

如果要让系统在经过明确定义且一致的每日 ETL 流程运行后再次可用,那么发送电子邮件、重新启用报告应用程序、更新 Intranet 站点上的状态图标等很容易.

If it's for the system to be available again after a well-defined and consistent daily ETL process runs, then it's easy to send an email, re-enable your reporting application, update a status icon on your intranet site etc.

另一方面,如果他们正在等待一个非常具体的数据集(东南亚地区小部件部门的 Q4 销售数据是否可用?")那么事情就会困难得多,因为每个人都是对不同的东西感兴趣.这甚至不是真正的技术决策,因为知道源数据何时完整和正确是一个业务问题,对于每个源系统或数据集可能有不同的答案.在我们的环境中,每日报告是完全自动化的,但每月或每年的报告不是,主要是因为经常存在不一致的事件或流程,这意味着我们仍然需要人工来确认报告可以运行.

On the other hand, if they are waiting for a very specific data set ("is the Q4 sales data for the widget division in the south-east Asia region available yet?") then things are much more difficult because everyone is interested in something different. It's not even really a technical decision because knowing when source data is complete and correct is a business question that may have a different answer for each source system or data set. In our environment, daily reports are fully automated but monthly or yearly ones are not, mostly because there are often inconsistent events or processes that mean we still need a human being to confirm that the reports can be run.

我确定您可以使用元数据来构建某种仪表板来显示特定数据的加载时间,但这将非常特定于您的情况和您的用户,所以我不知道是否有任何通用的解决方案或模式.我想这将非常依赖于您的业务流程、报告模式(用于元数据)和报告工具.

I'm sure you could use metadata to build some kind of dashboard that shows when certain data was loaded, but it would be extremely specific to your situation and your users so I don't know if there's any general solution or pattern. I imagine it would be very dependent on your business processes, reporting schema (for the metadata) and reporting tools.

这篇关于在数据仓库中发布数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆