如何以正确的顺序链接映射和过滤函数 [英] How to chain map and filter functions in the correct order
问题描述
我真的很喜欢链接 Array.prototype.map
、filter
和 reduce
来定义数据转换.不幸的是,在最近的一个涉及大型日志文件的项目中,我无法再多次循环访问我的数据...
我的目标:
我想创建一个链接 .filter
和 .map
方法的函数,而不是立即映射数组,而是组成一个循环数据的函数 一次.即:
const DataTransformation = () =>({地图:fn =>(/* ... */),过滤器:fn =>(/* ... */),运行: arr =>(/* ... */)});const someTransformation = DataTransformation().map(x => x + 1).filter(x => x > 3).map(x => x/2);//返回 [ 2, 2.5 ] 而不在中间创建 [ 2, 3, 4, 5] 和 [4, 5]const myData = someTransformation.run([ 1, 2, 3, 4]);
我的尝试:
灵感来自这个答案 和 这篇博文 我开始编写一个 Transduce
函数.
const 过滤器 = pred =>减速器 =>(acc, x) =>预测(x)?减速器(acc,x):acc;const 映射器 = 地图 =>减速器 =>(acc, x) =>减速器(ACC,地图(x));const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) =>({地图:地图=>转换(映射器(映射)(减速器)),过滤器:pred =>换能器(过滤器(预测)(减速器)),运行: arr =>arr.reduce(reducer, [])});
问题:
上面Transduce
片段的问题在于它向后"运行......我链接的最后一个方法是第一个执行的:
const someTransformation = Transduce().map(x => x + 1).filter(x => x > 3).map(x => x/2);//而不是 [ 2, 2.5 ] 这返回 []//以 (x/2) 开头 ->[0.5, 1, 1.5, 2]//然后过滤器 (x <3) ->[]const myData = someTransformation.run([ 1, 2, 3, 4]);
或者,用更抽象的术语:
<块引用>从:
Transducer(concat).map(f).map(g) == (acc, x) =>concat(acc, f(g(x)))
致:
Transducer(concat).map(f).map(g) == (acc, x) =>concat(acc, g(f(x)))
类似于:
mapper(f) (mapper(g) (concat))
我想我明白为什么会发生这种情况,但我不知道如何在不更改函数的接口"的情况下修复它.
问题:
如何以正确的顺序进行我的Transduce
方法链filter
和map
操作?
注意事项:
- 我只是在了解我正在尝试做的一些事情的命名.如果我错误地使用了
Transduce
术语,或者是否有更好的方法来描述问题,请告诉我. - 我知道我可以使用嵌套的
for
循环来做同样的事情:
const push = (acc, x) =>(acc.push(x), acc);const ActionChain = (actions = []) =>{常量运行 = arr =>arr.reduce((acc, x) => {for (let i = 0, action; i < actions.length; i += 1) {动作 = 动作[i];if (action.type === "过滤器") {如果(动作.fn(x)){继续;}返回acc;} else if (action.type === "MAP") {x = action.fn(x);}}acc.push(x);返回acc;}, []);const addAction = 类型 =>fn =>ActionChain(push(actions, { type, fn }));返回 {地图:addAction(地图"),过滤器:addAction(过滤器"),跑};};//与常规链比较以检查是否//有性能提升//诚然,在这个例子中,它非常小...const naiveApproach = {运行: arr =>阿尔.map(x => x + 3).filter(x => x % 3 === 0).map(x => x/3).filter(x => x <40)};const actionChain = ActionChain().map(x => x + 3).filter(x => x % 3 === 0).map(x => x/3).filter(x => x <40)const testData = Array.from(Array(100000), (x, i) => i);console.time("天真");const result1 = naiveApproach.run(testData);console.timeEnd("天真");console.time("链");const result2 = actionChain.run(testData);console.timeEnd("链");console.log("equal:", JSON.stringify(result1) === JSON.stringify(result2));
- 这是我在堆栈片段中的尝试:
const filterer = pred =>减速器 =>(acc, x) =>预测(x)?减速器(acc,x):acc;const 映射器 = 地图 =>减速器 =>(acc, x) =>减速器(ACC,地图(x));const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) =>({地图:地图=>转换(映射器(映射)(减速器)),过滤器:pred =>换能器(过滤器(预测)(减速器)),运行: arr =>arr.reduce(reducer, [])});const sameDataTransformation = Transduce().map(x => x + 5).filter(x => x % 2 === 0).map(x => x/2).filter(x => x <4);//它是向后的://[-1, 0, 1, 2, 3]//[-0.5, 0, 0.5, 1, 1.5]//[0]//[5]console.log(sameDataTransformation.run([-1, 0, 1, 2, 3, 4, 5]));
在我们知道之前
<块引用>我真的很喜欢链接...
我明白了,我会安抚你,但你会明白强制你的程序通过一个链式 API 是不自然的,而且在大多数情况下比它值得的麻烦更多.
<块引用>const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) =>({地图:地图=>转换(映射器(映射)(减速器)),过滤器:pred =>换能器(过滤器(预测)(减速器)),运行: arr =>arr.reduce(reducer, [])});
我想我明白它为什么会发生,但我不知道如何在不改变我函数的接口"的情况下修复它.
问题确实出在您的 Transduce
构造函数上.您的 map
和 filter
方法将 map
和 pred
堆叠在传感器链的外部,而不是嵌套它们里面.
下面,我已经实现了您的 Transduce
API,它以正确的顺序评估地图和过滤器.我还添加了一个 log
方法,以便我们可以看到 Transduce
的行为
const Transduce = (f = k => k) =>({地图:g =>换能(k =>f((acc, x) => k(acc, g(x)))),过滤器:g =>换能(k =>f ((acc, x) => g(x) ? k(acc, x) : acc)),日志:s =>换能(k =>f ((acc, x) => (console.log(s, x), k(acc, x)))),运行:xs =>xs.reduce(f((acc, x) => acc.concat(x)), [])})const foo = nums =>{返回转换().log('大于2?').filter(x => x > 2).log(' square:').map(x => x * x).log(' 小于 30?').filter(x => x <30).log(' pass').run(数字)}//保持平方(n),forall n of nums//其中 n >2//其中 square(n) <30console.log(foo([1,2,3,4,5,6,7]))//=>[ 9, 16, 25 ]
未开发的潜力
<块引用>灵感来自这个答案 ...
在阅读我写的那个答案时,您忽略了 Trans
的一般质量,因为它是在那里写的.这里,我们的 Transduce
只尝试处理数组,但实际上它可以处理任何具有空值 ([]
) 和 concat的类型代码>方法.这两个属性构成了一个名为 Monoids 的类别,如果我们不这样做,我们会对自己造成伤害不得利用传感器处理此类别的任何类型的能力.
上面,我们在 run
方法中硬编码了初始累加器 []
,但这可能应该作为参数提供 - 就像我们对 iterable.reduce(reducer, initialAcc)
除此之外,两种实现在本质上是等价的.最大的区别是链接答案中提供的 Trans
实现是 Trans
本身是一个幺半群,但 Transduce
在这里不是.Trans
在 concat
方法中巧妙地实现了转换器的组合,而 Transduce
(上面)在每个方法中混合了组合.使其成为幺半群允许我们以与所有其他幺半群相同的方式对 Trans
进行合理化,而不必将其理解为某些具有唯一 map
、filter 的特殊链接接口
和 run
方法.
我建议从 Trans
构建,而不是创建自己的自定义 API
吃你的蛋糕,也吃它
因此,我们学到了统一接口的宝贵经验,并且我们明白 Trans
本质上很简单.但是,您仍然需要那个甜蜜的链接 API.好吧好吧...
我们将再实现一次 Transduce
,但这次我们将使用 Trans
monoid.这里,Transduce
保存了一个 Trans
值而不是一个延续(Function
).
其他一切都保持不变 - foo
需要 1 微小 更改并产生相同的输出.
//通用转换器const 映射器 = f =>Trans(k => (acc, x) => k(acc, f(x)))const 过滤器 = f =>Trans(k => (acc, x) => f(x) ? k(acc, x) : acc)const 记录器 = 标签 =>Trans(k => (acc, x) => (console.log(label, x), k(acc, x)))//用 Trans monoid 制作的魔术链 apiconst Transduce = (t = Trans.empty()) =>({地图:f =>Transduce(t.concat(mapper(f))),过滤器:f =>转换(t.concat(过滤器(f))),日志:s =>Transduce(t.concat(logger(s))),运行:(m, xs) =>转换(t,m,xs)})//当我们运行时,我们必须指定要转换的类型//.run(Array, nums)//代替//.run(nums)
展开此代码片段以查看最终实现——当然您可以跳过定义单独的mapper
、filterer
和logger
,以及而是直接在 Transduce
上定义它们.我觉得这读起来更好看.
//Trans monoidconst Trans = f =>({运行传输:f,concat: ({runTrans: g}) =>反式(k => f(g(k)))})Trans.empty = () =>反(k => k)const 转换 = (t, m, xs) =>xs.reduce(t.runTrans((acc, x) => acc.concat(x)), m.empty())//完整的数组幺半群实现Array.empty = () =>[]//通用传感器const 映射器 = f =>Trans(k => (acc, x) => k(acc, f(x)))const 过滤器 = f =>Trans(k => (acc, x) => f(x) ? k(acc, x) : acc)const 记录器 = 标签 =>Trans(k => (acc, x) => (console.log(label, x), k(acc, x)))//现在用 Trans monoid 实现const Transduce = (t = Trans.empty()) =>({地图:f =>Transduce(t.concat(mapper(f))),过滤器:f =>转换(t.concat(过滤器(f))),日志:s =>Transduce(t.concat(logger(s))),运行:(m, xs) =>转换(t,m,xs)})//这保持完全相同const foo = nums =>{返回转换().log('大于2?').filter(x => x > 2).log(' square:').map(x => x * x).log(' 小于 30?').filter(x => x <30).log(' pass').run(数组,数字)}//输出完全一样console.log(foo([1,2,3,4,5,6,7]))//=>[ 9, 16, 25 ]
总结
所以我们从一堆 lambda 开始,然后使用幺半群使事情变得更简单.Trans
monoid 提供了明显的优势,因为 monoid 接口是已知的,并且泛型实现非常.但是我们很顽固,或者我们有一些目标要实现,而这些目标不是我们设定的——我们决定构建神奇的 Transduce
链式 API,但我们使用坚如磐石的 Trans
monoid 为我们提供了 Trans
的所有功能,但也很好地划分了复杂性.
点链恋物癖匿名
这是我最近写的关于方法链的其他几个答案
I really like chaining Array.prototype.map
, filter
and reduce
to define a data transformation. Unfortunately, in a recent project that involved large log files, I could no longer get away with looping through my data multiple times...
My goal:
I want to create a function that chains .filter
and .map
methods by, instead of mapping over an array immediately, composing a function that loops over the data once. I.e.:
const DataTransformation = () => ({
map: fn => (/* ... */),
filter: fn => (/* ... */),
run: arr => (/* ... */)
});
const someTransformation = DataTransformation()
.map(x => x + 1)
.filter(x => x > 3)
.map(x => x / 2);
// returns [ 2, 2.5 ] without creating [ 2, 3, 4, 5] and [4, 5] in between
const myData = someTransformation.run([ 1, 2, 3, 4]);
My attempt:
Inspired by this answer and this blogpost I started writing a Transduce
function.
const filterer = pred => reducer => (acc, x) =>
pred(x) ? reducer(acc, x) : acc;
const mapper = map => reducer => (acc, x) =>
reducer(acc, map(x));
const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) => ({
map: map => Transduce(mapper(map)(reducer)),
filter: pred => Transduce(filterer(pred)(reducer)),
run: arr => arr.reduce(reducer, [])
});
The problem:
The problem with the Transduce
snippet above, is that it runs "backwards"... The last method I chain is the first to be executed:
const someTransformation = Transduce()
.map(x => x + 1)
.filter(x => x > 3)
.map(x => x / 2);
// Instead of [ 2, 2.5 ] this returns []
// starts with (x / 2) -> [0.5, 1, 1.5, 2]
// then filters (x < 3) -> []
const myData = someTransformation.run([ 1, 2, 3, 4]);
Or, in more abstract terms:
Go from:
Transducer(concat).map(f).map(g) == (acc, x) => concat(acc, f(g(x)))
To:
Transducer(concat).map(f).map(g) == (acc, x) => concat(acc, g(f(x)))
Which is similar to:
mapper(f) (mapper(g) (concat))
I think I understand why it happens, but I can't figure out how to fix it without changing the "interface" of my function.
The question:
How can I make my Transduce
method chain filter
and map
operations in the correct order?
Notes:
- I'm only just learning about the naming of some of the things I'm trying to do. Please let me know if I've incorrectly used the
Transduce
term or if there are better ways to describe the problem. - I'm aware I can do the same using a nested
for
loop:
const push = (acc, x) => (acc.push(x), acc);
const ActionChain = (actions = []) => {
const run = arr =>
arr.reduce((acc, x) => {
for (let i = 0, action; i < actions.length; i += 1) {
action = actions[i];
if (action.type === "FILTER") {
if (action.fn(x)) {
continue;
}
return acc;
} else if (action.type === "MAP") {
x = action.fn(x);
}
}
acc.push(x);
return acc;
}, []);
const addAction = type => fn =>
ActionChain(push(actions, { type, fn }));
return {
map: addAction("MAP"),
filter: addAction("FILTER"),
run
};
};
// Compare to regular chain to check if
// there's a performance gain
// Admittedly, in this example, it's quite small...
const naiveApproach = {
run: arr =>
arr
.map(x => x + 3)
.filter(x => x % 3 === 0)
.map(x => x / 3)
.filter(x => x < 40)
};
const actionChain = ActionChain()
.map(x => x + 3)
.filter(x => x % 3 === 0)
.map(x => x / 3)
.filter(x => x < 40)
const testData = Array.from(Array(100000), (x, i) => i);
console.time("naive");
const result1 = naiveApproach.run(testData);
console.timeEnd("naive");
console.time("chain");
const result2 = actionChain.run(testData);
console.timeEnd("chain");
console.log("equal:", JSON.stringify(result1) === JSON.stringify(result2));
- Here's my attempt in a stack snippet:
const filterer = pred => reducer => (acc, x) =>
pred(x) ? reducer(acc, x) : acc;
const mapper = map => reducer => (acc, x) => reducer(acc, map(x));
const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) => ({
map: map => Transduce(mapper(map)(reducer)),
filter: pred => Transduce(filterer(pred)(reducer)),
run: arr => arr.reduce(reducer, [])
});
const sameDataTransformation = Transduce()
.map(x => x + 5)
.filter(x => x % 2 === 0)
.map(x => x / 2)
.filter(x => x < 4);
// It's backwards:
// [-1, 0, 1, 2, 3]
// [-0.5, 0, 0.5, 1, 1.5]
// [0]
// [5]
console.log(sameDataTransformation.run([-1, 0, 1, 2, 3, 4, 5]));
before we know better
I really like chaining ...
I see that, and I'll appease you, but you'll come to understand that forcing your program through a chaining API is unnatural, and more trouble than it's worth in most cases.
const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) => ({ map: map => Transduce(mapper(map)(reducer)), filter: pred => Transduce(filterer(pred)(reducer)), run: arr => arr.reduce(reducer, []) });
I think I understand why it happens, but I can't figure out how to fix it without changing the "interface" of my function.
The problem is indeed with your Transduce
constructor. Your map
and filter
methods are stacking map
and pred
on the outside of the transducer chain, instead of nesting them inside.
Below, I've implemented your Transduce
API that evaluates the maps and filters in correct order. I've also added a log
method so that we can see how Transduce
is behaving
const Transduce = (f = k => k) => ({
map: g =>
Transduce(k =>
f ((acc, x) => k(acc, g(x)))),
filter: g =>
Transduce(k =>
f ((acc, x) => g(x) ? k(acc, x) : acc)),
log: s =>
Transduce(k =>
f ((acc, x) => (console.log(s, x), k(acc, x)))),
run: xs =>
xs.reduce(f((acc, x) => acc.concat(x)), [])
})
const foo = nums => {
return Transduce()
.log('greater than 2?')
.filter(x => x > 2)
.log(' square:')
.map(x => x * x)
.log(' less than 30?')
.filter(x => x < 30)
.log(' pass')
.run(nums)
}
// keep square(n), forall n of nums
// where n > 2
// where square(n) < 30
console.log(foo([1,2,3,4,5,6,7]))
// => [ 9, 16, 25 ]
untapped potential
Inspired by this answer ...
In reading that answer I wrote, you overlook the generic quality of Trans
as it was written there. Here, our Transduce
only attempts to work with Arrays, but really it can work with any type that has an empty value ([]
) and a concat
method. These two properties make up a category called Monoids and we'd be doing ourselves a disservice if we didn't take advantage of transducer's ability to work with any type in this category.
Above, we hard-coded the initial accumulator []
in the run
method, but this should probably be supplied as an argument – much like we do with iterable.reduce(reducer, initialAcc)
Aside from that, both implementations are essentially equivalent. The biggest difference is that the Trans
implementation provided in the linked answer is Trans
itself is a monoid, but Transduce
here is not. Trans
neatly implements composition of transducers in the concat
method whereas Transduce
(above) has composition mixed within each method. Making it a monoid allows us to rationalize Trans
the same way do all other monoids, instead of having to understand it as some specialized chaining interface with unique map
, filter
, and run
methods.
I would advise building from Trans
instead of making your own custom API
have your cake and eat it too
So we learned the valuable lesson of uniform interfaces and we understand that Trans
is inherently simple. But, you still want that sweet chaining API. OK, ok...
We're going to implement Transduce
one more time, but this time we'll do so using the Trans
monoid. Here, Transduce
holds a Trans
value instead of a continuation (Function
).
Everything else stays the same – foo
takes 1 tiny change and produces an identical output.
// generic transducers
const mapper = f =>
Trans(k => (acc, x) => k(acc, f(x)))
const filterer = f =>
Trans(k => (acc, x) => f(x) ? k(acc, x) : acc)
const logger = label =>
Trans(k => (acc, x) => (console.log(label, x), k(acc, x)))
// magic chaining api made with Trans monoid
const Transduce = (t = Trans.empty()) => ({
map: f =>
Transduce(t.concat(mapper(f))),
filter: f =>
Transduce(t.concat(filterer(f))),
log: s =>
Transduce(t.concat(logger(s))),
run: (m, xs) =>
transduce(t, m, xs)
})
// when we run, we must specify the type to transduce
// .run(Array, nums)
// instead of
// .run(nums)
Expand this code snippet to see the final implementation – of course you could skip defining a separate mapper
, filterer
, and logger
, and instead define those directly on Transduce
. I think this reads nicer tho.
// Trans monoid
const Trans = f => ({
runTrans: f,
concat: ({runTrans: g}) =>
Trans(k => f(g(k)))
})
Trans.empty = () =>
Trans(k => k)
const transduce = (t, m, xs) =>
xs.reduce(t.runTrans((acc, x) => acc.concat(x)), m.empty())
// complete Array monoid implementation
Array.empty = () => []
// generic transducers
const mapper = f =>
Trans(k => (acc, x) => k(acc, f(x)))
const filterer = f =>
Trans(k => (acc, x) => f(x) ? k(acc, x) : acc)
const logger = label =>
Trans(k => (acc, x) => (console.log(label, x), k(acc, x)))
// now implemented with Trans monoid
const Transduce = (t = Trans.empty()) => ({
map: f =>
Transduce(t.concat(mapper(f))),
filter: f =>
Transduce(t.concat(filterer(f))),
log: s =>
Transduce(t.concat(logger(s))),
run: (m, xs) =>
transduce(t, m, xs)
})
// this stays exactly the same
const foo = nums => {
return Transduce()
.log('greater than 2?')
.filter(x => x > 2)
.log(' square:')
.map(x => x * x)
.log(' less than 30?')
.filter(x => x < 30)
.log(' pass')
.run(Array, nums)
}
// output is exactly the same
console.log(foo([1,2,3,4,5,6,7]))
// => [ 9, 16, 25 ]
wrap up
So we started with a mess of lambdas and then made things simpler using a monoid. The Trans
monoid provides distinct advantages in that the monoid interface is known and the generic implementation is extremely simple. But we're stubborn or maybe we have goals to fulfill that are not set by us – we decide to build the magic Transduce
chaining API, but we do so using our rock-solid Trans
monoid which gives us all the power of Trans
but also keeps complexity nicely compartmentalised.
dot chaining fetishists anonymous
Here's a couple other recent answers I wrote about method chaining
- Is there any way to make a functions return accessible via a property?
- Chaining functions and using an anonymous function
- Pass result of functional chain to function
这篇关于如何以正确的顺序链接映射和过滤函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!