如何以正确的顺序链接映射和过滤函数 [英] How to chain map and filter functions in the correct order

查看:13
本文介绍了如何以正确的顺序链接映射和过滤函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我真的很喜欢链接 Array.prototype.mapfilterreduce 来定义数据转换.不幸的是,在最近的一个涉及大型日志文件的项目中,我无法再多次循环访问我的数据...

我的目标:

我想创建一个链接 .filter.map 方法的函数,而不是立即映射数组,而是组成一个循环数据的函数 一次.即:

const DataTransformation = () =>({地图:fn =>(/* ... */),过滤器:fn =>(/* ... */),运行: arr =>(/* ... */)});const someTransformation = DataTransformation().map(x => x + 1).filter(x => x > 3).map(x => x/2);//返回 [ 2, 2.5 ] 而不在中间创建 [ 2, 3, 4, 5] 和 [4, 5]const myData = someTransformation.run([ 1, 2, 3, 4]);

我的尝试:

灵感来自这个答案这篇博文 我开始编写一个 Transduce 函数.

const 过滤器 = pred =>减速器 =>(acc, x) =>预测(x)?减速器(acc,x):acc;const 映射器 = 地图 =>减速器 =>(acc, x) =>减速器(ACC,地图(x));const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) =>({地图:地图=>转换(映射器(映射)(减速器)),过滤器:pred =>换能器(过滤器(预测)(减速器)),运行: arr =>arr.reduce(reducer, [])});

问题:

上面Transduce 片段的问题在于它向后"运行......我链接的最后一个方法是第一个执行的:

const someTransformation = Transduce().map(x => x + 1).filter(x => x > 3).map(x => x/2);//而不是 [ 2, 2.5 ] 这返回 []//以 (x/2) 开头 ->[0.5, 1, 1.5, 2]//然后过滤器 (x <3) ->[]const myData = someTransformation.run([ 1, 2, 3, 4]);

或者,用更抽象的术语:

<块引用>

从:

Transducer(concat).map(f).map(g) == (acc, x) =>concat(acc, f(g(x)))

致:

Transducer(concat).map(f).map(g) == (acc, x) =>concat(acc, g(f(x)))

类似于:

mapper(f) (mapper(g) (concat))

我想我明白为什么会发生这种情况,但我不知道如何在不更改函数的接口"的情况下修复它.

问题:

如何以正确的顺序进行我的Transduce 方法链filtermap 操作?

<小时>

注意事项:

  • 我只是在了解我正在尝试做的一些事情的命名.如果我错误地使用了 Transduce 术语,或者是否有更好的方法来描述问题,请告诉我.
  • 我知道我可以使用嵌套的 for 循环来做同样的事情:

const push = (acc, x) =>(acc.push(x), acc);const ActionChain = (actions = []) =>{常量运行 = arr =>arr.reduce((acc, x) => {for (let i = 0, action; i < actions.length; i += 1) {动作 = 动作[i];if (action.type === "过滤器") {如果(动作.fn(x)){继续;}返回acc;} else if (action.type === "MAP") {x = action.fn(x);}}acc.push(x);返回acc;}, []);const addAction = 类型 =>fn =>ActionChain(push(actions, { type, fn }));返回 {地图:addAction(地图"),过滤器:addAction(过滤器"),跑};};//与常规链比较以检查是否//有性能提升//诚然,在这个例子中,它非常小...const naiveApproach = {运行: arr =>阿尔.map(x => x + 3).filter(x => x % 3 === 0).map(x => x/3).filter(x => x <40)};const actionChain = ActionChain().map(x => x + 3).filter(x => x % 3 === 0).map(x => x/3).filter(x => x <40)const testData = Array.from(Array(100000), (x, i) => i);console.time("天真");const result1 = naiveApproach.run(testData);console.timeEnd("天真");console.time("链");const result2 = actionChain.run(testData);console.timeEnd("链");console.log("equal:", JSON.stringify(result1) === JSON.stringify(result2));

  • 这是我在堆栈片段中的尝试:

const filterer = pred =>减速器 =>(acc, x) =>预测(x)?减速器(acc,x):acc;const 映射器 = 地图 =>减速器 =>(acc, x) =>减速器(ACC,地图(x));const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) =>({地图:地图=>转换(映射器(映射)(减速器)),过滤器:pred =>换能器(过滤器(预测)(减速器)),运行: arr =>arr.reduce(reducer, [])});const sameDataTransformation = Transduce().map(x => x + 5).filter(x => x % 2 === 0).map(x => x/2).filter(x => x <4);//它是向后的://[-1, 0, 1, 2, 3]//[-0.5, 0, 0.5, 1, 1.5]//[0]//[5]console.log(sameDataTransformation.run([-1, 0, 1, 2, 3, 4, 5]));

解决方案

在我们知道之前

<块引用>

我真的很喜欢链接...

我明白了,我会安抚你,但你会明白强制你的程序通过一个链式 API 是不自然的,而且在大多数情况下比它值得的麻烦更多.

<块引用>

const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) =>({地图:地图=>转换(映射器(映射)(减速器)),过滤器:pred =>换能器(过滤器(预测)(减速器)),运行: arr =>arr.reduce(reducer, [])});

我想我明白它为什么会发生,但我不知道如何在不改变我函数的接口"的情况下修复它.

问题确实出在您的 Transduce 构造函数上.您的 mapfilter 方法将 mappred 堆叠在传感器链的外部,而不是嵌套它们里面.

下面,我已经实现了您的 Transduce API,它以正确的顺序评估地图和过滤器.我还添加了一个 log 方法,以便我们可以看到 Transduce 的行为

const Transduce = (f = k => k) =>({地图:g =>换能(k =>f((acc, x) => k(acc, g(x)))),过滤器:g =>换能(k =>f ((acc, x) => g(x) ? k(acc, x) : acc)),日志:s =>换能(k =>f ((acc, x) => (console.log(s, x), k(acc, x)))),运行:xs =>xs.reduce(f((acc, x) => acc.concat(x)), [])})const foo = nums =>{返回转换().log('大于2?').filter(x => x > 2).log('	square:').map(x => x * x).log('		 小于 30?').filter(x => x <30).log('			pass').run(数字)}//保持平方(n),forall n of nums//其中 n >2//其中 square(n) <30console.log(foo([1,2,3,4,5,6,7]))//=>[ 9, 16, 25 ]

<小时>

未开发的潜力

<块引用>

灵感来自这个答案 ...

在阅读我写的那个答案时,您忽略了 Trans 的一般质量,因为它是在那里写的.这里,我们的 Transduce 只尝试处理数组,但实际上它可以处理任何具有空值 ([]) 和 concat方法.这两个属性构成了一个名为 Monoids 的类别,如果我们不这样做,我们会对自己造成伤害不得利用传感器处理此类别的任何类型的能力.

上面,我们在 run 方法中硬编码了初始累加器 [],但这可能应该作为参数提供 - 就像我们对 iterable.reduce(reducer, initialAcc)

除此之外,两种实现在本质上是等价的.最大的区别是链接答案中提供的 Trans 实现是 Trans 本身是一个幺半群,但 Transduce 在这里不是.Transconcat 方法中巧妙地实现了转换器的组合,而 Transduce(上面)在每个方法中混合了组合.使其成为幺半群允许我们以与所有其他幺半群相同的方式对 Trans 进行合理化,而不必将其理解为某些具有唯一 mapfilter 的特殊链接接口run 方法.

我建议从 Trans 构建,而不是创建自己的自定义 API

<小时>

吃你的蛋糕,也吃它

因此,我们学到了统一接口的宝贵经验,并且我们明白 Trans 本质上很简单.但是,您仍然需要那个甜蜜的链接 API.好吧好吧...

我们将再实现一次 Transduce,但这次我们将使用 Trans monoid.这里,Transduce 保存了一个 Trans 值而不是一个延续(Function).

其他一切都保持不变 - foo 需要 1 微小 更改并产生相同的输出.

//通用转换器const 映射器 = f =>Trans(k => (acc, x) => k(acc, f(x)))const 过滤器 = f =>Trans(k => (acc, x) => f(x) ? k(acc, x) : acc)const 记录器 = 标签 =>Trans(k => (acc, x) => (console.log(label, x), k(acc, x)))//用 Trans monoid 制作的魔术链 apiconst Transduce = (t = Trans.empty()) =>({地图:f =>Transduce(t.concat(mapper(f))),过滤器:f =>转换(t.concat(过滤器(f))),日志:s =>Transduce(t.concat(logger(s))),运行:(m, xs) =>转换(t,m,xs)})//当我们运行时,我们必须指定要转换的类型//.run(Array, nums)//代替//.run(nums)

展开此代码片段以查看最终实现——当然您可以跳过定义单独的mapperfiltererlogger,以及而是直接在 Transduce 上定义它们.我觉得这读起来更好看.

//Trans monoidconst Trans = f =>({运行传输:f,concat: ({runTrans: g}) =>反式(k => f(g(k)))})Trans.empty = () =>反(k => k)const 转换 = (t, m, xs) =>xs.reduce(t.runTrans((acc, x) => acc.concat(x)), m.empty())//完整的数组幺半群实现Array.empty = () =>[]//通用传感器const 映射器 = f =>Trans(k => (acc, x) => k(acc, f(x)))const 过滤器 = f =>Trans(k => (acc, x) => f(x) ? k(acc, x) : acc)const 记录器 = 标签 =>Trans(k => (acc, x) => (console.log(label, x), k(acc, x)))//现在用 Trans monoid 实现const Transduce = (t = Trans.empty()) =>({地图:f =>Transduce(t.concat(mapper(f))),过滤器:f =>转换(t.concat(过滤器(f))),日志:s =>Transduce(t.concat(logger(s))),运行:(m, xs) =>转换(t,m,xs)})//这保持完全相同const foo = nums =>{返回转换().log('大于2?').filter(x => x > 2).log('	square:').map(x => x * x).log('		 小于 30?').filter(x => x <30).log('			pass').run(数组,数字)}//输出完全一样console.log(foo([1,2,3,4,5,6,7]))//=>[ 9, 16, 25 ]

<小时>

总结

所以我们从一堆 lambda 开始,然后使用幺半群使事情变得更简单.Trans monoid 提供了明显的优势,因为 monoid 接口是已知的,并且泛型实现非常.但是我们很顽固,或者我们有一些目标要实现,而这些目标不是我们设定的——我们决定构建神奇的 Transduce 链式 API,但我们使用坚如磐石的 Trans monoid 为我们提供了 Trans 的所有功能,但也很好地划分了复杂性.

<小时>

点链恋物癖匿名

这是我最近写的关于方法链的其他几个答案

I really like chaining Array.prototype.map, filter and reduce to define a data transformation. Unfortunately, in a recent project that involved large log files, I could no longer get away with looping through my data multiple times...

My goal:

I want to create a function that chains .filter and .map methods by, instead of mapping over an array immediately, composing a function that loops over the data once. I.e.:

const DataTransformation = () => ({ 
    map: fn => (/* ... */), 
    filter: fn => (/* ... */), 
    run: arr => (/* ... */)
});

const someTransformation = DataTransformation()
    .map(x => x + 1)
    .filter(x => x > 3)
    .map(x => x / 2);

// returns [ 2, 2.5 ] without creating [ 2, 3, 4, 5] and [4, 5] in between
const myData = someTransformation.run([ 1, 2, 3, 4]); 

My attempt:

Inspired by this answer and this blogpost I started writing a Transduce function.

const filterer = pred => reducer => (acc, x) =>
    pred(x) ? reducer(acc, x) : acc;

const mapper = map => reducer => (acc, x) =>
    reducer(acc, map(x));

const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) => ({
    map: map => Transduce(mapper(map)(reducer)),
    filter: pred => Transduce(filterer(pred)(reducer)),
    run: arr => arr.reduce(reducer, [])
});

The problem:

The problem with the Transduce snippet above, is that it runs "backwards"... The last method I chain is the first to be executed:

const someTransformation = Transduce()
    .map(x => x + 1)
    .filter(x => x > 3)
    .map(x => x / 2);

// Instead of [ 2, 2.5 ] this returns []
//  starts with (x / 2)       -> [0.5, 1, 1.5, 2] 
//  then filters (x < 3)      -> [] 
const myData = someTransformation.run([ 1, 2, 3, 4]);

Or, in more abstract terms:

Go from:

Transducer(concat).map(f).map(g) == (acc, x) => concat(acc, f(g(x)))

To:

Transducer(concat).map(f).map(g) == (acc, x) => concat(acc, g(f(x)))

Which is similar to:

mapper(f) (mapper(g) (concat))

I think I understand why it happens, but I can't figure out how to fix it without changing the "interface" of my function.

The question:

How can I make my Transduce method chain filter and map operations in the correct order?


Notes:

  • I'm only just learning about the naming of some of the things I'm trying to do. Please let me know if I've incorrectly used the Transduce term or if there are better ways to describe the problem.
  • I'm aware I can do the same using a nested for loop:

const push = (acc, x) => (acc.push(x), acc);
const ActionChain = (actions = []) => {
  const run = arr =>
    arr.reduce((acc, x) => {
      for (let i = 0, action; i < actions.length; i += 1) {
        action = actions[i];

        if (action.type === "FILTER") {
          if (action.fn(x)) {
            continue;
          }

          return acc;
        } else if (action.type === "MAP") {
          x = action.fn(x);
        }
      }

      acc.push(x);
      return acc;
    }, []);

  const addAction = type => fn => 
    ActionChain(push(actions, { type, fn }));

  return {
    map: addAction("MAP"),
    filter: addAction("FILTER"),
    run
  };
};

// Compare to regular chain to check if 
// there's a performance gain
// Admittedly, in this example, it's quite small...
const naiveApproach = {
  run: arr =>
    arr
      .map(x => x + 3)
      .filter(x => x % 3 === 0)
      .map(x => x / 3)
      .filter(x => x < 40)
};

const actionChain = ActionChain()
  .map(x => x + 3)
  .filter(x => x % 3 === 0)
  .map(x => x / 3)
  .filter(x => x < 40)


const testData = Array.from(Array(100000), (x, i) => i);

console.time("naive");
const result1 = naiveApproach.run(testData);
console.timeEnd("naive");

console.time("chain");
const result2 = actionChain.run(testData);
console.timeEnd("chain");
console.log("equal:", JSON.stringify(result1) === JSON.stringify(result2));

  • Here's my attempt in a stack snippet:

const filterer = pred => reducer => (acc, x) =>
  pred(x) ? reducer(acc, x) : acc;

const mapper = map => reducer => (acc, x) => reducer(acc, map(x));

const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) => ({
  map: map => Transduce(mapper(map)(reducer)),
  filter: pred => Transduce(filterer(pred)(reducer)),
  run: arr => arr.reduce(reducer, [])
});

const sameDataTransformation = Transduce()
  .map(x => x + 5)
  .filter(x => x % 2 === 0)
  .map(x => x / 2)
  .filter(x => x < 4);
  
// It's backwards:
// [-1, 0, 1, 2, 3]
// [-0.5, 0, 0.5, 1, 1.5]
// [0]
// [5]
console.log(sameDataTransformation.run([-1, 0, 1, 2, 3, 4, 5]));

解决方案

before we know better

I really like chaining ...

I see that, and I'll appease you, but you'll come to understand that forcing your program through a chaining API is unnatural, and more trouble than it's worth in most cases.

const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) => ({
  map: map => Transduce(mapper(map)(reducer)),
  filter: pred => Transduce(filterer(pred)(reducer)),
  run: arr => arr.reduce(reducer, [])
});

I think I understand why it happens, but I can't figure out how to fix it without changing the "interface" of my function.

The problem is indeed with your Transduce constructor. Your map and filter methods are stacking map and pred on the outside of the transducer chain, instead of nesting them inside.

Below, I've implemented your Transduce API that evaluates the maps and filters in correct order. I've also added a log method so that we can see how Transduce is behaving

const Transduce = (f = k => k) => ({
  map: g =>
    Transduce(k =>
      f ((acc, x) => k(acc, g(x)))),
  filter: g =>
    Transduce(k =>
      f ((acc, x) => g(x) ? k(acc, x) : acc)),
  log: s =>
    Transduce(k =>
      f ((acc, x) => (console.log(s, x), k(acc, x)))),
  run: xs =>
    xs.reduce(f((acc, x) => acc.concat(x)), [])
})

const foo = nums => {
  return Transduce()
    .log('greater than 2?')
    .filter(x => x > 2)
    .log('	square:')
    .map(x => x * x)
    .log('		less than 30?')
    .filter(x => x < 30)
    .log('			pass')
    .run(nums)
}

// keep square(n), forall n of nums
//   where n > 2
//   where square(n) < 30
console.log(foo([1,2,3,4,5,6,7]))
// => [ 9, 16, 25 ]


untapped potential

Inspired by this answer ...

In reading that answer I wrote, you overlook the generic quality of Trans as it was written there. Here, our Transduce only attempts to work with Arrays, but really it can work with any type that has an empty value ([]) and a concat method. These two properties make up a category called Monoids and we'd be doing ourselves a disservice if we didn't take advantage of transducer's ability to work with any type in this category.

Above, we hard-coded the initial accumulator [] in the run method, but this should probably be supplied as an argument – much like we do with iterable.reduce(reducer, initialAcc)

Aside from that, both implementations are essentially equivalent. The biggest difference is that the Trans implementation provided in the linked answer is Trans itself is a monoid, but Transduce here is not. Trans neatly implements composition of transducers in the concat method whereas Transduce (above) has composition mixed within each method. Making it a monoid allows us to rationalize Trans the same way do all other monoids, instead of having to understand it as some specialized chaining interface with unique map, filter, and run methods.

I would advise building from Trans instead of making your own custom API


have your cake and eat it too

So we learned the valuable lesson of uniform interfaces and we understand that Trans is inherently simple. But, you still want that sweet chaining API. OK, ok...

We're going to implement Transduce one more time, but this time we'll do so using the Trans monoid. Here, Transduce holds a Trans value instead of a continuation (Function).

Everything else stays the same – foo takes 1 tiny change and produces an identical output.

// generic transducers
const mapper = f =>
  Trans(k => (acc, x) => k(acc, f(x)))

const filterer = f =>
  Trans(k => (acc, x) => f(x) ? k(acc, x) : acc)

const logger = label =>
  Trans(k => (acc, x) => (console.log(label, x), k(acc, x)))

// magic chaining api made with Trans monoid
const Transduce = (t = Trans.empty()) => ({
  map: f =>
    Transduce(t.concat(mapper(f))),
  filter: f =>
    Transduce(t.concat(filterer(f))),
  log: s =>
    Transduce(t.concat(logger(s))),
  run: (m, xs) =>
    transduce(t, m, xs)
})

// when we run, we must specify the type to transduce
//   .run(Array, nums)
// instead of
//   .run(nums)

Expand this code snippet to see the final implementation – of course you could skip defining a separate mapper, filterer, and logger, and instead define those directly on Transduce. I think this reads nicer tho.

// Trans monoid
const Trans = f => ({
  runTrans: f,
  concat: ({runTrans: g}) =>
    Trans(k => f(g(k)))
})

Trans.empty = () =>
  Trans(k => k)

const transduce = (t, m, xs) =>
  xs.reduce(t.runTrans((acc, x) => acc.concat(x)), m.empty())

// complete Array monoid implementation
Array.empty = () => []

// generic transducers
const mapper = f =>
  Trans(k => (acc, x) => k(acc, f(x)))
  
const filterer = f =>
  Trans(k => (acc, x) => f(x) ? k(acc, x) : acc)
  
const logger = label =>
  Trans(k => (acc, x) => (console.log(label, x), k(acc, x)))

// now implemented with Trans monoid
const Transduce = (t = Trans.empty()) => ({
  map: f =>
    Transduce(t.concat(mapper(f))),
  filter: f =>
    Transduce(t.concat(filterer(f))),
  log: s =>
    Transduce(t.concat(logger(s))),
  run: (m, xs) =>
    transduce(t, m, xs)
})

// this stays exactly the same
const foo = nums => {
  return Transduce()
    .log('greater than 2?')
    .filter(x => x > 2)
    .log('	square:')
    .map(x => x * x)
    .log('		less than 30?')
    .filter(x => x < 30)
    .log('			pass')
    .run(Array, nums)
}

// output is exactly the same
console.log(foo([1,2,3,4,5,6,7]))
// => [ 9, 16, 25 ]


wrap up

So we started with a mess of lambdas and then made things simpler using a monoid. The Trans monoid provides distinct advantages in that the monoid interface is known and the generic implementation is extremely simple. But we're stubborn or maybe we have goals to fulfill that are not set by us – we decide to build the magic Transduce chaining API, but we do so using our rock-solid Trans monoid which gives us all the power of Trans but also keeps complexity nicely compartmentalised.


dot chaining fetishists anonymous

Here's a couple other recent answers I wrote about method chaining

这篇关于如何以正确的顺序链接映射和过滤函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆