太空飞船的推进AI:将3D船降落在位置= 0且角度= 0 [英] AI of spaceship's propulsion: land a 3D ship at position=0 and angle=0

查看:97
本文介绍了太空飞船的推进AI:将3D船降落在位置= 0且角度= 0的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于太空游戏来说,如何操纵可同时在3D中平移和旋转的宇宙飞船是一个非常困难的问题.

该宇宙飞船的n喷射器放置在不同的位置和方向.

i个射流相对于宇宙飞船CM的变换常数= Ti.

  • 变换是位置和方向的元组(四元数或矩阵3x3或更小) ,欧拉角).
  • 变换也可以用单个4x4矩阵表示.

换句话说,所有喷气式飞机都粘在船上,无法旋转.

喷气机只能在其轴线(绿色)的方向上向飞船施加力.
由于胶水的作用,轴与飞船一起旋转.

所有喷射器都可以在一定大小(标量,fi)上施加力(矢量,Fi):
i射流只能在min_i<= fi <=max_i范围内施加力(Fi = axis x fi).
min_imax_i都是具有已知值的常数.

要清楚,min_ifimax_i的单位是牛顿.
例如.如果范围未覆盖0,则表示无法关闭喷射器.

宇宙飞船的质量= m,惯性张量= I.
太空船的当前变换= Tran0,速度= V0,angularVelocity = W0.

宇宙飞船的物理物体遵循众所周知的物理规则:-
Torque=r x F
F=ma
angularAcceleration = I^-1 x Torque
linearAcceleration = m^-1 x F

I在每个方向上都是不同的,但为简单起见,在每个方向上都具有相同的值(类似球形).因此,可以将I视为标量,而不是矩阵3x3.

问题

如何控制所有喷射器(全部为fi)以位置= 0和角度= 0降落舰艇?
类似数学的规范:查找fi(time)的函数,该函数花费最短的时间到达position=(0,0,0)orient=identity,最终的angularVelocityvelocity = 0.

更具体地说,解决此问题的技术或相关算法的名称是什么?

我的研究(1维)

如果宇宙是一维的(因此没有旋转),则该问题将很容易解决.
(感谢 Gavin Lock https://stackoverflow.com/a/40359322/3577745 )

首先,找到值MIN_BURN=sum{min_i}/mMAX_BURN=sum{max_i}/m.

其次,以相反的方式思考,假设x=0(位置)和v=0t=0
然后使用x''=MIN_BURNx''=MAX_BURN创建两个抛物线.
(假定二阶导数在一段时间内保持不变,所以它是抛物线.)

剩下的唯一工作是将两个抛物线连接在一起.
红色虚线是他们连接的地方.

x''=MAX_BURN的时间内,所有fi=max_i.
x''=MIN_BURN的时间内,所有fi=min_i.

对于1D来说确实很好用,但是在3D中,问题要困难得多.

注意:
真的很感谢能指导我正确方向的粗略指导.
我不需要完美的AI,例如它可能需要比最佳时间多一些的时间.
我考虑了1个多星期,仍然没有发现任何线索.

其他尝试/观点

  • 我认为像神经网络这样的机器学习不适合这种情况.
  • 边界约束最小二乘优化可能有用,但我不知道如何使我的两个超抛物线适合这种形式的问题.
  • 这可以通过使用许多迭代来解决,但是如何解决?
  • 我已经搜索过NASA的网站,但找不到任何有用的信息.
  • 该功能可能存在于太空工程师"游戏中.
  • 由Logman评论: 机械工程知识可能会有所帮助.
  • 由AndyG评论: 这是一个运动计划问题,其中迅速探索随机树(RRT)以及围绕 Lyapunov方程

    • 在大多数情况下,(待设计的)人工智能会连续运行(即每个时间步都被调用).
    • 因此,随着AI的调整,Tran0通常是 几乎相同,V0W0通常是 与0的差别不大,例如. |Seta0|<30 degree|W0|<5 degree per time-step.
    • 我认为基于此假设的AI在大多数情况下都可以正常工作.尽管不是完美的解决方案,但它可以被视为正确的解决方案(我开始认为,如果没有这种假设,这个问题可能会太棘手了.)
    • 我微弱地认为,这种假设可能会启用一些使用线性"近似值的技巧.


    第二个替代问题-调12变量"(更简单)

    上面的问题也可能如下所示:-

    我想使用最少的时间步长将所有六个values和六个values'(一阶导数)调整为0.

    下面的表格显示了AI可能面临的情况:-

    乘数表存储原始问题中的inertia^-1 * rmass^-1.

    乘数范围是恒定的.

    每个时间步长,将要求AI为每个第i+1次喷射选择一个值fi的元组.该值必须在[min_i,max_i]范围内. 示例. AI可以从表中选择(f0=1,f1=0.1,f2=-1).

    然后,调用方将使用fi Multiplier表相乘得到values''.
    Px'' = f0*0.2+f1*0.0+f2*0.7
    Py'' = f0*0.3-f1*0.9-f2*0.6
    Pz'' = ....................
    SetaX''= ....................
    SetaY''= ....................
    SetaZ''= f0*0.0+f1*0.0+f2*5.0

    此后,调用方将使用公式values' += values''更新所有values'.
    Px' += Px''
    .................
    SetaZ' += SetaZ''

    最后,调用者将使用公式values += values'更新所有values.
    Px += Px'
    .................
    SetaZ += SetaZ'

    每个时间步仅询问一次AI.

    AI的目的是返回fi的元组(对于不同的时间步长可能会有所不同),以使PxPyPzSetaXSetaYPx'Py'Pz'SetaX'SetaY'SetaZ' = 0(或非常接近),
    通过尽可能减少时间步长.

    我希望提供对该问题的另一种观点将使其变得更容易.
    这不是完全相同的问题,但我认为可以解决此问题的解决方案可以使我非常接近原始问题的答案.

    这个替代问题的答案可能非常有用.



    第三个替代问题-调整6个变量"(最简单)

    这是先前替代方法的有损简化版本.

    唯一的区别是世界现在是2D,Fi也是2D(x,y).

    因此,我必须使用尽可能少的时间步长来仅调整PxPySetaZPx'Py'SetaZ' = 0.

    这个最简单的替代问题的答案被认为是有用的.

    解决方案

    我将尽量保持简短而甜美.

    一种通常用于解决模拟中这些问题的方法是快速探索随机树.为了使我的职位至少具有一点可信度,我承认我已经研究了这些知识,运动计划是我研究实验室的专业领域(概率运动计划).

    要阅读的典型论文是史蒂文·拉瓦尔(Steven LaValle)的快速探索随机树:一种用于路径规划的新工具,至今已发表了100万篇论文,并且都以某种方式对其进行了改进.

    首先,我将介绍RRT的最基本描述,然后再描述在具有动态约束的情况下RRT的变化.事后我会摆弄给你:

    术语

    空格"

    您的飞船状态可以通过其3维位置(x,y,z)和其3维旋转(alpha,beta,gamma)来描述(我使用这些希腊名称,因为它们是欧拉角).

    状态空间是您的飞船可以居住的所有可能的位置和旋转角度.当然,这是无限的.

    冲突空间是所有无效"状态.即实际上不可能的职位.在这些状态下,您的飞船与某些障碍物发生碰撞(对于其他物体,这也将包括与自身的碰撞,例如规划一条链的长度).缩写为C-Space.

    可用空间是不是冲突空间的任何东西.

    通用方法(没有动力学约束)

    对于没有动力约束的物体,该方法非常简单:

    1. 采样状态
    2. 找到该州最近的邻居
    3. 试图规划邻居与州之间的路线

    我将简要讨论每个步骤

    对状态进行采样

    在最基本的情况下对状态进行采样意味着为状态空间中的每个条目选择随机值.如果我们对您的太空飞船进行了此操作,则会在所有可能的值中进行x,y,z,alpha,beta,gamma的随机采样(均匀随机采样).

    当然,与自由空间相比,通常更多的空间是障碍空间(因为您通常将所讨论的对象限制在要在其内部移动的某些环境"内).因此,通常要做的是获取环境的边界立方体并在其中采样位置(x,y,z),现在我们有更多的机会在自由空间中进行采样.

    在RRT中,您将在大部分时间中进行随机采样.但是实际上您很有可能会选择下一个样本作为目标状态(尝试一下,从0.05开始).这是因为您需要定期测试以查看从开始到目标的路径是否可用.

    找到最近的邻居处于采样状态

    您选择了一些固定的整数> 0.我们将该整数称为k.您k最近的邻居在州空间中.这意味着您有一些距离度量,可以告诉您状态之间的距离.最基本的距离指标是欧几里得距离,它仅考虑物理距离,并不在意旋转角度(因为在最简单的情况下,您可以在单个时间步中旋转360度).

    最初,您只会拥有起始位置,因此它将是最近的邻居列表中的唯一候选人.

    规划国家之间的路线

    这称为本地规划.在现实世界中,您知道要去的地方,并且一路上需要躲避其他人和移动的物体.我们不会担心这里的那些事情.在我们的计划世界中,我们假设宇宙是静态的,但对我们而言.

    最常见的假设是在采样状态与其最近邻之间进行线性插值.邻居(即已经在树中的节点)沿着此线性插值一点一点地移动,直到达到采样配置,或者行进某个最大距离(回想您的距离度量).

    这里发生的是您的树正朝着样本生长.当我说您一步一步地"走时,我的意思是您定义了一些增量" (一个非常小的值),并沿线性插值在每个 timestep 内移动很多.在每个点上,您都要检查新状态是否与某些障碍物发生碰撞.如果遇到障碍,则将最后一个有效的配置保留在树中(不要忘记以某种方式存储边缘!)因此,对于本地计划者而言,您需要的是:

    • 冲突检查
    • 如何内插"在两个州之间(对于您的问题,您无需担心,因为我们会做一些不同的事情).
    • 用于时间步长的物理模拟(欧拉积分相当普遍,但不如类似方法稳定 Runge-Kutta .幸运的是,您已经有了一个物理模型!

    动态约束的修改

    当然,如果我们假设您可以在状态之间进行线性插值,则会违反您为飞船定义的物理原理.因此,我们对RRT进行了如下修改:

    • 我们对随机的控件进行采样,而不是对随机状态进行采样,并在固定的时间段内(或直到发生冲突)应用所述控件.

    之前,当我们对随机状态进行采样时,我们真正要做的是选择(在状态空间中)移动的方向.现在我们有了约束,我们随机采样了控件,实际上这是同一件事,只是我们保证不会违反约束.

    在固定时间间隔(或直到发生碰撞)上应用控件后,将一个节点添加到树中,控件存储在边缘.您的树将快速生长以探索空间.该控制应用程序替代了树状状态和采样状态之间的线性插值.

    对控件进行采样

    您有n个喷头,它们分别具有可以施加的最小和最大作用力.在每个射流的最小和最大力 内采样.

    我将控件应用于哪个节点?

    您可以随意选择,也可以偏向选择以选择最接近目标状态的节点(需要距离度量).随着时间的推移,这种偏见将尝试使节点增长到更接近目标的位置.

    现在,采用这种方法,您不太可能完全达到目标,因此您需要定义一些足够接近"的定义.即,您将使用距离度量来找到距目标状态最近的邻居,然后测试它们的足够近".这种足够接近"指标可以不同于您的距离指标.如果您使用的是欧几里德距离,但是目标配置也要正确旋转非常重要,那么您可能需要修改足够近"距离.衡量角度差异的指标.

    什么是足够接近"?完全取决于您.这也是您需要调整的东西,一百万篇论文试图使您与第一本书更加接近.

    结论

    这种随机采样听起来很荒谬,但是您的树会长出来以非常快地探索您的自由空间.在RRT上观看一些youtube视频以进行路径规划.我们不能保证有一种叫做概率完整性"的东西.具有动态约束,但通常是足够好".有时可能不存在解决方案,因此您需要在其中放置一些逻辑以在一段时间后停止生长树(例如20,000个样本)

    更多资源:

    从这些开始,然后开始研究他们的引文,然后开始研究谁在引用他们.

    This is a very difficult problem about how to maneuver a spaceship that can both translate and rotate in 3D, for a space game.

    The spaceship has n jets placing in various positions and directions.

    Transformation of i-th jet relative to the CM of spaceship is constant = Ti.

    • Transformation is a tuple of position and orientation (quaternion or matrix 3x3 or, less preferable, Euler angles).
    • A transformation can also be denoted by a single matrix 4x4.

    In other words, all jet are glued to the ship and cannot rotate.

    A jet can exert force to the spaceship only in direction of its axis (green).
    As a result of glue, the axis rotated along with the spaceship.

    All jets can exert force (vector,Fi) at a certain magnitude (scalar,fi) :
    i-th jet can exert force (Fi= axis x fi) only within range min_i<= fi <=max_i.
    Both min_i and max_i are constant with known value.

    To be clear, unit of min_i,fi,max_i is Newton.
    Ex. If the range doesn't cover 0, it means that the jet can't be turned off.

    The spaceship's mass = m and inertia tensor = I.
    The spaceship's current transformation = Tran0, velocity = V0, angularVelocity = W0.

    The spaceship physic body follows well-known physic rules :-
    Torque=r x F
    F=ma
    angularAcceleration = I^-1 x Torque
    linearAcceleration = m^-1 x F

    I is different for each direction, but for the sake of simplicity, it has the same value for every direction (sphere-like). Thus, I can be thought as a scalar instead of matrix 3x3.

    Question

    How to control all jets (all fi) to land the ship with position=0 and angle=0?
    Math-like specification: Find function of fi(time) that take minimum time to reach position=(0,0,0), orient=identity with final angularVelocity and velocity = zero.

    More specifically, what are names of technique or related algorithms to solve this problem?

    My research (1 dimension)

    If the universe is 1D (thus, no rotation), the problem will be easy to solve.
    ( Thank Gavin Lock, https://stackoverflow.com/a/40359322/3577745 )

    First, find the value MIN_BURN=sum{min_i}/m and MAX_BURN=sum{max_i}/m.

    Second, think in opposite way, assume that x=0 (position) and v=0 at t=0,
    then create two parabolas with x''=MIN_BURN and x''=MAX_BURN.
    (The 2nd derivative is assumed to be constant for a period of time, so it is parabola.)

    The only remaining work is to join two parabolas together.
    The red dash line is where them join.

    In the period of time that x''=MAX_BURN, all fi=max_i.
    In the period of time that x''=MIN_BURN, all fi=min_i.

    It works really well for 1D, but in 3D, the problem is far more harder.

    Note:
    Just a rough guide pointing me to a correct direction is really appreciated.
    I don't need a perfect AI, e.g. it can take a little more time than optimum.
    I think about it for more than 1 week, still find no clue.

    Other attempts / opinions

    • I don't think machine learning like neural network is appropriate for this case.
    • Boundary-constrained-least-square-optimisation may be useful but I don't know how to fit my two hyper-parabola to that form of problem.
    • This may be solved by using many iterations, but how?
    • I have searched NASA's website, but not find anything useful.
    • The feature may exist in "Space Engineer" game.
    • Commented by Logman: Knowledge in mechanical engineering may help.
    • Commented by AndyG: It is a motion planning problem with nonholonomic constraints. It could be solved by Rapidly exploring random tree (RRTs), theory around Lyapunov equation, and Linear quadratic regulator.
    • Commented by John Coleman: This seems more like optimal control than AI.

    Edit: "Near-0 assumption" (optional)

    • In most case, AI (to be designed) run continuously (i.e. called every time-step).
    • Thus, with the AI's tuning, Tran0 is usually near-identity, V0 and W0 are usually not so different from 0, e.g. |Seta0|<30 degree,|W0|<5 degree per time-step .
    • I think that AI based on this assumption would work OK in most case. Although not perfect, it can be considered as a correct solution (I started to think that without this assumption, this question might be too hard).
    • I faintly feel that this assumption may enable some tricks that use some "linear"-approximation.


    The 2nd Alternative Question - "Tune 12 Variables" (easier)

    The above question might also be viewed as followed :-

    I want to tune all six values and six values' (1st-derivative) to be 0, using lowest amount of time-steps.

    Here is a table show a possible situation that AI can face:-

    The Multiplier table stores inertia^-1 * r and mass^-1 from the original question.

    The Multiplier and Range are constant.

    Each timestep, the AI will be asked to pick a tuple of values fi that must be in the range [min_i,max_i] for every i+1-th jet.
    Ex. From the table, AI can pick (f0=1,f1=0.1,f2=-1).

    Then, the caller will use fi to multiply with the Multiplier table to get values''.
    Px'' = f0*0.2+f1*0.0+f2*0.7
    Py'' = f0*0.3-f1*0.9-f2*0.6
    Pz'' = ....................
    SetaX''= ....................
    SetaY''= ....................
    SetaZ''= f0*0.0+f1*0.0+f2*5.0

    After that, the caller will update all values' with formula values' += values''.
    Px' += Px''
    .................
    SetaZ' += SetaZ''

    Finally, the caller will update all values with formula values += values'.
    Px += Px'
    .................
    SetaZ += SetaZ'

    AI will be asked only once for each time-step.

    The objective of AI is to return tuples of fi (can be different for different time-step), to make Px,Py,Pz,SetaX,SetaY,SetaZ,Px',Py',Pz',SetaX',SetaY',SetaZ' = 0 (or very near),
    by using least amount of time-steps as possible.

    I hope providing another view of the problem will make it easier.
    It is not the exact same problem, but I feel that a solution that can solve this version can bring me very close to the answer of the original question.

    An answer for this alternate question can be very useful.



    The 3rd Alternative Question - "Tune 6 Variables" (easiest)

    This is a lossy simplified version of the previous alternative.

    The only difference is that the world is now 2D, Fi is also 2D (x,y).

    Thus I have to tune only Px,Py,SetaZ,Px',Py',SetaZ'=0, by using least amount of time-steps as possible.

    An answer to this easiest alternative question can be considered useful.

    解决方案

    I'll try to keep this short and sweet.

    One approach that is often used to solve these problems in simulation is a Rapidly-Exploring Random Tree. To give at least a little credibility to my post, I'll admit I studied these, and motion planning was my research lab's area of expertise (probabilistic motion planning).

    The canonical paper to read on these is Steven LaValle's Rapidly-exploring random trees: A new tool for path planning, and there have been a million papers published since that all improve on it in some way.

    First I'll cover the most basic description of an RRT, and then I'll describe how it changes when you have dynamical constraints. I'll leave fiddling with it afterwards up to you:

    Terminology

    "Spaces"

    The state of your spaceship can be described by its 3-dimension position (x, y, z) and its 3-dimensional rotation (alpha, beta, gamma) (I use those greek names because those are the Euler angles).

    state space is all possible positions and rotations your spaceship can inhabit. Of course this is infinite.

    collision space are all of the "invalid" states. i.e. realistically impossible positions. These are states where your spaceship is in collision with some obstacle (With other bodies this would also include collision with itself, for example planning for a length of chain). Abbreviated as C-Space.

    free space is anything that is not collision space.

    General Approach (no dynamics constraints)

    For a body without dynamical constraints the approach is fairly straightforward:

    1. Sample a state
    2. Find nearest neighbors to that state
    3. Attempt to plan a route between the neighbors and the state

    I'll briefly discuss each step

    Sampling a state

    Sampling a state in the most basic case means choosing at random values for each entry in your state space. If we did this with your space ship, we'd randomly sample for x, y, z, alpha, beta, gamma across all of their possible values (uniform random sampling).

    Of course way more of your space is obstacle space than free space typically (because you usually confine your object in question to some "environment" you want to move about inside of). So what is very common to do is to take the bounding cube of your environment and sample positions within it (x, y, z), and now we have a lot higher chance to sample in the free space.

    In an RRT, you'll sample randomly most of the time. But with some probability you will actually choose your next sample to be your goal state (play with it, start with 0.05). This is because you need to periodically test to see if a path from start to goal is available.

    Finding nearest neighbors to a sampled state

    You chose some fixed integer > 0. Let's call that integer k. Your k nearest neighbors are nearby in state space. That means you have some distance metric that can tell you how far away states are from each other. The most basic distance metric is Euclidean distance, which only accounts for physical distance and doesn't care about rotational angles (because in the simplest case you can rotate 360 degrees in a single timestep).

    Initially you'll only have your starting position, so it will be the only candidate in the nearest neighbor list.

    Planning a route between states

    This is called local planning. In a real-world scenario you know where you're going, and along the way you need to dodge other people and moving objects. We won't worry about those things here. In our planning world we assume the universe is static but for us.

    What's most common is to assume some linear interpolation between the sampled state and its nearest neighbor. The neighbor (i.e. a node already in the tree) is moved along this linear interpolation bit by bit until it either reaches the sampled configuration, or it travels some maximum distance (recall your distance metric).

    What's going on here is that your tree is growing towards the sample. When I say that you step "bit by bit" I mean you define some "delta" (a really small value) and move along the linear interpolation that much each timestep. At each point you check to see if you the new state is in collision with some obstacle. If you hit an obstacle, you keep the last valid configuration as part of the tree (don't forget to store the edge somehow!) So what you'll need for a local planner is:

    • Collision checking
    • how to "interpolate" between two states (for your problem you don't need to worry about this because we'll do something different).
    • A physics simulation for timestepping (Euler integration is quite common, but less stable than something like Runge-Kutta. Fortunately you already have a physics model!

    Modification for dynamical constraints

    Of course if we assume you can linearly interpolate between states, we'll violate the physics you've defined for your spaceship. So we modify the RRT as follows:

    • Instead of sampling random states, we sample random controls and apply said controls for a fixed time period (or until collision).

    Before, when we sampled random states, what we were really doing was choosing a direction (in state space) to move. Now that we have constraints, we randomly sample our controls, which is effectively the same thing, except we're guaranteed not to violate our constraints.

    After you apply your control for a fixed time interval (or until collision), you add a node to the tree, with the control stored on the edge. Your tree will grow very fast to explore the space. This control application replaces linear interpolation between tree states and sampled states.

    Sampling the controls

    You have n jets that individually have some min and max force they can apply. Sample within that min and max force for each jet.

    Which node(s) do I apply my controls to?

    Well you can choose at random, or your can bias the selection to choose nodes that are nearest to your goal state (need the distance metric). This biasing will try to grow nodes closer to the goal over time.

    Now, with this approach, you're unlikely to exactly reach your goal, so you need to define some definition of "close enough". That is, you will use your distance metric to find nearest neighbors to your goal state, and then test them for "close enough". This "close enough" metric can be different than your distance metric, or not. If you're using Euclidean distance, but it's very important that you goal configuration is also rotated properly, then you may want to modify the "close enough" metric to look at angle differences.

    What is "close enough" is entirely up to you. Also something for you to tune, and there are a million papers that try to get you a lot closer in the first place.

    Conclusion

    This random sampling may sound ridiculous, but your tree will grow to explore your free space very quickly. See some youtube videos on RRT for path planning. We can't guarantee something called "probabilistic completeness" with dynamical constraints, but it's usually "good enough". Sometimes it'll be possible that a solution does not exist, so you'll need to put some logic in there to stop growing the tree after a while (20,000 samples for example)

    More Resources:

    Start with these, and then start looking into their citations, and then start looking into who is citing them.

    这篇关于太空飞船的推进AI:将3D船降落在位置= 0且角度= 0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆