我怎样才能最好地“并行化"?蛮力攻击中一组四个嵌套的for()循环? [英] How can I best "parallelise" a set of four nested for()-loops in a Brute-Force attack?

查看:202
本文介绍了我怎样才能最好地“并行化"?蛮力攻击中一组四个嵌套的for()循环?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下家庭作业:
我需要使用以下掩码强行破解4字符密码

@ @

(其中 @ -是数字字符, % -是字母字符)

使用OpenMP在多个线程中.

这是一段代码,但是我不确定它是否在做正确的事情:

int i, j, m, n;

const char alph[26] = "abcdefghijklmnopqrstuvwxyz";
const char num[10] = "0123456789";

#pragma omp parallel for private(pass) schedule(dynamic) collapse(4)
for (i = 0; i < 26; i++)
    for (j = 0; j < 26; j++)
        for (m = 0; m < 10; m++)
            for (n = 0; n < 10; n++) {
                pass[0] = alph[i];
                pass[1] = alph[j];
                pass[2] = num[m];
                pass[3] = num[n];

                /* Working with pass here */

            }

我的问题是:
如何正确指定"parallel for"指令,以便在多个内核之间分配密码短语的范围?

非常感谢您的帮助.

解决方案

您的代码非常正确,除了使用alph而不是num.如果您能够在循环中定义pass变量,那么将省去很多麻烦.

完整的MWE可能如下所示:

//Compile with, e.g.: gcc -O3 temp.c -std=c99 -fopenmp
#include <stdio.h>
#include <unistd.h>
#include <string.h>

int PassCheck(char *pass){
  usleep(50); //Sleep for 100 microseconds to simulate work
  return strncmp(pass, "qr34", 4)==0;
}

int main(){
  const char alph[27] = "abcdefghijklmnopqrstuvwxyz";
  const char num[11]  = "0123456789";
  char goodpass[5] = "----"; //Provide a default password to indicate an error state

  int i, j, m, n;

  #pragma omp parallel for collapse(4)
  for (i = 0; i < 26; i++)
  for (j = 0; j < 26; j++)
  for (m = 0; m < 10; m++)
  for (n = 0; n < 10; n++){
    char pass[4];
    pass[0] = alph[i];
    pass[1] = alph[j];
    pass[2] = num[m];
    pass[3] = num[n];
    if(PassCheck(pass)){
      //It is good practice to use `critical` here in case two
      //passwords are somehow both valid. This won't arise in
      //your code, but is worth thinking about.
      #pragma omp critical
      {
        memcpy(goodpass, pass, 4);
        goodpass[4] = '\0';
        //#pragma omp cancel for //Escape for loops!
      }
    }
  }

  printf("Password was '%s'.\n",goodpass);

  return 0;
}

动态计划

在此处使用dynamic计划可能毫无意义.您应该期望每个密码平均花费大约相同的时间来检查.因此,循环的每次迭代将花费大约相同的时间.因此,无需使用动态调度,因为您的循环将保持均匀分布.

视觉噪音

请注意,循环嵌套是堆叠的,而不是缩进的.您经常会在代码中看到这种情况,因为其中有许多嵌套循环,因此往往会减少视觉噪音.

早休息

#pragma omp cancel for从OpenMP 4.0开始可用;但是,我在这种情况下使用此警告,因此已将其注释掉.如果您能够使其正常运行,那么一旦找到正确的密码,所有的工作都将被浪费掉,并且平均而言,该密码将位于搜索空间的一半,这将使您的运行时间减少一半. /p>

生成猜测密码的位置

其中一位评论员建议搬家,例如pass[0],使其不在最内层循环中.这是一个坏主意,因为这样做会阻止您使用collapse(4).结果,您可以并行化外循环,但是存在这样的风险,即它的迭代计数无法平均地除以线程数,从而导致很大的负载不平衡.另外,您可以并行化内部循环,这使您面临相同的问题,并且每次循环结束时都要花费高同步成本.

为什么usleep?

usleep函数使代码运行缓慢.这是故意的;由于工作量非常小,它提供了有关并行性效果的反馈.

如果删除usleep,则代码在单个内核上的完成时间为0.003s,在4个内核上的完成时间为0.004s.您不能说并行性甚至在起作用.保留usleep可以在单个内核上提供8.950s,在四个内核上提供2.257s,这充分证明了并行性的有效性.

自然,一旦确定并行性正常运行,您将删除此行.

此外,任何实际的暴力破解者都可能在PassCheck函数内部计算昂贵的哈希函数.在此处包含usleep()可以使我们模拟该功能并进行高级设计实验,而不必首先使用该功能.

I have the following homework task:
I need to brute force 4-char passphrase with the following mask

%%@@

( where @ - is a numeric character, % - is an alpha character )

in several threads using OpenMP.

Here is a piece of code, but I'm not sure if it is doing the right thing:

int i, j, m, n;

const char alph[26] = "abcdefghijklmnopqrstuvwxyz";
const char num[10] = "0123456789";

#pragma omp parallel for private(pass) schedule(dynamic) collapse(4)
for (i = 0; i < 26; i++)
    for (j = 0; j < 26; j++)
        for (m = 0; m < 10; m++)
            for (n = 0; n < 10; n++) {
                pass[0] = alph[i];
                pass[1] = alph[j];
                pass[2] = num[m];
                pass[3] = num[n];

                /* Working with pass here */

            }

So my question is :
How to correctly specify the "parallel for" instruction, in order to split the range of passphrases between several cores?

Help is much appreciated.

解决方案

Your code is pretty much right, except for using alph instead of num. If you're able to define the pass variable within the loop, that'll save you many a headache.

A full MWE might look like:

//Compile with, e.g.: gcc -O3 temp.c -std=c99 -fopenmp
#include <stdio.h>
#include <unistd.h>
#include <string.h>

int PassCheck(char *pass){
  usleep(50); //Sleep for 100 microseconds to simulate work
  return strncmp(pass, "qr34", 4)==0;
}

int main(){
  const char alph[27] = "abcdefghijklmnopqrstuvwxyz";
  const char num[11]  = "0123456789";
  char goodpass[5] = "----"; //Provide a default password to indicate an error state

  int i, j, m, n;

  #pragma omp parallel for collapse(4)
  for (i = 0; i < 26; i++)
  for (j = 0; j < 26; j++)
  for (m = 0; m < 10; m++)
  for (n = 0; n < 10; n++){
    char pass[4];
    pass[0] = alph[i];
    pass[1] = alph[j];
    pass[2] = num[m];
    pass[3] = num[n];
    if(PassCheck(pass)){
      //It is good practice to use `critical` here in case two
      //passwords are somehow both valid. This won't arise in
      //your code, but is worth thinking about.
      #pragma omp critical
      {
        memcpy(goodpass, pass, 4);
        goodpass[4] = '\0';
        //#pragma omp cancel for //Escape for loops!
      }
    }
  }

  printf("Password was '%s'.\n",goodpass);

  return 0;
}

Dynamic scheduling

Using a dynamic schedule here is probably pointless. Your expectation should be that each password will take, on average, about the same amount of time to check. Therefore, each iteration of the loop will take about the same amount of time. Therefore, there is no need to use dynamic scheduling because your loops will remain evenly distributed.

Visual noise

Note that the loop nest is stacked, rather than indented. You'll often see this in code where there are many nested loops as it tends to reduce visual noise.

Breaking early

#pragma omp cancel for is available as of OpenMP 4.0; however, I got a warning using it in this context, so I've commented it out. If you are able to get it working, that'll reduce your run-time by half since all effort is wasted once the correct password has been found and the password will, on average, be located half-way through the search space.

Where the guessed password is generated

One of the commentors suggests moving, e.g. pass[0] so that it is not in the innermost loop. This is a bad idea as doing so will prevent you from using collapse(4). As a result you could parallelize the outer loop, but you run the risk that its iteration count cannot be evenly divided by the number of threads, resulting in a large load imbalance. Alternatively, you could parallelize the inner loop, which exposes you to the same problem plus high synchronization costs each time the loop ends.

Why usleep?

The usleep function causes the code to run slowly. This is intentional; it provides feedback on the effect of parallelism, since the workload is so small.

If I remove the usleep, then the code completes in 0.003s on a single core and 0.004s on 4 cores. You cannot tell that the parallelism is even working. Leaving usleep in gives 8.950s on a single core and 2.257s on 4 cores, an apt demonstration of the effectiveness of the parallelism.

Naturally, you would remove this line once you're sure that parallelism is working correctly.

Further, any actual brute-force password cracker would likely be computing an expensive hash function inside the PassCheck function. Including usleep() here allows us to simulate that function and experiment with high-level design without having to the function first.

这篇关于我怎样才能最好地“并行化"?蛮力攻击中一组四个嵌套的for()循环?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆