将具有字符串节点标签的边列表映射到整数标签 [英] Map edgelist with string node labels to integer labels

查看:99
本文介绍了将具有字符串节点标签的边列表映射到整数标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个边缘列表格式的巨大图形,其中字符串作为节点标签.我想知道将字符串映射到整数的最佳"方法是什么.输入文件遵循以下示例:

I have a huge graph in edgelist format with strings as node labels. I wonder what is the "best" way to map strings to integers. Input file follows the example:

Mike Andrew
Mike Jane
John Jane

输出(即映射文件)应为:

The output (i.e., a mapped file) should be:

1 2
1 3
4 3

下面粘贴的是C语言中的框架,该框架读取输入文件.有人可以建议我该怎么做.

Pasted below is a skeleton in C that reads the input file. Could somebody please advice me how to proceed.

#include <stdio.h>

int LoadFile(const char * filename) {
  FILE *fp = NULL;
  char node1[10];
  char node2[10];
  int idx = 0;

  fp = fopen(filename, "r");
  if (fp == NULL) {
    perror("Error");
  }

  while (fscanf(fp, "%s %s", &node1, &node2) == 2) {
    idx++;
  }

  fclose(fp);

  return idx;
}

int main(void) {
  int n = LoadFile("./test.txt");
  printf("Number of edges: %d\n", n);
  return 0;
}

推荐答案

您需要天真的地图实现(将字符串映射为整数).

  • 定义如下的结构以存储字符串.

  • Define a structure as below to store the strings.

    typedef struct {
       unsigned int hashed;
       char **map;
   } hash;

  • 定义一个函数,该函数将在字符串不存在的情况下将其插入到哈希图中,并返回哈希图中的字符串索引.

  • Define a function which will insert the string into hashmap if it is not exists and return the index of string in hashmap.

    int insertInMap(hash *map, char *entry)

    将返回的索引存储到edge结构中.

    Store the returned index into edge structure.

    edges[i].first =insertInMap(&map,first_string); edges[i].second =insertInMap(&map,second_string)

    edges[i].first =insertInMap(&map,first_string); edges[i].second =insertInMap(&map,second_string)

    示例代码:

    typedef struct {
        unsigned int first;
        unsigned int second;
    } edge;
    
    typedef struct {
        unsigned int hashed;
         char **map;
    } hash;
    
    
    int insertInMap(hash *map, char *entry)
    {
      int i =0;
      for (i=0;i<map->hashed;i++)
      {
        if (strcmp(map->map[i],entry) == 0)
        return i+1;
      }
      /* Warning no boundary check is added */
      map->map[map->hashed++] = strdup(entry);   
      return map->hashed;
    }
    
    
    edge *LoadFile(const char * filename) {
      FILE *fp = NULL;
      char node1[10];
      char node2[10];
      int idx = 0;
    
      edge *edges;
      hash map;    
    
      int numEdges = 10;
      edges = malloc( numEdges * sizeof(edge));
    
      map.map = malloc(numEdges * sizeof(char*));
      map.hashed = 0;
    
      fp = fopen(filename, "r");
      if (fp == NULL) {
        perror("Error");
      }
    
      while (fscanf(fp, "%s %s", &node1, &node2) == 2) {
        if (idx >= numEdges)
        {
             numEdges *=2;
             edges = realloc(edges, numEdges * sizeof(edge));
    
             map.map = realloc(map.map, numEdges * sizeof(char*));
        }
        edges[idx].first =insertInMap(&map,node1);
        edges[idx].second =insertInMap(&map,node2);
        idx++;
      }
    
      fclose(fp);
    
      return edges;
    }
    

    稍后打印edges.

    这篇关于将具有字符串节点标签的边列表映射到整数标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆