C/C++ argv 内存管理 [英] C/C++ argv memory manage

查看:52
本文介绍了C/C++ argv 内存管理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

标准的 C/C++ 程序格式.

int main(int argc, char *argv[]){}

我想知道main被调用时argv数据在内存中是如何排列的.我得到了这个函数 copy_argv() 来自 Node.js 存储库.它的工作方式就像内存是这样安排的:

argv_area|NULL|argv_data_area

操作系统真的以这种方式处理argv的内存吗?

至于这取决于操作系统,只讨论 Linux 64bit

解决方案

原始的 argv 通常作为一个连续的 char * 值块处理,紧接着是环境的另一块 char * 值(int main(int argc, char **argv, char **envp)envp> main() 的变体,也由 environ 指向).然后是参数字符串和环境字符串本身.

参数列表和环境可能不是由 malloc() 本身创建的——参数和环境是由 execve() 系统调用设置的.>

三年前的某个时候,我在 main 之外的函数中使用find argv[0]"并编写了如下所示的代码.它仍然适用于 Mac OS X Mavericks(10.9.4 — 最初的测试版本是 Snow Leopard 10.6)和 Ubuntu 14.04.(有更好的但特定于平台的方法可以从函数中获取 argv[0],但这是一个 单独的 SO 问题,所以我不会使用这种技术,但它确实适用于某些常见平台.)

#include "posixver.h"#include #include #include #include /* putenv(), setenv() */外部字符**环境;/* 应该在 <unistd.h> 中声明*//*** 练习的目的是:给定环境(因为这就是全部** 可用于库函数)尝试查找 argv[0](和** 因此是 argc).**** 在某些平台上,内存的布局使得** 参数 (argc) 可用,后跟参数向量,** 后跟环境向量.**** argv 环境** ||** v v** |argc |argv0 |argv1 |... |argvN |0 |环境0 |环境1 |... |环境 |0 |**** 这适用于:** -- Solaris 10(32 位、64 位 SPARC)** -- MacOS X 10.6(Snow Leopard,32 位和 64 位)** -- Linux(x86/64 上的 RHEL 5,32 位和 64 位)**** 遗憾的是,这并不是其他两个 Unix 上发生的情况** 平台.argv0 前面的值似乎是 0.** -- AIX 6.1(32 位、64 位)** -- HP-UX 11.23 IA64(32 位、64 位)** 不符合标准的 POSIX 支持(无 setenv())和 C99 支持(无 %zd).**** 注意:如果调用 putenv() 或 setenv() 添加环境变量,** 然后,environ 的基地址发生了根本性的变化,移开了** 堆叠到堆上,所有赌注都关闭.修改现有** 变量不是问题.**** 发现从堆栈到堆的变化是通过观察是否**environ 指向的地址是大小的 128K 倍以上** 来自局部变量地址的指针.**** 这段代码名义上是机器特定的——但实际上** 非常便携.*/typedef 结构参数{字符 **argv;size_t argc;参数;static void print_cpp(const char *tag, int i, char **ptr){uintptr_t p = (uintptr_t)ptr;printf("%s[%d] = 0x%" PRIXPTR " (0x%" PRIXPTR ") (%s)\n",tag, i, p, (uintptr_t)(*ptr), (*ptr == 0 ? "" : *ptr));}枚举 { MAX_DELTA = sizeof(void *) * 128 * 1024 };静态参数 find_argv0(void){静态字符 *dummy[] = { "", 0 };参数 args;uintptr_t i;char **base = 环境 - 1;uintptr_t delta = ((uintptr_t)&base > (uintptr_t)environ) ?(uintptr_t)&base - (uintptr_t)environ : (uintptr_t)environ - (uintptr_t)&base;如果(增量<MAX_DELTA){for (i = 2; (uintptr_t)(*(environ - i) + 2) != i && (uintptr_t)(*(environ - i)) != 0; i++)print_cpp("test", i, 环境 i);args.argc = i - 2;args.argv = 环境 - i + 1;}别的{args.argc = 1;args.argv = 虚拟;}printf("argc = %zd\n", args.argc);for (i = 0; i <= args.argc; i++)print_cpp("argv", i, &args.argv[i]);返回参数;}静态无效打印参数(无效){参数 args = find_argv0();printf("命令名和参数\n");printf("argc = %zd\n", args.argc);for (size_t i = 0; i <= args.argc; i++)printf("argv[%zd] = %s\n", i, (args.argv[i] ?args.argv[i] : ""));}静态 int check_environ(int argc, char **argv){size_t n = argc;size_t i;无符号长增量 = (argv > 环境) ?argv - 环境:环境 - argv;printf("environ = 0x%lX; argv = 0x%lX (delta: 0x%lX)\n", (unsigned long)environ, (unsigned long)argv, delta);for (i = 0; i <= n; i++)print_cpp("chkv", i, &argv[i]);if (delta > (unsigned long)argc + 1)返回0;对于 (i = 1; i "));fflush(0);}i = n + 2;printf("chkF[%zd] = 0x%lX (0x%lX)\n", i, (unsigned long)(environ - i), (unsigned long)(*(environ - i)));i = n + 3;printf("chkF[%zd] = 0x%lX (0x%lX)\n", i, (unsigned long)(environ - i), (unsigned long)(*(environ - i)));返回 1;}int main(int argc, char **argv){printf("设置环境前\n");如果(检查环境(argc,argv))打印参数();//putenv("TZ=美国/太平洋");setenv("SHELL", "/bin/csh", 1);printf("修改环境后\n");如果(检查环境(argc,argv)== 0)printf("修改环境搞砸了一切\n");打印参数();putenv("CODSWALLOP=废话");printf("添加到环境后\n");如果(检查环境(argc,argv)== 0)printf("添加环境搞砸了一切\n");打印参数();返回0;}

来自 Mac OS X 的示例输出:

设置环境前环境 = 0x7FFF584D04C8;argv = 0x7FFF584D0498(增量:0x6)chkv[0] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)chkv[1] = 0x7FFF584D04A0 (0x7FFF584D06BD)(马其顿)chkv[2] = 0x7FFF584D04A8 (0x7FFF584D06C8)(方尖碑)chkv[3] = 0x7FFF584D04B0 (0x7FFF584D06D0)(精神崩溃)chkv[4] = 0x7FFF584D04B8 (0x7FFF584D06E1)(测试:1、2、3)chkv[5] = 0x7FFF584D04C0 (0x0) ()chkr[1] = 0x7FFF584D04C0 (0x0) ()chkr[2] = 0x7FFF584D04B8 (0x7FFF584D06E1)(测试:1、2、3)chkr[3] = 0x7FFF584D04B0 (0x7FFF584D06D0)(精神崩溃)chkr[4] = 0x7FFF584D04A8 (0x7FFF584D06C8)(方尖碑)chkr[5] = 0x7FFF584D04A0 (0x7FFF584D06BD)(马其顿)chkr[6] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)chkF[7] = 0x7FFF584D0490 (0x5)chkF[8] = 0x7FFF584D0488 (0x0)测试[2] = 0x7FFF584D04B8 (0x7FFF584D06E1)(测试:1、2、3)测试[3] = 0x7FFF584D04B0 (0x7FFF584D06D0)(精神崩溃)测试[4] = 0x7FFF584D04A8 (0x7FFF584D06C8)(方尖碑)测试[5] = 0x7FFF584D04A0 (0x7FFF584D06BD)(马其顿)测试[6] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)argc = 5argv[0] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)argv[1] = 0x7FFF584D04A0 (0x7FFF584D06BD)(马其顿)argv[2] = 0x7FFF584D04A8 (0x7FFF584D06C8)(方尖碑)argv[3] = 0x7FFF584D04B0 (0x7FFF584D06D0)(精神崩溃)argv[4] = 0x7FFF584D04B8 (0x7FFF584D06E1)(测试:1、2、3)argv[5] = 0x7FFF584D04C0 (0x0) ()命令名称和参数argc = 5argv[0] = ./find_argv0argv[1] = 马其顿argv[2] = 方尖碑argv[3] = 精神崩溃argv[4] = 测试:1、2、3argv[5] = <空>修改环境后环境 = 0x7FFF584D04C8;argv = 0x7FFF584D0498(增量:0x6)chkv[0] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)chkv[1] = 0x7FFF584D04A0 (0x7FFF584D06BD)(马其顿)chkv[2] = 0x7FFF584D04A8 (0x7FFF584D06C8)(方尖碑)chkv[3] = 0x7FFF584D04B0 (0x7FFF584D06D0)(精神崩溃)chkv[4] = 0x7FFF584D04B8 (0x7FFF584D06E1)(测试:1、2、3)chkv[5] = 0x7FFF584D04C0 (0x0) ()chkr[1] = 0x7FFF584D04C0 (0x0) ()chkr[2] = 0x7FFF584D04B8 (0x7FFF584D06E1)(测试:1、2、3)chkr[3] = 0x7FFF584D04B0 (0x7FFF584D06D0)(精神崩溃)chkr[4] = 0x7FFF584D04A8 (0x7FFF584D06C8)(方尖碑)chkr[5] = 0x7FFF584D04A0 (0x7FFF584D06BD)(马其顿)chkr[6] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)chkF[7] = 0x7FFF584D0490 (0x5)chkF[8] = 0x7FFF584D0488 (0x0)测试[2] = 0x7FFF584D04B8 (0x7FFF584D06E1)(测试:1、2、3)测试[3] = 0x7FFF584D04B0 (0x7FFF584D06D0)(精神崩溃)测试[4] = 0x7FFF584D04A8 (0x7FFF584D06C8)(方尖碑)测试[5] = 0x7FFF584D04A0 (0x7FFF584D06BD)(马其顿)测试[6] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)argc = 5argv[0] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)argv[1] = 0x7FFF584D04A0 (0x7FFF584D06BD)(马其顿)argv[2] = 0x7FFF584D04A8 (0x7FFF584D06C8)(方尖碑)argv[3] = 0x7FFF584D04B0 (0x7FFF584D06D0)(精神崩溃)argv[4] = 0x7FFF584D04B8 (0x7FFF584D06E1)(测试:1、2、3)argv[5] = 0x7FFF584D04C0 (0x0) ()命令名称和参数argc = 5argv[0] = ./find_argv0argv[1] = 马其顿argv[2] = 方尖碑argv[3] = 精神崩溃argv[4] = 测试:1、2、3argv[5] = <空>添加到环境后环境 = 0x7FB1EA403B60;argv = 0x7FFF584D0498(增量:0x9ADC19927)chkv[0] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)chkv[1] = 0x7FFF584D04A0 (0x7FFF584D06BD)(马其顿)chkv[2] = 0x7FFF584D04A8 (0x7FFF584D06C8)(方尖碑)chkv[3] = 0x7FFF584D04B0 (0x7FFF584D06D0)(精神崩溃)chkv[4] = 0x7FFF584D04B8 (0x7FFF584D06E1)(测试:1、2、3)chkv[5] = 0x7FFF584D04C0 (0x0) ()添加环境搞砸了一切argc = 1argv[0] = 0x107730040 (0x10772FEC0) (<unknown>)argv[1] = 0x107730048 (0x0) ()命令名称和参数argc = 1argv[0] = <未知>argv[1] = <空>

A standard C/C++ program format.

int main(int argc, char *argv[]){}

I wonder how argv data is arranged in memory when main is called. I got this function copy_argv() from Node.js repo. It works as if memory is arranged this way:

argv_area|NULL|argv_data_area

Does OS really handle argv's memory in this way?

As far as this is OS dependet, just discuss about Linux 64bit

解决方案

The original argv is normally handled as a single contiguous block of char * values, followed immediately by another block of char * values for the environment (the envp in the int main(int argc, char **argv, char **envp) variant of main(), also pointed to by environ). These are then followed by the argument strings and environment strings themselves.

The argument list and environment are probably not created by malloc() per se — the arguments and environment are set up by the execve() system call.

At one point three years ago, I was playing with 'find argv[0] from a function other than main' and wrote the code shown below. It still works on Mac OS X Mavericks (10.9.4 — the original tested version was Snow Leopard 10.6) and Ubuntu 14.04. (There are better, but platform specific, ways to get argv[0] from a function, but that's a separate SO question, so I would not use this technique, but it does work on some common platforms.)

#include "posixver.h"
#include <inttypes.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>     /* putenv(), setenv() */

extern char **environ;  /* Should be declared in <unistd.h> */

/*
** The object of the exercise is: given just environ (since that is all
** that is available to a library function) attempt to find argv[0] (and
** hence argc).
**
** On some platforms, the layout of memory is such that the number of
** arguments (argc) is available, followed by the argument vector,
** followed by the environment vector.
**
**          argv                            environ
**            |                                |
**            v                                v
** | argc | argv0 | argv1 | ... | argvN | 0 | env0 | env1 | ... | envN | 0 |
**
** This applies to:
** -- Solaris 10 (32-bit, 64-bit SPARC)
** -- MacOS X 10.6 (Snow Leopard, 32-bit and 64-bit)
** -- Linux (RHEL 5 on x86/64, 32-bit and 64-bit)
**
** Sadly, this is not quite what happens on the other two Unix
** platforms.  The value preceding argv0 seems to be a 0.
** -- AIX 6.1          (32-bit, 64-bit)
** -- HP-UX 11.23 IA64 (32-bit, 64-bit)
**       Sub-standard POSIX support (no setenv()) and C99 support (no %zd).
**
** NB: If putenv() or setenv() is called to add an environment variable,
** then the base address of environ changes radically, moving off the
** stack onto heap, and all bets are off.  Modifying an existing
** variable is not a problem.
**
** Spotting the change from stack to heap is done by observing whether
** the address pointed to by environ is more than 128 K times the size
** of a pointer from the address of a local variable.
**
** This code is nominally incredibly machine-specific - but actually
** works remarkably portably.
*/

typedef struct Arguments
{
    char   **argv;
    size_t   argc;
} Arguments;

static void print_cpp(const char *tag, int i, char **ptr)
{
    uintptr_t p = (uintptr_t)ptr;
    printf("%s[%d] = 0x%" PRIXPTR " (0x%" PRIXPTR ") (%s)\n",
            tag, i, p, (uintptr_t)(*ptr), (*ptr == 0 ? "<null>" : *ptr));
}

enum { MAX_DELTA = sizeof(void *) * 128 * 1024 };

static Arguments find_argv0(void)
{
    static char *dummy[] = { "<unknown>", 0 };
    Arguments args;
    uintptr_t i;
    char **base = environ - 1;
    uintptr_t delta = ((uintptr_t)&base > (uintptr_t)environ) ? (uintptr_t)&base - (uintptr_t)environ : (uintptr_t)environ - (uintptr_t)&base;
    if (delta < MAX_DELTA)
    {
        for (i = 2; (uintptr_t)(*(environ - i) + 2) != i && (uintptr_t)(*(environ - i)) != 0; i++)
            print_cpp("test", i, environ-i);
        args.argc = i - 2;
        args.argv = environ - i + 1;
    }
    else
    {
        args.argc = 1;
        args.argv = dummy;
    }

    printf("argc    = %zd\n", args.argc);
    for (i = 0; i <= args.argc; i++)
        print_cpp("argv", i, &args.argv[i]);

    return args;
}

static void print_arguments(void)
{
    Arguments args = find_argv0();
    printf("Command name and arguments\n");
    printf("argc    = %zd\n", args.argc);
    for (size_t i = 0; i <= args.argc; i++)
        printf("argv[%zd] = %s\n", i, (args.argv[i] ? args.argv[i] : "<null>"));
}

static int check_environ(int argc, char **argv)
{
    size_t n = argc;
    size_t i;
    unsigned long delta = (argv > environ) ? argv - environ : environ - argv;
    printf("environ = 0x%lX; argv = 0x%lX (delta: 0x%lX)\n", (unsigned long)environ, (unsigned long)argv, delta);
    for (i = 0; i <= n; i++)
        print_cpp("chkv", i, &argv[i]);
    if (delta > (unsigned long)argc + 1)
        return 0;

    for (i = 1; i < n + 2; i++)
    {
        printf("chkr[%zd] = 0x%lX (0x%lX) (%s)\n", i, (unsigned long)(environ - i), (unsigned long)(*(environ - i)),
                (*(environ-i) ? *(environ-i) : "<null>"));
        fflush(0);
    }
    i = n + 2;
    printf("chkF[%zd] = 0x%lX (0x%lX)\n", i, (unsigned long)(environ - i), (unsigned long)(*(environ - i)));
    i = n + 3;
    printf("chkF[%zd] = 0x%lX (0x%lX)\n", i, (unsigned long)(environ - i), (unsigned long)(*(environ - i)));
    return 1;
}

int main(int argc, char **argv)
{
    printf("Before setting environment\n");
    if (check_environ(argc, argv))
        print_arguments();

    //putenv("TZ=US/Pacific");
    setenv("SHELL", "/bin/csh", 1);

    printf("After modifying environment\n");
    if (check_environ(argc, argv) == 0)
        printf("Modifying environment messed everything up\n");
    print_arguments();

    putenv("CODSWALLOP=nonsense");

    printf("After adding to environment\n");
    if (check_environ(argc, argv) == 0)
        printf("Adding environment messed everything up\n");
    print_arguments();

    return 0;
}

Example output from Mac OS X:

Before setting environment
environ = 0x7FFF584D04C8; argv = 0x7FFF584D0498 (delta: 0x6)
chkv[0] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)
chkv[1] = 0x7FFF584D04A0 (0x7FFF584D06BD) (macedonian)
chkv[2] = 0x7FFF584D04A8 (0x7FFF584D06C8) (obelisk)
chkv[3] = 0x7FFF584D04B0 (0x7FFF584D06D0) (mental breakdown)
chkv[4] = 0x7FFF584D04B8 (0x7FFF584D06E1) (testing: 1, 2, 3)
chkv[5] = 0x7FFF584D04C0 (0x0) (<null>)
chkr[1] = 0x7FFF584D04C0 (0x0) (<null>)
chkr[2] = 0x7FFF584D04B8 (0x7FFF584D06E1) (testing: 1, 2, 3)
chkr[3] = 0x7FFF584D04B0 (0x7FFF584D06D0) (mental breakdown)
chkr[4] = 0x7FFF584D04A8 (0x7FFF584D06C8) (obelisk)
chkr[5] = 0x7FFF584D04A0 (0x7FFF584D06BD) (macedonian)
chkr[6] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)
chkF[7] = 0x7FFF584D0490 (0x5)
chkF[8] = 0x7FFF584D0488 (0x0)
test[2] = 0x7FFF584D04B8 (0x7FFF584D06E1) (testing: 1, 2, 3)
test[3] = 0x7FFF584D04B0 (0x7FFF584D06D0) (mental breakdown)
test[4] = 0x7FFF584D04A8 (0x7FFF584D06C8) (obelisk)
test[5] = 0x7FFF584D04A0 (0x7FFF584D06BD) (macedonian)
test[6] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)
argc    = 5
argv[0] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)
argv[1] = 0x7FFF584D04A0 (0x7FFF584D06BD) (macedonian)
argv[2] = 0x7FFF584D04A8 (0x7FFF584D06C8) (obelisk)
argv[3] = 0x7FFF584D04B0 (0x7FFF584D06D0) (mental breakdown)
argv[4] = 0x7FFF584D04B8 (0x7FFF584D06E1) (testing: 1, 2, 3)
argv[5] = 0x7FFF584D04C0 (0x0) (<null>)
Command name and arguments
argc    = 5
argv[0] = ./find_argv0
argv[1] = macedonian
argv[2] = obelisk
argv[3] = mental breakdown
argv[4] = testing: 1, 2, 3
argv[5] = <null>
After modifying environment
environ = 0x7FFF584D04C8; argv = 0x7FFF584D0498 (delta: 0x6)
chkv[0] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)
chkv[1] = 0x7FFF584D04A0 (0x7FFF584D06BD) (macedonian)
chkv[2] = 0x7FFF584D04A8 (0x7FFF584D06C8) (obelisk)
chkv[3] = 0x7FFF584D04B0 (0x7FFF584D06D0) (mental breakdown)
chkv[4] = 0x7FFF584D04B8 (0x7FFF584D06E1) (testing: 1, 2, 3)
chkv[5] = 0x7FFF584D04C0 (0x0) (<null>)
chkr[1] = 0x7FFF584D04C0 (0x0) (<null>)
chkr[2] = 0x7FFF584D04B8 (0x7FFF584D06E1) (testing: 1, 2, 3)
chkr[3] = 0x7FFF584D04B0 (0x7FFF584D06D0) (mental breakdown)
chkr[4] = 0x7FFF584D04A8 (0x7FFF584D06C8) (obelisk)
chkr[5] = 0x7FFF584D04A0 (0x7FFF584D06BD) (macedonian)
chkr[6] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)
chkF[7] = 0x7FFF584D0490 (0x5)
chkF[8] = 0x7FFF584D0488 (0x0)
test[2] = 0x7FFF584D04B8 (0x7FFF584D06E1) (testing: 1, 2, 3)
test[3] = 0x7FFF584D04B0 (0x7FFF584D06D0) (mental breakdown)
test[4] = 0x7FFF584D04A8 (0x7FFF584D06C8) (obelisk)
test[5] = 0x7FFF584D04A0 (0x7FFF584D06BD) (macedonian)
test[6] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)
argc    = 5
argv[0] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)
argv[1] = 0x7FFF584D04A0 (0x7FFF584D06BD) (macedonian)
argv[2] = 0x7FFF584D04A8 (0x7FFF584D06C8) (obelisk)
argv[3] = 0x7FFF584D04B0 (0x7FFF584D06D0) (mental breakdown)
argv[4] = 0x7FFF584D04B8 (0x7FFF584D06E1) (testing: 1, 2, 3)
argv[5] = 0x7FFF584D04C0 (0x0) (<null>)
Command name and arguments
argc    = 5
argv[0] = ./find_argv0
argv[1] = macedonian
argv[2] = obelisk
argv[3] = mental breakdown
argv[4] = testing: 1, 2, 3
argv[5] = <null>
After adding to environment
environ = 0x7FB1EA403B60; argv = 0x7FFF584D0498 (delta: 0x9ADC19927)
chkv[0] = 0x7FFF584D0498 (0x7FFF584D06B0) (./find_argv0)
chkv[1] = 0x7FFF584D04A0 (0x7FFF584D06BD) (macedonian)
chkv[2] = 0x7FFF584D04A8 (0x7FFF584D06C8) (obelisk)
chkv[3] = 0x7FFF584D04B0 (0x7FFF584D06D0) (mental breakdown)
chkv[4] = 0x7FFF584D04B8 (0x7FFF584D06E1) (testing: 1, 2, 3)
chkv[5] = 0x7FFF584D04C0 (0x0) (<null>)
Adding environment messed everything up
argc    = 1
argv[0] = 0x107730040 (0x10772FEC0) (<unknown>)
argv[1] = 0x107730048 (0x0) (<null>)
Command name and arguments
argc    = 1
argv[0] = <unknown>
argv[1] = <null>

这篇关于C/C++ argv 内存管理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆