我将如何解析Java类文件常量池? [英] How would I go about parsing the Java class file constant pool?

查看:120
本文介绍了我将如何解析Java类文件常量池?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据

解决方案

您需要的类文件的唯一相关文档是Java®虚拟机规范,尤其是< a href =https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html\"rel =nofollow>第4章类文件格式,如果您要解析的不仅仅是常量池,请第6章.Java虚拟机指令集



常量池由可变长度项组成,其第一个字节确定其类型,而这又决定了大小。大多数项目由一个或两个指向其他项目的索引组成。一个不需要任何第三方库的简单解析代码可能如下所示:

  public static final int HEAD = 0xcafebabe; 
//常量池类型
公共静态最终字节CONSTANT_Utf8 = 1;
public static final byte CONSTANT_Integer = 3;
public static final byte CONSTANT_Float = 4;
public static final byte CONSTANT_Long = 5;
public static final byte CONSTANT_Double = 6;
public static final byte CONSTANT_Class = 7;
public static final byte CONSTANT_String = 8;
public static final byte CONSTANT_FieldRef = 9;
public static final byte CONSTANT_MethodRef = 10;
public static final byte CONSTANT_InterfaceMethodRef = 11;
public static final byte CONSTANT_NameAndType = 12;
public static final byte CONSTANT_MethodHandle = 15;
public static final byte CONSTANT_MethodType = 16;
public static final byte CONSTANT_InvokeDynamic = 18;

static void parseRtClass(Class<?> clazz)抛出IOException,URISyntaxException {
URL url = clazz.getResource(clazz.getSimpleName()+。class);
if(url == null)抛出新的IOException(无法访问+ clazz的字节码);
parse(ByteBuffer.wrap(Files.readAllBytes(Paths.get(url.toURI()))));
}
static void parseClassFile(Path path)抛出IOException {
ByteBuffer bb;
try(FileChannel ch = FileChannel.open(path,StandardOpenOption.READ)){
bb = ch.map(FileChannel.MapMode.READ_ONLY,0,ch.size());
}
parse(bb);
}
static void parse(ByteBuffer buf){
if(buf.order(ByteOrder.BIG_ENDIAN).getInt()!= HEAD){
System.out.println( 不是有效的类文件);
返回;
}
int minor = buf.getChar(),ver = buf.getChar();
System.out.println(version+ ver +'。'+ minor);
for(int ix = 1,num = buf.getChar(); ix< num; ix ++){
String s; int index1 = -1,index2 = -1;
byte tag = buf.get();
switch(tag){
default:
System.out.println(unknown pool item type+ buf.get(buf.position() - 1));
返回;
case CONSTANT_Utf8:decodeString(ix,buf);继续;
case CONSTANT_Class:case CONSTANT_String:case CONSTANT_MethodType:
s =%d:\ t%s ref =%d%n;索引1 = buf.getChar();
休息;
case CONSTANT_FieldRef:case CONSTANT_MethodRef:
case CONSTANT_InterfaceMethodRef:case CONSTANT_NameAndType:
s =%d:\ t%s ref1 =%d,ref2 =%d%n;
index1 = buf.getChar();索引2 = buf.getChar();
休息;
case CONSTANT_Integer:s =%d:\t%s value =+ buf.getInt()+%n;打破;
case CONSTANT_Float:s =%d:\ t%s value =+ buf.getFloat()+%n;打破;
case CONSTANT_Double:s =%d:\t%s value =+ buf.getDouble()+%n; IX ++;打破;
case CONSTANT_Long:s =%d:\ t%s value =+ buf.getLong()+%n; IX ++;打破;
case CONSTANT_MethodHandle:
s =%d:\ t%s kind =%d,ref =%d%n;索引1 = buf.get();索引2 = buf.getChar();
休息;
case CONSTANT_InvokeDynamic:
s =%d:\ t%s bootstrap_method_attr_index =%d,ref =%d%n;
index1 = buf.getChar();索引2 = buf.getChar();
休息;
}
System.out.printf(s,ix,FMT [tag],index1,index2);
}
}
private static String [] FMT = {
null,Utf8,null,Integer,Float,Long,Double, Class,
String,Field,Method,Interface Method,Name and Type,
null,null,MethodHandle,MethodType,null, InvokeDynamic
};

private static void decodeString(int poolIndex,ByteBuffer buf){
int size = buf.getChar(),oldLimit = buf.limit();
buf.limit(buf.position()+ size);
StringBuilder sb = new StringBuilder(size +(size>>> 1)+16)
.append(poolIndex).append(:\ tttt8);
while(buf.hasRemaining()){
byte b = buf.get();
if(b> 0)sb.append((char)b);
else
{
int b2 = buf.get();
if((b& 0xf0)!= 0xe0)
sb.append((char)((b& 0x1F)<< 6 | b2& 0x3F));
else
{
int b3 = buf.get();
sb.append((char)((b& 0x0F)<< 12 |(b2& 0x3F)<<< 6 | b3& 0x3F));
}
}
}
buf.limit(oldLimit);
System.out.println(sb);
}

不要被 getChar()弄糊涂调用,我用它们作为获取无符号短片的便捷方式,而不是 getShort()& 0xffff



上面的代码只打印对其他池项的引用索引。为了解码项目,您可以首先将所有项目的数据存储到随机访问数据结构中,即数组或列表,因为项目可以引用具有更高索引号的项目。并注意从索引 1开始 ...


According to https://en.wikipedia.org/wiki/Java_class_file#General_layout - the Java constant pool of a class file begins 10 bytes into the file.

So far, I've been able to parse everything before that (magic to check if it's a classfile, major/minor versions, constant pool size) but I still don't understand exactly how to parse the constant pool. Like, are there opcodes for specifying method refs and other things?

Is there any way I can reference each hex value before text is represented in hex to find out what the following value is?

Should I go about by splitting each set of entries by NOPs (0x00) and then parsing each byte that isn't a text value?

For example, how can I work out exactly what each of these values represents?

解决方案

The only relevant documentation for class files you need is the The Java® Virtual Machine Specification, especially Chapter 4. The class File Format and, if you are going to parse more than the constant pool, Chapter 6. The Java Virtual Machine Instruction Set.

The constant pool consists of variable length items whose first byte determines its type which in turn dictates the size. Most items consist of one or two indices pointing to other items. A simple parsing code which doesn’t need any 3rd party library may look like this:

public static final int HEAD=0xcafebabe;
// Constant pool types
public static final byte CONSTANT_Utf8               = 1;
public static final byte CONSTANT_Integer            = 3;
public static final byte CONSTANT_Float              = 4;
public static final byte CONSTANT_Long               = 5;
public static final byte CONSTANT_Double             = 6;
public static final byte CONSTANT_Class              = 7;
public static final byte CONSTANT_String             = 8;
public static final byte CONSTANT_FieldRef           = 9;
public static final byte CONSTANT_MethodRef          =10;
public static final byte CONSTANT_InterfaceMethodRef =11;
public static final byte CONSTANT_NameAndType        =12;
public static final byte CONSTANT_MethodHandle       =15;
public static final byte CONSTANT_MethodType         =16;
public static final byte CONSTANT_InvokeDynamic      =18;

static void parseRtClass(Class<?> clazz) throws IOException, URISyntaxException {
    URL url = clazz.getResource(clazz.getSimpleName()+".class");
    if(url==null) throw new IOException("can't access bytecode of "+clazz);
    parse(ByteBuffer.wrap(Files.readAllBytes(Paths.get(url.toURI()))));
}
static void parseClassFile(Path path) throws IOException {
    ByteBuffer bb;
    try(FileChannel ch=FileChannel.open(path, StandardOpenOption.READ)) {
        bb=ch.map(FileChannel.MapMode.READ_ONLY, 0, ch.size());
    }
    parse(bb);
}
static void parse(ByteBuffer buf) {
    if(buf.order(ByteOrder.BIG_ENDIAN).getInt()!=HEAD) {
        System.out.println("not a valid class file");
        return;
    }
    int minor=buf.getChar(), ver=buf.getChar();
    System.out.println("version "+ver+'.'+minor);
    for(int ix=1, num=buf.getChar(); ix<num; ix++) {
        String s; int index1=-1, index2=-1;
        byte tag = buf.get();
        switch(tag) {
            default:
                System.out.println("unknown pool item type "+buf.get(buf.position()-1));
                return;
            case CONSTANT_Utf8: decodeString(ix, buf); continue;
            case CONSTANT_Class: case CONSTANT_String: case CONSTANT_MethodType:
                s="%d:\t%s ref=%d%n"; index1=buf.getChar();
                break;
            case CONSTANT_FieldRef: case CONSTANT_MethodRef:
            case CONSTANT_InterfaceMethodRef: case CONSTANT_NameAndType:
                s="%d:\t%s ref1=%d, ref2=%d%n";
                index1=buf.getChar(); index2=buf.getChar();
                break;
            case CONSTANT_Integer: s="%d:\t%s value="+buf.getInt()+"%n"; break;
            case CONSTANT_Float: s="%d:\t%s value="+buf.getFloat()+"%n"; break;
            case CONSTANT_Double: s="%d:\t%s value="+buf.getDouble()+"%n"; ix++; break;
            case CONSTANT_Long: s="%d:\t%s value="+buf.getLong()+"%n"; ix++; break;
            case CONSTANT_MethodHandle:
                s="%d:\t%s kind=%d, ref=%d%n"; index1=buf.get(); index2=buf.getChar();
                break;
             case CONSTANT_InvokeDynamic:
                s="%d:\t%s bootstrap_method_attr_index=%d, ref=%d%n";
                index1=buf.getChar(); index2=buf.getChar();
                break;
        }
        System.out.printf(s, ix, FMT[tag], index1, index2);
    }
}
private static String[] FMT= {
    null, "Utf8", null, "Integer", "Float", "Long", "Double", "Class",
    "String", "Field", "Method", "Interface Method", "Name and Type",
    null, null, "MethodHandle", "MethodType", null, "InvokeDynamic"
};

private static void decodeString(int poolIndex, ByteBuffer buf) {
    int size=buf.getChar(), oldLimit=buf.limit();
    buf.limit(buf.position()+size);
    StringBuilder sb=new StringBuilder(size+(size>>1)+16)
        .append(poolIndex).append(":\tUtf8 ");
    while(buf.hasRemaining()) {
        byte b=buf.get();
        if(b>0) sb.append((char)b);
        else
        {
            int b2 = buf.get();
            if((b&0xf0)!=0xe0)
                sb.append((char)((b&0x1F)<<6 | b2&0x3F));
            else
            {
                int b3 = buf.get();
                sb.append((char)((b&0x0F)<<12 | (b2&0x3F)<<6 | b3&0x3F));
            }
        }
    }
    buf.limit(oldLimit);
    System.out.println(sb);
}

Don’t get confused by the getChar() calls, I used them as a convenient way for getting an unsigned short, instead of getShort()&0xffff.

The code above simply prints the indices of references to other pool items. For decoding the items, you may first store the data of all items into a random access data structure, i.e. array or List as items may refer to items with a higher index number. And mind the starting at index 1

这篇关于我将如何解析Java类文件常量池?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆