慢速AES GCM加密和解密与Java 8u20 [英] Slow AES GCM encryption and decryption with Java 8u20

查看:2703
本文介绍了慢速AES GCM加密和解密与Java 8u20的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用AES / GCM / NoPadding加密和解密数据。我安装了JCE Unlimited Strength Policy Files,并运行了(简单的)基准。我使用OpenSSL做了相同的操作,并且能够在我的电脑上实现超过 1 GB / s 的加密和解密。



基准下面我只能在同一台电脑上使用Java 8获得 3 MB / s 加密和解密。任何想法我做错了什么?

  public static void main(String [] args)throws Exception {
final byte [] data = new byte [64 * 1024];
final byte [] encrypted = new byte [64×1024];
final byte [] key = new byte [32];
final byte [] iv = new byte [12];
final随机随机=新随机(1);
random.nextBytes(data);
random.nextBytes(key);
random.nextBytes(iv);

System.out.println(基准AES-256 GCM加密10秒钟);
long JavaEncryptInputBytes = 0;
long javaEncryptStartTime = System.currentTimeMillis();
final Cipher javaAES256 = Cipher.getInstance(AES / GCM / NoPadding);
byte [] tag = new byte [16];
long encryptInitTime = 0L;
long encryptUpdate1Time = 0L;
long encryptDoFinalTime = 0L;
while(System.currentTimeMillis() - javaEncryptStartTime <10000){
random.nextBytes(iv);
long n1 = System.nanoTime();
javaAES256.init(Cipher.ENCRYPT_MODE,new SecretKeySpec(key,AES),new GCMParameterSpec(16 * Byte.SIZE,iv));
long n2 = System.nanoTime();
javaAES256.update(data,0,data.length,encrypted,0);
long n3 = System.nanoTime();
javaAES256.doFinal(tag,0);
long n4 = System.nanoTime();
javaEncryptInputBytes + = data.length;

encryptInitTime = n2 - n1;
encryptUpdate1Time = n3 - n2;
encryptDoFinalTime = n4 - n3;
}
long javaEncryptEndTime = System.currentTimeMillis();
System.out.println(Time init(ns):+ encryptInitTime);
System.out.println(Time update(ns):+ encryptUpdate1Time);
System.out.println(Time do final(ns):+ encryptDoFinalTime);
System.out.println(Java calculated at+(javaEncryptInputBytes / 1024/1024 /((javaEncryptEndTime - javaEncryptStartTime)/ 1000))+MB / s);

System.out.println(基准AES-256 GCM解密10秒钟);
long javaDecryptInputBytes = 0;
long javaDecryptStartTime = System.currentTimeMillis();
final GCMParameterSpec gcmParameterSpec = new GCMParameterSpec(16 * Byte.SIZE,iv);
final SecretKeySpec keySpec = new SecretKeySpec(key,AES);
long decryptInitTime = 0L;
long decryptUpdate1Time = 0L;
long decryptUpdate2Time = 0L;
long decryptDoFinalTime = 0L;
while(System.currentTimeMillis() - javaDecryptStartTime <10000){
long n1 = System.nanoTime();
javaAES256.init(Cipher.DECRYPT_MODE,keySpec,gcmParameterSpec);
long n2 = System.nanoTime();
int offset = javaAES256.update(encrypted,0,encrypted.length,data,0);
long n3 = System.nanoTime();
javaAES256.update(tag,0,tag.length,data,offset);
long n4 = System.nanoTime();
javaAES256.doFinal(data,offset);
long n5 = System.nanoTime();
javaDecryptInputBytes + = data.length;

decryptInitTime + = n2 - n1;
decryptUpdate1Time + = n3 - n2;
decryptUpdate2Time + = n4 - n3;
decryptDoFinalTime + = n5 - n4;
}
long javaDecryptEndTime = System.currentTimeMillis();
System.out.println(Time init(ns):+ decryptInitTime);
System.out.println(Time update 1(ns):+ decryptUpdate1Time);
System.out.println(Time update 2(ns):+ decryptUpdate2Time);
System.out.println(Time do final(ns):+ decryptDoFinalTime);
System.out.println(Total bytes processed:+ javaDecryptInputBytes);
System.out.println(Java calculated at+(javaDecryptInputBytes / 1024/1024 /((javaDecryptEndTime-javaDecryptStartTime)/ 1000))+MB / s);
}

EDIT:
作为一个有趣的练习,以改善这个简单的头脑基准。



我已经测试了一些使用ServerVM,删除nanoTime调用和介绍热身,但正如我预料的,对基准结果有任何改善。

解决方案

除了微基准,性能



我可以一致地重现3MB / s(在Haswell i7笔记本电脑上),这是因为JDK 8中的GCM实现(至少达到1.8.0_25)更成熟的微基准。



代码潜水,这似乎是由于一个天真的乘法器实现,没有硬件加速的GCM计算



通过比较JDK 8中的AES(在ECB或CBC模式下)使用AES-NI加速内在函数并且(至少对于Java)非常快在同一硬件上的1GB / s),但是总的AES / GCM性能完全由破坏的GCM性能支配。



计划实施硬件加速,并且已经有第三方提交,以提高性能与,但这些尚未发布到一个版本。

Bouncy Castle(在撰写本文时)有更快的GCM实现(如果你正在编写不受软件专利法保护的开源软件)






2015年7月更新 - 1.8.0_45和JDK 9



JDK 8+将获得一个改进的(和恒定时间的)Java实现(由RedHat的Florian Weimer提供) - 这已经降落在JDK 9 EA版本中,但显然尚未在1.8.0_45中。
JDK9(至少EA b72)也有GCM内在性 - b72上的AES / GCM速度为18MB / s,不启用内联函数,25MB / s启用内联函数,这两种方法都令人失望 - 比较最快的恒定时间)BC实现是〜60MB / s,最慢(恒定时间,未完全优化)是〜26MB / s。






2016年1月更新 - 1.8.0_72:



一些性能修复程序降落在JDK 1.8.0_60 和现在的性能在同一基准现在是18MB / s - 从原来的6倍的改进,但仍然比BC实现慢得多。


I am trying to encrypt and decrypt data using AES/GCM/NoPadding. I installed the JCE Unlimited Strength Policy Files and ran the (simple minded) benchmark below. I've done the same using OpenSSL and was able to achieve more than 1 GB/s encryption and decryption on my PC.

With the benchmark below I'm only able to get 3 MB/s encryption and decryption using Java 8 on the same PC. Any idea what I am doing wrong?

public static void main(String[] args) throws Exception {
    final byte[] data = new byte[64 * 1024];
    final byte[] encrypted = new byte[64 * 1024];
    final byte[] key = new byte[32];
    final byte[] iv = new byte[12];
    final Random random = new Random(1);
    random.nextBytes(data);
    random.nextBytes(key);
    random.nextBytes(iv);

    System.out.println("Benchmarking AES-256 GCM encryption for 10 seconds");
    long javaEncryptInputBytes = 0;
    long javaEncryptStartTime = System.currentTimeMillis();
    final Cipher javaAES256 = Cipher.getInstance("AES/GCM/NoPadding");
    byte[] tag = new byte[16];
    long encryptInitTime = 0L;
    long encryptUpdate1Time = 0L;
    long encryptDoFinalTime = 0L;
    while (System.currentTimeMillis() - javaEncryptStartTime < 10000) {
        random.nextBytes(iv);
        long n1 = System.nanoTime();
        javaAES256.init(Cipher.ENCRYPT_MODE, new SecretKeySpec(key, "AES"), new GCMParameterSpec(16 * Byte.SIZE, iv));
        long n2 = System.nanoTime();
        javaAES256.update(data, 0, data.length, encrypted, 0);
        long n3 = System.nanoTime();
        javaAES256.doFinal(tag, 0);
        long n4 = System.nanoTime();
        javaEncryptInputBytes += data.length;

        encryptInitTime = n2 - n1;
        encryptUpdate1Time = n3 - n2;
        encryptDoFinalTime = n4 - n3;
    }
    long javaEncryptEndTime = System.currentTimeMillis();
    System.out.println("Time init (ns): "     + encryptInitTime);
    System.out.println("Time update (ns): "   + encryptUpdate1Time);
    System.out.println("Time do final (ns): " + encryptDoFinalTime);
    System.out.println("Java calculated at " + (javaEncryptInputBytes / 1024 / 1024 / ((javaEncryptEndTime - javaEncryptStartTime) / 1000)) + " MB/s");

    System.out.println("Benchmarking AES-256 GCM decryption for 10 seconds");
    long javaDecryptInputBytes = 0;
    long javaDecryptStartTime = System.currentTimeMillis();
    final GCMParameterSpec gcmParameterSpec = new GCMParameterSpec(16 * Byte.SIZE, iv);
    final SecretKeySpec keySpec = new SecretKeySpec(key, "AES");
    long decryptInitTime = 0L;
    long decryptUpdate1Time = 0L;
    long decryptUpdate2Time = 0L;
    long decryptDoFinalTime = 0L;
    while (System.currentTimeMillis() - javaDecryptStartTime < 10000) {
        long n1 = System.nanoTime();
        javaAES256.init(Cipher.DECRYPT_MODE, keySpec, gcmParameterSpec);
        long n2 = System.nanoTime();
        int offset = javaAES256.update(encrypted, 0, encrypted.length, data, 0);
        long n3 = System.nanoTime();
        javaAES256.update(tag, 0, tag.length, data, offset);
        long n4 = System.nanoTime();
        javaAES256.doFinal(data, offset);
        long n5 = System.nanoTime();
        javaDecryptInputBytes += data.length;

        decryptInitTime += n2 - n1;
        decryptUpdate1Time += n3 - n2;
        decryptUpdate2Time += n4 - n3;
        decryptDoFinalTime += n5 - n4;
    }
    long javaDecryptEndTime = System.currentTimeMillis();
    System.out.println("Time init (ns): " + decryptInitTime);
    System.out.println("Time update 1 (ns): " + decryptUpdate1Time);
    System.out.println("Time update 2 (ns): " + decryptUpdate2Time);
    System.out.println("Time do final (ns): " + decryptDoFinalTime);
    System.out.println("Total bytes processed: " + javaDecryptInputBytes);
    System.out.println("Java calculated at " + (javaDecryptInputBytes / 1024 / 1024 / ((javaDecryptEndTime - javaDecryptStartTime) / 1000)) + " MB/s");
}

EDIT: I leave it as a fun exercise to improve this simple minded benchmark.

I've tested some more using the ServerVM, removed nanoTime calls and introduced warmup, but as I expected none of this had any improvement on the benchmark results. It is flat-lined at 3 megabytes per second.

解决方案

Micro-benchmarking aside, the performance of the GCM implementation in JDK 8 (at least up to 1.8.0_25) is crippled.

I can consistently reproduce the 3MB/s (on a Haswell i7 laptop) with a more mature micro-benchmark.

From a code dive, this appears to be due to a naive multiplier implementation and no hardware acceleration for the GCM calculations.

By comparison AES (in ECB or CBC mode) in JDK 8 uses an AES-NI accelerated intrinsic and is (for Java at least) very quick (in the order of 1GB/s on the same hardware), but the overall AES/GCM performance is completely dominated by the broken GCM performance.

There are plans to implement hardware acceleration, and there have been third party submissions to improve the performance with, but these haven't made it to a release yet.

Something else to be aware of is that the JDK GCM implementation also buffers the entire plaintext on decryption until the authentication tag at the end of the ciphertext is verified, which cripples it for use with large messages.

Bouncy Castle has (at the time of writing) faster GCM implementations (and OCB if you're writing open source software of not encumbered by software patent laws).


Updated July 2015 - 1.8.0_45 and JDK 9

JDK 8+ will get an improved (and constant time) Java implementation (contributed by Florian Weimer of RedHat) - this has landed in JDK 9 EA builds, but apparently not yet in 1.8.0_45. JDK9 (since EA b72 at least) also has GCM intrinsics - AES/GCM speed on b72 is 18MB/s without intrinsics enabled and 25MB/s with intrinsics enabled, both of which are disappointing - for comparison the fastest (not constant time) BC implementation is ~60MB/s and the slowest (constant time, not fully optimised) is ~26MB/s.


Updated Jan 2016 - 1.8.0_72:

Some performance fixes landed in JDK 1.8.0_60 and performance on the same benchmark now is 18MB/s - a 6x improvement from the original, but still much slower than the BC implementations.

这篇关于慢速AES GCM加密和解密与Java 8u20的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆