• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

message: Benchmark and usage of message serialization/deserialization.

原作者: [db:作者] 来自: 网络 收藏 邀请

Java Object 序列化的基准测试(V1)

SPEED/SPACE Benchmarks of Java Object Serializing(V1)

1.概要

Summary

Java 序列化体系的性能孰高孰低,网上已经有了许多比较文章。

但我认为有些比较存在问题:

  • 测试样本结构简单
  • 测试程序进行泛化处理以公平衡量各序列化体系
  • 涉及序列化体系较少
  • 测试程序扩展性差难以加入其它序列化体系

因此撰写此文及程序,作为众人参考比较的选择。

There has been so many discussions about which is the best Java serialization system.Yet I think there were some problem in some of them.

  • Sample data structure is too simple
  • Testing program did not generalize serialization systems to evaluate each of them fairly
  • Only a few serialization systems are involved
  • Testing program is not extendable to involve more serialization systems

That's why this testing program and article were written, providing another option to build your own Java Serialization Systems evaluation.

1.1.涉及的序列化体系

Serialization systems involved

  • JDK bulit-in
  • Protobuf
  • Hessian2
  • Kryo
  • Fastjson
  • Jackson
  • Gson

1.2.测试结果关注点

Testing results to concern

  • 序列化速度

Speed of serialization

  • 反序列化速度(pending)

Speed of deserialization

  • 序列化后所占用的字节空间

Space cost after serialization

1.3.泛化处理

Generalization

Protobuf协议需要对消息定义执行静态编译,JDK built-in序列化协议需要被序列化对象实现java.io.Serializable接口。而其他框架都是运行时动态对任意Java Object进行序列化。为了能在同一个基准上进行比较,需要定义泛化约束如下。

Static compilation is required for Protocol Buffers message definitions, and JDK built-in serialization protocol requires Objects to implement java.io.Serializable, while others serialize any Plain Java Object dynamically. Constraints were defined to generalize all serialization systems.

1.3.1.结构泛化

Structure Generalization

  • 测试所用的领域对象必须与.proto文件预定义的消息结构相同,并提供转换器与.proto文件预定义的消息相互转化。
  • All domain objects should have the same structure defined by .proto, and provides converters to convert POJOs and protobuf messages back and forth.
  • 测试所用的领域对象必须实现java.io.Serializable接口
  • All domain objects should implement java.io.Serializable

1.3.2.输入泛化

Input Generalization

运行同一轮基准测试时,所有序列化框架输入的数据相同,循环次数相同。

Use exactly the same input for each Serialization System and loop exactly the same times for the same benchmark testing.

1.3.3.如何构建和运行

How to build and run

构建测试程序

  • 进入构建目录cd master
  • 全量构建mvn clean install

Build The Testing Program

  • Enter building directorycd master
  • Startover buildingmvn clean install

运行测试程序

  • 进入benchmark目录cd benchmark
  • 开始运行java -jar target/benchmark-<version>.jar
  • 在${user.home}/benchmark.log 查看输出、日志

Run The Testing Program

  • Enter benchmark directorycd benchmark
  • Start running by typingjava -jar target/benchmark-<version>.jar
  • Checkout logs in ${user.home}/benchmark.log

2.测试程序设计

Testing Program Designing

2.1.测试样本对象

Samples Testing Models

为了满足测试的多样性,较全面测试空间和时间性能,测试样本对象当满足如下条件。

Testing objects are supposed to satisfy requirements mentioned below, so that space/speed performances are better evaluated.

  • C1-01 测试样本对象的内容采用随机生成
  • C1-01 Testing objects and the properties of them are created randomly
  • C1-02 数据类型使用上至少包括整数、字符串、浮点数和枚举
  • C1-02 Testing objects should have integer/string/float/enum propertiesAll of these types are mandatory.
  • C1-03 至少使用一个集合类型
  • C1-03 Testing objects are supposed to hold at least 1 collection property
  • C1-04 测试样本对象应当有相互引用的结构
  • C1-04 Testing objects are supposed to refer to each other

2.2.序列化对象

Serialized Object

序列化对象是普通Java对象的包装,满足如下条件。

Serialized Objects are wrappers of POJOs, are supposed to satisfy requirements mentioned below.

  • C2-01 接受一个普通Java对象作为初始化对象
  • C2-01 Accepts a POJO for initialization
  • C2-02 提供返回值为byte[]类型的无参方法获取序列化后的字节流
  • C2-02 Provides a byte[] method without args for accessing serialized byte array
  • C2-03 提供返回值为int类型的无参方法获取字节流长度
  • C2-03 Provides a int method without args for accessing the length of byte array
  • C2-04 提供返回值为String的无参方法获取序列化后字节流的UTF-8字符串形态
  • C2-04 Provides a String method without args for accessing the UTF-8 form of byte array
  • C2-05 提供返回值为String的无参方法获取序列化后字节流的Base64字符串形态
  • C2-05 Provides a String method without args for accessing the Base64 form of byte array
  • C2-06 提供返回值与初始化对象相同无参方法对序列化后的字节流反序列化
  • C2-06 Provides method without args returning the same type as accepted POJO, which is deserialized from the byte array
  • C2-07 C2-06所提及的方法不能直接返回C2-01传入的对象
  • C2-07 The method required by C2-06 shall not return the POJO accepted by C2-01
  • C2-08 序列化对象应当是不可变对象
    • 不提供任何set*,add*等会改变对象状态的方法
    • C2-02所提供的方法应当进行保护性复制
  • C2-08 Serialized Object is supposed to be IMMUTABLE
    • Provide no mutators that changes the object status, like set*, add*
    • Method defined by C2-02 should return a defencive copy of the internal byte array

2.3.基准测试对象

Benchmark Testing Objects

2.3.1.空间基准测试对象

Space Benchmark Testing Objects

空间基准测试比较简单。只需要随机测试样本,逐个输出各序列化体系的空间占用即可。

Space benchmark testing is the simpler one. Generate samples, and record space cost of each serialization systems. That's all we have to do.

2.3.2.速度基准测试对象

Speed Benchmark Testing Objects

为了公平比较各序列化体系,定义速度基准测试对象约束如下

To be fair, the subsequent constraints are defined

  • C3-01 提供接受1个Object类型参数和1个int类型参数的方法。其中Object类型参数为待序列化对象,int类型参数为循环次数
  • C3-01 Provides a method which accepts 1 Object argument, which is to be serialized; and 1 int argument, which indicates times of looping.
  • C3-02C3-01定义的方法开始和结束时进行计时,计算总消耗时间和平均每次序列化的时间
  • C3-02 Calculate elapsed time of the method defined by C3-01, and average elapsed time of each serialization.
  • C3-03 速度基准测试对象的执行次序应当可以在运行时随意调整
  • C3-03 The execution order of each Speed Benchmark Objects are able to be adjusted at runtime, freely.

3.测试程序实现

Implementing Testing Program

3.1.SerializedObject

  • SerializedObject 是所有序列化对象的基类,根据 2.2的要求实现.
  • SerializedObject 的子类告诉其父类如何把所包装对象序列化成字节流.
  • SerializedObject 的子类告诉其父类如何把字节流反序列化成对象.
  • SerializedObject 的子类可通过实现 beforeSerilize()方法初始化序列化过程中需要用到的工具.
  • SerializedObject 在序列化过程中捕捉的受检异常都会被包装到SerializationException重新抛出.
  • SerializedObject 提供了工厂方法初始化其子类,其子类的构造函数都是package-private的。
  • SerializedObject is the base class of all serialized objects, which complies with 2.2.
  • Sub-types of SerializedObject tells their super class how to serialize the wrapped object.
  • Sub-types of SerializedObject tells their super class how to deserialize from the byte array.
  • Sub-types of SerializedObject are allowed to implement beforeSerilize() to initiate the internal utilities.
  • Checked exception of serialization procedure inside SerializedObject are wrapped and rethrown by SerializationException.
  • SerializedObject provides factory method to initialize it's known sub-types, since the constructor of which are package-private.

3.1.1.Hessian2

Hessian2SerializedObject需要额外的配置,用以指定自定义的序列化和反序列化策略。相应的配置放在META-INF目录下面。

Hessian2SerializedObjectrequires extra configuration under META-INF, which specifies custom serializers.

3.2.Benchmark Interface

是速度基准测试接口

which is a Speed Benchmark Interface

  • Benchmark 接口根据2.3.2定义了单次基准测试的执行方法
  • Benchmark defined method for benchmark testing, complies with 2.3.2.
  • Benchmark 的执行计时通过ProfilingAspect拦截实现
  • Benchmark executions are intercepted by ProfilingAspect, for elapsed time calculation.
  • ProfilingAspect 的总耗时单位是毫秒,单次调用平均耗时单位为微秒。
  • ProfilingAspect records total elapsed time in Milliseconds, and average elapsed time of a single call in Microseconds.

3.3.SpeedBenchmarks

  • 组合所有 Benchmark 已知的接口的实现
  • 对所有Benchmark实现分别执行1,000, 5,000, 20,000, 50,000, 200,000次
  • 定义执行Benchmark Testing的线程池并管理之
  • Arranges known Benchmark implementations.
  • Run each Benchmark implementation for 1,000, 5,000, 20,000, 50,000, 200,000 times.
  • Define thread pool which executes Benchmark Testing and manage its lifecycle.

3.4.自动生成的代码

Protocol Buffers消息对象需要通过静态编译预生成. 同时为避免冗长的代码,测试程序使用了lombok。如果你导入代码到IDE时发现缺少了相应的类或者库,请先到master目录运行mvn clean install,然后再重新导入代码。

Protocol Buffers messages requires static compilation. Moreover, the testing program introduced lombok. If you see any required classes or dependencies are missing after importing into IDE, checkout the masterdirectory and run mvn clean insall first, and re-import the testing program after that.

3.5.Testing Models

TestingModels是样本测试数据生成器,可随机生成被测试的样本对象及枚举值。测试样本类型由lombok编译器生成。无论编译与否,原文件在message/testing-models/src/main/lombok目录下找到。

TestingModels is the sample testing object provider, which generates samples testing objects and enums randomly. Sample testing model types are generated by lombok automatically. The original source can be found under message/testing-models/src/main/lombok even before compilation.

3.6.Package io.demo.message.domain.proto

io.demo.message.domain.proto包含2种类型

  • Protobuf编译器生成的消息类,编译后可在message/testing-models/target/generated-sources/protobuf/java找到。
  • Protobuf消息类和测试样本类之间的转换类。

2 kinds of classes are underio.demo.message.domain.proto

  • Messageclasses generated by Protobuf compiler, which can be found under message/testing-models/target/generated-sources/protobuf/java after compilation.
  • Converters transforms Testing Models and Protobuf messages back and forth

4.如何扩展测试程序

How to extend the Testing Program


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
上一篇:
lpfTestNg: selenium+testng编写的测试框架发布时间:2022-02-13
下一篇:
ramcrest-ruby: Hamcrest in Ruby发布时间:2022-02-13
热门推荐
热门话题
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap