This blog is about Java (advanced Java topics like Reflection, Byte Code transformation, Code Generation), Maven, Web technologies, Raspberry Pi and IT in general.

Samstag, 6. September 2014

Class Transformation with ASM

Have you ever asked yourself how class transformation works? Great, then you are reading the right blog post ;-)

Probably not everybody knows what class transformation is. Therefore I will start to explain it before I will explain how it works. 

What is class transformation?

Actually it's there is a quite simple answer to this question: the Java bytecode will be modified in some kind. But let me explain it more detailed. If you compile a Java file a class file will be generated. The class file represents the Java source file as Java binary code. So it's much smaller and optimized for execution. The methods consists of the Java opcodes. These opcodes will pushed sequentially onto the Java stack and will be executed. Class transformation means that the Java byte code, which represents a Java class, will be modified. So opcodes can be inserted or removed. But not only the opcodes inside of a method can be modified. Everything can be changed - any program can be transformed in anything else! At least as it is still valid Java byte code. Otherwise the Java Bytecode Verifier will reject the class if the class will be loaded.

Where is class transformation used?

Why would you transform a class if you just could write the Java class like you need it to be? Actually if you can accomplish your work without class transformation than don't use class transformation. Just write the Java code accordingly. I think the most commonly usage of class transformation is to instrument Java code at runtime. Imagine if you have a big program which has performance problems and there are no performance tools like VisualVM or JProfiler. What would you do to find the methods which takes long to execute?
You would have to insert at each method the code to measure the execution duration of the method. If there are thousands of methods this would be a quite boring work. E
specially since you need to remove the code for the production code and probably add it again to do the analyze the execution durations again. With class transformation you can do exactly this boring work. You don't write the duration measurement code in you Java files. But you read the existing class file and insert the needed opcodes to each class, to each method.
Actually all performance tools work like this. Java allows to modify already loaded classes, with some restrictions, too. So these tools gets the binary code of the classes, rewrites the classes on a binary level and Java loads the modified classes. Than the tools can generate analysis and pretty diagrams from the instrumented classes.

ASM

ASM is a great library which allows to transform classes. It consists of three main parts
  •  ClassReader: reads a binary class
  • ClassWriter: writes a binary class
  • ClassVisitor: transforms the binary class by calling the visit-methods of your implementation
To be able to create class transformations you need to understand the Java opcodes and how a stack based language works. At least you need a rudimentary understanding.

There are tools which will show you the byte code of any class and generate the ASM code to create this class with ASM. Therefore you don't need to write all the opcodes by hand. Just write a Java class which should be the result of your transformation. Than look at the generated code and adapt it to your needs. In theory that sounds very easy. But at least I had to read the ASM documentation and the Java opcodes documentation, too. To make a quite simple transformation work. 

 Example

This example, which can be found completely on GitHub, does two things.
  • Wraps all static Logger variables
  • Logs all method calls
For more details read the code. It's heavily documented and it makes more sense to read than anything else. Have fun! :-)

The code on GitHub with syntax hilighting, yeah: https://github.com/rseiler/concept-class-transformation-with-asm/blob/master/src/main/java/at/rseiler/concept/Main.java!


package at.rseiler.concept;

import org.objectweb.asm.*;
import org.objectweb.asm.commons.AdviceAdapter;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.lang.reflect.Method;
import java.util.logging.Logger;

import static org.objectweb.asm.Opcodes.ASM4;
import static org.objectweb.asm.Opcodes.INVOKESTATIC;

/**
 * A demo how to do byte code transformation with ASM.
 * 
 * The program will load the HelloWorld class file and manipulate the byte code:
 * 
 * 1. Wraps static {@link Logger} into the {@link LoggerWrapper#logger(Logger)}
 * 2. Adds at the beginning of each method a call to {@link MethodLogger#log(String, Object...)}
 * 
 * 1.
 * private static final Logger logger1 = Logger.getLogger(HelloWorld.class.getName());
 * will be transformed into:
 * private static final Logger logger1 = LoggerWrapper.logger(Logger.getLogger(HelloWorld.class.getName()));
 * 
 * 2.
 * public String foo(String arg) {
 * return bar("foo", arg);
 * }
 * will be transformed into:
 * public String foo(String arg) {
 * MethodLogger.log("foo", arg);
 * return bar("foo", arg);
 * }
 * 
 * 
 * You shouldn't relay on the ASM version packed into the jdk for production code!
 * Because if a new Java version will be shipped than it could contain a new version of AMS (or remove ASM) which will break your code.
 * Therefor you must repackage ASM into your own namespace, to prevent version conflicts, and ship it with your library.
 * 
 * Because this is non production code and I am lazy I didn't do it.
 * 
 * IMPORTANT: If you try to run the program on a JMV other than the JDK8 it will probably fail.
 *
 * @author reinhard.seiler@gmail.com
 */
public class Main {

    public static void main(String[] args) throws Exception {
        // creates the ASM ClassReader which will read the class file
        ClassReader classReader = new ClassReader(new FileInputStream(new File("HelloWorld.class")));
        // creates the ASM ClassWriter which will create the transformed class
        ClassWriter classWriter = new ClassWriter(ClassWriter.COMPUTE_MAXS);
        // creates the ClassVisitor to do the byte code transformations
        ClassVisitor classVisitor = new MyClassVisitor(ASM4, classWriter);
        // reads the class file and apply the transformations which will be written into the ClassWriter
        classReader.accept(classVisitor, 0);

        // gets the bytes from the transformed class
        byte[] bytes = classWriter.toByteArray();
        // writes the transformed class to the file system - to analyse it (e.g. javap -verbose)
        new FileOutputStream(new File("HelloWorld$$Transformed.class")).write(bytes);

        // inject the transformed class into the current class loader
        ClassLoader classLoader = Main.class.getClassLoader();
        Method defineClass = ClassLoader.class.getDeclaredMethod("defineClass", String.class, byte[].class, int.class, int.class);
        defineClass.setAccessible(true);
        Class helloWorldClass = (Class) defineClass.invoke(classLoader, null, bytes, 0, bytes.length);

        // creates an instance of the transformed class
        Object helloWorld = helloWorldClass.newInstance();
        Method hello = helloWorldClass.getMethod("hello");
        // class the hello method
        hello.invoke(helloWorld);
    }

    private static class MyClassVisitor extends ClassVisitor {

        public MyClassVisitor(int i, ClassVisitor classVisitor) {
            super(i, classVisitor);
        }

        public MethodVisitor visitMethod(int access, String name, String desc, String signature, String[] exceptions) {
            if (cv == null) {
                return null;
            }

            MethodVisitor mv = super.visitMethod(access, name, desc, signature, exceptions);
            //  defines the static block in which the assignment of static variables happens.
            // E.g. private static final Logger logger = Logger.getLogger(HelloWorld.class.getName());
            // The assignment of the logger variable happens in .
            if ("".equals(name)) {
                return new StaticBlockMethodVisitor(mv);
            } else {
                // all other methods (static and none static)
                return new MethodLogger(mv, access, name, desc);
            }
        }

        class StaticBlockMethodVisitor extends MethodVisitor {
            StaticBlockMethodVisitor(MethodVisitor mv) {
                super(ASM4, mv);
            }

            @Override
            public void visitFieldInsn(int opcode, String owner, String name, String desc) {
                // checks for: putstatic // Field *:Ljava/util/logging/Logger;
                if ("Ljava/util/logging/Logger;".equals(desc)) {
                    // adds before the putstatic opcode the call to LoggerWrapper#logger(Logger) to wrap the logger instance
                    super.visitMethodInsn(INVOKESTATIC, "at/rseiler/concept/LoggerWrapper", "logger", "(Ljava/util/logging/Logger;)Ljava/util/logging/Logger;");
                }
                // do the default behaviour: add the putstatic opcode to the byte code
                super.visitFieldInsn(opcode, owner, name, desc);
            }
        }

        class MethodLogger extends AdviceAdapter {

            private final int access;
            private final String name;
            private final String desc;

            protected MethodLogger(MethodVisitor mv, int access, String name, String desc) {
                super(ASM4, mv, access, name, desc);
                this.access = access;
                this.name = name;
                this.desc = desc;
            }

            @Override
            protected void onMethodEnter() {
                // checks if the method is static.
                // The difference is that "this" is stored in ALOAD_0 and the arguments are stored in ALOAD_1, ALOAD_2, ...
                // But there is no "this" for a static method call. Therefor the arguments are stored in ALOAD_0, ALOAD_1 ,...
                // If we want to access the arguments we need to differentiate between static and non static method calls.
                boolean isStatic = (access & ACC_STATIC) > 0;

                int length = Type.getArgumentTypes(desc).length;

                // pushes the method name on the stack
                super.visitLdcInsn(name);
                // pushes the count of arguments on the stack
                // could be optimized if we would use iconst_0, iconst_1, ..., iconst_5 for 0 to 5.
                super.visitIntInsn(BIPUSH, length);
                // creates an object array with the count of arguments
                super.visitTypeInsn(ANEWARRAY, "java/lang/Object");

                // stores the arguments in the array
                for (int i = 0; i < length; i++) {
                    // duplicates the reference to the array. Because the AASTORE opcode consumes the stack element with the reference to the array.
                    super.visitInsn(DUP);
                    // could be optimized
                    super.visitIntInsn(BIPUSH, i);
                    // puts the value of the current argument on the stack
                    super.visitVarInsn(ALOAD, i + (isStatic ? 0 : 1));
                    // stores the value of the current argument in the array
                    super.visitInsn(AASTORE);
                }

                // calls the MethodLogger#log(String, Object...) method with the corresponding arguments - which we created just before
                super.visitMethodInsn(INVOKESTATIC, "at/rseiler/concept/MethodLogger", "log", "(Ljava/lang/String;[Ljava/lang/Object;)V");
            }
        }

    }

}