banner
lMingyul

lMingyul

记录穿过自己的万物
jike

Using the Series - Introduction to Arthas

When diagnosing fault issues in the company, we generally print logs in key parts of the code and then analyze the problem by replacing the Jar package in the environment container. However, this process can be quite cumbersome:

  • First, you need to ensure the comprehensiveness of the logs you print. If some key information is not printed, you will need to reprint, replace the package, and restart the service, which wastes a lot of time.
  • Secondly, not all environments support package replacement and service restarts.

So recently, I have been looking for better methods for fault diagnosis, and I found that Alibaba's open-source Java service diagnostic tool seems quite good. It can view method call parameters, return values, called paths, call durations, method call counts, success counts, failure counts, etc., all of which can be recorded. Therefore, I will record this tool for learning purposes.

What is Arthas#

Official introduction:
Arthas is an online monitoring and diagnostic product that allows real-time viewing of application load, memory, GC, and thread status information from a global perspective. It can diagnose business problems without modifying application code, including viewing method call parameters, exceptions, monitoring method execution time, class loading information, etc., greatly improving the efficiency of online problem troubleshooting.

Operating Environment#

  • Only supports JDK 6 and above
  • Written in Java, supports cross-platform: supports Linux (mainly), Mac, Windows

Features#

  • Uses command-line interactive mode
  • Provides Tab key auto-completion functionality

Initial Use#

Since the environment used by the company is mainly in containers, the following mainly records how to use this tool in a Linux environment.

Download Usage Package#

Due to the company's environment being an intranet, direct access to GitHub for downloading installation packages is not supported. To prevent network issues from preventing downloads, the method used is to manually download from GitHub and copy it to the service container.

Download the complete installation package from GitHub, download address: https://github.com/alibaba/arthas/releases

CleanShot-2023-04-30-00-13-53@2x

Unzip in Container Environment#

# Create a directory dedicated to arthas, as many files will be generated after unzipping
mkdir arthas

# Unzip to the newly created arthas directory
unzip -d arthas arthas-bin.zip

CleanShot-2023-04-30-00-44-29@2x

Uninstall#

After locating the problem, it's also important to clean up the battlefield, so the method for uninstalling this tool is also recorded.

You can uninstall the tool by executing the following three steps:

rm -rf arthas

rm -rf ~/.arthas/

rm -rf ~/logs

Run#

First, start a Java program service that will not stop. The official installation package comes with a Jar package for practice: math-game.jar (however, our services generally run continuously, so here we use the official package for record-keeping).

# Start this Java program; if you have your own service, you can skip this step
java -jar math-game.jar

Then start arthas

# 1. Start
java -jar arthas-boot.jar

# 2. Select the Java service you want to attach to, enter the process number and press enter (there is only one process here, which is process 1)
1

# Seeing the arthas logo means arthas has attached to this process 1 service

CleanShot-2023-04-30-00-50-48@2x


Common Commands#

help#

# Entering help will provide Arthas-related command help information.
[arthas@421554]$ help
 NAME         DESCRIPTION                                                                                                                                                          
 help         Display Arthas Help                                                                                                                                                  
 auth         Authenticates the current session                                                                                                                                    
 keymap       Display all the available keymap for the specified connection.                                                                                                       
 sc           Search all the classes loaded by JVM                                                                                                                                 
 sm           Search the method of classes loaded by JVM                                                                                                                           
 classloader  Show classloader info                                                                                                                                                
 jad          Decompile class                                                                                                                                                      
 getstatic    Show the static field of a class                                                                                                                                     
 monitor      Monitor method execution statistics, e.g. total/success/failure count, average rt, fail rate, etc.                                                                   
 stack        Display the stack trace for the specified class and method                                                                                                           
 thread       Display thread info, thread stack                                                                                                                                    
 trace        Trace the execution time of specified method invocation.                                                                                                             
 watch        Display the input/output parameter, return object, and thrown exception of specified method invocation                                                               
 tt           Time Tunnel                                                                                                                                                          
 jvm          Display the target JVM information                                                                                                                                   
 memory       Display jvm memory info.                                                                                                                                             
 perfcounter  Display the perf counter information.                                                                                                                                
 ognl         Execute ognl expression.                                                                                                                                             
 mc           Memory compiler, compiles java files into bytecode and class files in memory.                                                                                        
 redefine     Redefine classes. @see Instrumentation#redefineClasses(ClassDefinition...)                                                                                           
 retransform  Retransform classes. @see Instrumentation#retransformClasses(Class...)                                                                                               
 dashboard    Overview of target jvm's thread, memory, gc, vm, tomcat info.                                                                                                        
 dump         Dump class byte array from JVM                                                                                                                                       
 heapdump     Heap dump                                                                                                                                                            
 options      View and change various Arthas options                                                                                                                               
 cls          Clear the screen                                                                                                                                                     
 reset        Reset all the enhanced classes                                                                                                                                       
 version      Display Arthas version                                                                                                                                               
 session      Display current session information                                                                                                                                  
 sysprop      Display and change the system properties.                                                                                                                            
 sysenv       Display the system env.                                                                                                                                              
 vmoption     Display, and update the vm diagnostic options.                                                                                                                       
 logger       Print logger info, and update the logger level                                                                                                                       
 history      Display command history                                                                                                                                              
 cat          Concatenate and print files                                                                                                                                          
 base64       Encode and decode using Base64 representation                                                                                                                        
 echo         write arguments to the standard output                                                                                                                               
 pwd          Return working directory name                                                                                                                                        
 mbean        Display the mbean information                                                                                                                                        
 grep         grep command for pipes.                                                                                                                                              
 tee          tee command for pipes.                                                                                                                                               
 profiler     Async Profiler. https://github.com/jvm-profiling-tools/async-profiler                                                                                                
 vmtool       jvm tool                                                                                                                                                             
 stop         Stop/Shutdown Arthas server and exit the console.                                                                                                                    
 jfr          Java Flight Recorder Command

dashboard#

Dashboard: Displays the real-time data panel of the current system. When there is no dashboard, we generally can only view system operation information through the built-in top command in Linux.

Enter dashboard, press Enter, and it will display the current process information. Press Ctrl+C or enter q to interrupt execution.
CleanShot-2023-04-30-01-01-50@2x

The displayed information is roughly divided into three main sections:

  • The top section is thread-related information
  • The middle area is JVM memory-related information
  • The bottom section is information about the Java runtime environment
    For specific information in each column, please refer to the official documentation.

thread#

View the current thread information stack

When there are no parameters, display the first page of thread information#

thread

By default, it is sorted in descending order by CPU increment time and only displays the first page of data.

CleanShot-2023-05-03-11-59-07@2x

Supports one-click display of the top N busiest threads and print the stack#

thread -n N

CleanShot-2023-05-03-12-00-56@2x 1

thread --all, display all matching threads#

# Display all matching thread information. Sometimes you need to obtain all JVM thread data for analysis.
thread --all

CleanShot-2023-05-03-16-43-24@2x

thread id, display the running stack of the specified thread#

[arthas@421554]$ thread 1
"main" Id=1 TIMED_WAITING
    at java.lang.Thread.sleep(Native Method)
    at java.lang.Thread.sleep(Thread.java:342)
    at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:386)
    at demo.MathGame.main(MathGame.java:17)

thread -b, find the thread that is currently blocking other threads#

thread -b

watch#

Observe the call situation of the specified method
You can observe:
Method return value, parameters, exceptions thrown by the method, and you can also view corresponding variables by writing OGNL expressions.

Observe the parameters, this object, and return value when the function call returns#

# The default observation dimensions are {params, target, returnObj}. Below, we observe the parameters, this object, and return value when the function call returns. -x represents the depth of the output result property traversal, i.e., the depth of sub-objects, with a maximum depth of 4.

[arthas@421554]$ watch demo.MathGame primeFactors -x 2
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 28 ms, listenerId: 2
method=demo.MathGame.primeFactors location=AtExit
ts=2023-05-10 00:25:42; [cost=0.106133ms] result=@ArrayList[
    @Object[][
        @Integer[1],
    ],
    @MathGame[
        random=@Random[java.util.Random@254989ff],
        illegalArgumentCount=@Integer[85442],
    ],
    @ArrayList[
        @Integer[103],
        @Integer[1667],
    ],
]

# Change the depth to 3
[arthas@421554]$ watch demo.MathGame primeFactors -x 3
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 26 ms, listenerId: 3
method=demo.MathGame.primeFactors location=AtExit
ts=2023-05-10 00:26:17; [cost=0.34344ms] result=@ArrayList[
    @Object[][
        @Integer[1],
    ],
    @MathGame[
        random=@Random[
            serialVersionUID=@Long[3905348978240129619],
            seed=@AtomicLong[97774455668942],
            multiplier=@Long[25214903917],
            addend=@Long[11],
            mask=@Long[281474976710655],
            DOUBLE_UNIT=@Double[1.1102230246251565E-16],
            BadBound=@String[bound must be positive],
            BadRange=@String[bound must be greater than origin],
            BadSize=@String[size must be non-negative],
            seedUniquifier=@AtomicLong[-3282039941672302964],
            nextNextGaussian=@Double[0.0],
            haveNextNextGaussian=@Boolean[false],
            serialPersistentFields=@ObjectStreamField[][isEmpty=false;size=3],
            unsafe=@Unsafe[sun.misc.Unsafe@5d099f62],
            seedOffset=@Long[24],
        ],
        illegalArgumentCount=@Integer[85459],
    ],
    @ArrayList[
        @Integer[7],
        @Integer[21313],
    ],
]

Observe both the function call before and after the function returns#

[arthas@421554]$ watch demo.MathGame primeFactors "{params,target,returnObj}" -x 2 -b -s
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 39 ms, listenerId: 6
method=demo.MathGame.primeFactors location=AtEnter
ts=2023-05-10 00:30:00; [cost=0.036373ms] result=@ArrayList[
    @Object[][
        @Integer[163405],
    ],
    @MathGame[
        random=@Random[java.util.Random@254989ff],
        illegalArgumentCount=@Integer[85576],
    ],
    null,
]
method=demo.MathGame.primeFactors location=AtExit
ts=2023-05-10 00:30:00; [cost=1.4721136326180643E10ms] result=@ArrayList[
    @Object[][
        @Integer[1],
    ],
    @MathGame[
        random=@Random[java.util.Random@254989ff],
        illegalArgumentCount=@Integer[85576],
    ],
    @ArrayList[
        @Integer[5],
        @Integer[11],
        @Integer[2971],
    ],
]

Observe the properties in the current object#

If you want to view the properties in the current object before and after the function runs, you can use the target keyword, where target represents the current object, and then use target.field_name to access a specific property of the current object.

[arthas@421554]$ watch demo.MathGame primeFactors 'target.illegalArgumentCount'
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 31 ms, listenerId: 7
method=demo.MathGame.primeFactors location=AtExit
ts=2023-05-10 00:33:50; [cost=0.081212ms] result=@Integer[85676]
method=demo.MathGame.primeFactors location=AtExceptionExit
ts=2023-05-10 00:33:51; [cost=0.102672ms] result=@Integer[85677]

trace#

The internal call path of the method, outputting the time spent at each node along the path, is used when the service call time is too long.

[arthas@421554]$ trace demo.MathGame run
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 55 ms, listenerId: 9
`---ts=2023-05-11 00:24:38;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@74a14482
    `---[0.54719ms] demo.MathGame:run()
        +---[20.76% 0.113574ms ] demo.MathGame:primeFactors() #24
        `---[53.28% 0.29155ms ] demo.MathGame:print() #25
  • In the output result, #24 indicates that the primeFactors() function was called at line 24 of the source file.
  • In the output result, #25 indicates that the print() function was called at line 25 of the source file.

stack#

Output the call path of the current method. When we need to know where this method (which has been called from many places) started executing, we can use this command (suitable for tracing back).

[arthas@421554]$ stack demo.MathGame primeFactors
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 38 ms, listenerId: 12
ts=2023-05-11 00:32:43;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@74a14482
    @demo.MathGame.primeFactors()
        at demo.MathGame.run(MathGame.java:24)
        at demo.MathGame.main(MathGame.java:16)

jad#

Decompile the source code of the specified loaded class for easier understanding of business logic online. The decompiled code is syntax-highlighted.

jad demo.MathGame

CleanShot-2023-05-11-00-34-53@2x


Reference Materials#

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.