Analysis Tool Suite

In addition to a CPU cache simulator, other predefined analysis tools are available that operate on memory address traces. Which set of tools is used can be selected with the -tool parameter. New, custom tools can also be created, as described in Creating New Analysis Tools.

Cache Simulator

This is the default tool. Here is an exmample of running it on an offline trace:

$ bin64/drrun -t drmemtrace -offline -- ~/test/pi_estimator

Estimation of pi is 3.142425985001098

$ bin64/drrun -t drmemtrace -indir drmemtrace.*.dir

Cache simulation results:

Core #0 (1 thread(s))

L1I stats:

Hits: 258,433

Misses: 1,148

Miss rate: 0.44%

L1D stats:

Hits: 93,654

Misses: 2,624

Prefetch hits: 458

Prefetch misses: 2,166

Miss rate: 2.73%

Core #1 (1 thread(s))

L1I stats:

Hits: 8,895

Misses: 99

Miss rate: 1.10%

L1D stats:

Hits: 3,448

Misses: 156

Prefetch hits: 26

Prefetch misses: 130

Miss rate: 4.33%

Core #2 (1 thread(s))

L1I stats:

Hits: 4,150

Misses: 101

Miss rate: 2.38%

L1D stats:

Hits: 1,578

Misses: 130

Prefetch hits: 25

Prefetch misses: 105

Miss rate: 7.61%

Core #3 (0 thread(s))

LL stats:

Hits: 1,414

Misses: 2,844

Prefetch hits: 824

Prefetch misses: 1,577

Local miss rate: 66.79%

Child hits: 370,667

Total miss rate: 0.76%

TLB Simulator

To simulate TLB devices instead of caches, pass TLB to -tool:

$ bin64/drrun -t drmemtrace -tool TLB -- ~/test/pi_estimator

Estimation of pi is 3.142425985001098

---- <application exited with code 0> ----

TLB simulation results:

Core #0 (1 thread(s))

L1I stats:

Hits: 252,412

Misses: 401

Miss rate: 0.16%

L1D stats:

Hits: 87,132

Misses: 9,127

Miss rate: 9.48%

LL stats:

Hits: 9,315

Misses: 213

Local miss rate: 2.24%

Child hits: 339,544

Total miss rate: 0.06%

Core #1 (1 thread(s))

L1I stats:

Hits: 8,709

Misses: 20

Miss rate: 0.23%

L1D stats:

Hits: 3,544

Misses: 55

Miss rate: 1.53%

LL stats:

Hits: 15

Misses: 60

Local miss rate: 80.00%

Child hits: 12,253

Total miss rate: 0.49%

Core #2 (1 thread(s))

L1I stats:

Hits: 1,622

Misses: 21

Miss rate: 1.28%

L1D stats:

Hits: 689

Misses: 35

Miss rate: 4.83%

LL stats:

Hits: 3

Misses: 53

Local miss rate: 94.64%

Child hits: 2,311

Total miss rate: 2.24%

Core #3 (0 thread(s))

Reuse Distance

To compute reuse distance metrics:

$ bin64/drrun -t drmemtrace -tool reuse_distance -reuse_distance_histogram -- ~/test/pi_estimator

Estimation of pi is 3.142425985001098

---- <application exited with code 0> ----

Reuse distance tool aggregated results:

Total accesses: 349632

Unique accesses: 196603

Unique cache lines accessed: 4235

Reuse distance mean: 14.64

Reuse distance median: 1

Reuse distance standard deviation: 104.10

Reuse distance histogram:

Distance Count Percent Cumulative

0 153029 44.36% 44.36%

1 101294 29.37% 73.73%

2 14116 4.09% 77.82%

3 14248 4.13% 81.95%

4 8894 2.58% 84.53%

5 2733 0.79% 85.32%

...

==================================================

Reuse distance tool results for shard 29327 (thread 29327):

Total accesses: 335084

Unique accesses: 187927

Unique cache lines accessed: 4148

Reuse distance mean: 14.77

Reuse distance median: 1

Reuse distance standard deviation: 106.02

Reuse distance histogram:

Distance Count Percent Cumulative

0 147157 44.47% 44.47%

1 96820 29.26% 73.72%

2 13613 4.11% 77.84%

3 13834 4.18% 82.02%

4 8666 2.62% 84.64%

5 2552 0.77% 85.41%

...

3658 29 0.01% 100.00%

3851 1 0.00% 100.00%

Reuse distance threshold = 100 cache lines

Top 10 frequently referenced cache lines

cache line: #references #distant refs

0x7f2a86b3fd80: 27980, 0

0x7f2a86b3fdc0: 18823, 0

0x7f2a88388fc0: 16409, 111

0x7f2a8838abc0: 15176, 6

0x7f2a883884c0: 9930, 20

0x7f2a88388480: 7944, 20

0x7f2a88388500: 7574, 20

0x7f2a88398d00: 7390, 100

0x7f2a86b3fd40: 6668, 0

0x7f2a88388440: 5717, 20

Top 10 distant repeatedly referenced cache lines

cache line: #references #distant refs

0x7f2a885a4180: 246, 132

0x7f2a87504ec0: 202, 128

0x7f2a875044c0: 323, 126

0x7f2a885a4480: 220, 126

0x7f2a87504f00: 293, 124

0x7f2a86fd7e00: 289, 124

0x7f2a875049c0: 221, 124

0x7f2a875053c0: 270, 122

0x7f2a86db9c00: 269, 122

0x7f2a875047c0: 201, 122

==================================================

Reuse distance tool results for shard 29328 (thread 29328):

Total accesses: 12216

Unique accesses: 7251

Unique cache lines accessed: 319

Reuse distance mean: 12.98

Reuse distance median: 1

Reuse distance standard deviation: 38.19

Reuse distance histogram:

Distance Count Percent Cumulative

0 4965 41.73% 41.73%

1 3758 31.59% 73.32%

2 411 3.45% 76.78%

3 348 2.93% 79.70%

4 179 1.50% 81.21%

5 152 1.28% 82.48%

...

Reuse Time

A reuse time tool is also provided, which counts the total number of memory accesses (without considering uniqueness) between accesses to the same address:

$ bin64/drrun -t drmemtrace -tool reuse_time -- ~/test/pi_estimator

Estimation of pi is 3.142425985001098

---- <application exited with code 0> ----

Reuse time tool aggregated results:

Total accesses: 88281

Total instructions: 261315

Mean reuse time: 433.47

Reuse time histogram:

Distance Count Percent Cumulative

1 27893 32.84% 32.84%

2 10948 12.89% 45.73%

3 5789 6.82% 52.54%

...

==================================================

Reuse time tool results for shard 29482 (thread 29482):

Total accesses: 84194

Total instructions: 250854

Mean reuse time: 450.01

Reuse time histogram:

Distance Count Percent Cumulative

1 26677 32.86% 32.86%

2 10508 12.95% 45.81%

3 5427 6.69% 52.50%

...

==================================================

Reuse time tool results for shard 29483 (thread 29483):

Total accesses: 3411

Total instructions: 8805

Mean reuse time: 86.36

Reuse time histogram:

Distance Count Percent Cumulative

1 1014 31.56% 31.56%

2 363 11.30% 42.86%

3 308 9.59% 52.44%

Event Counts

To simply see the counts of instructions and memory references broken down by thread use the basic counts tool:

$ bin64/drrun -t drmemtrace -tool basic_counts -- ~/test/pi_estimator

Estimation of pi is 3.142425985001098

---- <application exited with code 0> ----

Basic counts tool results:

Total counts:

134566 total (fetched) instructions

13838 total unique (fetched) instructions

423 total non-fetched instructions

0 total prefetches

30919 total data loads

13122 total data stores

0 total icache flushes

0 total dcache flushes

3 total threads

1134 total timestamp + cpuid markers

0 total idle markers

0 total wait markers

5 total kernel transfer markers

10 total function id markers

0 total function return address markers

30 total function argument markers

5 total function return value markers

0 total physical address + virtual address marker pairs

0 total physical address unavailable markers

75 total system call number markers

0 total blocking system call markers

12 total other markers

0 total encodings

Thread 64951 counts:

130900 (fetched) instructions

13469 unique (fetched) instructions

423 non-fetched instructions

0 prefetches

29844 data loads

12594 data stores

0 icache flushes

0 dcache flushes

1072 timestamp + cpuid markers

0 idle markers

0 wait markers

5 kernel transfer markers

4 function id markers

0 function return address markers

12 function argument markers

2 function return value markers

0 physical address + virtual address marker pairs

0 physical address unavailable markers

56 system call number markers

0 blocking system call markers

4 other markers

0 encodings

Thread 64958 counts:

1861 (fetched) instructions

1009 unique (fetched) instructions

0 non-fetched instructions

0 prefetches

538 data loads

262 data stores

0 icache flushes

0 dcache flushes

30 timestamp + cupid markers

0 idle markers

0 wait markers

0 kernel transfer markers

2 function id markers

0 function return address markers

6 function argument markers

1 function return value markers

0 physical address + virtual address marker pairs

0 physical address unavailable markers

9 system call number markers

0 blocking system call markers

4 other markers

0 encodings

Thread 64959 counts:

1805 (fetched) instructions

978 unique (fetched) instructions

0 non-fetched instructions

0 prefetches

537 data loads

266 data stores

0 icache flushes

0 dcache flushes

32 timestamp + cpuid markers

0 idle markers

0 wait markers

0 kernel transfer markers

4 function id markers

0 function return address markers

12 function argument markers

2 function return value markers

0 physical address + virtual address marker pairs

0 physical address unavailable markers

10 system call number markers

0 blocking system call markers

4 other markers

0 encodings

The non-fetched instructions are x86 string loop instructions, where subsequent iterations do not incur a fetch. They are included in the trace as a different type of trace entry to support core simulators in addition to cache simulators.

Opcode Mix

The opcode_mix tool uses the non-fetched instruction information along with the preserved libraries and binaries from the traced execution to gather more information on each executed instruction than was stored in the trace. To run on online traces, pass the -instr_encodings option. The results are broken down by the opcodes used in DR's IR, where for x86 mov is split into a separate opcode for load and store but both have the same public string "mov":

$ bin64/drrun -t drmemtrace -offline -- ~/test/pi_estimator

Estimation of pi is 3.142425985001098

$ bin64/drrun -t drmemtrace -tool opcode_mix -indir drmemtrace.*.dir

Opcode mix tool results:

267271 : total executed instructions

36432 : mov

31075 : mov

24715 : add

22579 : test

22539 : cmp

12137 : lea

11136 : jnz

10568 : movzx

10243 : jz

9056 : and

8064 : jnz

7279 : jz

5659 : push

4528 : sub

4357 : pop

4001 : shr

3427 : jnbe

2634 : mov

2469 : shl

2344 : jb

2291 : ret

2178 : xor

2164 : call

2111 : pcmpeqb

1472 : movdqa

...

Human-Readable View

The view tool prints out the contents of the trace for human viewing, including disassembling instructions in AT&T, Intel, Arm, or DR format (to see disassembly for online traces, pass the -instr_encodings option). The -skip_refs and -sim_refs flags can be used to set a start point and end point for the disassembled view. Note that these flags compute the number of trace entry records which are skipped or displayed which is distinct from the number of instruction records; to skip instruction records use -skip_instrs or -skip_to_timestamp (this is faster than -skip_refs).

The tool displays loads and stores, as well as metadata marker entries for timestamps, on which core and thread the subsequent instruction sequence was executed, and kernel and system call transfers (these correspond to signal or event handler interruptions of the regular execution flow).

In its first two columns, the tool displays the trace record ordinal and the instruction fetch ordinal.

$ $ bin64/drrun -t drmemtrace -tool view -indir drmemtrace.*.dir -sim_refs 20

Output format:

<--record#-> <--instr#->: <Wrkld.Tid> <record details>

------------------------------------------------------------

1 0: W0.T3256418 <marker: version 6>

2 0: W0.T3256418 <marker: filetype 0x240>

3 0: W0.T3256418 <marker: cache line size 64>

4 0: W0.T3256418 <marker: chunk instruction count 1024>

5 0: W0.T3256418 <marker: page size 4096>

6 0: W0.T3256418 <marker: timestamp 13312410768080478>

7 0: W0.T3256418 <marker: W0.T3256418 on core 7>

8 1: W0.T3256418 ifetch 3 byte(s) @ 0x00007fc205a61940 48 89 e7 mov %rsp, %rdi

9 2: W0.T3256418 ifetch 5 byte(s) @ 0x00007fc205a61943 e8 b8 0c 00 00 call $0x00007fc205a62600

10 2: W0.T3256418 write 8 byte(s) @ 0x00007fff9a9e3528 by PC 0x00007fc205a61943

11 3: W0.T3256418 ifetch 1 byte(s) @ 0x00007fc205a62600 55 push %rbp

12 3: W0.T3256418 write 8 byte(s) @ 0x00007fff9a9e3520 by PC 0x00007fc205a62600

13 4: W0.T3256418 ifetch 3 byte(s) @ 0x00007fc205a62601 48 89 e5 mov %rsp, %rbp

14 5: W0.T3256418 ifetch 2 byte(s) @ 0x00007fc205a62604 41 57 push %r15

15 5: W0.T3256418 write 8 byte(s) @ 0x00007fff9a9e3518 by PC 0x00007fc205a62604

16 6: W0.T3256418 ifetch 2 byte(s) @ 0x00007fc205a62606 41 56 push %r14

17 6: W0.T3256418 write 8 byte(s) @ 0x00007fff9a9e3510 by PC 0x00007fc205a62606

18 7: W0.T3256418 ifetch 2 byte(s) @ 0x00007fc205a62608 41 55 push %r13

19 7: W0.T3256418 write 8 byte(s) @ 0x00007fff9a9e3508 by PC 0x00007fc205a62608

20 8: W0.T3256418 ifetch 2 byte(s) @ 0x00007fc205a6260a 41 54 push %r12

View tool results:

8 : total instructions

An example of thread switches:

------------------------------------------------------------

46 0: W0.T3264758 <marker: timestamp 13312413437398055>

47 0: W0.T3264758 <marker: W0.T3264758 on core 2>

48 1: W0.T3264758 ifetch 3 byte(s) @ 0x00007f4ea89e4940 48 89 e7 mov %rsp, %rdi

49 2: W0.T3264758 ifetch 5 byte(s) @ 0x00007f4ea89e4943 e8 b8 0c 00 00 call $0x00007f4ea89e5600

50 2: W0.T3264758 write 8 byte(s) @ 0x00007ffd93a0cf18 by PC 0x00007f4ea89e4943

...

2854543 2149665: W0.T3264758 ifetch 5 byte(s) @ 0x00007f4ea7c87f8c b8 0e 00 00 00 mov $0x0000000e, %eax

2854544 2149666: W0.T3264758 ifetch 2 byte(s) @ 0x00007f4ea7c87f91 0f 05 syscall

------------------------------------------------------------

2854545 2149666: W0.T3264760 <marker: timestamp 13312413438835999>

2854546 2149666: W0.T3264760 <marker: W0.T3264760 on core 11>

2854547 2149667: W0.T3264760 ifetch 3 byte(s) @ 0x00007f4ea7d0b099 48 85 c0 test %rax, %rax

2854548 2149668: W0.T3264760 ifetch 2 byte(s) @ 0x00007f4ea7d0b09c 7c 18 jl $0x00007f4ea7d0b0b6

...

Here is an example of a signal handler interrupting the regular flow, with metadata showing that the signal was delivered just after an untaken conditional branch:

801343 601827: W0.T1159769 ifetch 2 byte(s) @ 0x00007fc2c3aa5c70 75 57 jnz $0x00007fc2c3aa5cc9 (untaken)

801344 601827: W0.T1159769 <marker: kernel xfer from 0x7fc2c3aa5c72 to handler>

801345 601827: W0.T1159769 <marker: timestamp 13335923552684013>

801346 601827: W0.T1159769 <marker: W0.T1159769 on core 7>

801347 601828: W0.T1159769 ifetch 1 byte(s) @ 0x00007fc2c03fa259 55 push %rbp

801348 601828: W0.T1159769 write 8 byte(s) @ 0x00007fff8044e930 by PC 0x00007fc2c03fa259

801349 601829: W0.T1159769 ifetch 3 byte(s) @ 0x00007fc2c03fa25a 48 89 e5 mov %rsp, %rbp

801350 601830: W0.T1159769 ifetch 3 byte(s) @ 0x00007fc2c03fa25d 89 7d fc mov %edi, -0x04(%rbp)

801351 601830: W0.T1159769 write 4 byte(s) @ 0x00007fff8044e92c by PC 0x00007fc2c03fa25d

801352 601831: W0.T1159769 ifetch 4 byte(s) @ 0x00007fc2c03fa260 48 89 75 f0 mov %rsi, -0x10(%rbp)

801353 601831: W0.T1159769 write 8 byte(s) @ 0x00007fff8044e920 by PC 0x00007fc2c03fa260

801354 601832: W0.T1159769 ifetch 4 byte(s) @ 0x00007fc2c03fa264 48 89 55 e8 mov %rdx, -0x18(%rbp)

801355 601832: W0.T1159769 write 8 byte(s) @ 0x00007fff8044e918 by PC 0x00007fc2c03fa264

801356 601833: W0.T1159769 ifetch 4 byte(s) @ 0x00007fc2c03fa268 83 7d fc 1a cmp -0x04(%rbp), $0x1a

801357 601833: W0.T1159769 read 4 byte(s) @ 0x00007fff8044e92c by PC 0x00007fc2c03fa268

801358 601834: W0.T1159769 ifetch 2 byte(s) @ 0x00007fc2c03fa26c 75 0f jnz $0x00007fc2c03fa27d (untaken)

801359 601835: W0.T1159769 ifetch 6 byte(s) @ 0x00007fc2c03fa26e 8b 05 c0 3e 00 00 mov <rel> 0x00007fc2c03fe134, %eax

801360 601835: W0.T1159769 read 4 byte(s) @ 0x00007fc2c03fe134 by PC 0x00007fc2c03fa26e

801361 601836: W0.T1159769 ifetch 3 byte(s) @ 0x00007fc2c03fa274 83 c0 01 add $0x01, %eax

801362 601837: W0.T1159769 ifetch 6 byte(s) @ 0x00007fc2c03fa277 89 05 b7 3e 00 00 mov %eax, <rel> 0x00007fc2c03fe134

801363 601837: W0.T1159769 write 4 byte(s) @ 0x00007fc2c03fe134 by PC 0x00007fc2c03fa277

801364 601838: W0.T1159769 ifetch 1 byte(s) @ 0x00007fc2c03fa27d 90 nop

801365 601839: W0.T1159769 ifetch 1 byte(s) @ 0x00007fc2c03fa27e 5d pop %rbp

801366 601839: W0.T1159769 read 8 byte(s) @ 0x00007fff8044e930 by PC 0x00007fc2c03fa27e

801367 601840: W0.T1159769 ifetch 1 byte(s) @ 0x00007fc2c03fa27f c3 ret (target 0x7fc2c3a5af90)

801368 601840: W0.T1159769 read 8 byte(s) @ 0x00007fff8044e938 by PC 0x00007fc2c03fa27f

801369 601841: W0.T1159769 ifetch 7 byte(s) @ 0x00007fc2c3a5af90 48 c7 c0 0f 00 00 00 mov $0x0000000f, %rax

801370 601842: W0.T1159769 ifetch 2 byte(s) @ 0x00007fc2c3a5af97 0f 05 syscall

801371 601842: W0.T1159769 <marker: system call 15>

801372 601842: W0.T1159769 <marker: timestamp 13335923552684023>

801373 601842: W0.T1159769 <marker: W0.T1159769 on core 7>

801374 601842: W0.T1159769 <marker: syscall xfer from 0x7fc2c3a5af99>

801375 601842: W0.T1159769 <marker: timestamp 13335923552684029>

801376 601842: W0.T1159769 <marker: W0.T1159769 on core 7>

801377 601843: W0.T1159769 ifetch 4 byte(s) @ 0x00007fc2c3aa5c72 48 83 c4 48 add $0x48, %rsp

Here is an illustration of what a trace would look like when an uncompleted instruction (mov) is interrupted by an asynchronous signal and the instruction is not removed (The following trace is for demonstration purpose only):

46914793 33950158: W0.T3767811 ifetch 5 byte(s) @ 0x000055f52911b75e c4 42 f8 f5 e8 bzhi %r8, %rax, %r13

46914794 33950159: W0.T3767811 ifetch 5 byte(s) @ 0x000055f52911b763 c4 62 f8 f5 c3 bzhi %rbx, %rax, %r8

46914795 33950160: W0.T3767811 ifetch 4 byte(s) @ 0x000055f52911b768 48 c1 eb 33 shr $0x33, %rbx

46914796 33950161: W0.T3767811 ifetch 3 byte(s) @ 0x000055f52911b76c 48 01 f3 add %rsi, %rbx

46914797 33950162: W0.T3767811 ifetch 5 byte(s) @ 0x000055f52911b76f b8 04 00 00 00 mov $0x00000004, %eax

(The line below is for demonstration purpose only and will not appear in a real drmemtrace.)

46914798 33950163: W0.T3767811 ifetch 6 byte(s) @ 0x000055f52911b774 89 85 68 ff ff ff mov %eax, -0x98(%rbp)

46914799 33950163: W0.T3767811 <marker: kernel xfer from 0x55f52911b774 to handler>

46914800 33950163: W0.T3767811 <marker: signal #27>

46914801 33950163: W0.T3767811 <marker: timestamp 13373506292215933>

46914802 33950163: W0.T3767811 <marker: W0.T3767811 on core 4139>

46914803 33950164: W0.T3767811 ifetch 1 byte(s) @ 0x000055f5298f7500 55 push %rbp

46914804 33950164: W0.T3767811 write 8 byte(s) @ 0x00007fd98bd2f210 by PC 0x000055f5298f7500

When the uncompleted mov instruction is removed, a dynamorio::drmemtrace::TRACE_MARKER_TYPE_UNCOMPLETED_INSTRUCTION marker is placed to indicate an instruction has been removed:

46914793 33950158: W0.T3767811 ifetch 5 byte(s) @ 0x000055f52911b75e c4 42 f8 f5 e8 bzhi %r8, %rax, %r13

46914794 33950159: W0.T3767811 ifetch 5 byte(s) @ 0x000055f52911b763 c4 62 f8 f5 c3 bzhi %rbx, %rax, %r8

46914795 33950160: W0.T3767811 ifetch 4 byte(s) @ 0x000055f52911b768 48 c1 eb 33 shr $0x33, %rbx

46914796 33950161: W0.T3767811 ifetch 3 byte(s) @ 0x000055f52911b76c 48 01 f3 add %rsi, %rbx

46914797 33950162: W0.T3767811 ifetch 5 byte(s) @ 0x000055f52911b76f b8 04 00 00 00 mov $0x00000004, %eax

46914798 33950163: W0.T3767811 <marker: uncompleted instruction, encoding 0x898568ff>

46914799 33950163: W0.T3767811 <marker: kernel xfer from 0x55f52911b774 to handler>

46914800 33950163: W0.T3767811 <marker: signal #27>

46914801 33950163: W0.T3767811 <marker: timestamp 13373506292215933>

46914802 33950163: W0.T3767811 <marker: W0.T3767811 on core 4139>

46914803 33950164: W0.T3767811 ifetch 1 byte(s) @ 0x000055f5298f7500 55 push %rbp

46914804 33950164: W0.T3767811 write 8 byte(s) @ 0x00007fd98bd2f210 by PC 0x000055f5298f7500

View Function Calls

The func_view tool records function argument and return values for function names specified at tracing time. See Tracing Function Calls for more information.

$ bin64/drrun -t drmemtrace -offline -record_function 'fib|1' -- ~/test/fib 5

Estimation of pi is 3.142425985001098

$ bin64/drrun -t drmemtrace -tool func_view -indir drmemtrace.*.dir

0x7fc06d2288eb => common.fib!fib(0x5)

0x7fc06d22888e => common.fib!fib(0x4)

0x7fc06d22888e => common.fib!fib(0x3)

0x7fc06d22888e => common.fib!fib(0x2)

0x7fc06d22888e => common.fib!fib(0x1) => 0x1

0x7fc06d22889d => common.fib!fib(0x0) => 0x1

=> 0x2

0x7fc06d22889d => common.fib!fib(0x1) => 0x1

=> 0x3

0x7fc06d22889d => common.fib!fib(0x2)

0x7fc06d22888e => common.fib!fib(0x1) => 0x1

0x7fc06d22889d => common.fib!fib(0x0) => 0x1

=> 0x2

=> 0x5

0x7fc06d22889d => common.fib!fib(0x3)

0x7fc06d22888e => common.fib!fib(0x2)

0x7fc06d22888e => common.fib!fib(0x1) => 0x1

0x7fc06d22889d => common.fib!fib(0x0) => 0x1

=> 0x2

0x7fc06d22889d => common.fib!fib(0x1) => 0x1

=> 0x3

=> 0x8

Function view tool results:

Function id=0: common.fib!fib

15 calls

15 returns

Cache Line Histogram

The top referenced cache lines are displayed by the histogram tool:

$ bin64/drrun -t drmemtrace -tool histogram -- ~/test/pi_estimator

Estimation of pi is 3.142425985001098

---- <application exited with code 0> ----

Cache line histogram tool results:

icache: 1134 unique cache lines

dcache: 3062 unique cache lines

icache top 10

0x7facdd013780: 30929

0x7facdb789fc0: 27664

0x7facdb78a000: 18629

0x7facdd003e80: 18176

0x7facdd003500: 11121

0x7facdd0034c0: 9763

0x7facdd005940: 8865

0x7facdd003480: 8277

0x7facdb789f80: 6660

0x7facdd003540: 5888

dcache top 10

0x7ffcc35e7d80: 4088

0x7ffcc35e7d40: 3497

0x7ffcc35e7e00: 3478

0x7ffcc35e7f40: 2919

0x7ffcc35e7dc0: 2837

0x7facdbe2e980: 2452

0x7facdbe2ec80: 2273

0x7ffcc35e7e80: 2194

0x7facdb6625c0: 2016

0x7ffcc35e7e40: 1997

Invariant Checker

The invariant_checker tool performs sanity checks on a trace, focusing on program counter continuity and guarantees around kernel control transfer interruptions. It optionally checks for restricted behavior that technically is legal but is not expected to happen in the target trace, helping to identify tracing problems and suitability for use of a trace for core simulation (see Core Simulation Support).

System Call Mix

The system call mix tool counts the frequency of system calls in a trace. It works for both online and offline traces. It uses the dynamorio::drmemtrace::TRACE_MARKER_TYPE_SYSCALL markers that store the system call number. This works only with traces that have these markers; that is, offline traces must have dynamorio::drmemtrace::OFFLINE_FILE_TYPE_SYSCALL_NUMBERS in their file type.

$ bin64/drrun -t drmemtrace -indir drmemtrace.ls.*.dir -tool syscall_mix

Syscall mix tool results:

count : syscall_num

17 : 9

16 : 1

8 : 3

7 : 262

6 : 257

5 : 0

5 : 10

3 : 12

2 : 217

2 : 16

2 : 17

2 : 137

2 : 21

1 : 158

1 : 302

1 : 334

1 : 218

1 : 231

1 : 318

1 : 11

1 : 273

Record Filter

The record filter tool modifies a target trace. It contains several varieties of filters which selectively remove records from the tool. The filters currently provided include:

  • Removing records of types specified by the -filter_trace_types option. The types are identified by their dynamorio::drmemtrace::trace_type_t enum numeric value.
  • Remove marker records of marker types specified by the -filter_marker_types option. The types are identified by their dynamorio::drmemtrace::trace_marker_type_t enum numeric value.
  • Running a simple data cache filter and removing hits. The cache is enbabled and its size specified by the -filter_cache_size option.
  • Trimming the start (via -trim_before_timestamp) and end (via -trim_after_timestamp) of a trace. Any now-empty shards are deleted entirely.

A filter can be applied only to the start of a trace using the -filter_stop_timestamp option.

Example of removing function markers:

$ bin64/drrun -t drmemtrace -indir mytracedir -tool basic_counts

...

9009 total function id markers

5006 total function return address markers

6007 total function argument markers

4003 total function return value markers

...

$ bin64/drrun -t drmemtrace -tool record_filter -filter_marker_types 4,5,6,7 -indir mytracedir -outdir newdir

Output 1280800 entries from 1304825 entries.

$ bin64/drrun -t drmemtrace -indir newdir -tool basic_counts

...

0 total function id markers

0 total function return address markers

0 total function argument markers

0 total function return value markers

...