GTPin
GTReplay

GEN emulator that replays traces generated by GTPin

========================================================================================

Introduction

========================================================================================

GTReplay is a GEN emulator allowing replaying special traces, called gLITs (Long Instruction Trace for GEN). gLIT is generated on GEN device and GTReplay replays it on CPU side. The picture below shows a schematic flow. The original GEN binary kernel is instrumented by GTReplay to generate gLIT while running on GEN device. gLIT contains all the information, required for replaying the trace within emulator. It is a multi-threaded trace - contains the record of each dispatch of the kernel to any of the GPU hardware thread. Once gLIT is created, one can feed it into GTReplay and replay on CPU. The replay process is deterministic - the order of kernel dispatches to specific Execution Units and specific hardware threads, their IDs, the order of memory accesses are preserved. The trace is generated once and can be replayed multiple times.

One can develop kernel profiling and analysis tools on top of GTReplay. The analysis tools communicate with GTReplay via so called Tool API. Tool API allows traversing the kernel binary, inspection of the GEN instructions, registration the callbacks before and/or after any kernel instruction, and observing the current state of each hardware thread. Profiling and analysis tools can be written in C/C++. Any number of analysis tools can run concurrently on the same kernel replay.

GTReplay-scheme.jpg


GTReplay is available, along with a set of sample analysis tools. It enables users to develop their own analysis tools.

Tutorial sections:

Reference sections:

========================================================================================

Capabilities

========================================================================================

Supported operating systems

GTReplay supports the following operating systems:

GTReplay is provided in both 32-bit and 64-bit forms. GTReplay supports cross-platform. One can generate gLIT traces on Linux and replay them on Windows, and wise versa

Supported GEN hardware

GTReplay supports the following HW platforms:

Intel Integrated Graphics:

Supported capabilities

What is GTReplay good for?

With GTReplay a user can easily develop profiling and analysis tools of any complexity. The tools are written in a high-level language (C/C++), any instruction of the kernel can be instrumented and observed. The register state of any HW thread is available at any point.

With GTReplay one can develop memory and cache models, model the traffic via different HW samplers and fixed functions, etc.

A user can debug kernels by using either the built in printing capabilities, or by developing his own debugger.

GTReplay can capture any data available at the EU scope while executing a program. It can capture such data at the lowest granularity possible: the single EU assembly instruction. You can create an unlimited variety of analysis tools using the GTReplay technology.

More details on the existing tools can be found in GTReplay Sample Tools.

Profiling data granularity

gLIT traces are collected separately, for:

A user can limit GTPin profiling for specific kernels and shaders, and for specific Enqueue/Draw commands.

Profiling scope

A user can limit gLIT generation to a specific Thread Group IDs (GTPin: Defining the Profiled Thread Group IDs).

========================================================================================

Installation

========================================================================================
GTReplay is a part of GTPin release package. It is located within Profilers\GTReplay folder

What's included within the package

GTReplay package has the following directory structure:

GTReplay
|--common
|--examples
|--ia32
|--intel64
|--utils

========================================================================================

How To Create gLIT and Run GTReplay

========================================================================================

Creating gLIT traces

In order to create gLITs one needs to run GTPin with a gentrace tool. Gentrace DLL is located within ia32 or intel64 sub-folders. As with the general GTPin tracing tools, Gentrace should be run in two phases - pre-processing and trace-gathering.

To run the pre-processing phase of the gentrace tool (in its default configuration) use the following command:

Profilers/Bin/gtpin -t Profilers\GTReplay\intel64\gentrace.dll --phase 1 -- app

NOTE: You can run this phase only once per application.

To run the trace-gathering phase of the gentrace tool (in its default configuration), use the following command:

Profilers/Bin/gtpin -t Profilers\GTReplay\intel64\gentrace.dll --phase 2 -- app

How to understand Gentrace results

When you run the Gentrace tool in its default configuration for pre-processing (phase 1), the tool generates a directory called: GTPIN_PROFILE_GENTRACE0. In addition the tool creates the following two files in the current directory:

This file is an input to the trace gathering phase. It has the following format:

BitonicSort___CS_asmf641279bbb4bc39f_simd32_7a6d7d0b064fb7e1 4456448

where, for each kernel, the maximum number of required trace records is provided.

This file contains informational data only, and has the following format:

BitonicSort___CS_asmf641279bbb4bc39f_simd32_7a6d7d0b064fb7e1     262144  OpenCL 0  0
BitonicSort___CS_asmf641279bbb4bc39f_simd32_7a6d7d0b064fb7e1     131072  OpenCL 0  1
BitonicSort___CS_asmf641279bbb4bc39f_simd32_7a6d7d0b064fb7e1     262144  OpenCL 0  2
BitonicSort___CS_asmf641279bbb4bc39f_simd32_7a6d7d0b064fb7e1     131072  OpenCL 0  3
BitonicSort___CS_asmf641279bbb4bc39f_simd32_7a6d7d0b064fb7e1     131072  OpenCL 0  4
BitonicSort___CS_asmf641279bbb4bc39f_simd32_7a6d7d0b064fb7e1     262144  OpenCL 0  5
BitonicSort___CS_asmf641279bbb4bc39f_simd32_7a6d7d0b064fb7e1     131072  OpenCL 0  6
BitonicSort___CS_asmf641279bbb4bc39f_simd32_7a6d7d0b064fb7e1     131072  OpenCL 0  7
BitonicSort___CS_asmf641279bbb4bc39f_simd32_7a6d7d0b064fb7e1     131072  OpenCL 0  8
BitonicSort___CS_asmf641279bbb4bc39f_simd32_7a6d7d0b064fb7e1     262144  OpenCL 0  9
BitonicSort___CS_asmf641279bbb4bc39f_simd32_7a6d7d0b064fb7e1     131072  OpenCL 0  10

where each line corresponds to a single kernel for a single Draw/Enqueue command. The fields have the following meaning (from left to right):

When the Gentrace tool is run for the trace gathering phase (phase 2), the tool generates the directory: GTPIN_PROFILE_GENTRACE1. GTPin saves the profiling results in the folder: GTPIN_PROFILE_GENTRACE1\Session_Final. The traces for each kernel is saved in a separate sub-folder that has the same name as the kernel. Each Draw/Enqueue command has a separate trace, which is saved in a corresponding sub-directory. The trace is provided in a compressed binary format.

Running GTReplay

In order to run GTReplay, you must run the following command line:

Profilers\GTReplay\intel64\gtreplay.exe [-t tool1] [-t tool2] [-t tool3] [GTReplay arguments] -- path-to-the-location-of-the-trace

The list of the arguments and parameters that can be provided to GTReplay is listed in GTReplay Configuration.

========================================================================================

GTReplay Configuration

========================================================================================

GTReplay supports several configuration parameters. The most useful parameters are:


========================================================================================

GTReplay Sample Tools

========================================================================================
The list of the ready-to-use sample tools:


 All Data Structures Functions Variables Typedefs Enumerations Enumerator


  Copyright (C) 2013-2025 Intel Corporation
SPDX-License-Identifier: MIT