TCF Agent Porting Guide

Copyright (c) 2009-2019 Wind River Systems, Inc. and others. Made available under the EPL 2.0
Agent portion made available under your choice of EPL 2.0 or EDL v1.0 dual-license.

Direct comments, questions to the tcf-dev@eclipse.org mailing list

Table of Contents

Introduction

TCF Agent is a lightweight reference implementation of TCF protocol that supports basic debugging and other TCF services. It is written in C and can be used for remote debugging of software written for Linux, Windows XP or VxWorks. See TCF Getting Started for instructions on how to get the source code and build the agent.

Customizing and Porting TCF Agent

It is important to know concurrency model used by the agent code before making any changes. Most of the agent code is event driven: it has a main loop that retrieves events from an event queue and executes them sequentially by calling event handlers by a single thread. Single threaded event driven design provides good level of concurrency (equivalent to cooperative multithreading), while greatly reduces need for synchronization - each event dispatch cycle can be viewed as fully synchronized atomic operation.

Event driven code should avoid long running or potentially blocking operations in event handlers since they can stop all event processing for indefinite time. Such operations should use asynchronous APIs (like POSIX Asynchronous I/O), or should be performed by background threads. Treat background threads with extreme caution - agent data structures are not intended for multithreaded access. Background thread scope should be limited to a single module and it should not call agent public APIs. Instead they should communicate with the rest of the code by posting events.

An event is essentially a function pointer (a call-back) that points to event handler, plus a data pointer. Call-backs are also used throughout the agent code to subscribe listeners for various state change notifications. Using events and call-backs, as a design principle, is also known as inversion of control. Note that, in general, inversion of control is not compatible with traditional multithreaded programming model that used mutexes to protect shared data from racing conditions.

Most of TCF agent configuration is done at compile time. Conditional compilation statements in the source code assume that both the agent and inferior code will run on same OS platform and on same CPU type that were used to build the agent. Building an agent that can run on one machine while controlling execution of code on another machine might be possible, but not fully supported at this time.

Header file tcf/config.h contains macro definitions that control agent configuration. All C files in the agent code include tcf/config.h before other header files. Individual services or features can be enabled or disabled by changing definitions in the file. Also, macro values can be overwritten by using -D option in C compiler command line. Agent Makefile contains additional logic that makes it even more convenient to build different agent configurations.

It should be much easier to port the agent if you don't need all TCF services. For example, for RSE integration you only need File System, System Monitor and Processes services, so you can disable all other services by editing tcf/config.h.

It is better to create a separate directory with alternative versions of tcf/config.h, tcf/framework/context.h, tcf/framework/context.c, Makefile, etc., instead of editing original files. The idea is that Makefile will search that directory first, and if a file not found there, it will search original agent sources. See org.eclipse.tcf.agent/examples/daytime/readme.txt for an example of a custom TCF agent. Another example is the tcf server (also known as value-add) which builds the agent C code in a different configuration for running on the host.

Of course, if changes are generic enough to be useful for other ports, then it is better to change code in the main directory. Please, consider contributing your changes of the source code back to eclipse.org.

Porting TCF Agent to a New OS Platform

In order to improve portability, instead of using non-portable native OS calls, agent code uses POSIX APIs whenever possible. When a POSIX API is not available for particular platform, and it can be easily emulated, it is done in mdep.h/mdep.c files. For example, mdep.h/mdep.c contains emulation of POSIX Threads for Win32, since the API is not available with Microsoft C compiler. API emulation does not have to be complete, it only needs to implement functionality that is used by the agent.

When it is not possible or not feasible to use portable POSIX APIs, the agent code contains conditional compilation statements that use well known macros like WIN32, __CYGWIN__, __MINGW32__, etc. Such places might require editing when porting to a new OS. Global system dependencies are encapsulated in the org.eclipse.tcf.agent/agent/system subdirectories, and pulled in conditionally by the build system. Beyond that, individual services may need to be enabled or disabled based on the OS. For example, see the conditionals in agent/tcf/services/sysmon.c, which implements the System Monitor service. Since many services are optional for an initial port of the TCF Agent, it is often sufficient to look only at a subset of services.

Porting TCF Agent to a New CPU Type

Most of the CPU dependencies are encapsulated in the org.eclipse.tcf.agent/agent/machine subdirectories, and pulled in conditionally by the build system. Beyond that, searching TCF agent source code for __i386__ is a good way to find all places where the source code depends on CPU type.

There are several files in the code that might need changes in order to support a new CPU type:

tcf/framework/context.c
The module provides low level debugger functionality: attach/detach, suspend/resume, single step, memory read/write. It uses OS debug APIs to do its job. Most of the code does not depend on CPU type, however, single stepping is not always directly supported by OS, and its implementation needs to be reviewed and changed to support new CPU type.
tcf/services/dwarfexpr.c
The module implements evaluation of DWARF expressions. The module is used only if the agent is built to support ELF executable file format and DWARF debugging data format. No need to change the module if ELF or DWARF support is not required. DWARF expressions can have references to CPU registers. Register access code needs to be changed to support new CPU type. Note that different compilers can use different numbers to identify same registers of same CPU.
tcf/services/registers.c
The module implements Registers service. The code has static variable "regs_index" that contains a table of CPU registers. The table holds names, offsets and sizes of CPU registers. Offset and size define location of register data in REG_SET structure, which represents snapshot of register values of an execution context. Definition of the variable needs to be changed to support new CPU type.
tcf/services/stacktrace.c
The module implements Stack Trace service. The module contains "trace_stack" function that creates stack trace by walking a stack of an executable context. Stack trace data format is defined by two struct declarations: StackTrace and StackFrame. The data structure is generic, however the code that created the data structure is CPU dependand. Alternative version of "trace_stack" function needs to be provided to support new CPU type.

Adding Support For a New Executable File Format

For source level debugging TCF agent needs to understand executable file format. Source level debugging is supported by providing two services: Symbols and Line Numbers. The services are optional, and if they are disabled no support for executable file format is needed. At this time the agent supports ELF (Executable and Linking Format) and PE (Portable Executable) files. ELF is very popular format in Unix-like and embedded systems, and PE is used in Windows operating systems.

ELF supported in the agent is developed from scratch, has no external dependences, and is available in source form as part of the agent source code. The code might require changes to support a particular flavor of ELF. Probably the most tricky part of the code is interface to the system loader. The agent needs to know when an ELF file is loaded into or removed from target memory so it can update symbol tables and breakpoints. For that it plants an internal (not visible to clients) breakpoint (aka eventpoint) inside system loader code. The breakpoint allows agent to intercept control every time an ELF file is loaded or unloaded.

PE support in the agent is implemented by using DBGHELP.DLL. This DLL is included in Windows operating system. However, older versions of the DLL might not provide all necessary functionality. To obtain the latest version of DBGHELP.DLL, go to http://www.microsoft.com/whdc/devtools/debugging/default.mspx and download Debugging Tools for Windows.

Support for a completely new file format would require to develop alternative versions of symbols_xxx.c and linenumbers_xxx.c files. See tcf/services/symbols_elf.c and tcf/services/linenumbers_elf.c as example implementation of the services.

Adding Support For a New Debug Data Format

For source level debugging TCF agent needs to understand debug data format. Debug data is usually reside in a section of an executable file, so the file format should be supported, see Adding Support For a New Executable File Format. At this time the agent supports DWARF and PE (Portable Executable) debug data formats. DWARF support is implemented as part of the agent source code, and PE data is accessed using DBGHELP.DLL, which is included in Windows operating system.

Adding Support For a New Communication Transport

Current agent code uses TCP/IP as the transport protocol to open communication channels. The agent code can be easily modified to support other transport protocols, like UDP, USB, etc.

Files tcf/framework/channel_tcp.h and tcf/framework/channel_tcp.c provide support for TCP/IP transport. To support another protocol one would need to develop similar code using TCP support as an example.

Adding new transport would also require to modify functions channel_server() and channel_connect() in tcf/framework/channel.c.