Skip to content

Commit

Permalink
DWARF tree for fully-qualified name construction
Browse files Browse the repository at this point in the history
The Windows debuggers expect PDB symbol names to be fully qualified.
I.e., if a class Foo has a constructor, its name should be emitted as
`Foo::Foo`, not simply `Foo` as is the case today. Linux debuggers like
GDB dynamically reconstruct the symbol tree at runtime each time a
program is debugged. Windows debuggers on the other hand do not, and
expect the name to be fully qualified from the outset. Failing this, the
constructor function `Foo` would have the same name as the class `Foo`
in the PDB, and WinDbg will get confused about what to dump (e.g. using
`dt Foo`) and arbitrarily pick the largest item, which might be the
constructor. Therefore you end up dumping the wrong thing and being
completely unable to inspect the contents of a `Foo` object.

This commit aims to fix that by introducing a DWARF tree during the
conversion process which allows us to efficiently reconstruct such fully
qualified names during the conversion.

A note about DWARF: the DWARF format does not explicitly record the
parent of any given DIE record. It is instead implicit in how the
records are layed out. Any record may have a "has children" flag, and if
it does, then the records following it are its children, terminated by a
special NULL record, popping back up one level of the tree.

The DIECursor already recognized this structure but did not capture it
in memory for later use.

In order to construct fully-qualified names for functions, enums,
classes, etc. (i.e. taking into account namespaces, nesting, etc), we
need a way to efficienctly lookup a node's parent. Thus the DWARF tree
was born.

At a high level, we take advantage of the fact that the DWARF sections
were already scanned in two passes. We hook into the first pass (where
the typeIDs were being reserved) and build the DWARF tree.

Then, in the second pass (where the CV symbols get emitted), we look up
the tree to figure out the correct fully-qualified symbol names.

NOTE: The first phase of this work focuses on subroutines only. Later
work will enable support for structs/classes/enums.

On the subroutine front, I also added a flag to capture whether a DIE is
a "declaration" or definition (based on the DW_AT_declaration
attribute). This is needed to consolidate function decl+defn into one
PDB symbol, as otherwise WinDbg will get confused. This also matches
what the MSVC toolset produces.

A few other related additions:

- Added helper to format a fully qualified function name by looking up
  the tree added in this commit.
- Added helper to print the DWARF tree for debugging purposes and a flag
  to control it.
  • Loading branch information
alexbudfb committed Mar 24, 2023
1 parent 2e4c1bf commit 62f975d
Show file tree
Hide file tree
Showing 6 changed files with 443 additions and 110 deletions.
10 changes: 8 additions & 2 deletions src/PEImage.h
Original file line number Diff line number Diff line change
Expand Up @@ -178,11 +178,16 @@ class PEImage : public LastError

template<typename SYM> const char* t_findSectionSymbolName(int s) const;

// File handle to PE image.
int fd;

// Pointer to in-memory buffer containing loaded PE image.
void* dump_base;

// Size of `dump_base` in bytes.
int dump_total_len;

// codeview
// codeview fields
IMAGE_DOS_HEADER *dos;
IMAGE_NT_HEADERS32* hdr32;
IMAGE_NT_HEADERS64* hdr64;
Expand All @@ -200,7 +205,8 @@ class PEImage : public LastError
std::unordered_map<std::string, SymbolInfo> symbolCache;

public:
//dwarf
// dwarf fields
// List of DWARF section descriptors.
#define EXPANDSEC(name) PESection name;
SECTION_LIST()
#undef EXPANDSEC
Expand Down
3 changes: 3 additions & 0 deletions src/cv2pdb.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -897,6 +897,7 @@ void CV2PDB::checkGlobalTypeAlloc(int size, int add)
}
}

// Get the CodeView type descriptor for the given type ID.
// CV-only. Returns NULL for DWARF-based images.
const codeview_type* CV2PDB::getTypeData(int type)
{
Expand All @@ -913,6 +914,7 @@ const codeview_type* CV2PDB::getTypeData(int type)
return (codeview_type*)(typeData + offset[type - BASE_USER_TYPE]);
}

// CV-only. Never called for DWARF.
const codeview_type* CV2PDB::getUserTypeData(int type)
{
type -= BASE_USER_TYPE + globalTypeHeader->cTypes;
Expand Down Expand Up @@ -2116,6 +2118,7 @@ int CV2PDB::appendTypedef(int type, const char* name, bool saveTranslation)
return typedefType;
}

// CV-only.
void CV2PDB::appendTypedefs()
{
if(Dversion == 0)
Expand Down
21 changes: 17 additions & 4 deletions src/cv2pdb.h
Original file line number Diff line number Diff line change
Expand Up @@ -169,17 +169,23 @@ class CV2PDB : public LastError
bool addDWARFLines();
bool addDWARFPublics();
bool writeDWARFImage(const TCHAR* opath);
DWARF_InfoData* findEntryByPtr(byte* entryPtr) const;

// Helper to just print the DWARF tree we've built for debugging purposes.
void dumpDwarfTree() const;

bool addDWARFSectionContrib(mspdb::Mod* mod, unsigned long pclo, unsigned long pchi);
bool addDWARFProc(DWARF_InfoData& id, const std::vector<RangeEntry> &ranges, DIECursor cursor);
void formatFullyQualifiedProcName(const DWARF_InfoData* proc, char* buf, size_t cbBuf) const;

int addDWARFStructure(DWARF_InfoData& id, DIECursor cursor);
int addDWARFFields(DWARF_InfoData& structid, DIECursor cursor, int off, int flStart);
int addDWARFArray(DWARF_InfoData& arrayid, DIECursor cursor);
int addDWARFFields(DWARF_InfoData& structid, DIECursor& cursor, int off, int flStart);
int addDWARFArray(DWARF_InfoData& arrayid, const DIECursor& cursor);
int addDWARFBasicType(const char*name, int encoding, int byte_size);
int addDWARFEnum(DWARF_InfoData& enumid, DIECursor cursor);
int getTypeByDWARFPtr(byte* ptr);
int getDWARFTypeSize(const DIECursor& parent, byte* ptr);
void getDWARFArrayBounds(DWARF_InfoData& arrayid, DIECursor cursor,
void getDWARFArrayBounds(DIECursor cursor,
int& basetype, int& lowerBound, int& upperBound);
void getDWARFSubrangeInfo(DWARF_InfoData& subrangeid, const DIECursor& parent,
int& basetype, int& lowerBound, int& upperBound);
Expand Down Expand Up @@ -278,7 +284,14 @@ class CV2PDB : public LastError

// DWARF
int codeSegOff;
std::unordered_map<byte*, int> mapOffsetToType;

// Lookup table for type IDs based on the DWARF_InfoData::entryPtr
std::unordered_map<byte*, int> mapEntryPtrToTypeID;
// Lookup table for entries based on the DWARF_InfoData::entryPtr
std::unordered_map<byte*, DWARF_InfoData*> mapEntryPtrToEntry;

// Head of list of DWARF DIE nodes.
DWARF_InfoData* dwarfHead = nullptr;

// Default lower bound for the current compilation unit. This depends on
// the language of the current unit.
Expand Down
Loading

0 comments on commit 62f975d

Please sign in to comment.