-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang-tools-extra][ExtractAPI] create clang-symbolgraph-merger #65894
Draft
Arsenic-ATG
wants to merge
2
commits into
llvm:main
Choose a base branch
from
Arsenic-ATG:arcpatch-D158646
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Create and use extractapi::RecordLocation instead of conventional clang::PresumedLoc to track the location of an APIRecord, this reduces the dependency of APISet on SourceManager and would help if someone wants to create APISet from JSON Serialized SymbolGraph. These changes also add extractapi::CommentLine which is similar to RawComment::CommentLine but use RecordLocation instead of PresumedLoc. Differential Revision: https://reviews.llvm.org/D157810
Create a clang tool to merge all the JSON symbolgraph emited by --emit-symbol-graph or -extract-api options into one unified JSON symbolgraph file. Differential Revision: https://reviews.llvm.org/D158646
Arsenic-ATG
added
clang
Clang issues not falling into any other category
clang-tools-extra
labels
Sep 10, 2023
@llvm/pr-subscribers-clang ChangesCreate a clang tool to merge all the JSON symbolgraph emited by --emit-symbol-graph or -extract-api options into one unified JSON symbolgraph file. Differential Revision: https://reviews.llvm.org/D158646Patch is 127.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/65894.diff 16 Files Affected:
diff --git a/clang-tools-extra/CMakeLists.txt b/clang-tools-extra/CMakeLists.txt index 6a3f741721ee6c7..a4052e0894076ef 100644 --- a/clang-tools-extra/CMakeLists.txt +++ b/clang-tools-extra/CMakeLists.txt @@ -13,6 +13,7 @@ if(CLANG_INCLUDE_TESTS) endif() endif() +add_subdirectory(clang-symbolgraph-merger) add_subdirectory(clang-apply-replacements) add_subdirectory(clang-reorder-fields) add_subdirectory(modularize) diff --git a/clang-tools-extra/clang-symbolgraph-merger/CMakeLists.txt b/clang-tools-extra/clang-symbolgraph-merger/CMakeLists.txt new file mode 100644 index 000000000000000..a071a8a11693337 --- /dev/null +++ b/clang-tools-extra/clang-symbolgraph-merger/CMakeLists.txt @@ -0,0 +1,3 @@ +include_directories(include) +add_subdirectory(lib) +add_subdirectory(tool) diff --git a/clang-tools-extra/clang-symbolgraph-merger/include/clang-symbolgraph-merger/SymbolGraph.h b/clang-tools-extra/clang-symbolgraph-merger/include/clang-symbolgraph-merger/SymbolGraph.h new file mode 100755 index 000000000000000..a613f833ffad73b --- /dev/null +++ b/clang-tools-extra/clang-symbolgraph-merger/include/clang-symbolgraph-merger/SymbolGraph.h @@ -0,0 +1,48 @@ +#ifndef SYMBOLGRAPH_H +#define SYMBOLGRAPH_H + +#include "clang/Basic/LangStandard.h" +#include "clang/ExtractAPI/API.h" +#include "clang/ExtractAPI/AvailabilityInfo.h" +#include "clang/ExtractAPI/DeclarationFragments.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/JSON.h" +#include +#include + +namespace sgmerger { + +// see https://github.com/apple/swift-docc-symbolkit/bdob/main/openapi.yaml +struct SymbolGraph { + + struct Symbol { + Symbol(const llvm::json::Object &SymbolObj); + + llvm::json::Object SymbolObj; + std::string AccessLevel; + clang::extractapi::APIRecord::RecordKind Kind; + clang::extractapi::DeclarationFragments DeclFragments; + clang::extractapi::FunctionSignature FunctionSign; + std::string Name; + std::string USR; + clang::extractapi::AvailabilitySet Availabilities; + clang::extractapi::DocComment Comments; + clang::extractapi::RecordLocation Location; + clang::extractapi::DeclarationFragments SubHeadings; + + // underlying type in case of Typedef + clang::extractapi::SymbolReference UnderLyingType; + }; + + SymbolGraph(const llvm::StringRef JSON); + llvm::json::Object SymbolGraphObject; + llvm::json::Object Metadata; + llvm::json::Object Module; + std::vector Symbols; + llvm::json::Array Relationships; +}; + +} // namespace sgmerger + +#endif /* SYMBOLGRAPH_H */ diff --git a/clang-tools-extra/clang-symbolgraph-merger/include/clang-symbolgraph-merger/SymbolGraphMerger.h b/clang-tools-extra/clang-symbolgraph-merger/include/clang-symbolgraph-merger/SymbolGraphMerger.h new file mode 100755 index 000000000000000..179cadafd877825 --- /dev/null +++ b/clang-tools-extra/clang-symbolgraph-merger/include/clang-symbolgraph-merger/SymbolGraphMerger.h @@ -0,0 +1,45 @@ +#ifndef SYMBOLGRAPHMERGER_H +#define SYMBOLGRAPHMERGER_H + +#include "clang-symbolgraph-merger/SymbolGraph.h" +#include "clang-symbolgraph-merger/SymbolGraphVisitor.h" +#include "clang/Basic/LangStandard.h" +#include "clang/ExtractAPI/API.h" +#include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/TargetParser/Triple.h" +#include + +namespace sgmerger { + +using SymbolMap = llvm::DenseMap; + +class SymbolGraphMerger : public SymbolGraphVisitor { +public: + SymbolGraphMerger(const clang::SmallVector &SymbolGraphs, + const std::string &ProductName = "") + : ProductName(ProductName), Lang(clang::Language::Unknown), + SymbolGraphs(SymbolGraphs) {} + bool merge(); + bool visitMetadata(const llvm::json::Object &Metadata); + bool visitModule(const llvm::json::Object &Module); + bool visitSymbol(const SymbolGraph::Symbol &Symbol); + bool visitRelationship(const llvm::json::Object &Relationship); + +private: + std::string Generator; + + // stuff required to construct the APISet + std::string ProductName; + llvm::Triple Target; + clang::Language Lang; + + SymbolMap PendingSymbols; + SymbolMap VisitedSymbols; + + const clang::SmallVector &SymbolGraphs; +}; + +} // namespace sgmerger + +#endif /* SYMBOLGRAPHMERGER_H */ diff --git a/clang-tools-extra/clang-symbolgraph-merger/include/clang-symbolgraph-merger/SymbolGraphVisitor.h b/clang-tools-extra/clang-symbolgraph-merger/include/clang-symbolgraph-merger/SymbolGraphVisitor.h new file mode 100755 index 000000000000000..6e2042784147a62 --- /dev/null +++ b/clang-tools-extra/clang-symbolgraph-merger/include/clang-symbolgraph-merger/SymbolGraphVisitor.h @@ -0,0 +1,68 @@ +#ifndef SYMBOLGRAPHVISITOR_H +#define SYMBOLGRAPHVISITOR_H + +#include "clang-symbolgraph-merger/SymbolGraph.h" +#include "clang/ExtractAPI/API.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/JSON.h" +#include + +namespace sgmerger { + +// Visits a symbol graph obbect and record the extracted info to API +template class SymbolGraphVisitor { +public: + bool traverseSymbolGraph(const SymbolGraph &SG) { + bool Success = true; + Success = (getDerived()->visitMetadata(SG.Metadata) && + getDerived()->visitModule(SG.Module) && + getDerived()->traverseSymbols(SG.Symbols) && + getDerived()->traverseRelationships(SG.Relationships)); + + return Success; + } + + bool traverseSymbols(const std::vector &Symbols) { + bool Success = true; + for (const auto &Symbol : Symbols) + Success = getDerived()->visitSymbol(Symbol); + return Success; + } + + bool traverseRelationships(const llvm::json::Array &Relationships) { + bool Success = true; + for (const auto &RelValue : Relationships) { + if (const auto *RelObj = RelValue.getAsObject()) + Success = getDerived()->visitRelationship(*RelObj); + } + return Success; + } + + bool visitMetadata(const llvm::json::Object &Metadata); + bool visitModule(const llvm::json::Object &Module); + bool visitSymbol(const SymbolGraph::Symbol &Symbol); + bool visitRelationship(const llvm::json::Object &Relationship); + + std::unique_ptr getAPISet() { + return std::move(API); + } + +protected: + std::unique_ptr API; + +public: + SymbolGraphVisitor(const SymbolGraphVisitor &) = delete; + SymbolGraphVisitor(SymbolGraphVisitor &&) = delete; + SymbolGraphVisitor &operator=(const SymbolGraphVisitor &) = delete; + SymbolGraphVisitor &operator=(SymbolGraphVisitor &&) = delete; + +protected: + SymbolGraphVisitor() : API(nullptr) {} + ~SymbolGraphVisitor() = default; + + Derived *getDerived() { return static_cast(this); }; +}; + +} // namespace sgmerger + +#endif /* SYMBOLGRAPHVISITOR_H */ diff --git a/clang-tools-extra/clang-symbolgraph-merger/lib/CMakeLists.txt b/clang-tools-extra/clang-symbolgraph-merger/lib/CMakeLists.txt new file mode 100755 index 000000000000000..5f0bcc65c4762e2 --- /dev/null +++ b/clang-tools-extra/clang-symbolgraph-merger/lib/CMakeLists.txt @@ -0,0 +1,14 @@ +set(LLVM_LINK_COMPONENTS Support) + +add_clang_library(clangSymbolGraphMerger + SymbolGraphMerger.cpp + SymbolGraph.cpp + ) + +clang_target_link_libraries(clangSymbolGraphMerger + PRIVATE + clangBasic + clangToolingCore + clangToolingInclusions + clangExtractAPI +) diff --git a/clang-tools-extra/clang-symbolgraph-merger/lib/SymbolGraph.cpp b/clang-tools-extra/clang-symbolgraph-merger/lib/SymbolGraph.cpp new file mode 100755 index 000000000000000..030a9bda99db08e --- /dev/null +++ b/clang-tools-extra/clang-symbolgraph-merger/lib/SymbolGraph.cpp @@ -0,0 +1,243 @@ +#include "clang-symbolgraph-merger/SymbolGraph.h" +#include "clang/ExtractAPI/API.h" +#include "clang/ExtractAPI/AvailabilityInfo.h" +#include "clang/ExtractAPI/DeclarationFragments.h" +#include "llvm/Support/Allocator.h" +#include "llvm/Support/JSON.h" +#include "llvm/Support/VersionTuple.h" +#include +#include +#include +#include + +using namespace sgmerger; +using namespace llvm; +using namespace llvm::json; +using namespace clang::extractapi; + +namespace { + +APIRecord::RecordKind getSymbolKind(const Object &Kind) { + + if (auto Identifier = Kind.getString("identifier")) { + // Remove danguage prefix + auto Id = Identifier->split('.').second; + if (Id.equals("func")) + return APIRecord::RK_GlobalFunction; + if (Id.equals("var")) + return APIRecord::RK_GlobalVariable; + if (Id.equals("enum.case")) + return APIRecord::RK_EnumConstant; + if (Id.equals("enum")) + return APIRecord::RK_Enum; + if (Id.equals("property")) + return APIRecord::RK_StructField; + if (Id.equals("struct")) + return APIRecord::RK_Struct; + if (Id.equals("ivar")) + return APIRecord::RK_ObjCIvar; + if (Id.equals("method")) + return APIRecord::RK_ObjCInstanceMethod; + if (Id.equals("type.method")) + return APIRecord::RK_ObjCClassMethod; + if (Id.equals("property")) + return APIRecord::RK_ObjCInstanceProperty; + if (Id.equals("type.property")) + return APIRecord::RK_ObjCClassProperty; + if (Id.equals("class")) + return APIRecord::RK_ObjCInterface; + if (Id.equals("protocod")) + return APIRecord::RK_ObjCProtocol; + if (Id.equals("macro")) + return APIRecord::RK_MacroDefinition; + if (Id.equals("typealias")) + return APIRecord::RK_Typedef; + } + return APIRecord::RK_Unknown; +} + +VersionTuple parseVersionTupleFromJSON(const Object *VTObj) { + auto Major = VTObj->getInteger("major").value_or(0); + auto Minor = VTObj->getInteger("minor").value_or(0); + auto Patch = VTObj->getInteger("patch").value_or(0); + return VersionTuple(Major, Minor, Patch); +} + +RecordLocation parseSourcePositionFromJSON(const Object *PosObj, + std::string Filename = "") { + assert(PosObj); + unsigned Line = PosObj->getInteger("line").value_or(0); + unsigned Col = PosObj->getInteger("character").value_or(0); + return RecordLocation(Line, Col, Filename); +} + +RecordLocation parseRecordLocationFromJSON(const Object *LocObj) { + assert(LocObj); + + std::string Filename(LocObj->getString("uri").value_or("")); + // extract file name from URI + std::string URIScheme = "file://"; + if (Filename.find(URIScheme) == 0) + Filename.erase(0, URIScheme.length()); + + const auto *PosObj = LocObj->getObject("position"); + + return parseSourcePositionFromJSON(PosObj, Filename); +} + +DocComment parseCommentsFromJSON(const Object *CommentsObj) { + assert(CommentsObj); + const auto *LinesArray = CommentsObj->getArray("lines"); + DocComment Comments; + if (LinesArray) { + for (auto &LineValue : *LinesArray) { + const auto *LineObj = LineValue.getAsObject(); + auto Text = LineObj->getString("text").value_or(""); + + // parse range + const auto *BeginLocObj = LineObj->getObject("start"); + RecordLocation BeginLoc = parseSourcePositionFromJSON(BeginLocObj); + const auto *EndLocObj = LineObj->getObject("end"); + RecordLocation EndLoc = parseSourcePositionFromJSON(EndLocObj); + Comments.push_back(CommentLine(Text, BeginLoc, EndLoc)); + } + } + return Comments; +} + +AvailabilitySet parseAvailabilitiesFromJSON(const Array *AvailablityArray) { + if (AvailablityArray) { + SmallVector AList; + for (auto &AvailablityValue : *AvailablityArray) { + const auto *AvailablityObj = AvailablityValue.getAsObject(); + auto Domain = AvailablityObj->getString("domain").value_or(""); + auto IntroducedVersion = parseVersionTupleFromJSON( + AvailablityObj->getObject("introducedVersion")); + auto ObsoletedVersion = parseVersionTupleFromJSON( + AvailablityObj->getObject("obsoletedVersion")); + auto DeprecatedVersion = parseVersionTupleFromJSON( + AvailablityObj->getObject("deprecatedVersion")); + AList.emplace_back(AvailabilityInfo(Domain, IntroducedVersion, + DeprecatedVersion, ObsoletedVersion, + false)); + } + return AvailabilitySet(AList); + } + return nullptr; +} + +DeclarationFragments parseDeclFragmentsFromJSON(const Array *FragmentsArray) { + DeclarationFragments Fragments; + if (FragmentsArray) { + for (auto &FragmentValue : *FragmentsArray) { + Object FragmentObj = *(FragmentValue.getAsObject()); + auto Spelling = FragmentObj.getString("spelling").value_or(""); + auto FragmentKind = DeclarationFragments::parseFragmentKindFromString( + FragmentObj.getString("kind").value_or("")); + StringRef PreciseIdentifier = + FragmentObj.getString("preciseIdentifier").value_or(""); + Fragments.append(Spelling, FragmentKind, PreciseIdentifier); + } + } + return Fragments; +} + +FunctionSignature parseFunctionSignaturesFromJSON(const Object *SignaturesObj) { + FunctionSignature ParsedSignatures; + if (SignaturesObj) { + // parse return type + const auto *RT = SignaturesObj->getArray("returns"); + ParsedSignatures.setReturnType(parseDeclFragmentsFromJSON(RT)); + + // parse function parameters + if (const auto *ParamArray = SignaturesObj->getArray("parameters")) { + for (auto &Param : *ParamArray) { + auto ParamObj = *(Param.getAsObject()); + auto Name = ParamObj.getString("name").value_or(""); + auto Fragments = parseDeclFragmentsFromJSON( + ParamObj.getArray("declarationFragments")); + ParsedSignatures.addParameter(Name, Fragments); + } + } + } + return ParsedSignatures; +} + +std::vector +parseSymbolsFromJSON(const Array *SymbolsArray) { + std::vector SymbolsVector; + if (SymbolsArray) { + for (const auto &S : *SymbolsArray) + if (const auto *Symbol = S.getAsObject()) + SymbolsVector.push_back(SymbolGraph::Symbol(*Symbol)); + } + return SymbolsVector; +} + +} // namespace + +SymbolGraph::Symbol::Symbol(const Object &SymbolObject) + : SymbolObj(SymbolObject) { + + AccessLevel = SymbolObj.getString("accessLevel").value_or("unknown"); + Kind = getSymbolKind(*(SymbolObject.getObject("kind"))); + + // parse Doc comments + if (const auto *CommentsArray = SymbolObject.getObject("docComment")) + Comments = parseCommentsFromJSON(CommentsArray); + + // parse Availabilityinfo + if (const auto *AvailabilityArray = SymbolObj.getArray("availability")) + Availabilities = parseAvailabilitiesFromJSON(AvailabilityArray); + + // parse declaration fragments + if (const auto *FragmentsArray = SymbolObj.getArray("declarationFragments")) + DeclFragments = parseDeclFragmentsFromJSON(FragmentsArray); + + // parse function signatures if any + if (const auto *FunctionSignObj = SymbolObj.getObject("functionSignature")) + FunctionSign = parseFunctionSignaturesFromJSON(FunctionSignObj); + + // parse identifier + if (const auto *IDObj = SymbolObj.getObject("identifier")) + USR = IDObj->getString("precise").value_or(""); + + // parse Location + if (const auto *LocObj = SymbolObject.getObject("location")) + Location = parseRecordLocationFromJSON(LocObj); + + // parse name and subheadings. + if (const auto *NamesObj = SymbolObj.getObject("names")) { + Name = NamesObj->getString("title").value_or(""); + if (const auto *SubHObj = NamesObj->getArray("subHeading")) + SubHeadings = parseDeclFragmentsFromJSON(SubHObj); + } + + // parse underlying type in case of Typedef + auto UType = SymbolObject.getString("type"); + if (UType.has_value()) { + auto UTypeUSR = UType.value(); + // FIXME: this is a hacky way for Underlying type to be + // serialized into the final graph. Get someway to extract the + // actual name of the underlying type from USR + UnderLyingType = SymbolReference(" ", UTypeUSR); + } +} + +SymbolGraph::SymbolGraph(const llvm::StringRef JSON) { + Expected SGValue = llvm::json::parse(JSON); + if (SGValue) { + assert(SGValue && SGValue->kind() == llvm::json::Value::Object); + if (const auto *SGObject = SGValue->getAsObject()) { + SymbolGraphObject = *SGObject; + if (const auto *MetadataObj = SGObject->getObject("metadata")) + Metadata = *MetadataObj; + if (const auto *ModuleObj = SGObject->getObject("module")) + Module = *ModuleObj; + if (const auto *RelArray = SGObject->getArray("relationships")) + Relationships = *RelArray; + + Symbols = parseSymbolsFromJSON(SGObject->getArray("symbols")); + } + } +} diff --git a/clang-tools-extra/clang-symbolgraph-merger/lib/SymbolGraphMerger.cpp b/clang-tools-extra/clang-symbolgraph-merger/lib/SymbolGraphMerger.cpp new file mode 100755 index 000000000000000..71facea3e6ba8bc --- /dev/null +++ b/clang-tools-extra/clang-symbolgraph-merger/lib/SymbolGraphMerger.cpp @@ -0,0 +1,290 @@ +#include "clang-symbolgraph-merger/SymbolGraphMerger.h" +#include "clang/AST/DeclObjC.h" +#include "clang/ExtractAPI/API.h" +#include "clang/ExtractAPI/AvailabilityInfo.h" +#include "llvm/ADT/StringRef.h" +#include "llvm/Support/Casting.h" +#include + +using namespace llvm; +using namespace llvm::json; +using namespace clang; +using namespace clang::extractapi; +using namespace sgmerger; + +namespace { +ObjCInstanceVariableRecord::AccessControl +getAccessFromString(const StringRef AccessLevel) { + if (AccessLevel.equals("Private")) + return ObjCInstanceVariableRecord::AccessControl::Private; + if (AccessLevel.equals("Protected")) + return ObjCInstanceVariableRecord::AccessControl::Protected; + if (AccessLevel.equals("Public")) + return ObjCInstanceVariableRecord::AccessControl::Public; + if (AccessLevel.equals("Package")) + return ObjCInstanceVariableRecord::AccessControl::Package; + return ObjCInstanceVariableRecord::AccessControl::None; +} + +Language getLanguageFromString(const StringRef LangName) { + if (LangName.equals("c")) + return Language::C; + if (LangName.equals("objective-c")) + return Language::ObjC; + if (LangName.equals("C++")) + return Language::CXX; + + return Language::Unknown; +} + +template +bool addWithContainerRecord(APIRecord::RecordKind Kind, APIRecord *TargetRecord, + Lambda Inserter) { + switch (Kind) { + case APIRecord::RK_ObjCInterface: { + if (ObjCInterfaceRecord *Container = + dyn_cast_or_null(TargetRecord)) + Inserter(Container); + } break; + case APIRecord::RK_ObjCProtocol: { + if (ObjCProtocolRecord *Container = + dyn_cast_or_null(TargetRecord)) + Inserter(Container); + } break; + case APIRecord::RK_ObjCCategory: { + if (ObjCCategoryRecord *Container = + dyn_cast_or_null(TargetRecord)) + Inserter(Container); + } break; + default: + retur... |
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Create a clang tool to merge all the JSON symbolgraph emited by --emit-symbol-graph or -extract-api options into one unified JSON symbolgraph file.
Differential Revision: https://reviews.llvm.org/D158646