LCOM Metric Analyzer is a tool that evaluates the quality of object-oriented code by identifying classes that lack cohesion. These classes can be broken down into smaller, more manageable units for improved clarity and maintenance. Cohesion is measured using the various LCOM (Lack of Cohesion of Methods) metrics, and source code is processed using the ROSE compiler infrastructure.
Given the following source code for the class in c4.adb
:
package body c4 is
procedure m1 is
begin
a1 := 1;
a2 := 2;
end m1;
procedure m2 is
begin
a1 := 1;
m3;
end m2;
procedure m3 is
begin
a3 := 3;
end m3;
procedure m4 is
begin
a4 := 4;
m3;
end m4;
end c4;
Our tool converts it into an LCOM graph, where each method is a box, each attribute is an ellipse, and arrows indicate an attribute/method access. Note how methods can call other methods, with arrows between boxes indicating the calls.
This graph is analyzed by the tool using the definitions of LCOM to compute the lack of cohesion of methods for the class.
LCOM1 | LCOM2 | LCOM3 | LCOM4 | LCOM5 |
---|---|---|---|---|
5 | 4 | 3 | 1 | 11/12 |
This tool has been tested to work with Ubuntu 20.04 and Ubuntu 22.04, but in principle, any Linux distribution supported by ROSE should work.
For ease of use, a dockerfile
has been provided. It can be built using the following command:
docker build -t lcom -f dockerfile .
Manual installation instructions follow:
Install the required dependencies using your package manager:
sudo apt-get update
sudo apt-get install -y bison byacc cmake dbus flex fontconfig g++ git gnat gprbuild libtool libx11-xcb1 make python3 wget
Set up the environment by choosing where to install tools.
Add these to your .bashrc
.
# Set these to an install location of your preference for each tool.
export GNAT_HOME="/GNAT/2019"
# Set these based on where you place the associated git repositories.
export BOOST_REPO="/boost_1_83_0"
export ROSE_REPO="/rose"
export GTEST_REPO="/gtest-parallel"
export LCOM_HOME="/ROSE-LCOM-Tools"
# These paths make it possible for all tools to be found during the build and run process.
# They should generally remain unchanged.
export PATH="$GNAT_HOME/bin:$PATH"
# export LD_LIBRARY_PATH="$GNAT_HOME/lib64:$GNAT_HOME/lib:$LD_LIBRARY_PATH"
export BOOST_HOME="$BOOST_REPO/install"
export LD_LIBRARY_PATH="$BOOST_HOME/stage/lib":$LD_LIBRARY_PATH
export BOOST_LIB="$BOOST_HOME/stage/libexport"
export ROSE_HOME="$ROSE_REPO/install_tree"
export BOOST_ROOT="$BOOST_HOME"
export ASIS_ADAPTER="$ROSE_REPO/build_tree/src/frontend/Experimental_Ada_ROSE_Connection/parser/asis_adapter"
To install GNAT, you can one of two methods: (1) an automated install using a community-created tool or (2) manually via a GUI.
git clone --depth 1 https://github.com/AdaCore/gnat_community_install_script.git
pushd gnat_community_install_script
wget -O gnat-community-2019-20190517-x86_64-linux-bin https://community.download.adacore.com/v1/0cd3e2a668332613b522d9612ffa27ef3eb0815b?filename=gnat-community-2019-20190517-x86_64-linux-bin
sh install_package.sh ./gnat-community-2019-20190517-x86_64-linux-bin $GNAT_HOME
popd
or
wget -O gnat-community-2019-20190517-x86_64-linux-bin https://community.download.adacore.com/v1/0cd3e2a668332613b522d9612ffa27ef3eb0815b?filename=gnat-community-2019-20190517-x86_64-linux-bin
chmod +x gnat-community-2019-20190517-x86_64-linux-bin
./gnat-community-2019-20190517-x86_64-linux-bin
# Follow the setup instructions, installing to the location specified by $GNAT_HOME.
wget -O asis.tar.gz https://community.download.adacore.com/v1/52c69e7295dc301ce670334f8150193ecbec580d?filename=asis-2019-20190517-18AB5-src.tar.gz
tar -xvzf asis.tar.gz
pushd asis-2019-20190517-18AB5-src
sed -i 's/for Library_Kind use \"static\";/for Library_Kind use \"dynamic\";/g' asis.gpr
make all install prefix=$GNAT_HOME
popd
rm asis.tar.gz
Build and install BOOST using the GNAT compiler.
wget https://boostorg.jfrog.io/artifactory/main/release/1.83.0/source/boost_1_83_0.tar.bz2
tar -xvf boost_1_83_0.tar.bz2
pushd $BOOST_REPO
mkdir -p tools/build/src/
echo "using gcc : 8.3.1 : $GNAT_HOME/bin/g++ ; " >> tools/build/src/user-config.jam
bash bootstrap.sh
./b2 -j$(nproc)
./b2 install --prefix=$BOOST_HOME
popd
rm boost_1_83_0.tar.bz2
Build ROSE with Ada language support with the following commands.
NOTE: The Ada representation in ROSE is not yet finalized, so incompatibilities with newer versions of ROSE may be possible. Our tool is confirmed to work with ROSE version 0.11.145.3
.
git clone --depth 1 https://github.com/rose-compiler/rose.git $ROSE_REPO
cd $ROSE_REPO
./build
mkdir -p build_tree
pushd build_tree
export LD_LIBRARY_PATH="$GNAT_HOME/lib64:$GNAT_HOME/lib:$LD_LIBRARY_PATH"
../configure --prefix=$ROSE_HOME --enable-languages=c,c++ --enable-experimental_ada_frontend --without-swi-prolog --without-cuda --without-java --without-python --with-boost=$BOOST_HOME --verbose --with-DEBUG=-ggdb --with-alloc-memset=2 --with-OPTIMIZE="-O0 -march=native -p -DBOOST_TIMER_ENABLE_DEPRECATED" --with-WARNINGS="-Wall -Wextra -Wno-misleading-indentation -Wno-unused-parameter" CXX=$GNAT_HOME/bin/g++ CC=$GNAT_HOME/bin/gcc
make core -j$(nproc)
make install-core -j$(nproc)
make check-core -j$(nproc)
# Build the ROSE AST DOT graph generator.
pushd exampleTranslators
make -j$(nproc)
popd
popd
# NOTE: Restart your terminal after this to clear the changes to LD_LIBRARY_PATH. Adding GNAT to the path adds out-of-date libraries as well, and may throw errors when running certain commands. However, it is required for the build process.
This is used by allTest.py
to run all test cases in parallel.
git clone --depth 1 https://github.com/google/gtest-parallel.git $GTEST_REPO
The recommended process uses cmake
. A Makefile
is also provided.
Start by cloning the repo such that it is in the location specified by $LCOM_HOME
git clone --depth 1 https://github.com/LLNL/ROSE-LCOM-Tools.git $LCOM_HOME
cd $LCOM_HOME
Now choose a build process, either using cmake
or running the full build command.
rm -r build/; # Remove the build directory to ensure a fresh build (rarely needed)
cmake -S . -B build # Create the build directory with autogenerated makefiles
cmake --build build --parallel $(nproc) # Build all LCOM tools in parallel
pushd build && ctest; popd # Run GTests sequentially
# Alternatively run GTests in parallel
$GTEST_REPO/gtest-parallel build/lcom-unittest
As an alternative to a proper build system, you can also build directly with the following commands:
mkdir -p build
# Main LCOM tool.
g++ -o build/lcom.out src/lcom.cpp -Iinclude -I${ROSE_HOME}/include/rose -I${BOOST_HOME}/include -lrose -lboost_date_time -lboost_thread -lboost_filesystem -lboost_program_options -lboost_regex -lboost_system -lboost_serialization -lboost_wave -lboost_iostreams -lboost_chrono -ldl -lm -lquadmath -lasis_adapter -lstdc++fs -pthread -L${ROSE_HOME}/lib -L${BOOST_HOME}/lib -L${ASIS_ADAPTER}/lib -Wl,-rpath ${BOOST_HOME}/lib -Wl,-rpath ${ASIS_ADAPTER}/lib -Wl,-rpath=${ROSE_HOME}/lib
# LCOM DOT graph generator for visualizations.
g++ -o build/lcom-dot.out src/lcom-dot.cpp -Iinclude -I${ROSE_HOME}/include/rose -I${BOOST_HOME}/include -lrose -lboost_date_time -lboost_thread -lboost_filesystem -lboost_program_options -lboost_regex -lboost_system -lboost_serialization -lboost_wave -lboost_iostreams -lboost_chrono -ldl -lm -lquadmath -lasis_adapter -lstdc++fs -pthread -L${ROSE_HOME}/lib -L${BOOST_HOME}/lib -L${ASIS_ADAPTER}/lib -Wl,-rpath ${BOOST_HOME}/lib -Wl,-rpath ${ASIS_ADAPTER}/lib -Wl,-rpath=${ROSE_HOME}/lib
This tool has been tested with additional sources not included in this repo. You can prepare them for use with this tool using the following commands:
bash acats.sh &
bash osc.sh
ROSE is designed to support Ada 95, so we use the associated ACATS version, 2.6.
acats.sh
can be used to download and process the source into testcases/acats
.
A collection of open source code that uses the Ada 95 language.
Project | Ada lines of code |
---|---|
Ada Exploiting | 1,675 |
Ada Structured Library | 48,258 |
ALIRE: Ada LIbrary REpository | 50,999 |
Ada 95 Booch Components | 34,073 |
Simple components for Ada | 463,660 |
Fuzzy sets for Ada | 695,430 |
GNAT Studio | 845,908 |
Libadalang-tools | 140,481 |
LinXtris | 5,341 |
PHCpack | 2,492,729 |
PNG_IO | 4,214 |
SHA-1 | 498 |
Ada KALINDA OS | 20,383 |
To download the source into testcases/osc
, run osc.sh
This will download all of the projects in parallel.
Some basic C++ programs. These were used to verify functionality of the LCOM tool on basic C++ code.
git clone https://github.com/amngupta/simple-cpp-programs.git testcases/simple-cpp-programs
Run bash allTest.sh
to run tests.
- Copy any source code into
testcases/<your directory here>
for evaluation. - Many options can be specified. View them with
python3 scripts/allTest.py -h
- Run
python3 scripts/allTest.py <task>
, specifying which tasks you want to run. The following tasks are available:
- build
- wipe_output
- make_dot_graphs
- gen_lcom
- make_lcom_dot_graphs
- combine_csv
- run_gtests NOTE: Will not work without cmake.
Multiple tasks, each separated by a space, can be selected to run in one test. If no task is specified, all of them will run.
Abstract syntax tree traversal is the most complex step.
At a high level, it works using ROSE's AstTopDownProcessing visitor traversal pattern, a recursive pattern that accesses each node from the top down.
This ensures every node is evaluated.
An inherited attribute is copied down to child nodes, but all inherited attributes are static in our approach.
Whenever one of our target node types is seen (e.g., Class=SgAdaPackageSpec*
, Method=SgFunctionDeclaration*
, Attribute=SgInitializedName*
), we process its relationship with the other nodes.
The class is the simplest. When a class node is seen, it is added to the list of classes.
When a method is seen, it is added to the list of methods and associated to its parent class. The parent class is identified by traversing up to parent scopes until a matching class type is seen.
When an attribute is seen, it is added to the list of attributes and associated to its parent method. The parent method is identified by traversing up to parent scopes until a matching method type is seen.
When a method is called, it is associated to the method that called it by traversing up to parent scopes until a matching method type is seen.
Attributes and methods can be renamed in Ada and C++, complicating the process of identifying methods that share attributes. When a renaming, pointer, etc. is seen, it is stored in a renaming map, which can be used to look up the root attribute or method associated with the call.
Ada records are used to contain fields, allowing multiple attributes to be tied to a single object. If a record is accessed, it needs to be seen as overlapping with any access to any underlying fields as well. Since record fields can themselves be records, it is possible that a field will be created and later accessed several records down. To accommodate this, we use a tree data structure, where each node is an attribute and children are fields. Every method access to a specific attribute is associated with the corresponding node. To find overlapping method accesses, traverse up from each leaf node to the root, connecting all methods found along each leaf-to-root traversal.
Once the ROSE AST has been traversed and the relationships between classes, methods, and attributes is captured, LCOM1-5 are calculated using the standard approaches outlined here. Additional, normalized metrics are computing by taking the LCOM metric divided by lowest possible cohesion for a class with the given number of methods, where 1 is least cohesive and 1/#methods is the most cohesive.
- LCOM1: The number of pairs of methods that do not share attributes.
- LCOM2: The number of pairs of methods that do not share attributes minus the number of pairs of methods that do share attributes.
- LCOM3: The number of connected components in the graph that represents each method as a node and the sharing of at least one attribute as an edge.
- LCOM4: The number of connected components in the graph that represents each method as a node and the sharing of at least one attribute as an edge. Edges between methods also form when one method calls another within the same class.
- LCOM5: The sum of non-module attributes accessed by a class, defined by the formula
(a-k*l)/(l-k*l)
, wherea
is the number of attribute accesses,l
is the number of attributes, andk
is the number of methods.
Ada has no specific class construct in the language, so identifying an object that could be used as a class is non-obvious. Anything with a distinct scope in Ada could reasonably be used as a class. We elected to use the same eligible local program units chosen by GNATmetric as our class types: packages, functions/procedures/subprograms, and protected objects. GNATmetric also evaluates tasks. In this situation, entries are methods. However, there is no clear way for an entry to reference an attribute, so LCOM is not meaningful here.
To determine if our tool produces the correct results, we must first report our assumptions.
- Attributes that are never accessed do not count toward the attribute count used to calculate LCOM5.
- All methods within a class are considered, even if they have no accessed attributes.
- If an attribute is accessed multiple times within a method, it is seen as only a single access. This can affect LCOM5.
- An array is seen as a single attribute for the purposes of LCOM.
- When DotBehavior is set to LeftOnly, access to a record field counts as an access to the object as a whole. In this situation, we do not track individual fields as unique attributes.
- When DotBehavior is set to Full, access to a record field counts as a single access to the specific field. If a field access overlaps with another (potentially parent) field or record access, it is counted as a shared access.
- Attribute accesses and method calls made outside of a method's scope are ignored.
- Methods outside of a class are ignored.
- Access to data outside of the currently evaluated file are often inaccessible. What is analyzed by the LCOM tool is limited to the contents of the AST generated by ROSE. For instance, analyzing point_complex.adb can find the declarations in gcomplex.ads but misses the attribute accesses made in the method definitions found in gcomplex.adb because they are not in the AST.
This tool has been tested for functionality with Ada and C++. In principle, it should be compatible with all languages that work with ROSE, although coverage of language features have only been thoroughly investigated for Ada and C++.
Issues with the tool are likely to be reported in the form of warning/error/fatal logs, unexpected program termination, or assertion failure. These can be tracked down in logs using the following RegEx:
\[(Fatal|Error|Warn)\]|(terminate called after)|(Assertion )
Issue reports for this tool should include the relevant logs (at trace level) and source files.
Work could be done to improve support for other languages. While other languages are handled by ROSE, this tool has only been tested extensively on Ada and C++ code. Languages are complex and this tool may require significant additional work to handle complex language features in a reasonable way. However, the extension of this tool to accommodate C++ was relatively straightforward, suggesting additional language support may be as simple as adding special cases for specialized AST node types.
Currently, all attribute and called method accesses are associated only with their immediate scope parent class. It would be interesting to see how LCOM changes if we associated that access with all parent classes, to handle nested classes. In an examination of other LCOM tools, there was no clear consensus in whether or not to support this. If this is implemented, an associated test case should be made for child.adb to ensure it works properly.
When a method calls another method, it indirectly accesses all of the attributes within that called method. We should see what happens to LCOM if we recursively identify these attributes and associate them back to the calling method. This should already be handled by LCOM4 but may impact the results of LCOM5.
LCOM5 currently counts the appearance of an attribute node as a single attribute access. However, when that attribute is a record, it has multiple fields associated with it, each of which can be seen as a separate attribute access for the purposes of LCOM5. It may be worth tracking the number of underlying fields associated with each record access to report a more accurate LCOM5 metric.
Tagged types are supported by our tool, but they currently only work when a method specifies a single tagged type as a parameter. While this is the most common configuration, it is also possible to have multiple tagged types as parameters, essentially giving multiple classes ownership of a single method. This situation is not currently supported and may take a significant amount of refactoring to support correctly.
Ada - Integrate with Ada Analysis Toolkit
The Ada analysis toolkit is a useful visualization tool to evaluate a codebase. LCOM could be integrated into this tool to color-code methods by LCOM and display their relationships in the code.
- Fix NPrint::p() in node-print.hpp to use a hierarchy of supported print functions.
- Fix Cache in lcom.hpp to improve performance.
- LCOM Lecture Slides (archive.org backup): Definition of LCOM with visuals and comparisons
- Cohesion metrics: Description of LCOM
- Class Cohesion Metrics for Software Engineering: A Critical Review: Overview of cohesion state of the art
- Refactoring Effect on Cohesion Metrics: Evaluation of how effective cohesion methods are as refactoring aids
- YALCOM: Yet Another LCOM Metric
- LCOM: Java LCOM implementation
- jpeek: Alternative Java LCOM implementation
- lcom: Python-based LCOM implementation
- LCOM4go: Golang-based LCOM4 implementation
- JavaScript: https://github.com/FujiHaruka/eslint-plugin-lcom
- C#: https://github.com/teo-tsirpanis/lcom-calculator
- PHP: https://github.com/phpmetrics/PhpMetrics/blob/master/doc/metrics.md
- Java: https://github.com/rodhilton/jasome/blob/master/src/main/java/org/jasome/metrics/calculators/LackOfCohesionMethodsCalculator.java
- Java: https://github.com/imgios/DARTS/blob/main/src/main/testSmellDetection/structuralRules/LackOfCohesionOfTestSmellStructural.java
- Java: https://github.com/mauricioaniche/ck
Installing: See README at https://github.com/cqfn/jpeek
Running:
find -name "*.java" > sources.txt
javac @sources.txt
java -jar other-tools/jpeek-0.32.2-jar-with-dependencies.jar --sources testcases/paper --target ./other-tools/jpeek --overwrite --metrics LCOM,LCOM2,LCOM3,LCOM4,LCOM5
Installing:
cd other-tools/LCOM
sudo apt-get install maven -y
mvn package
Running:
java -jar other-tools/LCOM/target/LCOM.jar -i testcases/paper -o other-tools/LCOM-MC
Installing: See README at https://github.com/yahoojapan/lcom4go
go install --ldflags "-s -w" --trimpath github.com/yahoojapan/lcom4go/cmd/lcom4@latest
Running:
$(go env GOPATH)/bin/lcom4
Installing:
pip3 install lcom
Running:
~/.local/bin/lcom testcases/paper
find testcases/paper/ -name "*.java" -print0 | xargs -0 javac && java -jar other-tools/jpeek-0.32.2-jar-with-dependencies.jar --sources testcases/paper --target ./other-tools/jpeek --overwrite --metrics LCOM,LCOM2,LCOM3,LCOM4,LCOM5
java -jar other-tools/LCOM/target/LCOM.jar -i testcases/paper -o other-tools/LCOM-MC
test=<Test Name Here>
$(go env GOPATH)/bin/lcom4 testcases/paper/$test/$test.go
# NOTE: Nothing is returned if LCOM = 1
~/.local/bin/lcom testcases/paper