documentation added

ArDoCo · Oct 9, 2024 · 65773a9 · 65773a9
1 parent b1b57c1
commit 65773a9
Show file tree

Hide file tree

Showing 27 changed files with 455 additions and 220 deletions.
diff --git a/.gitignore b/.gitignore
@@ -34,3 +34,4 @@ build/
 /redis/redis.conf
 /redis/redis.env
 /.env
+/redis/redis.acl
diff --git a/README.md b/README.md
@@ -0,0 +1,2 @@
+# ArDoCo REST
+Details concerning the architectural decisions as well as the API responses can be found [here](architecture_decisions.md).
diff --git a/architecture_decisions.md b/architecture_decisions.md
@@ -2,7 +2,7 @@
 ### Useful links:
 System health check: http://localhost:8080/actuator/health
 
-Swagger UI with API endpoints: http://localhost:8080/swagger-ui/index.htm
+Swagger UI with API endpoints: http://localhost:8080/swagger-ui/index.html
 
 Redis insight: http://localhost:5540/
 
@@ -11,8 +11,30 @@ Redis insight: http://localhost:5540/
 
 ## Controller
 
-### Purpose
-This is the outermost layer of the REST API and is responsible for the HTTP input and forwarding it to the service layer.
+This is the outermost layer of the REST API and is responsible for the HTTP input and forwarding it to the service layer.\
+Each ardoco-runner is represented by one controller: sad-sam, sad-code, sam-code, sad-sam-code\
+Each controller has 4 Endpoints: 
+- runPipeLine: starts the Ardoco Pipeline for the runner and returns a unique id which can be used to retreive the result
+- runPipelineAndWait: starts the Ardoco Pipeline for the runner and waits up to 60 seconds for the result, otherwise it 
+simply returns the id with which the result can be queried
+- getResult: returns the result for the given id, if it already exisits
+- waitForResult: waits up to 60 seconds for the result of the given id
+
+The endpoints that start the pipeline each proceed in a similar way:
+1. convert the inputMultipartFiles into Files
+2. generate a unique id from the input files, the projectName and the runner/tracelink-type (sad-sam,sad-code,...) to 
+later identify the result using an md5 hash
+3. set up the runner using the input Files
+4. forward the runner to the service layer, to run the pipeline asynchronously in case no result is in the database yet, 
+or simply get the result from the database
+5. case runPipeline: return the unique id
+6. case runPipelineAndWait: return unique id and result if present after 60 seconds
+
+## Notes on the Architecture
+Handling the result from calling the service is similar for all controllers. This similar behaviour can be found in the AbstractController class.\
+Moreover are the endpoints to retrieve the results the same for each controller (meaning, you can also use the getResult-Endpont from sad-sam to query the result from a sam-code pipeline).
+Currently, all controllers have the 2 methods so they can function on its own, but in order to minimize code duplicates and to make the API more intuitive
+it might make more sense to put them in a separate controller, whose only task it will be to retreive the result.
 
 ### Remarks
 - So far, the API doesn't allow users to define additional Configs (in the Controller classes)
@@ -26,26 +48,41 @@ file is empty or not.
 
 ## Service
 ### Purpose:
-This layer is responsible for processing the input and making the needed calls to ArDoCo to get a result.
+This layer is responsible for processing the input and making the needed calls to ArDoCo to run the pipeline in 
+order to retrieve a result.
+
+### Architectural Remarks
+The Controllers already set up the runner. The controllers then feed the runner to the runPipeline() and 
+runPipelineAndWaitForResult() Methods. This has the advantage that runPipeline() and runPipelineAndWaitForResult() have the same
+signature which maximizes code reusability without adding too much complexity. Moreover there is a unified(?)interface 
+for the Services which the Controllers can use. This works, since the runner in ardoco are part of an inheritance hierarchy 
+in ardoco. However, the current method has the disadvantage that ArdoCo is already invoked in the controller and not only 
+in the service layer and that the runner is always set up for the runPipeline-methods ignoring whether the result already 
+in the database or not.\
+Another option would be to only do invoke Ardoco in the service layer. This means that setting up the runner would be 
+needed to do there as well. But since the setup-methods of the runners each require different parameters, there can't be 
+a unified interface containing the startPipeline() methods without introducing a lot of complexity through generics.
 
 ### Remarks
 
-- The output directory, which is required by ardoco when running any pipeline, is internally set to a temporary directory 
-and is not made available to the outside, since the result will be returned in form of a response entity
-
-- only the direct interaction with ardoco is asynchronous. Handling the input file (including conversion and 
-checking whether its file type is correct) is done before, since like this the user can get quicker feedback that
-sth. went wrong.
-
 - The ids of the ongoing asynchronous calls are stored in a concurrentHashmap. This has the advantage that
 when a user calls getResult to potentially receive the result, it can first be checked in the concurrentHashmap whether
 the asynchronous call of ardoco has finished yet instead of unnecessarily doing a database call. 
 Additionally storing the Completable Futures in the hashmap allows wait for ardoco without constantly
 querying the database for a result.
 
+## Remarks to Interacting with Ardoco
+
+- The output directory, which is required by ardoco when running any pipeline, is internally set to a temporary directory
+  and is not made available to the outside, since the result will be returned in form of a response entity
+
+- only the direct interaction with ardoco is asynchronous. Handling the input file (including conversion and
+  checking whether its file type is correct) is done before, since like this the user can get quicker feedback that
+  sth. went wrong.
+
 ## Hashing (Generating the ProjectID)
-Only the files are used to create the hash, the configs not, meaning that in case only the configs change, the same
-hash is generated. In the future, the configs might need to be hashed as well.
+Only the files, the projectName and the controller/tracelink-type are used to create the hash, the configs not, meaning 
+that in case only the configs change, the same hash is generated. In the future, the configs might need to be hashed as well.
 A md5 hash is used to ensure to get a hash space great enough to ensure that the probability of collisions is almost 0.
 
 The hashes are used as keys in the database. Since entries are automatically deleted after 24h and the hash space is
@@ -59,12 +96,23 @@ Time To Live of 24h, so that the database never gets to large because of stored
 (because the client's request has been too long ago).
 
 To be able to change the database used smoothly, repositories implement a DatabaseAccessor Interface, which is used 
-by the classes which use the database. 
+by the classes which use the database (e.g the Services). 
+
+## Converting the Tracelinks to JSON
+The found tracelinks are converted into a raw JSON-String directly after the pipline has finished and stored in the database as raw JSON,
+to avoid having to convert the result multiple times in case a user queries the ready result multiple times.
+The TraceLinkConverter-Class provides functionality to convert different types of tracelink into JSON.
+
+## Exception Handeling
+Exceptions are centrally handled by the GlobalExceptionHandler which produces a such an ErrorResponse for
+the user in case an exception is thrown which is not caught elsewhere. This central handling of exceptions standardizes
+the way how the system deals with errors.
 
 ## API Response schemas
 The API has 2 response schemas: 
 1. **Schema for expected behaviour** \
     - #### Sad-Code
+        Example:
    ```json
    {
         "requestId": "SadCodeResult:bigBlueButtonF2BD94533508F2F2DE4130AB43403B63",
@@ -83,36 +131,76 @@ The API has 2 response schemas:
    }
    ```
     - #### Sam-Code
+        Example:
+    ```json
+    {
+        "requestId": "SamCodeResult:bigBlueButton2B867FE03AF1FE8DE3C1DEE7F1D9CB4E",
+        "status": "OK",
+        "message": "The result is ready.",
+        "traceLinkType": "SAM_CODE",
+        "traceLinks": [
+        {
+        "modelElementId": "_9wZIcFkHEeyewPSmlgszyA",
+        "modelElementName": "FSESL",
+        "codeElementId": "acm005843jsd",
+        "codeElementName": "bbb-fsesl-client/src/main/java/org/freeswitch/esl/client/manager/DefaultManagerConnection.java"
+        },
+        {
+        "modelElementId": "_nwrCMFwPEeyiuNx_RO7j-Q",
+        "modelElementName": "FreeSWITCH",
+        "codeElementId": "acm005938jsd",
+        "codeElementName": "bbb-fsesl-client/src/main/java/org/freeswitch/esl/client/outbound/example/SimpleHangupPipelineFactory.java"
+        }]
+    }
+   ```
     - #### Sad-Sam
+        Example:
+   ```json
+    {
+        "requestId": "SadSamResult:bigBlueButton6AA76050BA630F6D8A6E099A30D1053C",
+        "status": "OK",
+        "message": "The result is ready.",
+        "traceLinkType": "SAD_SAM",
+        "traceLinks": [
+        {
+        "sentenceNumber": 3,
+        "modelElementUid": "_0e5u8FkHEeyewPSmlgszyA",
+        "confidence": 1
+        },
+        {
+        "sentenceNumber": 4,
+        "modelElementUid": "_s0aIcFkHEeyewPSmlgszyA",
+        "confidence": 0.8
+        }]
+    }
+    ```
     - #### Sad-Sam-Code
-       Example: 
+        Example: 
 ```json
 {
-  "projectId": "SadCodeResult:bigbluebutton67C34469A21A66DC94FD39531C9C1E6C",
+  "requestId": "SadSamCodeResult:bigBlueButton8E0E764E3B368781CF0DDDC67F19ABC0",
   "status": "OK",
   "message": "The result is ready.",
-  "samSadTraceLinks": [
-    [
-      "presentations",
-      "FileTypeConstants"
-    ],
-    [
-      "BigBlueButton",
-      "MeetingsResponse"
-    ]]
+  "traceLinkType": "SAD_SAM_CODE",
+  "traceLinks": [
+    {
+      "sentenceNumber": 25,
+      "codeCompilationUnit": "akka-bbb-apps/src/main/scala/org/bigbluebutton/core/util/jhotdraw/PathData.java"
+    },
+    {
+      "sentenceNumber": 49,
+      "codeCompilationUnit": "akka-bbb-apps/src/main/scala/org/bigbluebutton/core/util/jhotdraw/PathData.java"
+    }]
 }
    ```
 Note: Depending on the invoked endpoint and on the concrete result, some parameters (esp traceLinks) might be null
 
-2. Schema for when an error occurred
+2. **Schema for when an error occurred**
 Example:
 ```json
 {
-  "timestamp": "18-09-2024 20:45:13",
+  "timestamp": "09-10-2024 12:58:42",
   "status": "UNPROCESSABLE_ENTITY",
-  "message": "File not found."
+  "message": "No result with key randomID123 found."
 }
-```
-Note: Exceptions are centrally handled by the GlobalExceptionHandler which produces a such an Error message for 
-the user in case an exception is thrown which is not caught elsewhere. This central handling of exceptions standartizes
-the way how the system deals with errors.
+```
diff --git a/src/main/java/io/github/ardoco/rest/api/api_response/ArdocoResultResponse.java b/src/main/java/io/github/ardoco/rest/api/api_response/ArdocoResultResponse.java
@@ -11,7 +11,6 @@ public class ArdocoResultResponse {
     private TraceLinkType traceLinkType;
 
     @JsonRawValue
-    //@JsonDeserialize(using = ArdocoResultResponseDeserializer.class)
     private String traceLinks;
 
     public ArdocoResultResponse() {}

diff --git a/src/main/java/io/github/ardoco/rest/api/api_response/TraceLinkType.java b/src/main/java/io/github/ardoco/rest/api/api_response/TraceLinkType.java
@@ -1,5 +1,10 @@
 package io.github.ardoco.rest.api.api_response;
 
+/**
+ * Enum which represents the different types of tracelinks/ runner-names.
+ * It is used primarily in controllers and services to differentiate the types of trace links
+ * during processing and determine the appropriate handler for each request based on the type.
+ */
 public enum TraceLinkType {
     SAD_CODE("SadCodeResult:", "sad-code"),
     SAM_CODE("SamCodeResult:", "sam-code"),

diff --git a/src/main/java/io/github/ardoco/rest/api/controller/AbstractController.java b/src/main/java/io/github/ardoco/rest/api/controller/AbstractController.java
@@ -17,7 +17,14 @@
 import java.util.List;
 import java.util.Optional;
 
-
+/**
+ * The {@code AbstractController} class provides foundational methods for handling various REST responses
+ * for traceability link recovery (TLR) processes. It is designed to work with specific types of trace links
+ * and delegate processing to the {@link AbstractRunnerTLRService}.
+ * <p>
+ * This abstract class is intended to be extended by specific controller implementations that can handle
+ * different types of trace links, represented by the {@link TraceLinkType} enum.
+ */
 public abstract class AbstractController {
 
     protected final TraceLinkType traceLinkType;
@@ -26,13 +33,29 @@ public abstract class AbstractController {
 
     private static final Logger logger = LogManager.getLogger(AbstractController.class);
 
+    /**
+     * Constructs a new {@code AbstractController} with the specified service and trace link type.
+     *
+     * @param service the service responsible for trace link recovery operations
+     * @param traceLinkType the type of trace link this controller manages
+     */
     public AbstractController(AbstractRunnerTLRService service, TraceLinkType traceLinkType) {
         this.traceLinkType = traceLinkType;
         this. service = service;
     }
 
 
-    // build result for runPipeline
+    /**
+     * Handles the process of running a pipeline and building a response based on the result status.
+     *
+     * @param runner the {@link ArDoCoRunner} instance to execute
+     * @param requestId the unique request ID associated with the pipeline run
+     * @param inputFiles the list of input files for the pipeline run
+     * @return a {@link ResponseEntity} containing the {@link ArdocoResultResponse} with the status and result message
+     * @throws FileNotFoundException if any of the input files cannot be found
+     * @throws FileConversionException if there's an error converting any file during the pipeline process
+     * @throws HashingException if hashing the files for the request ID fails
+     */
     protected ResponseEntity<ArdocoResultResponse> handleRunPipeLineResult(ArDoCoRunner runner, String requestId, List<File> inputFiles)
             throws FileNotFoundException, FileConversionException, HashingException {
         Optional<String> result = service.runPipeline(runner, requestId, inputFiles);
@@ -45,7 +68,14 @@ protected ResponseEntity<ArdocoResultResponse> handleRunPipeLineResult(ArDoCoRun
         return new ResponseEntity<>(response, response.getStatus());
     }
 
-    // build result for getResult
+    /**
+     * Handles the retrieval of a result and builds an appropriate response based on the result status.
+     *
+     * @param requestId the unique request ID for retrieving the result
+     * @return a {@link ResponseEntity} containing the {@link ArdocoResultResponse} with the status and result message
+     * @throws ArdocoException if an error occurs while fetching the result
+     * @throws IllegalArgumentException if the provided requestId is invalid
+     */
     protected ResponseEntity<ArdocoResultResponse> handleGetResult(String requestId) throws ArdocoException, IllegalArgumentException  {
         Optional<String> result = service.getResult(requestId);
         ArdocoResultResponse response;
@@ -57,7 +87,16 @@ protected ResponseEntity<ArdocoResultResponse> handleGetResult(String requestId)
         return new ResponseEntity<>(response, response.getStatus());
     }
 
-    // build result for waitForResult
+
+    /**
+     * Handles the process of waiting for a result to become available, building a response based on the status.
+     *
+     * @param requestId the unique request ID for retrieving the result
+     * @return a {@link ResponseEntity} containing the {@link ArdocoResultResponse} with the status and result message
+     * @throws ArdocoException if an error occurs while waiting for the result
+     * @throws IllegalArgumentException if the provided requestId is invalid
+     * @throws TimeoutException if waiting for the result times out
+     */
     protected ResponseEntity<ArdocoResultResponse> handleWaitForResult(String requestId) throws ArdocoException, IllegalArgumentException, TimeoutException {
         Optional<String> result = service.waitForResult(requestId);
         ArdocoResultResponse response;
@@ -69,7 +108,15 @@ protected ResponseEntity<ArdocoResultResponse> handleWaitForResult(String reques
         return new ResponseEntity<>(response, response.getStatus());
     }
 
-    // build result for runPipelineAndWaitForResult
+    /**
+     * Handles the process of running a pipeline and waiting for the result, building a response accordingly.
+     *
+     * @param runner the {@link ArDoCoRunner} instance to execute
+     * @param requestId the unique request ID associated with the pipeline run
+     * @param inputFiles the list of input files for the pipeline run
+     * @return a {@link ResponseEntity} containing the {@link ArdocoResultResponse} with the status and result message
+     * @throws ArdocoException if an error occurs during the pipeline process or waiting for the result
+     */
     protected ResponseEntity<ArdocoResultResponse> handleRunPipelineAndWaitForResult(ArDoCoRunner runner, String requestId, List<File> inputFiles) throws ArdocoException{
         Optional<String> result = service.runPipelineAndWaitForResult(runner, requestId, inputFiles);
         ArdocoResultResponse response;

diff --git a/src/main/java/io/github/ardoco/rest/api/controller/ArDoCoForSadCodeTLRController.java b/src/main/java/io/github/ardoco/rest/api/controller/ArDoCoForSadCodeTLRController.java
@@ -87,9 +87,9 @@ public ResponseEntity<ArdocoResultResponse> runPipelineAndWaitForResult(
 
 
     @Operation(
-            summary = "Queries whether the ArDoCoResult is already there.",
-            description = "Queries whether the SadCodeTraceLinks is already there using the id which was returned by tue runPipeline method. " +
-                    "In case the result is not yet ready, the user gets informed about that as well via an appropriate message"
+            summary = "Queries the TraceLinks for a given resultID, and returns it if it is ready",
+            description = "Queries whether the TraceLinks are ready using the id, which was returned by tue runPipeline method. " +
+                    "In case the result is not yet ready, the user gets informed about it via an appropriate message and the user retrieves the unique id to query the result later"
     )
     @ApiResponses(value = {
             @ApiResponse(responseCode = "200", description = "the sadCodeTraceLinks found by ardoco", content = @Content(mediaType = "application/json", schema = @Schema(implementation = ArdocoResultResponse.class))),
@@ -104,9 +104,9 @@ public ResponseEntity<ArdocoResultResponse> getResult(
 
 
     @Operation(
-            summary = "Queries the SadCodeTraceLinks and returns them when they are ready.",
-            description = "Queries the SamSadTraceLinks and returns them when the previously started pipeline (using the runPipeline Method) has finished." +
-                    "In case it is not ready yet, it performs busy-waiting, meaning it waits until the result ready "
+            summary = "Waits up to 60s for the TraceLinks and returns them when they are ready.",
+            description = "Queries the TraceLinks and returns them when the previously started pipeline (using the runPipeline Method) has finished." +
+                    "In case the result is not there within 60s of waiting, the user gets informed about it via an appropriate message"
     )
     @ApiResponses(value = {
             @ApiResponse(responseCode = "200", description = "the sadCodeTraceLinks found by ardoco", content = @Content(mediaType = "application/json", schema = @Schema(implementation = ArdocoResultResponse.class))),
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		# ArDoCo REST
		Details concerning the architectural decisions as well as the API responses can be found [here](architecture_decisions.md).