Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
dmeoli authored Feb 28, 2024
1 parent 6c2e0aa commit 39ff2a7
Showing 1 changed file with 12 additions and 12 deletions.
24 changes: 12 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,19 @@
WS4J provides a pure Java API for several published semantic relatedness/similarity algorithms for, in theory, any
WordNet instance. You can immediately use WS4J on [Princeton's English WordNet 3.0](https://wordnet.princeton.edu/)
lexical database through [MIT Java WordNet Interface 2.4.0](https://projects.csail.mit.edu/jwi/), which is the fastest
Java library for interfacing to WordNet.
Java library for interfacing with WordNet.

The codebase is mostly a Java re-implementation of [WordNet::Similarity](http://wn-similarity.sourceforge.net/)
written in Perl, using the same data files as seen in src/main/resources, with some test cases for verifying the same
logic. WS4J designed to be thread safe.
logic. WS4J is designed to be thread-safe.

## Relatedness/Similarity Algorithms

The semantic relatedness/similarity metrics available are:

- [HSO](http://search.cpan.org/dist/WordNet-Similarity/lib/WordNet/Similarity/hso.pm):
[Hirst & St-Onge, 1998](https://scholar.google.com/scholar?q=Lexical+chains+as+representations+of+context+for+the+detection+and+correction+of+malapropisms) -
The Hirst & St-Onge measure is based on an idea that two lexicalized concepts are semantically close if their WordNet
The Hirst & St-Onge measure is based on the idea that two lexicalized concepts are semantically close if their WordNet
synsets are connected by a path that is not too long and that "does not change direction too often":

HSO(s1, s2) = const_C - path_length(s1, s2) - const_k * num_of_changes_of_directions(s1, s2);
Expand All @@ -28,7 +28,7 @@ The semantic relatedness/similarity metrics available are:

- [LESK](http://search.cpan.org/dist/WordNet-Similarity/lib/WordNet/Similarity/lesk.pm):
[Banerjee & Pedersen, 2002](https://scholar.google.com/scholar?q=An+Adapted+Lesk+Algorithm+for+Word+Sense+Disambiguation+Using+WordNet) -
Lesk (1985) proposed that the relatedness of two words is proportional to the extent of overlaps of their dictionary
Lesk (1985) proposed that the relatedness of two words is proportional to the extent of overlaps in their dictionary
definitions. This Lesk measure is based on adapted Lesk from Banerjee and Pedersen (2002) extended this notion to use
WordNet as the dictionary for the word definitions:

Expand Down Expand Up @@ -57,16 +57,16 @@ The semantic relatedness/similarity metrics available are:

- [JCN](http://search.cpan.org/dist/WordNet-Similarity/lib/WordNet/Similarity/jcn.pm):
[Jiang & Conrath, 1997](https://scholar.google.com/scholar?q=Semantic+similarity+based+on+corpus+statistics+and+lexical+taxonomy) -
The Jiang & Conrath measure uses the notion of information content, but in the form of the conditional probability of
encountering an instance of a child-synset given an instance of a parent synset:
The Jiang & Conrath measure uses the notion of information content but in the form of the conditional probability of
encountering an instance of a child synset given an instance of a parent synset:

JCN(s1, s2) = 1 / jcn_distance where jcn_distance(s1, s2) = IC(s1) + IC(s2) - 2 * IC(LCS(s1, s2)); when it's 0,
jcn_distance(s1, s2) = -Math.log_e((freq(LCS(s1, s2).root) - 0.01) / freq(LCS(s1, s2).root)) so that we can have a
non-zero distance which results in infinite similarity;

- [LIN](http://search.cpan.org/dist/WordNet-Similarity/lib/WordNet/Similarity/lin.pm):
[Lin, 1998](https://scholar.google.com/scholar?q=An+information-theoretic+definition+of+similarity) - The Lin measure
idea is similar to JCN with small modification:
idea is similar to JCN with a small modification:

LIN(s1, s2) = 2 * IC(LCS(s1, s2) / (IC(s1) + IC(s2)).

Expand All @@ -75,7 +75,7 @@ The descriptions above are extracted either from each paper or from

## Prerequisites

By default, requirement for compilation are:
By default, the requirements for compilation are:

- JDK 8+
- Maven
Expand All @@ -100,7 +100,7 @@ and a simple demo class:

`src/main/java/edu/uniba/di/lacam/kdde/ws4j/demo/SimilarityCalculationDemo.java`

which can be run through jar-with-dependencies from root folder by typing into terminal:
which can be run through jar-with-dependencies from the root folder by typing into the terminal:

```
$ java -jar target/ws4j-1.0.2-jar-with-dependencies.jar
Expand All @@ -115,11 +115,11 @@ When using WS4J jar package from other projects add the [JitPack](https://jitpac
</repository>
</repositories>

and declare this github repo as a dependency:
and declare this GitHub repo as a dependency:

<dependencies>
<dependency>
<groupId>com.github.DonatoMeoli</groupId>
<groupId>com.github.dmeoli</groupId>
<artifactId>WS4J</artifactId>
<version>x.y.z</version>
</dependency>
Expand All @@ -128,7 +128,7 @@ and declare this github repo as a dependency:
## Running the tests

To run JUnit test cases:

```
$ mvn test
```
Expand Down

0 comments on commit 39ff2a7

Please sign in to comment.