Skip to content
/ juju Public

Automatically exported from code.google.com/p/juju

Notifications You must be signed in to change notification settings

OAlm/juju

Folders and files

NameName
Last commit message
Last commit date

Latest commit

author
Olli Alm
Feb 17, 2013
431ea0f · Feb 17, 2013

History

1 Commit
Feb 17, 2013
Feb 17, 2013
Feb 17, 2013
Feb 17, 2013
Feb 17, 2013

Repository files navigation

Juju

Juju is an information extraction framework.

Terms

Gram Token Sentence

Filter Weighter

Installation

Juju's Dependencies are handled with Maven (pom.xml).

SBT

Add the following lines to your build.sbt.

resolvers += "Local Maven Repository" at "file://"+Path.userHome.absolutePath+"/.m2/repository"

libraryDependencies += "fi.metropolia.ereading" % "Juju" % "0.0.1-SNAPSHOT"

Examples

A simple keyphrase extractor with default weighting (based on Wikipedia's corpus)

import fi.metropolia.mediaworks.juju.syntax.parser.DocumentBuilder;
import fi.metropolia.mediaworks.juju.document.Document;
import fi.metropolia.mediaworks.juju.extractor.keyphrase.KeyphraseExtractor;

String input = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc vitae dui lacus.";

Document document = DocumentBuilder.parseDocument(input, "fi"); // "en" is also available
KeyphraseExtractor extractor = new KeyphraseExtractor(document);

return extractor.process()

Calling process() will return a Map<Grams, Double>. Grams represent a word and the latter it's frequency/weight.

About

Automatically exported from code.google.com/p/juju

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages