Skip to content
forked from bixo/bixo

This is a replacement for ant + maven with Gradle. Bixo is an open source web mining toolkit that runs as a series of Cascading pipes on top of Hadoop. By building a customized Cascading pipe assembly, you can quickly create specialized web mining applications.

Notifications You must be signed in to change notification settings

ilsaul/bixo-gradle

 
 

Repository files navigation

===============================
Introduction
===============================

Bixo is an open source Java web mining toolkit that runs as a series of Cascading
pipes. It is designed to be used as a tool for creating customized web mining apps.
By building a customized Cascading pipe assembly, you can quickly create a workflow
using Bixo that fetches web content, parses, analyzes, and publishes the results.

Bixo borrows heavily from the Apache Nutch project, as well as many other open source
projects at Apache and elsewhere.

Bixo is released under the Apache License, Version 2.0.

===============================
Building
===============================

See http://openbixo.org/documentation/building-bixo/ for full details.

You need Apache Ant 1.7 or higher. 

To get a list of valid targets:

% cd <project directory>
% ant -p

or 

% gradle -q tasks

To  clean and build a jar (which also runs all tests):

% ant clean jar

or

% gradle cleand build

Note that "ant clean test jar" will currently fail, due to a bug in the maven ant task
plugin used for managing dependencies.

if you want to execute a build without the excute the test case use:

% gradle -x test build

To create Eclipse project files:

% ant eclipse

or

% gradle eclipse

Then, from Eclipse follow the standard procedure to import an existing Java project into your Workspace.


About

This is a replacement for ant + maven with Gradle. Bixo is an open source web mining toolkit that runs as a series of Cascading pipes on top of Hadoop. By building a customized Cascading pipe assembly, you can quickly create specialized web mining applications.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Arc 80.4%
  • Java 12.3%
  • Shell 7.2%
  • Groovy 0.1%