Skip to content

Library and test program for playing with bayesian rule sets.

Notifications You must be signed in to change notification settings

Hongyuy/rulelib

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

These are tools to experiment with representations and approaches to handling
rulesets.

Rules have support and then a bit vector (or GMP bignum) representing the
samples for which the rule applies.

A ruleset is a collection of rules and a captures vector/bignum associated
with each rule indicating which samples get captured by which rule.

makedata.py: Transform data sets into something easily read into a C program
	Assumes input files in the *.TAB and *.Y formats and produces a .out
	format of <rule, truthtable> tuples where entry i in the truthtable
	contains either an ascii 1 or ascii 0 indicating if the rule applies
	to the ith sample in the training data.

	Usage: (from within python)
	from makedata import *
	get_freqitemsets("basename")

	where basename would be something like adult2_train and the files
	adult2_train.TAB and adult2_train.Y exist. The function will output
	adult2_train.out


analyze.c:	Driver program that:
	1. Calls rules_init to read in the data produced by makedata
	2. Executes [-i iterations] (default 10) of:
		3. Create random ruleset of [-s size] (default 3)
		4. Performs size^2 adjacent swaps
		5. Performs size delete/add pairs

rulelib.c:	Library of routines for manipulating rules and rulesets.
	See rule.h for function prototypes exported.


Compile options:

This package compiles both with and without the GMP library.  Without it,
bit vector operations are coded manually as arrays of long longs. With -D GMP,
we store the vectors as bignums.

Sample code for running the code currently in the rulelib directory, 
first do:
$python makedata.py adult2_train
$python makedata.py adult2_train
then:
$./analyze titanic_train.out titanic_train.label -s 30 -i 100
$./analyze adult2_train.out adult2_train.label -s 30 -i 100

About

Library and test program for playing with bayesian rule sets.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Yacc 65.1%
  • Python 20.3%
  • C 14.5%
  • Makefile 0.1%