Skip to content

Releases: markkho/msdm

v0.11 Release

17 Oct 22:20
Compare
Choose a tag to compare
  • Major fix to A* implementation in 7a52fa7
  • Additional table support
  • ImplicitDistribution implementation
  • Implementation of Options Framework (Sutton, Precup & Singh, 1999)

v0.10 Release

11 Jan 22:22
Compare
Choose a tag to compare

Summary of changes/additions:

  • Implemented a Table class that allows for a dict and numpy-like interface with numpy array backend
  • MarkovDecisionProcess and PartiallyObservableMDP algorithms return Results objects with attributes in the form of Tables (e.g., state_value, action_value, policy) - note that this is a breaking change
  • For all MDPs and derived problem classes, is_terminal has been changed to is_absorbing
  • FunctionalPolicy and TabularPolicy classes introduced
  • PolicyIteration, ValueIteration, and MultichainPolicyIteration have been (re-)implemented
  • Tests have been streamlined
  • Organization of core modules has been streamlined

v0.9 Release

21 Sep 11:40
Compare
Choose a tag to compare

Summary of changes/additions:

  • RMAX implementation
  • Fix TD Learning bug
  • Fix TabularMDP.reachable_states
  • New tests

v0.8 Release

05 Aug 02:41
Compare
Choose a tag to compare

Summary of changes/additions:

  • LAOStar error handling
  • New DictDistributionmethods
  • New condition, chain, and is_normalized methods in FiniteDistribution

v0.7 Release

05 Dec 20:41
Compare
Choose a tag to compare

Summary of changes/additions:

  • POMDP solvers:
    • FSCBoundedPolicyIteration (new)
    • FSCGradientAscent (minor changes)
  • Planning algorithms
    • Major refactor of LAOStar to support event listener pattern (note interface changes)
    • Minor refactor of LRTDP to support event listener pattern
  • Core classes
    • Fix to TabularPolicy.from_q_matrices calculation of softmax distribution
    • Minor changes to core POMDP implementation
  • New domains
    • GridMDP base class and plotting tools
    • WindyGridWorld MDP
  • clean up

v0.6

03 Nov 22:44
Compare
Choose a tag to compare

Minor changes

v0.5 Release

30 Oct 17:25
Compare
Choose a tag to compare

This release mainly includes interfaces, algorithms, and test domains for tabular partially observable markov decision processes (POMDPs).

Summary of changes:

  • Core POMDP classes:
    • PartiallyObservableMDP
    • TabularPOMDP
    • BeliefMDP
    • POMDPPolicy
    • ValueBasedTabularPOMDPPolicy
    • AlphaVectorPolicy
    • FiniteStateController
    • StochasticFiniteStateController
  • Domains:
    • HeavenOrHell
    • LoadUnload
    • Tiger
  • Algorithms:
    • PointBasedValueIteration
    • QMDP
    • FSCGradientAscent
  • JuliaPOMDPs wrapper
  • Fixes to Policy Iteration and Value Iteration
  • Updated README.md

v0.4 Release

21 Oct 20:49
4aafea6
Compare
Choose a tag to compare

New Features

  • QLearning, SARSA, Expected SARSA, DoubleQLearning
  • Policy Iteration
  • Entropy Regularized Policy Iteration
  • Works with python 3.9
  • QuickMDP and QuickTabularMDP constructors
  • Construction of TabularMDPs from matrices
  • New domains: CliffWalking, GridMDP generic class, Russell & Norvig gridworld example
  • Gridworld plotting of action values

Refactoring of core

05 Apr 19:50
554bcd2
Compare
Choose a tag to compare

Major overhaul of core and tabular methods:

  • States/actions are assumed to be hashable (e.g., Gridworld now uses frozendict; no built-in hashing functions; dictionaries are the main way to create maps)
  • The distribution classes have been streamlined (Multinomial has been removed and DictDistribution is the main way to represent categorical distributions; .sample() takes a random number generator)
  • Policy classes have been simplified
  • More thorough type hints

Minor additions to algorithms

04 Apr 14:58
Compare
Choose a tag to compare
v0.2

Add makefile