Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
0xC000005 authored Sep 25, 2024
1 parent 94a458a commit d7f11cc
Showing 1 changed file with 61 additions and 1 deletion.
62 changes: 61 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,71 @@
# Revolution
![Repin_17October](https://github.com/social-ai-uoft/revolutions/assets/29427196/3488423c-29a9-4f8a-970b-ac9fd299a92e)
![Image_20240908163948](https://github.com/user-attachments/assets/c50309d7-4df5-4247-b50c-662d37d0d6d0)

## Architecture
```mermaid
classDiagram
class MultiAgentEnv {
+teams
+agents
+pairings
}
class Game {
+game_player_1
+game_player_2
}
class Team {
+team_players
}
class Player {
+model
+replay_buffer
}
class DQN_dynamic
class DoubleDQN_dynamic {
+dqn
+target_dqn
}
class ActorCritic {
+actor
+critic
}
class PPO {
+policy
+policy_old
+buffer
}
class RolloutBuffer
nn_Module <|-- DQN_dynamic
nn_Module <|-- ActorCritic
MultiAgentEnv --> "*" Team : contains
MultiAgentEnv --> "*" Player : contains
MultiAgentEnv --> "*" Game : creates
Team --> "*" Player : contains
Game --> "2" Player : references
Player --> "0..1" DoubleDQN_dynamic : may have
Player --> "0..1" PPO : may have
DoubleDQN_dynamic --> "2" DQN_dynamic : contains
PPO --> ActorCritic : contains
PPO --> RolloutBuffer : contains
```

**Changelog**


**[Version 5.0](https://github.com/social-ai-uoft/revolutions/tree/version_5.0)**
* **Episodic Memory Empowers Agents:** Agents now retain memories of past interactions, specific to each opponents, enabling them to develop sophisticated and adaptive strategies.
* **Replay Buffer Improvement:** Agents now retain replay buffer of past interactions specific to each opponents, enabling user to learn state transition to one user, making predicting opponents behavior more explicit.
* **Richer State Information:** Agents make more informed decisions by considering additional factors such as relative team performance and opponent identities.
* **NPC Opponents:** Test your agents against a variety of non-player characters (NPCs) with predefined behavioral patterns.
* **Battle of the Sexes Integration:** The reward function now includes internally both the battle of sexes and the prisoners dilemma.

**[Version 4.0](https://github.com/social-ai-uoft/revolutions/tree/version_4.0)**
* **Reactive Training:** Agents now adapt their strategies dynamically in response to opponents' predefined actions. We have verified that in fact agents with PPO and episodic memory are able to learn
* **Episodic Memory:** Agents now are able to associate opponents with their previous action within a episode, which significantly improve the agent performance
Expand Down

0 comments on commit d7f11cc

Please sign in to comment.