๐ŸŒ AIๆœ็ดข & ไปฃ็† ไธป้กต
Skip to content

SynapticSage/cscg_toolkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

52 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Action-augmented cloned HMMs for higher-order sequence learning and cognitive map formation

License Julia Python


Overview

This repository implements Clone-Structured Cognitive Graphs (CSCG), a probabilistic framework for learning cognitive maps from sequential observations and actions. CSCGs build on Cloned Hidden Markov Models (CHMM), which are sparse HMMs where each hidden state emits a single observation deterministically, with multiple "clones" per observation enabling context-dependent representations based on sequential context. CSCGs extend cloned HMMs by augmenting transitions with actions, enabling spatial/temporal/relational learning and vicarious evaluation (planning without execution) through message-passing inference.

Cognitive maps emerge naturally from latent higher-order sequence learningโ€”organisms learn space by treating it as a sequence.


Repository Structure

cscg_toolkit/
โ”œโ”€โ”€ README.md                    # This file
โ”œโ”€โ”€ LICENSE                      # MIT License
โ”œโ”€โ”€ julia/                       # Julia implementation (reference)
โ”‚   โ”œโ”€โ”€ Project.toml
โ”‚   โ”œโ”€โ”€ src/                     # Core CSCG library
โ”‚   โ”œโ”€โ”€ test/                    # Unit & integration tests
โ”‚   โ”œโ”€โ”€ scripts/                 # Example scripts
โ”‚   โ””โ”€โ”€ test_data/               # Test fixtures
โ”œโ”€โ”€ jax/                         # JAX implementation (active development)
โ”‚   โ”œโ”€โ”€ chmm_jax/                # Core package
โ”‚   โ”œโ”€โ”€ tests/                   # Test suite
โ”‚   โ”œโ”€โ”€ examples/                # Usage examples
โ”‚   โ””โ”€โ”€ README.md                # JAX-specific docs
โ”œโ”€โ”€ papers/                      # Reference papers and summaries
โ”‚   โ”œโ”€โ”€ pdf/                     # Original papers (PDFs)
โ”‚   โ”œโ”€โ”€ md/                      # Markdown conversions
โ”‚   โ””โ”€โ”€ summaries/               # Deep technical summaries with LaTeX
โ””โ”€โ”€ python/                      # Python reference implementation (legacy)
    โ”œโ”€โ”€ chmm_actions.py
    โ”œโ”€โ”€ intro.ipynb
    โ””โ”€โ”€ README.md

Core Concepts

What is a Cloned HMM?

A Cloned Hidden Markov Model is a sparse HMM where each hidden state emits a single observation deterministically. Multiple hidden states ("clones") can emit the same observation, enabling context-dependent representations:

Observations:  [x_0, x_1, x_2, ...]
                  โ†“     โ†“     โ†“
Hidden States: [z_0,z_1,z_2] [z_3,z_4,z_5] [z_6,z_7,z_8] ...  (3 clones per observation)

Key advantages:

  • Context-dependent representations: Same observation splits into different latent states based on sequential context
  • Variable-order dependencies: Efficiently learns higher-order sequential structure without exponential state explosion
  • Computational efficiency: Block-structured transitions exploit emission sparsity

What is CSCG?

Clone-Structured Cognitive Graphs extend cloned HMMs by augmenting state transitions with actions. This enables:

  • Spatial/temporal learning: Cognitive maps emerge from sequential experience with actions
  • Vicarious evaluation: Message-passing inference enables planning without execution
  • Flexible structure: Learns graphs from severely aliased observations (same visual input at multiple locations)

Algorithm Summary

The Baum-Welch algorithm takes a simpler form for cloned HMMs due to emission sparsity:

  1. Forward pass: ฮฑ(n+1)แต€ = ฮฑ(n)แต€ T(xโ‚™, aโ‚™, xโ‚™โ‚Šโ‚) Computes ฮฑ(n) = P(xโ‚:โ‚™, aโ‚:โ‚™โ‚‹โ‚, zโ‚™) using only Mร—M blocks

  2. Backward pass: ฮฒ(n) = T(xโ‚™, aโ‚™, xโ‚™โ‚Šโ‚) ฮฒ(n+1) Computes ฮฒ(n) = P(xโ‚™โ‚Šโ‚:N, aโ‚™:Nโ‚‹โ‚ | zโ‚™) using only Mร—M blocks

  3. E-step: ฮพแตขโ‚–โฑผ(n) = [ฮฑ(n) โˆ˜ T(i,aโ‚™,j) โˆ˜ ฮฒ(n+1)แต€] / [ฮฑ(n)แต€ T(i,aโ‚™,j) ฮฒ(n+1)] Expected transition counts from clone-set i via action k to clone-set j

  4. M-step: T(i,k,j) = ฮฃโ‚™ ฮพแตขโ‚–โฑผ(n) / ฮฃโ‚–โ€ฒ,โฑผโ€ฒ,โ‚™ ฮพแตขโ‚–โ€ฒโฑผโ€ฒ(n) Normalize expected counts to probability distributions

Key insight: Only compute blocks T(xโ‚™, aโ‚™, xโ‚™โ‚Šโ‚) appearing in observed sequence, yielding O(Mยฒ|ฮฃ|ยฒTN_a) complexity instead of O((M|ฮฃ|)ยฒTN_a) for standard HMM.

Alternatives:

  • Viterbi training: Hard assignment (argmax) instead of soft expectations for faster convergence
  • Pseudocount smoothing: Add ฮบ > 0 to all counts for regularization and preventing zero probabilities
  • Gradient descent (future): Planned enhancement for end-to-end integration with neural networks

Installation

This repository contains multiple implementations:


Contributing

This is a research project. Contributions are welcome!

Priority areas:

  • Gradient descent training via Flux.jl
  • GPU acceleration
    • Pytorch
    • Jax
  • Performance optimizations
  • Additional test coverage
  • Documentation improvements

Workflow:

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass (julia --project=julia test/runtests.jl)
  5. Submit a pull request

References

This implementation is based on the following research:

Foundational Papers

  • Dedieu, A., Gothoskar, N., Swingle, S., Lehrach, W., Lรกzaro-Gredilla, M., & George, D. (2019). Learning higher-order sequential structure with cloned HMMs. arXiv:1905.00507
    • Provides theoretical convergence guarantees, demonstrates 10% improvement over LSTMs on language modeling
  • George, D., Rikhye, R.V., Gothoskar, N., Guntupalli, J.S., Dedieu, A., & Lรกzaro-Gredilla, M. (2021). Clone-structured graph representations enable flexible learning and vicarious evaluation of cognitive maps. Nature Communications, 12(1), 2392
    • Extends CHMMs with actions, explains hippocampal splitter cells, route encoding, remapping phenomena
  • Raju, R.V., Guntupalli, J.S., Zhou, G., Lรกzaro-Gredilla, M., & George, D. (2022). Space is a latent sequence: Structured sequence learning as a unified theory of representation in the hippocampus. arXiv:2212.01508 (published Science Advances, 2024)
    • Explains dozen+ hippocampal phenomena with single mechanism: latent higher-order sequence learning
  • Kansky, K., et al. (2017). Schema networks: Zero-shot transfer with a generative causal model of intuitive physics. arXiv:1706.04317
  • Lรกzaro-Gredilla, M., et al. (2018). Beyond imitation: Zero-shot task transfer on robots by learning concepts as cognitive programs. arXiv:1812.02788 (Science Robotics)

Technical Summaries

Comprehensive paper summaries available in:


License

MIT License - see LICENSE for details


Last updated: 2025-11-02

About

Jax/PyTorch and Julia version of Dileep George's Clone-structured Cognitive Graphs.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •