Projects

SLM-RL Search

Trained a 4B parameter model with RL to write and execute its own search strategies. 3.7x improvement over baseline.

RLvLLMGRPOPython

Multi-Agent LLM Collaboration

Multi-agent architecture with custom reflection and inter-agent communication. Fine-tuned LLaMA-3 8B with LoRA, 85% success rate across 7 coordination tasks.

LLaMALoRAMulti-AgentHuggingFace

PyCxsim

Open-source Python package for running multi-agent simulations with a real-time visual interface.

PythonOpen SourceMulti-AgentSimulation