ARTICLE AD BOX
Problem Description The RL code implements a Gymnasium-compatible environment (DoomEnv) for training reinforcement learning agents in DOOM Retro. It captures game state (observations) via shared memory or screen capture, sends actions to control the game, and provides rewards. The issue is that the environment relies on DOOM Retro running in the background, and without it, it falls back to capturing the full monitor, which may not provide accurate game state. Additionally, the RL training and evaluation scripts need refinement for stability and performance.
Key challenges:
Dependency on DOOM Retro process. Shared memory integration for real-time state. Action space and reward shaping. Training convergence and evaluation metrics. This is WIP code, so bugs or incomplete features are expected.
Here's the core DoomEnv class (from doom_env.py). It's a Gymnasium environment that interfaces with DOOM Retro.
import gymnasium as gym from gymnasium import spaces import numpy as np import time from controller.doom_controller import DoomController from observation.observation_builder import ObservationBuilder from rewards.reward_manager import RewardManager from utils.shm_reader import ShmReader from frame_cache import FrameCache class DoomEnv(gym.Env): def __init__(self): super().__init__() self.action_space = spaces.Discrete(8) # Example: 8 actions (move, turn, shoot, etc.) self.observation_space = spaces.Box(low=0, high=255, shape=(84, 84, 3), dtype=np.uint8) self.controller = DoomController() self.obs_builder = ObservationBuilder() self.reward_manager = RewardManager() self.shm_reader = ShmReader() self.frame_cache = FrameCache() self.prev_health = 100 self.prev_ammo = 50 self.prev_kills = 0 def reset(self, seed=None, options=None): super().reset(seed=seed) self.controller.reset_game() obs = self._get_observation() info = {} return obs, info def step(self, action): self.controller.send_action(action) time.sleep(0.1) # Frame delay obs = self._get_observation() reward = self._calculate_reward() done = self._is_done() truncated = False info = {} return obs, reward, done, truncated, info def _get_observation(self): # Try shared memory first if self.shm_reader.is_available(): frame = self.shm_reader.get_frame() else: # Fallback to screen capture frame = self.frame_cache.capture_frame() # Process frame (resize, grayscale, etc.) processed = self.obs_builder.build_observation(frame) return processed def _calculate_reward(self): current_health = self.shm_reader.get_health() if self.shm_reader.is_available() else 100 current_ammo = self.shm_reader.get_ammo() if self.shm_reader.is_available() else 50 current_kills = self.shm_reader.get_kills() if self.shm_reader.is_available() else 0 reward = self.reward_manager.calculate_reward( current_health, self.prev_health, current_ammo, self.prev_ammo, current_kills, self.prev_kills ) self.prev_health = current_health self.prev_ammo = current_ammo self.prev_kills = current_kills return reward def _is_done(self): return self.shm_reader.get_health() <= 0 if self.shm_reader.is_available() else False def close(self): self.controller.close() import sys from os.path import dirname, abspath sys.path.insert(0, dirname(dirname(abspath(__file__)))) from env.doom_env import DoomEnv env = DoomEnv() obs, _ = env.reset() print(obs.shape) obs, reward, done, truncated, info = env.step(0) print("step working")Required Data/Setup DOOM Retro binary: Built from the main repo (requires SDL2, CMake). IWAD file: e.g., freedoom1.wad or doom.wad. Python environment: Virtualenv with gymnasium, stable-baselines3, numpy, mss, pynput, torch, pillow. Shared memory: DOOM Retro must be running with RL support (writes to /dev/shm/doomretro_rl). No external data files needed for basic testing, but training uses trajectories/clips.
Required Data/Setup DOOM Retro binary: Built from the main repo (requires SDL2, CMake). IWAD file: e.g., freedoom1.wad or doom.wad. Python environment: Virtualenv with gymnasium, stable-baselines3, numpy, mss, pynput, torch, pillow. Shared memory: DOOM Retro must be running with RL support (writes to /dev/shm/doomretro_rl). No external data files needed for basic testing, but training uses trajectories/clips.
To run DOOM Retro:./build/doomretro -iwad /path/to/doom.wad
Obtained Output Ran python scripts/test_env.py without DOOM Retro running (fallback to monitor capture):
Connecting to existing DOOM window...DOOM Retro window not found!WARNING: DOOM window not found. Is DOOM Retro running?DOOM Retro window not found![frame_cache] DOOM window not found — capturing full monitor.(84, 84, 12)step working
obs.shape: (84, 84, 12) – Observation is a resized frame with 12 channels (likely RGB + extras). "step working": Indicates the step method executed without error. Expected Output With DOOM Retro running:
obs.shape: (84, 84, 3) or similar (processed game frame). Reward calculation based on health/ammo/kills changes. Shared memory used instead of monitor capture. No warnings about DOOM window not found. Full episode runs with proper done/truncated flags. The environment should integrate seamlessly with RL libraries like Stable Baselines3 for training agents that play DOOM.
