• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

gym-super-mario-bros: An OpenAI Gym environment for Super Mario Bros. & Supe ...

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称:

gym-super-mario-bros

开源软件地址:

https://gitee.com/devilmaycry812839668/gym-super-mario-bros

开源软件介绍:

gym-super-mario-bros

BuildStatusPackageVersionPythonVersionStableFormatLicense

Mario

An OpenAI Gym environment forSuper Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The NintendoEntertainment System (NES) usingthe nes-py emulator.

Installation

The preferred installation of gym-super-mario-bros is from pip:

pip install gym-super-mario-bros

Usage

Python

You must import gym_super_mario_bros before trying to make an environment.This is because gym environments are registered at runtime. By default,gym_super_mario_bros environments use the full NES action space of 256discrete actions. To contstrain this, gym_super_mario_bros.actions providesthree actions lists (RIGHT_ONLY, SIMPLE_MOVEMENT, and COMPLEX_MOVEMENT)for the nes_py.wrappers.JoypadSpace wrapper. Seegym_super_mario_bros/actions.py for abreakdown of the legal actions in each of these three lists.

from nes_py.wrappers import JoypadSpaceimport gym_super_mario_brosfrom gym_super_mario_bros.actions import SIMPLE_MOVEMENTenv = gym_super_mario_bros.make('SuperMarioBros-v0')env = JoypadSpace(env, SIMPLE_MOVEMENT)done = Truefor step in range(5000):    if done:        state = env.reset()    state, reward, done, info = env.step(env.action_space.sample())    env.render()env.close()

NOTE: gym_super_mario_bros.make is just an alias to gym.make forconvenience.

NOTE: remove calls to render in training code for a nontrivialspeedup.

Command Line

gym_super_mario_bros features a command line interface for playingenvironments using either the keyboard, or uniform random movement.

gym_super_mario_bros -e <the environment ID to play> -m <`human` or `random`>

NOTE: by default, -e is set to SuperMarioBros-v0 and -m is set tohuman.

Environments

These environments allow 3 attempts (lives) to make it through the 32 stagesin the game. The environments only send reward-able game-play frames toagents; No cut-scenes, loading screens, etc. are sent from the NES emulatorto an agent nor can an agent perform actions during these instances. If acut-scene is not able to be skipped by hacking the NES's RAM, the environmentwill lock the Python process until the emulator is ready for the next action.

EnvironmentGameROMScreenshot
SuperMarioBros-v0SMBstandard
SuperMarioBros-v1SMBdownsample
SuperMarioBros-v2SMBpixel
SuperMarioBros-v3SMBrectangle
SuperMarioBros2-v0SMB2standard
SuperMarioBros2-v1SMB2downsample

Individual Stages

These environments allow a single attempt (life) to make it through a singlestage of the game.

Use the template

SuperMarioBros-<world>-<stage>-v<version>

where:

  • <world> is a number in {1, 2, 3, 4, 5, 6, 7, 8} indicating the world
  • <stage> is a number in {1, 2, 3, 4} indicating the stage within a world
  • <version> is a number in {0, 1, 2, 3} specifying the ROM mode to use
    • 0: standard ROM
    • 1: downsampled ROM
    • 2: pixel ROM
    • 3: rectangle ROM

For example, to play 4-2 on the downsampled ROM, you would use the environmentid SuperMarioBros-4-2-v1.

Random Stage Selection

The random stage selection environment randomly selects a stage and allows asingle attempt to clear it. Upon a death and subsequent call to reset, theenvironment randomly selects a new stage. This is only available for thestandard Super Mario Bros. game, not Lost Levels (at the moment). To usethese environments, append RandomStages to the SuperMarioBros id. Forexample, to use the standard ROM with random stage selection useSuperMarioBrosRandomStages-v0. To seed the random stage selection use theseed method of the env, i.e., env.seed(1), before any calls to reset.

Step

Info about the rewards and info returned by the step method.

Reward Function

The reward function assumes the objective of the game is to move as far rightas possible (increase the agent's x value), as fast as possible, withoutdying. To model this game, three separate variables compose the reward:

  1. v: the difference in agent x values between states
    • in this case this is instantaneous velocity for the given step
    • v = x1 - x0
      • x0 is the x position before the step
      • x1 is the x position after the step
    • moving right ⇔ v > 0
    • moving left ⇔ v < 0
    • not moving ⇔ v = 0
  2. c: the difference in the game clock between frames
    • the penalty prevents the agent from standing still
    • c = c0 - c1
      • c0 is the clock reading before the step
      • c1 is the clock reading after the step
    • no clock tick ⇔ c = 0
    • clock tick ⇔ c < 0
  3. d: a death penalty that penalizes the agent for dying in a state
    • this penalty encourages the agent to avoid death
    • alive ⇔ d = 0
    • dead ⇔ d = -15

r = v + c + d

The reward is clipped into the range (-15, 15).

info dictionary

The info dictionary returned by the step method contains the followingkeys:

KeyTypeDescription
coins intThe number of collected coins
flag_getboolTrue if Mario reached a flag or ax
lifeintThe number of lives left, i.e., {3, 2, 1}
scoreintThe cumulative in-game score
stageintThe current stage, i.e., {1, ..., 4}
statusstrMario's status, i.e., {'small', 'tall', 'fireball'}
timeintThe time left on the clock
worldintThe current world, i.e., {1, ..., 8}
x_posintMario's x position in the stage (from the left)
y_posintMario's y position in the stage (from the bottom)

Citation

Please cite gym-super-mario-bros if you use it in your research.

@misc{gym-super-mario-bros,  author = {Christian Kauten},  howpublished = {GitHub},  title = {{S}uper {M}ario {B}ros for {O}pen{AI} {G}ym},  URL = {https://github.com/Kautenja/gym-super-mario-bros},  year = {2018},}

鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
上一篇:
alien: 外星人入侵发布时间:2022-03-25
下一篇:
GMS2-SIUI: 自制GameMaker Studio2 UI框架发布时间:2022-03-25
热门推荐
热门话题
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap