在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称:gym-super-mario-bros开源软件地址:https://gitee.com/devilmaycry812839668/gym-super-mario-bros开源软件介绍:gym-super-mario-brosAn OpenAI Gym environment forSuper Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The NintendoEntertainment System (NES) usingthe nes-py emulator. InstallationThe preferred installation of pip install gym-super-mario-bros UsagePythonYou must import from nes_py.wrappers import JoypadSpaceimport gym_super_mario_brosfrom gym_super_mario_bros.actions import SIMPLE_MOVEMENTenv = gym_super_mario_bros.make('SuperMarioBros-v0')env = JoypadSpace(env, SIMPLE_MOVEMENT)done = Truefor step in range(5000): if done: state = env.reset() state, reward, done, info = env.step(env.action_space.sample()) env.render()env.close() NOTE: NOTE: remove calls to Command Line
gym_super_mario_bros -e <the environment ID to play> -m <`human` or `random`> NOTE: by default, EnvironmentsThese environments allow 3 attempts (lives) to make it through the 32 stagesin the game. The environments only send reward-able game-play frames toagents; No cut-scenes, loading screens, etc. are sent from the NES emulatorto an agent nor can an agent perform actions during these instances. If acut-scene is not able to be skipped by hacking the NES's RAM, the environmentwill lock the Python process until the emulator is ready for the next action.
Individual StagesThese environments allow a single attempt (life) to make it through a singlestage of the game. Use the template SuperMarioBros-<world>-<stage>-v<version> where:
For example, to play 4-2 on the downsampled ROM, you would use the environmentid Random Stage SelectionThe random stage selection environment randomly selects a stage and allows asingle attempt to clear it. Upon a death and subsequent call to StepInfo about the rewards and info returned by the Reward FunctionThe reward function assumes the objective of the game is to move as far rightas possible (increase the agent's x value), as fast as possible, withoutdying. To model this game, three separate variables compose the reward:
r = v + c + d The reward is clipped into the range (-15, 15).
|
Key | Type | Description |
---|---|---|
coins | int | The number of collected coins |
flag_get | bool | True if Mario reached a flag or ax |
life | int | The number of lives left, i.e., {3, 2, 1} |
score | int | The cumulative in-game score |
stage | int | The current stage, i.e., {1, ..., 4} |
status | str | Mario's status, i.e., {'small', 'tall', 'fireball'} |
time | int | The time left on the clock |
world | int | The current world, i.e., {1, ..., 8} |
x_pos | int | Mario's x position in the stage (from the left) |
y_pos | int | Mario's y position in the stage (from the bottom) |
Please cite gym-super-mario-bros
if you use it in your research.
@misc{gym-super-mario-bros, author = {Christian Kauten}, howpublished = {GitHub}, title = {{S}uper {M}ario {B}ros for {O}pen{AI} {G}ym}, URL = {https://github.com/Kautenja/gym-super-mario-bros}, year = {2018},}
请发表评论