• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

DI-engine: OpenDILab决策智能引擎 https://github.com/opendilab/DI-engine

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称:

DI-engine

开源软件地址:

https://gitee.com/opendilab/DI-engine

开源软件介绍:


TwitterPyPICondaConda updatePyPI - Python VersionPyTorch Version

LocComments

StyleDocsUnittestAlgotestdeploycodecov

GitHub Org's starsGitHub starsGitHub forksGitHub commit activityGitHub issuesGitHub pullsContributorsGitHub license

Updated on 2022.03.24 DI-engine-v0.3.0 (beta)

Introduction to DI-engine (beta)

DI-engine doc | 中文文档

DI-engine is a generalized decision intelligence engine. It supports various deep reinforcement learning algorithms (link):

  • Most basic DRL algorithms, such as DQN, PPO, SAC, R2D2
  • Multi-agent RL algorithms like QMIX, MAPPO
  • Imitation learning algorithms (BC/IRL/GAIL) , such as GAIL, SQIL, Guided Cost Learning
  • Exploration algorithms like HER, RND, ICM
  • Offline RL algorithms: CQL, TD3BC
  • Model-based RL algorithms: MBPO

DI-engine aims to standardize different RL enviroments and applications. Various training pipelines and customized decision AI applications are also supported.

DI-engine also has some system optimization and design for efficient and robust large-scale RL training:

Have fun with exploration and exploitation.

Installation

You can simply install DI-engine from PyPI with the following command:

pip install DI-engine

If you use Anaconda or Miniconda, you can install DI-engine from conda-forge through the following command:

conda install -c opendilab di-engine

For more information about installation, you can refer to installation.

And our dockerhub repo can be found here,we prepare base image and env image with common RL environments.

  • base: opendilab/ding:nightly
  • atari: opendilab/ding:nightly-atari
  • mujoco: opendilab/ding:nightly-mujoco
  • smac: opendilab/ding:nightly-smac

The detailed documentation are hosted on doc | 中文文档.

Quick Start

3 Minutes Kickoff

3 Minutes Kickoff (colab)

3 分钟上手中文版 (kaggle)

How to migrate a new RL Env | 如何迁移一个新的强化学习环境

Bonus: Train RL agent in one line code:

ding -m serial -e cartpole -p dqn -s 0

Supporters

↳ Stargazers

Stargazers repo roster for @opendilab/DI-engine

↳ Forkers

Forkers repo roster for @opendilab/DI-engine

Feature

Algorithm Versatility

NoAlgorithmLabelDoc and ImplementationRunnable Demo
1DQNdiscreteDQN doc
DQN中文文档
policy/dqn
python3 -u cartpole_dqn_main.py / ding -m serial -c cartpole_dqn_config.py -s 0
2C51discretepolicy/c51ding -m serial -c cartpole_c51_config.py -s 0
3QRDQNdiscretepolicy/qrdqnding -m serial -c cartpole_qrdqn_config.py -s 0
4IQNdiscretepolicy/iqnding -m serial -c cartpole_iqn_config.py -s 0
5Rainbowdiscretepolicy/rainbowding -m serial -c cartpole_rainbow_config.py -s 0
6SQLdiscretecontinuouspolicy/sqlding -m serial -c cartpole_sql_config.py -s 0
7R2D2distdiscretepolicy/r2d2ding -m serial -c cartpole_r2d2_config.py -s 0
8A2Cdiscretepolicy/a2cding -m serial -c cartpole_a2c_config.py -s 0
9PPO/MAPPOdiscretecontinuousMARLpolicy/ppopython3 -u cartpole_ppo_main.py / ding -m serial_onpolicy -c cartpole_ppo_config.py -s 0
10PPGdiscretepolicy/ppgpython3 -u cartpole_ppg_main.py
11ACERdiscretecontinuouspolicy/acerding -m serial -c cartpole_acer_config.py -s 0
12IMPALAdistdiscretepolicy/impalading -m serial -c cartpole_impala_config.py -s 0
13DDPG/PADDPGcontinuoushybridpolicy/ddpgding -m serial -c pendulum_ddpg_config.py -s 0
14TD3continuoushybridpolicy/td3python3 -u pendulum_td3_main.py / ding -m serial -c pendulum_td3_config.py -s 0
15D4PGcontinuouspolicy/d4pgpython3 -u pendulum_d4pg_config.py
16SACcontinuouspolicy/sacding -m serial -c pendulum_sac_config.py -s 0
17PDQNhybridpolicy/pdqnding -m serial -c gym_hybrid_pdqn_config.py -s 0
18MPDQNhybridpolicy/pdqnding -m serial -c gym_hybrid_mpdqn_config.py -s 0
19HPPOhybridpolicy/ppoding -m serial_onpolicy -c gym_hybrid_hppo_config.py -s 0
19QMIXMARLpolicy/qmixding -m serial -c smac_3s5z_qmix_config.py -s 0
20COMAMARLpolicy/comading -m serial -c smac_3s5z_coma_config.py -s 0
21QTranMARLpolicy/qtranding -m serial -c smac_3s5z_qtran_config.py -s 0
22WQMIXMARLpolicy/wqmixding -m serial -c smac_3s5z_wqmix_config.py -s 0
23CollaQMARLpolicy/collaqding -m serial -c smac_3s5z_collaq_config.py -s 0
24GAILILreward_model/gailding -m serial_gail -c cartpole_dqn_gail_config.py -s 0
25SQILILentry/sqilding -m serial_sqil -c cartpole_sqil_config.py -s 0
26DQFDILpolicy/dqfdding -m serial_dqfd -c cartpole_dqfd_config.py -s 0
27R2D3ILR2D3中文文档
policy/r2d3
python3 -u pong_r2d3_r2d2expert_config.py
28Guided Cost LearningILreward_model/guided_costpython3 lunarlander_gcl_config.py
29TREXILreward_model/trexpython3 mujoco_trex_main.py
30HERexpreward_model/herpython3 -u bitflip_her_dqn.py
31RNDexpreward_model/rndpython3 -u cartpole_ppo_rnd_main.py
32ICMexpICM中文文档
reward_model/icm
python3 -u cartpole_ppo_icm_config.py
33CQLofflinepolicy/cqlpython3 -u d4rl_cql_main.py
34TD3BCofflinepolicy/td3_bcpython3 -u mujoco_td3_bc_main.py
35MBPOmbrlmodel/template/model_based/mbpopython3 -u sac_halfcheetah_mopo_default_config.py
36PERotherworker/replay_bufferrainbow demo
37GAEotherrl_utils/gaeppo demo

discrete means discrete action space, which is only label in normal DRL algorithms (1-18)

continuous means continuous action space, which is only label in normal DRL algorithms (1-18)

hybrid means hybrid (discrete + continuous) action space (1-18)

dist means distributed training (collector-learner parallel) RL algorithm

MARL means multi-agent RL algorithm

exp means RL algorithm which is related to exploration and sparse reward

IL means Imitation Learning, including Behaviour Cloning, Inverse RL, Adversarial Structured IL

offline means offline RL algorithm

mbrl means model-based RL algorithm

other means other sub-direction algorithm, usually as plugin-in in the whole pipeline

P.S: The .py file in Runnable Demo can be found in dizoo

Environment Versatility

NoEnvironmentLabelVisualizationCode and Doc Links
1ataridiscreteoriginalcode link
env tutorial
环境指南
2box2d/bipedalwalkercontinuousoriginaldizoo link
环境指南
3box2d/lunarlanderdiscreteoriginaldizoo link
环境指南
4classic_control/cartpolediscreteoriginaldizoo link
环境指南
5classic_control/pendulumcontinuousoriginaldizoo link
环境指南
6competitive_rldiscrete selfplayoriginaldizoo link
环境指南
7gfootballdiscretesparseselfplayoriginaldizoo link
环境指南
8minigriddiscretesparseoriginaldizoo link
环境指南
9mujococontinuousoriginaldizoo link
环境指南
10PettingZoodiscrete continuous marloriginaldizoo link
环境指南
11overcookeddiscrete marloriginaldizoo link
env tutorial
12procgendiscreteoriginaldizoo link
环境指南
13pybulletcontinuousoriginaldizoo link
环境指南
14smacdiscrete marlselfplaysparseoriginaldizoo link
环境指南
15d4rlofflineoridizoo link
环境指南
16league_demodiscrete selfplayoriginaldizoo link
17pomdp ataridiscretedizoo link
18bsuitediscreteoriginaldizoo link
env tutorial
19ImageNetILoriginaldizoo link
环境指南
20slime_volleyballdiscreteselfplayoridizoo link
env tutorial
环境指南
21gym_hybridhybridoridizoo link
环境指南
22GoBiggerhybridmarlselfplayoriopendilab link
env tutorial
环境指南
23gym_soccerhybridoridizoo link
环境指南
24multiagent_mujococontinuous marloriginaldizoo link
环境指南
25bitflipdiscrete sparseoriginaldizoo link
环境指南

discrete means discrete action space

continuous means continuous action space

hybrid means hybrid (discrete + continuous) action space

MARL means multi-agent RL environment

sparse means environment which is related to exploration and sparse reward

offline means offline RL environment

IL means Imitation Learning or Supervised Learning Dataset

selfplay means environment that allows agent VS agent battle

P.S. some enviroments in Atari, such as MontezumaRevenge, are also sparse reward type

Feedback and Contribution

We appreciate all the feedbacks and contributions to improve DI-engine, both algorithms and system designs. And CONTRIBUTING.md offers some necessary information.

Citation

@misc{ding,    title={{DI-engine: OpenDILab} Decision Intelligence Engine},    author={DI-engine Contributors},    publisher = {GitHub},    howpublished = {\url{https://github.com/opendilab/DI-engine}},    year={2021},}

License

DI-engine released under the Apache 2.0 license.


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
热门话题
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap