在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称:edl开源软件地址:https://gitee.com/paddlepaddle/edl开源软件介绍: MotivationComputing resources on cloud such as Amazon AWS、Baidu Cloud have multi-tenancy. Deep learning model training and inference with elastic resources will be common on cloud. We propose Elastic Deep Learning (EDL) that makes training and inference of deep learning models on cloud easier and more efficient. Now EDL is an incubation-stage project of the LF AI Foundation. InstallationEDL package support python2.7/3.6/3.7. You can install with docker pull hub.baidubce.com/paddle-edl/paddle_edl:latest-cuda9.0-cudnn7nvidia-docker run -name paddle_edl hub.baidubce.com/paddle-edl/paddle_edl:latest-cuda9.0-cudnn7 /bin/bash Latest Release(0.3.1)
Quick start Demo
pip install paddle-serving-server-gpu
cd example/distill/resnetwget --no-check-certificate https://paddle-edl.bj.bcebos.com/distill_teacher_model/ResNeXt101_32x16d_wsl_model.tar.gztar -zxf ResNeXt101_32x16d_wsl_model.tar.gzpython -m paddle_serving_server_gpu.serve \ --model ResNeXt101_32x16d_wsl_model \ --mem_optim \ --port 9898 \ --gpu_ids 1
python -m paddle.distributed.launch --selected_gpus 0 \ ./train_with_fleet.py \ --model=ResNet50_vd \ --data_dir=./ImageNet \ --use_distill_service=True \ --distill_teachers=127.0.0.1:9898
About Knowledge Distillation in EDL
Release 0.2.0Checkpoint based elastic training on multiple GPUs
Resnet50 experiments on a single machine in docker
cd example/demo/collectivenode_ips="127.0.0.1"python -u paddle_edl.demo.collective.job_server_demo \ --node_ips ${node_ips} \ --pod_num_of_node 8 \ --time_interval_to_change 900 \ --gpu_num_of_node 8
# set the ImageNet data pathexport PADDLE_EDL_IMAGENET_PATH=<your path># set the checkpoint pathexport PADDLE_EDL_FLEET_CHECKPOINT_PATH=<your path>mkdir -p resnet50_podunset http_proxy https_proxy# running under edlexport PADDLE_RUNING_ENV=PADDLE_EDLexport PADDLE_JOB_ID="test_job_id_1234"export PADDLE_POD_ID="not set"python -u paddle_edl.demo.collective.job_client_demo \ --log_level 20 \ --package_sh ./resnet50/package.sh \ --pod_path ./resnet50_pod \ ./train_pretrain.sh
The whole example is here CommunityFAQLicense
Contribution
|
请发表评论