• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

GraphVite: GraphVite 是一个通用图表嵌入引擎,用于在各种应用中进行高速和大规模嵌 ...

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称:

GraphVite

开源软件地址:

https://gitee.com/mirrors/GraphVite

开源软件介绍:

GraphVite logo

GraphVite - graph embedding at high speed and large scale

Install with condaLicenseDownloads

Docs | Tutorials | Benchmarks | Pre-trained Models

GraphVite is a general graph embedding engine, dedicated to high-speed andlarge-scale embedding learning in various applications.

GraphVite provides complete training and evaluation pipelines for 3 applications:node embedding, knowledge graph embedding andgraph & high-dimensional data visualization. Besides, it also includes 9 popularmodels, along with their benchmarks on a bunch of standard datasets.

Node Embedding Knowledge Graph Embedding Graph & High-dimensional Data Visualization

Here is a summary of the training time of GraphVite along with the best open-sourceimplementations on 3 applications. All the time is reported based on a server with24 CPU threads and 4 V100 GPUs.

Training time of node embedding on Youtube dataset.

ModelExisting ImplementationGraphViteSpeedup
DeepWalk1.64 hrs (CPU parallel)1.19 mins82.9x
LINE1.39 hrs (CPU parallel)1.17 mins71.4x
node2vec24.4 hrs (CPU parallel)4.39 mins334x

Training / evaluation time of knowledge graph embedding on FB15k dataset.

ModelExisting ImplementationGraphViteSpeedup
TransE1.31 hrs / 1.75 mins (1 GPU)13.5 mins / 54.3 s5.82x / 1.93x
RotatE3.69 hrs / 4.19 mins (1 GPU)28.1 mins / 55.8 s7.88x / 4.50x

Training time of high-dimensional data visualization on MNIST dataset.

ModelExisting ImplementationGraphViteSpeedup
LargeVis15.3 mins (CPU parallel)13.9 s66.8x

Requirements

Generally, GraphVite works on any Linux distribution with CUDA >= 9.2.

The library is compatible with Python 2.7 and 3.6/3.7.

Installation

From Conda

conda install -c milagraph -c conda-forge graphvite cudatoolkit=$(nvcc -V | grep -Po "(?<=V)\d+.\d+")

If you only need embedding training without evaluation, you can use the followingalternative with minimal dependencies.

conda install -c milagraph -c conda-forge graphvite-mini cudatoolkit=$(nvcc -V | grep -Po "(?<=V)\d+.\d+")

From Source

Before installation, make sure you have conda installed.

git clone https://github.com/DeepGraphLearning/graphvitecd graphviteconda install -y --file conda/requirements.txtmkdir buildcd build && cmake .. && make && cd -cd python && python setup.py install && cd -

On Colab

!wget -c https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh!chmod +x Miniconda3-latest-Linux-x86_64.sh!./Miniconda3-latest-Linux-x86_64.sh -b -p /usr/local -f!conda install -y -c milagraph -c conda-forge graphvite \    python=3.6 cudatoolkit=$(nvcc -V | grep -Po "(?<=V)\d+\.\d+")!conda install -y wurlitzer ipykernel
import sitesite.addsitedir("/usr/local/lib/python3.6/site-packages")%reload_ext wurlitzer

Quick Start

Here is a quick-start example of the node embedding application.

graphvite baseline quick start

Typically, the example takes no more than 1 minute. You will obtain some output like

Batch id: 6000loss = 0.371041------------- link prediction --------------AUC: 0.899933----------- node classification ------------macro-F1@20%: 0.242114micro-F1@20%: 0.391342

Baseline Benchmark

To reproduce a baseline benchmark, you only need to specify the keywords of theexperiment. e.g. model and dataset.

graphvite baseline [keyword ...] [--no-eval] [--gpu n] [--cpu m] [--epoch e]

You may also set the number of GPUs and the number of CPUs per GPU.

Use graphvite list to get a list of available baselines.

Custom Experiment

Create a yaml configuration scaffold for graph, knowledge graph, visualization orword graph.

graphvite new [application ...] [--file f]

Fill some necessary entries in the configuration following the instructions. Youcan run the configuration by

graphvite run [config] [--no-eval] [--gpu n] [--cpu m] [--epoch e]

High-dimensional Data Visualization

You can visualize your high-dimensional vectors with a simple command line inGraphVite.

graphvite visualize [file] [--label label_file] [--save save_file] [--perplexity n] [--3d]

The file can be either a numpy dump *.npy or a text matrix *.txt. For the savefile, we recommend to use png format, while pdf is also supported.

Contributing

We welcome all contributions from bug fixs to new features. Please let us know if youhave any suggestion to our library.

Development Team

GraphVite is developed by MilaGraph, led by Prof. Jian Tang.

Authors of this project are Zhaocheng Zhu, Shizhen Xu, Meng Qu and Jian Tang.Contributors include Kunpeng Wang and Zhijian Duan.

Citation

If you find GraphVite useful for your research or development, please cite thefollowing paper.

@inproceedings{zhu2019graphvite,    title={GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding},     author={Zhu, Zhaocheng and Xu, Shizhen and Qu, Meng and Tang, Jian},     booktitle={The World Wide Web Conference},     pages={2494--2504},     year={2019},     organization={ACM} }

Acknowledgements

We would like to thank Compute Canada for supporting GPU servers. We specially thankWenbin Hou for useful discussions on C++ and GPU programming techniques.


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
热门话题
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap