Node2vec
Implementation of the node2vec algorithm.
Install / Use
/learn @eliorc/Node2vecREADME
Node2Vec
Python3 implementation of the node2vec algorithm Aditya Grover, Jure Leskovec and Vid Kocijan. node2vec: Scalable Feature Learning for Networks. A. Grover, J. Leskovec. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2016.
Maintenance
I no longer have time to maintain this, if someone wants to pick the baton let me know
Installation
pip install node2vec
Usage
import networkx as nx
from node2vec import Node2Vec
# Create a graph
graph = nx.fast_gnp_random_graph(n=100, p=0.5)
# Precompute probabilities and generate walks - **ON WINDOWS ONLY WORKS WITH workers=1**
node2vec = Node2Vec(graph, dimensions=64, walk_length=30, num_walks=200, workers=4) # Use temp_folder for big graphs
# Embed nodes
model = node2vec.fit(window=10, min_count=1, batch_words=4) # Any keywords acceptable by gensim.Word2Vec can be passed, `dimensions` and `workers` are automatically passed (from the Node2Vec constructor)
# Look for most similar nodes
model.wv.most_similar('2') # Output node names are always strings
# Save embeddings for later use
model.wv.save_word2vec_format(EMBEDDING_FILENAME)
# Save model for later use
model.save(EMBEDDING_MODEL_FILENAME)
# Embed edges using Hadamard method
from node2vec.edges import HadamardEmbedder
edges_embs = HadamardEmbedder(keyed_vectors=model.wv)
# Look for embeddings on the fly - here we pass normal tuples
edges_embs[('1', '2')]
''' OUTPUT
array([ 5.75068220e-03, -1.10937878e-02, 3.76693785e-01, 2.69105062e-02,
... ... ....
..................................................................],
dtype=float32)
'''
# Get all edges in a separate KeyedVectors instance - use with caution could be huge for big networks
edges_kv = edges_embs.as_keyed_vectors()
# Look for most similar edges - this time tuples must be sorted and as str
edges_kv.most_similar(str(('1', '2')))
# Save embeddings for later use
edges_kv.save_word2vec_format(EDGES_EMBEDDING_FILENAME)
Parameters
node2vec.Node2vec
-
Node2Vecconstructor:graph: The first positional argument has to be a networkx graph. Node names must be all integers or all strings. On the output model they will always be strings.dimensions: Embedding dimensions (default: 128)walk_length: Number of nodes in each walk (default: 80)num_walks: Number of walks per node (default: 10)p: Return hyper parameter (default: 1)q: Input parameter (default: 1)weight_key: On weighted graphs, this is the key for the weight attribute (default: 'weight')workers: Number of workers for parallel execution (default: 1)sampling_strategy: Node specific sampling strategies, supports setting node specific 'q', 'p', 'num_walks' and 'walk_length'. Use these keys exactly. If not set, will use the global ones which were passed on the object initialization`quiet: Boolean controlling the verbosity. (default: False)temp_folder: String path pointing to folder to save a shared memory copy of the graph - Supply when working on graphs that are too big to fit in memory during algorithm execution.seed: Seed for the random number generator (default: None). Deterministic results can be obtained if seed is set andworkers=1.
-
Node2Vec.fitmethod: Accepts any key word argument acceptable by gensim.Word2Vec
node2vec.EdgeEmbedder
EdgeEmbedder is an abstract class which all the concrete edge embeddings class inherit from.
The classes are AverageEmbedder, HadamardEmbedder, WeightedL1Embedder and WeightedL2Embedder which their practical definition could be found in the paper on table 1
Notice that edge embeddings are defined for any pair of nodes, connected or not and even node with itself.
-
Constructor:
keyed_vectors: A gensim.models.KeyedVectors instance containing the node embeddingsquiet: Boolean controlling the verbosity. (default: False)
-
EdgeEmbedder.__getitem__(item)method, better known asEdgeEmbedder[item]:item- A tuple consisting of 2 nodes from thekeyed_vectorspassed in the constructor. Will return the embedding of the edge.
-
EdgeEmbedder.as_keyed_vectorsmethod: Returns agensim.models.KeyedVectorsinstance with all possible node pairs in a sorted manner as string. For example, for nodes ['1', '2', '3'] we will have as keys "('1', '1')", "('1', '2')", "('1', '3')", "('2', '2')", "('2', '3')" and "('3', '3')".
Caveats
- Node names in the input graph must be all strings, or all ints
- Parallel execution not working on Windows (
joblibknown issue). To run non-parallel on Windows passworkers=1on theNode2Vec's constructor
TODO
- [x] Parallel implementation for walk generation
- [ ] Parallel implementation for probability precomputation
Related Skills
proje
Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
