[D] For GNN’s are gradients normally tracked on neighborhood aggregation operations (e.g. max, mean)?
I am writing a GNN from scratch, to demonstrate to myself that I understand all the required concepts.
I am a bit confused on whether neighborhood aggregation operations require gradients to be tracked through those operations like mean and max of neighbors embeddings. In my code where I perform these operations, currently I do them within a with torch.no_grad() block because if I don’t each epoch takes forever.
Here my code for those operations:
def neighborhood_aggregation(self, adj_lists, feat, agg_method): # adj_lists is a dict of neighbors for every node in graph # e.g. adj_list = {0:{1, 4, 5, 6}, 1: {2, 4, 5}, ...} # node 0 has neighbors 1, 4, 5, 6 with torch.no_grad(): # construct aggregated neighborhood embedding dim = list(feat.size()) n_nodes = dim[0] feat_dim = dim[1] aggregated_embed = torch.Tensor(n_nodes, feat_dim) # aggregated embeddings for all nodes in graph. embed_element_vec = torch.arange(feat_dim) # for node_id, neighbor_node_ids in adj_lists.items(): neighborhood_embedding = feat[list(neighbor_node_ids), :] if agg_method == 'mean': aggregated_neigborhood_embedding = torch.mean(neighborhood_embedding, 0) elif agg_method == 'pool': aggregated_neigborhood_embedding = torch.max(neighborhood_embedding, 0)[0] else: raise KeyError('Aggregator type {} not recognized.'.format(agg_method)) aggregated_embed[node_id, embed_element_vec] = aggregated_neigborhood_embedding return aggregated_embed
Note: The above code works, and I am getting very good results with it. It’s just I am not sure if what I am doing is wrong. IF it is wrong I was thinking that I need a 3D tensor for the aggregated_embed tensor [n_nodes, n_neighbors, embed_dim] (which requires_grad=False) and perform the mean/max on that tensor which would track gradients.
Thanks for any help.
submitted by /u/Muunich
[link] [comments]