Bitcoin

Implementation Details of Tree-Diffusion: Architecture and Training for Inverse Graphics

ziapirzada3 weeks ago

0 4 1 minute read

Implementation Details of Tree-Diffusion: Architecture and Training for Inverse Graphics

Table of Links

Abstract and 1. Introduction

Background & Related Work
Method

3.1 Sampling Small Mutations

3.2 Policy

3.3 Value Network & Search

3.4 Architecture
Experiments

4.1 Environments

4.2 Baselines

4.3 Ablations
Conclusion, Acknowledgments and Disclosure of Funding, and References

\

Appendix

A. Mutation Algorithm

B. Context-Free Grammars

C. Sketch Simulation

D. Complexity Filtering

E. Tree Path Algorithm

F. Implementation Details

F Implementation Details

We implement our architecture in PyTorch [1]. For our image encoder we use the NF-ResNet26 [4] implementation from the open-sourced library by Wightman [38]. Images are of size 128 × 128 × 1 for CSG2D and 128 × 128 × 3 for TinySVG. We pass the current and target images as a stack of image planes into the image encoder. Additionally, we provide the absolute difference between current and target image as additional planes.

\
For the autoregressive (CSGNet) baseline, we trained the model to output ground-truth programs from target images, and provided a blank current image. For tree diffusion methods, we initialized the search and rollouts using the output of the autoregressive model, which counted as a single node expansion. For our re-implementation of Ellis et al. [11], we flattened the CSG2D tree into shapes being added from left to right. We then randomly sampled a position in this shape array, compiled the output up until the sampled position, and trained the model to output the next shape using constrained grammar decoding.

\
This is a departure from the pointer network architecture in their work. We think that the lack of prior shaping, departure from a graphics specific pointer network, and not using reinforcement learning to fine-tune leads to a performance difference between their results and our re-implementation. We note that our method does not require any of these additional features, and thus the comparison is fairer. For tree diffusion search, we used a beam size of 64, with a maximum node expansion budget of 5000 nodes.

:::info
Authors:

(1) Shreyas Kapur, University of California, Berkeley (srkp@cs.berkeley.edu);

(2) Erik Jenner, University of California, Berkeley (jenner@cs.berkeley.edu);

(3) Stuart Russell, University of California, Berkeley (russell@cs.berkeley.edu).

:::

:::info
This paper is available on arxiv under CC BY-SA 4.0 DEED license.

:::

ziapirzada3 weeks ago

0 4 1 minute read