Skip to content

Conversation

@GMNGeoffrey
Copy link

@GMNGeoffrey GMNGeoffrey commented Oct 21, 2025

My experience is that template processing takes a huge fraction of the overall runtime. It's also all on CPU, which is a waste if you're using GPU machines.

Note that this uses the features.pkl which is already saved by the existing code. For more stability across numpy versions, etc. we could instead use npz files with numpy.save/numpy.load as features.pkl just holds a dictionary of numpy arrays. I wanted to first gauge the reaction to this general idea though.

See discussion in #895

@google-cla
Copy link

google-cla bot commented Oct 21, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@GMNGeoffrey
Copy link
Author

I also have additional changes to use an npz file instead, enable just running the featurization code (e.g. so you can do that on a CPU-only machine first), and break the run with precomputed features out into a separate script so it doesn't try to look for all the db files. Happy to upstream those, but as I noted I wanted to get some feedback on the general idea.

I'll sort out the CLA on our side

My experience is that template processing takes a huge fraction of the
overall runtime. It's also all on CPU, which is a waste if you're using
GPU machines.

Note that this uses the features.pkl which is already saved by the
existing code. For more stability across numpy versions, etc. we could
instead use `npz` files with `numpy.save`/`numpy.load` as `features.pkl`
just holds a dictionary of numpy arrays. I wanted to first gauge the
reaction to this general idea though.

See discussion in
google-deepmind#895
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant