A sneak peek at TorchVision v0.11
The last couple of weeks were super busy in āPyTorch Landā as we are frantically preparing the release of PyTorch v1.10 and TorchVision v0.11. In this 2nd instalment of the series, Iāll cover some of the upcoming features that are currently included in the release branch of TorchVision.
Disclaimer: Though the upcoming release is packed with numerous enhancements and bug/test/documentation improvements, here Iām highlighting new āuser-facingā features on domains Iām personally interested. After writing the blog post, I also noticed a bias towards features I reviewed, wrote or followed closely their development. Covering (or not covering) a feature says nothing about its importance. Opinions expressed are solely my own.
New Models
The new release is packed with new models:
Kai Zhang has added an implementation of the RegNet architecture along with pre-trained weights for 14 variants which closely reproduce the original paper.
Iāve recently added an implementation of the EfficientNet architecture along with pre-trained weights for variants B0-B7 provided by Luke Melas-Kyriazi and Ross Wightman.
New Data Augmentations
A few new Data Augmentation techniques have been added to the latest version:
Samuel Gabriel has contributed TrivialAugment, a new simple but highly effective strategy that seems to provide superior results to AutoAugment.
Iāve added the RandAugment method in auto-augmentations.
Iāve provided an implementation of Mixup and CutMix transforms in references. These will be moved in transforms on the next release once their API is finalized.
New Operators and Layers
A number of new operators and layers have been included:
Victor Fomin has contributed the backwards implementations of bilinear and bicubic interpolation with anti-alias option for CPUs and GPUs.
Kai Zhang and I have refactored common building blocks of models and written re-usable implementations for the Squeeze-Excitation and Conv-Norm-Activation layers.
Iāve updated our references to support Label Smoothing, which was recently introduced by Joel Schlosser and Thomas J. Fan on PyTorch core.
Iāve included the option to perform Learning Rate Warmup, using the latest LR schedulers developed by Ilqar Ramazanli.
Other improvements
Here are some other notable improvements added in the release:
Alexander Soare and Francisco Massa have developed an FX-based utility which allows extracting arbitrary intermediate features from model architectures.
Nikita Shulga has added support of CUDA 11.3 to TorchVision.
Zhongkai Zhu has fixed the dependency issues of JPEG lib (this issue has caused major headaches to many of our users).
In-progress & Next-up
There are lots of exciting new features under-development which didnāt make it in this release. Here are a few:
Moto Hira, Parmeet Singh Bhatia and I have drafted an RFC, which proposes a new mechanism for Model Versioning and for handling meta-data associated to pre-trained weights. This will enable us to support multiple pre-trained weights for each model and attach associated information such as labels, preprocessing transforms etc to the models.
Iām currently working on using the primitives added by the āBatteries Includedā project in order to improve the accuracy of our pre-trained models. The target is to achieve best-in-class results for the most popular pre-trained models provided by TorchVision.
Philip Meier and Francisco Massa are working on an exciting prototype for TorchVisionās new Dataset and Transforms API.
Prabhat Roy is working on extending PyTorch Coreās AveragedModel class to support the averaging of the buffers in addition to parameters. The lack of this feature is commonly reported as bug and will enable numerous downstream libraries and frameworks to remove their custom EMA implementations.
Aditya Oke wrote a utility which allows plotting the results of Keypoint models on the original images (the feature didnāt make it to the release as we got swamped and couldnāt review it in time š )
Iām building a prototype FX-utility which aims to to detect Residual Connections in arbitrary Model architectures and modify the network to add regularization blocks (such as StochasticDepth).
Finally there are a few new features in our backlog (PRs coming soon):
Nicholas Hug is working to add the RAFT model for Optical Flow.
I hope you found the above summary interesting. Any ideas on how to adapt the format of the blog series are very welcome. Hit me up on LinkedIn or Twitter.
Leave a Reply
How it works
Once you click Generate, Ollama reads this article and crafts 5 comprehension questions. Your answers are graded against the article content ā general knowledge won't be enough. Score 70+ to count toward your certificate.
Questions are cached ā you'll always get the same 5 for this article.