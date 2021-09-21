



Posted by Sayak Paul (MLE at Carted, and GDE) and Morgan Roff (Google)

This year, Google Summer of Code students working on TensorFlow Hub are pleased to share the work they have completed. Students interested in writing open source code will be interested in Google’s Summer of Code program.

Through this program, students will propose project ideas to open source organizations and, if selected, receive scholarships to work with them to complete the project during the summer. Students have the opportunity to learn directly from mentors within the organization of their choice, and the organization benefits from student contributions. This year, 17 successful students have completed projects with the TensorFlow organization on many projects. This article focuses on some of the work done on the TensorFlow hub.

Two mentors for the TensorFlow Hub (TF Hub) project, Sayak and Morgan. Here’s what students learned about building and publishing state-of-the-art models, training on large benchmark datasets, what they learned as mentors, and how rewarding the summer of code was for each of us. Share community.

I had the opportunity to teach two students, Aditya Kane and Vasudev Gupta. Aditya has successfully implemented several variants of RegNet, including those based on this paper, and trained them on the ImageNet-1k dataset. Vasudev has ported pre-trained wav2vec2 weights from this paper to TensorFlow. Therefore, we had to implement the model architecture from the beginning. We then tweaked these pre-trained checkpoints on the LibriSpeech dataset to make your work more customizable and community-relevant.

Due to this large scale of model training, it is especially important to follow good engineering techniques during implementation. These include code modularization, unit testing, good design patterns, optimizations, and more. The model was trained on a cloud TPU to reduce training time, so a great deal of effort was spent on the data entry pipeline to maximize accelerator utilization.

All these factors collectively contributed to the complexity of the project. Thanks to the Summer of Code program, students have the opportunity to tackle these tasks with the help of experienced mentors. This gives students insights into the organization and allows them to interact with people with many skill sets who work together to enable large-scale projects. We would like to thank all the students for handling this engineering work gracefully and listening to our feedback.

Vasudev and Aditya have provided TensorFlow Hub with important pre-trained models, along with tutorials on their use (Wav2Vec, RegNetY) and TensorFlow implementations for those who want to dig deeper. In their own words:

The last couple of months have been full of learning and coding. GSoC helped me enter the voice domain and motivated me to explore more about the TensorFlow ecosystem. We thank the mentors for their continuous and timely feedback. We look forward to further contributing to the TensorFlow community and other great open source projects. -Vasudev Gupta

Details of RegNets and Wav2Vec2

ResNet is still widely used as a benchmark architecture for all image comprehension tasks, about six years after it was first published. Many modern self-monitoring and semi-supervised learning frameworks continue to leverage ResNet 50 as their backbone architecture. However, ResNet is less scalable in large data regimes and often has longer training and inference latency as it grows. In contrast, RegNets was specially developed as a scalable architectural framework that maintains low latency while delivering high performance for standard image recognition tasks. Aditya’s model is published on TFHub, and code and tutorials are published on GitHub.

Self-supervised learning is an important area of ​​machine learning research. Many recent success stories have focused on NLP and computer vision, and Vasudev’s project wanted to explore speech. Last year, a group of researchers released the wav2vec2 framework for learning expressions from speech in a self-monitoring manner that benefits downstream tasks such as speech to text.

With wav2vec2, you can now pre-train voice models without labeled data and fine-tune them for downstream tasks such as speaker recognition. Vasudev’s model is available on TFHub, along with a new tutorial on tweaking and GitHub code.

summary

We would like to express our sincere gratitude to all the students, mentors and organizers who made the Summer of Code a success despite many challenges this year. We encourage you to check out these models and share what you’ve created with your social media posts tagged with #TFHub, or share your work with the Community Spotlight Program. If you have any questions about these new models, or want to know more, you can ask at discuss.tensorflow.org.

