Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[N] Test a Distilled GPT-2’s generative capabilities

At Hugging Face, we recently started distilling models starting with DistilBERT – a distilled version of BERT. We recently distilled the small version of GPT-2, which has the following parameters:

81,9M parameters vs 124M for GPT-2/small (66% parameters)

Weighs 336Mb vs 523Mb for GPT-2/small (64% disk size)

On CPU and GPU, the average forward pass of DistilGPT-2 is 51% that of GPT-2/small (twice as fast).

The absolute increase in perplexity on WikiText-103 is 3.5 points (15.0 -> 18.5).

We have added it to our app write with transformer, as well as our two repos transformers (along with a tutorial on how to distill transformers and example scripts!) and swift-coreml-transformers. We have successfully run it on an iPhone 7 and it is 38% faster than GPT-2 on an iPhone X with neural engine.

submitted by /u/jikkii
[link] [comments]