Skip to main content

Blog

Learn About Our Meetup

5000+ Members

GO >

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community.

JOIN

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

JOBS

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

CONTACT

[D] Decoding for the transformer in inference mode time series data

With the Transformer model from “Attention is All you need” you have to feed in the the actual target during training. However, this can obviously not be done for actual inference. Now usually for inference greedy decoding or beam search is used for generating the target sequence iteratively. However, from my understanding (could be wrong) beam search and greedy decoding generally work in conjunction with a softmax function. Moreover, this is generally done over a set of vocabulary. How would we use the transformer model in inference mode for a time series forecasting task? What is the best way to generate the target values for the decoder? Could beam search still work?

submitted by /u/svpadd3
[link] [comments]