img

Kyle Polich discusses sequence to sequence models. The following are the points from the podcast

  • Many approaches of ML suffer from fixed-input-fixed-output
  • Natural Language does not have fixed input and fixed output. Summarizing a paper, cross language translation does not have fixed length input-output
  • What a word means depends on the context.
  • There is an internal state representation that the algo is learning
  • The encoder/decoder architecture has obvious promise for machine translation, and has been successfully applied this way. Encoding an input to a small number of hidden nodes which can effectively be decoded to a matching string requires machine learning to learn an efficient representation of the essence of the strings.
  • Figures out a way to best way to encode the words
  • Measurement of seq2seq is based on BLEU.
  • Compare the translation to various human translations for the same sentence
  • Input - Embedding Layer - LSTM layer - Dense layer - maps to the word
  • Image captioning done via Sequence to Sequence models
  • Relevant link on seq2seq