The Transformer

The following are the learnings from the podcast The word “bank” has different meanings in different contexts. It could be a river bank or a financial institution Transformer is a encoder-decoder architecture that makes word embeddings more robust to the context It is a modern NLP technique Attention Is All You Need - A paper that has revolutionized this space The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration.

Named Entity Recognition

Kyle Polich discusses NER in this podcast. My learnings are What is an entity in an unstructured dataset? It depends on the context and the task that the ML algo is trying to accomplish Spacy package is a python package that can do NER NER is used in chatbot applications, semantic search applications Lot of NER packages are good but not great Market research - Parse the brands that were mentioned Wikipedia has a lot of markup - Easy to do NER.

The Death of a Language

Kyle interviews Zane and Leena about the Endangered Languages Project. My learnings are Project is taking in 3.5 hours of audio content from an endangered language called “Ladin” It creates phonetic transcriptions from audio samples of human languages Model has so far produced decent levels of vowel identifications Currently working on phoneme segmentation and larger consonant categories From the project blurb In this project, we are trying to speed up the process of language documentation by building a model that produces phonetic transcriptions from audio samples of human languages.

Sequence to Sequence Models

Kyle Polich discusses sequence to sequence models. The following are the points from the podcast Many approaches of ML suffer from fixed-input-fixed-output Natural Language does not have fixed input and fixed output. Summarizing a paper, cross language translation does not have fixed length input-output What a word means depends on the context. There is an internal state representation that the algo is learning The encoder/decoder architecture has obvious promise for machine translation, and has been successfully applied this way.

Simultaneous Translation

Kyle Polich discusses with Liang Huang about his work on Baidu on Simultaneous translation. The following are the points covered in the podcast: Most of the advertized cross language translation vendors such as skype do not do simultaneous translation. They wait for the speaker to finish and then the system does the translation. Skype does consecutive translation and not simultaneous translation Simultaneous translation trades off between accuracy and latency You cannot wait too much of a time to do the translation Prefix-to-Prefix method of translating What’s the dataset used ?