Here’s another post I co-authored with Chris McCormick on how to quickly and easily create a SOTA text classifier by fine-tuning BERT in PyTorch. It’s incredibly useful to take a look at this transfer learning approach if you’re interested in creating a high performance NLP model.
Please check out the post I co-authored with Chris McCormick on BERT Word Embeddings here. In it, we take an in-depth look at the word embeddings produced by BERT, show you how to create your own in a Google Colab notebook, and tips on how to implement and use these embeddings in your production pipeline. Check it out!
In a previous post we looked at root-finding methods for single variable equations. In this post we’ll look at the expansion of Quasi-Newton methods to the multivariable case and look at one of the more widely-used algorithms today: Broyden’s Method.
How do you find the roots of a continuous polynomial function? Well, if we want to find the roots of something like:
I wouldn’t expect DropConnect to appear in TensorFlow, Keras, or Theano since, as far as I know, it’s used pretty rarely and doesn’t seem as well-studied or demonstrably more useful than its cousin, Dropout. However, there don’t seem to be any implementations out there, so I’ll provide a few ways of doing so. Continue reading “DropConnect Implementation in Python and TensorFlow”
Suppose we are given a dataset of outcomes from some distribution parameterized by . How do we estimate ?
For example, given a bent coin and a series of heads and tails outcomes from that coin, how can we estimate the probability of the coin landing heads? Continue reading “MLE, MAP, and Naive Bayes”
Getting Useful Information Out of Unstructured Text
Let’s say that you’re interested in performing a basic analysis of the US M&A market over the last five years. You don’t have access to a database of transactions and don’t have access to tombstones (public advertisements announcing the minimal details of a closed deal, e.g. ABC acquires XYZ for $500mm). What you do have is access to is a large corpus of financial news articles that contain within them – somewhere – the basic transactional details of M&A deals.
What you need to do is design a system that takes in this large database and outputs clean fields containing M&A transaction details. In other words, map an excerpt like this: Continue reading “Shallow Parsing for Entity Recognition with NLTK and Machine Learning”