## MLE, MAP, and Naive Bayes

Suppose we are given a dataset $X$ of outcomes from some distribution parameterized by $\Theta$. How do we estimate $\Theta$?

For example, given a bent coin and a series of heads and tails outcomes from that coin, how can we estimate the probability of the coin landing heads? Continue reading “MLE, MAP, and Naive Bayes”

## Text Classification at Data Science Hackathon with DataKind

Last weekend I attended a DataKind data science hackathon. It was a lot of fun and a great way to meet people in the space and share some ideas. If it sounds the least bit interesting, I encourage you to join a DataKind event. Here’s what my team worked on, which should serve as a good indication what you might do over the course of the weekend. My code here: project folder, supervised classification – most interesting, and topic modeling. Continue reading “Text Classification at Data Science Hackathon with DataKind”

## A Few Nice Coding Challenges

A recent interview process required passing some coding challenges.

When I first started programming I spent a decent amount of time on Project Euler, but since then I rarely do these crack-the-interview coding challenges. I find project-based work more interesting, I work mostly with data, and – based on what I understand from experienced interviewers – facility with brain teasers and coding challenges correlates less with good programming than time spent programming correlates with good programming. Anyway, I spent a few afternoons working through coding challenges on Codility to get a feel for the types of questions that get asked of software engineering candidates. Continue reading “A Few Nice Coding Challenges”

## Understanding Facebook Ads: Pros and Cons

I recently did some A/B testing work through the Facebook advertising platform, and gave a quick presentation on the pros and cons of the platform. Here’s a summary.

PRO

• Microtargeting
• Optimization
• Inexpensive, low ceilings
• Demonstrated to work at scale, sophisticated distribution

CON

• Click bots
• Opaque

To clarify my perspective on the platform, some background on the work we did:

We ran some A/B tests through the platform targeting a specific population, evaluating different levels of resulting engagement for statistical significance. I assure you, nothing fancy. Continue reading “Understanding Facebook Ads: Pros and Cons”

## Decorators and Metaprogramming in Python

Decorators

Decorators are intuitive and extremely useful. To demonstrate, we’ll look at a simple example. Let’s say we’ve got some function that sums all numbers 0 to n:

```def sum_0_to_n(n):
count = 0
while n > 0:
count += n
n -= 1
return count
```

and we’d like to time the performance of this function. Of course we could just modify the function like so:

## Shallow Parsing for Entity Recognition with NLTK and Machine Learning

Getting Useful Information Out of Unstructured Text

Let’s say that you’re interested in performing a basic analysis of the US M&A market over the last five years. You don’t have access to a database of transactions and don’t have access to tombstones (public advertisements announcing the minimal details of a closed deal, e.g. ABC acquires XYZ for \$500mm). What you do have is access to is a large corpus of financial news articles that contain within them – somewhere – the basic transactional details of M&A deals.

What you need to do is design a system that takes in this large database and outputs clean fields containing M&A transaction details. In other words, map an excerpt like this: Continue reading “Shallow Parsing for Entity Recognition with NLTK and Machine Learning”