Big Data, Small Batches

By: CJ, 16 July 2018

evening sunWelcome to data-brew! This inaugural post should give you a sense of the site's purpose and structure. You might find that I hit you up for money toward the end. Sorry.


We are in the age of big data. When you use a social media app like Facebook or a website like Amazon, data generated by your clicks, views, likes, and posts is fed into sophisticated machine learning algorithms designed and deployed by data scientists. These algorithms find complex patterns in the massive quantities of information they process, allowing your preferences and purchasing behavior—even your political leanings or sexual orientation—to be predicted. Targeted content and advertising can then be sent your way. This is profoundly creepy (for instance), but it's also an extremely powerful way to make accurate, useful predictive tools.

We are also at a point where homebrewers have turned an experimental, scientific eye to their hobby. Projects like Brulosophy and Experimental Homebrewing use split batch experiments to question homebrew gospel, finding that many common rules, principles, and best practices are without basis (so don't bother with carapils ), while frequently overlooked factors play major roles in beer flavor.


What would happen if we combined these two trends, using data science techniques to enhance the analytical vision we apply to homebrewing? What can we learn if we throw big data at small batches? Let's collect data on hundreds, thousands, or millions of homebrew batches. Let's use machine learning and statistical methods to find patterns in that information and turn our knowledge into tools we can use to make better beer. Don't let Facebook and Amazon have all the fun.

Who is this Asshole?

If you're interested, my name is CJ and I live in Nashville, TN with my lovely and amazing wife. I started homebrewing in the summer of 2015 and brew mostly English and Belgian ales.

I have no formal training in data science or related fields like software engineering, statistics, or business intelligence. My academic background is actually in the humanities (I have a PhD in philosophy), but I've taught myself enough to build this site and the tools therein. Part of the reason I made the site is to give myself an end-to-end data science project on which to learn, so this is a work in progress. Forgive the mess. Still, I drank a Westvleteren XII once, so I must know what I'm doing:


Site Structure

The site is still under construction, so you might notice some changes in the near and distant future. Hopefully, the sections of the site are self-explanatory. The terms of use are here. They're pretty standard, I think. Basically, use at your own risk and don't blame me for anything. If you just want to use the calculator applications, go here. I hope to add more as time goes on.

I'd be happy to hear from anyone interested in the project. There's a contact form here , which can also be used for general inquiries. If you want to submit brew data, you can just email it to Try to include as much information as you can. Eventually, I'll stop being lazy and build a form through which people can automatically submit batch data directly to a database. Recipe databases like Brewtoad and Brewer's Friend are great, but often lack crucial information (like measured FG) required to build predictive tools. I think we can do better.
Server Room

The blog section will include a few different categories of post. First, there will be updates about new developments. If something major changes (a new tool is added, I switch to wine coolers and shut down the site, etc.), I'll post a notice there. Second, I plan to add explanatory/informative posts about both brewing and programming/data stuff. When I uncover something interesting in the batch data, I'll let you know. Third, I'll periodically post brew day rundowns and tasting information on my own batches. I read a number of homebrew blogs and I love the brew day posts! It's always interesting to me to see how other homebrewers approach their process, and fair's fair. Finally, when I'm feeling bold I might also post some 'editorial' content, in which I pontificate on beer or brewing in some obnoxious way.

You'll notice there's a donate link under the 'contribute' section of the menu. (This is the part of the post where I beg you for money.) I'll need to pay for things like domain registration and web hosting, and I'd like to avoid putting ads on the site, so I'd appreciate any help you can give to offset the costs. (End of begging.)

For the Nerds

For those of you interested, the toolkit I'm using is pretty standard for projects of this type. Web-scraping, data cleaning, exploratory data analysis/visualization, and machine learning are done in Python and R. At the moment, I'm deploying the resulting models by translating them into JavaScript, though that will probably change in the future. I'm using Django for the site's backend. For the frontend, I started with an open-source Bootstrap theme available here, which I extended and adjusted a bit.