Can Robots Bake Cookies?

Claire Longo
4 min readMar 24, 2023

--

My adventures baking a recipe from ChatGPT.

Not my cookies :-P

Thumbprint cookies have always been my favorite. They are perfect with a cup of espresso to start the day. Because I love AI, and I love baking, I wanted to bake a recipe generated directly from ChatGPT. I was curious how the cookies would turn out. I wanted to know if we could truly trust a robot to tell us how to make a good cookie. Read on to see if we can! (I promise there is some real math to discuss after the cookies are baked.)

First, I asked ChatGPT for a recipe for chewy thumbprint cookies. The recipe passed the sanity check. It had similar ingredients and instructions to what I’ve seen online.

However, I live at high altitude. Because baking at altitude requires some adjustments, I’ve never been able to get the perfect cookie. They always come out too puffy or too crunchy for my liking (I love a flat chewy cookie). So I asked ChatGPT to adapt the recipe for the high altitude. It responded by adjusting the baking temperature and baking time. Thanks ChatGPT! Let’s see if this works…

So with the recipe in hand, I collected the ingredients together.

Then I got baking. I made sure to follow the instructions exactly, so we can truly answer the question: “can robots make good cookies?”

ChatGPT thumbprint cookies for high altitude.

While my robot cookies may not be as aesthetically pleasing as I hoped, they certainly were delicious, and chewy just like I asked for!

Robot Cookies

So why is Generative AI so good at baking? I can break this down into two observations:

The Data:

A quick google search for thumbprint cookie recipes returns many results to crawl through. When I use ChatGPT to search, I like to think of the results as an average of results I’d get if I googled it. This is because ChatGPT is trained on a large set of text data from across the web. The training dataset contains over 45 terabytes of text data. Thats a lot of cookie recipes.

The Model’s Architecture:

The Architecture of a Neural Network model refers to the mathematical structure of how it is designed. Neural Networks are made up of layers. These layers are mathematical representations of patterns discovered in the data. These patterns are encoded, and then used to make predictions on new data. So the Architecture of a Neural Network is defined by how these layers are constructed and connected.

ChatGPT is a Generative model, meaning it is structurally designed to output new text based on the information given as the input text. Generative models learn to predict the next word in a sequence based on the previous words and context.

This is done through a Neural Network architecture called “transformers”. Transformers are particularly effective for Natural Language Processing tasks because they consist of an encoder and decoder layer. In the encoder, the model receives input and encodes the data patterns into what is called “hidden layers”. In the decoder, the patterns discovered and encoded into the hidden layers are used as context to decode and output new data.

This allows the model to capture, interpret, and mimic the structure of natural language. This is why interactions with ChatGPT feel conversational, and it retains the context during the conversation.

The Generative process of ChatGPT can be mathematically represented as:

Where x is the input sequence, y is the output of length n, and y1,…,yi-1 represent past generated sequences as context. So this equation can be read as “the probability of output y given input x, which is dependent on previous context.”

This robot knows how to bake a good cookie!

--

--

Claire Longo
Claire Longo

Written by Claire Longo

Full Stack Data Scientist/Machine Learning Engineer and bad Poker Player (try me)

No responses yet