Week notes — 5th September 2024

3 min readSep 5, 2024

I’ve finished the lectures for Stanford’s Deep Learning for Natural Language Processing course. The program also requires a final project deploying BERT — I’m working on the right way to tie that aspect into some of my broader research interests.

My goal in doing CS224N is to get a deeper design understanding of AI, this weeknote discusses some of the lessons I took from the course from a design research and UXR perspective.

My notes and assignments are here.

The history

One lesson from the course is the deep history of NLP — it’s easy to forget that it’s been a research area for decades, making slow and steady headway until suddenly making vertiginous progress from about 2013. As early 1954, researchers thought they’d cracked machine translation — very much wrongly. Then, in 2013, Word2Vec came out and demonstrated a vectorised approach. In 2016, Google Translate started using a neural translation model based, in part, on Word2Vec’s vectorised approach. It was a significant moment because Google Translate’s previous statistical approach embodied state-of-the-art non-ML statistical learning and decades of research into dependency parsers. By 2017, the Transformer architecture cracked the problem of understanding long-range context in neural networks. BERT demonstrated the power of Transformers in 2018, and in 2019, another Transformer, GPT2, came out — just six years after Word2Vec. GPT2 isn’t as polished as the Generative AI we’re used to now, but it’s the same type of beast. GPT2 and BERT are ‘foundation models’, a single model that is highly effective at many classic NLP tasks. It swept away the menagerie of handcrafted approaches previously used for coreference resolution, entity extraction, topic modelling, code generation, question answering, etc.

Stanford, and Chris Manning himself (leader of the course), were part of that history, and going through the CS224N lectures brought a visceral sense of the uneven pace of progress.

Understanding that history can help bring another perspective to designing for AI products. While many of the older models have been displaced, the technical language, approaches to benchmarking and conceptualisation of the problem space are still informed by NLP’s history.

Maths & Code

Many products will use an API to interact with Gemini or ChatGPT, and product designers and UXRs might not need to know about what’s going on under the hood. But I think it’s still useful to do some hands-on coding with an LLM running locally, and to understand some of the underlying maths, too.

There may be some domains where there are better solutions than calling the Gemini or ChatGPT, perhaps specific combinations of multimodal AI or working data that isn’t text or images. Further, there are still some reasons to implement an LLM even where an API is available, perhaps most prominently to create a stable, ‘explainable’ model or for applications which cannot rely on internet connections.

As a designer or UXR you may not be implementing the model, but it’s still useful to be able to understand and discuss the technical architecture.

Reading the research

I’m not at a level where I could evaluate a highly technical paper, but understanding the maths and the history to a reasonable level is enough to contextualise the key claims and methods used in research papers, which is handy if you want to design or research products and features with AI affordances.

Where next?

NLP is an area where I had some previous experience, so it’s a logical jumping-off point for developing deeper AI skills. For my next step, I’d like to explore some more diverse areas — right now, I’ve been looking at how AI is applied in simulations for mechanical engineering. It connects to my interest in engineering, has fascinating implications, and is a refreshing contrast with NLP.

Week notes — 5th September 2024

The history

Maths & Code

Reading the research

Where next?

Written by Jimmy Tidey