Week 1, Lecture 1 - Introduction

Date: 9/26/22

Course content delivered in modules and in lecture (which go over the content from modules in an interactive manner).
So during lecture, Percy and Dorsa will walk us through the lecture slides from the module page.

Course content

How can we go from code –> self-driving cars? We need a plan of attack. Here we introduce an AI paradigm consisting of three components:
1. Modeling
2. Inference
3. Learning

Modeling: i.e., how we can go from a complex problem (e.g., lane shifting) to a nice, mathematical representation from which we can work with.

However, modeling is inherently lossy: we lose information due to the complexities of the world. not all of the richness of the real world can be captured, and therefore there is an art of modeling: what does one keep versus ignore?
Inference: answer questions with respect to the model. Example given city model: what is the shortest path? Essentially a mathematical excersize where we use algorithms to solve the problem.

Learning: simplified but real world is complex. Not feasible to write full model. But building a model is very hard to accurately represent the real world. So instead, we can show the computer data and the model is learned implicity through such data.
- i.e., we’ll write down a model architecture (model “family”), you find the edge values by learning.

Note: you probably think of neural networks when we talk about modeling. But in actuality, it is much more broad; The idea of learning is not tied to a particular model family (e.g., neural networks). Rather, it is more of a philosophy of how to produce models.
Rest of course: use this paradigm to understand modeling strategies: from low-level to high-level.
Machine learning: data –> model;
State-based model: proceduraly: take next step into next state to “search” for a solution.
Variable-based: assignment: assign bayesian edges in a graph (essentially a PGM).

Course plan chart

AI History

Next module.

Topics in this course have long and complex history.
Turing test: define intelligence as a machine that can fool a human.
Three stories of the birth of AI.
Birth of AI (1956): John Mcarthy organizes Dartmouth conference during the Summer.
- Proposed challenge to build artificial intelligence with computers. Idea: “Every aspect of learning or any other feature of intelligence can be so precisely described that a machine can be made to simulate it.”
- Produced some good results like Samuel’s chess engine.
- AI summer: expected to be solved in like 10 years.
- But of course: underwelmed… no AGI produced!
- Problems:
  - Not enough compute
  - Didn’t understand the complexity of the problem (no np-completeness yet!)
1. Symbolic AI: Knowledge-based systems (1970s+):
- Idea is to encode explicit domain knowledge in the form of rules:
```
if [premises] then [conclusion]
```
- Commercial success: medical prediction systems
  - Solved limited compute problem
  - But world is too complex to model completely with this system.
2. Neural AI: Neural models (1943, 1980’s):
- Mcullugh and Pitts: made mathematical model of neurons. What can it learn?
  - Note: these two were Psychologists well before the age of computing: no intention to build AGI.
- Lots of advances until in 1969 Minsky proved that XOR was not solvable by linear NN’s. Killed NN’s for the timing being.
- However, field picked back up again with the rise of connectionism (explaining cognitive phenominoms using neural networks).
  - 1980: Fukushima’s neocognition CNN architecture.
  - 1986: Hinton’s backprop.
  - 1989: Lecun applies backprop learning to Fukushima’s CNN architecture.
- Didn’t really pick up speed until 2000s and 2010s when compute (GPUs) came into fruition and networks could be made “deep”er.
  - AlexNet moment!
3. Statistical modeling:
- Bayesian networks, SVM’s, etc.
- Provide theory
- Goes back to Gauss (before AI)

Next module…

Ethics and responsibility

High-level principle: ensure AI is developed to benefit and not harm society.
However, it is tough to operationalize these principles in the real world.
Two axes: intent vs. impact.

Levels of abstraction so it can get quite complex to see what is right/wrong.
- Downstream impact.
Accidents:
- Good intent, bad results
  - Comes from gap between model and real-world. We lossy it down to the model and then deploy lossy version in real-world. Lossed information can cause problems!
  - “Misalignment between real-world objective and system’s objective.”
- Optimize wrong objective function
  - e.g., facebook echo chamber: want good user feeedback but can cause political isolation.
- Fairness: performance disparities between different groups.
- Robustness: spurious correlations
  - Example: train classifier for collapsed lung. But all collapsed lung x-ray usually has some chest drain device so the model is really just learning to find the chest-drain rather than a pathological identifier.
- Security: adversaries (sticker on stop sign for autonomous driving).
Think about the task setup: e.g., don’t make gender classification algorithm.
Feedback loops: bad algorithm paradigms (as explained above), bad data make for a bad loop.
Data is of course controversial. Dataset biases, imbalanced classes, legal battles over copyright ownership. Labour problems for labeling data.

So what do we do? - Transparency: document uses, model, etc. - e.g., model cards - Choosing right problems

Summary: