Tech Refactored

Summer Staff Favorites: The Alignment Problem with Brian Christian

July 15, 2022 Nebraska Governance and Technology Center Season 2 Episode 48
Tech Refactored
Summer Staff Favorites: The Alignment Problem with Brian Christian
Show Notes Transcript

Tech Refactored is on a short summer vacation. We can't wait to bring you Season Three of our show, beginning in August 2022, but as we near 100 total episodes our team needs a beat to rest and recharge. While we're away, please enjoy some summer staff favorites. The following episode was originally posted in February of 2022.

Best selling author Brian Christian joins the podcast to discuss machine learning and his latest book, The Alignment Problem. Brian and Gus cover the alignment problem - when what we intend to teach machines to do, isn’t what they do - and the many challenges and concerns surrounding artificial intelligence and machine learning. This episode was the first featuring a speaker from this semester’s speaker series, a part of the Nebraska Governance and Technology Center’s Fellows Program. Coming later this season is Christopher Ali on rural broadband, and Anita Allen on race and privacy.

00:00:00 Gus Herwitz: This is Tech Refactored. I'm your host Gus Herwitz, the Menard director of the Nebraska Governance and Technology Center at the University of Nebraska. Today, we're joined by Brian Christian, the author of the Most Human Human, which was named a wall street journal bestseller, a New York Times editor's choice and a New Yorker favorite book of the year.

00:00:39 Gus Herwitz: He's also the author with Tom Griffiths of algorithms to live by, a number one, audible Best Seller, Amazon best science book of the year and MIT technology, Best Book of the Year, and personally, one of my favorite books. And, uh, we're going to be talking about his third and latest book, The Alignment Problem, which has just been, uh, published, uh, in the United States or recently been published in the United States and is forthcoming around the world in upcoming years.

00:01:05 Gus Herwitz: And Brian is also a visiting scholar at the University of California, Berkeley. Brian, thanks for joining us.

00:01:12 Brian Christian: It's my pleasure. Thanks for having me.

00:01:14 Gus Herwitz: So we're talking about your book, The Alignment Problem: Machine Learning and Human Values. And I, I just have to start by asking what is the alignment problem?

00:01:24 Brian Christian: So the alignment problem is an idea that goes back to the famous MIT cyber neticist, Norbert Wiener, um, who already back in 1960 was having these very prescient concerns about our relationship to technology.

00:01:40 Brian Christian: And he has a famous quote that says "If we use to achieve our purposes, a mechanical agency with whose operation, we cannot interfere once we've set it going, then we had better be quite certain that the purpose we put into the machine is the thing that we truly desire." And that is the essence of what has come to be known as the alignment problem.

00:01:59 Brian Christian: And this is an idea specific to the field of machine learning, namely software systems that are trained implicitly by examples rather than being explicitly programmed. The question is, do these systems really learn the things we think they're learning and are they going to behave in the way that we expect and will they do what we want?

00:02:21 Brian Christian: And so that has come to be known as the alignment problem. Are there objectives aligned with our intention? And what, why should we be, or how concerned should we be about this as a problem? I think it has very quickly gone from being a philosophically salient question to being, I think one of the most pressing and urgent practical questions facing us in the world.

00:02:51 Brian Christian: We have really seen in the last decade in particular, although it starts earlier than that, uh, a breathtaking adoption of machine learning systems in just about every aspect of our lives, whether it's the newsfeed algorithms that are filtering and amplifying, you know, the, the speech within our country, whether it's the autonomous vehicles increasingly on our streets, whether it's the risk assessment algorithms that are more and more part of the criminal justice system, determining things like pretrial, detention and parole and things, machine learning systems are, are really, I think, coming to replace not only human expertise and human judgment, but also cases where we were previously using sort of explicit declarative software.

00:03:46 Brian Christian: We now use the sort of implicit automatically learned software. And so, yeah, to answer your question, I mean, I think the alignment problem is very much already here. We see it in cases where there are wildly disparate error rates for facial recognition, depending on your gender and your skin color,

00:04:07 Brian Christian: There are autonomous vehicles that are getting into collisions because they can't recognize someone crossing the street. And this is really just the tip of the iceberg. And so, yeah, I think. This set of questions around, are these systems really doing what we think they're doing? Are they really learning what we think they're learning has in the span of, you know, about five or six years gone from the fringes of the field to, to today comprising, I think arguably the central question of the field of AI.

00:04:38 Gus Herwitz: So I like how you, um, frame that his we've used computer software now for decades and historically it's been programmed. So you, you use a spreadsheet, you use Microsoft Excel and we understand what, uh, cell A one plus cell B one equals C one, we understand what those operations are and they're being explicitly defined.

00:05:00 Gus Herwitz: And with machine learning, we're asking a machine to look at our spreadsheets effectively. It's much broader than spreadsheets, but say, oh, column A and column B, they, they seem to be combining with this addition function to give us column C. So we're just gonna say column C is addition or it learns what addition is by looking at what columns A and B, which means

00:05:25 Gus Herwitz: At at least two things. First, we're not expressly telling it what addition is. It's figuring it out. And also it's figuring it out based upon the examples that we're giving it. So it might be learning things different from what we intended to be learning.

00:05:41 Brian Christian: That's right. And, you know, in your example, it learns addition, which is probably what we were hoping for, but there are other cases where it's not even clear at least initially what the system has learned. So you see this in robotics for example, if you stack a set of blocks and you ask the, you know, robot do that, now it encounters a slightly different configuration of blocks.

00:06:08 Brian Christian: So there's some ambiguity. Does, is it supposed to do literally the exact same motor movements that it did the first time? Is it supposed to move the box to the same location in absolute space or in relative space relative to the other block? There's a lot of latent ambiguity, even in seemingly a simple task like that. And so this, this presents a problem and you can sort of imagine how that only scales up when you're talking about something like, you know, driving through a city or whatever.

00:06:38 Gus Herwitz: Yeah. So let's take an example. I've watched far too many videos on YouTube about this, and I, I, I should just say, if you, uh, go to YouTube and search for machine learning videos, you can find all sorts of fascinating stuff, but programming in AI to complete a game like Super Mario Brothers. There, there are a few things going on there, a few ways that the, the machine can think about what it's doing complete the game in as short, a time possible, or with a maximum number of points as possible.

00:07:09 Gus Herwitz: Those are two possible, uh, things, but the, the question that always puzzles me is what happens if the machine learning algorithm has figured out its way through the, the first level, uh, level one, one of Super Mario Brothers, the one that everyone, uh, knows really well.What happens if you change the position of a pipe, just move it over by one block or add one enemy. How is that going to affect the machine learning algorithm?

00:07:37 Brian Christian: Well, there's a particular failure case that machine learning systems are prone to, which is called overfitting. And so this would be a case where, you know, you, you, you might find for example, that the system that has learned to play level one, one is in effect memorizing the button inputs.

00:07:58 Brian Christian: And so suddenly you add one additional enemy or you move one pipe and the inputs totally break because that's- now you're running into the enemy or now you're missing the jump. And so what you really want is a program that that can, what is called generalize and be quote on quote, "robust" to those sorts of things.

00:08:20 Brian Christian: And so your research team might instead of training it always on the same level, they might train it on some kind of procedural level where the pipes are always moving around a little bit or the enemy placement is randomized. And that can start to give you confidence that your system is flexible enough, or it can learn strategies that are more generally applicable, but then you throw it in a water level where the physics doesn't even work the same way and all bets are off presumably all over again.

00:08:55 Brian Christian: And so that is the kind of thing that keeps machine learning researchers up at night is, you know, what, what is the implied distribution of different levels that this thing has learned in? And are we actually going to deploy it in a situation that stays within that comfort zone? Or is it going to encounter something that's never seen.

00:09:14 Gus Herwitz: So this set of concerns, the generalizability sort of concerns, that goes to questions about how much should we be relying on artificial intelligence or machine learning in general situations?

00:09:32 Gus Herwitz: So the, the Tesla example or the autonomous vehicles example has a very low error rate when it's driving in familiar conditions, you put it in unfamiliar conditions and you don't know what it's going to do. There's an entire different set of concerns though, that we might, uh, have with machine learning, which is, it reveals stuff about us that we might not like, or it, uh, To go to the, the name of the title of your book, machine learning, doesn't share human values.

00:10:04 Gus Herwitz: So you have the example early in the book, uh, where you talk about language math. Could you just explain what language math is and we can discuss it a little?

00:10:15 Brian Christian: Yeah. So there was a very riveting and intriguing development in computational linguistics in the early 2010s. That focused on what are called word embeddings. And the basic idea of a word embedding is that you use machine learning to create a kind of high dimensional space. It's actually, you know, hundreds of dimensions, but for, you know, mere human visualization purposes, we can think of it as three dimensions, this space in which you sort of locate each word in the language at some point in space.

00:10:54 Brian Christian: And the idea is that their s- spatial relationships have some bearing on the, the similarities in terms of the patterns with which you see those words used in the language. So similar words are going to appear clustered together and that sort of thing. Well, there was a, a particular breakthrough in the study of word embedding when someone realized that these points in space could be used to do essentially word math as you described it.

00:11:22 Brian Christian: So you can have something like, you know, swim minus swam plus ran, and it'll say "run". So these grammatical features can be captured by the word math. You can also capture gender as a feature within the word math. So the, the most famous example is king minus man plus woman. And it returns a point in space which happens to be near the word queen.

00:11:54 Gus Herwitz: That that alone, this sounds, this frankly sounds like intelligence. This sounds really remarkable. And you can just imagine talking to elementary school kids and perhaps kindergartner's first graders would have trouble with this, and you could see first, second, third graders starting to understand what's going on.

00:12:13 Gus Herwitz: And then you get more sophisticated and start getting into the philosophy of language- okay maybe a fifth or sixth grader, not philosophy of language level, but start playing really fun, interesting word games.

00:12:24 Brian Christian: Yeah. I, I have been thinking about, there's sort of an untapped potential here for really creative video games where you could imagine having some word that you're starting with and some target word, and your goal is to kind of add and subtract words until you get near the, the point in space that you're trying to get to.

00:12:45 Brian Christian: Um, yeah, and I think, you know, there's- there's a lot of potential. So we've seen this used in things like machine translation, where you can use the word embeddings, uh, which are kind of remarkably similar from one language to another, as a medium for translating words from one language to another, you can also use word embeddings for things like search result, relevance.

00:13:11 Brian Christian: You know, you search for something and Google says, okay, well, the exact phrase that you use doesn't appear on this page, but the words on this page are very near the point in this embedding space where your query exists. And so this is probably relevant to you, even if it's a different set of words. So it's very powerful and we've seen it within the span of just a few years, go from peer-reviewed white papers to actual products, such as Google search and Google translate.

00:13:40 Brian Christian: However, there is a pretty big problem as you've foreshadowed, which is that these word embeddings capture, not only what we would sort of consider to be the objective or, you know, proper features of the language, but they also include stereotypes.

00:13:58 Brian Christian: So there was a group of researchers in Boston. That discovered that if you do doctor minus man, plus woman, you get back a point in space that's close to the word nurse and we would object and say, you know, now wait a minute that there's, there's a difference here between king and queen. Which is that, you know- the difference between a doctor and a nurse- is the model appears to have captured this stereotype. Mm-hmm.

00:14:26 Brian Christian: And so what are we going to do about that? And I think there's a very interesting research dimension to this question of how do we try to tease apart the things in the language that we want the model to learn and don't want the model to learn, but there's also a real cautionary tale for the adoption and deployment of these systems.

00:14:45 Brian Christian:  So Amazon, for example, was using a system kind of like this to help them filter resumes for job applicant. At the very least they'd built something internally and they were, they were trialing it. Uh, and the idea was that you have some engineering position or whatever position you get more resumes than you could go through or it's overwhelming.

00:15:10 Brian Christian:  And so you use one of these word embedding models to say, okay, which, which of these resumes tend to contain the most, you know, quote unquote relevant terms as measured by a s- spatial proximity to the resumes of the engineers that you've hired in the past? Well, you find all sorts of problematic gender associations here where their initial version of the model, uh, was penalizing job applicants who had the word women's on their resume, because it had discovered that few of the actual engineers that they had hired had the word women's on their resume.

00:15:45 Brian Christian: So, uh, if you went to a women's college or you played women's field hockey, or, you know, whatever it would down, rank your resume for this engineering position and to make a long story short, they've tried a number of different techniques to try to de-bias the model. The model was picking up, not only the word women's, but was also picking up differences like between sports that were more commonly played by women and those by men. It was picking up turns of phrase that were more typical of male applicants. The example that I remember was "executed" and "captured" these sort of militaristic verbs for business, we executed the strategy and captured the market. It would give you bonus points for talking in that way.

00:16:34 Brian Christian: And ultimately the team decided they could not confidently purge all of these problematic associations from the model and they scrapped the model. But I think this is a great example of how uncareful use of machine learning, um, in the real world could not only exhibit some of the same stereotypes and biases that people have, but it can essentially turn them on autopilot.

00:17:00 Brian Christian: You know, if people weren't paying attention, this could have become part of the hiring pipeline. And then suddenly no female engineers are getting their resume viewed by the recruiters. And no one really knows why. So this is an example of a feedback loop where a system like that might not merely reflect the bias that exists in society, but might actually, uh, perpetuate it or exacerbate it.

00:17:25 Gus Herwitz: And it's also a, a great example of the, the concept we use this term all the time of deep learning. We think, oh, there's gender bias. And the, the machine is learning that Amazon historically has hired more men, so it's learning from that. So. Intuitively you think okay, so there's a checkbox on the application, male or female, and it's preferencing applications from male candidates, and no, you could try to scrub that information off of the application entirely. Don't give the ML algorithm access to demographic information, obviously, but there's, there are a lot of other indicia that we might not even recognize that are being discovered. I I've got a couple of follow up questions just on this example.

00:18:10 Gus Herwitz: I'll start with, in a sense, isn't this a really useful thing or almost even a good thing? It, the algorithm has identified this bias that we might not even have been aware that we had. So sure. We don't want to put the, uh, algorithm on the front lines, making the decisions, but perhaps we want to run our decisions through these algorithms, uh, and then throw a test set at them to discover, oh, we- we have these biases that we didn't know. Is that something that we can be doing?

00:18:45 Brian Christian: Yes. In fact, I think that there is a real boom for computational social science that uses some of these machine learning tools as instruments for quantifying some of what might previously have been considered qualitative aspects of a culture. We've seen some really interesting results of people using these embeddings, and tracking: do these word embeddings correlate with things like the implicit association test, which is a well known, uh, measure of implicit bias in social sciences?

00:19:20 Brian Christian: And it turns out there's a very tight correlation between the word embeddings and the IAT. People have said, does the word embedding system capture some of the quote on quote "vertical bias" in the world? Is the term steel worker, let's say, correlated to one gender over another to the degree that people, according to, you know, the US Department of Labor statistics shows that that gender is o- overrepresented in that profession.

00:19:53 Brian Christian: And it turns out that they- the correlation is quite strong there. And I think this is very powerful because it gives you an instrument for tracking the direction in which society is heading. It might be very difficult for us to say with any degree of certainty, "is the news media getting more or less sexist in 2022 relative to 2021?" And we might cherry-pick a couple different headlines and we could argue about it.

00:20:28 Brian Christian: But I think what's really powerful about these models is that you could throw essentially the entire internet into the model. And say does the 2022 Corpus produce word embeddings that have a greater gender skew than embeddings produced by throwing last year's internet into the model? Mm-hmm. And that gives you a way to actually watch society change and to track that. And so this has been of great interest to social scientists and linguists.

00:20:57 Gus Herwitz: Uh, I- I'm just imagining, and by the way that you're describing this, I assume someone is doing this, Google has its N-grams where it, which is based upon massive data sets of scanned in text from books.

00:21:10 Gus Herwitz: I, I just hope that someone is, uh, going back decade by decade or year by year, and trying to map out changing values using machine learning algorithms. That would be so remarkably cool.

00:21:23 Brian Christian: There was a team of researchers, I think this was a group at Stanford if I'm not mistaken, that did something like this decade by decade, where they took every decade in the 20th century. And then they would look at, uh, different aspects of cultural bias on a decade by decade basis. And I remember in particular, they were looking at, uh, terms that were. Correlated in this embedding space with Asian Americans. And there was a very striking shift in terms of the, the nature of the bias in the early 20th century compared to the late 20th century.

00:22:02 Brian Christian: And so the- there were diff you know, stereotypes in both, but they were dramatically different. And so there is a way that we can sort. Put our society under the microscope and take these things that would seemingly be kind of ineffable or hard to hard to make inarguable or hard to make quantitative and actually put them under a kind of microscope that makes them totally clear and explicit and numerical even. I think that's pretty remarkable.

00:22:33 Gus Herwitz: So it's easy to look at stories like this and say, wow, machine learning is biased. This is a problem. We shouldn't use these systems to just be politically incorrect about it. Isn't the reality that the machine learning is recognizing patterns that we have as humans and our own biases, and is it wrong for us to blame these algorithms instead of blaming ourselves, um, for what the algorithms are revealing about us?

00:23:05 Brian Christian: I think that's a great question. The way that I tend to think about it is that machine learning by its nature assumes that the future will be like the past. And so. If a company hired predominantly male engineers during the .com boom and that's the data set that you're using then the machine learning algorithm has no in effect aspirations of doing anything different than that, whereas we, as a society, every generation grows up to find the previous generation's moral views kind of archaic and barbaric and the future generations will feel the same way about us inevitably.

00:23:47 Brian Christian: And so I think there is, uh, a danger with machine learning that yes, the, the biases that it inherits are generally speaking a reflection of the society that created it. But as, as Princeton, computer scientists Arvind Narayanan put it, you know, you think about all the government systems and financial systems that are built in the seventies and eighties using COBOL and Pascal and these programming languages that no one even knows how to program in anymore, but they're still u- being used in the banking sector or in various applications.

00:24:25 Brian Christian: Wouldn't it be horrifying to think that some machine learning system based on a training set from 10 years ago is still running some aspect of the world 50 years from now? And so, yeah, I think that's- that's the kind of thing that we should be on guard against.

00:24:41 Gus Herwitz: Yeah. So at some level, the optimal outcome might be to, uh, put humans into the, the loop of the training data really endogenize on this, and I don't just mean have humans looking over the shoulder of the AI, but having humans learning from what the machine learning is learning about us and having that update our priors, and that then gets fed back into the system. And it's all a, a closed loop, uh, system.

00:25:09 Brian Christian: Absolutely. I think that that's, that's exactly the kind of thing that we need.

00:25:13 Gus Herwitz: Well, uh, we are speaking with Brian Christian. We're going to take a short break and we will be back in a moment to, uh, continue our discussion.

00:25:26 Lysandra Marquez: Hi listeners. I'm Lysandra Marquez. And I'm one of the producers of tech refactored. I hope you're enjoying this episode of our show. One of my favorite things about being one of the producers of Tech Refactored is coming up with episode ideas and meeting all of our amazing guests. We especially love it when we get audience suggestions. Do you have an idea for Tech Refactored? Is there some thorny tech issue you'd love to hear us break down? Visit our website or tweet us at UNL underscore NGTC to submit your ideas to the show. And don't forget the best way to help is continue making content like this episode is word of mouth.
So ask your friends if they have an idea too. Now back to this episode of Tech Refactored.

00:26:20 Gus Herwitz: We are back talking with author Brian Christian, about his book, The Alignment Problem: Machine Learning and Human Values. Unsurprisingly, we are talking about machine learning and whether it is aligned with human values, Brian, I, I want to ask how much of the alignment problem is really just a problem of scale and the idea that we're trying to use single algorithms to solve large problems.

00:26:48 Gus Herwitz: So for instance, the Google search algorithm, that's the main search algorithm that we use. And if there are biases, that means those biases are going to be overrepresented and oversampled in- in the corporate world. For instance, we try to address this, not by saying every corporate director needs to be a perfect representative of all humanity, but we instead say, we want to have a number of corporate directors, all of which are going to be representative of individual aspects of the stakeholders for the corporation. I is this really just a problem that we're trying to over determine, uh, solutions to problems with a single algorithm?

00:27:28 Brian Christian: I think this question of scale is very, very important. And the word that comes to my mind, thinking about this is monoculture that in biology, if you have a monoculture, then an entire species might be vulnerable to a pathogen. Like if there's not enough genetic diversity, for example, and there is a funny way in computer science generally, of this tendency towards a monoculture. Right now, most computers in the world run one of three operating systems and most mobile devices run one of two operating systems. And so if a bad actor finds an exploit in one of those operating systems, suddenly 2 billion devices are vulnerable overnight. And then we all have to scramble to get the latest patch mm-hmm , you know, et cetera.

00:28:23 Brian Christian: And there is a very weird way in which I think the tech industry has tended towards this, put all of your eggs in one basket and watch that basket very closely, rather than having a more kind of diverse approach, which is in some ways the lesson that I take away from the biological sciences. So thinking about this in terms of machine learning, I think there's, there's definitely a risk that we create a kind of decision-making monoculture.

00:28:55 Brian Christian: So currently, for example, things like pretrial detention, you are charged with a crime; your trial date is set some number of weeks in the future. Are you released pending trial or are you detained before your trial? These are increasingly informed by these risk assessment algorithms that are machine learning systems. But if you just think about the human judges making these determinations, some judges generally are, you know, stricter or more punitive, others are more lenient. There are different differences like that, that, and, and to some extent it creates an arbitrary aspect of, oh no, I got judge Smith. He's going to recommend detention, but judge Jones would've released me or something like that.

00:29:45 Brian Christian: Um, and so you could argue that the very diversity is itself a source of inequality or unfairness or on the other. It is very useful to people who study criminal justice, that there are these individual differences, because you can look at a defendant and you can say, okay, we know that Judge Smith would've put this person in jail, but Judge Jones released them, and then we know whether they went on to be rearrested or not.

00:30:14 Brian Christian: And so this gives us almost some counterfactual evidence about the kinds of people that the stricter judge is detaining. Well, maybe they don't need to be detaining all of those people because we- we- we know when they go in front of the other judge and do get released what happens. Well, that's a very important sanity check, I think on the system as a whole. And you lose that if you create a single decision-making algorithm that is uniformly applied to all people, if there's some kind of blind spot, then you may never know because there's no counterfactual. So that's the kind of thing that I think should give us some, some significant cause.

00:30:58 Gus Herwitz: So I, I think that that would fall into the, the category of examples or problems that we, uh, think of as wicked problems where there, the- each judge is trying to make a well informed, reasonable decision. And there might not be aright decision for each judge, which is part of why we have, uh, a large judiciary with lots of different judges and views and values that get expressed.

00:31:23 Gus Herwitz: So one question I could ask, and I'm gonna do the- the terrible host thing and overload these questions, uh, what one, uh, question I could ask is- Is machine learning just ill-equipped to deal with these wicked problems? But then there's another class of problems where there might not be an equilibrium solution. So one of the examples that I talk about with my students is crash tests and crash test dummies, and historically the federal government, when it was, uh, assessing the crashworthiness of cars, they had the typical driver. The dummy was modeled after a male driver, because historically that was the typical driver.

00:32:02 Gus Herwitz: And that means cars were being designed to be much safer for male drivers than female drivers. And we've, we've realized this, but it turns out that male and female bodies on average are different in size. So a safe car for men and for women might actually be different. So this is a situation where there might not be an equilibrium car design that is optimally safe for both men and women, or that optimally safe for both men and women car might actually be less safe than the optimally safe car for men, or the optimally safe car for women for those specific classes of drivers.

00:32:40 Gus Herwitz: Which is a, a long-winded way of, uh, asking is, is this another sort of problem that machine learning is ill-suited to, uh, uh, solving?

00:32:49 Brian Christian: I'm tempted to say that there might be possibilities here for machine learning if applied at the right scale and with the right scope. So in terms of criminal justice, one of my favorite papers in that space looks at–and he sort of takes as its presumption that every judge has some kind of idiosyncratic way of kind of tallying up the different things that they care about. Maybe they're strictly optimizing for public safety. Maybe they realize that if you put this person in jail, then their family is gonna lose their source of income or blah, blah, blah.

00:33:35 Brian Christian: And the judge is weighing these other considerations that are be go beyond purely the, the safety consideration– and this, this shows that judges are, are not merely attempting to maximize some predefined measure of, uh, success, but are in some ways creating the value judgment or doing the kind of hard philosophical part of how do you juxtapose seemingly incompatible or incommensurable desires or, or outcomes. So what you could do rather than creating one machine learning system to rule them all that's going to somehow determine the relative values of these things and apply to everybody you could give each judge their sort of own personal machine learning system that is modeled after their decision making style, but does so in a more consistent way.

00:34:37 Brian Christian: And there's some, I think very interesting results there where you can basically make a model that is doing the value judgment of that particular person but is more consistent and less noisy and never has low blood sugar or et cetera. So you can reduce some of the variability while maintaining the kind of idiosyncratic to Teradata. I think that's really cool, I think that's, that's the kind of thing that we should consider. For some reason, we tend only to think about these things as you know, the federal government is provisioning tool X and it's just gonna apply in every state and every jurisdiction. But yeah, something more personal might, might actually be the way to go in that situation.

00:35:23 Gus Herwitz: So turning to a, a slightly different take on the topic, we use this term artificial intelligence, and we also use the term machine learning. And there's a broader concept that is used in the AI community of artificial general intelligence. Is any of this really intelligence or is, is, are we really just back to the old, uh, trope that computer scientists call artificial intelligence? Any problem that hasn't been solved?

00:35:53 Brian Christian: Yes, this is, this is a famous adage and opinions obviously differ here, but I'll give you my view, which is that if you go back to the history of machine learning history of neural networks, which really predate the concept of AI itself. AI is sort of thought to emerge at the Dartmouth conference in the 1950s, but neural networks were conceived of by Warren McCulloch and Walter Pitts in the early 1940s. This was before we even really had scored program computers-

00:36:29 Gus Herwitz: ...and they were thinking like, uh, actual neural brain, how neurons and things came together.

00:36:35 Brian Christian: Yeah. So, I mean, Warren McCulloch was a neurologist who, you know, was studying the actual nervous system and was a medical doctor. And it was this collaboration between him and Walter Pitts who was sort of a teenage logic prodigy. It's really an incredible story, and Warren McCulloch in effect became his foster father.

00:36:51 Brian Christian: It's just a really remarkable story of scientific collaboration, but the- the paper that they wrote in the, in 1942 creates this simplified mathematical model of a neuron just based on the science that they knew at the day, which was we know that these individual neuron cells are wired up such that they have a number of different inputs, but a single output and the way that it works is when the, you know, combined signal of all of their inputs exceeds some threshold, then they will emit a pulse. Otherwise they won't do anything.

00:37:34 Brian Christian: And they thought, okay, let's turn this into sort of more of a boolean logic thing where you're adding up numbers. If they're greater than zero, then you, you know, send a one down the channel, let's see what we can do with this as kind of a mathematical model. And this has a, a rich history full of many stops and starts, but what's significant to me is that this idea was basically correct. The deep learning revolution that really began in 2011, 2012.was in some ways, nothing more than the final vindication of the first idea that anyone ever had in the early 1940s.

00:38:15 Brian Christian: It's just now we had enough computing power and enough training data to make it work. And so to me, that's, that's one data point. That we're on the path to "real intelligence", as opposed to just some engineering hack that this was kind of the first idea anyone ever had. And it was directly inspired by the biology of what we knew at the time about the brain.

00:38:38 Brian Christian: And the second piece of data for me that's very significant is– without going totally into the details, although we can, if you want– there is a technique in AI developed in the 1980s and 90s called temporal difference reinforcement learning, which is how systems can learn based on a diff- differential between one prediction and the next. And this was being used for you know, Backgammon and things that like, like that, that the AI community was interested in, in the early nineties, but it turns out in a parallel development there had been a lot of advancements during the 1980s into the study of dopamine, uh, the dopamine system. And it was becoming a lot more clear that the dopamine system had something to do with reward, but it was not exactly reward.

00:39:24 Brian Christian: And it was kind of this outstanding mystery among the neuroscience community. What exactly is the dopamine system doing? Well, when the AI community took one look at those data, they said, "Oh. This is temporal difference reinforcement learning. We just worked the math out a few years ago because we were working on Backgammon." and it turns out to this day that this is, this is the accepted story for the role of the dopamine system.

00:39:48 Brian Christian: And that for me is another data point that there's, there really is a way in which the artificial intelligence agenda that we're on at the moment is in my view, not merely solving cool problems with interesting engineering hacks, but rather actually discovering some of the philosophical pay dirt, the actual mm-hmm,  some of the same solutions for learning and intelligence and decision making that evolution found. So to me- I think we're onto something.

00:40:23 Gus Herwitz: So- so this, this brings me to, uh, what, what will probably be our last topic, but also is an opportunity for me to mention your, your previous book, which is one of my favorites: Algorithms to Live By, which, uh, is an exploration of human cognitive psychology and algorithms, the relationship between computer science and how computers might operate and we might design them and how the human brain operates.

00:40:47 Gus Herwitz: And what, one of the examples that you discussed early in that book is the exploration- exploitation trade off the idea that we need to spend some amount of time learning about our environment or our world or anything that we're trying to do. And at some point we need to switch from learning to exploiting what we, um, have learned. And we see that coming up in this book as well, with the role of curiosity and the game Montezuma's Revenge. I- I- can you just tell us a bit about Montezuma's revenge and, uh, the, the values that we find embedded here?

00:41:22 Brian Christian: Yeah. So Montezuma's revenge is an Atari game from the early days, the- I think, early to mid 1980s. And in it, you play this explorer that's kind of loosely modeled off of Indiana Jones called Panama Joe and Panama Joe has to escape this temple that's full of these deadly traps. And you have to, you know, jump over pits of fire and swing on ropes and collect these keys that you then use to unlock different rooms of this temple.

00:41:53 Brian Christian: And this game has come to be pretty famous within the story of, uh, AI in the last 10 years, because it proved so difficult for AI systems to beat. And in particular, there was a team from Deep Mind in 2015 that got a paper onto the cover of nature magazine, where they had built this reinforcement learning system that could play almost every Atari game ever made at a superhuman level and it was pretty jaw-dropping at the time, but the catch was, it could not score a single point in the game Montezuma's Revenge. And so this led to a lot of research interest into what the what's going on with this particular video game that makes it so hard, and what might we do to actually make progress?

00:42:46 Brian Christian: And so the, the short answer is that the thing that makes it really hard is what's called sparse rewards. Traditional AI systems, reinforcement learning systems, are trained to maximize the number of points that they get in the game. And basically to your point about exploration and exploitation, these systems begin by just mashing buttons at random and learning what stuff seems to score points, and then you can start doing more of that. But in Montezuma's Revenge, almost everything you do simply kills you. And so it's very, very difficult to score any points at all. You really have to know what you're doing and, you know, swing on the rope and jump over the fire and go down the ladder and blah, blah, blah, blah, blah.

00:43:29 Brian Christian: So how would an AI system even know that it was on the right track? I think this is it's, it's very interesting because it gets to this question of both why, why do we, when we play the game, understand that you're supposed to go get the key and then come back up, but deeper than that, why are we playing the video game to begin with?

00:43:49 Brian Christian: And I think the reason we're playing it to begin with is to sort of see what's on the other side of the locked door. You know, what, what is in the next room? Can we get out of this temple? And if so, what's outside of the temple? There's this intrinsic motivation that we have, which is not about scoring points.

00:44:08 Brian Christian: You know, getting a key gives you a hundred points, but who cares? The point from any human player's perspective is to explore the space. And so what you can do is program the AI system to have that kind of novelty drive. And this is a tangent, but part of what I find so fascinating about this is that the formal model of this novelty drive was borrowed by the AI researchers directly from child psychology. So they, you know, they called up their colleagues who do developmental cognitive science, and they said, what's your best working model for, you know, explaining infant exploratory behavior?

00:44:53 Brian Christian: And they plug that into the AI system and suddenly it's playing the game the way that a human would, it's sort of, you know, crossing the bridge to see what's on the other side, so to speak. Mm-hmm. And finally they were able to beat this game. So I think it's really a triumph, both of the interdisciplinary perspective that suddenly, you know, as AI systems come to resemble more and more closely animal and human behavior, AI researchers can actually turn to their colleagues in those departments and not have to reinvent the wheel.

00:45:26 Brian Christian: And it's also I- I think a triumph for thinking about the complexity of our motivation for something even as simple as playing a video game. We're not actually playing to maximize our points. We're playing for some combination of, you know, novelty and surprise and mastery and all sorts of things. So it really reveals the, the richness, I think, of human motivation.

00:45:46 Gus Herwitz: Well, thinking about novelty, surprise and mastery. My, my last question for you is what's next? Are you working on another book yet? Or, uh, what, what are you continuing to think about?

00:45:58 Brian Christian: Yeah. I mean, I'm thinking a lot about- there are a few things that were left on the cutting room floor of the book that I've continued to sort of pull those threads. And one thing for example is what do you do when you have more than one user? So we kind of touched on this a little bit earlier in the conversation, but in it's in the category of, I think wicked problems as you describe them; you can't simply average across different users. Mm-hmm  um, and you see this in driving, you know, if you have 50% of your users swerve right around the caution cone and 50% swerve left, you don't wanna just average it out and plow straight into it.

00:46:38 Brian Christian: So how do we- how do we try to navigate situations like that, where you have different heterogeneous preferences? That's really interesting to me, buried within AI systems is kind of an implicit model of human psychology. And in some ways it's a giant placeholder of insert cognitive science here. So I'm interested in progress towards, you know, putting something a little bit less provisional into those systems and, and how cognitive science might inform that. I've also been thinking a lot about the concept of trust in computing. And I think this is something that comes up both in machine learning, obviously, but even in more sort of classical forms of computing, you know, you go to a website and it says, put in your bank password.

00:47:22 Brian Christian: And how do you know that it's really your bank, you know? Well, there's a green lock icon next to the URL. Okay. What does that mean? So you click it and it says, this has been, you know, this is a digital certificate, it's a 2,256 bit thing from DigiCert Inc. And he was like, well, what is that? So trying to kind of peel back those layers of the infrastructure of trust in computing. I think I've been- I've been pulling that thread a lot lately, so mm-hmm that may end up turning into something.

00:47:49 Gus Herwitz: Fa- fascinating topics, all. I look forward to, uh, hearing what you say about them and, uh, continuing to read your work. Thank you. We've been speaking with Brian Christian about his book the Alignment Problem: Machine Learning and Human Values. You can find it on Amazon or wherever you buy your books I guess. Uh, if you need recommendations for that, you can go to Google and it's machine learning algorithm will help you out. Thank you, Brian.

00:48:13 Gus Herwitz: And thank you to our listeners. I've been your host, Gus Herwitz. This has been Tech Refactored. If you want to learn more about what we're doing here at the Nebraska Governance and Technology Center or submit an idea for a future episode, you can go to our website at ngtc.unl.edu, or you can follow us on Twitter at UNL underscore NGTC.

00:48:33 Gus Herwitz: If you enjoyed the show, don't forget to leave us a rating and review wherever you listen to your podcasts. That will help the algorithms figure out how to recommend similar podcasts to you. Our podcast is, uh, produced by Elsbeth Magilton and Lysandra Marquez and Colin McCarthy created and recorded our theme.

00:48:50 Gus Herwitz: This podcast is part of the Menard Governance and Technology programming series. Until next time, keep your machines learning.