In the second a special series on generative artificial intelligence, we hear from two companies involved in the AI revolution - one of the biggest and oldest names in computing, Microsoft, and a young startup making waves in this booming industry, Hugging Face.
Speakers: Natasha Crampton, Chief Responsible AI Officer, Microsoft; Thomas Wolf Chief Science Officer and Co-Founder, Hugging Face.
Co-host: Benjamin Larsen, Lead, Artificial Intelligence and Machine Learning, Centre for the Fourth Industrial Revolution, World Economic Forum.
The World Economic Forum's Centre for the Fourth Industrial Revolution: https://centres.weforum.org/centre-for-the-fourth-industrial-revolution/home
Join the World Economic Forum Podcast Club
Transcripción del podcast
This transcript has been generated using speech recognition software and may contain errors. Please check its accuracy against the audio.
Robin Pomeroy, host, Radio Davos: Welcome to Radio Davos, the podcast from the World Economic Forum that looks at the biggest challenges and how we might solve them. This week, on the second in our special series on artificial intelligence, we hear from two companies involved in the AI revolution, one of the biggest and oldest names in computing...
Natasha Crampton, Chief Responsible AI Officer, Microsoft: Microsoft's long taking the view that we need both responsible organisations like ourselves to exercise self-restraint, and we also need regulation.
Robin Pomeroy: And one a young start-up making waves in this booming industry.
Thomas Wolf, Chief Science Officer and Co-Founder, Hugging Face: Two days ago we released HuggingChat which is basically a opensource alternative to ChatGPT.
Robin Pomeroy: The CEO of Silicon Valley darling Hugging Face tells us what it's like to be riding the AI wave.
Thomas Wolf: It's crazy. It was millions of views on the released posts we see like hundreds of thousands of people using it.
Robin Pomeroy: And the head of responsible AI at Microsoft tells us the industry wants and needs regulation, but we should resist calls to pause AI development.
Natasha Crampton: There's no question here that we will need new laws and norms and standards to build confidence in this new technology. Rather than pausing important research and development work that is underway right now, including as to the safety of these models, I really think we should focus on a plan of action.
Robin Pomeroy: Want to know where the world is heading with AI? Then subscribe to Radio Davos wherever you get your podcasts to get this series or visit wef.ch/podcasts.
I'm Robin Pomeroy at the World Economic Forum and with episode two of our series on generative AI...
Thomas Wolf: AI becoming this kind of common good of humanity where everyone can understand it and use it in a positive and a beneficial way.
Robin Pomeroy: This is Radio Davos.
Robin Pomeroy: Welcome to Radio Davos and the second in our special series on generative artificial intelligence. And I'm joined today by my colleague Benjamin Larson, who leads artificial intelligence and machine learning at the Centre for the Fourth Industrial Revolution at the World Economic Forum. Hi, Ben, how are you doing?
Benjamin Larsen: Thanks for having. I'm doing great. It's good to be here.
Robin Pomeroy: You're in San Francisco where I was a few weeks ago. How's the weather there?
Benjamin Larsen: The weather currently is rather cloudy. You know, we have that that gloomy San Francisco spring.
Robin Pomeroy: So it's weather with character, like the city itself.
Benjamin Larsen: Definitely.
Robin Pomeroy: Okay. So you're here to kind of introduce this episode with me. We're going to talk about the interviews that are going to play out, two interviews today. But before we do that, just remind us what I was doing, and what you were doing there in San Francisco at the end of April. What was that meeting about?
Benjamin Larsen: So in late April of this year, we convened the Responsible AI Leadership Global Summit on Generative AI at the World Economic Forum's Centre for the Fourth Industrial Revolution in the Presidio of San Francisco. During that summit, which was hosted in partnership with AI Commons, we really sought to guide technical experts and policymakers on the responsible development and governance of generative AI systems.
Robin Pomeroy: And I managed to get some interviews there with some of the people attending. There were people from companies, from academia, from institutions, and we've got two of them today. So one of them, they were both in the opening montage, we had there is the Chief Responsible AI Officer at Microsoft, a company that probably needs little or no introduction from us. One of the big computing companies that is at the forefront of AI. They've invested in OpenAI, the company that makes ChatGPT. And then the second interview you will hear is with a perhaps a lesser known company called Hugging Face. I speak to Thomas Wolf, who's the co-founder of this company, it's been around for five or six years or so. What's Hugging Face?
Benjamin Larsen: Hugging Face, that is a company in an open source community that specialises in natural language processing and machine learning. And what they do is that they provide tools and libraries through that platform to make it easier for developers to work with their NLP [natural language processing] models and the tools and systems that are created by the team at Hugging Face and elsewhere are really used by a lot of different research organisations, including some of the very largest companies out there. You have the Facebook artificial intelligence research, Google Research, DeepMind, Amazon, Apple, etc..
Robin Pomeroy: Perhaps you could also help me with some jargon busting. I'm going to fire a couple of things that you that are mentioned in these interviews. One of them is open source. What do we mean by open source?
Benjamin Larsen: Perhaps I'll situate it within the larger area because the area of AI development and release is a very large topic. And some people, they see the issue as sort of a choice between two options. So we have open source on one hand and then you have closed source on the other hand.
Open source, that really means that many people, they can work on any AI system together. They can make sure that this system, it meets everyone's needs. One particular project here, that would be BigScience Bloom project, which was a global effort at developing a large language model, and that effort, it brought together thousands of researchers internationally, as well as national resources to train the actual model. And where open source, it allows for a more sort of proactive approach where people they can contribute on an open basis. Then there are also some risks, some downstream risks in terms of of misuse by bad actors.
Robin Pomeroy: So another bit of jargon we can bust here together is this expression black box, the black box question. Could you explain to anyone who's never heard of that and is not in this field what's meant by that?
Benjamin Larsen: Definitely. So you can imagine perhaps that you have an AI system that makes important decisions like approving loan applications or predicting medical diagnosis and so on. And the problem here that is that sometimes some of these AI systems, they can make decisions without clearly explaining how or why they reach those conclusions. So it's a 'black box' where you put in data and it gives you an answer, but you don't know how it got there, basically.
So this lack of transparency, it raises concerns because people, they want to understand the reasoning behind the decisions and that can make it difficult to trust and verify, for example, the fairness, accuracy and potential biases of an AI system.
Robin Pomeroy: And so all of these issues are things that were discussed at that meeting, that are being discussed by everyone involved in AI. And those are the issues that will come up when people talk about regulation. What do you think we should be listening out for when when we're interested in regulation?
Benjamin Larsen: So I think front and centre at the moment is of course the incoming EU AI Act, which is a horizontal and risk-based approach to regulation, and it's by far the most comprehensive regulation that's out there. So in all likelihood a precedent will be set by how the AI Act affects the industry.
And we can also look at regulations such as the Digital Services Act, which already has set up a regulator that will begin to start auditing some models and systems, as well as looking at that potential impact.
And then you have, we saw the recent the Senate hearing in the US where we had also Sam Altman, CEO of OpenAI providing his testimony. And we see that in the US, regulators are slowly waking up to the importance of AI regulations. And that has to be viewed alongside other initiatives such as the Biden administration's new actions to promote responsible innovations that protects Americans' rights and and safety. And then we also have an AI Bill of Rights released by the White House in the US, and we have the Risk Management Framework as well that provides a process that integrates security, privacy and cyber supply chain risk management activities.
Robin Pomeroy: Very, very busy discussions about regulation. And of course we had these discussions in San Francisco a month ago. Where's the World Economic Forum going now on this.
Benjamin Larsen: Yes. And what emerged from the summit is really a set of recommendations that will be released in June, aiming to have a broad and actionable impact that varying stakeholders they can look to and learn from as we take these conversations forward. Separately, we also are moments away from launching a new initiative on generative AI, and there's a lot, a lot of details that would follow in the near future.
Robin Pomeroy: We'll be covering that on these podcasts as well. Let's listen to both the interviews right now. Benjamin Larson, thanks very much for joining us.
Benjamin Larsen: Thank you, Robin.
Natasha Crampton: I'm Natasha Crampton, Microsoft's Chief Responsible AI Officer.
Robin Pomeroy: Natasha, what is a responsible AI officer?
Natasha Crampton: Great question. So in a nutshell, my job is to put our six AI principles that we've adopted at Microsoft to work across the company. So what that means in practice is that we've established a governance structure, we've established policies, we train our engineers.
Our six principles for our north star are fairness, privacy and security, reliability and safety, inclusiveness, accountability and transparency.
One of our innovations is that we've taken all of those six principles and doubleclicked down on them and written actionable guidelines for our engineering teams so that when we are shipping out new generative AI products, whether that's the new Bing or the co-pilots that we've built for Outlook and Teams and Excel, that our engineers are taking responsibility considerations in from the outset and making sure that those systems are going to live up to our values.
Robin Pomeroy: Is that enough to ensure, to reassure the public and to reassure the world that these systems will be safe? Or do you think there also needs to be some kind of guidance from governance bodies, from governments or from some kind of regulation, which is central to the discussions that are happening here in San Francisco.
Natasha Crampton: Microsoft’s long taken the view that we need both responsible organisations like ourselves to exercise self-restraint and put in place the best practices that we can to make sure that AI systems are safe and trustworthy and reliable. And we also need regulation.
There's no question here that we will need new laws and norms and standards to build confidence in this new technology and also to give everyone the protection under the law.
While we would love it to be the case that all companies decided to adopt the most responsible practices that they can, that is not a realistic assumption. And we think it's important that there is this baseline protection for everyone and that will help to build trust in the technology.
Robin Pomeroy: Tell us something about yourself. How did you come to be a Chief Responsible AI Officer? What kind of career path leads to that?
Natasha Crampton: Sure. So I've always had an interest in the intersection between law and technology and society. So at university I trained as both a lawyer and as a technologist. And after several years in law firms, I made my way to Microsoft and I started in the New Zealand and Australian subsidiaries. And I think working in those subsidiaries is really formative to my job here. You're very many miles away from headquarters at that point. You need to learn to listen very closely to the diverse customer voices that you hear when you're interacting directly with customers and also regulators, other societies. You're right there on the coalface of those interactions.
So I learned that you had to listen and you had to localise. I think it really taught me that technology, like many other things in life, is in fact local.
And so I bring those experiences to my job now and in making sure that when we are working on important issues like fairness and transparency, that we're really bringing in a diverse set of voices to the table. And it makes me remember that I'm in a role where I need to think about the impact of AI on society. That is not just society in the US or Europe. We really need to bring a global perspective to this work.
Robin Pomeroy: And was there a moment in your life when you realised the power of AI?
Natasha Crampton: So last summer two things were happening at once.
I had early access to an early version of GPT-4 and I was able to prompt it with a bunch of queries. And honestly the results really wowed me. They were quite unlike anything that I'd seen in the past, and I think advances that I'd been expecting to see further into the distance were there, here and now.
The second thing that I was doing at that time was working on a product integration. So we were integrating Dall-E, which is OpenAI's model that takes words, descriptions, and turns them into images. Now, personally I’m much more of a wordsmith than I am an artist, and so I found it quite joyful and delightful to be able to say things that I could describe well with words generated into art that I would personally never be capable of producing myself.
Robin Pomeroy: Can you give us some idea of what those things when you're playing around with that first iteration of GPT-4, what were the prompts and what were the responses you were getting that made you say, Wow?
Natasha Crampton: Well, one thing I did was that I prompted an early version of GPT-4 to produce a bill that could regulate AI based on an impact assessment methodology. And I got an output that was a very decent first draft.
Now, I'm a lawyer by training, and so I could also spot errors in the way that it had been pulled together. But it was striking just both in its structure and its content as to how much it was capable of.
Robin Pomeroy: So if you'd asked, let's compare it to, you had a junior colleague and you said, prepare this for me. How long would you have expected them to take to produce that document? And would you have been pleased with the outcome from a junior colleague of that?
Natasha Crampton: Look, I think a junior colleague probably would have taken a number of hours to produce what was produced in that momentary time frame that I was using GPT-4 for. Now, I think I would have been impressed. And, you know, I think it's likely that the junior colleague would have taken certain approaches and sort of excelled in areas where GPT-4 didn't.
This technology can be a very good first draft partner in high stakes scenarios ... it's important to understand where the technology is very, very good and how to combine the best of that technology with the best of humans ... this technology is essentially a co-pilot for doing these tasks.
”So, you know, I think the net of it is that this technology can be a very good first draft partner in high stakes scenarios. Of course, it's very, very important to be judicious about it. And I also think it's important to understand where the technology is very, very good and how to combine the best of that technology with the best of humans.
And that's really certainly where Microsoft starts - that this technology is essentially a co-pilot for doing these tasks. It's not the case that we can outsource writing legislation to GPT-4 or any other equivalent technology.
Robin Pomeroy: I'm quite interested in this concept of the black box, maybe you could explain what that is. How would you explain black box to someone who's never heard that term?
Natasha Crampton: So when I hear concerns about the black box nature of AI, what I really understand by that is that, you know, concerns that AI is a bit mysterious. We don't really understand how it works. We don't really understand how it's made and what its risk profile really is.
And so I think addressing that set of concerns really involves getting a deeper understanding of how these systems have been built. What is the data that they're being trained on? What is the actual function that they are serving? What are the mitigations that have been put into place? Because we really need to understand all of those dimensions in order to be able to have informed policy conversations, informed conversations as members of society about how we want to use these technologies.
Now, sometimes the response to the black box type concern is the suggestion that we should just open up the hood for all to see the mechanics of how these systems are built. And I think we actually need a more nuanced approach than that.
You know, Microsoft has the transparency principle, which we're committed to upholding. But one thing that we've learnt over time with that principle, which is very important, is that actually different stakeholders have different needs when it comes to transparency, and it's actually not helpful to the general member of the public to have a very sort of in-depth under-the-hood look at what the, how the system works in a very, very deep, you know, scientific way.
What is actually more helpful is to understand the building blocks, the core building blocks, of that technology, and for a user and member of the general public to then be empowered to ask the right questions so that if if they are using their technology and want to be aware of the capabilities and limitations of that, they understand that if they, if the technology is being used in a decision-making scenario, they need to understand enough in order to be able to make sure that they are able to assert their rights in respect of such decisions.
So I think overall, you know, the black box concern is fundamentally a transparency concern. And then we need to remember that transparency is not going to be effective if we try and take a one size fits all approach. We actually need to adapt different approaches to transparency depending on stakeholder needs.
Robin Pomeroy: And is there a risk that AI development is moving too fast? Famously some people have suggested that we need to pause or slow down.
Natasha Crampton: So there I think, rather than pausing important research and development work that is underway right now, including as to the safety of these models, which I strongly believe needs to be pursued, I really think we should focus on a plan of action.
We should focus on making sure that tech companies and policymakers are coming together to make sure that there is a common understanding of how the technology works.
We should also be bringing our best ideas to the table about the practices that are effective today to make sure that we're identifying and measuring and mitigating risks. And we should also be bringing our best ideas to the table about the new laws and norms and standards that we need in this space.
For Microsoft's part, I feel that we are ready to meet this moment because of all the work that we've done leading up to it. So the responsible AI programme that today I lead has in fact been going on for more than six years at that point. And over that time we have consistently been working towards operationalising our commitments across the company.
So I believe we are now in a position to move both thoughtfully and nimbly. But it's very critical that at this moment we have these broad societal conversations about the technologies so that together we can chart a path forward.
Robin Pomeroy: You've been working on this for several years. The fact that the general public is suddenly aware of ChatGPT in particular, but other advances, it's not taken you as a person by surprise that suddenly there's all this conversation going on. This is an ongoing conversation that's been happening for years. But I guess it's accelerating now and the pressures are accelerating, would that be fair to say?
We have had this ongoing incremental experience with these systems. But I do think now is the time to have the broad societal conversation.
”Natasha Crampton: I think it's a broader conversation, and that's an essential part of what we need to be doing in this moment.
You know at Microsoft's we've been releasing generative AI systems for several years now. The first generative AI system that we released was a product called GitHub Copilot. This is a system that allows you to, in plain English, insert descriptions of code that you'd like to generate. And the system takes your natural language and converts it into code. And developers, from the least experienced developers to the most experienced developers, find this to be a tool that is just extremely productivity enhancing, and it's been very well-received by them.
But those first earliest editions of Copilot, we released almost two years ago now, and what we've been able to do is to take the learnings from those early systems and continue to apply them to systems like the new Bing that we've built or the new copilots that we've built for Outlook and Teams and Word and all of the Microsoft Office products that many people use on a day to day basis.
So we have had this ongoing incremental experience with these systems. But I do think now is the time to have the broad societal conversation, and that just can't be technologists alone. We need to hear from governments, from civil society, from academia. And this great broad tent that’s now been opened by making available products that are easy to understand and easy to use by anyone, I think it's a really important step for the current moment.
Robin Pomeroy: And finally, is there a book and a podcast and a film or any of, and/or any of those things you'd recommend?
Natasha Crampton: On the book, I think Azeem Azhar’s Exponential, or if you're in the US I believe it's called The Exponential Age is a really helpful book for this moment. Azeem digs into four general purpose technologies: computing biology, renewable energy and manufacturing. And what he exposes is this gap between the advances of the technology, which he would argue are happening on an exponential basis, and the ability of societal institutions to keep up.
What I think is really valuable about the book is that it connects social and political and economic and technological trends in a way that's helpful for the current moment because we really need to be having this multifaceted conversation about the implications of AI. And I think it does a nice job of connecting those dots.
Ultimately, he concludes that even with these exponential advances in technology, human ingenuity is such that we're always able to not just control that technology, but shape it in the direction that we want it to serve us. And I think that's a really important conclusion to draw at the current time.
Robin Pomeroy: Azeem actually co-hosted a podcast with me in Davos, people can listen back on that. And we talked about this word exponential, which is in everything he does, he's talking about this, which is a challenging idea of the speed of change of things.
Are there other things you were going to recommend to us?
Natasha Crampton: I might also recommend a podcast by our Microsoft CTO, Kevin Scott. I know there are many policymakers and business leaders looking to have an understanding, a deeper understanding of the technology in this moment.
This podcast is called Behind the Scenes of Tech with Kevin Scott. I think what he does a nice job of is meeting with some really fascinating guests, from AI experts to musicians to neuroscientists to hear about the frontiers of technology. And he tells not just their own personal stories, but helps to provide a very accessible and engaging introduction to the technology as well. So I think that's something that I commend to your listeners if they’re interested in taking a bit deeper into the technology and the stories of the people who are making it.
Robin Pomeroy: Brilliant, I'll check that one out. Natasha Crampton, thanks very much for joining us.
Natasha Crampton: Thanks for having me.
Robin Pomeroy: You are listening to Radio Davos and our special series on generative artificial intelligence. That was Natasha Crampton, Chief Responsible AI Officer at Microsoft speaking to me at the World Economic Forum’s Responsible AI Leadership Summit. Our next guest is co founder and chief science officer of a much younger company. I’ll let him introduce himself.
Thomas Wolf: So I'm Thomas Wolf. I'm a co-founder and CSO of Hugging Face, which is a platform for models, data sets, demos, in AI.
Robin Pomeroy: So people who have not heard of Hugging Face, how would you describe what it was and what it's become? It looks like you've had some big news in the last days even.
Thomas Wolf: Recently, like two days ago, we released HuggingChat which is basically an open source alternative to ChatGPT and it kind of illustrates quite well what Hugging Face is about.
It's about democratising good AI and that means for us open source transparency, auditability of models of what you use are very important. So we started actually as an open source company with a framework called Transformers that was designed to give very easy access to all the state of the art AI models.
And then based on this open source code base, we actually built a platform to share models, but also to share data sets to share demos. And this platform is the Hugging Face Hub. It's now used by thousands of companies. There is more like 200,000 models and datasets and demos on it.
And the idea is that we can push for good practices by using this platform. So you have like dataset sheets, you have model cards that give a lot of details into how each model was made, what are the limitations. And obviously you can access all these models. So it's it's really built for transparency.
And I guess the third big aspect is the community that started to build around our tools and that's also this community that creates most of these models. So Hugging Chat is powered by community-created models.
Robin Pomeroy: Could you explain to the layperson the difference between having an open model and a closed model? And why is that important?
Thomas Wolf: Yes, sure. I think having an open model is super important. Basically, I think the core of this is the notion of trust that you want to build in these tools. And there's a trust in where they will fail or trust in where they will work and also trust in all the biases that you may have in these tools or how they could or not be misused.
And if the model is open, you actually don't have to believe just the people who made it. You can yourself audit the model, read how it was made, dive into training data, raise issues. So we have a lot of discussion pages on our models where people can flag models, raise questions and issues.
And so it's basically just like the difference between, you know, being able to read the code base or being able to see the internals of something like a car where you can open and look at the engine or a car where everything is locked and you actually have to believe and trust the people who made the car that it's perfect.
Robin Pomeroy: It's been going [HuggingChat] for a couple of days, hasn't it? How how's it going? I mean, just literally…
Thomas Wolf: Oh, it’s crazy. That's definitely one of our biggest releases up to date. There was millions of views on the release posts. We see like thousands of, hundreds of thousands of people using it. So I think we're very happy about it.
Robin Pomeroy: So what brought you to this? As I understand it, a few years ago, Hugging Face was a chat bot app for teenagers. Is my Wikipedia research right, in 2016?
Thomas Wolf: Yes, that's right. We started as a game company six years ago. So today we have this HuggingChat chatGPT, but six years ago the technology for chat was not there yet. And so what we did actually was to open source some of the research we were doing to share some of the models we have trained. And in 2019 we had actually this first framework to share models and this grew really super fast. Basically in a matter of weeks, people were using it, everybody was starting to use it. And that's when we decided to pivot from kind of the game approach to more the open source, community-based approach.
Robin Pomeroy: Was there a point when you realised that a chat bot app could do more than just be a little bit of fun? Did it dawn on you or did you always know that actually it could be a tool that could be used for so many things?
Thomas Wolf: That's a good question. I think we have an interesting come-back today on this, but quite quickly I think we realised that AI was going to be something very big, what the model was starting to be able to do. And I think the first realisation was even before we started open sourcing model, we participated in a competition at this big conference called NeurIPS in 2017, that was just one year after we were created. And we presented one of these models that was trained with this deep learning approach. And this model was by far the winner of the competition, like really super high above all the other competitors. And that's where we kind of had this insight that, yes, this model was actually much more powerful than we thought previously.
Robin Pomeroy: And so you've built, the model behind HuggingChat is a huge dataset, a large language model. Tell us how you did that.
To see AI becoming this kind of common good of humanity where everyone can have access to it, everyone can understand it and use it in a positive and a beneficial way. That's the very long term place where we would like to be.
”Thomas Wolf: So what is interesting here is that we're really building this platform to share models, but the ones that power HugginChat is not a model that we have to train ourself. It's a model that's been trained by a community called OpenAssistant. It's a grassroots community just just like basically a Discord server. And they've decided themselves to write instructions, to build a data set themselves that you can access openly online and to train, you know, starting from one of the foundation models that was out there, an open source one, they decided to train it on their data sets to make this OpenAssistant Chat. And so in the future we could see HuggingChat hosting many other chat models. That's the first one. I think it's one of the best ones today, but there's still room for improvement.
Robin Pomeroy: So there's a few of these out now that people can play around with, these large language model chatbots. Are they all going to be the same, do you think, or will there be notable differences? I'm guessing, because I just played around with it for the first time last night, very, very briefly. And I was really trying to wonder, is this different from ChatGPT? It must be different, but do you know how it will be different?
Thomas Wolf: It's a very good question actually. That's a lot of my work this month is around that. I think we should have better ways to compare all of these models that are out there. We kind of tend to, we believe a bit the hype that we see, but it does not always correlate with the performances that you would like to see on this or this application.
So we're working on better evaluation this month and I hope if we do this podcast again in one month, we'll have more answers to this question.
Robin Pomeroy: So here we're talking about regulation, at this event at the World Economic Forum. Where where do you stand on how we should pursue regulation or governance?
Thomas Wolf: It's a good question. I think just like any new technology there should be some regulation.
AI is quite complex because there is some data regulation, there is a model regulation, there's some usage regulation. So there is many things that intersect here. But I think we start to see very interesting regulation. I like what's taking shape in Europe, for instance.
I think the most interesting thing will be if we tailor this domain by domain. This model can do so many things, but their performance will vary a lot depending on the domain. The impact on society will vary a lot depending on the domain.
If you're in health, if you have a health advice chatbot, you want this to give good advice. If it's like a chatbot that's used to help writers, you know, when they have like a white page blank, they want to start writing something, well, the regulation can be pretty light. Anything, any new ideas, any brainstorm is fine.
So I think the importance will be that we take a look at all these different applications and we try to tailor something that makes sense for each field.
Robin Pomeroy: What about the more kind of upstream, for the model itself? Is there any way of regulating that, which some people would call for? To embed, within a large language model, something that will stop it doing bad things? Or is that just an impossible dream?
One model that can generate fake news may be just perfectly fine for a writer and obviously not fine for a newspaper, and vice versa.
”Thomas Wolf: Yes. Connected to what I said, I think it's a strange idea, right? It's like saying computers or code should be regulated generally. I think outside of what you want to use them for, it's really hard to make some very, you know, wide thing that says, hey, this type of computer should be forbidden. And we don't see that for a good reason, because they are just so versatile in what they can do. And we see the same way with models, right? One model that can generate fake news may be just perfectly fine for a writer and obviously not fine for like a newspaper, and vice versa. Right? If you're a writer, you actually don't want factual correctness all the time. So there is many questions and they will just, you can have opposite answers depending on the field.
Robin Pomeroy: So where do you see Hugging Face going? You've launched this just days ago. You said to talk to me in a month and I’ll tell you more about it. But what is your dream for where you could take this company?
Thomas Wolf: Our two main dreams would be to see AI becoming this kind of common good of humanity where everyone can have access to it, everyone can understand it and use it in a positive and a beneficial way. So that's kind of the very long term place where we would like to be.
Shorter term, I'm quite worried about more concentrations of power in AI. I think if we have just really one dominant actor, it's never really good for a field. So I would like to see a lot of diversity of actors. A lot of actors bring their own perspectives, some of them from the US, some of them from Europe, some of them from developing countries bring their own questions. In AI you also have this mix of ethical moral values that actually end up being embedded in the model. And you would actually really like to have this diversity of actors that bring diverse perspectives. So that's really what we try to push today is a full community and not just one or two islands where this is created behind closed doors.
Robin Pomeroy: Where are you seeing your technology and your tools being used? Some kind of applications that you weren't expecting, that people have picked up your technologies and they're integrating into what they do? What's most interested you about the users of your product?
Thomas Wolf: It's a very wide-ranging question, also because in open source you give a lot of things for free without really monitoring what people actively used it for.
What we see a lot is a lot of people doing like creative writing or creative application with the models. In the gaming industry I think it's very interesting to see this new way to have a game where you can fully interact with non-active players.
The main field where I'm quite excited to see this technology, finance, I think it's quite interesting what you can do on the Bloomberg model that was released recently is a very interesting one that was also based on some open source model as a starting point.
In healthcare. So I think it's a much more complex field, but I think there is some very interesting things you could do to speed up research there.
And the last one I would like to see more is also is physics science and basically climate science.
Robin Pomeroy: Climate science?
Thomas Wolf: Yes, it would be very interesting to see this model being used more there.
Robin Pomeroy: The real utopian view is that generative AI can solve all the world's problems, on the good side. But I'm just wondering how could it help solve climate change? I mean, in one way it might be hurting the climate because of all the power used to this huge amount of computing power, which requires energy. But where do you see this type of technology helping us solve these big problems? Something like climate? What tangible things could it do?
Thomas Wolf: So it's not, especially in this case, ChatGPT. I'm more talking about this very good quality predictive model, which is basically what is ChatGPT’s model that's very good at predicting the next word.
But obviously in this case, it's more like fluid dynamics. If you can have some very good predictive models, you can speed up this computation by a lot. And today they are very, very costly, They're very complex. And if you can put deep learning in these computations, you actually speed them a lot.
But the energy costs of training these models is a big question, right? That's something we tackle head on with the large model BLOOM that we train. And in this case, what we did was to use a very, very specific data centre that was actually super efficient in that it's also used to warm the university that it's hosted in. So it's almost a negative impact. It's also based in France where you have a cocktail with mostly nuclear energy, 80%. But I think choosing your data centre or choosing to try to minimize the cost of training this model, that's a very important thing.
Robin Pomeroy: Which university is being warmed by a data centre?
Thomas Wolf: It's Paris-Saclay in the south of Paris. I think we should do that a lot more. We should use these data centres - all the heat they provide - to warm our cities.
Robin Pomeroy: What do you say to policymakers who have a hard time grasping the potential, both positive and negative, of generative AI? When you meet a policymaker, what do you say to them? What are you trying to convince them about where things are at right now?
Thomas Wolf: It's also a very wide question. I think, in many ways, this model, they're a bit like computers, so they're so versatile. It's a bit like being scared about computers, right? They're like, Oh yeah, these things are connected together. Maybe you can do bad things with them. So obviously you can do bad things with computers, right? But obviously you can also do good things.
What we try, at our level, I would say, is to do a lot of nudging. So try to kind of push the whole community in, in a beneficial direction by rewarding good ethical acts and good position. We try to make this kind of a positive mindset in the community so that people from their own will will kind of do a good thing.
In terms of politics, I think it's really at the application level. I think that you want to think, okay, in this specific field there is a big danger. And then here we will like to prevent that. And that might be at the model level, but much more likely it will be at the deployment level. How do you reach these type of users? Which kind of interface do you use to communicate with them? Do you have a human in the loop with your model, or is your model directly in contact with humans? So there is a lot of questions how you deploy and use this model. And just like computers, right, that will be where actually you want to make some guardrails.
And I think the idea that you can build all guardrails really inside the model, I'm still quite doubtful about it.
Robin Pomeroy: So as you mentioned, the European Union, it seems to be going down that line, doesn't it, of defining these are the higher risk areas that will need those guardrails and these other areas we don't need so much regulation. So I guess that could be adopted, a similar thing in the U.S. or elsewhere. But where do you stand on a more overarching international body? People have mentioned aviation is covered by international treaties, or nuclear power by an International Atomic Energy Agency. Do you see any benefit of that or would that be just another layer of bureaucracy we could all do without?
Thomas Wolf: I think if it's in specific fields, that would make sense. Actually, you know, when you are talking about airlines, it's actually a very specific field, it's commercial airlines. You don't have the same for other types of aviation.
So I think once you have nailed down a specific field where we feel there is here a specific danger that we want to prevent or specific sets, may be quite a range, but they all bear a commonality, then it makes sense to to make some coordination between the various politics you want. And I would really welcome that. I think regulation at this level would be super positive and super interesting.
But something that just generally covers AI feels to me like, you know, just like something that generally covers computers, it's like, yeah, but will that be able to lead to some concrete recommendation or will it be just, you know, spread over a too wide domain to be really effective?
Robin Pomeroy: Tell me about yourself. When is there a point in your life where you just realised computers, or technology is something that you wanted to pursue. Was there a moment when you saw or heard about something and you thought, wow, that's what I want to do.
Thomas Wolf: Not so much that I can relate to AI. I was always interested by computers, by the fact that they could do things by themselves, that they could react and there could be some kind of very interesting experience to interact with them. But for a long time, I was not really convinced was a very serious career. So that's why I became first a physicist, doing a PhD in statistical physics and then a lawyer working on intellectual property law for six years.
And at some point I was like, okay, this is still really what I found very interesting all the time as a hobby. Maybe let's try to make it a day to day job for a few months, which became a few years with Hugging Face.
Robin Pomeroy: So you're an intellectual property lawyer? That's a big question, isn't it, in generative AI when you ask it to create an image that looks like something, and there's already lawsuits about this from companies that produce images that say their images are being used and repurposed.
Thomas Wolf: It's a very big question. I think there are two huge questions in AI that are not discussed nearly enough. One is the training data. How do we give access? How do we know? How do you have some rights on the training data? How does it influence the model behaviours?
And the other is the productions of the model - who has some right to them? What if they are really an exact copy of something that a human has made and that was in the training dataset.
So these are super interesting questions. And that's also why when we just consider AI as a code, we kind of miss all these data questions which are even more important than the model questions.
Robin Pomeroy: I used Dall-E to create a picture of my daughter in a kind of a fantasy situation. It's a great picture, but I'm wondering, do I own it? Is that my picture now? Or does OpenAI own it, or the dataset where it was generated from. Is it clear yet?
Thomas Wolf: No, I think nothing's clear. I think just like every complex question there, the solution will be a mix of deeply thinking what does it mean and also kind of what is the power dynamic in the world today, knowing that AI companies do have a lot of power at this moment which we see in the rights lawsuit moving forward at the moment.
Robin Pomeroy: What's the next big thing we can expect? A lot of people who weren't paying attention were suddenly amazed when they got access to ChatGPT. I think it's seen as a game changer. Is that just going to carry on as more and more people start accessing that and your own chatbot or is there going to be another big thing that you can see on the horizon that people are just going to say, oh wow, this is another advance.
Thomas Wolf: I think we'll see two main things. The first one is we'll see more modalities. Dall-E, ChatGPT were quite separate. We can see a future where all these become one model or one type of model that can understand images, maybe sound, and that can also produce images and sound. So I think this is a more technical point of view.
Robin Pomeroy: So that would be instead of just a pure text thing that the chat bots are now, it would be more sophisticated with lots of different things you can do with it straightaway.
Thomas Wolf: Yes, this steady increase in competencies and skills that it can do. And more generally I think just what we witnessed this year especially was that these things are really moving from research to being deployed.
So we'll see these AI assistants more and more everywhere and typically in a few years you can imagine a world where basically in every, you know, slightly interactive object, you will have actually something that's rather smart and understands what you want, rather just being dumb. So your GPS will be able to understand really what you say in natural language and your computer as well would be able to understand a lot more the context and will have a lot less of this failure where the things we interact with don't really get what we want. And so I could see this being kind of widespread in our world where every object, human made objects, will be roughly smarter than it is today.
Robin Pomeroy: When will we see that, do you think? Well will my phone just know what I want and take me where I need to go.
Thomas Wolf: I think technology is almost ready today.
Robin Pomeroy: Could be anytime.
Thomas Wolf: Could be anytime.
Robin Pomeroy: Finally, can I ask you, is there a book and or a film and or a podcast and or anything that you would say, read this or listen to this, either to understand AI or just because you love that book and that film and that podcast?
Thomas Wolf: I'm mostly reading books and films in Dutch, that’s my personal quest to understand this culture a bit more and more.
Robin Pomeroy: You're learning Dutch, right? How's that going?
Thomas Wolf: It's going well.
Robin Pomeroy: Are you using AI to learn Dutch?
Thomas Wolf: No, I think, yes, Duolingo is already quite impressive.
No, I like this thing of having personal challenges, basically. Still trying to see that you can learn something that's actually quite difficult to learn.
Robin Pomeroy: Here's a philosophical question then. I've learnt languages as well and struggled for years and all the effort you put in. Would you prefer it if you could just download that information into your brain, or is the struggle of learning and the experiences you have while you're learning a language, is that important?
I hope actually AI will bring us more together in some way ... more human, direct human interaction.
”Thomas Wolf: Yes. It's interesting because all these new models that can do crazy stuff, right? If I asked ChatGPT to rewrite something I painfully wrote in English, it can write it better than I do. And that questions a little bit, it will question more and more what are we proud of as humans? What are we proud of, the things that we do? Which one matters or not.
And I think the fact that you put some effort in doing something will stay something rewarding. You know, you put a lot of effort in learning to play piano, in learning to maybe paint, And maybe an AI can do that better, but still, these like human skills that you've learned… Just like watching, I think increasingly we'll see the difference between watching a real human do something and watching that on television or something where it might have been created fully by computers. And we’ll probably see very soon films that are mostly created by computers from nothing, just special effects.
And so we'll see, I think, more and more this big difference between seeing a human doing something that's actually difficult to do and just watching that on the TV. So I hope actually AI will bring us more together in some way, like we will bring more human, direct human interaction.
Robin Pomeroy: There might be that kind of perverse move back to that real world thing. I guess vinyl records had a resurgence because everything was available to download. Who expected that to happen?
Thomas Wolf: You make a big difference between seeing an artist perform music and just listening to something, because it's some unique experience.
Robin Pomeroy: Tom, thanks very much for joining us. Thanks.
Thomas Wolf: Thank you.
Robin Pomeroy: Thomas Wolf is Chief Science Officer at Hugging Face, you also heard Natasha Crampton, Chief Responsible AI Officer at Microsoft. They spoke to me at the World Economic Forum’s Responsible AI Leadership Summit at the end of April 2023.
Please subscribe to Radio Davos wherever you get your podcasts and please leave us a rating or review. And join the conversation on the World Economic Forum Podcast club -- look for that on Facebook.
This episode of Radio Davos was written and presented by me, Robin Pomeroy, with Ben Larsen. Studio production was by Gareth Nolan.
Don’t miss our next episode on AI next week, meanwhile, thanks to you for listening and goodbye.
Podcast Editor, World Economic Forum
Lead, Artificial Intelligence and Machine Learning, World Economic Forum
Filipe Beato and Charlie Markham
22 de noviembre de 2024