AI Notes

 Vendor - AI  as a service

Relies on vast amounts of data to train the algorithms that power the models that produce the predictions or decision

GDPR does not specifically regulate AI as GDPR is a principle based that’s Technology agnostic but it does regulate the personal processing

Agnostic it does regulate the processing of personal data and so it becomes really important to think about the principles enshrined in the gdpr whenever we're using personal data to power the processing activities that support ao technology and its development and the focus of today's webinar is really going to be thinking about how we determine what role you may be playing when you're using a item on your providing AI as a service and more specifically whether you're acting as a controller or processor now there are various different phases of processing in different stages of processing that we relevant to the development and use of AI technology and so often people will be wearing various different hats and they can be quite come quite tricky to figure out exactly what role you're playing but actually making that determination is really crucial because we know that the the compliance responsibilities that you have under the gdpr very much be determined by whether or not you're acting as a controller or processor so making that up front determination is really crucial to them being able to understand and map what other compliance obligations you might have to comply with and some of those other compliance obligations are something we're going to explore in the future webinar series so we and have a number of interesting topics that we're going to look at you know going beyond just the essentials which we're going to talk about today and then we're going to look at ethics and explainability and how you can develop a compliance strategy but today to kick start things off we really didn't think about whether or not you're acting as a controller or processor and sorry my Amazon delivery is arrived in the background so I'm not going to hand things over to Richard who's going to kick start by kind of demystifying what AI is and talk through some key principles before we delve into the controller processor analysis great thanks flick so before we dive into our gdpr discussions today and as flick says we just set up a little bit of context so the first question is obviously what is artificial intelligence and that's a very broad term is used to describe a range of different technologies including things like machine learning neural networks deep learning and the like and there are a lot of definitions for that but what we're interested in is what does a I mean in the context of data protection and this is a fairly dry definition that was provided by the international working group on data protection in telecommunications which is the AI is the theory and development of computer systems able to perform tasks normally requiring human intelligence OK so that's pretty broad Anna captures a really wide range of use cases and on the slide on the left hand side we've listed some of the most prominent examples of where AI is being deployed today so looking at that list we have content filtering you might be using AI to philtre spam or perhaps to review and moderate content on a platform image detection and classification that's a huge area at the moment so using AI to recognise objects and people in images and videos and we've got a number of different purposes here so image labelling facial recognition tracking object movements or even sophisticated things like inflowing certain characteristics like a person's emotional state and a good example of this use case is using AI in autonomous vehicles the third one on our list is natural language processing at another big area so using AI to recognise speech and maybe to transcribe or translate that speech or using AI as part of a virtual Voice Assistant to understand speech commands and similarly that another huge field is processing text to optimise writing like to improve somebody's grammar and then there are two other areas we can think about recommendation algorithms now these have been around for some time that's using AI to analyse patterns of behaviour and previous interactions or transactions to recommend products or features for a user and for advertising and marketing purposes and the last one we we we've described here is classifying risks so using AI to generate a credit risk report or perhaps to predict fraud at the user or individual transaction level and you know depending on the use case there are going to be different considerations in terms of data protection and your obligations under the gdpr so for example these last two examples recommendation algorithms and classifying risk these really involve you know more direct examples of profiling individuals and making predictions and decisions about those individuals which obviously have a clear it direct impact on people the other important thing to consider which flick previously mentioned is the data itself so you might be obtaining and using data from a variety of different sources as part of your AI deployment and this also will have an impact on your obligations under data protection law so for example you might be obtaining at data directly yourself for example by capturing photographs or videos from the real world and using that directly source data to train models similarly you might be obtaining data from publicly available sources and using that to train your models or maybe you're buying in or licencing data from third party sources but the other very important source of data is from your own platform or application so your private data sets which is really valuable to better understand how users are engaging with your platform and what decisions and actions they are taking so this data set might be part of a walled garden if you you know keeping and retaining that data the Google's on the facebooks of this world or perhaps you're sharing that data with an AI vendor who might also use the data to improve and train their own global models which is something will come onto later and then lastly on the slide we also have usage and metadata so similarly data about how people are interacting with the platform the telemetry data that's being sent back that can be equally useful in terms of marketing and sales as well so to suffice to say there are increasingly number of different examples where AI is being used in our daily life and it's increasingly important and regulators are very much aware of this so for a number of regulators in Europe they've identified AI as an important focus area and some have also issued specific guidance on the responsible use of AI in the context of data protection so for example the UK I see oh they've issued some very detailed guidance and developed a framework for auditing AI compliance and they've also identified AI as one of its top three strategy priorities at the eu level we still don't have any specific AI guidance from the EDP B but there are some AI related examples in its other guidelines such as guidelines on data protection by design and default and also at the EDP B has published a long list of guidance that it intends to put out as part of its work programme for 2021 and 2022 and a I features on that very long list as well so we should be expecting guidance at the level another important thing to mention here is that we have our focus from the data protection regulators but the European Commission has also developed a strategy for AI Anas intending to introduce a new legislative proposal for AI this year and that won't be specific to data protection but it's part of a broader legal and ethical framework for a I in Europe and applying to both developers and users of AI so that is something to watch out for as well with that said let's now turn in to or discussion on the gdpr anaz flexors today we're going to be focusing on that key question what is your data processing role when you are using AI this is really fundamental because obviously informs all of your obligations under the gdpr and it flows into everything else that we going to be discussing during our webinar series so when you're deploying or using and benefitting from a I are you acting as a data controller or a data processor when you're doing that and that these are the gdpr terms but there are equivalent concepts under other data protection those two so Brazil lgb D that also talks about controllers and processors and the California CPA talks about businesses and service providers so we're going to assume you're familiar with these concepts but the key question is always going to be are you determining the purposes and means of processing when you're using AI or are you simply processing on behalf of another and as like mentioned you've got to think about the specific context so you could be wearing different hats depending on the context and you could be a controller or processor of the same data and here's a brief reminder of why controller ship matters and why this question is so important this is a list of the responsibilities for controllers and their sponsor abilities for processors and suffice it to say the list on the left the controller responsibilities it's longer is far more comprehensive so you are directly responsible for a great number of obligations under the gdpr but for a I the key ones to pick out are the fact that you are going to be responsible for honouring data subject rights that includes providing transparency to data subjects and explaining AI to them but also complying with key principles like privacy by design and default and also completing DPI AA if your use of AI may have a high risk for individuals concerned so there's that's clearly going to inform your compliance strategy and it's your liability exposure as well and apart from these you know very plain key responsibilities under the law there's also a broader commercial question and strategic considerations to think about if your positioning as a controller or processor an I'm going to turnover to flick now who's just going to run with that thought cool thanks Richard yeah so as Richard has hinted at clearly there is the issue of you know if you assume a controller role then you're also assuming significant the longer list of compliance responsibilities that you have to comply with whereas if you are a processor of the particular data set really you are assuming a much more limited role from a compliance perspective because you are only ever allowed to process the personal data on the instructions of the controller and that has a number of trickle down impacts it means for example that you can't retain data or you're not supposed to retain data beyond the life of the agreement you're supposed to be delete ING returning it at the request of the controller and my flag that deletion point because often that's the trigger for kind of making people question hang on a second when we're using data for you know to develop our products or to train our models are we actually acting as a processor because actually we want to be able to retain the data data that we've collected in the context so for example providing our technology to a customer we want to retain it and use it and we play it to keep training our models because it's really really useful data and at that point it usually triggers a big kind of question hang on a second if we want to retain it we may be acting as a controller when we want to use it for other purposes beyond just providing the particular feature or service to our customer and so it becomes another reason why it becomes a really important strategic upfront consideration have you incorrectly if you're a vendor who wants to use data to train your models to improve your AI technology have you actually incorrectly just positioned yourself as a pure processor and therefore forced yourself into a more restricted contractual role that would in fact prevent you from using the data to do you know broader to use it for broader business purposes including two in train and improve your models so that's why this becomes a really important thing to think through up front conversely if you've sort of gone and an upfront decided that you are a controller of the data then that is can create some commercial sensitivity if you're going to market as a controller because often you know if you're dealing with customers they're pretty familiar with dealing with the processor they may have a standard state process agreement data processing agreement that they're ready to roll out and use with you as a vendor or they may be willing to accept your DPA and you know they kind of understand the process for dealing with the processor but if they are now having to deal with you as a controller that can often spark some commercial sensitivities largely because there's some misunderstanding about how to paper for that and also it means that there's going to be more compliance considerations that the customer of your the vendor is going to have to think through there now going to have to stablish legal grounds to be able to share the data with you to use it for controller purposes so very careful positioning needs it needs to be thought about it carefully positioned in your contractual documentation because if for example you say your processor you have to stick in that lane and you could be contractually boxed in and you wouldn't be able to use it for a broader purposes and conversely if you're willing to go to market as a controller you need to have a very good privacy story to explain why you're a controller how you're going to protect the data and get your customers comfortable with that so moving on to the next slide please Richard so how do you identify then whether you're a controller or processor so before we dig into this I think it's worth taking a quick step back and considering the different phases involved in the development of AI so there's usually an initial data preparation stage where the data is being gathered and prepared and then that data would then be effectively used to train to train the AI algorithms using that data and through that training process its end kind of used to generate a model and when we talk about data models we're really talking about mathematical algorithms that have been that are trained using data and then usually a human expert with input into that to enable that model to identify patterns and Connexions between the different data points and importantly that data model is then applied to a particular use case and at this point we're in the deployment phase so then it may be used for the purposes that that AI technology was designed for so usually in order to provide predictions or classifications to us to assist with human decision making autumn to make an automated decision itself so the development of those models is really crucial so what sort of decisions can you make in those different phases as a controller now bear in mind we talked about this a little bit earlier remember a controller is the entity that decides how and why data is processed a processor cannot do that it can only act on the instructions of the controller so the types of decisions that would typically be made by controller include things like the target output for the models the feature selection the source and nature of the training data they would also typically decide on the kinds of machine learning machine learning algorithms that would be used to create the models also sort of key model parameters such as you know how complex a decision tree can be or how many models will be included they would also make key decisions over the valuation metrics and loss functions so you know how do you trade off between false positives and false negatives and equally a controller would usually be determining the process for testing and updating the models so how often is that testing happening what kinds of data need to be used for that and how ongoing performance will be assessed so they are the key decision makers here with respect to the personal data that would be used and we'll go through the different phases that I just talked about in a second and sort of apply that in context conversely as I mentioned the processor has a much more limited ability to decide on any key decisions over how and why that data is processed but they do have the ability to make certain decisions so typically a processor would be able to decide on what types of security measures would be used to protect the data they could also make decisions about what types of IT systems and methods would be used to process the data so the technical means of processing so in the context of AI that might include the specific implementation of the generic algorithms that are used So what kind of programming language in code like libraries might be used they may also be able to make decisions over how the data and models are stored oh how how they could retrieve transfer delete will dispose of that data but importantly and I always raised this a processor should never be making decisions over how long they retain the data for because remember again they should only be retaining it under the instructions of the controllers it's really the controller who should be deciding how long that data is retained for and the processor has to delete it when asked by the controller but a process that could take decisions over you know how to optimise the measures used to optimise the learning algorithms type of computing resources that might be able to be used also the architectural details of how models will be deployed so you know what choice of virtual machines or microservices and APIs that those are all decisions can be made by the processor and just so you're aware this is all kind of examples are provided in the iko AI auditing framework so the UK regulators as Richard was briefly mentioned before has produced some really helpful guidance and these are all examples of the types of decisions referenced in that guidance which I think is a helpful one to go back to to kind of think about hanging a second With this particular processing activity who's really making the decisions here about how and why data is processed next slide please Richard so sort of taking those principles let's think of some real life examples so as I mentioned there were typically be a sort of development or training phase which is where the data that you've collected and prepared is being used to create a model but identifying different patterns and Connexions between the different data points and at that point typically the AI vendor would be processing data controller because X clearly determining and making key decisions over what data is isn't required to be used to train that model and how it will hand held the processing will happen so for the development of trading of AI models it will most certainly be the case at the vendor or the the entity that's using the data for that purpose is would be a controller something to really bear in mind when you're thinking about hang on a second if I'm a vendor that that's collecting data from my customer am I actually am I going to need to be able to use that or do I want to be able to use that to train my models later down the line and if you think you might want to leverage that data for your broader training and development purposes then that should trigger a question hang on a second am I acting as a controller for that purpose and do I need to make sure that I've carved out the necessary rights and permissions in the customer contract to be able to do that so something to bear in mind there another typical example would be where ai is being provided as a you know ai prediction as a service so the technology is being deployed as a service and to breakdown that kind of example so this would be a scenario which the ai

The technology is being deployed as a service and to breakdown that kind of example so this would be a scenario in which the AI vendor develops its own models and then allows the customer to send queries to them via for example an API and and then that then they would get responses back from the model so for example if the customer is using the model to understand you know what objects are in a particular image so in the context of for example a self driving car if there's a load of images being collected by the camera on the car then this model then may be used to review those images and producer and outputs like the classified the objects in the image and at that point typically the service provider here the vendor would be acting as a processor to the extent that the data that is being deployed through the model for that purpose is really being used to make predictions and classifications on behalf of the customer and that would assume that the vendor is never doing anything else with the data is just being replayed through the models to produce the outputs however as I mentioned above if that vendor then wants to use that data to create and improve its models at that point then we would be looking at a controller role so again you could be using wearing two hats there and depending on what you're wanting to do with the data and then interestingly the iko guidance on this has also indicated that if for example that vendor was actually pretty crucially involved in or had an influence over how and why those predictions were being made or the development of the model that was being used typically to provide that service to the customer then there may actually be some element of joint controller ship that arises there I think that would be relatively rare but it's just something to keep in mind but if you're starting as a vendor to have more influence over the essential elements of how and why the processing is happening to make those predictions and classifications with the customer then there could be some element of joint controller ship there and then that's the another model so another typical scenario would be that a vendor is leveraging customer data to create a customizer model for the customers goal views so this would be the customer provides a whole load of data and they want to the vendor to create a specific model to deal with this specific issue or a particular use case and at that point assuming again that the vendor is really only ever using the data on the instructions of the customer then we would usually say that that would be a processor role but again always gotta be careful that you there's no other why do use of the data to do things like training and development to be AI model cause then we would be looking also at a controller hat the final model is one in which the AI vendor provides tools for say machine learning that enables the customer to build and run their own models and the customer would be choosing the data and really just using the tool and the infrastructure provided by the vendor to develop their own AI technology and at that point I think we're looking at more of a processor role there now this is just you know just some examples these aren't definitive conclusions again it's always going to depend on the particular context but I think you know though it's very likely that in most scenarios people could be wearing multiple hats depending on what they're trying to do with the data and I think another common example is we see the development of things like global models and when I say global models I mean data models that a vendor maybe wanting to use across its customer base and they may have almost like a gift to get model whereby they ask the customer to say look we need your data to train this global model so that customers get the benefit of the predictions and the insights and that relies on every customer providing us with their with their data to enable us to train and develop that model which then you get the benefit of and use to get certain outputs and typically in that scenario you know the that there would be more of a controller role by the vendor but it would very much depend on how that role was constructed in the contracts and in some cases we've seen vendors trying to position themselves as a processor in that role and making it work but yeah it's a tricky one to balance there because you would effectively be having to get instructions from each and every customer to build that global model and that is not always an easy fit with the process of rolling also boxes you in in terms of how and why you can use the data because you always have to constantly going back to the multiple customers that you have or be the controllers so it's a more natural fit in that sort of global model scenario to say that you are a controller of all the data that is being deployed in used to train the model and to produce the outputs so it's you know not always clear cuts an it there needs to be very careful consideration for exactly you know how you're using the data how long you want to retain it for what the contracts that you have in place with your customers say and before you kind of assume that you can automatically say use data for a broader purposes that has taken us perfectly to the top of the hour we promised that we would try and keep these webinar series sort of focused and and Burton 30 minutes long and I think we have achieved that today which I'm pleased about just the kind of final wrap up to say the slides will be available on the fieldfisher silicon valley youtube channel so if you want to go and listen to this again you'll be able to find the slides and the full recording on our youtube channel also richard and i are

 

Video 2

 

This web and off on data protection issues in artificial intelligence now supposed the moral of today's story is is to be careful what you say to Alexa and I don't know if any of you are like me but at home I am I am one of those people who tends to say please and thank you to Alexa never I asked to do anything for me much of the movement of my wife and children but I have that sort of niggling doubts at the back of my mind up what will happen if the robots taken to takeover the world when will they remember that I was polite to them and therefore spare me now today's webinar is part of a series of webinars that the field Fisher team is running in conjunction with our Silicon Valley office and further details about the other webinars that we're doing will be provided at the end of this presentation in today's webinar we're going to cover 3 core areas we're going to look at what actually is artificial intelligence and how does it work we're going to look at some of the core data protection issues that arise in the context of artificial intelligence and then we're going to look at some of the practical challenges that arise when meeting with their section in a I now to assist me on today's webinar I am delighted to to say that I'm supported by my colleagues my partner the only power and Robert fed who is a senior associate in our team both of whom or hours and outdated experts and so we are very lucky to have them with us today and you may be aware if you've been following certain news in the artificial intelligence space that there was some breaking developments over the past week there was a leaked draught of a regulation to regulate AI in the eu and the official version of that was just published yesterday now I have say because this is all the new development encompasses issues that or why didn't just pure data section that's not going to be within the scope of our presentation today but for those of you that are interested you may dislike to know that new regulation is basically looking at some of the three core types of AI those are at times of AI that it considers to be that ought to be prohibited AI that is linked to the high risk systems and that will be subject to a lot of requirements around transparency and documentation and quality of data used to train the AI systems and then sort of all other all other types of AI systems which again will have transparency requires attaching to them and alongside that you can see that some of the thinking around this new regulation has clearly hard why on the gdpr there's going to be a creation of an you European ally board very similar to the European data protection board that we have under the gdpr and there will be significant penalties up to 4% of annual worldwide turnover breaches of the new AI regulation again very similar to to what we see under the gdpr but it's only just being proposed it has the whole legislative process to work through so it may be another few years before we see this law actually coming into effect so for today we're going to focus on data protection issues so so kick it all off if you're anything like me you have probably been you know attending calls joining webinars speaking with vendors or technology providers and now we talking to AI and you made them at the back of your mind have this notion of what AI is it's something to do with making computers behave like humans but not really understanding sort of quite how it works and then you'll hear people throw other terms that you liked of Sheen learning it all started to get quite confusing so the purpose of what we want to do covering this first part is to look at what actually is AI and how does it work now just to put some context if you don't already realise it but your odds are that you are using AI on a daily basis already up on the screen here are just three examples of sort of common uses of AI so on the left here we have Google search you can see here I've typed in what is our physical in letting agents and a I has sorry Google has used its artificial intelligence to workout that probably what I was trying to type is what is artificial intelligence and below that it also suggested a number of other types of links that I might be interested in seeing to learn more about artificial intelligence so again how does it know how to do that will of course is seeing all the searches that people have entered into group beforehand it's learning from that and the types of things that people are interested in and it's using that to propose better search results in the middle of the screen you can see a screenshot from my iPhone if those of you who have iPhones and Google does something similar if you go into the albums on your phone it will it will group lots of photos by face and you can see here the the groupings of photos of me back from the days when when I didn't have have any facial hair up to the present day and it is somehow that takes my face and all of these images and it's compiled them together on my phone so I can go back and see all the photos about me and similarly all do things for my wife my kids and so on again how does it help do that it's using AI it's using face detection some very clever stuff going on in the background and then on the right hand side here you can see a picture of a Tesla car increasingly you are finding that AI is being built into into sort of into vehicles to assist in from sort of very basic uses to make sure for example that cars stay in light in lane when the driving on the road through to some of the more sophisticated things that you see in the likes of Tesla trying to do where they are building autonomous vehicles that will ultimately drive themselves or with the goal ultimately of making the roads safer and G and easing congestion and so on but of course those are the very positive uses of AI a of the flipside of it science fiction has taught us to be also quite wary of AI as well this is where I go back to my point about being polite to Alexa that's just my tip for today you know on the left here you can see an example of how 9000 which was that a slightly ominous AI system in two 1001 a space odyssey those of you may remember him refusing to open the Bay doors to let one of the astronauts back into the spaceship slightly murderous use of AI and over on the right hand side we have a poster for Terminator again I think lots of people when they think about kind of the worst possible scenarios they either imagines are terminated machines roaming the planet and taking over humanity so you know you know these kinds of fears are kind of faded people for a long time and it's why we have data protection rules and it's why we have the AI looking sorry we have the eu looking at creating you a regulation to ensure that the AI systems that we developed serve humanity and that they are developed in ethical ways that are respectful for information and basically operate that in the ways that we want them to and not in the ways that we don't that's all Welland good we've all gotten a sense of kind of you know AI in everyday life but it doesn't necessarily mean that we actually really know what a I is So what is AI well we've put a definition on the screen here this comes from the international working group on data protection in telecommunications and what they say is that AI is the theory and development of computer systems able to perform tasks normally requiring human intelligence and then they give examples of visual perception speech recognition but decision making and translation between languages in other words it's trying to get computers to mimic human intelligence to learn to use mimic human intelligence and if you go back sort of 2030 years you know some of the early examples you may have seen of this if you were a bit of a computer nerd like me and never used to play chess on your computer you may have found that the computer was very hard to beat now in the very early days of those kind of programmes where they created chess algorithms actually literally what used to happen was that programmers would sit down with chess experts and they were trying to create rules for the computer about how to play chess and so they would teach it you know if you're if you're at this particular point in the game you know this might be a good next move to make now what that meant was you had to manually code 10s hundreds thousands of rules to teach a computer to play chess effectively and the computer wasn't really learning we were just giving it a set of rules to follow I want artificial intelligence you know that was one example of artificial intelligence you know outwardly it looked like the computer was doing something intelligent the reality was that probably wasn't it was just following the rules that have been hard coded into it nowadays when you when people talk about artificial intelligence they tend to use the term synonymously with the concept of machine learning and that's where these things can get a little bit confusing because you may have heard of machine learning and things like eat learning in your all networks and wondered what on earth are these things well the way to think of it is like a series of those kind of Russian matryoshka dolls where each is kind of a subset of the other so AI refers to the overall objective of trying to get computer systems to behave in ways that mimic human intelligence machine learning deep learning neural networks are all technologies used to achieve overall objective machine learning is basically the process of getting machines to learn from data sets to produce particular outputs particular predictions to basically teach themselves how to improve deep learning is a kind of subset of machine learning it's got a more advanced for machine learning and it works on water called neural networks neural networks are essentially it basically sort of systems that are designed to mimic the human brain this table what you do is you create software or hardware neurons that interconnect together and they it very much like the neurons of brain and they receive him personally fire those the fire outputs from neurons in your autumn back when they feed it round and so on and the overall result is that huge layering of neurons together create something akin to a human brain which can be used to to to do very sophisticated machine learning and not very sophisticated machine learning is is typically what we refer to as deep learning to deep learning occurs on your own networks machine learning is not quite a sophisticated and will explain an example of how machine learning works in the moment now you may be wondering why is AI significant well we've already seen we've already seen sort of examples of AI in use but you may be thinking well you know if AI is just about mimicking human intelligence then why do we need machines to do that why don't we just use humans to do it I think the short answer to that is that that AI can do it quicker and better in humans come there capable of doing things that humans can't achieve there's a great example here that from Google's DeepMind AI system alphago 0 and what they did they used a machine learning technology on it called reinforcement learning and literally within a few hours and with a mystery with all the computing resource is that Google has to throw at a system like this it was able to teach itself chess with it within the space of a day to a point where it was essentially could be every other system on the planet that was kind of interesting about it was that the other sort of chess champion systems that existed that actually been used but basically using other forms of machine learning technology alphago 0 had essentially taught itself it was given some basic very basic rules but also being taught itself how to play chess and in the matter of a few hours became in defeatable so you know and you get if you think how long it takes a human chess master to learn chess it is a process of years anybody who's watched Queen's gambit on Netflix will see just how much effort and skill goes into and computers are achieved at a pace that we just can't achieve so let's look at how machine learning works now what I'm going to talk to here is not an example of deep learning this is ordinary machine learning it's what we refer to supervised machine learning which is in practise probably one of the most common forms of machine learning that you see now the important thing to understand is that computers are not inherently clever if you were to give a computer an image of a cat and a dog it wouldn't be able to tell you the difference between the two you have to train it to do that and So what you do is you provide your AI system with what we call training data and you know what we do is we get a load of images of cats and a load of images of dogs and we basically label them we say this is a dog and this is a cat and want to help the AI system out what we do is we define features of what it is that distinguishes account from a dog now with humans we can look after dog and we can just kind of instinctively recognise that one is captain was a dog but computers you have to tell them what the differences are so they can learn So what we say here is we say make sure the dogs are larger than they have bold claws and they're a bit scruffy and cats by contrast the smaller they have sharp claws and they tend to be quite tidy and then you feed all those hundreds thousands 10s of thousands millions of images into the AI algorithms and is initiated this is an example of a cap and what it and this is an example of a dog and what it starts do is it matches those images against teachers that you've described and it starts to attach weight to each of those figures each of these features and it works out which are more important which of these features are more important from for distinguishing dog from account so then once it's gone through a process of learning U that presented with another image and here's an image of a fairly scruffy cap and you give it to the to the machine learning algorithm and it looks at it and it says well this thing is fairly scruffy you've told me that the dogs are scruffy therefore it's a dog I actually this is very clearly a cat so an engineer may provide some feedback to the AI algorithm saying no actually this is not a dog it's a cat and what the what the AI system will do is then to adjust the weight that attaches to scruffyness in identifying dog and say well OK I sort of assumed being scruffy was really important but maybe it's not that important being larger having dull clothes is more important to identifying a category dog and you do that enough times and give it enough features and it attaches it it learns how to attach the weights to each of those features and then eventually it becomes pretty good at recognising and next time you give it some of the picture here a leopard and despite the fact it is Laura leopard is a large cat it's didn't still recognises that despite being large it is a cat so ultimately overtime starts to learn So what can you take from that well you can learn from that that computers are not inherently clever they don't just know these things they have to be trained that's why we have training sets that's why we have engineers that's what we teach them about the features some of the more advanced AI models that you see that deep learning ural net they will start to teach themselves they will identify features themselves with out humans having to to to train them on that they can just give a whole bunch of data and those deep learning systems will start to separate out nature and workout with the features are but computers are basic computer is not inherently clever and you have to be very careful because if you give it bias data you will get vice models there have been examples for example of technology companies have been trying to use AI systems to identify candidates who are likely to be better engineers and they do that by giving the giving AI systems details about the existing engineers they have problem with that is that if you are technology company where the majority of our engineers tend to be male what you may just be teaching your AI system is the men are better being engineers which of course is very biassed and incorrect and So what may happen is you end up with a situation but AI system start reject female candidates that's not what you want so you have to be very careful with the data that you used to train AI systems and when I I goes wrong it can go very very wrong I've got a an image here on the right hand side about 1980s film war games where you may remember a teenage hacker almost accidentally starts off nuclear war based on some AI responses with military systems but you know an example of that maybe you hear stories about some of the self driving cars for example tragic case in the US and Tesla where one of their vehicles did not recognise that it was passing it was passing a lorry think the story about it wasn't it Mr side of the lorries open sky and in trying to overtake another car move into the lorry so you know AI systems aren't perfect they do make mistakes they will get better with time that's what they do but when it goes wrong it can have some very serious consequences and so with with these things This is why we need to create regulations to make sure that developed and safe ethical and privacy respectful ways and it does of course raise the overarching question of what happens when a human no longer understands the algorithm ultimately what all AI systems do is they create mathematical models that predict outcomes now they're very sophisticated algorithms that human human human knows how to create the AI system but ultimately it doesn't understand really how the AI does what it does once it's created that model and that raises some interesting questions but those are probably ethical questions beyond the scope of today's presentation so with that I am now going to hand you over Oh no sorry I've got two more points just very briefly how its facial recognition technologies AI quick example here very much were talking about before you take if you loaded an image of a face into an AI system it will then start to extract features about face So what we were talking about earlier one of the distinguishing features of that face and it will use it so that in future when you present it with a photograph of the same individual it can compare the features of that individuals face against the features already knows and work out who that individual is so if you go back to that iOS album of me earlier that was how my phone did that on how the virtual voice assistants use AI well you know in in very simple sums a virtual Voice Assistant is actually little more than a glorified search engine what happens is that when you speak to your Siri or Alexa that uses natural language processing of a form of AI that converts the spoken voice into a transcript that transcript is then passed by computer systems to work out the semantics of what was said and it works out what was the instruction you just gave it you know what's on my local cinema or what's the traffic like on the way to work and then it it will conduct a sort of search engine search to find the results for that before ultimately returning those results to you And that is essentially how those work so without probably take up more time than i should tell you how ai works i'm going to passover now to leonie who's going to talk to you about some of the core privacy risks

Privacy risks

 

The first question is well why are data protection laws relevant it AI in the first place and at its very basic AI involves data processing just very large amounts of data some of which will be personal data so it's essentially like any other form of data processing it's really just about some computers crunching some algorithms and some data but what makes IR unique is the sophistication of those algorithms and the volumes of the data that's typically involved and ultimately as Phil says humans just don't understand how machine learning trailer trained algorithms work but we just know that they do if you could move to the next slide please just one one previous one I think we've gone too far ahead there yeah so looking then at the legal and regulatory landscape for AI obviously to the extent that personal data is processed then the gdpr will be relevant as well as local data protection laws the privacy directive will also be relevant to the extent that the AI implementation involves the accessing of or storing of information on a device and that's whether that information is personal or not then we've got the European Convention of human rights because we've got to remember that data protection aims to protect individual rights and freedoms with regard to the processing of their personal data now of course that includes the right to privacy but it also includes other rights beyond privacy such as the right to non discrimination so any data protection by design and by default means that you must take into account the risk to rights and freedoms of data subjects generally and not just in a privacy context and so that's why you know discrimination anti discrimination laws will also be relevant Phil mentioned earlier at the draught eu regulation on AI that will apply as he said to all AI systems but with a particular focus on prohibited systems and high risk high risk systems so again it's broader than just data protection then we've got the nest directive requirements and they are likely in many cases to cover AI cloud computing services so even if an adverse Attack Attack in an AI context does not necessarily involve personal data it may still be in this incident and then finally we've got potentially sector and technology specific local laws and they're likely to depend on the AI technologies being used in the context of the processing so an example might be licencing laws for the licencing of some types of AI technologies such as facial recognition technologies next slide please so OK so let's look more specifically then on how gdpr applies to AI now the key point to understand here is that the underlying data protection questions for even the most complex AI project are much the same as with any new project so the questions are is the data being use fairly and lawfully and transparently do people understand how their data is being used how is it being kept secure so all the full gamut of data protection issues would be relevant but there are ones that cause particular challenges in an AI context and I've outlining those on the slide and if we look at accountability from the accountability perspective organisations are required to account for the risks arising from the processing of the personal data now AI implementation implementations generally involve a higher degree of risk to right and freedoms than in the context of other processing of personal data and in the vast majority of cases the use of ai will involve a type of processing that's likely to result in a high risk to individuals rights and freedoms and therefore trigger a legal obligation to what is your lawful basis for the relevant data processing and it's likely to be the case that you would have different lawful basis for your AI development versus your AI deployment phases and there will be challenges in relying on certain lawful basis in an AI context and I'll talk about that in a few minutes from a fairness perspective it's about understanding what the reasonable expectations of individuals will be in an AI context but it's also about ensuring things like statistical accuracy because that that will impact specifically on fairness and also the likelihood of bias within an AI system will impact unfairness on end looking at furnace organisations might also want to consider having opted choices and some of the ethical AI considerations that Phil alluded to from a purpose limitation perspective that particular gdpr principle again poses key challenges because that principle stipulates that personal data must be collected for specified explicit and legitimate purposes and not further processed in a manner that's incompatible with those purposes so data that's collected for a particular purpose constantly be redeployed to train your AI models without seeking consent from the relevant individuals from a security perspective again you have all of the known security risks but AI makes known risks worse and more challenging to control and this is because of additional complexity as well as things like heavy reliance on 3rd party coding relationships the need to integrate with third party components and there's also new types of risks such as adversarial attacks on machine learning models and I'll touch on that briefly in a few minutes personnel involved in AI may also be from a wide range of backgrounds and may not necessarily appreciate the broader security compliance requirements and data protection more generally and in many cases you're going to be training large data sets which will involve training data being copied and imported from their original location and so it's essential you know that you consider the risks in that particular context AI often uses open source code so in many cases implementing AI will require changes to an organisation software stack and that will introduce additional security risks the ICU is called an AI then from a data minimization perspective there is an inherent conflict between the need for data minimization on the one hand and the need to allow machine learning to conclude what information is necessary from large data sets but again the ideal guidance on AI makes clear that there are techniques that can ensure organisations only process what they need to process and it recommends that those organisations can consider very specifically those technical measures from an individual rights perspective in many cases personal data that's fed into an AI system so in other words that becomes training data will be subject to pre processing so to change it from one form to another to make it into training data now that would still be personal data because it's still likely to be possible to use that data to single out an individual for example through a series of their unique purchases but because it's being subject to pre processing it will be harder to link that data to the individual so in many cases the identifiers will have been remove the contact details will have been removed so it's much more challenging to deal with individual rights and one particular right poses specific challenges in an AI context and that's the article 22 right in the gdpr not to be subject to decision making based solely on automated processing that has a legal or similarly significant effect and we'll look briefly at that in more detail in a subsequent slide next slide please fill OK so very briefly then in terms of accountability and carrying out your data protection impact assessment one of the key challenges that arises in the in this context will be around the description of the processing because it's likely to be highly complex and technical and what the ICO guide suggests in this context is that you consider having two different versions of your DPI a one that's for a specialist audience and one that contains a more high level description of the processing that's useful for explaining the processing to individuals and to internal stakeholders it's also necessary as part of the DPA to demonstrate necessity and proportionality in other words there is no less intrusive way of achieving that the objective that you're seeking to achieve it's important also as part of your DPI it to explain any relevant variation or margins of error and it's important document trade-offs so there will be a number of trade-offs arising in an AI context and an example of a trade off is data minimisation on the one hand versus the need to ensure statistical accuracy on the other hand or another example is ensuring AI explainability on the one hand versus increasing the risk of privacy attacks on the other the more that you make the model transparent to potential attackers so you need to document those trade-offs and the rationale for the tradeoffs within your DPS so you identify them you assess any existing or potential tradeoffs when you design or indeed when you procure the AI system and you consider available technical approaches to minimise the need for trade-offs and also have clear lines of accountability for final trade off decisions and review them on a regular basis the final point to note in this context is that the DPI a like all DPS should be a living document but particularly in an AI context the DPI a needs to address this idea of concept drift in other words if the demographics of the target population shift or people change their behaviour then you need to consider whether the DPA also needs to be revisited so there are some specific issues that arise from a a DPI a perspective next slide please for I'm not going to go into too much detail on this particular slide because this is my colleagues in Silicon Valley are running a separate series on AI and they go into this in quite some detail suffice to say that one of the key principles in the gdpr is that processing must be fair and lawful and controllers must ensure transparency of data processing from a lawfulness perspective in an AI context like with any other data processing you have to consider the legal basis for that processing and you must breakdown and separate each processing purpose an identify an appropriate legal basis for each one and as I've mentioned it's likely to be the case that you would have different lawful bases in the in the development versus the deployment phase now I've outlined on this slide and the three most likely legal basis on which organisations could rely in AI context consent is likely to be inappropriate legal basis in a number of context and indeed consent will be required in certain cases for example if you're processing biometric data in order to uniquely identify an individual that special category data and therefore triggers the need to comply with a special category condition in Article 9 the most likely condition that's appropriate is is consent but there are some challenges in relying on consent because from a gdpr perspective there must be a genuine choice and the more things that an organisation wants to do from an AI perspective the more difficult it is to ensure that consent is specific and informed and if relying on consent during deployment organisations must be ready to accommodate withdrawal of that consent in terms of reliance and contractual necessity any processing that relies on this basis must be objectively necessary to deliver the service so while for example in a virtual Voice Assistant context you might be able to rely on that basis in order to execute a voice command it's unlikely that you will be able to rely on it for service improvement and whether or not you can rely on person on this basis in order to personalise content depends very much on on the circumstances in terms of reliance on legitimate interests it's key to know that you cannot rely on this basis of processing if the use of the data would be unexpected or cause unnecessary harm and obviously the risk of that happening is far higher in an AI contact context then in other processing contexts I just included on that side also a link to the ideals guidance that it produced in conjunction with the Alan Turing institute the key issues arising in relation to transparency in an AI context or addressed in that particular guidance and of course the challenge will be to explain in a concise and easy to understand Manor what's happening in the context of the particular data processing operations you'll see on the slide also that I've included a box that suggests that consent may be needed in other context if there are specific rules triggered like Article 9 we've mentioned special category data but will come in a minute to talk about automated decision making and it's likely would require explicit consent to the extent that that automated decision making involves illegal or similarly significant effect with the individual also to the extent that you're accessing information on a device that triggers the cookie rules of article 5/3 of the privacy directive and therefore consent would be required but as I say my Silicon Valley colleagues go into this in quite a bit more detail next slide please well OK so just we've mentioned some of the challenges that arise in a security context from an AI perspective and I mentioned that many of the risks for the risks that we already know about but that there exacerbated in an AI context but AI also introduces potentially new security risks one of those risk is known as model inversion attack and what that basically is it is it is a description of the scenario in which an attacker has access to some personal data belonging to specific individuals that are included in the training data for a particular model and because they have access to that data and they have access to the model they can infer further personal data about those same individuals by observing the inputs and outputs of the model and if we take an example of facial recognition systems they are often designed to allow third parties to query in the bottle so when the models given the image of a person whose face it recognises it basically returns its best guess as to the name of that person an associated confidence rate and what can what attackers can potentially do is they can probe the model by submitting many differently randomly generated facial images and by observing the outputs so the names and the confidence scores they could potentially reconstruct the face images associated with those individuals that have been included in the training data and so you can see on the left one of those reconstructed versions and while it's imperfect researchers have found that they can be matched by human reviewers to the individuals in the training data with 95% accuracy and that's an example that's taken from the iOS guidance on AI and I've included a link also to that guidance in the slide next slide please bill this life just illustrates another potential new security risk known as a membership inference attack and essentially what it does is it allows a malicious actor to deduce if a given individual is present in the training data on an AI model so basically attackers have the target model and they use the target model junction with information they already have about the individual to workout at that individual boots part of the training data now they can't necessarily find out additional information about the individual but they can't find out whether they were in the original training set and that's not necessarily always particularly significant but if for example the model is trained using vulnerable or sensitive data so from a vulnerable or sensitive population like those with dementia or those with HIV for example then revealing that someone is part of that population can give rise to significant privacy risks next slide please bill so finally then just to look briefly in more detail at article 22 one of the gdpr so I mentioned specifically some of the challenge that arises in relation to respecting individual rights one particular individual right that poses significant risks in an AI context is 22/1 and what that article says is that individuals have the right not to be subject to decision making that is based solely on automated processing including profiling which has legal or similarly significant effects on that individual now in many AI implementations that is exactly what Tai system is designed to do it's designed to produce predictions and based on those predictions certain decisions will be taken so if if article 22 is triggered in other words if there is automated decision making going on that potentially has these little legal or similarly significant effects what article 22 says is that there are only certain legal basis on which that processing can be carried out either you need the explicit consent of the relevant individual or the processing must be necessary for the performance of a contract or the taking steps to enter into the contract or the processing must be authorised or you know must be authorised by union or member state law which introduces sub suitable measures to safeguard the individuals rights and freedoms now even if you have the appropriate legal basis even if for example you have the explicit consent of the individual to undertake this type of decision making certain safeguards still need to be built into the system by virtue of article 22 and those safeguards demand that you basically have some process whereby the individual can obtain human intervention can express his or her point of view and or contest the relevant decision that's taken and so it's important and again the ICU guidance makes this absolutely clear that to the extent that you are obliged to insert a human into the process that that you know a human with genuine decision making power is not a token human intervention and that they do have the power to overturn the decision and one of the key things also is that you you know you make sure that you mitigate risks like automation bias and automation bias basically is that at the end of you know the algorithmic process humans tend to trust that process and that the output of that process is correct but as as fellas pointed out a I can get things wrong so it's important to ensure appropriate processes and procedures are in place to avoid that automation bias to the extent that you do Internet allow humans or that you do insert humans into the process so that brings me to the end of the risk section and I'll now hand over to rob who talk a little bit about some of the practical issues thank you thanks early the next slide please so first at school problem we're going to address is that of negotiating data protection agreements between vendors and their customers where the vendors providing an AI service so as you may have noticed already in non AI processing there could be some complexities about whether a Bender is a controller or processor or potentially fit both and this conundrum is certainly no easier when AI is involved but will come on to discuss that bit what's on the next slide so what's line is if a Bender is processing data for AI purposes and that's only to benefit specific customer contribution of the data but there's a stronger arguments that's that then there will be a processor but if the vendor is processing data to benefit itself or other customers as well example called the general product improvement purposes then or if it's targeting users but that Bender is much more likely to be controller so how do we deal with this insert data protection negotiation well we've seen look at it from my sort of product improvements spectrix that's very common some points of negotiation well we've seen a few different strategies about how to deal with this in practise so the first one is yes it's just to disclose the fans that the Bender is a controller and the data will be processed product improvement objects something more legally accurate description but it won't meet with some contractual resistance or customer often customers kind of get into the mindset of all the vendors of processors they don't answer that one anything outside that box so if we're going down this route maybe we could make it a bit more tolerable to the vendor if sorry to the customer if we kind of gave them an opt in or potentially opt out to allow some sort of control about you know whether that date will be used for bullet improvement come more generally so that's the best approach second approach perhaps the more traditional alternative is for the benefit to position itself as a processor unfortunate though this is a bit like fitting square peg into a round hole because the vendor is going to need to get instructions from the customer process that data purposes in order to meet requirements under article 28 the gdpr and that's where again you you're likely to get that kind of resistance from customers and potentially hold up deal so what's the answer well third option could be anonymization that's going to require a bit of investment potentially some some technical wizardry and if the vendor though is able to fully anonymized data then that data will no longer be personal data so would come outside but sometimes trying to

 

The importance of AI Governance

 

Make an assessment of the risk that business have using Chaturbate solutions and my background is as a lawyer I'm an expert in data privacy regulations I advised the French Prime Minister administration during the negotiation of the gdpr as an expert and I've been doing this very very long time the interesting things here is we will look at two different problems the first one is how personal data is collected by open AI for the training of chargeability so that's one first element that we need to assess and the second element is how personal data might be processed through the use of agility so those are really two different elements and why is it important is because for your businesses or businesses have you customers gdpr compliance is a very it's a very important element if Chaturbate is absolutely not compliant or cannot manage to get compliant within a reasonable amount of time then the problem is that for eu businesses to use the tool it will be very difficult and so there will be a huge loss of competitiveness so this is quite important problem to take in consideration so already Italy raised gdpr compliance problem France also there are many complaints at the moment that are currently currently being processed so will take a look at the problems that are that existed with open eye so first element problems related to the collection of personal data so through the training of opening eyes chat there is a lot of data collection because the data model is trained

 

Chat GPT and Open AI

 

It helps clients throughout the world reached their third stage of digital transformation success and the topic of today's live stream is ChatGPT and opening I what are the data privacy and legal issues so in other words we're going to talk about the dark side of chachi PT and open AI we will also talk about the positive things but we also want to really focus on some of the the hidden risks and things that people may not be thinking about as they as they embrace and get excited about chachi BT open AI another other AI technologies that are becoming quite in vogue and mainstream here lately I will introduce our guest here in just a moment but before we do that a couple logistical things first of all this live stream interview is going to be edited in polished and attitude additional content to become part of next week's episode of transformation ground control which is a weekly podcast they host bits released every Wednesday on LinkedIn YouTube Facebook and Twitter where it streams every Wednesday and on Wednesdays you can also find those same episodes in the audio formats on find Google Spotify apple Amazon et cetera podcast platforms where you listen to podcasts you can find it there so be sure to subscribe to the podcast if you don't it's called transformation ground control you can find it on podcast platforms all over the place as well as streaming on the platforms and mentioned secondly we are going to start with some questions that I have or a guest here today to talk about some of the legalities and data privacy and dark side issues of chachi BT and opening I but I shall I also love to hear your questions as well so I want to make sure that we get to audience questions here throughout the conversation so at any point we're talking here we've gotten I hear on the streams that were streaming 2 and so we're watching the chat stream here so please drop in the chat any questions you have along the way and Speaking of the chat stream here if you don't mind just dropping the chat wherever you're watching today if you don't mind just dropping in the chat where in the world are joining from which city and country argued in this is a global reach and a global audience and we typically are talking to every Tuesday and we'd love to hear where everyone's joining from here today especially because of this topic is very much an international topic and there's legalities and nuances that are probably different to different parts of the world so we'd love to hear where people are joining from here today so please drop down the chat left your very from so again topic today chat 60 and open AI what are the data privacy and legal is used best person I could think of to have this conversation is someone that we've had multiple times on the podcast on this live stream it is Marcus Harris from capital law so Marcus thank you for being here today so it's good to see you again I appreciate you inviting me on talk about this only said I mean this is an evolving world but really innovation and impact and they think the legal implications are pretty critical to have an understanding of specially I mean look at this from two perspectives one is just from consumer perspective wanting to deal with the legal implications are kind of layout those risks and look at how this is going to impact you what kind of legal constraints there are with respect to AI and then I look at it from an enterprise perspective and figure out you look what are the benefits were the use cases how you mitigate risk how to exploit this technology to really gain some efficiencies in corporate ERP big data big software side yeah absolutely and it's in like I said that the intro of this discussion this is such a hot topic right now and there's so much excitement around it so much curiosity around it open AI and chat GT or things that are mentioned in mass media pop culture recent episode of South Park which is a popular US comedy animated comedy even had a whole episode dedicated to chatting BT so you start to look at these signals that people are really interested in this topic and so we thought it would be kind of cool to talk about you know what are you know we want to temper some of the enthusiasm about the technology not to suggest it's not a good technology it won't totally transformed the way we do business but we also need to recognise the risks and the potential dark side of chat BT open AI and by the way I'll keep talking bout chachi BT and open AI but really this whole conversation relates to any AI model generated AI that's out there so Google has their own version of Bard Microsoft has copilot which is something that is a little bit different but they've introduced that as part of their Office 365 suite or there I think their beta testing it now it Microsoft is also a investor in open AI Musk is investing in his own platform for sort of an opening I type of model so it's we're talking bout gpt open AI but will sort of use that as a universal term to describe what we're we're discussing here to analyse allsorts of AI just to start I guess you know one thing that maybe just to set the context for the discussion here we won't go into a tonne of detail of what chassis PT isn't what open AI is but we do have some resources on our YouTube channel that you can go to if you want to learn more about it in fact I'll ask our marketing team to drop it in the chat here some links to recent discussions I might put out a brief video just yesterday on my YouTube channel that just gives an overview of ChatGPT an we also a few weeks ago on this the same live stream we had a discussion around what chachi BT is and what it can do in the ways it's affecting businesses in the way we do business so will drop those links in the chat so you can learn more about what the platform is today we want to focus on sort of the the like I said the dark side of this and just to set the context I'll kind of open it up by talking about a poll that I published on LinkedIn just yesterday actually so even we haven't gotten a tonne of results or the complete results yet but we have 116 votes on this pole I put out an in this question or this poll leg I posted to my network I asked the question of what will be the biggest positive or negative impact of chachi BT and open AI an just to give you some context of what people are thinking about it 44% said it would make businesses more efficient 34% said their data privacy and legal issues 15% mentioned loss of jobs and then 7% said something else other comment below and we'll get to some of those comments here in a moment but the reason I put this pull out there is cause I wanted to get a feel for you know how excited people are about the technology versus how much people recognise the dark side or concerned about the dark side and certainly the 44% said that it's going to make businesses more efficient that seems reasonable but then it was interesting to see the 34% cited data privacy and legal issues as sort of the main thing that was on their mind so I guess maybe to to start there maybe just use that as a way to set the context for my first? Is what are some of the legal concerns are unknowns as it relates to Chaturbate and open AI and obviously data privacy and legal issues being one of them maybe we could start there and then mention anything else that you think are are kind of concerns from your perspective as an attorney yeah I mean I think you know just generally with respect to opening I chat GT I mean this is probably one of the Inflexion point from a technology standpoint but I don't think we've seen before I mean to me this is as revolutionary as you know anything that has come before that just has the ability to transform the way we interact with people and then just the levels of efficiency of course with that kind of power as they say comes quick great responsibility right and one of the concerns that I have we're going to talk about the dark side may do I do want to talk about some of the efficiencies and some of the positive things about this cause I think just enormous one of the issues with legal regulation of the law as it applies to technology has always been in the fact that technology is always you know three to four steps ahead of where existing laws and regulations in any kind of regulatory framework exists so we're always in a certain scenario where we are catching up and I think you know one of the fundamental issues from a legal perspective and there's so many OK and we can talk about this for hours but one of the things that is problematic is these these tools these systems are being trained on you know a corpus of text of speech of data and you don't have necessarily an understanding of what what what is made-up of that body of information where are they getting it for example and I think that creates a federal risk above no problem for a variety of reasons one is what is the accuracy or the liability of the output when we have it all saying in our industry and you know it's virgin group gel right you put that data into it and data out so there is no control or no regulation or no interest in the accuracy of what this thing is training on then there's really no guarantee as to the accuracy or reliability of what you're going to get out the output of the content that you have created through this tool in the reliability of that that create create substantive issues from just a liability employment from an IP infringement standpoint I do think really that is going to be somewhat of a self regulating issue because you know you're going to have different competing products and if one is less reliable and the results are not as good as another then people are going actually going to gravitate towards a different a different product but I think they're inheriting that is just a lot of substantive risk what data is being utilised and how good it is the product that you're getting out of it and how you utilise it and then downstream how do you protect the ultimate end user so for example if you're a software vendor that's incorporating AI generated content into your products what kinds of reps and warranties indemnities and limitations of liabilities are you going to take on with this unknown corpus of data from a legal perspective there has been recent litigation primarily in the IP ownership context so you've got yo chat GT with there's another there's another entity I think it's called Dolly where it he takes visual visual representations and creates modifications of that and there has been at least one notable lawsuit a couple of others one filed by Getty which has all these photographs the AI tool is gone in modified pictures yoga uses a basis for the generation generated output that getting images and getting says hey now you know that's an infringement of Arnold property rights in those photos interesting this column attic in how how is a company do you make representations of infringement indemnity obligations limit your liability when you don't know what the basis for the data has been yeah that's really interesting I I wasn't aware that that lawsuit involving Getty but I could see how that could be a challenge and I think when I've thought about data privacy and IP or intellectual property types of issues with these AI platforms I guess I have thought a bit more at from the the end user perspective you know if I'm if I'm an employee at a company and I use chachi BT2 to have have it analysed some sort of sensitive company information from confidential company information you know what happened to that data and that sort of thing but this is interesting you're kind of talking bout the other side which I hadn't thought about which is the model itself using information that's already out there and then who owns that data and what are the IP implications of that what about you know when you are an employee if I'm employee atom at a company or an organisation and I use chachi PT let's just say I'm trying to use it I'm testing it out on my day-to-day job what are some examples or some issues that I might face or be exposed to as relates to intellectual property and or confidential information how does that maybe you could help us understand how that works and what that might look like from a legal perspective yeah well I I've spent some time in my practise looking at the rules and regulations that these entities put out there in the context of terms and conditions and what their obligations are to you and there are a lot of quite honest with you I mean they do a very good job it's saying hey you know whatever you give to us is not going to necessarily be treated as confidential material and we're going to be able to use that material in order to iterate our product to make it better so essentially you know whatever whatever you're putting into the AI tool that you're using you have to make an assumption that it is not going to be treated with any kind of confidentiality is not going to be treated as a trade secret it's going to be open and available to the world and invest Samsung I think just in the last couple of weeks have had a ChatGPT issue where there somebody at Samsung disclosed proprietary or confidential information via chat GT no I think that's huge concern because if they don't take responsibility catching PT or any things over open open AI platforms when you're uploading your input into them it's the Wild West essentially you have no you should have no expectation of privacy confidentiality or really very much recourse to get that information back and I think that's a huge risk for any company and you can't have you know some guy in a cube on the other side of the world trying to do more efficient job but but then you having having chat GT for example generate those create substantive issues of liability fragement and exposure for for enterprise customers in enterprise measures yeah that's that's really interesting what about this just occurred to me as your your talking there Marcus it it kind of reminds me a little bit of the early 2000s when Napster was becoming a thing and those of you watching or listening that don't remember master it was a platform that was developed that was basically allows you to share music and it became an issue because obviously there were copyright issues and bands like Metallica I remember Metallica being kind of leading the charge on this but Metallica sued Napster because they were creating his platform that allowed people to essentially steal their music and they weren't getting money for for what was rightfully theirs and it ultimately overtime they sort of won that fight they won that battle and Napster is no longer around I don't think or maybe they've been wrapped into some other company but they're not as relevant as they or at 1.20 years ago but there's sort of a flash in the pan they sort of came and went because of some of those issues I don't know that it's necessarily the same thing or from a legal perspective it's the same kind of issue but is are these AI platforms or vendors are they at risk because they're providing a platform that exposes potential confidential information shares that confidential confidential information or is it just sort of like we in organisations as users of that tool is just a risk that we just have to deal with what are your thoughts on that or how do we I mean I think I think from from the platforms perspective like I said you know they've got these pretty robust terms abuse that have been generated by teams of attorneys that really do mitigate their liability in their risk of lawsuits and recourse and you know that is being tested in the core system now with a variety of different lawsuits so certainly I think if that corpus of material that they use to train their AI maan contains proprietary or confidential information that has been active access without authorization word is being used without express authorisation that creates a risk in the platform not only to the owners of the platform the providers of their product but even to the potential end users you know I mean think about that for a second as he user ChatGPT if you are now generating code that is where that AI tool has been trained on someone's proprietary code that's confidential or protected by an intellectual property right the output that you have generated is now infringing and who's responsible for the project that the AI provider attempts to wash their hands of it and we'll see how successful they aren't doing that as these court cases move on but then what is reliability as the person that generated that code that the AI provider is going to point you and say well you you represented that you had the right to do what you did you put in put in your responsible for what comes out case unnecessarily true so I mean you were at the vanguard of a lot of these legal issues in its super super interesting and I think there's there certainly going to be some trailblazing trailblazing you know legal precedent that comes out of these that's going to define the parameters of how you are going to use these systems in future so some of some of the other issues associated with this too or I mean you know if who owns this output is another fundamental question that really has been clarified at least to some extent by the copyright office in the last month or so and the copyright office came down on this by saying yes you are in putting information and the output really has no human interaction or control or discretion that is not protectable from a copyright perspective it's not protectable in a larger property here of the copyright office so you know if you if you were to go in and say OK well let's do a new painting in the style of Van Gogh and Monet and now you want to apply for copyright protection the copyright office is going to ask you certain questions about it and if there is no human interaction or human control over what has come out they're going to reject it from protection standpoint which is in contrast and this is their example you know a graphic novel that is has text that is generated by a person or maybe partially generated by an AI tool but then edited embedded in modified by a person with a I images of her generated by AI there's going to be copyrightable elements to that and that work would be protectable maybe not every element of that work so they only coming down with clarity certainly the copyright office or right guidance to consumers and courses to what is protectable and what is on yeah That's fascinating I I was not aware of that that the copyright office of EU S government is already providing those parameters and it leads me to another question here which is that you know you see a lot of governments throughout the world as you just described in EU S government's responding fairly quickly I mean as fast as governments can move I suppose there there responding quickly to to this threat or the uncertainty around chatting BT and open a I in fact some countries like Italy for example just a couple of weeks go completely banned chachi BT and said it's just a platform that is illegal in their country and I think China just recently a few days ago did something similar so governments in some cases are taking more extreme measures and just sort of trying to shut it down or trying to completely control or limit the exposure I know this is totally you know as attorneys as an attorney you you probably hate this question because I'm asking you to predict the future and it world uncertainty but but what do you think that will continue that trend will continue with US or with global governments getting more and more involved in trying to regulate it similar to what they're doing right now with cryptocurrency and other things that could be perceived as threats to the sorry the status quo do you think that will continue with with your prediction be on that front I would expect it to continue in certain jurisdictions and I think for me that really is the wrong approach because I think that you know just outright banning something without fully realises the benefits of it there's a there's a real risk in that that your your technology industry is going to be left behind and now once you let the genie out of the bottle so to speak you can't really put it back in and you know these things don't necessarily have borders right I mean you know if you've got chat GT this generally available there people are going to find a way to use it certainly is providing efficiencies and benefits you know from profitability standpoint people are going to be clamouring to get their hands on it and I think the right approach is regulation and developing a legal framework that is going to mitigate whatever perceived risks are associated with yes I think there's there's the right certainly in being concerned and know whether we need what we call sui generis protection which is kind of Ruby spoke brand new type of compliance system for this or if we can leverage existing laws I don't know maybe there's a mix of both maybe some some B spoke regulations need to be put in place together whatever concerns there are with opening I but I certainly think that a framework that existed you know in in law can be applied to this in a substantive way to mitigate risk but certainly I mean you look at implications with respect to banking and healthcare you know those are enormous implications right I mean you got your faulty data that's got biases in it you got inaccurate data I mean you're talking about you know people losing their lives and you've got your financial crises that could be perpetuated through a high issues so I mean the risks are certainly in the roast and you know we talked about you know accuracy of the data earlier you know who's responsible essentially for policing that who is making sure that the data that is is put in is accurate and the output is reliable you know there are no as far as I know and I could be wrong I'm sure somebody here in the chat can correct me as far as I know there are no systematic regulations today then there are applicable to AI and I think what that means then is from an administrative standpoint this is administratively burdensome but as users particularly in the corporate context when you've got a regular regularly assess whether that data is accurate if you you need to know So what type of purpose is being utilised with that particular product that you're that you're using I mean is it a private data system are they just going out on the Internet scraping data that's generally available that's going to be problematic from potentially you need to know that you have to have an understanding of what those risks are and then you can just craft policies internally to mitigate those risks I look at this you would be referenced after and I think that's a good analogy that's a good comparison I look at this almost from a from Kevin and open source context right I mean what are the risks to your enterprise from open source software and what kind of programme have you put in place to minimise the risks associated with the liability caused by open source software it's kind of kind of similar in some ways when you're looking at you know unfettered use by your employees and consultants open AI tools to make them more efficient how are you going to moderate that how are you going to take the risks what policies and regulations as a new is an organisation are you going to put in place to govern that kind of thing yeah you mentioned a couple things there I want to come back to you in a SEC with follow up questions one is what you just said about policies you might put in place is an organisation that's I'll ask that later in the discussion here today maybe it will dive into that and then I also want to come back to colour points you mentioned before we do that I just want to a couple things Marcus if you're wondering why no one saying anything in the chats because the chat the stream isn't working in the green room that were in the studio were in right now so I have manually have on my phone I have LinkedIn open and YouTube as well so I'm going to have to manually pull the questions we have the same problem last week I'm not sure why it's being issued so unfortunately you won't see the questions in advance like you normally would but before I get to the question here from the audience I'm just going to go over to the audience here and where people are joining from we have I Anna from Trinidad and Tobago Tobago thank you for being here today subash sheesh from India Ashley from Atlanta Wesley from Colorado Lars from Charlotte NC just as a few examples here I'm trying to cherry pick some some different countries that people are joining from here today thank you all for being here really appreciate you for appreciate being here one of the questions we have here on LinkedIn would be actually just came in on LinkedIn it was sort of related to what you were talking about a second ago Marcus I just need to find it because this isn't ideal on my phone and I lost it where is it bear with me one second here it is this is from the mall on linkedin AI generated IP your sort of talking about that a minute ago as far as what the US copyright office is is doing you have any other thoughts beyond that I know you're not an attorney a global attorney your focus is obviously in U.S. law are US based law but what are your thoughts on that anything else you would add to that thread yeah I think I think that's certainly an interesting question I mean I think you know is it patentable when is therap end ability component to these things I mean I I really thought about this in the context of the copyright component and I certainly think that if there is novel you know output that is being created you know it very well could qualify for pat ability improving protectability now whether that qualifies for pet protection in the United states I think the the hypothetical just incomplete so it's hard to say so I talked about something that was called genesis protection earlier and you know there there may get make it to a point where you've got machines you're using machines to create innovative technology in there may need to be some sort of programme put in place to protect that that doesn't really rely on this component that the copyright office requires which is is human intervention in human component to it so I could definitely see legislation but that would expand or clarify the patentability of technology is being developed through opening I sources so I think that could be something that comes down the Pike but like I said it's really hard to say without having specific models things like that right it's a little interesting question yeah yeah for sure here's another comment I'm actually going to pull this comment off of my LinkedIn pull that I mentioned earlier the where I asked you know what the biggest positive or negative impact of chachi BT and open a I would be in this thread one person Jonathan on LinkedIn chose other and then when you choose other ask people to put in the comments what they think the real issues are I'm going to read this comment even though it's it's a bit longer I'm going I'll try to condense it a bit here but it brings up a really good question and something that I don't know that a lot of people really thought about just like a lot of people haven't really thought about the confidentiality and IP issues with GD but Jonathan says in response to the pole is the items you on the list don't even begin to understand the complexity of issues we face what about faking people with fake explosive messages fake videos propaganda disinformation irresponsible know how like kids gaining dangerous chemistry know how or the floodgates of security threats and exploits by generating code he goes on to just list a bunch of things that you could use AI4 you know persuading people into scams fake family members calling you for cash you know you can tell that it's not a family member I know internally third stage you know we just like a lot of companies we get a lot of phishing scams or attempted phishing scams an could someone replicate me and create my face or voice and make it sound like I'm saying hey Marcus I need you to wire me $1,000,000 right now you know that's what I think is that you see that being issue or have you seen any issues of that recently I think that's really an issue for an ethical perspective right in there isn't necessarily you know well well I mean we talked about regulation and a legal framework and I think that's something that would certainly need to be regulated in my view and whether the existing framework of laws and patchwork of regulations could be applicable to that I'm sorry I'm sure they came to some extent but here you've got this explosive technology that you know you're you're pretty available on social media right I mean people know what you look like they know what your voice sounds like an I think that someone's ability to fabricate a deep clone of your image and your voice the cadence with which you speak occasions to hate notations of trouble at work I think that would be pretty easy for one of these opening add platforms to replicate and then all of a sudden you know we've got you you know or this deep fake version of you calling somebody asking them for money or whatever it is I mean I think the the ability to clone your voice mimic speech cadence tones it's really off the charts and if you look at how a lot of those interactions have taken place from a commercial standpoint today I mean you know you've got you go onto a website and there's a chat bot that interacts with you and this is kind of halting stop and start interaction that's all going to change fundamentally and you know there's the there's the ability to use that for the various progress which is you know the deep deep fake fraudulent schemes you know causing havoc in financial markets political you know the crazies but then you look at something as innocuous as expedia.com just integrated a ChatGPT API into their website where you can just go online and said hey look I'm looking I want to go to Puerto Rico for the weekend can you tell me what flights are going there and with the price ranges it it just tells you looking person I mean that's so it's on the positive you know efficient uses of it but the flipside of that is pretty dark and pretty scary yeah yeah in for a real life example of the power of how someone could potentially use this technology in the nefarious ways you're describing Marcus I would invite everyone who is listening in if you're on TikTok and if you haven't already go checkout a account on there called deep Tom Cruise and it's I've mentioned it before on this podcast it's a fascinating account it's actually pretty funny but it's apparently a software facial recognition and facial AI re generative AI sort of software companies that Tom Cruise is somehow involved with he actually invested he invested in Italy has a stake in equity stake in the company from what I understand so he gave them permission to create this account called deep Tom Cruise and it's sort of like a younger version of Tom Cruise maybe 20 or 30 years ago is that kind of looks like maybe using his 30s or what he looked like 20 years ago even though he still kind of looks the same which is irritating as a middle-aged man but but if you go to this account you cannot tell or I cannot tell that it's not him and if you if you watch all the videos are pretty funny it looks his mannerisms tier point mark is the mannerisms are just like him as he's kind of Courtney in the way he talks in his personality is a little porkies and he's pretty funny in the video but you cannot tell I mean I try really hard to notice some sort of deficiency or inconsistency and you cannot tell his voice his mannerisms his look coz everything so I think if you go to that you look at that as previous fascinating but also kind of freaky because I fell for it when I first saw it I thought it was Tom Cruise and you can see someone using that to your advantage you extrapolate that to scenarios where you've got you know a deep vague that quality of Wladimir Putin you know declaring nuclear war on the United states I mean that those have huge implications right I mean if you take it to kind of more mundane level I get a tonne of robo calls with all my cell phone work on my work phone and I have a hard time every once in a while determining whether I'm talking to a real person or some sort of you know artificial intelligence in that artificial intelligence intelligence is not even that good you know I catch myself yet initiating a conversation within realised just trying to sell me extended car warranty or you know whatever it is Medicare or something I don't like it those calls but so it I mean it's going to really it's going to really really blow those things on the water and I think it's kids you know we talk about existing regulations well you know what regulations do you have do not call us and that kind of thing I mean it's become incredibly easy for these people to to inundate you with scams and the likelihood agreed to fall victim to those now increases exponentially as well which is certainly problematic so I guess using you know asking a question already asked you but asking it in the context of what you just said who's responsible for that from a legal perspective in the future how do you think courts and governments throughout the world might settle that will they settle it like it is you know is it my fault if I fall for something like that is it the person is it obviously it's the criminals fault whoever creates the fraud or you know there's some there's obviously liability there but how it fault or at risk are we as people to my fall forward and or organisations or platforms that enable that sort of activity you know who's how do you allocate the liability I guess in this whole equation from your perspective well I think that the owners of these AI tools have recently right and I think there's an effort on their part to do that and I think as long as they are meeting a certain standard of reasonableness and trying to prevent nefarious use of their tools really that you can only do what you can do right and I think they're not being reckless over not being negligent then I think the liability that they have is probably going to be on the lower end of the scale I think what you what you're going to get into a situation where there's a vulnerability in the AI platform that someone is able to exploit kind of hack and you know now starts to create something that can be utilised in a bad way but then you're always going to have a situation where you know you're going to criminal organisations or roll or people with Illinois tent with developing technology that subsequently similar in just it's good it's going to be focused on you know fraudulent activity and how do you regulate that I mean you can't you can't regulate the criminals from that big criminals right I mean you can have laws that hold them liable responsible for those things but certainly I think you know you've seen these efforts yeah I was watching YouTube video the other day where you know someone's using I think it was cat gpt and they asked how to make a bomb and you know it's that's against one of its parameters of use and it said hey I can't tell you that sorry so right they're putting those kinds of safeguards into the system but there are there are certainly you know I think work arounds are certainly creative and they're going to get around those types of things is not going to put those in place certainly yeah and he sort of allude to something that we started to talk about earlier which is the the inputs into these AI models an who's the arbitrator of truth and you know I look too social media right now in the way governments throughout the world including EU S government where you and I are based U S government has in my opinion sort of overreached on its moderation of discussion in speech on social media and that's just my personal opinion I don't want open account worms for people to migrate or disagree with me on that but that's my personal opinion so it makes me wonder if US which is known for its freedom of speech and that sort of thing we don't have freedom of speech in the way we used to and the government has sort of crackdown to term determine what it thinks is true and what it thinks is false or misinformation how do I guess I worry a little bit that governments in other powerful bodies even the AI platforms themselves AI vendors themselves may start to moderate or may start to create these biases to reflect their own personal biases in these AI models and it just creates a bunch of questions around what what is fact what is opinion what you know what's the quote unquote right answer you mentioned that earlier about you know these AI models are going out in gathering information that may or may not be true and it's using that information that may or may not be true to create outputs the it's feeding to its customers so Joni thoughts on that I mean how do you think that unfolds I know I'm getting outside the legal realm there will talk more about solar stuff but I think it's incredibly probably I mean I think there's bias exist everywhere right and I mean if you've got bias in your data and you've got bias in your team and you know the output then is dealt with as true by unsuspecting consumers were utilised to users of the technology I think the implications are vast in an incredibly problematic I mean you think about things like you know just diagnosing certain types of healthcare issues and you know you think that mothers whole populations that are more susceptible of this because of their race or their identity or whatever it is only clear but if there's bias in that area it's going to substantively impact yo how people are diagnosed with disease potentially right or if you think of financial or banking utilisation of these tools you know who's going to get mortgage down what neighbourhood and those types of things you know I mean certainly IS factor can be strong and you know how do you regulate that especially if you're concerned about over regulation you know having you know images of Tom Cruise is one thing and certainly with his consent it's a totally different thing but you know how do you how do you stop people from utilising your images and there's certain laws in place to deal with that now but you know I think there's got to be some level of regulation is to the quality of the content the quality of the output in you know I think for now you know the self regulation is really kind of where you're at with us and see where it goes yeah yeah absolutely yeah baby yeah I'm fascinated by that cause I think it's I think anytime you have something as powerful as open AI ChatGPT and something that has caught on South quickly much faster than even cryptocurrency I mean cryptocurrency is sort of a mild version of of this in terms of the sort of an alternative new platform a new way of doing things and you see how the governments throughout the world have reacted to that they've tried and they're still trying to find ways to regulate cryptocurrency and I think with this anytime you have something like this governments just can't help themselves throughout the world that's their job I guess in some ways is to rein it in and so you wonder how everything going to rain this in what are they going to do it does that diminish the value or does that diminish the benefits to society or whatever the case may be yeah well that's always the rest for it I mean if you're in Italian right now you don't even have the benefit of utilising this technology because the regulations been so extreme on one side that you know it's been totally being and that's not to me doesn't seem like a well reasoned approach to dealing with the issues but you know I think you know self regulation initially and see how it goes and then you know government interference to the extended sentence necessary or needed and maybe in only certain industries or regulatory applications of this where would it be necessary right so here's a follow up comments of one of your comments earlier Marcus this is from Frederick on LinkedIn Frederick says it looks like Samsung found out by an internal audit that people had done infringements chachi BT didn't go out in post to Twitter post with the information so in other words chatty BT didn't expose that information necessarily or make it public but the internal audit exposed the fact that people had shared that confidential information doesn't change the fact that now that confidential information is part of the open AI data set data model and it could be used obviously chatting P is not going to post that information on Twitter but it is information now that was confidential it might have been intellectual property that is now part of the open AI model I think that's an important the more point there yeah I think it's important to it wasn't a data breach necessarily right I mean it wasn't as if they had been exposed on the Internet in open inconspicuous way though it had been exposed kind of you know within corpus of this data set and is being utilised potentially by all the use of chat GT now and you know what what is being exposed it's not been exposed and how detrimental is it to Samsung reality is hard to say but it can be a positive thing in anyway shape or form right I mean and This is why I think you know we talk about government regulation and self regulation of these tools the owners of these tools I think as an organisation say you take a Samsung you've got to have internal policy you've gotta have internal regulations that are going to govern what you're what you're employees can and cannot do with my understanding is that they had Samsung had shutdown any turn elisa ChatGPT and then like I said for some push back internally or they re assessed that position thought it maybe wasn't the best position and then look at what happened right but I think you know as an attorney and it certainly is a former in house attorney you you really have to look at this with a careful eye in look at the benefits and risks because you would like your company to take advantage of the efficiencies and the business people are going to do that but you've got to assess their risk and put in up regulatory scheme internally the mitigates risks but also increases St of the tool if you can and that's not an easy thing to do right yeah absolutely now here's a question from LinkedIn this is Carol inn on LinkedIn although I'm I'm wondering if it's really carolin do we know I don't know will might be a deep fake version of Caroline but is alleged person in Carolina I'm getting early but she says opening eyes privacy policy for their paid premium API indicates that they do not use our proprietary data tune their data model nor do they store our data other than their privacy policy how can we be assured that are sensitive data is secure so I guess first of all I didn't you know I'm not aware of that I'm not aware of what their privacy policy is to be honest I'm not sure yet I can't validate that that is what it says that they won't use it to tune their data model or store their data I guess that begs the question of what what you know what's the difference between what they do store and using their data model versus what they don't store I'm not sure what the you know what the differentiators are is that something you're familiar with or or have any thoughts on well you know I've looked at the chat GT slash opening I terms of use and you know my understanding is that it's pretty vague and there's a lot of nuance of grey area that that would allow them to utilise any information that you would submit to be able to at least modify the platform or make it more efficient but certainly I think I think the the the bright line rule here for me is attorney you know regardless of even what the terms of use a if you're going to be submitting information that is confidential or proprietary or trade secret into this system that is a fool's error right there that is not something that you should be doing or allowing as a company encourage in your employees to do and there should be some restriction internally that you impose as a company to prevent that from happening because to me I think the likelihood that that information is going to be utilised in some way it's very hot and I think the risk to the company you don't know what it is I mean I think in the Samsung example you have a scenario where you know you've got information that's part of the corpus it is being utilised in some way shape or form how substantively I don't know I don't know I mean you know talk to Getty Images and you know they're certainly not happy that their proprietary copyrighted photographs have been part of incorporated into the corpus and are being utilised to generate images lets the infringement from a former real practical perspective right there in some ways it doesn't matter because of their statutory statutory liability and I think that the total damages that are seeking to something like $12 million so the risks in the perceived risks are really maybe don't equal or are you don't are the same but certainly from a liability and risk perspective you've got managing mitigated yeah yeah absolutely yeah great point it's it's a lot of what we're talking about here today seems to come back to a common theme which is we don't really know you know we're talking about a lot of uncertainty and a lot of speculation of what might happen in the future some of this has already happened but I'd say you know there's probably 90% or more of of this that we haven't really figured out or settled yet as it relates to chachi BT all right you know one of the things that we haven't talked about that is directly related to the Samsung incident in the utilisation of this information is what do you do as an organisation when you have a huge data set of user information let's say you know your company that's utilising AI in your using or using a software part of corporates AI to analyse you know user end user data confidential information proprietary information personally identifiable information and that's being utilised in the data set what are your obligations with respect to gdpr so it's clear real privacy regulations and what you're not only right below your responsibilities but what are your liabilities with respect to doing that and you know certainly think gdpr you know I would say that you've got to be transparent and you've got to get consent and you need to certainly let people know if there's an automated decision making with respect to their personally identifiable information but the liability associated with utilising all that data it could be something as innocuous as using an open opening I to get market analysis or just user data in general statistics on how they're using your products now utilising personally identifiable information and you're putting it into the AI system had this problematic a thing yeah you you mentioned great point with gdpr in Europe which is the the data privacy law that was enacted a few years ago in the eu and that's a great point I mean you've got gdpr regulations that sort of limit you know what can or can't be shared or used in this in this way do you what kind of confidence do you have that these open AI models and chatty BT and Google's Barden any other AI platform like this with how confident would you be that they are compliant with gdpr another privacy laws well you know I'm not I'm not so sure that there the way you utilised these tools it's really you know if you're going to be putting in the personally identifiable information into the tool you're the one that needs to be compliant with the regulations that govern your use of your customers personally identifiable information or data and so you know I'm not so sure that it's it's really question of you know how compliant is ChatGPT necessarily an what does it allow you to do with personally identifiable information Kennett identify it doesn't preclude the use of that information that one question but if you've got some sort of an even open AI application or tool that utilising in connexion with your business in your trying to use that tool to analyse your customer data and what obligations do you have in do you have an understanding of how that data is being used within that toolset you again are you putting out personally identifiable information into a corpus of data that's now going to be analysed by this opening I system that's certainly a big deal in a problem something that you need to be aware of right Yup and great point Caroline on LinkedIn by the way just for those there wondering she did confirm that it really is her so it must be true then if she came back electronically until later I've gotta believe it right now not to be sceptical anytime I see anyone on social media or even a video up to be sceptical whether it's really been here's a comment from a question from Sam Graham on linked and Sam says sometimes perception is reality if enough people believe false information that ChatGPT publishes is there a danger of that causing actions that we wouldn't want for example if it said the vitamin C was bad for us the results would not be good with your thoughts there well yeah I think that is problematic because you know that that's about the accuracy in the truth of what the output is and you know if you've got bias in band data that is replicating within that system and you're getting output that's just not correct so how do you how do you police against that how do you make sure that you know people's confidence in the accuracy of the data is where it needs to be the implications associated with that are pretty enormous right I mean you got facts coming out there saying you know something isn't good for you when in fact it is perceptions from his right so I guess just to sort of put a bow on this or wrap this altogether we talked about a lot of uncertainty we talked about what we think might happen in the future we talked about some of the the landscape of what some of the risks are but what closing recommendations would you have for organisations that are concerned about potential downside of catchy btu measured before policies that organisations by put in place I mean what would you do if you you know you're giving us advice as an attorney and your corporate counsel which you actually are third stages corporate counsel but but if you were counsel to others on the on the call here today what would you suggest they do to navigate this uncertainty yeah I think it's it's somewhat similar to some of the things that we say when you're trying to adopt European software or technology in general OK I mean you can't do this out of a photo mentality right you can't be an organisation that says hey we want to implement a I incorporated into our systems to make use of the efficiencies that we are there you have to have a solid and legitimate business case for utilising that piece of technology just like any other technology it's got to make sense for your business right and then once you have determined that there is a business case to use a particular AI tool you want to make sure that you have an understanding of how the vendor or the provider of that tool trains their AI product what data is it is it using is it you know general data on the Internet is it proprietary information that only they have access to how what's the quality of that information all of those are key components that you need to make a determination of right and then you gotta in my view do the necessary due diligence to make an attempt to have an understanding of what kind of intellectual property rights do you need to utilise the AI tool okay what kind of intellectual property representations warranties do you need to float down too many end users or customers that are going to use products that are associated with or generated by from corporative the AI technology then you know once you've got kind of that framework in place as to what your risks and liabilities are word how you're going to mitigate them and what's the use policy for for utilising the AI platform and I think you've got to think about that in two ways one is there's going to be a different appetite for risk and risk tolerance if you're utilising AI internally like seems OK or you're utilising it externally like Microsoft right or Google or something I mean those are two fundamental different use cases and they carry fundamentally different risk mitigation models and you might have this kind of tool traquair using internally to increase your own efficiencies and then you're using externally in order to generate more revenue and provide better customer experiences or whatever those are really the fundamental things that there's other implications with respect to mergers and acquisitions I mean if you're if you're acquiring companies you need to understand what kind of AI components they might have embedded in their products however utilising those with the risk your organisation is once you complete that merger requisition and then again I will come back to yeah how we talk about reps and warranties from a contractual perspective you know that you're going to slow down to your end users but what about just documents in general that you've signed you know what's your limitation of liability what kind of reps of warranties have you made to other companies regarding the use of intellectual property your ownership of intellectual property and does that need to be modified yeah I mean this is not a mature area right even when you look at terms of use of these AI tools put out there not drafted in a way that accommodates every kind of you know potential scenario and so you as a user have to take on this level and you diligence to make sure that you've got the proper scheme in place internally to to mitigate the risk in to maximise the use of efficiency of these tools to take advantage of a right gal it's well well said I hadn't thought about the difference between engine only use in the public facing user customer facing use of these tools and recognising the need to treat those differently and mitigate risk differently depending on what what the purpose is or what the use of the product is I think in some cases like I give you an example just from my personal experience you know I I know that there's parts of our team at third stage they're using chassis BT and I didn't tell him to it's not a policy that you need to use it we didn't decide to roll it out company wide or anything like that it's a technology that anyone go use just like you know Google I don't tell people to use Google but they do they use it they use it when they need to go find something or looking for something and similarly people using chat between that way but we've we've storage started to reinforce the caution that needs to be applied here both in terms of information that might get leaked but also not revealing not relying too much on chassis BG because we just like Google you know we're not going to Google best practises for how to make a digital transformation successful that's not what we do is based on our experience and that's where I think but we might augment that experience with information we find there but we also have to recognise that just cause it's on the Internet just cause it's on chesty doesn't necessarily mean it's true and so you have to take with green salt and so like just education is a big part of it too and making sure that in addition having policies in place that people are educated with these risks are and you know just piping be careful with that information absolutely I mean the risks the risks are pretty big it could be right yeah absolutely well so we didn't talk about this at the beginning of the interview we sort of I think we were both so excited to jump into this topic we glossed over it but maybe tell us just quickly a little bit about your start your law practise and what it is you do personally as well as what tap dozen maybe just let us know how we can get ahold of you that he was interested in chat with you more about this topic are there concerned about everyone of bounce ideas off you how to get ahold of you yeah absolutely I mean like you can go to Taff la I'm on their markets Harris I'm on LinkedIn YouTube channel I mean I think you know work services turning is is the way to the easiest way to find a person get put ring coz there's a couple of athletes that rank a lot higher than I do so but regardless of my practise is really focused on intellectual property technology issues general intellectual property issues and certainly enterprise software related issues from location to contract drafting negotiating and so we're at the forefront of a lot of these data privacy issues and certainly issues just like this that are starting to come up more and more in our practise can really impact our clients in a big way and you know the approach that we have to all of this is it's not it's not the accounting approach where you just say no it's really an approach where let's get to yes let's figure out how to mitigate the risks in a manageable way to get you as a business or consumer to utilise what you need to use but as long as you're aware of what you're getting yourself into communicator let me know that's that's the way we like to approaches yeah yeah absolutely yeah as well said and you know that that's why you're such a perfect guest for this topic because of your intellectual property focused background as well as your software technology background and focus as well so I imagine in the next few months you're going to have a lot more to talk about as you get more engaged in resolving some of these issues for more and more organisations that you work with more and more clients that you work with so we might have to do a touchpoint here later in the year to see where we are any new updates on on this whole chat you D open AI thing but thank you very much for being here today Marcus really appreciate your time yeah the audience chatting or

NEW VIDEO

I like never before we're now talking about AI the pros and cons like never before but do we even see what we think we are looking at the advances in AI are now exponential so fast that some AI experts now predict artificial intelligence will become more intelligent than humans by the end of this decade a growing chorus of tech thinkers are warning we are not prepared and that includes Geoffrey Hinton a leading machine learning pioneer is known as the godfather of AI he says it's time to put the brakes on AI while we still can with scientists right we're exploring what happens when you train large neural Nets on computers and that's just reality that we ended up here it's one of those things where there's no way that people weren't going to explore it the issue is now that we've discovered it works better than we expected a few years ago what do we do to mitigate the long term risks of things more intelligent than us taking control a big question there I want to bring you now Lindsey Gorman she's a senior fellow for emerging technologies at the alliance for securing democracy in Washington DC things it's good to see you again Geoffrey Hinton he wants us to slow down he has joined a growing chorus of AI experts were calling for a moratorium on AI development and deployment where do you stand on this well I think I think that he speaks to a very real concern that AI systems are progressing rapidly more quickly than anyone really expected a couple of years ago the idea that we could train large language models such as chat beat gpt and have it display what we think of as intelligent tasks and capability is not something we really saw inside that caused a little bit of a panic to in the community to say how are we voting this technology where is it going to get ahead of humans I don't think we're there yet certainly there are these sort of capabilities that look like intelligence and maybe moving in that direction of course but I don't think we're necessarily at the point where AI is taking over humans but there are some real concerns with how these systems are going to be used and how they are already being used when it comes to the information environment when it comes from job displacement and when it comes to extending wage inequality you know ChatGPT Italy as soon as ChatGPT really exploited Italy put a ban on the chat bot which was then later lifted is that the right approach when we're talking about maybe trying to get a better hand on managing AI well I think you have to hand it to Italy in that it very aggressively deployed its existing legal architectures to this new technology and specifically it applies the general data protection framework gdpr that you use for landmark data privacy and data protection legislation to say that Jack ChatGPT hadn't actually justified and need an really demonstrated the need to scrape all the data that was used to train the model on the Internet including including in Italy and that was really the the reason for and the legal basis for that block to provide that justification for why are we scraping this this data to develop the system at without without justifying the need but I think it does really speak to this question of how much do we are we able to apply the existing legal framework such as the gdpr as in the case of Italy and how much do we really need to develop new regulatory frameworks to address these new applications and new uses and new risks with these large language models and I think the answer is a little bit of both and of course we talk about regulation we're not talking about something that's going to happen overnight in the mean time we're going to have a big election in the United states I'm thinking about the 2024 presidential election I'm wondering what will see I mean for that now last week you retweeted a post with a video that is 100% generated by AI is a video by the Republican National Committee against president by it's a 32nd clip when you see it right here that shows China attacking Taiwan EU S banking system collapsing and it shows EU S border with Mexico being overrun by migrants now the video it looks real and you say that this is a big problem why well researchers our organisation and many others for years have been warning and raising the alarm about the spectre of deep fakes and the possibility that political actors or even foreign actors looking to interfere in democratic elections could use these completely fake images video or even text now we're seeing to manipulate voters into certain preferences an into certain candidates and insert and worldviews and this is about the general information environment whether that's Chinese propaganda or Russian disinformation but it also comes to a flashpoint when we're talking about elections and I think 2024 maybe the first election where deep fakes and where AI generated images and video play a much more significant role than they have in the past it's always been this kind of alarmist worry that something is going to flip the mind of a voter maybe on the eve of the election and we don't have time to prove that its active faked and we it just becomes so much easier to create these videos and so they really need to be labelled as such so that we can tell what's real and what's not and you know U.S. intelligence tells us that Russia used social media to meddle in the 2016 and the 2020 elections it's 2023 there still no regulation of social media can Washington deal with artificial intelligence well that is really the $1,000,000 question I think there's no reason to suspect that our foreign adversaries are going to sit this election out as they haven't for the previous previous elections and whether they'll be able to manipulate AI or or maybe they don't even need to I think is the question now there have been some hurts too on on behalf of the social media platforms to prohibit the use of manipulated content and manipulated videos right on the eve maybe the two weeks leading up to an election I think we're very likely to see similar policies put in place in 2024 if we don't get broader regulation and I think those policies really should include the requirement to label manipulated and deep faked an AI generated images and video because now anyone can make these images with mid journey with Duffy and it's not something that only computer science labs or able to generate yes good point maybe they should insist on putting water marks on these videos when I ask you about what we're seeing here in the European Union we know it's led the way with legislation to protect data privacy on the Internet now the eu was drawing up legislation that would make AI companies disclose any copyrighted material that's used to train their chat bots for example is this in your opinion it would this be a way to control a I I don't know if it's a way to control AI is so much as a way to preserve intellectual property because there's a real concern of if an AI system is developed using Priya Terry information whether that's on the corporate side or in the artistic side who really owns the the results of that if an AI generates new poetry and new art but it's actually working by spoofing and copying and kind of predicting what a famous poet or a famous famous artist would be writing or creating or drawing then really gets to the question of ownership who has created this and so I think this effort by the eu is a really strong attempt to get at this question of copyright infringement and as I said earlier apply some of the existing frameworks that we have run copyright around data to this new AI era so I don't know if I see it as a way to control a I think there will be broader regulation when we think about it and the eu is doing this as well on kind of risk based framework for AI harms that's going to be that's going to take a little bit more time but it absolutely makes sense to apply kind of the existing tools that we have to make sure that there isn't intellectual property and artistic and creative infringement in ownership as these system development they become more popular Lindsay Gorman is always leads we appreciate your time and your insights tonight thank you thanks

 

Gdpr AND THE ico

Machine learning AI algorithms and data to try and drive better product experiences maybe you start to see things that are concerning you and you just want to learn a little bit more or maybe you're coming along because actually you want to understand what does this mean for me personally where should I try and focus what should I try and do there's a range of different reasons I might bring you into this room what will try and do in talking about to give is trying deal with some of those things that might be going through your mind I know there's no Q&A at the end of this but as I walk out if there's anything that I don't cover you want to grab me grab me if you want to e-mail me you can e-mail me afterwards as well my emails up there but just before we dive in just understand who's in the room how many people in the room would classify themselves as of C-Suite socio CFO CTO somebody who is a decision maker perhaps you are OK12 how many people are within the data science community so data scientists lost majority how many awesome data engineering so you're not doing the development models were you trying to sell the pipelines in the infrastructure but you have both OK cool that's really good how many people sit in Porto marketing or OK great how many people sit in compliance that's potential officer wanted to view OK cool I feel your pain guys OK so and how many people actually know what the Information Commissioner's Office is that's why I work for but hands up okay about half the room right so the Information Commissioner's Office is an actual person is Elizabeth Denham she's the commissioner and she is independent of government and that is her job to uphold information rights so that's all of your rights not in your corporate positions but as citizens and it still uphold the rights of the 10s of millions of people in the UK whose data rights matter and to help them leverage their rights right to make sure that they can exercise them that certainly represents and the way that we do that is a couple of things we have some sticks so there are fines that we can issue the compulsory audience that we can undertake we could come and knock on your door and investigate we can even if we think there is a big enough reason to do this she stop notices so there's a range of different ways that we can exercise some powers to try and influence how people's rights are being dealt with how their data is being managed but there's another side to list which is more the carrot which is how do we work with industry we start us with Mrs companies with large tech corporations to try and engineer better information rights environment for everyone where business can succeed where you cannot make a profit where you cannot have a great time doing your jobs but in the same time as your customers those citizens can be sure that their rights are protected as defined within our society right so UK democracy European democracy we've all agreed these these are the source of row broke laws that we need to adhere to how do we make sure that actually happens and so the permission Commissioner's Office that's our job the team I represent is fairly new it's only been around for nine months a new directorate within arceo I want to be here for three months just three months previously arms at the BBC I love the BBC I should I'll just mention this because actually my ex colleague Ben did a really amazing talk earlier on and I just thought I agreed with everything he said and he and the team there are brilliant but I wanted to talk about why I left why is relevant to the discussion around AI and where this story is going to take us I had an amazing job I really loved it I was having major technology it was fun right with the job was really just gone think about and explore new technologies virtual reality of mental reality I majored on machine learning and data science and AI and to try and think about how do you start to leverage some of those tools and techniques and Dr those into product buildup products insert for the organisation how do you build up organisational human capacity and understanding what's going on but also it being the BBC basically echoing what Ben said earlier think about what this means for members of the public the audience right so it was a fun fun job while eve one of the reasons was that I had this little worm in my head that which is burning away this unanswered question which is how are we going to take all of these conversations that's been happening over the last three years around ethics and responsibility around the fact that we all against bias we don't want to buy so we don't want to see people discriminated against we want to make sure that we're being responsible we want to make sure that the development machine learning isn't held back by some of these issues I wasn't sure how you could take those conversations that were happening at quite an ephemeral philosophical level around principles and values and and this thing called ethics I'm actually production ising bring it down into the practical reality of developing models using those models in a product context trying to shift us and at the same time how do you make sure that what you are building even if you've done a great job is good for society overall right we want to live in health society I just really hope we crack that problem because the potential in machine learning and AI massive so you know I could have picked a few difference of indicators of this right I could put up a stat from the Kinsey and alloy or PwC that said global GDP is going to be massively improved by the development machining area of the next decade or two decades of three decades pick your report pick your stat right it's all saying there's going to be impact it can be massive or look at these sorts of developments where some of these techniques probabilistic compute techniques have been used to really improve the degree to which we can make a difference to people's lives real differences number this contraction where there is actually helping you just navigate around the city better if you're using citymapper here in London or some other app all of these different techniques offer huge amount of utility and public service good and benefit for society it's great and yet if you haven't been under stone and as a group picture in this room you haven't you cell selection beer you'll know that there is also a feeling that what is going on right now is causing is cartoon after alright it's not bad battery but can you recognise that actually that's a conversation that's been happening out in the wider world right through the media to sort of discussions the discussion that have been happening with policymakers and others it is happening and there are instances where real harm is happening there are instances where there is evidence about the use of machine learning probabilistic compute personal data is driving home how do you respond to that right so the information commissioners office is a regulator it's our job to regulate information right so we have to try and understand what to do about this equally so to all of you if you want healthy businesses if you want to succeed so whether you're a data scientist or a data engineer whether your compliance whether your leadership your working for an organisation you want it to do well right if you're not careful at best you end up with headlines like this right so the sneaky ways that companies manipulate you to buy more online I'm sure most organisations would say who are using some data science to drive their product's that's not the intention that's not what we do this is a misrepresentation of what happens actually we're building recommenders or building pricing models or were trying to engineer Bob better product experience so if I want to buy some Nike trainers I want to be able to get to the types of Nike trainers I want really quickly I want to get rid of all of the other noise having some information about me building a classifier around this sort of things I'm interested would be really good hey that's what we're trying to do we're not trying to manipulate you to your online experiences but it's not just retail is it it's all the banking is also the media there's lots of different sectors that would equally say this isn't our intention ship so I best right now if you're lucky this is the level of headline actually if you be more egregious if there has been an actual home if you had an issue at worst you can end up with headlines like this so this is Facebook and Cambridge analytica no need to go into the details you all know what happened there but this is the sort of headline that really erodes trust in your brand on what you are doing and also your role as individuals who are data sciences and partners community Kindles organisations it limits your ability to do what you want to do and need to do it limits your ability to engineer and innovate right so you need to avoid we all need to avoid something like this at the time I mean like I said I've only been in the ICO three months so this is case that my colleagues investigated another time the limit was half a million that vco could find to any organisation with gdpr does that's changed is 4% of euro global revenue like I said there's compulsory audits that can be done stop notices can be issued right so the range of powers that can be leveraged have increased I mean the world doesn't stop right so equally as are the powers of the regulations have increased the ability of organisations to move around and respond has also increased so there's a constant conversation about what effective regulation is but no matter what you think about that whether you think organisations can price themselves out of these sorts of issues or not just reflect on this this has material consequences right so it affects your recruitment ability that affects your ability to retain talent has loads of consequences amongst your customer base how they perceive your brand and the degree to which they'll trust you right this research last week we published that evidence is that and then why I just to stop labour the point if you were to ask me what does my current team spend all of his time on zero these are the list of things that are priorities first rise right now so cyber security thinking about how do you design online services design online services that work for children right how do you put building the right protections specially when so much is data-driven how do you make sure that the role official recognition technology in our society is palace against our social norms and what the law says these are questions that we're all we're tackling right now we're investigating looking into another thing is I mean at this point I wish I had that Intel chime you know like Intel inside right if you remember from the 90s or whatever is all they are inside actually as a really loose definition it's a collection of compute technologies that might be on the one hand of the spectrum just by simple decision trees on the other hand deep learning models but if most of these issues that we are investigating and looking into rely on personal data on their rely on some level of probabilistic computer so a I is general sense is powering so much of what we're seeing around us and that makes it really important for us to understand how do we respond so we've just issued a code you can go into the ICS website and really saying that for any service that might be likely to be accessed by child here are 16 principles that should dictate how that experience should be doing to that child right so it's really starting to ask that question and answer it around how do you make sure that Internet is a safe place for children to navigate modern exclude case of column off the Internet how do you make sure that actually if it's very likely charged in accessing services you thought about this right so you can go on the website and look up any property design column just Google it or Bing it or whatever your search engine and you should be able to find more details on that but we can't be in this room as I mentioned that word ethics right is singularly the most easy thing to define under most difficult right you've had countless definitions of it today not just in this room but across the conference today in lots of different rooms people have referred to it and give you some woman definition again you can Google this Anne just ask the interweb what's the definition for ethics could actually go to Google Facebook Microsoft and others and they've done lots of work on this you've heard other organisations talk about it so why don't I just pollute the data set and give you one more definition right this is my personal one hey ethics for me is the gap between what the technology enables us to do about the Law Society Blues is the right thing to do so how do we navigate those two points on the spectrum actually for being a bit more precise that's not a accurate definition because what the law says and what society expects off are not the same thing right so actually maybe a better definition is the caps remove the technology enables to do you know there's a point where this is all magic but actually we're going to do we all get really excited the hypersoft increases but there's some power to it what the law allows us to do that simple question is what we are building is what I'm building and working on legal is it legal underline the gdpr is illegal under the Equality Act is it legal under the Human Rights Act is it legal under the different frameworks that you might be legislated under so if you're in fintech what does the FCA say about what we're doing right there's lots of regulation are there how do you go about checking that I care about personal data so is it legal under what the legislation that the I scale has mandated and then finally as a society what do we think we ought to do is this moral does this fit with my personal values and principles does it fit with the values and principles of my organisation as a community or we all heading in the right direction so is it moral my team's job and I apologise for the not so subtle transition to close the gap between those things but it's really to close the gap between those discussions because the world isn't static in changes what was morally acceptable in the United Kingdom in the 60s and 70s isn't acceptable now I'm a second generation immigrant my my parents came from Pakistan and Kashmir when I was growing in Sheffield they had lots of races invective thrown up there but I had it when I was younger we do not society accept that as a normal social norm now social norms will change for data scientists that's really important because how do you cope with the fact that your customer base your user base the people that you are profiling or trying to deliver services to their context is going to constantly shift at what point do you account for concept drift or how will you do all of these things these become really important but you have to start with this whole framework in mind really thinking about okay what does the technology enables to do let's separate out the hype person really focus on what is possible now what we're aiming to what we're going to do where is applicable where should we use it then the test how do you make sure that what you're doing right now to be successful as a business how can you go about making sure that the correct thing to do and I'll I'll explain the AI audit from the point that should help you with that but then finally my team job also to shape the next debate but all of you also shaping the next debate just in your day-to-day jobs so how do you do that in a way that you communicated both for this group or why this society we can't need to think through what the mechanisms for the to do that together now just to focus on that middle one what does the law say we shouldn't shy away from a central fact there are tensions between what the law says should be done versus the way that the development machine learning probabilistic compute is currently going right some of those so I'll just run through some of these right so the law says minimise the amount of data that you collect and make sure you're accurate and yet so much of the innovation and the impact that we've seen happened on the last few years was really relied on gathering as much data as possible is in direct tension we have to try squared off and figure out what do we do about it the law says be really clear and purposeful about information that you've gathered personal data and what you're going to use it for purpose limitation trying really clear about that but we also know that often the data that you collect and the examples about crash logs but could be anything when combined with other data can help you draw inferences right detect patterns so how do you square off the fact that you might got consent under a very particular use case and you limited the purpose in asking whoever your data subject is but equally for you as a data science team you want that slight more freedom to be able to navigate through this and draw out inferences and detect patterns that are there to be detected right how do you squared off that's attention just recognise that legally that is attention you might be on the wrong side of the local not careful transparency and fairness lots of discussion about that how do you actually make sure that your source of information that you provide is impactful and meaningful to whoever is that you I don't want to go too far into that because it's been discussed discussed quite a lot today but alongside that for us as the regulator did you recognise the context in which you were delivering that assessment right how did you manage to trade-offs that you almost inevitably will have to manage in delivering that transparency or that accuracy was that understanding will that explain abilities what are you willing to trade and did you do consciously what the law says is that actually you need to make sure that you are transparent and that you're very new explanations you done that but we know that if you're on the deep learning end of the spectrum actually that might be a challenge if you just try to explain what the model does even data scientists and others practitioners would struggle with that so actually what is useful in that context and then automated decision making lots been made about the clauses in the law in gdpr on automated decision making and actually one of the ways that one of the best ways that travel human in the loop again up until 3-4 months ago I was in mediatek I really cared about how do you build decision support tooling for members of staff and colleagues right that's the language I use decisions supporters at what point does that decision support will become meaningless and really the person is just a token person in that cycle in that equation if we determine that they were just a talking person that word work against the organisation that was really saying actually we had a human in the loop so all of these are really interested in is actually what does having a meaningful human in the loop actually mean how do you deal with decision fatigue of the sort of people who might be involved in downloading to do some of that governance on those cheques and balances right how do you navigate this tension I'm going to start making you feel uncomfortable about all of the issues that come from why regulated could do after one more slide there will go to that stuff but there are two scenarios here scenario one on the left is my colleagues at the ICO come and investigate you now this is a picture from the Cambridge analytica they had the FBI style windbreakers I did actually ask when I joined can I get one of my eyes of onboarding package there were no you can't but I'm not enforcement investigations but at some point if my colleagues feel there is an issue and they come knocking on your door as bad as a collective failure if we all get to that point the other side is well actually how do remove the conversation and we tried to engineer create a blueprint for what a effective framework for developing machine learning is that recognises and protects citizens rights while others have runs allows business to innovate so give your snapshot off the framework there are two parts to it before in green if your organisation of any size should not feel unfamiliar choosing right the exact terminology might be different with how you plan to deliver training to the people who really need to know about this stuff right what is limited shifting around risk management understanding the decisions they make him how do you as a data science community make sure that you're not help holding the baby because everyone else is just pushed the decision down to you how do you make sure that it works upwards and downwards how do you make sure that your auditing and documentation is up to scratch we're not saying you need to develop brand new processes for this our work is exploring what the differences within a AI context that you need to understand and how do you go about understanding and adapting and upgrading what you have the reason is I should have said this actually this framework is being designed to help my assurance and investigations colleagues so that the next time they have to go and investigate an organisation that is using any form of data personal data or probabilistic compute AI machine learning what additional tools do they need to augment what they already have Just as I was saying on the previous slide we don't want to keep that in house we're open sourcing this so we sharing this framework with your top level is step one but there is a whole site that you can go to and you can read more about our work revision guidance re asking for consultation we are doing this open source because we think if we erase as a community in the bar is a pair of everyone USB golf stream you will avoid the problem in the first place Michael never don't have a job to do right that's ideal scenario for us that we don't need to investigate but yeah just back to this top level via the delta between what you already have on what is specific around machine learning and AI the bottom half really start exploring specific risk areas associated with AI and here is the source of conversation questions that you had this morning how do you tackle those right let's get if you've been lying on say checklists how do you understand how can you be sure that the checklist model that you applying is going to get you through the legal compliance cheques that you need there were early lectures and talks about synthetic data if you intend to use synthetic data how can you be sure that use of synthetic data is still not going to leave you in a legal gap from where are you going to navigate that these each others did the bottom half of the framework which is reiterating expanded here the source of questions that we are going arranging some research my colleagues Reuben Binns Dr Rubens is day jobs in Oxford computer science department is seconded into the ICU for two years he is developing the framework he's doing a lot this research is sharing it is blocking it so if you want to know what our current thinking was around automated decision making we've got blog out there which preview some of our thinking you can come and get engaged if you had a question around how do I balance the trade off between accuracy privacy or accuracy and explainability some of my colleagues did a project where we used citizen juries to ask people what did they care about and I'll preview a little result for you in the health context when they were asked would you trade away explainability of how this decision was made for accuracy the answer was yes because they felt that having accuracy in a health centre said the constant attention for example was really important and they trusted the institutions that do that right so being trusted the doctors they trusted the hospital they trusted the nhs's same so questionable presented in a police of judicial scenario actually they were not willing to trade away explainability they felt it was much more important to have an understanding of how that decision was made because either it was your fault personally but it could affect society more broadly and they felt it was more important our citizens that they had understanding the reason that matters to you is trying to figure out how do you navigate all of this is going to be very context specific and you need to know what we think when we look at this question and we will be using this solve approach to try and determine did you get the balance right try to take away some of the myth of the mystique around this we can't make it hard science but can we get rid of us much of the great for you so that you understand obviously doing the right things we've also been doing lots of work around the different rights and how they might get infected it impacted the next blog that's going to be issued is going to look at the other tradeoffs are really expand on that and really this is leading to my by closing side which is where next for this work because what I've shown you is just a very top level the blogs half a dozen that we've already published initial previous thinking we're going to work through all of these questions over this coming. Actually this is your opportunity right this is not being done broadcast out to you we're not just doing this in isolation and pushing it out this is a consultation. This is where if you have opinions you can either just push them on the comment section on the website or if you don't want to do public you can e-mail us and we could have a bilateral conversation about your views were really interested to think see what you think about the majority of the framework one example what's missing off that right now is well if I'm not a Google or Facebook or an apple or Microsoft or IBM the full stack column and I'm likely to be using third party data or party models what does that mean for me right what does it mean if machine learning is part of my supply chain how do I what if it what does it mean for me from using third party dependencies to build my product so we're going to go and research and explore that track for item guidance and clarification on that so this is your opportunity to feeding those sorts of really top issues that you've got and we will try and respond and that's going to carry on through till about October will then go into slightly more formal consultation periods with the end of this year and then early next year will publish the actual guidance I'll stop there because I think I'm well out of time but if you want to go and find out more you can go to this website if you want to e-mail as you can use that e-mail address thank you

Comments

Popular posts from this blog

Obituary Notice

Today I will be in Paradise.

All is not Lost – Trying to understand and Supporting the Minds of Young People Today – A reflection by Revd. Mark James