AI Notes
Vendor - AI as a service
Relies on vast amounts of data to train the algorithms that
power the models that produce the predictions or decision
GDPR does not specifically regulate AI as GDPR is a
principle based that’s Technology agnostic but it does regulate the personal
processing
Agnostic it does regulate the processing of personal data
and so it becomes really important to think about the principles enshrined in
the gdpr whenever we're using personal data to power the processing activities
that support ao technology and its development and the focus of today's webinar
is really going to be thinking about how we determine what role you may be
playing when you're using a item on your providing AI as a service and more
specifically whether you're acting as a controller or processor now there are
various different phases of processing in different stages of processing that
we relevant to the development and use of AI technology and so often people
will be wearing various different hats and they can be quite come quite tricky
to figure out exactly what role you're playing but actually making that
determination is really crucial because we know that the the compliance
responsibilities that you have under the gdpr very much be determined by
whether or not you're acting as a controller or processor so making that up
front determination is really crucial to them being able to understand and map
what other compliance obligations you might have to comply with and some of
those other compliance obligations are something we're going to explore in the
future webinar series so we and have a number of interesting topics that we're
going to look at you know going beyond just the essentials which we're going to
talk about today and then we're going to look at ethics and explainability and
how you can develop a compliance strategy but today to kick start things off we
really didn't think about whether or not you're acting as a controller or
processor and sorry my Amazon delivery is arrived in the background so I'm not
going to hand things over to Richard who's going to kick start by kind of
demystifying what AI is and talk through some key principles before we delve
into the controller processor analysis great thanks flick so before we dive
into our gdpr discussions today and as flick says we just set up a little bit
of context so the first question is obviously what is artificial intelligence
and that's a very broad term is used to describe a range of different
technologies including things like machine learning neural networks deep
learning and the like and there are a lot of definitions for that but what
we're interested in is what does a I mean in the context of data protection and
this is a fairly dry definition that was provided by the international working
group on data protection in telecommunications which is the AI is the theory
and development of computer systems able to perform tasks normally requiring
human intelligence OK so that's pretty broad Anna captures a really wide range
of use cases and on the slide on the left hand side we've listed some of the
most prominent examples of where AI is being deployed today so looking at that
list we have content filtering you might be using AI to philtre spam or perhaps
to review and moderate content on a platform image detection and classification
that's a huge area at the moment so using AI to recognise objects and people in
images and videos and we've got a number of different purposes here so image
labelling facial recognition tracking object movements or even sophisticated
things like inflowing certain characteristics like a person's emotional state
and a good example of this use case is using AI in autonomous vehicles the
third one on our list is natural language processing at another big area so
using AI to recognise speech and maybe to transcribe or translate that speech
or using AI as part of a virtual Voice Assistant to understand speech commands
and similarly that another huge field is processing text to optimise writing
like to improve somebody's grammar and then there are two other areas we can think
about recommendation algorithms now these have been around for some time that's
using AI to analyse patterns of behaviour and previous interactions or
transactions to recommend products or features for a user and for advertising
and marketing purposes and the last one we we we've described here is
classifying risks so using AI to generate a credit risk report or perhaps to
predict fraud at the user or individual transaction level and you know
depending on the use case there are going to be different considerations in
terms of data protection and your obligations under the gdpr so for example
these last two examples recommendation algorithms and classifying risk these
really involve you know more direct examples of profiling individuals and
making predictions and decisions about those individuals which obviously have a
clear it direct impact on people the other important thing to consider which
flick previously mentioned is the data itself so you might be obtaining and
using data from a variety of different sources as part of your AI deployment
and this also will have an impact on your obligations under data protection law
so for example you might be obtaining at data directly yourself for example by
capturing photographs or videos from the real world and using that directly
source data to train models similarly you might be obtaining data from publicly
available sources and using that to train your models or maybe you're buying in
or licencing data from third party sources but the other very important source
of data is from your own platform or application so your private data sets
which is really valuable to better understand how users are engaging with your
platform and what decisions and actions they are taking so this data set might
be part of a walled garden if you you know keeping and retaining that data the
Google's on the facebooks of this world or perhaps you're sharing that data
with an AI vendor who might also use the data to improve and train their own
global models which is something will come onto later and then lastly on the
slide we also have usage and metadata so similarly data about how people are
interacting with the platform the telemetry data that's being sent back that
can be equally useful in terms of marketing and sales as well so to suffice to
say there are increasingly number of different examples where AI is being used
in our daily life and it's increasingly important and regulators are very much
aware of this so for a number of regulators in Europe they've identified AI as
an important focus area and some have also issued specific guidance on the
responsible use of AI in the context of data protection so for example the UK I
see oh they've issued some very detailed guidance and developed a framework for
auditing AI compliance and they've also identified AI as one of its top three
strategy priorities at the eu level we still don't have any specific AI
guidance from the EDP B but there are some AI related examples in its other
guidelines such as guidelines on data protection by design and default and also
at the EDP B has published a long list of guidance that it intends to put out
as part of its work programme for 2021 and 2022 and a I features on that very
long list as well so we should be expecting guidance at the level another important
thing to mention here is that we have our focus from the data protection
regulators but the European Commission has also developed a strategy for AI
Anas intending to introduce a new legislative proposal for AI this year and
that won't be specific to data protection but it's part of a broader legal and
ethical framework for a I in Europe and applying to both developers and users
of AI so that is something to watch out for as well with that said let's now
turn in to or discussion on the gdpr anaz flexors today we're going to be
focusing on that key question what is your data processing role when you are
using AI this is really fundamental because obviously informs all of your
obligations under the gdpr and it flows into everything else that we going to be
discussing during our webinar series so when you're deploying or using and
benefitting from a I are you acting as a data controller or a data processor
when you're doing that and that these are the gdpr terms but there are
equivalent concepts under other data protection those two so Brazil lgb D that
also talks about controllers and processors and the California CPA talks about
businesses and service providers so we're going to assume you're familiar with
these concepts but the key question is always going to be are you determining
the purposes and means of processing when you're using AI or are you simply
processing on behalf of another and as like mentioned you've got to think about
the specific context so you could be wearing different hats depending on the
context and you could be a controller or processor of the same data and here's
a brief reminder of why controller ship matters and why this question is so
important this is a list of the responsibilities for controllers and their
sponsor abilities for processors and suffice it to say the list on the left the
controller responsibilities it's longer is far more comprehensive so you are
directly responsible for a great number of obligations under the gdpr but for a
I the key ones to pick out are the fact that you are going to be responsible
for honouring data subject rights that includes providing transparency to data
subjects and explaining AI to them but also complying with key principles like
privacy by design and default and also completing DPI AA if your use of AI may
have a high risk for individuals concerned so there's that's clearly going to
inform your compliance strategy and it's your liability exposure as well and
apart from these you know very plain key responsibilities under the law there's
also a broader commercial question and strategic considerations to think about
if your positioning as a controller or processor an I'm going to turnover to
flick now who's just going to run with that thought cool thanks Richard yeah so
as Richard has hinted at clearly there is the issue of you know if you assume a
controller role then you're also assuming significant the longer list of
compliance responsibilities that you have to comply with whereas if you are a
processor of the particular data set really you are assuming a much more
limited role from a compliance perspective because you are only ever allowed to
process the personal data on the instructions of the controller and that has a
number of trickle down impacts it means for example that you can't retain data
or you're not supposed to retain data beyond the life of the agreement you're
supposed to be delete ING returning it at the request of the controller and my
flag that deletion point because often that's the trigger for kind of making
people question hang on a second when we're using data for you know to develop
our products or to train our models are we actually acting as a processor
because actually we want to be able to retain the data data that we've
collected in the context so for example providing our technology to a customer
we want to retain it and use it and we play it to keep training our models
because it's really really useful data and at that point it usually triggers a
big kind of question hang on a second if we want to retain it we may be acting
as a controller when we want to use it for other purposes beyond just providing
the particular feature or service to our customer and so it becomes another
reason why it becomes a really important strategic upfront consideration have
you incorrectly if you're a vendor who wants to use data to train your models
to improve your AI technology have you actually incorrectly just positioned
yourself as a pure processor and therefore forced yourself into a more
restricted contractual role that would in fact prevent you from using the data
to do you know broader to use it for broader business purposes including two in
train and improve your models so that's why this becomes a really important
thing to think through up front conversely if you've sort of gone and an
upfront decided that you are a controller of the data then that is can create
some commercial sensitivity if you're going to market as a controller because
often you know if you're dealing with customers they're pretty familiar with
dealing with the processor they may have a standard state process agreement
data processing agreement that they're ready to roll out and use with you as a
vendor or they may be willing to accept your DPA and you know they kind of
understand the process for dealing with the processor but if they are now
having to deal with you as a controller that can often spark some commercial
sensitivities largely because there's some misunderstanding about how to paper
for that and also it means that there's going to be more compliance considerations
that the customer of your the vendor is going to have to think through there
now going to have to stablish legal grounds to be able to share the data with
you to use it for controller purposes so very careful positioning needs it
needs to be thought about it carefully positioned in your contractual
documentation because if for example you say your processor you have to stick
in that lane and you could be contractually boxed in and you wouldn't be able
to use it for a broader purposes and conversely if you're willing to go to
market as a controller you need to have a very good privacy story to explain
why you're a controller how you're going to protect the data and get your
customers comfortable with that so moving on to the next slide please Richard
so how do you identify then whether you're a controller or processor so before
we dig into this I think it's worth taking a quick step back and considering
the different phases involved in the development of AI so there's usually an
initial data preparation stage where the data is being gathered and prepared
and then that data would then be effectively used to train to train the AI
algorithms using that data and through that training process its end kind of
used to generate a model and when we talk about data models we're really
talking about mathematical algorithms that have been that are trained using
data and then usually a human expert with input into that to enable that model
to identify patterns and Connexions between the different data points and
importantly that data model is then applied to a particular use case and at
this point we're in the deployment phase so then it may be used for the
purposes that that AI technology was designed for so usually in order to
provide predictions or classifications to us to assist with human decision
making autumn to make an automated decision itself so the development of those
models is really crucial so what sort of decisions can you make in those
different phases as a controller now bear in mind we talked about this a little
bit earlier remember a controller is the entity that decides how and why data
is processed a processor cannot do that it can only act on the instructions of
the controller so the types of decisions that would typically be made by controller
include things like the target output for the models the feature selection the
source and nature of the training data they would also typically decide on the
kinds of machine learning machine learning algorithms that would be used to
create the models also sort of key model parameters such as you know how
complex a decision tree can be or how many models will be included they would
also make key decisions over the valuation metrics and loss functions so you
know how do you trade off between false positives and false negatives and
equally a controller would usually be determining the process for testing and
updating the models so how often is that testing happening what kinds of data
need to be used for that and how ongoing performance will be assessed so they
are the key decision makers here with respect to the personal data that would
be used and we'll go through the different phases that I just talked about in a
second and sort of apply that in context conversely as I mentioned the
processor has a much more limited ability to decide on any key decisions over
how and why that data is processed but they do have the ability to make certain
decisions so typically a processor would be able to decide on what types of
security measures would be used to protect the data they could also make
decisions about what types of IT systems and methods would be used to process
the data so the technical means of processing so in the context of AI that
might include the specific implementation of the generic algorithms that are
used So what kind of programming language in code like libraries might be used
they may also be able to make decisions over how the data and models are stored
oh how how they could retrieve transfer delete will dispose of that data but
importantly and I always raised this a processor should never be making
decisions over how long they retain the data for because remember again they
should only be retaining it under the instructions of the controllers it's
really the controller who should be deciding how long that data is retained for
and the processor has to delete it when asked by the controller but a process
that could take decisions over you know how to optimise the measures used to
optimise the learning algorithms type of computing resources that might be able
to be used also the architectural details of how models will be deployed so you
know what choice of virtual machines or microservices and APIs that those are
all decisions can be made by the processor and just so you're aware this is all
kind of examples are provided in the iko AI auditing framework so the UK
regulators as Richard was briefly mentioned before has produced some really
helpful guidance and these are all examples of the types of decisions
referenced in that guidance which I think is a helpful one to go back to to
kind of think about hanging a second With this particular processing activity
who's really making the decisions here about how and why data is processed next
slide please Richard so sort of taking those principles let's think of some
real life examples so as I mentioned there were typically be a sort of
development or training phase which is where the data that you've collected and
prepared is being used to create a model but identifying different patterns and
Connexions between the different data points and at that point typically the AI
vendor would be processing data controller because X clearly determining and
making key decisions over what data is isn't required to be used to train that
model and how it will hand held the processing will happen so for the
development of trading of AI models it will most certainly be the case at the
vendor or the the entity that's using the data for that purpose is would be a
controller something to really bear in mind when you're thinking about hang on
a second if I'm a vendor that that's collecting data from my customer am I
actually am I going to need to be able to use that or do I want to be able to
use that to train my models later down the line and if you think you might want
to leverage that data for your broader training and development purposes then
that should trigger a question hang on a second am I acting as a controller for
that purpose and do I need to make sure that I've carved out the necessary
rights and permissions in the customer contract to be able to do that so
something to bear in mind there another typical example would be where ai is
being provided as a you know ai prediction as a service so the technology is
being deployed as a service and to breakdown that kind of example so this would
be a scenario which the ai
The technology is being deployed as a service and to
breakdown that kind of example so this would be a scenario in which the AI
vendor develops its own models and then allows the customer to send queries to
them via for example an API and and then that then they would get responses
back from the model so for example if the customer is using the model to
understand you know what objects are in a particular image so in the context of
for example a self driving car if there's a load of images being collected by
the camera on the car then this model then may be used to review those images
and producer and outputs like the classified the objects in the image and at
that point typically the service provider here the vendor would be acting as a
processor to the extent that the data that is being deployed through the model
for that purpose is really being used to make predictions and classifications
on behalf of the customer and that would assume that the vendor is never doing
anything else with the data is just being replayed through the models to
produce the outputs however as I mentioned above if that vendor then wants to
use that data to create and improve its models at that point then we would be
looking at a controller role so again you could be using wearing two hats there
and depending on what you're wanting to do with the data and then interestingly
the iko guidance on this has also indicated that if for example that vendor was
actually pretty crucially involved in or had an influence over how and why
those predictions were being made or the development of the model that was
being used typically to provide that service to the customer then there may
actually be some element of joint controller ship that arises there I think
that would be relatively rare but it's just something to keep in mind but if
you're starting as a vendor to have more influence over the essential elements
of how and why the processing is happening to make those predictions and
classifications with the customer then there could be some element of joint
controller ship there and then that's the another model so another typical
scenario would be that a vendor is leveraging customer data to create a
customizer model for the customers goal views so this would be the customer
provides a whole load of data and they want to the vendor to create a specific
model to deal with this specific issue or a particular use case and at that
point assuming again that the vendor is really only ever using the data on the
instructions of the customer then we would usually say that that would be a
processor role but again always gotta be careful that you there's no other why
do use of the data to do things like training and development to be AI model
cause then we would be looking also at a controller hat the final model is one
in which the AI vendor provides tools for say machine learning that enables the
customer to build and run their own models and the customer would be choosing
the data and really just using the tool and the infrastructure provided by the
vendor to develop their own AI technology and at that point I think we're
looking at more of a processor role there now this is just you know just some
examples these aren't definitive conclusions again it's always going to depend
on the particular context but I think you know though it's very likely that in
most scenarios people could be wearing multiple hats depending on what they're
trying to do with the data and I think another common example is we see the
development of things like global models and when I say global models I mean
data models that a vendor maybe wanting to use across its customer base and
they may have almost like a gift to get model whereby they ask the customer to
say look we need your data to train this global model so that customers get the
benefit of the predictions and the insights and that relies on every customer
providing us with their with their data to enable us to train and develop that
model which then you get the benefit of and use to get certain outputs and
typically in that scenario you know the that there would be more of a
controller role by the vendor but it would very much depend on how that role
was constructed in the contracts and in some cases we've seen vendors trying to
position themselves as a processor in that role and making it work but yeah
it's a tricky one to balance there because you would effectively be having to
get instructions from each and every customer to build that global model and
that is not always an easy fit with the process of rolling also boxes you in in
terms of how and why you can use the data because you always have to constantly
going back to the multiple customers that you have or be the controllers so
it's a more natural fit in that sort of global model scenario to say that you
are a controller of all the data that is being deployed in used to train the
model and to produce the outputs so it's you know not always clear cuts an it
there needs to be very careful consideration for exactly you know how you're
using the data how long you want to retain it for what the contracts that you
have in place with your customers say and before you kind of assume that you
can automatically say use data for a broader purposes that has taken us
perfectly to the top of the hour we promised that we would try and keep these
webinar series sort of focused and and Burton 30 minutes long and I think we
have achieved that today which I'm pleased about just the kind of final wrap up
to say the slides will be available on the fieldfisher silicon valley youtube
channel so if you want to go and listen to this again you'll be able to find
the slides and the full recording on our youtube channel also richard and i are
Video 2
This web and off on data protection issues in artificial intelligence
now supposed the moral of today's story is is to be careful what you say to
Alexa and I don't know if any of you are like me but at home I am I am one of
those people who tends to say please and thank you to Alexa never I asked to do
anything for me much of the movement of my wife and children but I have that
sort of niggling doubts at the back of my mind up what will happen if the
robots taken to takeover the world when will they remember that I was polite to
them and therefore spare me now today's webinar is part of a series of webinars
that the field Fisher team is running in conjunction with our Silicon Valley
office and further details about the other webinars that we're doing will be
provided at the end of this presentation in today's webinar we're going to
cover 3 core areas we're going to look at what actually is artificial
intelligence and how does it work we're going to look at some of the core data
protection issues that arise in the context of artificial intelligence and then
we're going to look at some of the practical challenges that arise when meeting
with their section in a I now to assist me on today's webinar I am delighted to
to say that I'm supported by my colleagues my partner the only power and Robert
fed who is a senior associate in our team both of whom or hours and outdated
experts and so we are very lucky to have them with us today and you may be
aware if you've been following certain news in the artificial intelligence
space that there was some breaking developments over the past week there was a
leaked draught of a regulation to regulate AI in the eu and the official
version of that was just published yesterday now I have say because this is all
the new development encompasses issues that or why didn't just pure data section
that's not going to be within the scope of our presentation today but for those
of you that are interested you may dislike to know that new regulation is
basically looking at some of the three core types of AI those are at times of
AI that it considers to be that ought to be prohibited AI that is linked to the
high risk systems and that will be subject to a lot of requirements around
transparency and documentation and quality of data used to train the AI systems
and then sort of all other all other types of AI systems which again will have
transparency requires attaching to them and alongside that you can see that
some of the thinking around this new regulation has clearly hard why on the
gdpr there's going to be a creation of an you European ally board very similar
to the European data protection board that we have under the gdpr and there
will be significant penalties up to 4% of annual worldwide turnover breaches of
the new AI regulation again very similar to to what we see under the gdpr but
it's only just being proposed it has the whole legislative process to work
through so it may be another few years before we see this law actually coming
into effect so for today we're going to focus on data protection issues so so
kick it all off if you're anything like me you have probably been you know
attending calls joining webinars speaking with vendors or technology providers
and now we talking to AI and you made them at the back of your mind have this
notion of what AI is it's something to do with making computers behave like
humans but not really understanding sort of quite how it works and then you'll
hear people throw other terms that you liked of Sheen learning it all started
to get quite confusing so the purpose of what we want to do covering this first
part is to look at what actually is AI and how does it work now just to put
some context if you don't already realise it but your odds are that you are
using AI on a daily basis already up on the screen here are just three examples
of sort of common uses of AI so on the left here we have Google search you can
see here I've typed in what is our physical in letting agents and a I has sorry
Google has used its artificial intelligence to workout that probably what I was
trying to type is what is artificial intelligence and below that it also
suggested a number of other types of links that I might be interested in seeing
to learn more about artificial intelligence so again how does it know how to do
that will of course is seeing all the searches that people have entered into
group beforehand it's learning from that and the types of things that people
are interested in and it's using that to propose better search results in the
middle of the screen you can see a screenshot from my iPhone if those of you
who have iPhones and Google does something similar if you go into the albums on
your phone it will it will group lots of photos by face and you can see here
the the groupings of photos of me back from the days when when I didn't have
have any facial hair up to the present day and it is somehow that takes my face
and all of these images and it's compiled them together on my phone so I can go
back and see all the photos about me and similarly all do things for my wife my
kids and so on again how does it help do that it's using AI it's using face
detection some very clever stuff going on in the background and then on the
right hand side here you can see a picture of a Tesla car increasingly you are
finding that AI is being built into into sort of into vehicles to assist in
from sort of very basic uses to make sure for example that cars stay in light
in lane when the driving on the road through to some of the more sophisticated
things that you see in the likes of Tesla trying to do where they are building
autonomous vehicles that will ultimately drive themselves or with the goal
ultimately of making the roads safer and G and easing congestion and so on but
of course those are the very positive uses of AI a of the flipside of it
science fiction has taught us to be also quite wary of AI as well this is where
I go back to my point about being polite to Alexa that's just my tip for today
you know on the left here you can see an example of how 9000 which was that a
slightly ominous AI system in two 1001 a space odyssey those of you may
remember him refusing to open the Bay doors to let one of the astronauts back
into the spaceship slightly murderous use of AI and over on the right hand side
we have a poster for Terminator again I think lots of people when they think
about kind of the worst possible scenarios they either imagines are terminated
machines roaming the planet and taking over humanity so you know you know these
kinds of fears are kind of faded people for a long time and it's why we have
data protection rules and it's why we have the AI looking sorry we have the eu
looking at creating you a regulation to ensure that the AI systems that we
developed serve humanity and that they are developed in ethical ways that are
respectful for information and basically operate that in the ways that we want
them to and not in the ways that we don't that's all Welland good we've all
gotten a sense of kind of you know AI in everyday life but it doesn't
necessarily mean that we actually really know what a I is So what is AI well
we've put a definition on the screen here this comes from the international
working group on data protection in telecommunications and what they say is
that AI is the theory and development of computer systems able to perform tasks
normally requiring human intelligence and then they give examples of visual
perception speech recognition but decision making and translation between
languages in other words it's trying to get computers to mimic human
intelligence to learn to use mimic human intelligence and if you go back sort
of 2030 years you know some of the early examples you may have seen of this if
you were a bit of a computer nerd like me and never used to play chess on your
computer you may have found that the computer was very hard to beat now in the
very early days of those kind of programmes where they created chess algorithms
actually literally what used to happen was that programmers would sit down with
chess experts and they were trying to create rules for the computer about how
to play chess and so they would teach it you know if you're if you're at this
particular point in the game you know this might be a good next move to make
now what that meant was you had to manually code 10s hundreds thousands of
rules to teach a computer to play chess effectively and the computer wasn't
really learning we were just giving it a set of rules to follow I want
artificial intelligence you know that was one example of artificial
intelligence you know outwardly it looked like the computer was doing something
intelligent the reality was that probably wasn't it was just following the
rules that have been hard coded into it nowadays when you when people talk
about artificial intelligence they tend to use the term synonymously with the
concept of machine learning and that's where these things can get a little bit
confusing because you may have heard of machine learning and things like eat
learning in your all networks and wondered what on earth are these things well
the way to think of it is like a series of those kind of Russian matryoshka
dolls where each is kind of a subset of the other so AI refers to the overall
objective of trying to get computer systems to behave in ways that mimic human
intelligence machine learning deep learning neural networks are all
technologies used to achieve overall objective machine learning is basically
the process of getting machines to learn from data sets to produce particular
outputs particular predictions to basically teach themselves how to improve
deep learning is a kind of subset of machine learning it's got a more advanced
for machine learning and it works on water called neural networks neural
networks are essentially it basically sort of systems that are designed to
mimic the human brain this table what you do is you create software or hardware
neurons that interconnect together and they it very much like the neurons of
brain and they receive him personally fire those the fire outputs from neurons
in your autumn back when they feed it round and so on and the overall result is
that huge layering of neurons together create something akin to a human brain
which can be used to to to do very sophisticated machine learning and not very
sophisticated machine learning is is typically what we refer to as deep
learning to deep learning occurs on your own networks machine learning is not
quite a sophisticated and will explain an example of how machine learning works
in the moment now you may be wondering why is AI significant well we've already
seen we've already seen sort of examples of AI in use but you may be thinking
well you know if AI is just about mimicking human intelligence then why do we
need machines to do that why don't we just use humans to do it I think the
short answer to that is that that AI can do it quicker and better in humans
come there capable of doing things that humans can't achieve there's a great
example here that from Google's DeepMind AI system alphago 0 and what they did
they used a machine learning technology on it called reinforcement learning and
literally within a few hours and with a mystery with all the computing resource
is that Google has to throw at a system like this it was able to teach itself
chess with it within the space of a day to a point where it was essentially
could be every other system on the planet that was kind of interesting about it
was that the other sort of chess champion systems that existed that actually
been used but basically using other forms of machine learning technology
alphago 0 had essentially taught itself it was given some basic very basic rules
but also being taught itself how to play chess and in the matter of a few hours
became in defeatable so you know and you get if you think how long it takes a
human chess master to learn chess it is a process of years anybody who's
watched Queen's gambit on Netflix will see just how much effort and skill goes
into and computers are achieved at a pace that we just can't achieve so let's
look at how machine learning works now what I'm going to talk to here is not an
example of deep learning this is ordinary machine learning it's what we refer
to supervised machine learning which is in practise probably one of the most
common forms of machine learning that you see now the important thing to
understand is that computers are not inherently clever if you were to give a
computer an image of a cat and a dog it wouldn't be able to tell you the
difference between the two you have to train it to do that and So what you do
is you provide your AI system with what we call training data and you know what
we do is we get a load of images of cats and a load of images of dogs and we
basically label them we say this is a dog and this is a cat and want to help
the AI system out what we do is we define features of what it is that
distinguishes account from a dog now with humans we can look after dog and we
can just kind of instinctively recognise that one is captain was a dog but
computers you have to tell them what the differences are so they can learn So
what we say here is we say make sure the dogs are larger than they have bold claws
and they're a bit scruffy and cats by contrast the smaller they have sharp
claws and they tend to be quite tidy and then you feed all those hundreds
thousands 10s of thousands millions of images into the AI algorithms and is
initiated this is an example of a cap and what it and this is an example of a
dog and what it starts do is it matches those images against teachers that
you've described and it starts to attach weight to each of those figures each
of these features and it works out which are more important which of these
features are more important from for distinguishing dog from account so then
once it's gone through a process of learning U that presented with another
image and here's an image of a fairly scruffy cap and you give it to the to the
machine learning algorithm and it looks at it and it says well this thing is
fairly scruffy you've told me that the dogs are scruffy therefore it's a dog I
actually this is very clearly a cat so an engineer may provide some feedback to
the AI algorithm saying no actually this is not a dog it's a cat and what the
what the AI system will do is then to adjust the weight that attaches to
scruffyness in identifying dog and say well OK I sort of assumed being scruffy
was really important but maybe it's not that important being larger having dull
clothes is more important to identifying a category dog and you do that enough
times and give it enough features and it attaches it it learns how to attach
the weights to each of those features and then eventually it becomes pretty
good at recognising and next time you give it some of the picture here a
leopard and despite the fact it is Laura leopard is a large cat it's didn't
still recognises that despite being large it is a cat so ultimately overtime
starts to learn So what can you take from that well you can learn from that
that computers are not inherently clever they don't just know these things they
have to be trained that's why we have training sets that's why we have
engineers that's what we teach them about the features some of the more
advanced AI models that you see that deep learning ural net they will start to
teach themselves they will identify features themselves with out humans having
to to to train them on that they can just give a whole bunch of data and those
deep learning systems will start to separate out nature and workout with the
features are but computers are basic computer is not inherently clever and you
have to be very careful because if you give it bias data you will get vice
models there have been examples for example of technology companies have been
trying to use AI systems to identify candidates who are likely to be better
engineers and they do that by giving the giving AI systems details about the
existing engineers they have problem with that is that if you are technology
company where the majority of our engineers tend to be male what you may just
be teaching your AI system is the men are better being engineers which of
course is very biassed and incorrect and So what may happen is you end up with
a situation but AI system start reject female candidates that's not what you
want so you have to be very careful with the data that you used to train AI
systems and when I I goes wrong it can go very very wrong I've got a an image
here on the right hand side about 1980s film war games where you may remember a
teenage hacker almost accidentally starts off nuclear war based on some AI
responses with military systems but you know an example of that maybe you hear
stories about some of the self driving cars for example tragic case in the US
and Tesla where one of their vehicles did not recognise that it was passing it
was passing a lorry think the story about it wasn't it Mr side of the lorries
open sky and in trying to overtake another car move into the lorry so you know
AI systems aren't perfect they do make mistakes they will get better with time
that's what they do but when it goes wrong it can have some very serious
consequences and so with with these things This is why we need to create
regulations to make sure that developed and safe ethical and privacy respectful
ways and it does of course raise the overarching question of what happens when
a human no longer understands the algorithm ultimately what all AI systems do
is they create mathematical models that predict outcomes now they're very
sophisticated algorithms that human human human knows how to create the AI
system but ultimately it doesn't understand really how the AI does what it does
once it's created that model and that raises some interesting questions but
those are probably ethical questions beyond the scope of today's presentation
so with that I am now going to hand you over Oh no sorry I've got two more
points just very briefly how its facial recognition technologies AI quick
example here very much were talking about before you take if you loaded an
image of a face into an AI system it will then start to extract features about
face So what we were talking about earlier one of the distinguishing features
of that face and it will use it so that in future when you present it with a
photograph of the same individual it can compare the features of that
individuals face against the features already knows and work out who that
individual is so if you go back to that iOS album of me earlier that was how my
phone did that on how the virtual voice assistants use AI well you know in in
very simple sums a virtual Voice Assistant is actually little more than a
glorified search engine what happens is that when you speak to your Siri or
Alexa that uses natural language processing of a form of AI that converts the
spoken voice into a transcript that transcript is then passed by computer
systems to work out the semantics of what was said and it works out what was
the instruction you just gave it you know what's on my local cinema or what's
the traffic like on the way to work and then it it will conduct a sort of
search engine search to find the results for that before ultimately returning
those results to you And that is essentially how those work so without probably
take up more time than i should tell you how ai works i'm going to passover now
to leonie who's going to talk to you about some of the core privacy risks
Privacy risks
The first question is well why are data protection laws
relevant it AI in the first place and at its very basic AI involves data
processing just very large amounts of data some of which will be personal data
so it's essentially like any other form of data processing it's really just
about some computers crunching some algorithms and some data but what makes IR
unique is the sophistication of those algorithms and the volumes of the data
that's typically involved and ultimately as Phil says humans just don't
understand how machine learning trailer trained algorithms work but we just know
that they do if you could move to the next slide please just one one previous
one I think we've gone too far ahead there yeah so looking then at the legal
and regulatory landscape for AI obviously to the extent that personal data is
processed then the gdpr will be relevant as well as local data protection laws
the privacy directive will also be relevant to the extent that the AI
implementation involves the accessing of or storing of information on a device
and that's whether that information is personal or not then we've got the
European Convention of human rights because we've got to remember that data
protection aims to protect individual rights and freedoms with regard to the
processing of their personal data now of course that includes the right to privacy
but it also includes other rights beyond privacy such as the right to non
discrimination so any data protection by design and by default means that you
must take into account the risk to rights and freedoms of data subjects
generally and not just in a privacy context and so that's why you know
discrimination anti discrimination laws will also be relevant Phil mentioned
earlier at the draught eu regulation on AI that will apply as he said to all AI
systems but with a particular focus on prohibited systems and high risk high
risk systems so again it's broader than just data protection then we've got the
nest directive requirements and they are likely in many cases to cover AI cloud
computing services so even if an adverse Attack Attack in an AI context does
not necessarily involve personal data it may still be in this incident and then
finally we've got potentially sector and technology specific local laws and
they're likely to depend on the AI technologies being used in the context of
the processing so an example might be licencing laws for the licencing of some
types of AI technologies such as facial recognition technologies next slide
please so OK so let's look more specifically then on how gdpr applies to AI now
the key point to understand here is that the underlying data protection
questions for even the most complex AI project are much the same as with any
new project so the questions are is the data being use fairly and lawfully and
transparently do people understand how their data is being used how is it being
kept secure so all the full gamut of data protection issues would be relevant
but there are ones that cause particular challenges in an AI context and I've
outlining those on the slide and if we look at accountability from the
accountability perspective organisations are required to account for the risks
arising from the processing of the personal data now AI implementation
implementations generally involve a higher degree of risk to right and freedoms
than in the context of other processing of personal data and in the vast
majority of cases the use of ai will involve a type of processing that's likely
to result in a high risk to individuals rights and freedoms and therefore
trigger a legal obligation to what is your lawful basis for the relevant data
processing and it's likely to be the case that you would have different lawful
basis for your AI development versus your AI deployment phases and there will
be challenges in relying on certain lawful basis in an AI context and I'll talk
about that in a few minutes from a fairness perspective it's about
understanding what the reasonable expectations of individuals will be in an AI
context but it's also about ensuring things like statistical accuracy because
that that will impact specifically on fairness and also the likelihood of bias
within an AI system will impact unfairness on end looking at furnace
organisations might also want to consider having opted choices and some of the
ethical AI considerations that Phil alluded to from a purpose limitation perspective
that particular gdpr principle again poses key challenges because that
principle stipulates that personal data must be collected for specified
explicit and legitimate purposes and not further processed in a manner that's
incompatible with those purposes so data that's collected for a particular
purpose constantly be redeployed to train your AI models without seeking
consent from the relevant individuals from a security perspective again you
have all of the known security risks but AI makes known risks worse and more
challenging to control and this is because of additional complexity as well as
things like heavy reliance on 3rd party coding relationships the need to
integrate with third party components and there's also new types of risks such
as adversarial attacks on machine learning models and I'll touch on that
briefly in a few minutes personnel involved in AI may also be from a wide range
of backgrounds and may not necessarily appreciate the broader security
compliance requirements and data protection more generally and in many cases
you're going to be training large data sets which will involve training data
being copied and imported from their original location and so it's essential
you know that you consider the risks in that particular context AI often uses
open source code so in many cases implementing AI will require changes to an
organisation software stack and that will introduce additional security risks
the ICU is called an AI then from a data minimization perspective there is an
inherent conflict between the need for data minimization on the one hand and
the need to allow machine learning to conclude what information is necessary
from large data sets but again the ideal guidance on AI makes clear that there
are techniques that can ensure organisations only process what they need to
process and it recommends that those organisations can consider very
specifically those technical measures from an individual rights perspective in
many cases personal data that's fed into an AI system so in other words that
becomes training data will be subject to pre processing so to change it from
one form to another to make it into training data now that would still be
personal data because it's still likely to be possible to use that data to
single out an individual for example through a series of their unique purchases
but because it's being subject to pre processing it will be harder to link that
data to the individual so in many cases the identifiers will have been remove
the contact details will have been removed so it's much more challenging to
deal with individual rights and one particular right poses specific challenges
in an AI context and that's the article 22 right in the gdpr not to be subject
to decision making based solely on automated processing that has a legal or
similarly significant effect and we'll look briefly at that in more detail in a
subsequent slide next slide please fill OK so very briefly then in terms of
accountability and carrying out your data protection impact assessment one of the
key challenges that arises in the in this context will be around the
description of the processing because it's likely to be highly complex and
technical and what the ICO guide suggests in this context is that you consider
having two different versions of your DPI a one that's for a specialist
audience and one that contains a more high level description of the processing
that's useful for explaining the processing to individuals and to internal
stakeholders it's also necessary as part of the DPA to demonstrate necessity
and proportionality in other words there is no less intrusive way of achieving
that the objective that you're seeking to achieve it's important also as part
of your DPI it to explain any relevant variation or margins of error and it's
important document trade-offs so there will be a number of trade-offs arising
in an AI context and an example of a trade off is data minimisation on the one
hand versus the need to ensure statistical accuracy on the other hand or
another example is ensuring AI explainability on the one hand versus increasing
the risk of privacy attacks on the other the more that you make the model
transparent to potential attackers so you need to document those trade-offs and
the rationale for the tradeoffs within your DPS so you identify them you assess
any existing or potential tradeoffs when you design or indeed when you procure
the AI system and you consider available technical approaches to minimise the
need for trade-offs and also have clear lines of accountability for final trade
off decisions and review them on a regular basis the final point to note in
this context is that the DPI a like all DPS should be a living document but
particularly in an AI context the DPI a needs to address this idea of concept
drift in other words if the demographics of the target population shift or
people change their behaviour then you need to consider whether the DPA also
needs to be revisited so there are some specific issues that arise from a a DPI
a perspective next slide please for I'm not going to go into too much detail on
this particular slide because this is my colleagues in Silicon Valley are
running a separate series on AI and they go into this in quite some detail
suffice to say that one of the key principles in the gdpr is that processing
must be fair and lawful and controllers must ensure transparency of data
processing from a lawfulness perspective in an AI context like with any other
data processing you have to consider the legal basis for that processing and
you must breakdown and separate each processing purpose an identify an
appropriate legal basis for each one and as I've mentioned it's likely to be
the case that you would have different lawful bases in the in the development
versus the deployment phase now I've outlined on this slide and the three most
likely legal basis on which organisations could rely in AI context consent is
likely to be inappropriate legal basis in a number of context and indeed
consent will be required in certain cases for example if you're processing biometric
data in order to uniquely identify an individual that special category data and
therefore triggers the need to comply with a special category condition in
Article 9 the most likely condition that's appropriate is is consent but there
are some challenges in relying on consent because from a gdpr perspective there
must be a genuine choice and the more things that an organisation wants to do
from an AI perspective the more difficult it is to ensure that consent is
specific and informed and if relying on consent during deployment organisations
must be ready to accommodate withdrawal of that consent in terms of reliance
and contractual necessity any processing that relies on this basis must be
objectively necessary to deliver the service so while for example in a virtual
Voice Assistant context you might be able to rely on that basis in order to
execute a voice command it's unlikely that you will be able to rely on it for
service improvement and whether or not you can rely on person on this basis in
order to personalise content depends very much on on the circumstances in terms
of reliance on legitimate interests it's key to know that you cannot rely on
this basis of processing if the use of the data would be unexpected or cause
unnecessary harm and obviously the risk of that happening is far higher in an
AI contact context then in other processing contexts I just included on that
side also a link to the ideals guidance that it produced in conjunction with
the Alan Turing institute the key issues arising in relation to transparency in
an AI context or addressed in that particular guidance and of course the
challenge will be to explain in a concise and easy to understand Manor what's
happening in the context of the particular data processing operations you'll
see on the slide also that I've included a box that suggests that consent may
be needed in other context if there are specific rules triggered like Article 9
we've mentioned special category data but will come in a minute to talk about
automated decision making and it's likely would require explicit consent to the
extent that that automated decision making involves illegal or similarly
significant effect with the individual also to the extent that you're accessing
information on a device that triggers the cookie rules of article 5/3 of the
privacy directive and therefore consent would be required but as I say my
Silicon Valley colleagues go into this in quite a bit more detail next slide
please well OK so just we've mentioned some of the challenges that arise in a
security context from an AI perspective and I mentioned that many of the risks
for the risks that we already know about but that there exacerbated in an AI
context but AI also introduces potentially new security risks one of those risk
is known as model inversion attack and what that basically is it is it is a
description of the scenario in which an attacker has access to some personal
data belonging to specific individuals that are included in the training data
for a particular model and because they have access to that data and they have
access to the model they can infer further personal data about those same
individuals by observing the inputs and outputs of the model and if we take an
example of facial recognition systems they are often designed to allow third
parties to query in the bottle so when the models given the image of a person
whose face it recognises it basically returns its best guess as to the name of
that person an associated confidence rate and what can what attackers can
potentially do is they can probe the model by submitting many differently
randomly generated facial images and by observing the outputs so the names and
the confidence scores they could potentially reconstruct the face images
associated with those individuals that have been included in the training data
and so you can see on the left one of those reconstructed versions and while
it's imperfect researchers have found that they can be matched by human
reviewers to the individuals in the training data with 95% accuracy and that's
an example that's taken from the iOS guidance on AI and I've included a link
also to that guidance in the slide next slide please bill this life just
illustrates another potential new security risk known as a membership inference
attack and essentially what it does is it allows a malicious actor to deduce if
a given individual is present in the training data on an AI model so basically
attackers have the target model and they use the target model junction with
information they already have about the individual to workout at that
individual boots part of the training data now they can't necessarily find out
additional information about the individual but they can't find out whether
they were in the original training set and that's not necessarily always
particularly significant but if for example the model is trained using
vulnerable or sensitive data so from a vulnerable or sensitive population like
those with dementia or those with HIV for example then revealing that someone
is part of that population can give rise to significant privacy risks next
slide please bill so finally then just to look briefly in more detail at
article 22 one of the gdpr so I mentioned specifically some of the challenge
that arises in relation to respecting individual rights one particular
individual right that poses significant risks in an AI context is 22/1 and what
that article says is that individuals have the right not to be subject to
decision making that is based solely on automated processing including
profiling which has legal or similarly significant effects on that individual
now in many AI implementations that is exactly what Tai system is designed to
do it's designed to produce predictions and based on those predictions certain
decisions will be taken so if if article 22 is triggered in other words if
there is automated decision making going on that potentially has these little
legal or similarly significant effects what article 22 says is that there are
only certain legal basis on which that processing can be carried out either you
need the explicit consent of the relevant individual or the processing must be
necessary for the performance of a contract or the taking steps to enter into
the contract or the processing must be authorised or you know must be authorised
by union or member state law which introduces sub suitable measures to
safeguard the individuals rights and freedoms now even if you have the
appropriate legal basis even if for example you have the explicit consent of
the individual to undertake this type of decision making certain safeguards
still need to be built into the system by virtue of article 22 and those
safeguards demand that you basically have some process whereby the individual
can obtain human intervention can express his or her point of view and or
contest the relevant decision that's taken and so it's important and again the
ICU guidance makes this absolutely clear that to the extent that you are
obliged to insert a human into the process that that you know a human with
genuine decision making power is not a token human intervention and that they
do have the power to overturn the decision and one of the key things also is
that you you know you make sure that you mitigate risks like automation bias
and automation bias basically is that at the end of you know the algorithmic
process humans tend to trust that process and that the output of that process
is correct but as as fellas pointed out a I can get things wrong so it's
important to ensure appropriate processes and procedures are in place to avoid
that automation bias to the extent that you do Internet allow humans or that
you do insert humans into the process so that brings me to the end of the risk
section and I'll now hand over to rob who talk a little bit about some of the
practical issues thank you thanks early the next slide please so first at
school problem we're going to address is that of negotiating data protection
agreements between vendors and their customers where the vendors providing an
AI service so as you may have noticed already in non AI processing there could
be some complexities about whether a Bender is a controller or processor or
potentially fit both and this conundrum is certainly no easier when AI is
involved but will come on to discuss that bit what's on the next slide so
what's line is if a Bender is processing data for AI purposes and that's only
to benefit specific customer contribution of the data but there's a stronger
arguments that's that then there will be a processor but if the vendor is
processing data to benefit itself or other customers as well example called the
general product improvement purposes then or if it's targeting users but that
Bender is much more likely to be controller so how do we deal with this insert
data protection negotiation well we've seen look at it from my sort of product
improvements spectrix that's very common some points of negotiation well we've
seen a few different strategies about how to deal with this in practise so the
first one is yes it's just to disclose the fans that the Bender is a controller
and the data will be processed product improvement objects something more
legally accurate description but it won't meet with some contractual resistance
or customer often customers kind of get into the mindset of all the vendors of
processors they don't answer that one anything outside that box so if we're
going down this route maybe we could make it a bit more tolerable to the vendor
if sorry to the customer if we kind of gave them an opt in or potentially opt
out to allow some sort of control about you know whether that date will be used
for bullet improvement come more generally so that's the best approach second
approach perhaps the more traditional alternative is for the benefit to
position itself as a processor unfortunate though this is a bit like fitting
square peg into a round hole because the vendor is going to need to get
instructions from the customer process that data purposes in order to meet
requirements under article 28 the gdpr and that's where again you you're likely
to get that kind of resistance from customers and potentially hold up deal so
what's the answer well third option could be anonymization that's going to
require a bit of investment potentially some some technical wizardry and if the
vendor though is able to fully anonymized data then that data will no longer be
personal data so would come outside but sometimes trying to
The importance of AI Governance
Make an assessment of the risk that business have using
Chaturbate solutions and my background is as a lawyer I'm an expert in data
privacy regulations I advised the French Prime Minister administration during
the negotiation of the gdpr as an expert and I've been doing this very very
long time the interesting things here is we will look at two different problems
the first one is how personal data is collected by open AI for the training of
chargeability so that's one first element that we need to assess and the second
element is how personal data might be processed through the use of agility so
those are really two different elements and why is it important is because for
your businesses or businesses have you customers gdpr compliance is a very it's
a very important element if Chaturbate is absolutely not compliant or cannot
manage to get compliant within a reasonable amount of time then the problem is
that for eu businesses to use the tool it will be very difficult and so there
will be a huge loss of competitiveness so this is quite important problem to
take in consideration so already Italy raised gdpr compliance problem France
also there are many complaints at the moment that are currently currently being
processed so will take a look at the problems that are that existed with open
eye so first element problems related to the collection of personal data so through
the training of opening eyes chat there is a lot of data collection because the
data model is trained
Chat GPT and Open AI
It helps clients throughout the world reached their third
stage of digital transformation success and the topic of today's live stream is
ChatGPT and opening I what are the data privacy and legal issues so in other
words we're going to talk about the dark side of chachi PT and open AI we will
also talk about the positive things but we also want to really focus on some of
the the hidden risks and things that people may not be thinking about as they
as they embrace and get excited about chachi BT open AI another other AI
technologies that are becoming quite in vogue and mainstream here lately I will
introduce our guest here in just a moment but before we do that a couple
logistical things first of all this live stream interview is going to be edited
in polished and attitude additional content to become part of next week's
episode of transformation ground control which is a weekly podcast they host
bits released every Wednesday on LinkedIn YouTube Facebook and Twitter where it
streams every Wednesday and on Wednesdays you can also find those same episodes
in the audio formats on find Google Spotify apple Amazon et cetera podcast platforms
where you listen to podcasts you can find it there so be sure to subscribe to
the podcast if you don't it's called transformation ground control you can find
it on podcast platforms all over the place as well as streaming on the
platforms and mentioned secondly we are going to start with some questions that
I have or a guest here today to talk about some of the legalities and data
privacy and dark side issues of chachi BT and opening I but I shall I also love
to hear your questions as well so I want to make sure that we get to audience
questions here throughout the conversation so at any point we're talking here
we've gotten I hear on the streams that were streaming 2 and so we're watching
the chat stream here so please drop in the chat any questions you have along
the way and Speaking of the chat stream here if you don't mind just dropping
the chat wherever you're watching today if you don't mind just dropping in the
chat where in the world are joining from which city and country argued in this
is a global reach and a global audience and we typically are talking to every
Tuesday and we'd love to hear where everyone's joining from here today
especially because of this topic is very much an international topic and
there's legalities and nuances that are probably different to different parts
of the world so we'd love to hear where people are joining from here today so
please drop down the chat left your very from so again topic today chat 60 and
open AI what are the data privacy and legal is used best person I could think
of to have this conversation is someone that we've had multiple times on the
podcast on this live stream it is Marcus Harris from capital law so Marcus
thank you for being here today so it's good to see you again I appreciate you
inviting me on talk about this only said I mean this is an evolving world but
really innovation and impact and they think the legal implications are pretty
critical to have an understanding of specially I mean look at this from two
perspectives one is just from consumer perspective wanting to deal with the
legal implications are kind of layout those risks and look at how this is going
to impact you what kind of legal constraints there are with respect to AI and
then I look at it from an enterprise perspective and figure out you look what
are the benefits were the use cases how you mitigate risk how to exploit this
technology to really gain some efficiencies in corporate ERP big data big
software side yeah absolutely and it's in like I said that the intro of this discussion
this is such a hot topic right now and there's so much excitement around it so
much curiosity around it open AI and chat GT or things that are mentioned in
mass media pop culture recent episode of South Park which is a popular US
comedy animated comedy even had a whole episode dedicated to chatting BT so you
start to look at these signals that people are really interested in this topic
and so we thought it would be kind of cool to talk about you know what are you
know we want to temper some of the enthusiasm about the technology not to
suggest it's not a good technology it won't totally transformed the way we do
business but we also need to recognise the risks and the potential dark side of
chat BT open AI and by the way I'll keep talking bout chachi BT and open AI but
really this whole conversation relates to any AI model generated AI that's out
there so Google has their own version of Bard Microsoft has copilot which is
something that is a little bit different but they've introduced that as part of
their Office 365 suite or there I think their beta testing it now it Microsoft
is also a investor in open AI Musk is investing in his own platform for sort of
an opening I type of model so it's we're talking bout gpt open AI but will sort
of use that as a universal term to describe what we're we're discussing here to
analyse allsorts of AI just to start I guess you know one thing that maybe just
to set the context for the discussion here we won't go into a tonne of detail
of what chassis PT isn't what open AI is but we do have some resources on our
YouTube channel that you can go to if you want to learn more about it in fact
I'll ask our marketing team to drop it in the chat here some links to recent
discussions I might put out a brief video just yesterday on my YouTube channel
that just gives an overview of ChatGPT an we also a few weeks ago on this the
same live stream we had a discussion around what chachi BT is and what it can
do in the ways it's affecting businesses in the way we do business so will drop
those links in the chat so you can learn more about what the platform is today
we want to focus on sort of the the like I said the dark side of this and just
to set the context I'll kind of open it up by talking about a poll that I
published on LinkedIn just yesterday actually so even we haven't gotten a tonne
of results or the complete results yet but we have 116 votes on this pole I put
out an in this question or this poll leg I posted to my network I asked the
question of what will be the biggest positive or negative impact of chachi BT
and open AI an just to give you some context of what people are thinking about
it 44% said it would make businesses more efficient 34% said their data privacy
and legal issues 15% mentioned loss of jobs and then 7% said something else
other comment below and we'll get to some of those comments here in a moment
but the reason I put this pull out there is cause I wanted to get a feel for
you know how excited people are about the technology versus how much people
recognise the dark side or concerned about the dark side and certainly the 44%
said that it's going to make businesses more efficient that seems reasonable
but then it was interesting to see the 34% cited data privacy and legal issues
as sort of the main thing that was on their mind so I guess maybe to to start
there maybe just use that as a way to set the context for my first? Is what are
some of the legal concerns are unknowns as it relates to Chaturbate and open AI
and obviously data privacy and legal issues being one of them maybe we could
start there and then mention anything else that you think are are kind of
concerns from your perspective as an attorney yeah I mean I think you know just
generally with respect to opening I chat GT I mean this is probably one of the
Inflexion point from a technology standpoint but I don't think we've seen
before I mean to me this is as revolutionary as you know anything that has come
before that just has the ability to transform the way we interact with people
and then just the levels of efficiency of course with that kind of power as
they say comes quick great responsibility right and one of the concerns that I
have we're going to talk about the dark side may do I do want to talk about
some of the efficiencies and some of the positive things about this cause I
think just enormous one of the issues with legal regulation of the law as it
applies to technology has always been in the fact that technology is always you
know three to four steps ahead of where existing laws and regulations in any
kind of regulatory framework exists so we're always in a certain scenario where
we are catching up and I think you know one of the fundamental issues from a
legal perspective and there's so many OK and we can talk about this for hours
but one of the things that is problematic is these these tools these systems
are being trained on you know a corpus of text of speech of data and you don't
have necessarily an understanding of what what what is made-up of that body of
information where are they getting it for example and I think that creates a
federal risk above no problem for a variety of reasons one is what is the
accuracy or the liability of the output when we have it all saying in our
industry and you know it's virgin group gel right you put that data into it and
data out so there is no control or no regulation or no interest in the accuracy
of what this thing is training on then there's really no guarantee as to the
accuracy or reliability of what you're going to get out the output of the
content that you have created through this tool in the reliability of that that
create create substantive issues from just a liability employment from an IP
infringement standpoint I do think really that is going to be somewhat of a
self regulating issue because you know you're going to have different competing
products and if one is less reliable and the results are not as good as another
then people are going actually going to gravitate towards a different a
different product but I think they're inheriting that is just a lot of
substantive risk what data is being utilised and how good it is the product
that you're getting out of it and how you utilise it and then downstream how do
you protect the ultimate end user so for example if you're a software vendor
that's incorporating AI generated content into your products what kinds of reps
and warranties indemnities and limitations of liabilities are you going to take
on with this unknown corpus of data from a legal perspective there has been
recent litigation primarily in the IP ownership context so you've got yo chat
GT with there's another there's another entity I think it's called Dolly where
it he takes visual visual representations and creates modifications of that and
there has been at least one notable lawsuit a couple of others one filed by
Getty which has all these photographs the AI tool is gone in modified pictures
yoga uses a basis for the generation generated output that getting images and
getting says hey now you know that's an infringement of Arnold property rights
in those photos interesting this column attic in how how is a company do you
make representations of infringement indemnity obligations limit your liability
when you don't know what the basis for the data has been yeah that's really
interesting I I wasn't aware that that lawsuit involving Getty but I could see
how that could be a challenge and I think when I've thought about data privacy
and IP or intellectual property types of issues with these AI platforms I guess
I have thought a bit more at from the the end user perspective you know if I'm
if I'm an employee at a company and I use chachi BT2 to have have it analysed
some sort of sensitive company information from confidential company
information you know what happened to that data and that sort of thing but this
is interesting you're kind of talking bout the other side which I hadn't
thought about which is the model itself using information that's already out
there and then who owns that data and what are the IP implications of that what
about you know when you are an employee if I'm employee atom at a company or an
organisation and I use chachi PT let's just say I'm trying to use it I'm
testing it out on my day-to-day job what are some examples or some issues that
I might face or be exposed to as relates to intellectual property and or
confidential information how does that maybe you could help us understand how
that works and what that might look like from a legal perspective yeah well I
I've spent some time in my practise looking at the rules and regulations that
these entities put out there in the context of terms and conditions and what
their obligations are to you and there are a lot of quite honest with you I
mean they do a very good job it's saying hey you know whatever you give to us
is not going to necessarily be treated as confidential material and we're going
to be able to use that material in order to iterate our product to make it
better so essentially you know whatever whatever you're putting into the AI
tool that you're using you have to make an assumption that it is not going to
be treated with any kind of confidentiality is not going to be treated as a
trade secret it's going to be open and available to the world and invest
Samsung I think just in the last couple of weeks have had a ChatGPT issue where
there somebody at Samsung disclosed proprietary or confidential information via
chat GT no I think that's huge concern because if they don't take
responsibility catching PT or any things over open open AI platforms when
you're uploading your input into them it's the Wild West essentially you have
no you should have no expectation of privacy confidentiality or really very
much recourse to get that information back and I think that's a huge risk for
any company and you can't have you know some guy in a cube on the other side of
the world trying to do more efficient job but but then you having having chat
GT for example generate those create substantive issues of liability fragement
and exposure for for enterprise customers in enterprise measures yeah that's
that's really interesting what about this just occurred to me as your your
talking there Marcus it it kind of reminds me a little bit of the early 2000s
when Napster was becoming a thing and those of you watching or listening that
don't remember master it was a platform that was developed that was basically
allows you to share music and it became an issue because obviously there were
copyright issues and bands like Metallica I remember Metallica being kind of
leading the charge on this but Metallica sued Napster because they were
creating his platform that allowed people to essentially steal their music and
they weren't getting money for for what was rightfully theirs and it ultimately
overtime they sort of won that fight they won that battle and Napster is no
longer around I don't think or maybe they've been wrapped into some other
company but they're not as relevant as they or at 1.20 years ago but there's
sort of a flash in the pan they sort of came and went because of some of those
issues I don't know that it's necessarily the same thing or from a legal
perspective it's the same kind of issue but is are these AI platforms or
vendors are they at risk because they're providing a platform that exposes
potential confidential information shares that confidential confidential
information or is it just sort of like we in organisations as users of that
tool is just a risk that we just have to deal with what are your thoughts on
that or how do we I mean I think I think from from the platforms perspective
like I said you know they've got these pretty robust terms abuse that have been
generated by teams of attorneys that really do mitigate their liability in
their risk of lawsuits and recourse and you know that is being tested in the
core system now with a variety of different lawsuits so certainly I think if
that corpus of material that they use to train their AI maan contains
proprietary or confidential information that has been active access without
authorization word is being used without express authorisation that creates a
risk in the platform not only to the owners of the platform the providers of
their product but even to the potential end users you know I mean think about
that for a second as he user ChatGPT if you are now generating code that is
where that AI tool has been trained on someone's proprietary code that's
confidential or protected by an intellectual property right the output that you
have generated is now infringing and who's responsible for the project that the
AI provider attempts to wash their hands of it and we'll see how successful
they aren't doing that as these court cases move on but then what is
reliability as the person that generated that code that the AI provider is
going to point you and say well you you represented that you had the right to
do what you did you put in put in your responsible for what comes out case
unnecessarily true so I mean you were at the vanguard of a lot of these legal
issues in its super super interesting and I think there's there certainly going
to be some trailblazing trailblazing you know legal precedent that comes out of
these that's going to define the parameters of how you are going to use these
systems in future so some of some of the other issues associated with this too
or I mean you know if who owns this output is another fundamental question that
really has been clarified at least to some extent by the copyright office in
the last month or so and the copyright office came down on this by saying yes
you are in putting information and the output really has no human interaction
or control or discretion that is not protectable from a copyright perspective
it's not protectable in a larger property here of the copyright office so you
know if you if you were to go in and say OK well let's do a new painting in the
style of Van Gogh and Monet and now you want to apply for copyright protection
the copyright office is going to ask you certain questions about it and if
there is no human interaction or human control over what has come out they're
going to reject it from protection standpoint which is in contrast and this is
their example you know a graphic novel that is has text that is generated by a
person or maybe partially generated by an AI tool but then edited embedded in
modified by a person with a I images of her generated by AI there's going to be
copyrightable elements to that and that work would be protectable maybe not
every element of that work so they only coming down with clarity certainly the
copyright office or right guidance to consumers and courses to what is
protectable and what is on yeah That's fascinating I I was not aware of that
that the copyright office of EU S government is already providing those
parameters and it leads me to another question here which is that you know you
see a lot of governments throughout the world as you just described in EU S
government's responding fairly quickly I mean as fast as governments can move I
suppose there there responding quickly to to this threat or the uncertainty
around chatting BT and open a I in fact some countries like Italy for example
just a couple of weeks go completely banned chachi BT and said it's just a
platform that is illegal in their country and I think China just recently a few
days ago did something similar so governments in some cases are taking more
extreme measures and just sort of trying to shut it down or trying to
completely control or limit the exposure I know this is totally you know as
attorneys as an attorney you you probably hate this question because I'm asking
you to predict the future and it world uncertainty but but what do you think
that will continue that trend will continue with US or with global governments
getting more and more involved in trying to regulate it similar to what they're
doing right now with cryptocurrency and other things that could be perceived as
threats to the sorry the status quo do you think that will continue with with
your prediction be on that front I would expect it to continue in certain
jurisdictions and I think for me that really is the wrong approach because I
think that you know just outright banning something without fully realises the
benefits of it there's a there's a real risk in that that your your technology
industry is going to be left behind and now once you let the genie out of the
bottle so to speak you can't really put it back in and you know these things
don't necessarily have borders right I mean you know if you've got chat GT this
generally available there people are going to find a way to use it certainly is
providing efficiencies and benefits you know from profitability standpoint
people are going to be clamouring to get their hands on it and I think the
right approach is regulation and developing a legal framework that is going to
mitigate whatever perceived risks are associated with yes I think there's
there's the right certainly in being concerned and know whether we need what we
call sui generis protection which is kind of Ruby spoke brand new type of
compliance system for this or if we can leverage existing laws I don't know
maybe there's a mix of both maybe some some B spoke regulations need to be put
in place together whatever concerns there are with opening I but I certainly
think that a framework that existed you know in in law can be applied to this
in a substantive way to mitigate risk but certainly I mean you look at
implications with respect to banking and healthcare you know those are enormous
implications right I mean you got your faulty data that's got biases in it you
got inaccurate data I mean you're talking about you know people losing their
lives and you've got your financial crises that could be perpetuated through a
high issues so I mean the risks are certainly in the roast and you know we
talked about you know accuracy of the data earlier you know who's responsible
essentially for policing that who is making sure that the data that is is put
in is accurate and the output is reliable you know there are no as far as I
know and I could be wrong I'm sure somebody here in the chat can correct me as
far as I know there are no systematic regulations today then there are
applicable to AI and I think what that means then is from an administrative
standpoint this is administratively burdensome but as users particularly in the
corporate context when you've got a regular regularly assess whether that data
is accurate if you you need to know So what type of purpose is being utilised
with that particular product that you're that you're using I mean is it a
private data system are they just going out on the Internet scraping data
that's generally available that's going to be problematic from potentially you
need to know that you have to have an understanding of what those risks are and
then you can just craft policies internally to mitigate those risks I look at
this you would be referenced after and I think that's a good analogy that's a
good comparison I look at this almost from a from Kevin and open source context
right I mean what are the risks to your enterprise from open source software
and what kind of programme have you put in place to minimise the risks
associated with the liability caused by open source software it's kind of kind
of similar in some ways when you're looking at you know unfettered use by your
employees and consultants open AI tools to make them more efficient how are you
going to moderate that how are you going to take the risks what policies and
regulations as a new is an organisation are you going to put in place to govern
that kind of thing yeah you mentioned a couple things there I want to come back
to you in a SEC with follow up questions one is what you just said about
policies you might put in place is an organisation that's I'll ask that later
in the discussion here today maybe it will dive into that and then I also want
to come back to colour points you mentioned before we do that I just want to a
couple things Marcus if you're wondering why no one saying anything in the
chats because the chat the stream isn't working in the green room that were in
the studio were in right now so I have manually have on my phone I have
LinkedIn open and YouTube as well so I'm going to have to manually pull the
questions we have the same problem last week I'm not sure why it's being issued
so unfortunately you won't see the questions in advance like you normally would
but before I get to the question here from the audience I'm just going to go
over to the audience here and where people are joining from we have I Anna from
Trinidad and Tobago Tobago thank you for being here today subash sheesh from
India Ashley from Atlanta Wesley from Colorado Lars from Charlotte NC just as a
few examples here I'm trying to cherry pick some some different countries that
people are joining from here today thank you all for being here really
appreciate you for appreciate being here one of the questions we have here on
LinkedIn would be actually just came in on LinkedIn it was sort of related to
what you were talking about a second ago Marcus I just need to find it because
this isn't ideal on my phone and I lost it where is it bear with me one second
here it is this is from the mall on linkedin AI generated IP your sort of
talking about that a minute ago as far as what the US copyright office is is
doing you have any other thoughts beyond that I know you're not an attorney a
global attorney your focus is obviously in U.S. law are US based law but what
are your thoughts on that anything else you would add to that thread yeah I
think I think that's certainly an interesting question I mean I think you know
is it patentable when is therap end ability component to these things I mean I
I really thought about this in the context of the copyright component and I
certainly think that if there is novel you know output that is being created
you know it very well could qualify for pat ability improving protectability
now whether that qualifies for pet protection in the United states I think the
the hypothetical just incomplete so it's hard to say so I talked about
something that was called genesis protection earlier and you know there there
may get make it to a point where you've got machines you're using machines to
create innovative technology in there may need to be some sort of programme put
in place to protect that that doesn't really rely on this component that the
copyright office requires which is is human intervention in human component to
it so I could definitely see legislation but that would expand or clarify the
patentability of technology is being developed through opening I sources so I
think that could be something that comes down the Pike but like I said it's
really hard to say without having specific models things like that right it's a
little interesting question yeah yeah for sure here's another comment I'm
actually going to pull this comment off of my LinkedIn pull that I mentioned earlier
the where I asked you know what the biggest positive or negative impact of
chachi BT and open a I would be in this thread one person Jonathan on LinkedIn
chose other and then when you choose other ask people to put in the comments
what they think the real issues are I'm going to read this comment even though
it's it's a bit longer I'm going I'll try to condense it a bit here but it
brings up a really good question and something that I don't know that a lot of
people really thought about just like a lot of people haven't really thought
about the confidentiality and IP issues with GD but Jonathan says in response
to the pole is the items you on the list don't even begin to understand the
complexity of issues we face what about faking people with fake explosive
messages fake videos propaganda disinformation irresponsible know how like kids
gaining dangerous chemistry know how or the floodgates of security threats and
exploits by generating code he goes on to just list a bunch of things that you
could use AI4 you know persuading people into scams fake family members calling
you for cash you know you can tell that it's not a family member I know
internally third stage you know we just like a lot of companies we get a lot of
phishing scams or attempted phishing scams an could someone replicate me and
create my face or voice and make it sound like I'm saying hey Marcus I need you
to wire me $1,000,000 right now you know that's what I think is that you see
that being issue or have you seen any issues of that recently I think that's
really an issue for an ethical perspective right in there isn't necessarily you
know well well I mean we talked about regulation and a legal framework and I
think that's something that would certainly need to be regulated in my view and
whether the existing framework of laws and patchwork of regulations could be
applicable to that I'm sorry I'm sure they came to some extent but here you've
got this explosive technology that you know you're you're pretty available on
social media right I mean people know what you look like they know what your
voice sounds like an I think that someone's ability to fabricate a deep clone
of your image and your voice the cadence with which you speak occasions to hate
notations of trouble at work I think that would be pretty easy for one of these
opening add platforms to replicate and then all of a sudden you know we've got
you you know or this deep fake version of you calling somebody asking them for
money or whatever it is I mean I think the the ability to clone your voice
mimic speech cadence tones it's really off the charts and if you look at how a
lot of those interactions have taken place from a commercial standpoint today I
mean you know you've got you go onto a website and there's a chat bot that interacts
with you and this is kind of halting stop and start interaction that's all
going to change fundamentally and you know there's the there's the ability to
use that for the various progress which is you know the deep deep fake
fraudulent schemes you know causing havoc in financial markets political you
know the crazies but then you look at something as innocuous as expedia.com
just integrated a ChatGPT API into their website where you can just go online
and said hey look I'm looking I want to go to Puerto Rico for the weekend can
you tell me what flights are going there and with the price ranges it it just
tells you looking person I mean that's so it's on the positive you know
efficient uses of it but the flipside of that is pretty dark and pretty scary yeah
yeah in for a real life example of the power of how someone could potentially
use this technology in the nefarious ways you're describing Marcus I would
invite everyone who is listening in if you're on TikTok and if you haven't
already go checkout a account on there called deep Tom Cruise and it's I've
mentioned it before on this podcast it's a fascinating account it's actually
pretty funny but it's apparently a software facial recognition and facial AI re
generative AI sort of software companies that Tom Cruise is somehow involved
with he actually invested he invested in Italy has a stake in equity stake in
the company from what I understand so he gave them permission to create this
account called deep Tom Cruise and it's sort of like a younger version of Tom
Cruise maybe 20 or 30 years ago is that kind of looks like maybe using his 30s
or what he looked like 20 years ago even though he still kind of looks the same
which is irritating as a middle-aged man but but if you go to this account you
cannot tell or I cannot tell that it's not him and if you if you watch all the
videos are pretty funny it looks his mannerisms tier point mark is the
mannerisms are just like him as he's kind of Courtney in the way he talks in
his personality is a little porkies and he's pretty funny in the video but you
cannot tell I mean I try really hard to notice some sort of deficiency or
inconsistency and you cannot tell his voice his mannerisms his look coz
everything so I think if you go to that you look at that as previous fascinating
but also kind of freaky because I fell for it when I first saw it I thought it
was Tom Cruise and you can see someone using that to your advantage you
extrapolate that to scenarios where you've got you know a deep vague that
quality of Wladimir Putin you know declaring nuclear war on the United states I
mean that those have huge implications right I mean if you take it to kind of
more mundane level I get a tonne of robo calls with all my cell phone work on
my work phone and I have a hard time every once in a while determining whether
I'm talking to a real person or some sort of you know artificial intelligence
in that artificial intelligence intelligence is not even that good you know I
catch myself yet initiating a conversation within realised just trying to sell
me extended car warranty or you know whatever it is Medicare or something I
don't like it those calls but so it I mean it's going to really it's going to
really really blow those things on the water and I think it's kids you know we
talk about existing regulations well you know what regulations do you have do
not call us and that kind of thing I mean it's become incredibly easy for these
people to to inundate you with scams and the likelihood agreed to fall victim
to those now increases exponentially as well which is certainly problematic so
I guess using you know asking a question already asked you but asking it in the
context of what you just said who's responsible for that from a legal
perspective in the future how do you think courts and governments throughout
the world might settle that will they settle it like it is you know is it my
fault if I fall for something like that is it the person is it obviously it's
the criminals fault whoever creates the fraud or you know there's some there's
obviously liability there but how it fault or at risk are we as people to my
fall forward and or organisations or platforms that enable that sort of
activity you know who's how do you allocate the liability I guess in this whole
equation from your perspective well I think that the owners of these AI tools
have recently right and I think there's an effort on their part to do that and
I think as long as they are meeting a certain standard of reasonableness and
trying to prevent nefarious use of their tools really that you can only do what
you can do right and I think they're not being reckless over not being
negligent then I think the liability that they have is probably going to be on
the lower end of the scale I think what you what you're going to get into a
situation where there's a vulnerability in the AI platform that someone is able
to exploit kind of hack and you know now starts to create something that can be
utilised in a bad way but then you're always going to have a situation where
you know you're going to criminal organisations or roll or people with Illinois
tent with developing technology that subsequently similar in just it's good
it's going to be focused on you know fraudulent activity and how do you
regulate that I mean you can't you can't regulate the criminals from that big
criminals right I mean you can have laws that hold them liable responsible for
those things but certainly I think you know you've seen these efforts yeah I
was watching YouTube video the other day where you know someone's using I think
it was cat gpt and they asked how to make a bomb and you know it's that's
against one of its parameters of use and it said hey I can't tell you that
sorry so right they're putting those kinds of safeguards into the system but
there are there are certainly you know I think work arounds are certainly
creative and they're going to get around those types of things is not going to
put those in place certainly yeah and he sort of allude to something that we
started to talk about earlier which is the the inputs into these AI models an
who's the arbitrator of truth and you know I look too social media right now in
the way governments throughout the world including EU S government where you
and I are based U S government has in my opinion sort of overreached on its
moderation of discussion in speech on social media and that's just my personal
opinion I don't want open account worms for people to migrate or disagree with
me on that but that's my personal opinion so it makes me wonder if US which is
known for its freedom of speech and that sort of thing we don't have freedom of
speech in the way we used to and the government has sort of crackdown to term
determine what it thinks is true and what it thinks is false or misinformation
how do I guess I worry a little bit that governments in other powerful bodies
even the AI platforms themselves AI vendors themselves may start to moderate or
may start to create these biases to reflect their own personal biases in these
AI models and it just creates a bunch of questions around what what is fact
what is opinion what you know what's the quote unquote right answer you
mentioned that earlier about you know these AI models are going out in
gathering information that may or may not be true and it's using that
information that may or may not be true to create outputs the it's feeding to
its customers so Joni thoughts on that I mean how do you think that unfolds I
know I'm getting outside the legal realm there will talk more about solar stuff
but I think it's incredibly probably I mean I think there's bias exist
everywhere right and I mean if you've got bias in your data and you've got bias
in your team and you know the output then is dealt with as true by unsuspecting
consumers were utilised to users of the technology I think the implications are
vast in an incredibly problematic I mean you think about things like you know
just diagnosing certain types of healthcare issues and you know you think that
mothers whole populations that are more susceptible of this because of their
race or their identity or whatever it is only clear but if there's bias in that
area it's going to substantively impact yo how people are diagnosed with
disease potentially right or if you think of financial or banking utilisation
of these tools you know who's going to get mortgage down what neighbourhood and
those types of things you know I mean certainly IS factor can be strong and you
know how do you regulate that especially if you're concerned about over
regulation you know having you know images of Tom Cruise is one thing and
certainly with his consent it's a totally different thing but you know how do
you how do you stop people from utilising your images and there's certain laws
in place to deal with that now but you know I think there's got to be some
level of regulation is to the quality of the content the quality of the output
in you know I think for now you know the self regulation is really kind of
where you're at with us and see where it goes yeah yeah absolutely yeah baby
yeah I'm fascinated by that cause I think it's I think anytime you have
something as powerful as open AI ChatGPT and something that has caught on South
quickly much faster than even cryptocurrency I mean cryptocurrency is sort of a
mild version of of this in terms of the sort of an alternative new platform a
new way of doing things and you see how the governments throughout the world
have reacted to that they've tried and they're still trying to find ways to
regulate cryptocurrency and I think with this anytime you have something like
this governments just can't help themselves throughout the world that's their
job I guess in some ways is to rein it in and so you wonder how everything
going to rain this in what are they going to do it does that diminish the value
or does that diminish the benefits to society or whatever the case may be yeah
well that's always the rest for it I mean if you're in Italian right now you
don't even have the benefit of utilising this technology because the
regulations been so extreme on one side that you know it's been totally being
and that's not to me doesn't seem like a well reasoned approach to dealing with
the issues but you know I think you know self regulation initially and see how
it goes and then you know government interference to the extended sentence
necessary or needed and maybe in only certain industries or regulatory
applications of this where would it be necessary right so here's a follow up
comments of one of your comments earlier Marcus this is from Frederick on
LinkedIn Frederick says it looks like Samsung found out by an internal audit
that people had done infringements chachi BT didn't go out in post to Twitter
post with the information so in other words chatty BT didn't expose that
information necessarily or make it public but the internal audit exposed the
fact that people had shared that confidential information doesn't change the
fact that now that confidential information is part of the open AI data set
data model and it could be used obviously chatting P is not going to post that
information on Twitter but it is information now that was confidential it might
have been intellectual property that is now part of the open AI model I think
that's an important the more point there yeah I think it's important to it
wasn't a data breach necessarily right I mean it wasn't as if they had been
exposed on the Internet in open inconspicuous way though it had been exposed
kind of you know within corpus of this data set and is being utilised
potentially by all the use of chat GT now and you know what what is being
exposed it's not been exposed and how detrimental is it to Samsung reality is
hard to say but it can be a positive thing in anyway shape or form right I mean
and This is why I think you know we talk about government regulation and self regulation
of these tools the owners of these tools I think as an organisation say you
take a Samsung you've got to have internal policy you've gotta have internal
regulations that are going to govern what you're what you're employees can and
cannot do with my understanding is that they had Samsung had shutdown any turn
elisa ChatGPT and then like I said for some push back internally or they re
assessed that position thought it maybe wasn't the best position and then look
at what happened right but I think you know as an attorney and it certainly is
a former in house attorney you you really have to look at this with a careful
eye in look at the benefits and risks because you would like your company to
take advantage of the efficiencies and the business people are going to do that
but you've got to assess their risk and put in up regulatory scheme internally
the mitigates risks but also increases St of the tool if you can and that's not
an easy thing to do right yeah absolutely now here's a question from LinkedIn
this is Carol inn on LinkedIn although I'm I'm wondering if it's really carolin
do we know I don't know will might be a deep fake version of Caroline but is
alleged person in Carolina I'm getting early but she says opening eyes privacy
policy for their paid premium API indicates that they do not use our
proprietary data tune their data model nor do they store our data other than
their privacy policy how can we be assured that are sensitive data is secure so
I guess first of all I didn't you know I'm not aware of that I'm not aware of
what their privacy policy is to be honest I'm not sure yet I can't validate
that that is what it says that they won't use it to tune their data model or
store their data I guess that begs the question of what what you know what's
the difference between what they do store and using their data model versus
what they don't store I'm not sure what the you know what the differentiators
are is that something you're familiar with or or have any thoughts on well you
know I've looked at the chat GT slash opening I terms of use and you know my
understanding is that it's pretty vague and there's a lot of nuance of grey
area that that would allow them to utilise any information that you would
submit to be able to at least modify the platform or make it more efficient but
certainly I think I think the the the bright line rule here for me is attorney
you know regardless of even what the terms of use a if you're going to be
submitting information that is confidential or proprietary or trade secret into
this system that is a fool's error right there that is not something that you
should be doing or allowing as a company encourage in your employees to do and
there should be some restriction internally that you impose as a company to
prevent that from happening because to me I think the likelihood that that
information is going to be utilised in some way it's very hot and I think the
risk to the company you don't know what it is I mean I think in the Samsung
example you have a scenario where you know you've got information that's part
of the corpus it is being utilised in some way shape or form how substantively
I don't know I don't know I mean you know talk to Getty Images and you know
they're certainly not happy that their proprietary copyrighted photographs have
been part of incorporated into the corpus and are being utilised to generate
images lets the infringement from a former real practical perspective right
there in some ways it doesn't matter because of their statutory statutory
liability and I think that the total damages that are seeking to something like
$12 million so the risks in the perceived risks are really maybe don't equal or
are you don't are the same but certainly from a liability and risk perspective
you've got managing mitigated yeah yeah absolutely yeah great point it's it's a
lot of what we're talking about here today seems to come back to a common theme
which is we don't really know you know we're talking about a lot of uncertainty
and a lot of speculation of what might happen in the future some of this has
already happened but I'd say you know there's probably 90% or more of of this
that we haven't really figured out or settled yet as it relates to chachi BT
all right you know one of the things that we haven't talked about that is
directly related to the Samsung incident in the utilisation of this information
is what do you do as an organisation when you have a huge data set of user
information let's say you know your company that's utilising AI in your using
or using a software part of corporates AI to analyse you know user end user
data confidential information proprietary information personally identifiable
information and that's being utilised in the data set what are your obligations
with respect to gdpr so it's clear real privacy regulations and what you're not
only right below your responsibilities but what are your liabilities with
respect to doing that and you know certainly think gdpr you know I would say
that you've got to be transparent and you've got to get consent and you need to
certainly let people know if there's an automated decision making with respect
to their personally identifiable information but the liability associated with
utilising all that data it could be something as innocuous as using an open
opening I to get market analysis or just user data in general statistics on how
they're using your products now utilising personally identifiable information
and you're putting it into the AI system had this problematic a thing yeah you
you mentioned great point with gdpr in Europe which is the the data privacy law
that was enacted a few years ago in the eu and that's a great point I mean
you've got gdpr regulations that sort of limit you know what can or can't be
shared or used in this in this way do you what kind of confidence do you have
that these open AI models and chatty BT and Google's Barden any other AI
platform like this with how confident would you be that they are compliant with
gdpr another privacy laws well you know I'm not I'm not so sure that there the
way you utilised these tools it's really you know if you're going to be putting
in the personally identifiable information into the tool you're the one that
needs to be compliant with the regulations that govern your use of your
customers personally identifiable information or data and so you know I'm not
so sure that it's it's really question of you know how compliant is ChatGPT
necessarily an what does it allow you to do with personally identifiable
information Kennett identify it doesn't preclude the use of that information
that one question but if you've got some sort of an even open AI application or
tool that utilising in connexion with your business in your trying to use that
tool to analyse your customer data and what obligations do you have in do you
have an understanding of how that data is being used within that toolset you
again are you putting out personally identifiable information into a corpus of
data that's now going to be analysed by this opening I system that's certainly
a big deal in a problem something that you need to be aware of right Yup and
great point Caroline on LinkedIn by the way just for those there wondering she
did confirm that it really is her so it must be true then if she came back
electronically until later I've gotta believe it right now not to be sceptical
anytime I see anyone on social media or even a video up to be sceptical whether
it's really been here's a comment from a question from Sam Graham on linked and
Sam says sometimes perception is reality if enough people believe false
information that ChatGPT publishes is there a danger of that causing actions
that we wouldn't want for example if it said the vitamin C was bad for us the
results would not be good with your thoughts there well yeah I think that is
problematic because you know that that's about the accuracy in the truth of
what the output is and you know if you've got bias in band data that is
replicating within that system and you're getting output that's just not
correct so how do you how do you police against that how do you make sure that
you know people's confidence in the accuracy of the data is where it needs to
be the implications associated with that are pretty enormous right I mean you
got facts coming out there saying you know something isn't good for you when in
fact it is perceptions from his right so I guess just to sort of put a bow on
this or wrap this altogether we talked about a lot of uncertainty we talked
about what we think might happen in the future we talked about some of the the
landscape of what some of the risks are but what closing recommendations would
you have for organisations that are concerned about potential downside of
catchy btu measured before policies that organisations by put in place I mean
what would you do if you you know you're giving us advice as an attorney and
your corporate counsel which you actually are third stages corporate counsel
but but if you were counsel to others on the on the call here today what would
you suggest they do to navigate this uncertainty yeah I think it's it's
somewhat similar to some of the things that we say when you're trying to adopt
European software or technology in general OK I mean you can't do this out of a
photo mentality right you can't be an organisation that says hey we want to
implement a I incorporated into our systems to make use of the efficiencies
that we are there you have to have a solid and legitimate business case for
utilising that piece of technology just like any other technology it's got to
make sense for your business right and then once you have determined that there
is a business case to use a particular AI tool you want to make sure that you
have an understanding of how the vendor or the provider of that tool trains
their AI product what data is it is it using is it you know general data on the
Internet is it proprietary information that only they have access to how what's
the quality of that information all of those are key components that you need
to make a determination of right and then you gotta in my view do the necessary
due diligence to make an attempt to have an understanding of what kind of
intellectual property rights do you need to utilise the AI tool okay what kind
of intellectual property representations warranties do you need to float down
too many end users or customers that are going to use products that are
associated with or generated by from corporative the AI technology then you
know once you've got kind of that framework in place as to what your risks and
liabilities are word how you're going to mitigate them and what's the use
policy for for utilising the AI platform and I think you've got to think about
that in two ways one is there's going to be a different appetite for risk and
risk tolerance if you're utilising AI internally like seems OK or you're
utilising it externally like Microsoft right or Google or something I mean
those are two fundamental different use cases and they carry fundamentally
different risk mitigation models and you might have this kind of tool traquair
using internally to increase your own efficiencies and then you're using
externally in order to generate more revenue and provide better customer
experiences or whatever those are really the fundamental things that there's
other implications with respect to mergers and acquisitions I mean if you're if
you're acquiring companies you need to understand what kind of AI components
they might have embedded in their products however utilising those with the
risk your organisation is once you complete that merger requisition and then
again I will come back to yeah how we talk about reps and warranties from a
contractual perspective you know that you're going to slow down to your end
users but what about just documents in general that you've signed you know
what's your limitation of liability what kind of reps of warranties have you
made to other companies regarding the use of intellectual property your
ownership of intellectual property and does that need to be modified yeah I
mean this is not a mature area right even when you look at terms of use of
these AI tools put out there not drafted in a way that accommodates every kind
of you know potential scenario and so you as a user have to take on this level
and you diligence to make sure that you've got the proper scheme in place
internally to to mitigate the risk in to maximise the use of efficiency of
these tools to take advantage of a right gal it's well well said I hadn't
thought about the difference between engine only use in the public facing user
customer facing use of these tools and recognising the need to treat those
differently and mitigate risk differently depending on what what the purpose is
or what the use of the product is I think in some cases like I give you an
example just from my personal experience you know I I know that there's parts
of our team at third stage they're using chassis BT and I didn't tell him to
it's not a policy that you need to use it we didn't decide to roll it out
company wide or anything like that it's a technology that anyone go use just
like you know Google I don't tell people to use Google but they do they use it
they use it when they need to go find something or looking for something and
similarly people using chat between that way but we've we've storage started to
reinforce the caution that needs to be applied here both in terms of
information that might get leaked but also not revealing not relying too much
on chassis BG because we just like Google you know we're not going to Google
best practises for how to make a digital transformation successful that's not
what we do is based on our experience and that's where I think but we might
augment that experience with information we find there but we also have to
recognise that just cause it's on the Internet just cause it's on chesty
doesn't necessarily mean it's true and so you have to take with green salt and
so like just education is a big part of it too and making sure that in addition
having policies in place that people are educated with these risks are and you
know just piping be careful with that information absolutely I mean the risks
the risks are pretty big it could be right yeah absolutely well so we didn't
talk about this at the beginning of the interview we sort of I think we were
both so excited to jump into this topic we glossed over it but maybe tell us
just quickly a little bit about your start your law practise and what it is you
do personally as well as what tap dozen maybe just let us know how we can get
ahold of you that he was interested in chat with you more about this topic are
there concerned about everyone of bounce ideas off you how to get ahold of you
yeah absolutely I mean like you can go to Taff la I'm on their markets Harris
I'm on LinkedIn YouTube channel I mean I think you know work services turning
is is the way to the easiest way to find a person get put ring coz there's a
couple of athletes that rank a lot higher than I do so but regardless of my
practise is really focused on intellectual property technology issues general
intellectual property issues and certainly enterprise software related issues
from location to contract drafting negotiating and so we're at the forefront of
a lot of these data privacy issues and certainly issues just like this that are
starting to come up more and more in our practise can really impact our clients
in a big way and you know the approach that we have to all of this is it's not
it's not the accounting approach where you just say no it's really an approach
where let's get to yes let's figure out how to mitigate the risks in a
manageable way to get you as a business or consumer to utilise what you need to
use but as long as you're aware of what you're getting yourself into
communicator let me know that's that's the way we like to approaches yeah yeah
absolutely yeah as well said and you know that that's why you're such a perfect
guest for this topic because of your intellectual property focused background
as well as your software technology background and focus as well so I imagine
in the next few months you're going to have a lot more to talk about as you get
more engaged in resolving some of these issues for more and more organisations
that you work with more and more clients that you work with so we might have to
do a touchpoint here later in the year to see where we are any new updates on
on this whole chat you D open AI thing but thank you very much for being here
today Marcus really appreciate your time yeah the audience chatting or
NEW VIDEO
I like never before we're now talking about AI the pros and
cons like never before but do we even see what we think we are looking at the
advances in AI are now exponential so fast that some AI experts now predict
artificial intelligence will become more intelligent than humans by the end of
this decade a growing chorus of tech thinkers are warning we are not prepared
and that includes Geoffrey Hinton a leading machine learning pioneer is known
as the godfather of AI he says it's time to put the brakes on AI while we still
can with scientists right we're exploring what happens when you train large
neural Nets on computers and that's just reality that we ended up here it's one
of those things where there's no way that people weren't going to explore it
the issue is now that we've discovered it works better than we expected a few
years ago what do we do to mitigate the long term risks of things more
intelligent than us taking control a big question there I want to bring you now
Lindsey Gorman she's a senior fellow for emerging technologies at the alliance
for securing democracy in Washington DC things it's good to see you again
Geoffrey Hinton he wants us to slow down he has joined a growing chorus of AI
experts were calling for a moratorium on AI development and deployment where do
you stand on this well I think I think that he speaks to a very real concern that
AI systems are progressing rapidly more quickly than anyone really expected a
couple of years ago the idea that we could train large language models such as
chat beat gpt and have it display what we think of as intelligent tasks and
capability is not something we really saw inside that caused a little bit of a
panic to in the community to say how are we voting this technology where is it
going to get ahead of humans I don't think we're there yet certainly there are
these sort of capabilities that look like intelligence and maybe moving in that
direction of course but I don't think we're necessarily at the point where AI
is taking over humans but there are some real concerns with how these systems
are going to be used and how they are already being used when it comes to the
information environment when it comes from job displacement and when it comes
to extending wage inequality you know ChatGPT Italy as soon as ChatGPT really
exploited Italy put a ban on the chat bot which was then later lifted is that
the right approach when we're talking about maybe trying to get a better hand
on managing AI well I think you have to hand it to Italy in that it very
aggressively deployed its existing legal architectures to this new technology
and specifically it applies the general data protection framework gdpr that you
use for landmark data privacy and data protection legislation to say that Jack
ChatGPT hadn't actually justified and need an really demonstrated the need to
scrape all the data that was used to train the model on the Internet including
including in Italy and that was really the the reason for and the legal basis
for that block to provide that justification for why are we scraping this this
data to develop the system at without without justifying the need but I think
it does really speak to this question of how much do we are we able to apply
the existing legal framework such as the gdpr as in the case of Italy and how
much do we really need to develop new regulatory frameworks to address these
new applications and new uses and new risks with these large language models
and I think the answer is a little bit of both and of course we talk about
regulation we're not talking about something that's going to happen overnight
in the mean time we're going to have a big election in the United states I'm
thinking about the 2024 presidential election I'm wondering what will see I
mean for that now last week you retweeted a post with a video that is 100%
generated by AI is a video by the Republican National Committee against
president by it's a 32nd clip when you see it right here that shows China
attacking Taiwan EU S banking system collapsing and it shows EU S border with
Mexico being overrun by migrants now the video it looks real and you say that
this is a big problem why well researchers our organisation and many others for
years have been warning and raising the alarm about the spectre of deep fakes
and the possibility that political actors or even foreign actors looking to
interfere in democratic elections could use these completely fake images video
or even text now we're seeing to manipulate voters into certain preferences an
into certain candidates and insert and worldviews and this is about the general
information environment whether that's Chinese propaganda or Russian
disinformation but it also comes to a flashpoint when we're talking about
elections and I think 2024 maybe the first election where deep fakes and where
AI generated images and video play a much more significant role than they have
in the past it's always been this kind of alarmist worry that something is
going to flip the mind of a voter maybe on the eve of the election and we don't
have time to prove that its active faked and we it just becomes so much easier
to create these videos and so they really need to be labelled as such so that
we can tell what's real and what's not and you know U.S. intelligence tells us
that Russia used social media to meddle in the 2016 and the 2020 elections it's
2023 there still no regulation of social media can Washington deal with
artificial intelligence well that is really the $1,000,000 question I think
there's no reason to suspect that our foreign adversaries are going to sit this
election out as they haven't for the previous previous elections and whether
they'll be able to manipulate AI or or maybe they don't even need to I think is
the question now there have been some hurts too on on behalf of the social
media platforms to prohibit the use of manipulated content and manipulated
videos right on the eve maybe the two weeks leading up to an election I think
we're very likely to see similar policies put in place in 2024 if we don't get
broader regulation and I think those policies really should include the
requirement to label manipulated and deep faked an AI generated images and
video because now anyone can make these images with mid journey with Duffy and
it's not something that only computer science labs or able to generate yes good
point maybe they should insist on putting water marks on these videos when I
ask you about what we're seeing here in the European Union we know it's led the
way with legislation to protect data privacy on the Internet now the eu was
drawing up legislation that would make AI companies disclose any copyrighted
material that's used to train their chat bots for example is this in your
opinion it would this be a way to control a I I don't know if it's a way to
control AI is so much as a way to preserve intellectual property because
there's a real concern of if an AI system is developed using Priya Terry
information whether that's on the corporate side or in the artistic side who
really owns the the results of that if an AI generates new poetry and new art
but it's actually working by spoofing and copying and kind of predicting what a
famous poet or a famous famous artist would be writing or creating or drawing
then really gets to the question of ownership who has created this and so I
think this effort by the eu is a really strong attempt to get at this question
of copyright infringement and as I said earlier apply some of the existing
frameworks that we have run copyright around data to this new AI era so I don't
know if I see it as a way to control a I think there will be broader regulation
when we think about it and the eu is doing this as well on kind of risk based
framework for AI harms that's going to be that's going to take a little bit
more time but it absolutely makes sense to apply kind of the existing tools
that we have to make sure that there isn't intellectual property and artistic
and creative infringement in ownership as these system development they become
more popular Lindsay Gorman is always leads we appreciate your time and your
insights tonight thank you thanks
Gdpr AND THE ico
Machine learning AI algorithms and data to try and drive
better product experiences maybe you start to see things that are concerning
you and you just want to learn a little bit more or maybe you're coming along
because actually you want to understand what does this mean for me personally
where should I try and focus what should I try and do there's a range of
different reasons I might bring you into this room what will try and do in
talking about to give is trying deal with some of those things that might be
going through your mind I know there's no Q&A at the end of this but as I
walk out if there's anything that I don't cover you want to grab me grab me if
you want to e-mail me you can e-mail me afterwards as well my emails up there
but just before we dive in just understand who's in the room how many people in
the room would classify themselves as of C-Suite socio CFO CTO somebody who is
a decision maker perhaps you are OK12 how many people are within the data
science community so data scientists lost majority how many awesome data
engineering so you're not doing the development models were you trying to sell
the pipelines in the infrastructure but you have both OK cool that's really
good how many people sit in Porto marketing or OK great how many people sit in
compliance that's potential officer wanted to view OK cool I feel your pain
guys OK so and how many people actually know what the Information
Commissioner's Office is that's why I work for but hands up okay about half the
room right so the Information Commissioner's Office is an actual person is
Elizabeth Denham she's the commissioner and she is independent of government
and that is her job to uphold information rights so that's all of your rights
not in your corporate positions but as citizens and it still uphold the rights
of the 10s of millions of people in the UK whose data rights matter and to help
them leverage their rights right to make sure that they can exercise them that
certainly represents and the way that we do that is a couple of things we have
some sticks so there are fines that we can issue the compulsory audience that
we can undertake we could come and knock on your door and investigate we can
even if we think there is a big enough reason to do this she stop notices so
there's a range of different ways that we can exercise some powers to try and
influence how people's rights are being dealt with how their data is being
managed but there's another side to list which is more the carrot which is how
do we work with industry we start us with Mrs companies with large tech
corporations to try and engineer better information rights environment for
everyone where business can succeed where you cannot make a profit where you
cannot have a great time doing your jobs but in the same time as your customers
those citizens can be sure that their rights are protected as defined within
our society right so UK democracy European democracy we've all agreed these
these are the source of row broke laws that we need to adhere to how do we make
sure that actually happens and so the permission Commissioner's Office that's
our job the team I represent is fairly new it's only been around for nine
months a new directorate within arceo I want to be here for three months just
three months previously arms at the BBC I love the BBC I should I'll just
mention this because actually my ex colleague Ben did a really amazing talk
earlier on and I just thought I agreed with everything he said and he and the
team there are brilliant but I wanted to talk about why I left why is relevant
to the discussion around AI and where this story is going to take us I had an
amazing job I really loved it I was having major technology it was fun right
with the job was really just gone think about and explore new technologies
virtual reality of mental reality I majored on machine learning and data
science and AI and to try and think about how do you start to leverage some of
those tools and techniques and Dr those into product buildup products insert
for the organisation how do you build up organisational human capacity and
understanding what's going on but also it being the BBC basically echoing what
Ben said earlier think about what this means for members of the public the
audience right so it was a fun fun job while eve one of the reasons was that I
had this little worm in my head that which is burning away this unanswered
question which is how are we going to take all of these conversations that's
been happening over the last three years around ethics and responsibility
around the fact that we all against bias we don't want to buy so we don't want
to see people discriminated against we want to make sure that we're being
responsible we want to make sure that the development machine learning isn't
held back by some of these issues I wasn't sure how you could take those
conversations that were happening at quite an ephemeral philosophical level
around principles and values and and this thing called ethics I'm actually
production ising bring it down into the practical reality of developing models
using those models in a product context trying to shift us and at the same time
how do you make sure that what you are building even if you've done a great job
is good for society overall right we want to live in health society I just
really hope we crack that problem because the potential in machine learning and
AI massive so you know I could have picked a few difference of indicators of
this right I could put up a stat from the Kinsey and alloy or PwC that said
global GDP is going to be massively improved by the development machining area
of the next decade or two decades of three decades pick your report pick your
stat right it's all saying there's going to be impact it can be massive or look
at these sorts of developments where some of these techniques probabilistic
compute techniques have been used to really improve the degree to which we can
make a difference to people's lives real differences number this contraction
where there is actually helping you just navigate around the city better if
you're using citymapper here in London or some other app all of these different
techniques offer huge amount of utility and public service good and benefit for
society it's great and yet if you haven't been under stone and as a group
picture in this room you haven't you cell selection beer you'll know that there
is also a feeling that what is going on right now is causing is cartoon after
alright it's not bad battery but can you recognise that actually that's a
conversation that's been happening out in the wider world right through the
media to sort of discussions the discussion that have been happening with
policymakers and others it is happening and there are instances where real harm
is happening there are instances where there is evidence about the use of
machine learning probabilistic compute personal data is driving home how do you
respond to that right so the information commissioners office is a regulator
it's our job to regulate information right so we have to try and understand
what to do about this equally so to all of you if you want healthy businesses
if you want to succeed so whether you're a data scientist or a data engineer
whether your compliance whether your leadership your working for an
organisation you want it to do well right if you're not careful at best you end
up with headlines like this right so the sneaky ways that companies manipulate
you to buy more online I'm sure most organisations would say who are using some
data science to drive their product's that's not the intention that's not what
we do this is a misrepresentation of what happens actually we're building
recommenders or building pricing models or were trying to engineer Bob better
product experience so if I want to buy some Nike trainers I want to be able to
get to the types of Nike trainers I want really quickly I want to get rid of
all of the other noise having some information about me building a classifier
around this sort of things I'm interested would be really good hey that's what
we're trying to do we're not trying to manipulate you to your online
experiences but it's not just retail is it it's all the banking is also the
media there's lots of different sectors that would equally say this isn't our
intention ship so I best right now if you're lucky this is the level of
headline actually if you be more egregious if there has been an actual home if
you had an issue at worst you can end up with headlines like this so this is
Facebook and Cambridge analytica no need to go into the details you all know
what happened there but this is the sort of headline that really erodes trust
in your brand on what you are doing and also your role as individuals who are
data sciences and partners community Kindles organisations it limits your
ability to do what you want to do and need to do it limits your ability to
engineer and innovate right so you need to avoid we all need to avoid something
like this at the time I mean like I said I've only been in the ICO three months
so this is case that my colleagues investigated another time the limit was half
a million that vco could find to any organisation with gdpr does that's changed
is 4% of euro global revenue like I said there's compulsory audits that can be
done stop notices can be issued right so the range of powers that can be
leveraged have increased I mean the world doesn't stop right so equally as are
the powers of the regulations have increased the ability of organisations to
move around and respond has also increased so there's a constant conversation
about what effective regulation is but no matter what you think about that
whether you think organisations can price themselves out of these sorts of
issues or not just reflect on this this has material consequences right so it
affects your recruitment ability that affects your ability to retain talent has
loads of consequences amongst your customer base how they perceive your brand
and the degree to which they'll trust you right this research last week we
published that evidence is that and then why I just to stop labour the point if
you were to ask me what does my current team spend all of his time on zero
these are the list of things that are priorities first rise right now so cyber
security thinking about how do you design online services design online
services that work for children right how do you put building the right
protections specially when so much is data-driven how do you make sure that the
role official recognition technology in our society is palace against our
social norms and what the law says these are questions that we're all we're
tackling right now we're investigating looking into another thing is I mean at
this point I wish I had that Intel chime you know like Intel inside right if
you remember from the 90s or whatever is all they are inside actually as a
really loose definition it's a collection of compute technologies that might be
on the one hand of the spectrum just by simple decision trees on the other hand
deep learning models but if most of these issues that we are investigating and
looking into rely on personal data on their rely on some level of probabilistic
computer so a I is general sense is powering so much of what we're seeing
around us and that makes it really important for us to understand how do we
respond so we've just issued a code you can go into the ICS website and really
saying that for any service that might be likely to be accessed by child here
are 16 principles that should dictate how that experience should be doing to
that child right so it's really starting to ask that question and answer it
around how do you make sure that Internet is a safe place for children to
navigate modern exclude case of column off the Internet how do you make sure
that actually if it's very likely charged in accessing services you thought
about this right so you can go on the website and look up any property design
column just Google it or Bing it or whatever your search engine and you should
be able to find more details on that but we can't be in this room as I
mentioned that word ethics right is singularly the most easy thing to define
under most difficult right you've had countless definitions of it today not
just in this room but across the conference today in lots of different rooms
people have referred to it and give you some woman definition again you can
Google this Anne just ask the interweb what's the definition for ethics could actually
go to Google Facebook Microsoft and others and they've done lots of work on
this you've heard other organisations talk about it so why don't I just pollute
the data set and give you one more definition right this is my personal one hey
ethics for me is the gap between what the technology enables us to do about the
Law Society Blues is the right thing to do so how do we navigate those two
points on the spectrum actually for being a bit more precise that's not a
accurate definition because what the law says and what society expects off are
not the same thing right so actually maybe a better definition is the caps
remove the technology enables to do you know there's a point where this is all
magic but actually we're going to do we all get really excited the hypersoft
increases but there's some power to it what the law allows us to do that simple
question is what we are building is what I'm building and working on legal is
it legal underline the gdpr is illegal under the Equality Act is it legal under
the Human Rights Act is it legal under the different frameworks that you might
be legislated under so if you're in fintech what does the FCA say about what
we're doing right there's lots of regulation are there how do you go about
checking that I care about personal data so is it legal under what the
legislation that the I scale has mandated and then finally as a society what do
we think we ought to do is this moral does this fit with my personal values and
principles does it fit with the values and principles of my organisation as a
community or we all heading in the right direction so is it moral my team's job
and I apologise for the not so subtle transition to close the gap between those
things but it's really to close the gap between those discussions because the
world isn't static in changes what was morally acceptable in the United Kingdom
in the 60s and 70s isn't acceptable now I'm a second generation immigrant my my
parents came from Pakistan and Kashmir when I was growing in Sheffield they had
lots of races invective thrown up there but I had it when I was younger we do
not society accept that as a normal social norm now social norms will change
for data scientists that's really important because how do you cope with the
fact that your customer base your user base the people that you are profiling
or trying to deliver services to their context is going to constantly shift at
what point do you account for concept drift or how will you do all of these
things these become really important but you have to start with this whole
framework in mind really thinking about okay what does the technology enables
to do let's separate out the hype person really focus on what is possible now
what we're aiming to what we're going to do where is applicable where should we
use it then the test how do you make sure that what you're doing right now to
be successful as a business how can you go about making sure that the correct
thing to do and I'll I'll explain the AI audit from the point that should help
you with that but then finally my team job also to shape the next debate but
all of you also shaping the next debate just in your day-to-day jobs so how do
you do that in a way that you communicated both for this group or why this
society we can't need to think through what the mechanisms for the to do that
together now just to focus on that middle one what does the law say we
shouldn't shy away from a central fact there are tensions between what the law
says should be done versus the way that the development machine learning probabilistic
compute is currently going right some of those so I'll just run through some of
these right so the law says minimise the amount of data that you collect and
make sure you're accurate and yet so much of the innovation and the impact that
we've seen happened on the last few years was really relied on gathering as
much data as possible is in direct tension we have to try squared off and
figure out what do we do about it the law says be really clear and purposeful
about information that you've gathered personal data and what you're going to
use it for purpose limitation trying really clear about that but we also know
that often the data that you collect and the examples about crash logs but
could be anything when combined with other data can help you draw inferences
right detect patterns so how do you square off the fact that you might got
consent under a very particular use case and you limited the purpose in asking
whoever your data subject is but equally for you as a data science team you
want that slight more freedom to be able to navigate through this and draw out
inferences and detect patterns that are there to be detected right how do you
squared off that's attention just recognise that legally that is attention you
might be on the wrong side of the local not careful transparency and fairness
lots of discussion about that how do you actually make sure that your source of
information that you provide is impactful and meaningful to whoever is that you
I don't want to go too far into that because it's been discussed discussed
quite a lot today but alongside that for us as the regulator did you recognise
the context in which you were delivering that assessment right how did you
manage to trade-offs that you almost inevitably will have to manage in delivering
that transparency or that accuracy was that understanding will that explain
abilities what are you willing to trade and did you do consciously what the law
says is that actually you need to make sure that you are transparent and that
you're very new explanations you done that but we know that if you're on the
deep learning end of the spectrum actually that might be a challenge if you
just try to explain what the model does even data scientists and others
practitioners would struggle with that so actually what is useful in that
context and then automated decision making lots been made about the clauses in
the law in gdpr on automated decision making and actually one of the ways that
one of the best ways that travel human in the loop again up until 3-4 months
ago I was in mediatek I really cared about how do you build decision support
tooling for members of staff and colleagues right that's the language I use
decisions supporters at what point does that decision support will become
meaningless and really the person is just a token person in that cycle in that
equation if we determine that they were just a talking person that word work
against the organisation that was really saying actually we had a human in the
loop so all of these are really interested in is actually what does having a
meaningful human in the loop actually mean how do you deal with decision
fatigue of the sort of people who might be involved in downloading to do some
of that governance on those cheques and balances right how do you navigate this
tension I'm going to start making you feel uncomfortable about all of the
issues that come from why regulated could do after one more slide there will go
to that stuff but there are two scenarios here scenario one on the left is my
colleagues at the ICO come and investigate you now this is a picture from the
Cambridge analytica they had the FBI style windbreakers I did actually ask when
I joined can I get one of my eyes of onboarding package there were no you can't
but I'm not enforcement investigations but at some point if my colleagues feel
there is an issue and they come knocking on your door as bad as a collective
failure if we all get to that point the other side is well actually how do
remove the conversation and we tried to engineer create a blueprint for what a
effective framework for developing machine learning is that recognises and
protects citizens rights while others have runs allows business to innovate so
give your snapshot off the framework there are two parts to it before in green if
your organisation of any size should not feel unfamiliar choosing right the
exact terminology might be different with how you plan to deliver training to
the people who really need to know about this stuff right what is limited
shifting around risk management understanding the decisions they make him how
do you as a data science community make sure that you're not help holding the
baby because everyone else is just pushed the decision down to you how do you
make sure that it works upwards and downwards how do you make sure that your
auditing and documentation is up to scratch we're not saying you need to
develop brand new processes for this our work is exploring what the differences
within a AI context that you need to understand and how do you go about understanding
and adapting and upgrading what you have the reason is I should have said this
actually this framework is being designed to help my assurance and
investigations colleagues so that the next time they have to go and investigate
an organisation that is using any form of data personal data or probabilistic
compute AI machine learning what additional tools do they need to augment what
they already have Just as I was saying on the previous slide we don't want to
keep that in house we're open sourcing this so we sharing this framework with
your top level is step one but there is a whole site that you can go to and you
can read more about our work revision guidance re asking for consultation we
are doing this open source because we think if we erase as a community in the
bar is a pair of everyone USB golf stream you will avoid the problem in the
first place Michael never don't have a job to do right that's ideal scenario
for us that we don't need to investigate but yeah just back to this top level
via the delta between what you already have on what is specific around machine
learning and AI the bottom half really start exploring specific risk areas
associated with AI and here is the source of conversation questions that you
had this morning how do you tackle those right let's get if you've been lying
on say checklists how do you understand how can you be sure that the checklist
model that you applying is going to get you through the legal compliance
cheques that you need there were early lectures and talks about synthetic data
if you intend to use synthetic data how can you be sure that use of synthetic
data is still not going to leave you in a legal gap from where are you going to
navigate that these each others did the bottom half of the framework which is
reiterating expanded here the source of questions that we are going arranging
some research my colleagues Reuben Binns Dr Rubens is day jobs in Oxford
computer science department is seconded into the ICU for two years he is
developing the framework he's doing a lot this research is sharing it is
blocking it so if you want to know what our current thinking was around
automated decision making we've got blog out there which preview some of our
thinking you can come and get engaged if you had a question around how do I
balance the trade off between accuracy privacy or accuracy and explainability
some of my colleagues did a project where we used citizen juries to ask people
what did they care about and I'll preview a little result for you in the health
context when they were asked would you trade away explainability of how this
decision was made for accuracy the answer was yes because they felt that having
accuracy in a health centre said the constant attention for example was really
important and they trusted the institutions that do that right so being trusted
the doctors they trusted the hospital they trusted the nhs's same so
questionable presented in a police of judicial scenario actually they were not
willing to trade away explainability they felt it was much more important to
have an understanding of how that decision was made because either it was your
fault personally but it could affect society more broadly and they felt it was
more important our citizens that they had understanding the reason that matters
to you is trying to figure out how do you navigate all of this is going to be
very context specific and you need to know what we think when we look at this
question and we will be using this solve approach to try and determine did you
get the balance right try to take away some of the myth of the mystique around
this we can't make it hard science but can we get rid of us much of the great
for you so that you understand obviously doing the right things we've also been
doing lots of work around the different rights and how they might get infected
it impacted the next blog that's going to be issued is going to look at the
other tradeoffs are really expand on that and really this is leading to my by
closing side which is where next for this work because what I've shown you is
just a very top level the blogs half a dozen that we've already published
initial previous thinking we're going to work through all of these questions
over this coming. Actually this is your opportunity right this is not being
done broadcast out to you we're not just doing this in isolation and pushing it
out this is a consultation. This is where if you have opinions you can either
just push them on the comment section on the website or if you don't want to do
public you can e-mail us and we could have a bilateral conversation about your
views were really interested to think see what you think about the majority of
the framework one example what's missing off that right now is well if I'm not
a Google or Facebook or an apple or Microsoft or IBM the full stack column and
I'm likely to be using third party data or party models what does that mean for
me right what does it mean if machine learning is part of my supply chain how
do I what if it what does it mean for me from using third party dependencies to
build my product so we're going to go and research and explore that track for
item guidance and clarification on that so this is your opportunity to feeding
those sorts of really top issues that you've got and we will try and respond
and that's going to carry on through till about October will then go into
slightly more formal consultation periods with the end of this year and then
early next year will publish the actual guidance I'll stop there because I
think I'm well out of time but if you want to go and find out more you can go
to this website if you want to e-mail as you can use that e-mail address thank
you
Comments
Post a Comment