Essay
Essay
Essay
Should Machine Learning Models be treated as computer programs under copyright law?
Should the user of Machine Learning Models get all the rights for the results?
Why Machine Learning Models should not be regulated under the same property law as computer
programs?
Why should the user of Machine Learning Models not gain rights to the product that is achieved from it?
Why its unethical for Machine Learning Models users to gain rights for its product?
Is it unethical to gain rights to the product, which is the result of Machine Learning Models?
Can a person gain rights to the product, which is the result of Machine Learning Models?
Introduction
Unit One: Directives under EU law which indicate property of the ML Models results, and who it goes to.
Unit Three: Why ML Models produce shouldn’t fall into the same category of property law as other
creations.
Conclussion
ML consists of tools for pattern recognition. Therefore, ML is basically limited to predicting a future that
looks mostly like the past.
According to LeCun, Chief AI Scientist at Facebook, “we are very far from building truly intelligent
machines. All you’re seeing now — all these feats of AI like self-driving cars, interpreting medical images,
beating the world champion at Go and so on — these are very narrow intelligences, and they’re really
trained for a particular purpose. They are situations where we can collect a lot of data.” Additionally, ML
has emerged as the method of choice for developing practical software for computer vision, speech
recognition, natural language processing, robot control, and other applications.
The design of this process is what makes its development differ from traditional computer programs
design. In traditional computer programming design, the software developer will anticipate the desired
response for all possible inputs to the computer. Yet, in the case of ML systems the model can be trained
by providing examples of desired input-output behaviour. To illustrate this: if we want a computer to do
diagnosis, traditionally, the programmer needs to explain with a high degree of thoroughness what the
rules of that diagnosis are. Same, if we want the computer to play chess, or drive a car, the programmer
will need to write a completely different program. In traditional computing, every algorithm has an input
and an output: data goes into the computer; the algorithm does what it is instructed to do with it, and
out comes the result. ML turns this around: in goes the training data and the desired result and out
comes the ML model that turns one into the other.21 It is based on a trial-error process. Furthermore,
some have claimed that the same ML model could learn to do all of these different things just depending
on the data that is given.22 If the data is about chess games, it learns to play chess. If the data is about x-
rays, it learns to do diagnosis. For non-technical experts, this could pretty well sound as if the
programmer made a wish to the computer system, and the genie in the machine came up with the
desired solution.
Are ML models algorithms and should they be treated from a copyright perspective as algorithms? Are
they a new type of algorithm that requires copyright to look at them differently? Are they computer
programs?
It raises concerns about suitability of the copyright system to adequately protect ML models; particularly
because all proprietary and open-source software licensing rely on the copyright protection of the code.
In open source licenses, if the license is applied to something that is not protected by copyright (or
related rights) the license is not triggered. Moreover, one could wonder if the absence of copyright
protection over ML models could open the door to a foreclosing scenario, placing users at the expense of
possible restrictive contracts.
Under utilitarian economic theories some have claimed that human programmers would need to obtain
rights to the results generated by AI, because programming would not happen if third parties could
immediately free ride.38 Others have claimed that programmers’ motivations to develop AI applications
need no further incentives as they are sufficiently taken into account “higher up the chain.”39 Based on
investment protection theories, some have stated that ML models need a new type of protection
because copyright law and patent law only allow for incomplete protection. However, as 40 years ago, all
these perspectives remain speculative either due to the lack of empirical grounds41 or because the mere
restatement of theoretical abstract positions without further evidence.
Back in the day, paring computer programs with literary works was internationally accepted, and hence,
copyright seemed the bestsuited regulatory tool against the fear of “effortless” software replication.43
Still, patent protection would be a better fit for apparatus using computer software or software-related
inventions.44 In the earlier times of the software industry, relief against mere copying was enough.
However, when looking at the current state of ML models, the ease to replicate characteristic does not
seem to be the case for complex ones, which are tailor-made for one particular problem and the
concerns are associated with protection against similar functionalities, where copyright plays no role.45
On the other hand, for those ML models that are simple, replication is not a problem but rather
encouraged in a collaborative innovation environment either for free or under OSS licenses.46 As stated
earlier, there is a vast supply of simple ML models under very permissive open source licenses such as
Apache 2.0 and MIT. However, these licenses are based on the legal right of copyright. If a ML model is
not protectable under copyright, the license is not triggered, and it would be in the public domain
Some have claimed that big software-based companies or AI companies are pursuing a counterintuitive
IP strategy, by aggressively patenting AI technologies while sharing them freely, as they experience
pressure to open-source their work to attract talent and increase use of their platforms (gaining data
from the users in return).47 There is also relevance of access to training data and trainable parameters
as to obtain a ML model. Therefore, it seems legitimate to question whether copyright protection of
computer programs has or may have any practical relevance in the world of ML models.
An engineer curates and prepares domain-specific data, which is fed into ML models, which are
iteratively trained and continuously improved. A ML model can deduct from data what features, and
patterns are important without a human explicitly encoding this knowledge.
The outputs of the training process can even surprise humans and highlight perspectives or details the
engineers have not thought of themselves. Some have coined it as “Software 2.0.”72 In Karpathy's
words, “Software 2.0” is code written in the form of “neural network weights” not by humans but by
machine learning methods such as back propagation and stochastic gradient descent. Updating models
entails retraining algorithms with new data, which will change how the model will behave and
perform.73 The fact that it has this special characteristic means that it cannot be considered as a regular
algorithm. Yet, whether this is enough to qualify the ML model as a computer program under the
Computer Programs Directive, depends on meeting of the requirements for protection established by
the Directive.
Thus, although some have said that a ML model is like writing code with more sophisticated tools such as
back propagation and stochastic gradient descents89, it could be argued that a ML model is a type of
algorithm. More precisely, it could be said that it is an algorithm based upon a nonlinear mathematical
function, capable of generating output based on the learned patterns in the training process.90 There is
still no unanimity on whether this capability makes it a different type of algorithm, because it could be
seen as a “learning” (in the human sense of the word) algorithm.
At the European level, the Directive on the legal protection of computer programs from 1991, which was
later amended,91 seeks to harmonize Member States’ legislation by setting a minimum level of
protection. The Directive does not provide a legal definition of a computer program.92 It limits to stating
that the term computer program shall include computer programs incorporated in hardware as well as
its preparatory design.93 Thus, the Computer Programs Directive reconciled the definition provided by
the WIPO Model Provisions on the Protection of Computer Programs which define “computer program”
as “a set of instructions capable, when incorporated in a machine-readable medium, of causing a
machine having informationprocessing capabilities to indicate, perform or achieve a particular function,
task or result”94 , and it extended the concept to the preparatory design material.95 Following the
Directive’s approach most of member states’ national laws do not give a definition of a computer
program.
Beyond the concept of work, the Directive provides for a concrete form of copyright protection for a
computer program, as it considers it “a literary work within the meaning of the Berne Convention,”97 as
long as the computer program is original in the sense that it is “the author’s own intellectual creation.”
From Art. 2 of the Computer Programs Directive follows that copyright may not only be attributed to
natural persons, but also to legal persons, where provided for by national legislation.99 An extensive
interpretation of the Directive could lead to argue that if a national law would recognize an AI as a legal
person, also the copyright over a computer program generated by a ML learning process could be
attributed to the AI system. However, having the so-called creator doctrine as a cornerstone approach, it
is obvious that the Berne Convention reserves authorship to “human beings”100, and this may be also
implicit in most national laws.
Some have claimed that ML systems cannot be truly viewed as human creative software programming,
as they do not generate human readable source code and because the role of a human programmer can
be minimized in particular development phases to merely providing a list of desired inputs and
outputs.102 Others have recalled the idea that authorship in the sense of the Berne Convention requires
both conception and execution of the creative plan for the work.
These ML instances need to be organized within a structure defined by domain knowledge, and they
need to be fed data that helps them complete their prearranged prediction task. In other words, the
conception of a ML model requires more than an idea, or a list of desires given to the computing system.
It must manifest a detailed creative plan, which will define the key elements of the work. All these tasks
are controlled by humans, even if in different degrees.
In all these scenarios hardware and computing elements are key contributors to problem solving in any
ML system as they are the ones digesting and analysing the data obediently, always in the way that the
human programmer indicates the computer to do.105 Brut computation (super software)106 capacity
allows machines to outperform humans. Control remains human.107 However, the paradigm shift in
software design practices, might have brought about challenges in regard to claiming authorship of ML
models, depending on the architectures and training techniques employed.
In any case, if to draw the line we need to wait for a fair amount of national court decisions in years to
come, which will most likely deal with copyright infringement rather than copyright subsistence, this
could become a problem for reliance on copyright protection of ML models, as the European Parliament
pointed out.
A second option is to consider that ML models are just a type of complex algorithm, and thus, excluded
from copyright protection to the extent that they comprise ideas and principles.133 The latter is in line
with the basic assumption that protection of functional elements of a computer program would
ultimately lead to monopolisation of ideas and hinder innovation.1
The Computer Programs Directive adheres to the idea-expression dichotomy.135 Thus, copyright confers
protection for the code or the structure of a computer program but might not do so for the learning
algorithm itself.136 However, the delineation of whether a ML model is a regular algorithm or more than
that is still disputed.137 This is because ML models are not final products themselves. A ML model could
be regarded as a bit more than an algorithm but less than an application. It is an immediate output of
the machine learning process.
ML models have a double purpose: training and inferring. It is the inferring part, (ML model after initial
training) the one that provides a functional link or computational framework for generating predictions.
Thus, to the extent that a ML model can be expressed in a coded form, they could qualify as a computer
program. Yet, this is a previous step, as after its potential qualification as a work, we need to assess if the
ML model meets the criteria for protection. However, the scope of protection would be limited to that
concrete formulation or specific arrangement of the algorithms contained in the ML model, its inner
structure. Accordingly, the most valuable part of the model, the functionality of such arrangement,
would not be protected and remain in the public domain.
The second option would be to consider if a ML model can qualify as an independent work. The criterion
set by the CJEU in the BSA case, bring us to the standard of originality, but since the originality criterion
is the same for all works, the recent Cofemel and Levola cases138 need to be considered here. The CJEU
emphasised that, a “work· within the meaning of copyright must satisfy two conditions. First, “the
subject matter concerned must be original in the sense that it is the author’s own intellectual
creation”139, and second, it must be “the expression of the author’s own intellectual creation”140,
which object is identifiable with sufficient precision and objectivity (second requirement). The Court also
remarked that for “something” to be considered a work of copyright, it requires the existence of an
external expression identified in a sufficiently accurate and objective manner.141 In short, the CJEU
states that when it is impossible to objectively and accurately define the subject matter of protection,
leading to legal uncertainty, that “something” cannot be classified as an intellectual work. The potential
tautology and ambiguity of this criterion brings about problems when applied to ML models. A ML
model is a learner that, on the one hand, follows the instructions provided by a programmer and does
what is told to do, but on the other hand and according to the same learning principles, it has certain
capacity for optimization that goes beyond given instructions. To identify what a work would be in this
context following the current criterion, generates a contradiction for the copyright system in general.142
In other words, to identify when a ML model has reached a sufficient creative level would imply that the
original requirement is objective, that is, it can be appreciated in comparison with pre-existing creations.
This opens the door for questioning whether the criterion of minimum creative effort to protect works of
a human intellect should really not be subject to revision and re-evaluation, or whether works of a
functional nature such as computer programs should be removed from the copyright system once and
for all
Additionally, optimizations in the ML models during the learning process might have serious implications
for human authorship articulation.143 Even if the only applicable criteria for protection of a computer
program is that the work must be “its author’s own intellectual creation”, to judge the creative level, the
existence of a human author is a pre-condition. In this regard, even if partially based on US authorship
standards, but nevertheless with Art. 2(6) of the Berne Convention as main reference, it seems useful to
look at Prof. Ginsburg’s proposal of authorless output144, adapted and applied to ML models and see
whether there is enough human autonomy in their development that allows for a valid claim of
authorship.
To the main question of this paper, whether EU copyright is fit for purpose, the tentative answer seems
to be yes, with certain nuances. First, even if it may be possible in some individual cases to “express” a
ML model in such a way that could qualify as a computer program, the effectiveness of this option is very
uncertain because the scope of protection would not cover the most valuable feature of the model, its
functionality. Furthermore, most of these cases would remain in the public domain because it will not be
possible to assign them authorship. As to the potential of foreclosing in the case of ML models published
under an open source license that would not be triggered, it seems very unlikely this would happen.
Furthermore, the absence of copyright protection for ML models does not seem to affect a potential risk
of misappropriation or piracy. This is because the factual control over many other elements, components
and parameters of the ML system are the ones crucial for potential re-use. Access to APIs such as in the
case of OpenAI’s model GPT-3, the use of technical protection measures and access to training data
limited by database rights are some already existing examples. Further research on these topics is still
needed. When turning back to the prerequisites of protection, it is obvious that the absence of copyright
protection over ML models can be seen as a confirmation that copyright law is an inappropriate way to
protect computer programs, as many scholars have claimed over the years, and this might be the right
time to rethink how software should and should not be protected. Moreover, any decision about IP
protection is also a decision about the grounds for competition, particularly in light of the digital
economy, with software at its core. As Prof. Ullrich stated, the purpose of protection may not be seen in
isolation, while the need to maintain the link between purpose and protection will increase as protection
is extended, to ever more technological knowledge, and as by granting ever more rights to exclusivity,
the legislator seeks to implement its industrial policy in the field concerned.148 Therefore, any reaction
of IP law beyond interpretative guidance, should be handled with care. For instance, proposals for the
creation of a sui generis right or an ancillary right for the protection of ML models seem to have no
justification.