A ML and Brazilian Supreme Court Applying Victor Robot To Legal Texts 2020

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Machine learning and the general repercussion on

Brazilian Supreme Court: applying the Victor robot to


legal texts

1 2
Fabiano Hartmann , Debora Bonat
1
PhD, Law School-University of Brasília, Brasília, Brazil – ppgd.unb.br
https://unb.academia.edu/FabianoHartmann
2
PhD, Law School-University of Brasília, Brasília, Brazil – ppgd.unb.br
https://deborabonat.academia.edu

Abstract. This paper aims to present the development of an instrumental


solution to a necessity raised from an artificial intelligence project, latter
called Victor robot project. The Victor robot demands a methodological
combination of the reasoning of the areas of software engineering,
computer science and Law. For its unprecedented factor, all researchers
must develop knowledge in an intense form, while working in different
thought process, language and very specific legal texts in a huge volume of
data. In a second part, this paper presents some sui generis features of
general repercussion as a constitutional filter and a possible field for
ontological development for machine learning, and important to understand
your potential application at Supreme Court activity. Finally, the article
presents some steps of the project that is still in progress, but is already
considered the largest artificial intelligence project in the Brazilian
judiciary, which has 100 million cases in stock.
Keywords: Victor project. machine learning. methodology. general
repercussion. text classification. decision support.

1 Introduction1

1
{Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons
License Attribution 4.0 International (CC BY 4.0).}
2

The first goal of this paper is to present the development of an instrumental


solution to a necessity raised from the work plan established in a artificial
intelligence project between the University of Brasilia and the Brazilian Supreme
Court (STF), called "Research and development project on machine learning about
judicial data on general repercussion in the Supreme Court", latter called Victor
Robot Project. The Victor robot project demands knowledge and researchers from
the areas of software engineering, computer science and Law. For its
unprecedented factor, all researchers must develop knowledge in a intense form,
while working in different thought process, language and very specific
methodologies. Therefore appears the need of an integrative methodology, that
allows the implementation of advanced artificial intelligence instruments for the
legal area. For the development values and guidelines must be really clear, and
they will be addressed by deductive method, from the findings on Artificial
Intelligence (AI), Law and the agile methods.

This work plan foresees the engagement of three different areas of knowledge:
software engineering, computer science and law for the development of a unique
innovative solution in an environment with thousands of lawsuits and millions of
legal texts, all this in absolutely non-structured data. This scenario represents a
series of challenges to be faced for areas that are traditionally structured on
diametrically different rationalities but can – in a convergent and synergic way –
develop the steps (stages, phases) to achieve a common central goal: the
development of a system based in Machine Learning (ML) algorithms and,
possibly, Deep Neural Networks to be used in a specific phase of the judicial
process called “general repercussion”, a brazilian (typical country of civil law
tradition) adaptation to an approach to the solutions of the common law tradition.

The use of the system will allow the improvement of the contingency of judicial
process in Brazil. To start the planned research work, a methodology was
customized: it is more familiar to the software development area and still insipid
in the law studies. The first objective of this paper is to present and justify these
choices. As two of the main critical factors of the project were a short calendar
and a strict budget, as well as the innovation character of the research – and its
dynamics needs of correction and adaption, the agile scrum methodology
presented itself as extremely interesting. Nevertheless, the legal character of the
research demanded a few adjustments on the methodology framework. In the same
sense of the necessary adjustments to justify a specialized research methodology,
it was needed to fixate some conceptual cuts, bounding properly the research field.

In this sense, it will be presented the concept and the impacts of the general
repercussion in the process stock and the strategies of application of machine
learning for identification and classification of themes in the general repercussion.
This line of thinking it is not only valid for lawyers, but to all legal practitioners.
Activities in legal administration logistics that could be performed in a fraction of
3

time with a high level of accuracy, may be benefit, and allows human talent to be
concentrate in strategic areas. The benefits of a research on AI and Law are yet
underestimated, since one cannot limit remotely precisely the combination of
human knowledge and AI tools. However, it is possible to previously define some
topics that should necessarily guide an AI research in a way to maximize social
benefits with accuracy and speed. This research is in this field.

2 A methodology to legal text classification and decision


support solution

2.1 Rating decision support solution with agile method:

As two of the main critical factors of the project were a short calendar and a strict
budget, as well as the innovation character of the research – and its dynamics
needs of correction and adaption, the agile methodology presented itself as
extremely interesting. Nevertheless, the legal character of the research demanded a
few adjustments on the methodology framework2.

A crucial issue to the accuracy is to identify/decide which is the level of symmetry


between computer activity and what really happens in the human reasoning.
With this perception, the idea of the story of seven blind wise men makes sense
again. The specificities, contingencies, bias - although possibly existing or true (in
the indian story sense) do not match the holistic, global or collective concept. This
perception is crucial to define any work method between AI and law3.

2
Therefore, it is not the purpose of this paper to present: 1) the work plan of the “Machine
Learning of the general repercussions in the Brazilian Supreme Court”; 2) The state of the
art in artificial intelligence (AI) or IA and law. These approaches will be made here only to
in a way to instrumentalize the methodological definition. The subjects listed above are part
of the research in a broader way, and will be released properly when opportune.
3
An in depth historic view of the development process and AI perspectives is described in
the paper "A review of artificial intelligence”, from E.S. Brunette, R.C. Flemmer and
C.L.Flemmer. School of Engineering and Advanced Technology of Massey University,
New Zeland. There, beyond the historic perspective, there is the register of Mikawa (2004)
work on the variables of the human mind: “consciousness, preconsciousness and uncon-
sciousness. In this model most data processing is done in the non-conscious states. He
therefore proposed a system where the level of information processing changed, based on
visual information being received. In his model, external information processing is con-
ducted when the robot is awake. However, when the robot is in sleep mode, external infor-
mation processing is reduced and more internal information possessing is conducted.”
(BRUNETTE, 2009, p. 386) [All content following the pages of the paper was uploaded by
Claire Flemmer on 25 March 2014.]
4

The synergy between artificial intelligence and law was detected by Professor
Edwina L. Rissland, from the Department of Computer Science from the Universi-
ty of Massachusetts, in a very traditional article in 2003. In there, [16, 2003] situ-
ated AI and Law as a singular research field for the AI. These studies allowed an
increase for topics beyond the typical law ones - as dogmatic, hermeneutic, legal
argumentation, theories of justice or the best decision - more specifically about the
insights and the legal praxis own logic in a broader way.

In the AI general research outlook can be identified nuances and limitations of the
existing techniques for the law functionality, as well as catalyzing elements for the
development of the new sustentable approaches. The mentioned Rissland work is
strengthened by the observation of the consolidation of standard legal argumenta-
tion theories, with the refining of the design of models to analyze and evaluate le-
gal reasoning.

A process characteristic of software engineering, created by Kent Back, Extreme


Programming (XP)4 is a set of development practices guided by values. These
practices and values seek to face the most common difficulties in software devel-
opment, they are: missing deadlines and overspending to develop solutions. For
dealing with extremely delimited values and practices equally simple, when ap-
plied together shows a considerable synergistic potential. Better decisions, quick
answers, efficient communication and wise efforts investments are noble goals of
a research, especially of researches with the characteristics pointed above5. There
is emphasis in the interactions and personal aspects of the teams involved. The
process itself - as well as its tools - are of development and innovation. The devel-
oped work is important and - probably a defining factor for the methodology
choice - the customer collaboration (as the guideline set above of law and its legal
language`s characteristics working as a customer). This metaphorical use of law as

4
It is not the intention of the present paper to develop a dissertation on the subject.
5
Tripp, Saltz and Turk recently reordered the cautions for the use of the agile methodology,
and guided the customization for the present project: “We believe that agile in software de-
velopment is an “instance” of agility in projects. Agile can be used in many other project
contexts beyond software development, […] Some foundational questions include: • What
guidance can we provide to create and sustain better agile and lean behaviors and more suc-
cessful outcomes? • How can we incorporate other functions, such as architecture and pro-
duction support, into agile and lean frameworks?• How can organizations and cultures re-
structure to support these philosophies? • What are the measurable outcomes of using agile
techniques? • What additional metrics might a team use to measure team performance? •
What are the measurable differences in outcomes when using traditional vs agile tech-
niques? • What are ways that we can create a repository of knowledge, experiences, cases
and empirical data that could be used by research and industry to leverage and expand our
understanding of and practical skills in agile techniques?” [18, p. 5470]
5

a customer allows the process to be tolerant with the unexpected (typical of inno-
vation research), and give quick answers to the problems - enabling changes, tests
and adjustments6 - a typical Victor robot routine!

The agile method has the characteristIcs7 of a cycle (SDLC) that where very well
grouped by [13, p. 198-199):

Table 1. The agile method has the characteristics -

number characteristics
1 Conduct a survey and assess the feasibility of information systems
development project.
2 Study and analyze the information systems that are running
3 Determine the requests of the information system users.
4 Select the best solution or problem solving
5 Determine the hardware and software.
6 Design a new information system
7 Build a new information system.
8 Communicate and implement the new information system.
9 Maintain and repair / improve the new information system if
necessary.

What should guide the agile methodology application is the verification of its po-
tential benefits in face of a traditional work methodology. This paper agrees with
the statement. There are a series of studies in this sense, and the main reasons for
its use are empirically verified.8

The choice for the agile method imposes severity, risking otherwise that the bene-
fits do not overcame the difficulties and imprecisions. [18] synthesizes this need in
three criteria: 1) the establishment of individual practice metrics, even when they

6
"There are several team and environmental characteristics that drive the extent to which
agile methods and practices can achieve their full potential. Hence, when conceptualizing
the potential impact of agile methods, researchers must consider and document the charac-
teristics of the environment that enable agile practices to be successfully implemented."
[18, p. 5466]
7
SDLC (Software Development Life Cycle, “is the stages of work performed by system
analysts and programmers in building an information system.” [13, p. 198-199]
8
There are several studies that “Describe and measure the team and environmental charac-
teristics of the project, 2) Measure the use of multiple agile practices, either qualitatively or
quantitatively, and 3) Illustrate theoretically how and when the unique nature of agile
methods influences outcomes.” [18,p. 5466]
6

are qualitative and not quantitative, 2) the documentation and communication and
3) when applying the theory, combining the nature of the agile method with the
work environment. This convergence is mandatory.

3 General repercussion elements for Machine Learning


application

3.1 General repercussion as a filter

The brazilian’s institute of the general repercussion was instituted in order to op-
erate as a recursional filter, thus avoiding the knowledge of extraordinary appeals,
whose constitutional cause is irrelevant or of sole and exclusive interest of the par-
ties. It is understood that a cause demonstrates relevance when it has real and un-
doubted importance, standing out against those other causes that involve the same
object. Thus, relevant issues are those that have great value or interest, which is
why they are clamoring for guidance from the Federal Supreme Court.

With regard to the requirement of transcendence, it is understood that the cause


can not be limited to matters of individual interests. That is why the experts and
the jurisprudence reinforce the need for transcendence, that is, the analysis must
exceed the individual interests and set up a true collective interests in the examina-
tion of matter. In this way, the general repercussion allows that in cases founded
on the same controversy, one or more resources that adequately and fully repre-
sent the leading cases be selected, to be analyzed by the Supreme Court, according
to the Code of Civil Procedure; the effects of the decision on only one appeal will
bind so many others of equal matter, which, of course, will denote greater effec-
tiveness and celerity in the jurisdictional activity.

This scenario goes back to the adoption of the system of precedents by the Brazili-
an procedural system. The flow of administrative tasks, both for the purpose of di-
recting the judicial process itself and for developing services in support of judicial
activity, has taken a great deal of effort, time, resources and other valuable ele-
ments of the public service.

Nor is it today that these flows are ever more precise and produce many elements
of efficiency and effectiveness metrics. Therefore, there is a good substrate for the
application and measurement of gains in approved and implemented Artificial In-
telligence (AI). Based on the idea of a flowchart, such as the representation of a
sequence of events that map some (some) types of decision, the tasks often go
through needs for treatment, reading and understanding of data, classification and
comparison with pre-existing parameters or indication of a new possible standard.

Terminologies may vary according to the representation in the flow, but the activi-
ties usually pass through an (optical or other sensory recognition) of the data, a
7

structuring of this data (a form of storage, organization), an optimization of infor-


mation relevant to some form of classification and decision of the path to follow in
the flow or - possibly - change of flow.

For a first step, it is possible to contribute, for example, to the development of


"knowledge", that is to say with technology of optical character recognition or
similar, that allows the recognition of text characters in images, transforming them
into text capable of edition. Even the recognition of handwritten images is possi-
ble. For the structure of data, the AI can contribute, for example, with organization
algorithms forming structures oriented by their function, purpose and conditions
for storage (classic vectors, lists, stacks, trees, etc).

Both the form of organization and the methods used must be made according to a
methodology appropriate to the characteristics of the judicial data. Classification
is one of the most frequent features offered by AI. For example, a multi-layered
neural network architecture can perform, with very acceptable parameters of accu-
racy and verification and ethical validation, treatment services and data structuring
to function as a real input data sorter.

4 Text classification and decision support solution combined:


the Victor robot

4.1 The beginning of Victor robot

The Victor robot was designed to act in the flow of management of court
proceedings in the stage of evaluation and framing of selected theme on general
repercussion. Due to the quality of the data at the beginning of this flow, the
server (without the help of ML) takes considerable time (close to 30 minutes) to
locate and organize the relevant data to be read and interpreted to perform the
classification. This statistic, combined with another statistic: approximately 400
new processes arrive at the beginning of this flow per business day. This identified
a serious management problem that absorbs something close to 200 hours/worker
for organization and flow initiation (pre-activity). The quality of initial text and
image data is also highly variable due to the large number of process sources
addressed to the Supreme Court and the variation in electronic process systems
and eventual digitization capabilities.

Briefly, the project aims the development and application of the newest Artificial
Intelligence concepts and techniques, especially Machine Learning for relevant
needs in terms of processing, classification of procedural parts and classification
of temes/classes on general repercussion management at Supreme Court and
support the decisions of the technical team. The objectives are to increase the
speed of processing, increase the accuracy (accuracy) in the involved steps and
optimize the human resources to perform more strategic activities to the Court.
8

The following stages of the Victor Project was planned: 1) Preparation and
structuring of the General Repercussions database for training of machine learning
models. 2) Evaluation of more efficient training algorithms and strategies for the
context of General Repercussions, including deep artificial neural networks. 3)
Prototyping and training of the chosen algorithms including their evaluation. 4)
Preparation of the communication architecture for real-time process classification
along with the interface for recording possible errors in model responses,
including integration with the STF solution park.

Table 2. Goals of R&D Victor

number characteristics
1 Composition of research and development base
2 Preparation of General Repercussions (RG) database for analysis
3 Building an Optimal Ontology
4 Select rating methods
5 Optimize selected classifiers
6 Match classifiers for process ratings against RGs
7 End user interface
8 Evaluate inference methodology
9 Publication of Results
10 Create Bank of New General Repercussion Themes (TRGs)
11 Generate New TRG Classifier
12 Generate New PC (procedural parts) Classifier
13 Prototype training automation engine and use of new TRG classifiers
14 Detail STF Team TRG Classifier
15 Detail STF Team PC Classifier

A mapping of human activity was done and an identification of the most recurring
themes of general repercussion, and the division of data for testing, modeling and
comparison between the activity simulated by human legal experts and the
machine were made. Thus, from a preprocessing model, a classifying system of 28
general repercussion classes is being developed, selected by association with a
large number of similar court lawsuits. The most several simulations indicate a
high accuracy index.

4.2 Some results of Victor robot

All goals (Table 2.) are in execution and improvement in agile medium. In texte
environment with some encouraging results:

Table 3. Stages of research steps (stages reported in stages of accountability of the


Research Project at the University of Brasilia)
9

number step stage


1 Preparation and structuring of Stage completed in full. The data from
the General Repercussions lawsuits were collected from the STF
database for training of database, preprocessed and entered
machine learning models: into a structured database hosted on
the research laboratory's servers.
2 Evaluation of algorithms and Several hypotheses of uses of
training strategies more preprocessing methods and classifiers
efficient for the context of were investigated. In application to 28
General Repercussions, themes of general repercussion, the
including deep artificial best results achieved using the
neural networks. XGBoost technique model, with an
average accuracy (F1-Score) above
.90
3 Communication architecture The RGs classification feature has
preparation for real-time been adapted to work within the STF
process classification along (STF-Digital) technology park.
with the interface for
recording possible errors in
model responses interactively,
including integration with the
STF solution park.

Throughout its execution, there were changes in the way to extract the text from
pdfs files, a fact that required reprocessing of the entire database. Once this
extraction is completed it will begin the machine learning remodeling process.

5 Conclusions
This way it is intended to conduct an intensive research, in a limited time, optimiz-
ing the available resources, with instruments that allow problems identification
and necessary adjustments. From the human relations point of view, although the
team has different theoretical backgrounds, it is intended to keep the team united,
stable and productive. Thereby, it is understood that the elected methodology is
justified, with its modifications to attend the research work plan.

The general repercussion allows that in cases founded on the same controversy,
one or more resources that adequately and fully represent the leading cases be se-
lected, to be analyzed by the Supreme Court, according to the Code of Civil Pro-
cedure; the effects of the decision on only one appeal will bind so many others of
equal matter, which, of course, will denote greater effectiveness and celerity in the
jurisdictional activity.
10

The Victor Project, although very recent, has been presenting very interesting,
strategic and relevant results in an attempt to reduce the processing time of court
proceedings. As said several hypotheses of uses of preprocessing methods and
classifiers were investigated. In application to 28 themes of general repercussion,
the best results achieved using the XGBoost technique model, with an average
accuracy (F1-Score) above .90.

References

1. Blight, Karin Johansson. Artificial Intelligence, AI biases and risks, and the need for
AI-regulation and AI ethics: some examples. 2018. DOI:10.13140/RG.2.2.
23455.00160, https://www.researchgate.net/publication/326377798 _Artificial_
Intelligence_AI_biases_and_risks_and_the_need_for_AIregulation_and_AI_ethics_so
me_examples_17_Nov_2018, on 11/03/2019.
2. Brundage, Miles, et.al. Scaling Up Humanity: The Case for Conditional Optimism
about Artificial Intelligence. In: EPRS. Euroepean Parliamentary Research Service.
Should we fear artificial intelligence? Europian Parliament. 2018.
http://www.europarl.europa.eu/RegData/etudes/IDAN/2018/614547/EPRS_IDA(2018)
614547_EN.pdf. On11/03/2019.
3. Brunette, E.S; Flemmer, R.C.; Flemmer, C.L.. A review of artificial intelligence.
School of Engineering and Advanced Technology of Massey University. New Zeland.
2009. DOI: 10.1109/ICARA.2000.4804025.
4. Datta, Shoumen Palit Austin. The Elusive Quest for Intelligence in Artificial
Intelligence. MIT Auto-ID Labs. Massachusetts Institute of Technology – MIT.
https://dspace.mit.edu/bitstream/handle/1721.1/108000/Intelligence_AI.pdf?sequence=
11. Acess on 07/02/2018.
5. Davies, Jim; Francis Jr., Anthony G.. The Role of Artificial Intelligence Research
Methods in Cognitive Science. Institute of Cognitive Science, Carleton University.
Ottawa.2013:https://pdfs.semanticscholar.org/e159/ea04f23303091742e81a5ba25
a4f62e40bc7.pdf, On 07/02/2018.
6. Engle, Eric Allen. An Introduction to Artificial Intelligence and Legal Reasoning:
Using x Talk to Model the Alein Tort Claims Act and Torture Victim Protection Act.
Richmond Journal of Law & Technology. Volume XI, Issue 1. 2004.
http://jolt.richmond.edu/jolt-archive/v11i1/article2.pdf. On 07/02/2018.
7. Gray, Pamela. Artificial Legal Intelligence. Aldershot: Brookfield, EUA.1997
8. Jahanzaib, Shabbir; Anwer, Tarique. Aritificial Intelligence and its Role Near
Future. Journal of latex class files, vol. 14, n. 8, august 2015.
https://arxiv.org/pdf/1804.01396.pdf. On 25/02/2019.
9. Khmelevsky, Youry; Li, Xitong; Madnick, Stuart. Software development using agile
and scrum in distributed teams. Management Sloan School. Massachusetts Institute of
Techonology – MIT. 2017. http://web.mit.edu/smadnick/www/wp/2017-02.pdf, On
07/02/2018.
10. Kubovic, Ondrej; Kosinár, Peter; Jánosik, Juraj. Can Artificial Intelligence
PowerFuture Malware? ESET White Paper. Enjoy Safer Technology. Disponível
emhttps://www.welivesecurity.com/wp-content/uploads/2018/08/
Can_AI_Power_Future_Malware.pdf. 11/03/ 2019.
11. Maalel, Ahmed; Hadj-Mabrouk, Habib. Contribution of Case Based Reasoning (CBR)
in the Exploitation of Return of Experience. Application to Accident Scenarii in
Railroad Transport. Cornell University Library. 2012. https://arxiv.org/pdf/1203.0656.
11

On 07/02/2018
12. Maini, Vishal;Sabri, Samer. Machine Learning for Humans. Published August 19,2017.
Edited by Sachin Maini. https://everythingcomputerscience.
com/books/Machine%20Learning%20for%20Humans.pdf.On 08/03/2019.
13. Permana, Putu Adi Guna. Scrum Method Implementation in a Software Development
Project Managemente. Bradford, UK: (IJACSA) International Journal of Advanced
Computer Science and Applications, Vol. 6, No. 9, 2015, p. 198-204. ISSN : 2156-
5570(Online). DOI: 10.14569/issn.2156-5570
14. Prieto, Heloisa; Godfrey, John. O tempo tem histórias. In:
https://otempotemhistorias.wordpress.com. On 07/02/2018.
15. Richter, Michael M.; Aamodt, Agnar. Case-based reasoning foundations. The
Knowledge Engineering Review, Vol. 20:3, 203–207. 2006, Cambridge University
Press doi:10.1017/S0269888906000695 Printed in the United Kingdom.
16. Rissland EL, Ashley, KD, Branting: Case-based reasoning and law. The Knowledge
Engineering Review, Vol. 00:0, 1-4. 2005, Cambridge University Press DOI:
10.1017/S0000000000000000000
17. Stern, Simon. Introduction: Artificial Intelligence, Technology and the Law. Pré-paper
in the 8 University of Toronto Law Journal __ (2018).
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3092887. On 07/02/2018.
18. Tripp, John F.; Saltz, Jeffrey; Turk, Dan. Thoughts on Current and Future Research on
Agile and Lean: Ensuring Relevance and Rigor. Proceedings of the 51st Hawaii
International Conference on System Science. CC BY-NC-ND 4.0.
2018. p. 5465 – 5471. http://hdl.handle.net/10125/50570. ISBN: 978-0-9981331-1-9.
On 07/02/2018.

You might also like