SCS6105 Assignment 1 Group3

University of Nairobi
Course: Msc. Computer Science (Computational Intelligence)

Date: 24/10/2023
SCS6105 – Machine Learning
Assignment 1 (ML Trends)
GROUP 3
Group Members
Name email Phone Number Reg Number

Clara Musenya [email protected] 0703755941 Not available
Musyoka yet
AGIRA CHRIS [email protected] 0755591046
JAMES
NGUU JOHN [email protected] 0712523444 P60/41063/2021
KIKUVI
1. What are the (globally) emerging and/or future use cases of ML- based
AI?
a. NLP- Natural language processing
This is a field in Computer science that enables computers understand text and speech same way
human beings do naturally. One of the key use cases of NLP is the Virtual assistants which use
conversational AI to generate responses to human and maintain a conversation flowing in a more
natural way. On top of this, this the Virtual agents use deep learning to self-improve over time. One
recent use case of AI chatbot which has taken the internet with a storm is the Chatgpt. Launched in
Nov 30 2022 [3] by Open AI, it enables conversations in a dialogue way, can write articles, can write
code snippets, etc. Chatgpt uses LLMs called the GPT-3 and GPT-4 models. These models have been
trained on large internet data, 45 TB text data for the case of GPT-3, [6]enabling them understand and
generate responses like a human being to a wide variety of topics.
The ‘large’ in LLM refers to the number of values “parameters” the model can change autonomously
as it learns [5]. GPT-3 has 175 billion parameters and it’s a model that given an input text, can
probabilistically determine what token from a known vocabulary will come next, a technique called
word2vec. [6]
b. Smart shopping with Cart less/ cashier less shopping.
The ecommerce industry has made significant strides in using machine learning to improve consumer
experience and reduce expenses. One prominent use case is Amazon's Just Walk Out technology,
which has found various real-world applications such as the Amazon Go, which was released in 2018.
[28]
c. Digital twin in various sectors
This is a virtual depiction of an existing or yet-to-be-built real object, system, or process, complete
with data and functionality. It is built by gathering and combining data from sensors, IoT devices, and
other sources to generate a precise, real-time model of the physical thing. This virtual model can be
used for a variety of applications, including monitoring, simulation, analysis, and optimization with
machine learning models. This will be used in a variety of areas such as predictive maintenance,
business optimization, performance monitoring, and inventory management, with the worldwide
digital twin market expected to be valued USD 110.1 billion by 2028, rising at a 61.3% annual rate.
[13]. The use of digital twins for tailored treatment and disease modeling is a good example. 66% of
healthcare facilities intend to invest in digital twins because they provide a secure environment for
experimentation in this crucial industry. [13]
Drug discovery is aided by the usage of digital twins. During Covid _19, Siemens assisted companies
in developing vaccines on a large scale utilizing the digital twin approach, a task that would have
taken years to complete. [27]
d. Healthcare
Machine learning has been employed in many areas of healthcare, including disease diagnosis and
prediction, medication discovery and development, improved medical imaging diagnostics, and so on.
Its use in the healthcare sector has improved patient outcomes while also lowering expenditures.
The project InnerEye (due to be released in September 2020) is a recent application of ML in
healthcare.
Project InnerEye [15]This is a Microsoft research effort in collaboration with the University of
Cambridge. InnerEye develops solutions for autonomous, quantitative analysis of 3D medical pictures
using machine learning technology. InnerEye draws on previous work in computer vision and
machine learning.
Example use case is accelerating radiotherapy planning for patients with brain tumor [16]. It takes a
lot of time to create radiotherapy treatment for each cancer patient mapping out tumor and healthy
parts (this is done for targeted radiotherapy and ensure surrounding organs are not damaged), AI here
(InnerEye) helps to do the whole 3d segmentation automatically.
e. Software development with AI
GitHub Copilot is an AI-powered coding assistant that uses machine learning models such as GPT-3
to help developers write code by proposing code completions and giving context-aware
documentation. It has gotten a lot of attention because of its potential to boost developer productivity
and streamline the development process.
When GitHub Copilot for Individuals was published in June 2022, it generated more than 27% of all
code files created by developers. GitHub Copilot now powers 46% of a developer's code across all
programming languages, with Java accounting for 61%.[29]
2.For any of the identified use cases (a minimum of two), address he is

following:
a. What
learning paradigm is in use (supervised, unsupervised, reinforcement,
recommender etc.)? Justify your answer.
1. ChatGPT was developed by combining supervised and reinforcement learning techniques [7]. The
reinforcement model distinguishes chatgpt. The creators employed a technique known as
Reinforcement Learning from Human input, which employs human input in the training loop to
reduce damaging, untruthful, and/or biased outputs. The Model was trained in the three steps listed
below.
1. Generative pre- training- The base model was trained using internet data using the
transformer architecture.
2. Supervised fine tuning- The next stage was to deploy human AI trainers, who would act out
discussions between the user and an AI assistant.
3. Reinforcement learning through human feedback- Next, the model is optimized using
reinforcement learning by training it against a reward model.
Fig 2.1 steps of Chatgpt training: Source https://openai.com/blog/chatgpt?ref=assemblyai.com
2. Healthcare (Project InnerEye)
Supervised learning – For automatic segmentation of medical pictures, InnerEye employs supervised
learning algorithms such as Convolutional Neural Network (CNN) [15]. The decision forest is another
technique that is used to categorize each voxel in an image as belonging to tumor or healthy tissue.
[16].
3. GitHub Copilot employs unsupervised learning algorithms. GitHub Copilot, an AI-powered
code completion tool that employs unsupervised learning to produce code suggestions
depending on the context of the code being entered, is one such tool that has lately created
waves in the programming community. 30]
GitHub Copilot's model architecture is based on a deep learning method. To analyze code and
recognize patterns, it employs a combination of convolutional neural networks (CNNs) and
recurrent neural networks (RNNs). The algorithm has been trained on a huge dataset of code
repositories and is intended to recognize code patterns and recommend actions.[31]
b. Whatlegal and ethical concerns should be considered when

adopting/implementing such use cases?
Chatgpt legal and Ethical concerns:[8]
1. Biased and inaccurate outputs. Models used in chatgpt models contain biases and hence
produce outputs that reflect these biases like racial, gender stereotypes. People have criticized
the tool for producing inaccurate results. This inaccuracy is also known by OpenAI as one
othe limitations of chatgpt where they have said” ChatGPT sometimes writes plausible-
sounding but incorrect or nonsensical answers “[4]
There is also no transparency of how the information acquired to train it was obtained and
this caused it to be banned in several countries like Italy, China Russia [8]
2. Privacy violations. Chatgpt stores conversations between the users for further training. These
conversations can later be used as outputs. Users have to disable this function manually or
delete conversations. Most users might not know about these and end up exposing personal or
even company information. In a security breach in March 2023, Users saw other users’
conversations [11]
3. Plagiarism and cheating. These has become a big issue in education where students copy
content from chatgpt and pass it as their own work, or paraphrase content generated by AI and
passing it as original content [12]. There are tools put in place to limit plagiarism like AI
detector
4. 4. Infringement of intellectual property rights. Because some of the data used to train
ChatGpt is copyrighted, the outputs may be copyrighted as well. Chagtpt is also unable to
provide citations, therefore users are unaware when their copyright has been violated. Many
litigation suits have been filed against chatgpt for copyright infringement, including one by
the creator of Game of Thrones, who is one of 17 authors who have sued Open AI for
utilizing their intellectual work without permission. [10]
Heathcare (Project InnerEye) Ethical Concerns:
1. Bias - as stated on the paper of main concern is the lack of generalizability where most of the
ML models are only trained on dataset from a single institution [16].
2. Privacy – The ML model was trained using data from eight clinical centers, which is a substantial
amount of patient data that, if not properly safeguarded, might be vulnerable to cybercrime.
legal and ethical concerns digital twins in healthcare:[14]
1. Privacy and security concerns. Digital twins have to collect patient data which is very
sensitive and can be damaging incase it lands to the hands of unauthorized people.
2. Limited/ fragmented data. Hospitals lack comprehensive data needed to properly train digital
twins. Electronic data is also scattered and hard to incorporate. It’s not easy to collect real
time data and merge it with other different data types, hence a bit challenging to build an
advanced health digital twin.
3. Integration with existing systems- A lot of systems in healthcare are disconnected lack
interoperability and integrations with digital twin systems.
4. High implementations costs- Costs of implementing digital twins in healthcare might be
substantial and outweigh the benefits.
5. Regulatory considerations to ensure issues related to data privacy, patient consent and ethical
research practices must be adhered to.
Legal and ethical issues with Github copilot:
1. The accuracy of the responses provided by GitHub copilot.
The copilot may occasionally provide the programmer incorrect information. It is crucial to
understand that, while GitHub Copilot is a great tool, it does have some restrictions. The
quality of the resulting code, like that of any AI model, is determined by the quality of the
training data. As a result, Copilot may generate inaccurate or insecure code on occasion, and
developers should always check the ideas before implementing them into their projects.[32]
2. Copyright Concerns
GitHub Copilot's machine learning model was trained using publicly available code, which raises
two important questions:
 Does GitHub require permission from the code's copyright holder to train Copilot with
their code?
 Given that Copilot generates suggestions using a large corpus of publicly available code,
some of which may be subject to strict copyleft licensing, does utilizing Copilot constitute
the development of a derivative work based on the original copyleft-licensed code?
In Oct 20, 2022, Github Copilot license compliance took a turn, when a lawsuit was issued
against them. On October 17, programmer and lawyer Matthew Butterick announced that he and a
team of lawyers are considering suing Copilot over copyright claims.[32]
3.What are examples of local (Africa, Kenya) use cases of ML-based AI?
I. Agriculture:
Agriculture is one of the main back bone of the economy of Kenya contributing 30% to the
GDP [18]. One use case of ML application is through Eska, which can be used to detect crop
diseases and nutrient deficiencies in soil. A farmer takes images of crops, and the app uses
machine learning to diagnose the plant's health and present the results on the app. [18]
Another example is a solution launched in Ethiopia for coffee traceability that has enabled
farmers with supply chain tracking and market positioning. [19].
Apollo Agriculture – With Apollo’s digital platform farmers are able to access to inputs,
funding, and markets. Farmers register in the application and provide some details after which
Apollo Agriculture collects satellite imagery of the farms and uses ML-based AI to make
better decisions on loans to be granted to the farmer [26].
Though not explicitly stated on the platform’s website, Apollo Agriculture uses a proprietary
credit assessment algorithm [25] which is a predictive algorithm thus concluding the ML
paradigm is supervised learning.
II. Fintech
ML is used to assess credit eligibility for persons with no credit history by evaluating data
sources such as mobile phone usage. These data and ML models are used by digital lenders
to establish loan limits and suitability for individuals.[17] eg, Loanbee utilizes an ML model
for credit rating and even sells it as a PaaS solution to other organizations. [20]
III. Health
ML powered health systems are used to do remote consultations and diagnostic assistance in
underserved areas across Africa resulting to early diagnosis and treatment. One example is
Dr. Elsa in Tanzania, an AI- powered app used by doctors to make various diagnosis using
medical data. With this app doctors take shorter time to do diagnosis and hence can treat
many patients. The app is also used by patients for patient education, medical alerts and
analysis of the patient mental status. [21]
AfyaRekod in Kenya used to collect patient information by medical personnel and provide
them with appropriate medical attention. This was further used during COVID_19 pandemic
aiding in early diagnosis. [17]
IV. Education
An example use case in education is Angaza Elimu which uses ML models to provide
learners with personalized learning experience by giving the access to notes designed to fit
their unique learning experiences and tacking of individual performance. It also enables tutors
assess student’s capabilities and provides them with customized learning resources. [17]
M-shule an SMS based application used in delivering learning, evaluations and data tools
using ML models. [17]
4. What are the factors accelerating or impeding adoption of ML based AI in

the local context?
Factors impending adoption of AI:
 Data Quality and Accessibility:

In as much there is a data, a lot of data is in forms that can’t be used by ML models, because data is
in hard copies and not digital formats.Kenya for instance is ranked 78/94 countries globally with a
score 15% in the Global Open data Index 2016/17 that measures availability of government data to
the public. It also takes seventh position in openness Government data [17]
 Skills Gap
AI development necessitates the development of persons with AI competence, which can be gained
through participation in STEM courses. STEM course enrollment in Kenya, on the other hand,
remains extremely low. According to a Nation Newsplex analysis of AI in Kenya, barely one out of
every four university graduates has completed a STEM course, with only one-third enrolled in STEM
courses. [17]
 Regulatory environment
Unclear or restrictive regulations and compliance issues to adoption of AI. Kenya lacks AI regulatory
framework. The government commissioned a blockchain and AI taskforce in 2018 to provide
directions on utilization of AI in country which found out the lack of regulation created risks around
data privacy, weaponization and human redundancy. [17] The taskforce recommended
implementation of polices to promote AI ecosystem but the government is yet to establish them.[17]
 Connectivity
Remote locations in Africa as a whole still have no reliable internet connection which is a main factor
in AI adoption. This limits adoption of AI in those remote areas. For example, according to GSMA
state of Mobile internet Connectivity 2023 is 57% with 38% living with the connection but not using
it [22]
Issues that have causes low connectivity in Kenya include: High internet costs and smart mobile
phones, low levels of digital literacy in rural areas and poor infrastructure. [17]
 Investment in research
In Kenya, the government has not financed any AI research, and those that have been supported have
come from overseas organizations interested in AI adoption. The lack of these expenditures prevents
AI from being used to solve the most pressing requirements. The government committed to support
these studies in its 2019 ICT policy, but little effort has been done thus far. [17]
Factors accelerating adoption of AI
 Innovation hubs
There has been increase in international organizations innovation hubs in Kenya which are looking to
accelerate adoption of AI, examples of such as the Microsoft Africa Research Institute (MARI) which
is aiming to bring together researchers, engineers and designers and develop viable AI solutions for
Africa. [17]
IBM research Laboratory (Nairobi Think Lab) which conducts AI research for critical areas like
education, healthcare etc [17], Google AI Research Centre in Africa, which will be in Accra, Ghana.
 Government support
In African countries where government supports technology initiatives like AI are the most
aggressive. Example is Kenya where initiative has been taken towards data protection and taskforce
(“Block chain and Artificial Intelligence taskforce) [24] put in place by Ministry of ICT to explore
how best the technology can be used to advance Kenya’s development.
 Culture for research and innovation

Nearly all universities offer STEM courses which continue putting the talent and skills to required for
AI research and innovation in the market. This can also be seen by the increasing network of technical
hubs in Africa. Example in Kenya we have approximately 30 tech hubs [24].
References
1. “Natural Language Processing (NLP) - A Complete Guide.” (NLP) [A Complete
Guide], www.deeplearning.ai/resources/natural-language-processing/. Accessed 30
Oct. 2023.
2. DeepLearning.AI. “GPT-3 for All GPT-3 NLP Model Is Available for Select Azure
Users.” GPT-3 NLP Model Is Available for Select Azure Users, GPT-3 NLP Model is
Available for Select Azure Users, 16 Aug. 2023, www.deeplearning.ai/the-batch/gpt-
3-azure/.
3. “What Is CHATGPT and Why Does It Matter? Here’s What You Need to Know.”
ZDNET, www.zdnet.com/article/what-is-chatgpt-and-why-does-it-matter-heres-
everything-you-need-to-know/. Accessed 30 Oct. 2023.
4. “Chatgpt.” ChatGPT, openai.com/chatgpt. Accessed 30 Oct. 2023.
5. What Is a Large Language Model (LLM)? Techopedia Explains,
www.techopedia.com/definition/34948/large-language-model-llm. Accessed 30 Oct.
2023.
6. About Kindra Cooper Kindra Cooper is a content writer at Springboard. She has
worked as a journalist and content marketer in the US and Indonesia. “OpenAI GPT-
3: Everything You Need to Know [Updated].” Springboard Blog, 6 Oct. 2023,
www.springboard.com/blog/data-science/machine-learning-gpt-3-open-ai/.
7. Ramponi, Marco. “How CHATGPT Actually Works.” News, Tutorials, AI Research,
News, Tutorials, AI Research, 4 Aug. 2023, www.assemblyai.com/blog/how-chatgpt-
actually-works/.
8. Ryan, Eoghan. “Ethical Implications of Chatgpt.” Scribbr, 11 Sept. 2023,
www.scribbr.com/ai-tools/chatgpt-ethics/.
9. McCallum, Shiona. “CHATGPT Banned in Italy over Privacy Concerns.” BBC News,
BBC, 1 Apr. 2023, www.bbc.com/news/technology-65139406.
10. “Game of Thrones Creator and Other Authors Sue Chatgpt-Maker for ‘Theft.’”
Technology News | Al Jazeera, Al Jazeera, 21 Sept. 2023,
www.aljazeera.com/news/2023/9/21/openai-sued#:~:text=In%20papers%20filed
%20in%20federal,theft%20on%20a%20mass%20scale%E2%80%9D.
11. Browne, Ryan. “OpenAI CEO Admits a Bug Allowed Some CHATGPT Users to See
Others’ Conversation Titles.” CNBC, CNBC, 17 Apr. 2023,
www.cnbc.com/2023/03/23/openai-ceo-says-a-bug-allowed-some-chatgpt-to-see-
others-chat-titles.html.
12. Mark Allen Cu and Sebastian Hochman. “Scores of Stanford Students Used
CHATGPT on Final Exams.” The Stanford Daily, 24 Jan. 2023,
stanforddaily.com/2023/01/22/scores-of-stanford-students-used-chatgpt-on-final-
exams-survey-suggests/.
13. “Digital Twin Market Size, Share, Industry Report, Revenue Trends and Growth
Drivers.” MarketsandMarkets, www.marketsandmarkets.com/Market-Reports/digital-
twin-market-225269522.html. Accessed 30 Oct. 2023.
14. “Digital Twin in Healthcare: A Game-Changing Technology.” Relevant Software, 11
May 2023, relevant.software/blog/digital-twin-in-healthcare/.
15. “Project InnerEye - Democratizing Medical Imaging Ai.” Microsoft Research, 28 July
2023, www.microsoft.com/en-us/research/project/medical-image-analysis/.
16. Oktay, Ozan, et al. “%.” Microsoft Research, 30 Nov. 2020, www.microsoft.com/en-
us/research/blog/project-innereye-evaluation-shows-how-ai-can-augment-and-
accelerate-clinicians-ability-to-perform-radiotherapy-planning-13-times-faster/.
17. Artificial Intelligence in Kenya - Paradigm Initiative,
paradigmhq.org/wp-content/uploads/2022/02/Artificial-Inteligence-in-Kenya-1.pdf.
Accessed 30 Oct. 2023.
18. “Kenyan Government Goes for AI Powered Platform for Crop Monitoring. •
Skillmine Opportunity.” Skillmine Opportunity, 10 June 2022,
opportunity.skillmine.africa/kenyan-government-goes-for-ai-powered-platform-for-
crop-monitoring/.
19. Writer, Staff. “Over 5 Million Ethiopian Farmers to Benefit from IBM’s ECX
Traceability System.” CIO Africa, 10 Nov. 2015, cioafrica.co/over-5-million-
ethiopian-farmers-to-benefit-from-ibms-ecx-traceability-system/.
20. “Exploring Credit Scoring Services Powered by Machine Learning – Steve Guoko -
Ai Kenya.” Ai Kenya - Democratizing Machine Intelligence, 4 Feb. 2020,
kenya.ai/exploring-credit-scoring-services-powered-by-machine-learning-steve-
guoko/.
21. “We Use AI to Support and Optimize Health Decisions.” Elsa Health | AI for Clinical
Decision Support, www.elsa.health/. Accessed 30 Oct. 2023.
22. “The State of Mobile Internet Connectivity Report 2023 - Mobile for Development.”
<a Href="/Mobilefordevelopment">Mobile for Development</A>,
www.gsma.com/r/somic/. Accessed 30 Oct. 2023.
23. How Siemens Helped Scientists Produce COVID-19 Vaccines on a Massive Scale,
www.fastcompany.com/90731339/how-siemens-helped-scientists-produce-covid-19-
vaccines-on-a-massive-scale. Accessed 30 Oct. 2023.
24. Coming to Life: Artificial Intelligence in Africa - Atlantic Council,
www.atlanticcouncil.org/wp-content/uploads/2019/09/Coming-to-Life-Artificial-
Intelligence-in-Africa.pdf. Accessed 30 Oct. 2023.
25. Div Portal, divportal.usaid.gov/s/project/a0gt0000000rW8SAAU/remote-sensing-
and-machine-learning-for-smallholder-finance. Accessed 30 Oct. 2023.
26. Quenum, Adoni Conrad. “Kenya: AI-Based Agritech Apollo Agriculture Helps
Farmers Maximize Profits.” Actualité - We Are Tech, We are Tech, 4 May 2022,
www.wearetech.africa/en/fils-uk/solutions/kenya-ai-based-agritech-apollo-
agriculture-helps-farmers-maximize-profits.
27. How Siemens Helped Scientists Produce COVID-19 Vaccines on a Massive Scale,
www.fastcompany.com/90731339/how-siemens-helped-scientists-produce-covid-19-
vaccines-on-a-massive-scale. Accessed 30 Oct. 2023.
28. Picaro, Elyse Betters. “Amazon Go and Amazon Fresh: How the ‘just Walk out’ Tech
Works.” Pocket, 2 Sept. 2023, www.pocket-lint.com/what-is-amazon-go-where-is-it-
and-how-does-it-work/.
29. Zhao, Shuyin. “GitHub Copilot Now Has a Better AI Model and New Capabilities.”
Https://Github.Blog/2022-09-07-Research-Quantifying-Github-Copilots-Impact-on-
Developer-Productivity-and-Happiness/, Github, 14 Feb. 2023. Accessed 30 Oct.
2023.
30. Https://Ts2.Space/En/Enhancing-Code-Generation-with-Github-Copilots-
Unsupervised-Learning/#:~:text=One%20such%20tool%20that%20has%20recently
%20made%20waves,On%20the%20context%20of%20the%20code%20being
%20written., 24 June 2023, https://ts2.space/en/enhancing-code-generation-with-
github-copilots-unsupervised-learning/#:~:text=One%20such%20tool%20that%20has
%20recently%20made%20waves,on%20the%20context%20of%20the%20code
%20being%20written. Accessed 30 Oct. 2023.
31. Frąckiewiczin , Marcin. A Comprehensive Guide to GitHub Copilot’s Model
Architecture and Training Process, 7 Apr. 2023, https://ts2.space/en/a-
comprehensive-guide-to-github-copilots-model-architecture-and-training-process/.
Accessed 30 Oct. 2023
32. Vincent, James. “The Lawsuit That Could Rewrite the Rules of AI Copyright.”
Https://Www.Theverge.Com/2022/11/8/23446821/Microsoft-Openai-Github-Copilot-
Class-Action-Lawsuit-Ai-Copyright-Violation-Training-Data, 8 Nov. 2022,
https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github-copilot-
class-action-lawsuit-ai-copyright-violation-training-data. Accessed 30 Oct. 2023.

SCS6105 Assignment 1 Group3

Uploaded by

Copyright:

Available Formats

SCS6105 Assignment 1 Group3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SCS6105 Assignment 1 Group3

Uploaded by

Copyright:

Available Formats

University of Nairobi

Course: Msc. Computer Science (Computational Intelligence)

Name email Phone Number Reg Number

2.For any of the identified use cases (a minimum of two), address he is

b. Whatlegal and ethical concerns should be considered when

4. What are the factors accelerating or impeding adoption of ML based AI in

 Data Quality and Accessibility:

 Culture for research and innovation

You might also like