HCMUT Internship Report DoanTienThong
HCMUT Internship Report DoanTienThong
HCMUT Internship Report DoanTienThong
ARTIFICIAL INTELLIGENCE IN
INDUSTRY ENVIRONMENT
Major: Control and Automation Engineering
Throughout my internship, I have received support and assistance from many people, and I
would want to take this opportunity to express my heartfelt gratitude and delight.
First and foremost, I would like to thank my internship supervisor, Mr. Nguyen Hoang Giap,
for their constant encouragement, invaluable guidance, and unwavering support throughout my
internship. Your insights and expertise have been instrumental in my professional development,
and I am truly grateful for the opportunity to learn from you.
I am also grateful to the entire team at Emage Development company, who welcomed me
warmly and provided me with a conducive environment to hone my skills and explore new
horizons.
I would like to extend my gratitude to the faculty of Electrical and Electronics for their invalu-
able guidance, support, and feedback on my internship report. Your expertise and suggestions
have significantly contributed to the quality of this report.
I am also thankful to the Ho Chi Minh city University of Technology for providing me with
the platform and resources to pursue this internship. The experience has been vital to my aca-
demic and professional growth.
I am deeply grateful to my family for their unwavering love, encouragement, and patience
throughout my internship journey. Their support has been the backbone of my success, and
I cannot thank them enough for their belief in me.
Lastly, I would like to express my appreciation to my friends and fellow interns, who have
shared this journey with me. Your support, collaboration, and friendship have made this experi-
ence both enjoyable and memorable.
i
Abstract
This internship report presents an overview of the author’s experiences, key accomplishments,
and learning outcomes during their internship at Emage Development company as an AI Engi-
neer.
Throughout the internship, the intern was engaged in various projects and tasks related to Artifi-
cial Intelligent like classification, object detection and segmentation. This hands-on experience
allowed the intern to develop and strengthen essential technical and interpersonal skills, such as
programming, how use frameworks, problem-solving, teamwork, and effective communication.
The report details the intern’s experiences and learning outcomes, along with the challenges
encountered and the strategies adopted to overcome them. Furthermore, the report highlights
the intern’s contributions to the organization and reflects on the impact of the internship on
their academic and professional growth.
In conclusion, the internship experience at Emage Development company has been invaluable
in bridging the gap between academic learning and industry practices. These skills and experi-
ments when working here will help to excel in future career endeavors.
ii
Contents
Acknowledgements i
Abstract ii
Contents iii
List of Figures iv
1 Introduction 1
1.1 Introduction to Emage Development company . . . . . . . . . . . . . . . . . . 1
1.2 Internship Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Internship Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Internship Content 3
2.1 Research about Company’s Product . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Project 1: Research about EfficientNet . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Project 2: Template Matching using DL . . . . . . . . . . . . . . . . . . . . . 9
3 Internship Summary 13
3.1 Internship Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Experience Gained . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
References 15
iii
List of Figures
iv
Chapter 1
Introduction
This section should give brief information about the Emage Development company, estab-
lishment history, development process and vision. Next, this chapter will overview about the
internship program and timeline when working here.
In 2016, Emage Vision started to look into developing AI solutions to help customers im-
prove their manufacturing solutions. In 2019, the Machine Learning software was adopted and
integrated to major customer’s manufacturing lines. By then, these solutions had encompassed
the “eyes” and “brains” to industrial machines.
1
1.2. Internship Project
In 2020, Emage prototyped a humanoid. This enables us to incorporate “touch” into our
suite of solutions. By incorporating machine vision, AI and humanoid, the company give ‘eyes”,
“brain” and “touch” to help our customers in their quest to enlarge the footprints in their smart
manufacturing efforts.
Today, the company are headquartered in Singapore with RD centres in Russia, India and
Vietnam, and field operations in the USA and Philippines. Our organisation is dedicated to re-
lentless innovations, delivering to our commitments and support to our customers.
• Research and apply other advanced techniques in deep learning to solve real project of
company. (like Template matching using deep learning, Domain Adaptive model, etc.
2
Chapter 2
Internship Content
This chapter will go into detail about the implementation of projects during the internship
as an AI Engineer at Emage Development company.
Today, most manufacturing processes rely on machine vision for product quality inspections.
Overkills and false rejects cost money, entails more resources and affects overall efficiency.
These machine-learning solutions can enhance your vision inspections and reduce the false
rejects resulting in better yields. This proven platform can help label, train and predict all your
images, with accuracy of better than 95
AEON (Autonomous Equipment+Operation Networking) is our Neural Network and Re-
inforce Learning software that provides “autonomous” operations and Defect Classifications.
3
2.2. Project 1: Research about EfficientNet
This platform will enable our customers a step closer to a fully automated manufacturing envi-
ronment.
OSPREY (Operation Specific Process Recovery Ecosystem) is our real-time trend analysis
utility tool that combines Machine Learning Defects Classification and Alert Management Sys-
tem to help process/manufacturing engineering to troubleshoot and diagnose faster and easier.
4
2.2. Project 1: Research about EfficientNet
Figure 2.2: Model Scaling. (a) is a baseline network example; (b)-(d) are conventional scaling
that only increases one dimension of network width, depth, or resolution. (e) Proposed
compound scaling method that uniformly scales all three dimensions with a fixed ratio.
Depth Scaling will increase the receptive field of the model, the deeper network can capture
richer and more complex features and generalizes well on new tasks. The challenge is vanishing
gradients is one of the most common problems that arise as we go deep and adding more layers
doesn’t always help. Ex: ResNet-1000 has similar accuracy as ResNet-101.
Wider networks tend to be able to capture more fine-grained features. Also, smaller models
are easier to train. The challenge is making the network extremely wide, with shallow models
(less deep but wider) accuracy saturates quickly with a larger width.
Resolution (r)
With high-resolution images, the features are more fine-grained and hence high-res im-
ages should work better. For example, in Object detection tasks, we use image resolutions like
300x300, or 512x512, or 600x600. But this doesn’t scale linearly.
5
2.2. Project 1: Research about EfficientNet
Figure 2.3: Scaling Up a Baseline Model with Different Network Width (w), Depth (d), and
Resolution (r) Coefficients.
6
2.2. Project 1: Research about EfficientNet
width scaling achieves much better accuracy under the same FLOPS cost.
φ is a user-specified coefficient that controls how many resources are available whereas
α, β , and γ specify how to assign these resources to network depth, width, and resolution
respectively.
Note:
• In a CNN, Conv layers are the most compute expensive part of the network. Also, the
FLOPS of a regular convolution op is almost proportional to d, w², r².
• For example, doubling the depth will double the FLOPS while doubling width or resolu-
tion increases FLOPS almost by four times.
The specific architecture of EfficientNet-B0 is shown in Figure 2.5. The MBConv block:
Inverted Residual Block (used in MobileNetV2) with a Squeeze and Excite block injected some-
times.
7
2.3. Project 2: Template Matching using DL
Figure 2.5: The architecture for the baseline network of EfficientNet-B0 is simple and clean,
making it easier to scale and generalize.
Scaling parameters: The model has 4 parameters to search for: α, β , γ, and ϕ. apply the
compound scaling method to scale it up with two steps:
1. Fix φ =1, assuming that twice more resources are available, and do a small grid search
for α, β and γ. For baseline network B0, it turned out the optimal values are α =1.2, β =
1.1, and γ = 1.15 such that α ∗ β 2 ∗ γ 2 ≈ 2.
2. Now fix α, β and γ as constants (with values found in step 1) and experiment with differ-
ent values of φ . The different values of φ produce EfficientNets B1-B7.
8
2.3. Project 2: Template Matching using DL
Figure 2.7: The baseline network architecture of QATM. The dashed arrows indicate the
replacement relationship
exp{α · ρ( ft , fs )}
L(s|t) =
∑t ′ ∈T exp{α · ρ( ft ′ , fs )}
9
2.3. Project 2: Template Matching using DL
This likelihood function can be interpreted as a soft-ranking of the current patch t compared
to all other patches in the template image in terms of matching quality. It can be alternatively
considered as a heated-up softmax embedding, which is the softmax activation layer with a
learnable temperature parameter, i.e. α in our context.
In this way, we can define the QATM measure as simple as the product of likelihoods that
s is matched in T and t is matched in S as shown in the equation below:
10
2.3. Project 2: Template Matching using DL
Figure 2.8: Some demos of the proposed method in OTB public dataset
dataset OTB [3] . The results of the proposed method are shown in Figure 2.8, while the red
rectangle is the predicted from my method, and the green one is the ground truth of image.
11
Chapter 3
Internship Summary
• Gain deep knowledge about EfficientNet - a core network used in the company, able to
build EfficientNet from scratch and use it to solve problems in the company.
• Apply and improve QATM baseline to solve well real-world company tasks, and able to
write an article from that solution.
• Learn how to write a report to manager and able to present solutions to the team leader
and manager.
After my internship time at Emage Development, I have been in contact with many col-
leagues in the company, and learned lots of experiments from them. Participated a list of works
of AI Engineer, which helps me have a clear direction as well as helped me in my career path.
Once a gain, thank you everybody help me get this chance to work here.
13
References
[1] Jiaxin Cheng, Yue Wu, Wael AbdAlmageed, and Premkumar Natarajan. Qatm: Quality-
aware template matching for deep learning. In Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition, pages 11553–11562, 2019.
[2] Mingxing Tan and Quoc Le. Efficientnet: Rethinking model scaling for convolutional neu-
ral networks. In International conference on machine learning, pages 6105–6114. PMLR,
2019.
[3] Yi Wu, Jongwoo Lim, and Ming-Hsuan Yang. Online object tracking: A benchmark. In
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013.
15