FINAL WORK (4unidades) PDF
FINAL WORK (4unidades) PDF
FINAL WORK (4unidades) PDF
The lack of exploitation of the software that comprises tasks related
to the determination of the needs and conditions in the sentiment
analysis in the software engineering text.
Theoretical framework
SentiSttrength is a software oriented to the semantic content of the
terms of the message. According to Bock, it can be said in general
terms that it is a computer tool dedicated to the quantitative analysis
of texts, having the knowledge that the lexicon, the classes of words
and the associations of these words in the text can be quantified. (J.
K, 1986)
The SentiStrength software uses a lexical approach that exploits a list
of terms related to feelings and has rules for dealing with standard
linguistic and social media methods for expressing feelings, such as
emoticons, exaggerated punctuation, and deliberate misspellings.
(Thelwall, 2016)
This tool can identify (by establishing the appropriate conditions) the
author of a text; define the peculiarities of the speech of the person
who made a certain writing; define the mood, sex, age, cultural
background, of the author. Sentiment analysis in software engineering
text has attracted immense interest recently. The poor performance
of general-purpose sentiment analysis tools, when operating on
engineering texts, has led to the recent emergence of domain-specific
sentiment analysis tools specially designed for software engineering
text. (Islam & Zibran, 2018)
Machine learning was born in the field of computer science, and more
specifically, artificial intelligence. It is a type of computer program
whose data processing is a kind of learning. In other words: the
machine is not programmed to respond in a certain way according to
the received inputs, but rather to extract behavior patterns from the
received inputs, and based on said learned or assimilated information,
perform the evaluation of new entries. The internal algorithms that
constitute the basis of this learning have a strong statistical and
algebraic component, with the consequent calculation capacity.
(Rabab'ah, M. et al., 2016)
Under the approach of sentiment analysis within texts, it detects the
positive, negative or neutral trend of a given text. "Studies indicate
that sentiment analysis tools provide unreliable results when used out
of the box, as they are not designed to process SE data sets." (Bin,
and others, 2018)
The purpose within engineering is focused on detecting the needs of
customers in the references on social networks and thus better adapt
to their demands. The benefits of sentiment analysis are numerous
and important. The ability to extract information from data from social
networks is a practice that organizations are already adopting
worldwide. Changes in sentiment on social media have been proven
to correspond to changes in the stock market. (Cardoso et al., 2019)
The great challenges that sentiment analysis still faces have to do with
the problems posed by the use, by the speaker or author of the text,
of irony and figurative language. SentiSttrength uses algorithms
based on machine learning, the degree of reliability will depend on the
data with which the machine has been fed. SentiStrength-SE
achieves 73.85% accuracy and 85% recovery, which are significantly
higher than a state-of-the-art sentiment analysis tool we compared to.
(Islam & Zibran, 2017)
The arrival of Web 2.0 and the popularization of microblogging social
networks such as Twitter have catapulted this field of Artificial
Intelligence research towards the highest levels of interest and
notoriety due to the indisputable importance of being able to obtain
the degree of evaluation of thousands of people at every moment for
companies, organizations, governments and consumers. This large
amount of information together with the increase in the computing
power of computers have made possible the application of machine
learning techniques for the classification of texts based on their
sentimental polarity and have opened a door to what will undoubtedly
be an of the most important research and development areas of the
coming years.
Many studies are conducted to study the sentiments presented by
social media users regarding different topics. Sentiment Analysis
(SA) is a new field that deals with measuring the sentiment
presented in a given text. Due to its broad set of applications,
several SA tools are available. Most of them are designed for
English text. (Rabab'ah et al., 2016)
In recent years, increasing attention has been paid to the social
aspects of software engineering, including studies of emotions and
feelings experienced and expressed by software developers. Most
of these studies reuse existing sentiment analysis tools, such as
SentiStrength and NLTK. However, these tools have been trained in
product reviews and movie reviews and therefore their results may
not be applicable in the domain of software engineering. (Jongeling
et al., 2015)
The present investigation has as a priority to describe both the
qualities and characteristics of the SentiSttrength software; in addition
to describing the relationship between the software application and
the industry in general. Therefore, the scope of said research falls on
a correlational and descriptive scope.
There is a lack of exploitation of the software that includes tasks
related to the determination of the needs and conditions in the
analysis of sentiments in the text, within the industry.
The analysis of variance (ANOVA) technique, also known as factor
analysis and developed by Fisher in 1930, constitutes the basic tool
for studying the effect of one or more factors (each with two or more
levels) on the mean of a variable. keep going. It is therefore the
statistical test to use when you want to compare the means of two or
more groups. This technique can also be generalized to study the
possible effects of factors on the variance of a variable.
The null hypothesis from which the different types of ANOVA start is
that the mean of the variable studied is the same in the different
groups, in contrast to the alternative hypothesis that at least two
means differ significantly. ANOVA allows you to compare multiple
means, but it does so by studying the variances.
The basic operation of an ANOVA consists of calculating the mean of
each of the groups and then comparing the variance of these means
(variance explained by the group variable, intervariance) versus the
average variance within the groups (that not explained by the group
variable, intravariance). Under the null hypothesis that the
observations of the different groups come from the entire same
population (they have the same mean and variance), the weighted
variance between groups will be the same as the average variance
within the groups. As the group means are further apart from each
other, the variance between means will increase and will no longer be
equal to the average variance within the groups. (Joaquín Amat
Rodrigo, 2015)
The T test was initially designed to examine the differences between
two independent and small samples that have normal distribution and
homogeneity in their variances (in the original article, the author does
not define what is a large and / or small sample). Gosset emphasizes
the normality of the two samples as crucial in the development of the
test. (Alberto, 2015)
In order to exemplify the aforementioned, the t-test was used within
an investigation to be able to be appreciated. A group of young
university students calculated a sample from the population of
Systems Engineering students at the University of Cartagena, which
is made up of a total of 350 active students. Within this research, an
attempt was made to demonstrate the feelings of young people in
relation to the measures established by the University for the modality
of online classes generated by the pandemic.
Once the sample was estimated, a data collection instrument was
designed with 7 questions (3 quantitative and 4 qualitative). To obtain
the representative sample of students, sentiStrength text sentiment
analysis was used. (CHANCHI et al., 2020)
Within this example, our samples are the two question variables,
qualitative and quantitative. Our universe being the 350 students
interviewed from the University of Cartagena. Next,
the representation of the variables (questions) used.
By way of conclusion and taking up the example already exposed
previously, within this section in said investigation the results obtained
from the exposed questions were presented. Regarding question 1,
related to the perception of the academic strategies proposed by the
University during the confinement period, the following graph was
It can be seen from the results presented in the graph that 58.5% of
the students surveyed agree and completely agree with the strategies
developed by the University (13% completely agree and 45.5%
agree). Similarly, 35.1% of the students consider that the strategies
were acceptable and 6.5% of the students indicate that the strategies
were not adequate. (CHANCHI et al., 2020)
SentiSttrength is a software oriented to the semantic content of the terms of the message.
The SentiStrength software uses a lexical approach that exploits a list of terms related to
feelings and has rules for dealing with standard linguistic and social media methods for
expressing feelings, such as emoticons, exaggerated punctuation, and deliberate
The purpose within engineering is focused on detecting the needs of customers in the
references on social networks and thus better adapt to their demands. The benefits of
sentiment analysis are numerous and important. The ability to extract information from data
from social networks is a practice that organizations are already adopting worldwide.
Changes in sentiment on social media have been proven to correspond to changes in the
stock market.
This product is of a business nature, it is a supply that, although it does not have to do
directly with the production process, is an essential part for improving the production
process and the final product, in relation to a population statistical process of the product
consumer. This study seeks how to incorporate it into the market, encompassing all the
relevant factors to achieve the objective.
Within the business sector, whether medium or large companies, the use of technological
tools has increased since the appearance of the internet. According to the National Institute
of Statistics (INE), 64.7% of companies that use computers are for business purposes.
Where 34.9% represent purchases through electronic commerce in 2019, for companies.
(Survey on the use of ICT and electronic commerce in companies. Year 2019 - First quarter
of 2020, 2020)
According to this, 65 out of 100 companies use technological tools such as SentiSttrength
with an annual increase of 3%.
According to a report in the City of Chile, after the various benefits produced by sentiment
analysis software, explained in the article, they are considered of considerable relevance
and necessity in the business sector for competition, indicating that 9 out of 10 large
companies use it, while in medium-sized companies only 4 out of 10 (El Universo, 2011)
Within this same article, he mentions the variations in the prices of this type of software “…
the value of a tool varies between US $ 5,000, for basic services to US $ 50,000 in Premium
services such as the one used by Microsoft in the US market. " (2011)
According to the information collected within this study, the following situations could be
It can be seen that in graph 1, that 75% (orange fraction) of the respondents did not know
the sentiment analysis software tool within the industry application; creating the panorama
of a possible market where if they do not know it, it is not implemented until now, as a
possible marketing tool. While 16.7% (blue fraction) if they had heard of the use of this
software at an industrial level.
Within graph 2, 33.3% (orange fraction) of those surveyed believe that the feelings
(emotions, comments, reactions, etc.) expressed in social networks are not relevant in the
perception of the product / brand, limiting the condition of need within the market. Similarly,
respondents who believe that feelings are relevant (blue fraction) and those who do not
know (yellow fraction) are placed in the same percentage.
In graph 3, for 75% of the respondents (orange and blue fraction) they believe that the
software is innovative or new, allowing the incorporation of the software to the market. While
25% (yellow fraction) already knew the software.
Within graph 4, it can be seen that 83.3% of the respondents were attracted to being able
to know through the sentiment analysis of this software the deficiencies of its production.
Grafico 4. Did you know that this software can help
you discover the deficiencies within your
And finally, within graph 5, 100% of the respondents are interested in the benefits that this
software offers, opening great expectations of the demand that this software can generate,
mainly in the market of medium or small companies.
In relation to the above, the different prices established by the giants within
the software industry were analyzed. Thus giving an example and an idea
within the market which are the prices that dominate, which will be represented
Within the different companies the same type of software is offered, one that
could be an annual subscription with its different updates in three different
types of presentations (standard, plus or professional); or a unique installation
and portable software also presented in three different presentations. (Online
Shop | MAXQDA, 2020)
Oferta de precios
Therefore, the prices of these software are between 400.00 and 1800.00 USD
according to the services, categories and updates that it offers.
The demand is not always fully satisfied, therefore, it must be established what
percentage of the unsatisfied demand exists within the computer industry. The
unsatisfied demand is known from the following equation:
Unsatisfied demand = Demand - Supply
To calculate demand, variables such as GDP, inflation and the exchange rate
are taken into account. Also Based on secondary sources, we obtain that to
project demand these data are used according to the Mexican Free Software
Business Association (AMESOL), which gives the percentages in which the
demand grew.
2016 11%
2017 11.3%
2018 13.4%
2019 14.2%
2020 14.8%
2021 15.6%
The current supply for the year 2021 is 15.6% and it is intended to start
entering the market with a production of 0.6%
We proceed to do the calculations:
Oferta 2021 = 15.6 % Como nuestra capacidad es mensual,
necesitamos pasarlo a meses.
𝑈𝑛𝑖𝑑𝑎𝑑𝑒𝑠 12 𝑚𝑒𝑠𝑒𝑠
Capacidad del taller (0.6%) = = Simplificando queda ==
𝑀𝑒𝑛𝑠𝑢𝑎𝑙𝑒𝑠 1 𝑎ñ𝑜
Por último, paso: 𝐷𝑒𝑚𝑎𝑛𝑑𝑎 𝑖𝑛𝑠𝑎𝑡𝑖𝑠𝑓𝑒𝑐ℎ𝑎2021 = 𝐷𝑒𝑚𝑎𝑛𝑑𝑎 2021 − 𝑂𝑓𝑒𝑟𝑡𝑎 2021
= 15.6% - 6.9% = 8.7 %
The unmet demand is 8.7% for text sentiment analysis software within the
business sector, per year of the total market.
The price will not be higher than the competition, keeping prices competitive
and attractive to the market.
Current forms of payment will be added for the safety of the prospective
As the first source, the distribution of the product's trade will be through digital
platforms in 80% and it is intended that 20% is generated in a distribution by
land transport. Distribution by digital means will be through a single platform
that will show its different systems for sale, as well as the form of payment and
distribution, collecting user information and verifying that it is not false. The
distribution channel chosen was considered to be the most effective, safe and
economical; in addition to being the best employee given the type of product
This distribution of the products will be made directly with the merchants who
require this service, the companies may request the service with the desired
characteristics and the product will immediately go to your company with the
due process of delivery in time, place and form.
In this case, an intermediary will not be necessary because all purchases will
be made directly or with a sales agent who will install the software, and it can
also be requested by email to give details of the desired product.
The services are correlated with the software, it is a product that is not yet
overexploited within the software industry, so there are very few companies or
companies that know about it, the services we offer in some cases are high
since the company It requires them, and in other cases where the consumer
does not know about this service, we obtain a low demand.
In conclusion, the service offered in the market is novel due to the little over-exploitation of
the product, but given the atypical situation that the world is going through, it makes a bad
move in the business since the primary and secondary sources also did not make them
known. exploit the potential that this product deserves and they have less purchasing
power, then this affects the market therefore it will be difficult at the beginning or perhaps
during the introduction of the service to the market with this very difficult situation, but it can
be solved by investing in advertising and making the companies more known with the
service that is offered trying to be better than the competition, then the demand is
intermediate because there are ups and downs, it seems not to have a good performance
in the industry, but if this service we learn to over exploiting or targeting worldwide
applications to improve the use of speech in text may prove to be a successful service, if
not compromised is not earned, every business involves a risk that can be positive or
negative for the company.
Unit III
Within this production plant it will be by project, from the use of the software
already defined without special characteristics, to the manufacture of customized
software in relation to the deficiencies and particularities that the company has,
or seeks to focus more.
For the use of already established programs, the production would be 1 month,
to achieve the installation and due to the demand of large corporations, the
demand would be very little, however, the maintenance and care of the software
programs within the plant It would be the main and vital occupation.
The infrastructure and tools would be limited going to the background, while the
greatest raw material and tool would be the trained personnel, for care and
Machinery such as hard drives, electricity, computer systems and the internet,
would make up part of the raw material; while our computer systems engineers,
telecommunications, electrical engineers and other personnel that help our
growth and maintenance would be part of the machinery.
C= 150000 dólares
I= 300,000 dólares
i= 30% t= 3
𝑪𝑻 = (𝟑𝟎𝟎𝟎𝟎𝟎)(𝟏𝟓𝟎𝟎𝟎) + ∑
(𝟏 + 𝟑𝟎)𝟑
𝑪𝑻 = 𝟒. 𝟓𝟎𝟎𝟎𝟎𝟎𝟎𝟎𝟏𝑿𝟏𝟎^𝟏𝟎
Factores Peso
Escenario Escenario
importa asign
ntes ado
2 1
Personal 0.3 7 . 5 .
1 5
0.2 5 1 7 .
0 0
Oficinas 0.1 8 . 4 .
8 4
Cercanía 2 1
del 0.3 8 . 5 .
mercado 4 5
0 0
0.1 5 . 7 .
5 7
6 5
Suma 1 . .
8 5
Our plant should focus mainly on scenario a, given the great possibilities of
finding personnel due to the concentration of companies around it, as well as the
proximity of the market, the possibility of having larger and better located offices,
but with some deficiencies in transportation and finding nearby raw materials for
the plant.
Within the distribution of the plant, firstly, it is all virtual, by means of electronic
software machinery, where it begins by specifying the requirements by the user;
Afterwards, the requirements that the client grants for the software are analyzed
and created, to finish, producing the system and analyzing it. The cycle becomes
closed, since only the specifications or requirements that they request are
At the same time, another distribution is the one that the user is formulating within
the program request, as shown in the graph, reflecting the operations to be
carried out by it; how to present the problems to be solved, the logic and direction
of the business that it handles, the tools that provide data, such as social
networks and finally the package you want to hire.
a) Provider: must contain digital processors, 500 TB hard drives, business capacity
c) Dimensions: the dimensions are relatively small, with a size of 1 meter high by 6
meters long, with all the machinery ready for operation.
d) Capacity: its capacity must process 500TB of information at a time, without suffering
j) Cost of freight and insurance: $ 30,000 for workers and machinery insurance.
The production distribution begins with the collection of data to define the
software format, passing through the production area, and then moving on to the
quality and errors area, where it culminates in the storage of data in high-caliber
processors to be finally taken to the installation area.
• Reception: 5x5 m2
• General Management: 6x6 m2
• Administration: 8x6 m2
• Production manager: 8x8 m2
• Production area: 50x 30 m2
• Quality area: 15x10 m2
• Data storage: 7x5 m2
• Data collection: 15x8 m2
• Bathrooms: 10x10 m2
• Dining room: 10x10 m2
Vega M
M F P o
i r a
g a b C
u n l a
e c o b
l i e
a l
s M
H c a l
e o r o
The objective of this economic analysis is to help the board to determine if the
decisions about financing were the most appropriate, and in this way to determine
the future of the investments. It should be understood that the elements of the
analysis that provide the comparison of the financial ratios and the different
analysis techniques that can be applied within the company.
Within the production, certain relevant factors must be taken within the production
budget and the initial capital, such as raw material, labor, energy, among others;
presented below:
Capital por
Factores de producción
MATERIA PRIMA () $1,000,000
MANO DE OBRA () $2,040,000
ENERGÍA (Eléctrica) $960,000
EMPAQUES $500,000
TOTAL $4,500,000
Capital por
Factores de producción
TOTAL $500,000
Capital por
Factores de producción
IMPUESTOS $,600,000
TOTAL $1,000,000
The initial investment includes the acquisition of all fixed or tangible and deferred
or intangible assets necessary to start the operations of the company, with the
exception of working capital.
Inversión Fija. Costos fijos
Programas 10,000
Maquinaria y equipos 500,000
Renta edificio 300,000
Muebles y enseres 100,000
Vehículos 500,000
Otros 50,000
Inversión Diferida Variables
Estudio e investigaciones
Gastos de organización
Patentes y licencias
Interés de pre-operación
Depreciation and amortization.
The Tax depreciation of Fixed Assets is the deduction to which Taxpayers who
pay taxes in the Tax Administration Service are entitled, this deduction applies to
all Legal Persons, Individuals except those who are only with Wages and
Amortization is the expense for loss of value of fixed assets estimated each year.
Thus, each year the result decreases by that amount as one more expense for
the use of assets and, by that same amount, the value of the asset is reduced on
the balance sheet.
Within this company the amortization of each asset will be made over 8 years, to
know the loss that will be established in this period of time. An equal wear is
calculated every year for eight years, amortization will be an annual expense for
8 years.
From an initial total in the recent purchase, 1,110,000 were invested and the
amortization after 8 years gave a loss of 570,000 that represents 51.35% of the
initial investment.
The breakeven point is a term used to define the moment when a company covers
its fixed and variable costs; that is, when income and expenses are at the same
This is calculated according to the following formula:
The software has a production value of 11,500; which has a market value of
between 3,850 and 4,000 pesos
𝟏𝟏 ,𝟓𝟎𝟎 𝟏𝟏,𝟓𝟎𝟎
𝒑𝒆 = (𝟒,𝟎𝟎𝟎−𝟑,𝟖𝟓𝟎) 𝒑𝒆 = (𝟒,𝟎𝟎𝟎−𝟑,𝟖𝟓𝟎) 𝒑𝒆 = 𝟕𝟔. 𝟔𝟔𝟔𝟔
Therefore, the breakeven point begins to return from the sale of 77 software per
year, more is profit, less is loss.
The following is a sales estimate for the breakeven plot.
Punto de Equilibrio
Ingreso Producción
Semana. (2011, January 31). Estos son los líderes mundiales del mercado del software. Últimas Noticias de Colombia Y El Mundo.
Author: Jorge Herbert Acero
How to put together your business plan?
A business plan is something that is always moving, it is a living system. The
excess of not documenting something as well as excessively documenting all
things in a short time. Identify or publicize the initial product and modify the
service or product to improve it, whether we can satisfy the customer to generate
sales, if we do not generate sales we are not understanding that it is our product
and it could be bankruptcy.
The business model describes the fundamentals of how an organization creates,
develops and captures values, the canva model serves to know processes and
know how to work in harmonic processes, there are ways to identify problems
with different colors, red marks the danger and is why we should worry because
it is in that we will be failing
This process is made up of 9 blocks which explain and identify the value systems
that we need in the process, there are two things that we must always take into
account, one of them is how to sell more and the other is how to spend less.
The canva model is divided into HOW, WHAT, WHO, are components that we
must satisfy the client in order to give it a structure and know how to meet the
client's requirements.
Having the product, we have to evolve as a fence, changing the customer's
landscape, since everything grows as well as skills and the consumer will go with
whoever offers better services.
To start the business model from the 9 blocks, you have to start the customer's
needs because that way you can see what you can offer your product based on
what is needed, with this you can raise or plan in elements of your business plan.
BLOCK 1: the client is the most important thing in a business plan, the sales
process is the most important, it is the most delicate point in any business model,
Eliminates the niche that anyone in the world is our client and refers that this is
not true, it emphasizes having a more competitive product or service in terms of
prices. The important thing about the company is to have sales and to give the
customer the satisfaction of what they need, because the customer is the heart
of the business model.
BLOCK 2: the value proposition is to generate that the client is surprised with our
product, it is necessary to understand the differential that we will give to the
product and the clients take us as an option to satisfy the needs and be a more
viable option in the market. Having a different offer is having a better performance
or better quality that is what you should look for, the differential in the product is
important to enter the market, so our proposals have to be innovative and not
compete with other companies for the price already which could be larger
BLOCK 3: You have to find the processes to meet the demand, they are
distribution channels to deliver the product or service, at this point they must
develop how the product will be delivered, and they are the key to connecting
with the customer, who play an important role to receive feedback or satisfaction
of our product.
BLOCK 4: The relationship that is obtained with the client is important to know if
he is satisfied this serves to make or improve the product promotes
communication with customers generating constant communication with the
product and developing communication, claims can be personalized or
automated to acquire new customers, retain customers or increase customers.
BLOCK 5: in the flow of income, mainly identify how much the client is willing to
pay compared to the competition and generate a solid structure with respect to
the competition in price wars and know how to react, generating preservations
since the price needs to be attractive refers to how you give it to perceive.
BLOCK 6: the key resources describes the challenges to face and understand
that sometimes infinite resources are not needed to develop a product, what will
be the necessary resources to start developing the promised product, the key
resources are those necessary for the purchase of materials for the development
of the product with a budget with essential and necessary things, that is why we
have to understand or identify what the key resources are so as not to exceed
the limits.
BLOCK 7: At this point, what are the minimum key activities to be able to start
selling the product, analyzing only activities, to be able to sell and fulfill the
valuable activities.
BLOCK 8: People who are related to the business and can add value and you to
them, a network of alliances is important to develop value in the product.