This section primarily describes the definition of the sustainability and the factors involved in the sustainability of OSE. Next, sustainability metrics of OSE are proposed, and an Evaluation Indicators System is constructed based on the hierarchical structure for the sustainability measurement model of the natural ecosystem.
3.1. Definition of Sustainability
The term “sustainability” was proposed by ecologists, which is called “ecological sustainability.” It is intended to illustrate the balance between natural resources and their development and utilization. Studies have shown that natural ecosystems have a heuristic effect on OSE. Many aspects of OSE can be extended from natural ecosystems. For example, Dhungana et al. [
10] compared the natural ecosystem with the OSE in terms of diversity, sustainability, and energy flow in the ecology, pointing out that the OSE is highly similar to the natural ecosystem. Moreover, sustainability is defined as a system that can increase or maintain a community of its users or developers over a longer period of time and can withstand sudden changes. Li Bing et al. [
22] discussed the composition, structure, model and characteristics of the OSE ecosystem with reference to the natural ecosystem, proposed the knowledge chain structure of OSE for the food chain of the natural ecosystem, and further proposed that sustainability refers to a process or state that can be maintained for a long time. In addition, Wan Jiangping et al. [
5] proposed that the sustainability of OSE means that the system can continuously tap its inherent potential. Penzenstadler et al. [
7] proposed that sustainability can be defined as maintaining the functionality of a system over a defined time span. In general, “sustainability” refers to the ability to maintain or support an objective thing permanently or indefinitely. In a dynamic sense, it means the ability to sustain or support the healthy existence and development of a system, persistently or indefinitely. For a sustainable ecosystem, interaction with the environment ultimately promotes systemic health and development, rather than degradation or extinction. Accordingly, we define open source ecosystem sustainability as the potential initiative of the software ecosystem to permanently sustain or support its own dynamic health and its evolutionary evolution. This definition includes two meanings: First, the system is now in a state of steady development; second, the system will continue to be in that state in the future.
3.2. Evaluation Indicators System for the Sustainability of OSE
The Evaluation Indicators System of OSE sustainability is a set of indicators that reflects the overall development status from different aspects. In the natural ecosystem, researchers propose different factors for assessing sustainability, such as openness [
22,
26], stability [
27,
28], integrity [
29], productivity [
30,
31], regulation [
22], resistance [
4], diversity [
26,
31,
32], drivers [
4], organizational structure [
33], a propensity for growth [
2], and in line with development trends [
14].
Table 1 shows the meaning of these influencing factors. Inspired by the assessment of the sustainability of natural ecosystems, this article analyzes and studies the factors affecting the sustainability of OSE. Based on the description of the sustainability assessment factors for natural ecosystems in
Table 1, we find that some of these concepts have similarities and overlaps. Among them, openness and integrity emphasize the relationship between ecosystems and external or internal, so we classify openness and integrity as openness. The meaning of stability includes the avoidance, tolerance and resilience. When assessing the stability of the ecosystem, the most common indicators are organizational structure and diversity, therefore, we generalize the related factors such as regulation, resistance, diversity, and organizational structure, and define it as stability. Productivity and drivers represent the frequency of species activity within an ecosystem, so it is defined as activity; The propensity for growth and the ability to conform to development trend represent the capacity of ecosystem development, so it is defined as extensibility.
Among these factors, openness is the prerequisite for the sustainable development of ecosystems and represents the communication function of the entire ecosystem within the external environment; stability is the basis of sustainability and represents a relatively stable state in the long-term development of the system; activity is a power assessment of sustainability, representing the frequency of species activity within the system, that is, the vitality of the system; extensibility reflects the possibility of the future development of ecosystems and represents the system’s ability to develop into the future. In addition, these four factors also reflect the meaning of sustainability. Openness and stability reflect the stable development status of the definition of ecosystem sustainability, while activity and extensibility reflect an extension of the development capability within that same definition. As a reference, we choose openness, stability, activity and extensibility as indicators to measure the sustainability of OSE.
Figure 1 shows the indicators for assessing ecosystem sustainability.
Based on these characteristics, with reference to the PSR (Pressure-State-Response) model [
34] of the sustainability of natural ecosystems, three levels of target, guideline, and evaluation are established to analyze the sustainability of OSE. The target level indicates that the main purpose of the evaluation system is to measure the sustainability of OSE; the guideline level is the guiding principle that is followed to achieve the goal of measuring the sustainability of OSE. This paper uses openness, stability, activity, and extensibility as indicators of the guideline level. The evaluation level indicators are based on the participants’ activities to determine the influencing factors. In the ecosystem, the participants’ activities are actually the reflection of group collaboration. In the Internet environment, the participants with dispersed, independent and different backgrounds are openly convened by specific organizations or individuals, and voluntarily form network teams relying on common interests and perceptions, and then collaborate to complete high-quality and complex team work. At present, group collaboration is also an important mechanism for the operation of OSE. Therefore, the evaluation level is used to determine the factors that influence the indicators of the guideline level according to the various activities of the ecosystem participants in order to measure the sustainability of OSE. This also makes the Evaluation Indicators System suitable for more OSE with group cooperation mechanism, such as Github, StackOverflow, SourceForge, Topcoder and other OSE.
Figure 2 shows the Evaluation Indicators System for the sustainability of OSE. The following describes each indicator in detail.
3.2.1. Openness
Openness represents the ability of the entire ecosystem to communicate with the outside environment or the ability of users within the ecosystem to participate. Therefore, this paper evaluates the openness of OSE from these two perspectives. The communication between the ecosystem and the external environment is mainly reflected in the interaction between the personnel and the ecosystem. Anyone can register, submit content, view questions, download code, or engage in other related activities. The community is likely to have new users join at any time, so it is possible to communicate with the outside at any time. In addition, outbound links to other software ecosystems also indicate the links between the ecosystem and other ecosystems, and other ecosystems rely exclusively upon it. For example, bugs in the GitHub project are quickly resolved by posting to Stack Overflow, which accelerates the project development process. This speedy resolution indicates that there is communication and mutual promotion between the GitHub ecosystem and the Stack Overflow ecosystem. Therefore, this paper uses the number of new users and outbound links as indicators for measuring the ability of the ecosystem to communicate openly with the outside environment.
Internal communication in the ecosystem is primarily reflected in the internal interaction of users in OSE. The open source ecosystem provides users with interactive channels, such as mailing lists, bug libraries, and forums. The more active a mailing list response, the more users may contribute and use the project. Bug fix time represents the speed of reporting and solving problems in the ecosystem. The relevance of users to each other within the forum represents the degree of intimacy between users. At the same time, the project related number is also an important index for evaluating the internal openness of the ecosystem. The project related number refers to the number of connections between software products and other projects and represents the exchange and sharing of information between projects. The relationship between projects can be roughly divided into three categories, namely, similarity, co-use and dependence. Similarity means that there are common functional modules between two projects; co-use refers to multiple projects that are related to a common goal; “project A dependent project B” refers to project B as a dependent library, framework, or other necessary component of project A. Therefore, this article uses the mailing list response, bug fix time, user relevance, and the project related number to evaluate the indicators of internal communication within the ecosystem.
3.2.2. Stability
Stability means that the state of the ecosystem does not fluctuate strongly over time and instead remains stable. First, long-term contributors are significant to the stable development of the ecosystem. In OSE, only a small number of users actively participate in the project, solve project problems in a timely manner, accelerate project progress, and make long-term contributions to the stability of the ecosystem. In addition, the diversity of developer roles and the diversity of project attributes help enhance the resistance and recovery capabilities of the software ecosystem. Through the diversity of roles, the task can be temporarily represented by other developers of similar roles when they are disturbed; through the diversity of project attributes, if the only developer with a certain role in a team leaves, the alternate member with the ability to fill that role can be introduced from a wider scope such that the project development process can return to normal sooner. If there is a partnership among developers and a network of developers forms, the lost number of developers will be reduced, and the stability of the ecosystem will be maintained. For example, the developers on GitHub are all gathered on the website for the development of the project; therefore, naturally, it is possible to build the developers network with the project as a link. Developers who develop the same project belong to the same social network. The more public projects there are, the closer the two developers are. Therefore, the long-term contributors, the diversity of developer roles, the diversity of project attributes, and the partnership among developers can be used as important indicators for measuring ecosystem stability.
3.2.3. Activity
Activity is a comprehensive manifestation of the overall life and functioning status of the ecosystem, in which the internal members are the performers of the ecosystem vitality, and the constant sharing and exchange of information among user members keep the ecosystem in an active state of development. Therefore, this paper uses the efficiency of information transmission in the ecosystem as an evaluation criterion for activity. The efficiency of information transmission indicates the frequency and contribution of developers’ activities. For example, developers constantly update and increase the number of lines of code or solve problems in a project in a timely manner, passing this information to other developers to better update the project. At the same time, the greater the number of downloads is for the project, the more end users use the product, thereby demonstrating the activity of the project.
3.2.4. Extensibility
Extensibility indicates the dynamics of ecosystem development and growth ability. Extensibility is mainly evaluated in terms of three aspects, namely, popularity, growth force and service platform compatibility. Popularity shows the evolutionary trend of OSE. Taking programming languages as an example, if the language in the open source community project is consistent with the language that was popular at the time, defining the popularity of the open source community is in line with the development trend, which also indicates that OSE has developed dynamically over time. Growth force means the growth ability of emerging languages in OSE. In recent years, the rise of emerging languages such as Swift and Python (a language that quickly became everyone’s favorite) will serve to indicate whether OSE can keep pace with the progress of the times. That is, whether the number of projects related to Swift and Python will increase and therefore whether the dynamic development of OSE can be measured by analyzing the number of projects related to emerging languages in OSE. Service platform compatibility is an important indicator for measuring the quality of software. If the software can run on multiple platforms, then it is demonstrated to have good scalability. For example, projects in GitHub can run on such operating systems as Windows, OSX, and Linux. Therefore, they do not need to be recompiled or changed according to different operating system platforms. This means that the project has a very good extension, and can be widely used in different platforms. The service platform compatibility indicator can measure whether a piece of software can provide services across platforms.
3.3. Sustainability Measurement Method
For the sustainability measurement of the Open Source Ecosystem, Sun Lianshan et al. [
21] further decompose the sustainability of the Open Source Ecosystem into three different aspects: Commercial quality model, product quality model and collaborative quality model, and conduct qualitative analysis. Srba et al. [
11,
12] conducted a case study of the negative development of the SO community, suggesting a long-term sustainability of the Q&A ecosystem through a strong reputation mechanism and adaptive support for respondents. Weiss et al. [
8] proposed the “Guidelines for Sustainability,” using software engineering methods to provide guidance for system sustainability measurement. At present, the research on ecosystem sustainability is mainly based on the theoretical research and does not quantitatively analyze the sustainability of OSE. Therefore, based on the Evaluation Indicators System for the sustainability of OSE, this paper provides the method for quantitative analysis of the sustainable state of ecosystems. According to the Evaluation Indicator System, the weights of each indicator are calculated, and a sustainability assessment model is proposed for a comprehensive assessment of the sustainability of the ecosystem in question.
3.3.1. Weight Calculation Method Based on Information Contribution Value
The evaluation of the sustainability of open source ecosystems is a multi-indicator quantitative comprehensive evaluation process. For the sustainability of open source ecosystems, there are certain differences in the degree of impact of different evaluation indicators. In order to reflect this difference, it is necessary to assign a certain weight to each evaluation indicator.
As the types of indicators in the indicators system are complex, and the unit sizes are quite different, the original data is mapped by the Range Method [
35] in the interval [0,1].
The original data matrix is
X = (
xij)
m×n, and the normalized matrix is
X′ = (
xij′)
m×n. The specific formula is:
where
xij,
xij′ respectively represent the original and standardized value of the
j-th sample of the
i-th indicator; max
xj, min
xj respectively represent the standardized maximum value and standardized minimum value of the
j-th sample.
According to the definition of sustainability, we find that the sustainable development of the SO ecosystem is dynamic rather than static; therefore, if the value of the indicator changes more, the contribution of the indicator to sustainability is greater, and we define the contribution of the indicator as the contribution value. The contribution value, as a discrete degree of an indicator, means that the greater the degree of dispersion of the indicator, and the greater the contribution value, the greater the impact on the sustainability of the indicator, and the corresponding weight is larger. The contribution value of the indicator is formulated as
where
is the discrete degree of contribution between
j samples of the
i-th indicator,
indicates the
j-th sample contribution of the
i-th indicator, in order to eliminate the contribution being 0, we introduce the parameter
(
);
is the total contribution of all samples of the
i-th indicator,
,
as the adjustment factor, to ensure that
Ci is between [0,1].
For the value of , we will discuss the following. The parameter is introduced to eliminate the case where pij is 0. The value of pij depends on the value of xij′, so we need to judge xij′. There are two cases:
After the contribution value of the indicator is determined, the weight
wi of the
i-th indicator can be calculated according to the following formula:
where
is the total contribution value of all the indicators.
According to Equations (2)–(4), we first calculate the weight of the indicators included in the evaluation level. On this basis, we can use the additive of the contribution value of the indicator of the evaluation level to calculate the contribution value of the indicator of the corresponding guideline level, and write it as
Hk (
k = 1, 2...). Thus, the weight of the indicator of the corresponding guideline level is
3.3.2. Sustainability Assessment Model
The sustainable evaluation of Open Source Ecosystems is not only a unilateral analysis and the evaluation of an indicator, but also to quantify and form a quantitative scientific comprehensive evaluation result, such that we can comprehensively consider the development status of the ecosystem. Therefore, in order to assess the development status of each indicator and the development status of ecosystem sustainability, this paper integrates multiple indicators to comprehensively evaluate the sustainability of an Open Source Ecosystem and proposes a Sustainability Assessment Model. The expression of that model is
where
F is the composite index of the sustainability of the open source ecosystem;
wi—the weight of the
i-th guideline level;
wij—the weight of the
j-th evaluation level indicator in the
i-th guideline level;
t—the total number of indicators in the guideline level;
r—the number of indicators of evaluation level contained in the
i-th guideline level. The larger the
F value, the better the sustainable development of the open source ecosystem.