Download ebook Parallel Computational Technologies 16Th International Conference Pct 2022 Dubna Russia March 29 31 2022 Revised Selected Papers Communications In Computer And Information Science 1618 Leonid Sokolin online pdf all chapter docx epub
Download ebook Parallel Computational Technologies 16Th International Conference Pct 2022 Dubna Russia March 29 31 2022 Revised Selected Papers Communications In Computer And Information Science 1618 Leonid Sokolin online pdf all chapter docx epub
Download ebook Parallel Computational Technologies 16Th International Conference Pct 2022 Dubna Russia March 29 31 2022 Revised Selected Papers Communications In Computer And Information Science 1618 Leonid Sokolin online pdf all chapter docx epub
Parallel Computational
Technologies
16th International Conference, PCT 2022
Dubna, Russia, March 29–31, 2022
Revised Selected Papers
Communications
in Computer and Information Science 1618
Parallel Computational
Technologies
16th International Conference, PCT 2022
Dubna, Russia, March 29–31, 2022
Revised Selected Papers
Editors
Leonid Sokolinsky Mikhail Zymbler
South Ural State University South Ural State University
Chelyabinsk, Russia Chelyabinsk, Russia
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2022
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book are
believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors
give a warranty, expressed or implied, with respect to the material contained herein or for any errors or
omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
This volume contains a selection of the papers presented at the 16th International
Scientific Conference on Parallel Computational Technologies, PCT 2022. The PCT
2022 conference was held in Dubna, Russia, during March 29–31, 2022.
The PCT series of conferences aims at providing an opportunity to report and
discuss the results achieved by leading research groups in solving practical issues
using supercomputer and neural network technologies. The scope of the PCT series
of conferences includes all aspects of the application of cloud, supercomputer, and
neural network technologies in science and technology such as applications, hardware
and software, specialized languages, and packages.
The PCT series is organized by the Supercomputing Consortium of Russian
Universities and the Ministry of Science and Higher Education of the Russian
Federation. Originating in 2007 at the South Ural State University (Chelyabinsk,
Russia), the PCT series of conferences has now become one of the most prestigious
Russian scientific meetings on parallel programming, high-performance computing,
and machine learning. PCT 2022 in Dubna continued the series after Chelyabinsk
(2007), St. Petersburg (2008), Nizhny Novgorod (2009), Ufa (2010), Moscow (2011),
Novosibirsk (2012), Chelyabinsk (2013), Rostov-on-Don (2014), Ekaterinburg (2015),
Arkhangelsk (2016), Kazan (2017), Rostov-on-Don (2018), Kaliningrad (2019), Perm
(2020), and Volgograd (2021).
Each paper submitted to the conference was scrupulously evaluated by three
reviewers based on relevance to the conference topics, scientific and practical contri-
bution, experimental evaluation of the results, and presentation quality. The Program
Committee of PCT selected the 22 best papers to be included in this CCIS proceedings
volume.
We would like to thank the respected PCT 2022 platinum sponsors, namely Intel, RSC
Group, and Karma Group, and the conference partner, Special Technological Center, for
their continued financial support of the PCT series of conferences.
We would like to express our gratitude to every individual who contributed to
the success of PCT 2022. Special thanks to the Program Committee members and
the external reviewers for evaluating papers submitted to the conference. Thanks
also to the Organizing Committee members and all the colleagues involved in the
conference organization from the Joint Institute for Nuclear Research, the South Ural
State University (national research university), and Moscow State University. We thank
the participants of PCT 2022 for sharing their research and presenting their achievements
as well.
Finally, we thank Springer for publishing the proceedings of PCT 2022 in the
Communications in Computer and Information Science series.
Steering Committee
Berdyshev, V. I. Krasovskii Institute of Mathematics and
Mechanics, UrB RAS, Russia
Ershov, Yu. L. United Scientific Council on Mathematics and
Informatics, Russia
Minkin, V. I. South Federal University, Russia
Moiseev, E. I. Moscow State University, Russia
Savin, G. I. Joint Supercomputer Center, RAS, Russia
Sadovnichiy, V. A. Moscow State University, Russia
Chetverushkin, B. N. Keldysh Institute of Applied Mathematics, RAS,
Russia
Shokin, Yu. I. Institute of Computational Technologies, RAS,
Russia
Program Committee
Dongarra, J. (Co-chair) University of Tennessee, USA
Sokolinsky, L. B. (Co-chair) South Ural State University, Russia
Voevodin, Vl. V. (Co-chair) Moscow State University, Russia
Zymbler, M. L. (Academic South Ural State University, Russia
Secretary)
Ablameyko, S. V. Belarusian State University, Belarus
Afanasiev, A. P. Institute for Systems Analysis, RAS, Russia
Akimova, E. N. Krasovskii Institute of Mathematics and
Mechanics, UrB RAS, Russia
Andrzejak, A. Heidelberg University, Germany
Balaji, P. Argonne National Laboratory, USA
Boldyrev, Yu. Ya. St. Petersburg Polytechnic University, Russia
Carretero, J. Carlos III University of Madrid, Spain
Gazizov, R. K. Ufa State Aviation Technical University, Russia
Glinsky, B. M. Institute of Computational Mathematics and
Mathematical Geophysics, SB RAS, Russia
Goryachev, V. D. Tver State Technical University, Russia
viii Organization
Organizing Committee
Koren’kov, V. V. (Chair) Joint Institute for Nuclear Research, Russia
Podgaynyi, D. V. (Deputy Chair) Joint Institute for Nuclear Research, Russia
Derenovskaya, O. Yu. (Secretary) Joint Institute for Nuclear Research, Russia
Antonov, A. S. Moscow State University, Russia
Antonova, A. P. Moscow State University, Russia
Organization ix
Supercomputer Simulation
1 Introduction
The ranking of modern supercomputing systems and computational platforms
is an important problem of modern computer science. With a large variety of
architectures that exist and are widely used nowadays, it is crucial to under-
stand which systems are capable of solving a specific real-world problem faster,
frequently taking into account the properties of input data.
There exist multiple projects such as Top500 [14], Graph500 [15], HPCG [13],
Algo500 [5], which are aimed to rank the performance of supercomputing systems
based on algorithms used in different fields of application. The purpose of this
study is to develop a ranking of modern shared memory systems that is more
representative than existing systems. Our research extends the approach of using
graph algorithms to rank modern supercomputing architectures using a family
of graph algorithms, which is important due to the fact that graph algorithms
are used in a wide range of applications: solution of infrastructure and biological
problems, analysis of social and web networks, etc.
c The Author(s), under exclusive license to Springer Nature Switzerland AG 2022
L. Sokolinsky and M. Zymbler (Eds.): PCT 2022, CCIS 1618, pp. 3–16, 2022.
https://doi.org/10.1007/978-3-031-11623-0_1
4 I. Afanasyev and S. Krymskii
2 Related Work
At the moment of this writing, many solutions aimed to benchmark and conse-
quently rank supercomputing systems exist. Examples of such solutions include
the Top500 [14], Graph500 [15], Green500 [7] lists, the Algo500 [5] project based
on Algowiki [17], the HPCG [13] benchmark, and some others. Typically, these
solutions are based on applying a specific frequently used algorithm and its
implementation, such as solving SLE, doing SPMV, etc., and using some per-
formance metrics to rank various supercomputing systems. Such a variety of
1
The VGL rating is currently available at vgl-rating.parallel.ru.
VGL Rating 5
existing ratings is explained by the fact that different algorithms used as a bench-
marking core stress different parts of the supercomputing hardware (for example,
Graph500 – memory subsystem). At the same time some approaches, such as
Algo500, are more general, since they are capable of benchmarking supercom-
puting systems based on any algorithm described in the Algowiki project.
The most related to our project is the Graph500 rating, which also uses the
implementations of the Shortest Paths and Breadth-First Search (BFS) graph
algorithms launched on RMAT [6] graphs of different scales as a benchmarking
core. However, this rating, in our opinion, has the following drawbacks:
– Only 2 graph algorithms are used. At the same time, there are many other
graph algorithms with different properties, which usually demonstrate a dras-
tically different performance on different evaluated architectures and result
into significantly different ratings based on these algorithms;
– Similarly, using only one type of synthetic input graphs leads to the same
problem, i.e. architectures can potentially be ranked drastically differently
when some other graph is used;
– Graph500 provides only generic MPI and OpenMP implementations, forcing
users to develop their own highly optimized implementations;
– Graph500 targets large supercomputing systems, while it is also interesting
to compare single-node (and single-GPU) systems.
Thus, we decided to build our own rating system on top of the VGL frame-
work. This rating is mostly aimed to benchmark single-node systems, at the
same time using graphs and algorithms with different properties, which allows
creating a more general and balanced rating.
As a benchmarking core we use our own VGL framework. There exist poten-
tially other CPU-based or GPU-based graph-processing systems, such as Gun-
rock [18], cuSha [9], Ligra [16], etc. However, as we will show in the following
sections, VGL suites these purposes better since it is architecture-independent
and supports a large variety of modern architectures.
The next three subsections describe each of these three features in detail.
6 I. Afanasyev and S. Krymskii
differs from others due to (1) its larger computational complexity and the (2)
fact that it processes weighted graphs.
The main comparative characteristics of the algorithms (including those
already mentioned during the algorithm description) are provided in Table 1.
An analysis of these characteristics enables to conclude that we selected a repre-
sentative set of graph algorithms. Our additional conducted experiments demon-
strated that the performance of graph algorithms solving other problems, includ-
ing Maximum Flow, Strongly Connected Components, Coloring, quite resembled
one or several algorithms that we used as the basis of our benchmark. Thus, these
four algorithms form the basis for a representative rating reflecting the features
of a wide range of graph algorithms. However, our implementation makes it eas-
ily to extend the set of algorithms used in the rating (as will be shown in the
other section), in case we need to add another drastically different algorithm in
the future.
Table 1. Main comparative characteristics of the algorithms used in the rating basis.
Table 2. Graphs used in the project. Each graph is described by the Name, Number
of Vertexes, Number of Edges triple. The columns and rows of the table correspond to
the different sizes and categories of these graphs.
Based on the data presented, we can conclude that the developed rating uses
different classes, the parameters of which significantly impact the performance
of graph algorithm implementations on different architectures.
and, therefore, the same contribution to the final result, however, the user can
select a weight for each parameter in order to give preference to one or another
parameter. Each selected weight (from 0.0 to 1.0) indicates how strongly a par-
ticular parameter should affect the generated rating.
Two approaches to the formation of the rating were implemented. First, for
both approaches, all algorithms are launched on all graphs. Further, for each
pair {graph, algorithm}:
1. The performance values are sorted among all architectures; after sorting, each
architecture receives a sequential number, i.e. an index in the sorted array. For
each architecture, a value equal to the difference of the number of architec-
tures and the sequential number of the architecture, multiplied by the weight,
is added to the final rating.
2. The maximum performance value among all architectures is found, and all
results are divided by this value (normalization is performed). After that, for
each architecture, the normalized values multiplied by weights are added to
the final rating value.
Let us denote the set of graph types (social, infrastructure, etc.) used as the
basis of the rating as I, the set of graphs as G, the set of used graph algorithms as
J, the set of graph scales as K, the set of tested architectures as A. In addition,
let us denote the weights of these sets (which are specified by users) as xi , xj ,
xk .
In the first implementation, we first fix the graph g ∈ G, the graph
algorithm j ∈ J, and for all a ∈ A we obtain an array of performance values
Mgja corresponding to the triple {g, j, a}. Then we sort these values, and each
architecture gets the value pgja corresponding to the index of the value Mgja in
the sorted array. Let N be the number of architectures, then the final rating is
formed according to the following equation:
Ra (Rating of architecture a) = (N − pgja ) ∗ xi ∗ xj ,
∀g∈G,∀j∈J
∀p ∈ A Mijk p
Ra (Rating of architecture a) = ∗ xi ∗ xj ∗ xk .
max∀t∈A Mijk t
∀i∈I,∀j∈J,∀k∈K
of the second approach may be that one architecture works much better than
others on a small number of fixed graphs and algorithms, while it is slightly
worse on all others. In this case, this architecture can be ranked higher than
others, although it shows less efficiency on most graphs.
In the second approach, the normalization by the maximum performance
value obtained among all architectures on a certain combination of input param-
eters i, j, k, t is required to avoid situations when performance differences on
different sets of input data are drastically different due to some properties of
input data. For example, the shortest paths algorithm on a road graph performs
many more iterations compared to social graphs (due to their different diame-
ters). Without this normalization, the performance input of road graphs will be
much lower compared to social ones, which should not be the case.
During the course of this project, the VGL framework was extended with a
set of interfaces for automatically collecting performance data. These interfaces
execute the selected graph algorithms (PR, BFS, SSSP, HITS) on specified input
data on VGL-supported architectures. Afterwards, the interfaces automatically
send the results to the rating server, which in turn creates a rating based on the
ranking method described in Sect. 3.
VGL Rating 11
Fig. 1. Scheme of the developed benchmarking system: the client side is implemented
via VGL scripts and interfaces, while the server part is responsible for data storage
and rating visualization.
As shown in Fig. 1, the developed system has client and server parts.
Client Part: On the architecture being benchmarked, the user launches a
Python script provided inside VGL, which downloads graphs from the Konect
collection [12], converts them into the internal VGL format, launches four graph
algorithms and then collects performance data in TEPS [15]. Afterwards, the
performance data is uploaded to the rating server.
12 I. Afanasyev and S. Krymskii
Server Part: The server executes two scripts: the first is responsible for receiv-
ing data from the client and storing the received data in MongoDB [8], while the
second is responsible for calculating the rating based on user-specified parame-
ters and visualizing it as an HTML page.
Next, we will describe these two parts in detail, following the process of
benchmarking a specific architecture chosen by the user, submitting the obtained
benchmarking results and processing these results by the rating system.
First of all, the user launches the Python script submit.py on the client side,
which automatically performs the following actions.
At the beginning, the type of the architecture is determined: presence of
GPUs, vector edges of the SX-Aurora TSUBASA system [11], vendor, type and
generation of the CPU, etc. Depending on the obtained values, the evaluated
graph algorithm implementations are compiled according to the obtained infor-
mation (using specific compilers, optimizations flags, etc.). To achieve this, we
implemented in VGL a fairly large database of recommended compilation and
optimization settings for many widely used supercomputing architectures.
Afterwards, all graphs needed for testing are downloaded from the Konect
collection, and synthetic graphs are generated using random graph generators
implemented in the VGL framework.
After downloading, all graphs are divided into groups by the categories
defined in Sect. 3.1. When generating a rating, the user will be able to spec-
ify influence weights for each of the groups.
The user can provide additional parameters to the submity.py script to launch
graph algorithms on specific subsets of input graphs: Tiny, Tiny + Small, Tiny
+ Small + Medium, Tiny + Small + Medium + Large (in other words, a gradual
increase in the graphs used). These modes allow one to accelerate the benchmark-
ing process, as well as to solve the problem when certain large graphs cannot be
stored in the memory of the evaluated architecture, which can be the case for
NVIDIA GPUs or personal computers where the memory is limited by around
16–32 GB. It is important to emphasize that if some graph is not used for test-
ing, the obtained rating of the benchmarked architecture will be lower as if the
performance obtained on these graphs was equal to zero.
Once downloaded, the graphs are converted to an edge list format and stored
on the disk as binary files. Then, the optimized routines of the VGL framework
are used to load and convert these graphs into a specific optimized representation,
namely, CSR, VectorCSR [4], segmented or clusterized CSR [19], etc. Using
the optional parameters of the submit.py script, the user can select a specific
graph storage format for the evaluated architecture, which they think would be
more suitable. By default, VGL also provides a recommendation database, which
format should be used for a specific architecture (similarly to compilation and
optimization options).
Afterwards, all four algorithms are executed on all converted graphs, the
performance data is collected and saved as an array of dictionaries. Finally, this
performance data is packed and sent to the rating server. An offline export of
the performance data is implemented as an option. This is necessary in the case
VGL Rating 13
when the benchmarked system does not have access to the Internet, which is a
frequent situation for supercomputer nodes. In both cases, the generated array
of dictionaries containing the performance results is converted to a stream of
bytes using the pickle library and sent by the client to the server using the
socket library, where the received stream of bytes is converted back to a Python
dictionary.
The rating server processes the received performance data in the following
way.
The received data is saved to the Mongo database in the format shown in
Fig. 2. We decided to use a non-relational (NoSQL) Mongo database due to
the fact that in MongoDB, each collection object can contain different fields,
while in SQL databases, tables have a strongly typed schema. In our project,
it allows providing additional information, i.e. new graphs, algorithms, types of
the evaluated system during the development of the project, while remaining
back compatibility with older data.
After the data is saved, a specific method for calculating the rating (described
in Sect. 3) is used. The developed system is very flexible, and additional rating
formulas can be easily provided. In the future, we plan to support data visual-
ization based on various rating formulas according to the user’s choice.
A web page2 written in html, css and javascript is used to visualize the results.
The interaction of Python scripts and web pages is implemented using the Flask
web framework.
2
vgl-rating.parallel.ru.
14 I. Afanasyev and S. Krymskii
Table 3. Rating results used to make observations. Each cell after the name of the
architecture provides its rating value, which shows how often the given architecture is
better than the others.
Rating BFS
Overall rating LARGE size
position algorithm
NEC NEC
NVIDIA
SX-Aurora SX-Aurora
1 GPU V100,
TSUABSA, TSUABSA,
36.25
37.4 32
NVIDIA NVIDIA
Intel Xeon
2 GPU V100, GPU P100,
6140, 22.75
31.23 31
NVIDIA NVIDIA
Intel Xeon
3 GPU P100, GPU
6140, 25.5
19.86 V100,16.25
Intel Xeon Kunpeng Kunpeng
4
6140, 12 920, 16 920, 22
NEC
NVIDIA
Kunpeng 920, SX-Aurora
5 GPU P100,
8.81 TSUABSA,
15
19
Intel Xeon Intel Xeon Intel Xeon
6
6240, 5.45 6126, 10.75 6240, 7.75
Intel Xeon Intel Xeon Intel Xeon
7
6126, 4.35 6240, 9.25 6126, 7.5
1. Kunpeng 920 works faster on infrastructure graphs than Intel Xeon 6140, but
slower on all the others.
VGL Rating 15
2. NVIDIA GPUs process social graphs faster than NEC SX-Aurora TSUBASA
v1.0 and all the others slower.
3. NVIDIA GPUs work faster with the PR, SSSP algorithms than NEC SX-
Aurora TSUBASA v1.0 and slower with the other algorithms.
4. Intel Xeon 6140 is faster on the BFS algorithm than NVIDIA GPU P100 and
V100 and slower on the other algorithms.
5. Kunpeng 920 is faster on BFS than Intel Xeon 6140 and slower on all the
others.
6. NVIDIA GPUs are faster on Large graphs than NEC SX-Aurora TSUBASA
v1.0 and slower on the other graph sizes.
7. Kunpeng 920 and Intel Xeon 6140 are faster on Large graphs than NEC
SX-Aurora TSUBASA v1.0 and slower on the other graph sizes.
7 Conclusion
In this paper, we proposed a novel rating system that evaluates the performance
of target architectures based on the performance of multiple graph algorithms:
PR, SSSP, BFS and HITS. At the same time, our rating uses different types of
input graphs: infrastructure, social, rating, synthetic, which in aggregate makes
the proposed rating more representative than its existing counterparts.
The proposed rating system is implemented on top of the architecture-
independent VGL framework, which makes the benchmarking and submission
process as simple as running a single script provided in VGL.
Information about our rating is currently available on the vgl-rating.
parallel.ru website. In addition, everyone can easily contribute to the VGL frame-
work, freely available at vgl.parallel.ru, by implementing support for new archi-
tectures. We strongly believe that the proposed rating will be frequently used to
compare modern supercomputing architectures, gradually turning into a larger
project.
References
1. NEC SX-Aurora TSUBASA C/C++ compiler user’s guide. https://www.hpc.nec/
documents/sdk/pdfs/g2af01e-C++UsersGuide-016.pdf. Accessed 12 May 2020
2. Afanasyev, I.V.: Developing an architecture-independent graph framework for
modern vector processors and NVIDIA GPUs. Supercomput. Front. Innov. 7(4),
49–61 (2021). https://doi.org/10.14529/jsfi200404
3. Afanasyev, I.V., Voevodin, V.V., Komatsu, K., Kobayashi, H.: Distributed graph
algorithms for multiple vector engines of NEC SX-aurora TSUBASA systems.
Supercomput. Front. Innov. 8(2), 95–113 (2021)
16 I. Afanasyev and S. Krymskii
4. Afanasyev, I.V., Voevodin, V.V., Komatsu, K., Kobayashi, H.: VGL: a high-
performance graph processing framework for the NEC SX-Aurora TSUBASA vec-
tor architecture. J. Supercomput. 77(8), 8694–8715 (2021). https://doi.org/10.
1007/s11227-020-03564-9
5. Antonov, A., Nikitenko, D., Voevodin, V.V.: Algo500-a new approach to the joint
analysis of algorithms and computers. Lobachevskii J. Math. 41(8), 1435–1443
(2020)
6. Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-MAT: a recursive model for graph min-
ing. In: Proceedings of the 2004 SIAM International Conference on Data Mining,
pp. 442–446. SIAM (2004). https://doi.org/10.1137/1.9781611972740.43
7. Feng, W.C., Cameron, K.: The green500 list: encouraging sustainable supercom-
puting. Computer 40(12), 50–55 (2007)
8. Győrödi, C., Győrödi, R., Pecherle, G., Olah, A.: A comparative study: MongoDB
vs. MySQL. In: 2015 13th International Conference on Engineering of Modern
Electric Systems (EMES), pp. 1–6. IEEE (2015)
9. Khorasani, F., Vora, K., Gupta, R., Bhuyan, L.N.: CuSha: vertex-centric graph
processing on GPUs. In: Proceedings of the 23rd International Symposium on
High-Performance Parallel and Distributed Computing, pp. 239–252 (2014)
10. Kirk, D., et al.: Nvidia CUDA software and GPU parallel computing architecture.
In: ISMM, vol. 7, pp. 103–104 (2007)
11. Komatsu, K., Watanabe, O., Musa, A., et al.: Performance evaluation of a vector
supercomputer SX-Aurora TSUBASA. In: Proceedings of the International Confer-
ence for High Performance Computing, Networking, Storage, and Analysis, Dallas,
TX, USA, 11–16 November 2018, SC 2018, pp. 54:1–54:12. IEEE (2018). https://
doi.org/10.1109/SC.2018.00057
12. Kunegis, J.: Konect: the koblenz network collection. In: Proceedings of the 22nd
International Conference on World Wide Web, pp. 1343–1350 (2013)
13. Marjanović, V., Gracia, J., Glass, C.W.: Performance modeling of the HPCG
benchmark. In: Jarvis, S.A., Wright, S.A., Hammond, S.D. (eds.) PMBS 2014.
LNCS, vol. 8966, pp. 172–192. Springer, Cham (2015). https://doi.org/10.1007/
978-3-319-17248-4 9
14. Meuer, H.W.: The top500 project. looking back over 15 years of supercomputing
experience (2008)
15. Murphy, R.C., Wheeler, K.B., Barrett, B.W., Ang, J.A.: Introducing the graph
500. Cray Users Group (CUG) 19, 45–74 (2010)
16. Shun, J., Blelloch, G.E.: Ligra: a lightweight graph processing framework for shared
memory. In: ACM SIGPLAN Notices, vol. 48, pp. 135–146. ACM (2013)
17. Voevodin, V., Antonov, A., Dongarra, J.: AlgoWiki: an open encyclopedia of par-
allel algorithmic features. Supercomput. Front. Innov. 2(1), 4–18 (2015)
18. Wang, Y., Davidson, A., Pan, Y., et al.: Gunrock: a high-performance graph pro-
cessing library on the GPU. In: Proceedings of the 21st ACM SIGPLAN Sympo-
sium on Principles and Practice of Parallel Programming, pp. 1–12. ACM (2016).
https://doi.org/10.1145/2851141.2851145
19. Zhang, Y., Kiriansky, V., Mendis, C., Zaharia, M., Amarasinghe, S.P.: Optimizing
cache performance for graph analytics. arXiv abs/1608.01362 (2016)
HPC TaskMaster – Task Efficiency
Monitoring System
for the Supercomputer Center
1 Introduction
The most common examples of the inefficient usage of cluster resources are:
– allocation of insufficient or excessive resources for a task;
– running a non-parallel task on multiple CPU cores or GPUs;
– allocation of the compute node capacity without starting calculations.
The following requirements were defined for the design of the task perfor-
mance monitoring system.
1. The system should collect the following data for each task:
– utilization of specific CPU cores allocated for the task;
– utilization of GPUs allocated for the task;
– GPU memory utilization;
– GPU power consumption;
– utilization of RAM created by the task;
– file system usage.
2. The system must analyze the collected data and use it to determine whether
the task worked effectively.
3. The system must provide users with access to the list of completed tasks and
reports on their completion using a web application.
The rest of this paper is organized as follows. A comparison of different
monitoring systems is carried out in Sect. 2. In Sect. 3, the architecture of the
system is described. The detection of inefficient user tasks is considered in Sect. 4.
User statistics are provided in Sect. 5. Finally, Sect. 6 shows the conclusions of
this work.
2 Related Work
The key feature of the HSE cluster is how it allocates resources for user tasks.
Instead of allocating the entire compute node for one task, the user is given a
certain number of processor cores and GPUs. As a result, several dozen tasks can
be performed on the compute node at once, thus optimizing cluster resources.
Due to this feature, ready-made solutions for monitoring system resources, such
as Nagios and Zabbix, are not suitable for this cluster. cHARISMa already has
a monitoring system of its own [4], however, it is designed to display only the
global usage across the whole cluster and its nodes.
Since one of the HSE University goals is to provide cluster users with a
secure system in the HSE University environment, a new monitoring system was
built using open-source monitoring tools. Chan [3], Wegrzynek [11], Kychkin [6],
Safonov [10] describe how using a combination of programs such as Telegraf,
InfluxDB and Grafana allows one to quickly set up and run a cluster resource
monitoring system. In [2,3], it is also described how the Slurm plugin acct gather
enables to collect metrics for Slurm tasks, which is precisely the data required
for a task efficiency monitoring system. Since all programs, except Telegraf, are
already installed on cHARISMa, this approach can be used to monitor tasks on
the cluster.
HPC TaskMaster 19
3 System Architecture
This section describes the monitoring infrastructure of the HPC TaskMaster
system, shown in Fig. 1.
The Slurm task scheduler is used to run tasks on the cluster. The main
data of Slurm tasks is stored in the MySQL relational database using the back-
ground process slurm database (slurmdbd), and the task metrics are written to
the InfluxDB time series database using the plugin acct gather. This plugin col-
lects memory and filesystem usage (read/write) for each task.
The required metrics of utilizing specific CPU cores and GPUs are collected
with the Telegraf daemon, which has built-in plugins for these metrics. Thus,
having the CPU and GPU IDs assigned to the task, the system can collect
metrics for the components and, therefore, distinguish utilization for different
tasks on one node. Additional metrics are collected using developed plugins in
Python.
The collected metrics are stored in the InfluxDB database. InfluxDB was cho-
sen as a time-series database because of Telegraf support and Slurm acct gather
plugin support, which allows one to store all the required metrics in one database.
Grafana is used as a tool for visualizing graphs on the cHARISMa cluster.
Grafana provides great opportunities for configuring and formatting charts and
also has support for creating them using the API. This API allows automating
the creation of graphs for each task. New graphs for each task are created using
JSON templates. Based on the available data about the task, when the user
requests it, graphs are automatically built in Grafana. The created graphs are
displayed on the system’s website using iframe technology, where the user can
interactively view the graphs for the period of task execution. In addition, the
system creates graphs for both completed and running tasks. Thereby, the user
can observe the work of his task in real time.
The advantage of using a combination of Telegraf, InfluxDB and Grafana is
the ability to install and configure these tools on any cluster. Moreover, these
tools make the monitoring system quite flexible – additional data for the system
can be collected using the built-in plugins of Telegraf or developed ones.
It is important to pay attention to the fact that the HPC TaskMaster system
has a negligible impact on the performance of compute nodes; the installed
Telegraf daemon uses only 0.03% of the overall CPU performance. In addition
to Telegraf, another source of the computing cluster load is InfluxDB. Installed
on the head node, InluxDB uses an average of 5 GB of storage per month. To
free up storage, a retention policy that compresses metrics older than 6 months
is used.
The HPC TaskMaster system is developed on Django, a Python web frame-
work that has a large number of available packages and a wide range of tools for
developing web applications, which allows one to develop a monitoring system
using Telegraf, InfluxDB and Grafana. In addition, Django has a built-in admin-
istration panel through which the administrator can configure the monitoring
system himself without making changes to the source code of the program.
The task performance monitoring system works according to the following
principles:
HPC TaskMaster 21
– metrics are collected on each compute node using Telegraf and stored in the
InfluxDB database on the head node. Metrics from the acct gather plugin are
also stored in InfluxDB;
– the system updates its local MySQL database by comparing its tasks with
those from the Slurm database;
– while the task is running, aggregated metrics are collected for it from the
InfluxDB database with a certain period;
– if the task is completed, its aggregated metrics are collected for the last time;
– the collected aggregated metrics are analyzed by the system, and an inference
about the efficiency of the task is generated.
Aggregated Metrics
To simplify the analysis, aggregated metrics Λk = (λk1 , · · · , λkm ) are calculated
for each time series [5]. They include the minimum, maximum, average, median
and standard deviations. In addition to them, the tuple Λ includes the average
load of each node and the combined average load of the nodes.
22 P. Kostenetskiy et al.
№ Parameter Type
1 ID Integer
2 Task name
3 Status
String
4 Launch command
5 Type of compute nodes
6 Number of compute nodes
7 Number of CPU cores
8 Number of GPUs
Integer
9 Exit code
10 User ID
11 Project ID
12 Start date and time
Date
13 End date and time
Tags
Since the task parameters are a heterogeneous set of data (integers, strings,
dates), to simplify their analysis, a system of tags, i.e., “labels” indicating the
type of task, execution time, and other properties of the task, is introduced.
Table 3 contains a list of tags currently available in the system. Additional tags
can be developed and implemented into the system.
The tuple T k = (τ1k , . . . , τnk ) is assigned to the task with the ID k, where n is
the number of tags in the system. The τi element corresponds to the indicator of
the i tag and takes the value 1 if all conditions are met and the tag is assigned
to the task, and 0 otherwise.
Indicators
To determine if the task is working inefficiently, it is necessary to evaluate the
disposal of the components involved in the task. To do this, the concept of
indicator of problems is introduced.
HPC TaskMaster 23
№ Tags Type
1 Jupyter-notebook task
2 LAMMPS task
3 VASP task
String
4 Allocation of resources for calculations
5 The task lasted less than a minute
6 The task was completed with an error
λkj − aj
ljk = 1 − , lj ∈ [0, 1], (1)
bj − aj
where aj , bj are the admin defined parameters referring to the minimum and
maximum possible values of the j-th element of the aggregated metrics.
Indicators are placed in the tuple of indicators Lk = (l1k , . . . , lm
k
).
The list of currently available indicators is presented in Table 4. Additional
indicators can be developed and implemented into the system. The number of
indicators for a specific task depends on the number of cores, compute nodes
and GPUs used.
№ Indicators
1 Low average CPU usage
2 Low average CPU core usage
3 Low average GPU usage
4 Low GPU memory usage
5 The task was completed with an error
4.3 Inferences
To help users to interpret the results, the system has a set of inferences Φ = (φi ).
Inferences are the result of the analysis of the task.
24 P. Kostenetskiy et al.
Different requirements for tags and indicator values are set for each inference.
An inference is assigned to the task when all the conditions are met. Several
inferences can correspond to one task at once.
Denote the union of tuples of indicators L and tags T as
Let Ωi be a set of conditions for the output of φi to the elements of the tuple
N k.
Then we can match the set C k to each problem:
4.4 Example
№ Parameter Value
1 ID 405408
2 Task name SimpleRun
3 Status Successful
4 Exit code 0
5 Launch command sbatch run task.sh
6 User ID 2000
7 Project ID 32
8 Start date and time November 11, 2021 10:13:28
9 End date and time November 12, 2021 13:19:09
10 Type of compute nodes type a
11 Number of compute nodes 4
12 Number of CPU cores 176
13 Number of GPUs 16
The aggregated metrics across all compute nodes for the example task are
shown in Table 6.
HPC TaskMaster 25
№ Metrics Value
1 Avg. load of cores on comp. node cn-001 99.36
2 Avg. load of cores on comp. node cn-002 99.11
3 Avg. load of cores on comp. node cn-003 99.15
4 Avg. load of cores on comp. node cn-004 99.51
5 Avg. load of comp. nodes 99.28
7 Avg. utilization of GPUs on comp. node cn-001 71.62
8 Avg. utilization of GPUs on comp. node cn-002 71.6
9 Avg. utilization of GPUs on comp. node cn-003 71.15
10 Avg. utilization of GPUs on comp. node cn-004 71.8
11 Avg. utilization of GPUs 71.54
Table 7 shows the aggregated metrics of the time series for compute node
cn-001. Data for compute nodes cn-002, cn-003, cn-004 are not shown to save
space.
N Indicator Value
Compute node cn-001
1 Core 1 0.011
. . .
. . .
. . .
44 Core 44 0.002
207 GPU №:0 utilization 0.284
. . .
. . .
. . .
210 GPU №:3 utilization 0.284
223 GPU №:0 memory usage 0.778
. . .
. . .
. . .
226 GPU №:3 memory usage 0.778
. . .
. . .
. . .
Compute node cn-004
205 Core 1 0.011
. . .
. . .
. . .
206 Core 40 0.002
207 GPU №:0 utilization 0.279
. . .
. . .
. . .
208 GPU №:4 utilization 0.28
209 GPU №:0 memory usage 0.779
. . .
. . .
. . .
210 GPU №:3 memory usage 0.778
Summary
239 Avg. load of cores on node cn-001 0.006
240 Avg. load of cores on node cn-002 0.009
241 Avg. load of cores on node cn-003 0.008
242 Avg. load of cores on node cn-004 0.005
243 Avg. load of nodes 0.007
244 Avg. utilization of GPUs on node cn-001 0.284
245 Avg. utilization of GPUs on node cn-002 0.284
246 Avg. utilization of GPUs on node cn-003 0.289
247 Avg. utilization of GPUs on node cn-004 0.282
248 Avg. utilization of GPUs 0.285
HPC TaskMaster 27
Table 9. Inferences
Based on the tuple N 405408 , the system will associate the set C 405408 = {φ1 }
with task 405408, since the task is executed without errors and all resources are
used.
An example of the task report with an inference of inefficient salloc usage is
shown in Fig. 2.
5 User Statistics
System administrators have access to inference statistics for each cluster user
for a selected period of time. An example of statistics is shown in Fig. 3. Using
28 P. Kostenetskiy et al.
this pie chart, administrators can understand which types of tasks are caus-
ing difficulties for the user. After determining the problem that the user has
encountered, he can get a personal consultation to solve this problem.
Statistics of the most active users of the cluster with the lowest percentage
of effective tasks are compiled monthly; personal consultations are held on the
basis of the statistics. By tracking trends in user efficiency by month, we can
conclude how the HPC TaskMaster system can increase the efficiency of using
cluster resources.
6 Conclusions
The developed task performance monitoring system, HPC TaskMaster, is a pow-
erful tool that provides all the necessary information (main information, aggre-
gated metrics, graphs, and inferences) about tasks in one place. This system
will help users to identify the problem for existing scientific applications and
applications of their development, thereby simplifying work with the cluster for
users, allowing them to perform scientific calculations faster and more efficiently
in the future.
HPC TaskMaster is constantly evolving and improving. Among the future
directions for development are:
– monitoring the effectiveness of individual categories of applications using
machine learning tools;
HPC TaskMaster 29
References
1. Open Source/HPC TaskMaster GitLab. https://git.hpc.hse.ru/open-source/hpc-
taskmaster
2. Slurm Workload Manager - acct gather.conf. https://slurm.schedmd.com/acct
gather.conf.html
3. Chan, N.: A resource utilization analytics platform using grafana and telegraf
for the Savio supercluster. In: ACM International Conference Proceeding Series.
Association for Computing Machinery (2019). https://doi.org/10.1145/3332186.
3333053
4. Kostenetskiy, P.S., Chulkevich, R.A., Kozyrev, V.I.: HPC resources of the higher
school of economics. J. Phys. Conf. Ser. 1740, 012050 (2021). https://doi.org/10.
1088/1742-6596/1740/1/012050
5. Kraeva, Y., Zymbler, M.: Scalable algorithm for subsequence similarity search in
very large time series data on cluster of phi KNL. In: Manolopoulos, Y., Stupnikov,
S. (eds.) DAMDID/RCDL 2018. CCIS, vol. 1003, pp. 149–164. Springer, Cham
(2019). https://doi.org/10.1007/978-3-030-23584-0 9
6. Kychkin, A., Deryabin, A., Vikentyeva, O., Shestakova, L.: Architecture of com-
pressor equipment monitoring and control cyber-physical system based on influx-
data platform. In: 2019 International Conference on Industrial Engineering,
Applications and Manufacturing, ICIEAM 2019 (2019). https://doi.org/10.1109/
ICIEAM.2019.8742963
7. Nikitenko, D., et al.: JobDigest - detailed system monitoring-based supercomputer
application behavior analysis. In: Voevodin, V., Sobolev, S. (eds.) Supercomputing.
Communications in Computer and Information Science, vol. 793, pp. 516–529.
Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71255-0 42
8. Nikitenko, D.A., Voevodin, V.V., Zhumatiy, S.A.: Deep analysis of job state
statistics on Lomonosov-2 supercomputer. Supercomput. Front. Innov. 5(2), 4–10
(2018). https://doi.org/10.14529/jsfi180201
9. Rohl, T., Eitzinger, J., Hager, G., Wellein, G.: Likwid monitoring stack: a flexible
framework enabling job specific performance monitoring for the masses (2017).
https://doi.org/10.1109/CLUSTER.2017.115
10. Safonov, A., Kostenetskiy, P., Borodulin, K., Melekhin, F.: A monitoring system
for supercomputers of SUSU. In: Proceedings of Russian Supercomputing Days
International Conference, vol. 1482, pp. 662–666. CEUR-WS (2015)
11. Wegrzynek, A., Vino, G.: The evolution of the ALICE O 2 monitoring system.
In: EPJ Web of Conferences, vol. 245 (2020). https://doi.org/10.1051/epjconf/
202024501042
Constructing an Expert System
for Solving Astrophysical Problems Based
on the Ontological Approach
1 Introduction
Modern astrophysics studies the physical processes of the Universe, the evolution
of astronomical objects and their interaction. Mathematical models of evolving
astronomical objects and their mutual influence are constructed on the basis
of the observed information taking into account the gravitational and magnetic
fields. It should be noted that mathematical modeling is the primary theoretical
method for studying astrophysical processes. It becomes necessary to solve a
numerous class of problems associated with the study of the structure, dynamics
and evolution of stellar systems, the Sun and stars, with the study of variable
stars, multiple stellar systems and the physics of the interstellar medium.
A large number of parallel codes have been developed for the solution of astro-
physical problems. We distinguish the following groups of codes: codes based on
Smoothed Particle Hydrodynamics [1–3], grid codes [4–6], including codes using
adaptive [7–9] and moving [10–12] meshes. Each implemented numerical method
and code focus on a certain type of problems and are often limited to the use of
classical supercomputer architectures. There are also codes adapted on graph-
ics accelerators [13–15] and Intel Xeon Phi accelerators [16]. However, the use of
any of these codes for solving a specific astrophysical problem requires significant
improvement. Currently, there are no universal systems for generating astrophys-
ical codes. Nevertheless, attempts to create such systems exist, for example, at
the University of Costa Rica [17] on the basis of the EXCALC package. An
intelligent system for generating astrophysical codes has not yet been created,
although there are attempts to develop such a system, including those based on
the ontological approach [18] and the ontological approach practice [19].
In [20,21], we presented the concept of intelligent support for solving
compute-intensive problems of mathematical physics using ontology. Let us
briefly list the main blocks of the proposed system and their purpose (Fig. 1).
The main block of the system is a knowledge base, which includes the ontol-
ogy of numerical methods and parallel algorithms and the ontology of parallel
architectures and technologies, and inference rules. Based on these ontologies, an
information-analytical web resource is built, it allows the user to study objects
included in the knowledge base, to view the connections between them, and also
add new objects to the base. The next block is an expert system, at the input of
which the user submits the specification of the problem to be solved. Based on
this information, the inference engine builds a scheme for solving the problem
using ontology objects from the knowledge base and inference rules formulated
by experts. When the solution scheme is determined, the next step is to build a
parallel program for solving the problem. In this step, modules from the software
library are used. If there is no suitable module, then the user will have to develop
it himself. Thus, a parallel code is generated taking into account the computa-
tional algorithm and architecture of the selected computing system. The system
also includes a block for simulation, which allows one to determine the optimal
number of computing cores for solving the problem.
To work with ontological models, inference machines are used, they allow one
to check the correctness of the ontology, operating with the names of classes,
properties and entities. They can also be used to display information that is not
explicitly contained in the ontology based on inference rules. There are several
inference machines, the most famous of which are Pellet, HermiT, FaCT++.
These inference engines are installed as plugins for the Protege ontology
editor [22].
The goal of this work is to develop a crucial component for solving astro-
physical problems using the ontological approach: the assignment of a group
of inference rules that determine the choice of a numerical method, computing
32 A. Sapetina et al.
Fig. 1. Main blocks of the intelligent support system for solving compute-intensive
problems of mathematical physics.
In [20], an example of choosing a chain of objects from the main classes of such
an ontology to solve an astrophysical problem associated with the collision of
galaxies is given. This work does not consider in detail how this chain should be
built to solve the problem, including the questions of setting a group of rules,
on the basis of which the numerical method and architecture of the computing
system are selected from those available in the ontology.
In [24], we considered a conceptual model for constructing a scheme for solv-
ing a mathematical physics problem based on the ontology approach (Fig. 2).
The main blocks for the specification of the problem (user interface), the main
blocks of the solution scheme, as well as the groups of rules that must be set for
the automatic construction of the scheme are highlighted. These are groups of
rules determining a system of equations, a numerical method, the implementa-
tion of a parallel algorithm, the properties of this algorithm, parallel computing
architectures and technologies. Essentially, these are user decision points where
intelligent support is needed to select the optimal solution, including from the
point of view of parallel implementation. Therefore, for each subject area, it is
necessary to develop a set of such rules that will allow the user to avoid mistakes
when developing a parallel algorithm and a program for solving his problem.
Fig. 2. Scheme of relationships between the main blocks of the user interface (high-
lighted in blue), problem-solution scheme blocks (highlighted in yellow), and rule groups
(highlighted in green). (Color figure online)
L’ÉCRIVAIN ET L’ARGENT
LE MARIAGE DE L’ÉCRIVAIN.
L’ÉCRIVAINE
SALONS LITTÉRAIRES
Jadis les écrivains allaient au café ; ils y faisaient leurs débuts ; ils
y vivaient ; parfois ils y mouraient, ou peu s’en faut. Le grand Moréas
aura peut-être été le dernier à mener intrépidement, et jusqu’à
l’hôpital, cette existence indépendante et bohème. Elle avait ses
avantages, assurant à l’esprit une liberté qu’ailleurs il ne saurait
retrouver aussi entière. Elle avait ses inconvénients, dont l’un, et non
des moindres, était de séparer presque complètement les gens de
lettres des femmes — du moins des femmes qui ne fréquentent pas
les cafés, et c’est le plus grand nombre. Un autre de ces
inconvénients est qu’on ne saurait guère aller au café, et y séjourner,
sans boire. La littérature d’alors buvait donc, et non sans excès… La
Faculté, de nos jours, constate qu’il existe « un alcoolisme des gens
du monde » à base de porto et de cocktails. Il y avait, à cette époque
aujourd’hui préhistorique, un alcoolisme des littérateurs, à base
d’absinthe et d’autres breuvages violents et populaires.
Nul ne saura jamais pourquoi les peintres vont encore au café,
tandis que les gens de lettres l’abandonnent. Il se peut que ce soit
parce qu’il subsiste, dans la peinture, plus de fantaisie et d’esprit
révolutionnaire, si l’on entend ce dernier terme au sens d’une sorte
de répugnance à s’incliner devant un minimum de conventions
mondaines et aussi d’un goût déterminé pour les discussions
théoriques. Les discussions théoriques ne peuvent guère avoir lieu
qu’au café, et entre hommes, ou du moins en présence de dames
qui ne sont là que pour attendre patiemment que leur ami finisse par
estimer qu’il est temps de s’aller coucher.
Le café, pour la littérature, surtout pour la très jeune littérature, a
été remplacé par le bar-dancing, plus coûteux, et où l’on rencontre
des dames également plus coûteuses, bien que d’un niveau social
analogue à celui des personnes qui accompagnaient autrefois leurs
seigneurs et maîtres à la brasserie ; mais surtout par les salons.
Il existe en ce moment très peu de salons « littéraires » au sens
propre du mot, c’est-à-dire où un homme de lettres, ou plusieurs,
tiennent le haut du tapis et le dé de la conversation. Mais il en est,
beaucoup plus qu’auparavant, où les jeunes gens de lettres sont
admis de plain-pied avec les gens du monde ou de fortune
considérable. Ceci vient, comme il a été dit, de la tendance des
classes dirigeantes et conservatrices à s’annexer, comme une force,
la littérature. Les jeunes gens de lettres se font là des amies, ni plus
ni moins sûres que celles que leurs prédécesseurs conduisaient au
café, mais qui en diffèrent par leur rang social, leur manière de vivre
et, en quelques nuances, d’envisager les problèmes de l’amour.
Elles ont, de plus, en raison de leur habitude du monde, et de leur
situation, plus d’autorité ; elles exigent qu’on ne les laisse pas
entièrement à part de la conversation, même si elle est « d’idées »,
ce qui, à la grande rigueur, peut arriver.
Il résulte de cette évolution des mœurs que la littérature
d’autrefois, la littérature de café, avait une tendance excessive à se
masculiniser, et que la littérature d’aujourd’hui marque en sens
inverse une propension à se féminiser, tout en s’affirmant, en
quelque manière, antiféminine. Elle est de meilleur ton, et plus
galante ; elle est moins romantique, moins oratoire, plus spirituelle,
légère, psychologique ; elle recherche d’autres genres de
supériorité, elle admet aussi d’autres genres de médiocrité. Il ne faut
pas croire que les cafés littéraires n’eussent pas leur snobisme :
celui de la violence, de la grossièreté truculente et, dans les derniers
temps, d’un individualisme anarchique… Les salons plus ou moins
littéraires de nos jours ont le leur, dicté par quelques revues plus ou
moins jeunes, qui ont la prétention d’exprimer le fin du fin, d’avoir un
goût qui n’est pas celui du vulgaire — le snobisme de l’ennui, a dit
avec rudesse, et sans suffisantes nuances, M. Henri Béraud — et
celui des opinions décentes, non pas en morale, où l’on est fort
indulgent, mais en politique.
Le café était volontiers libertaire ; le salon est conservateur, bien
que de façon platonique et inefficace. Il ne saurait, en effet, aller bien
loin : car il ne reçoit pas seulement des gens de lettres et des gens
du monde, mais des hommes politiques des partis au pouvoir, qui
sont aussi, pour la maîtresse de la maison, des numéros « à
montrer ». Souvent aussi, d’ailleurs, des intérêts matériels, des
intérêts « d’affaires » y sont pour quelque chose. On a toujours un
petit service à demander à un homme politique ! D’ailleurs on
s’accorde généralement à déclarer qu’il pense moins mal qu’on
n’aurait cru, qu’au fond « il est des nôtres ». On garde le vague
espoir qu’on le gagnera tout à fait. Cette erreur est excusable : à
Paris et dans un milieu parisien, l’homme politique parle comme on
parle à Paris, il ne tient pas à se faire d’ennemis. Le dos tourné, il
recommence à penser à ses électeurs de province, qui eux-mêmes
ne pensent pas comme les habitués de ce salon parisien. Il sait ce
qu’il faut dire — et ce qu’il faut taire. En fin de compte, ce ne sont
pas ses électeurs qu’il trahira, mais le salon ne lui en gardera pas
longtemps rancune, parce que, malgré tout, il faut « l’avoir ».
Le salon n’exerce aucune influence réelle sur la littérature ; il ne
la mène pas, il ne lui signale nulle direction, pour le motif qu’on y
pense peu, et que les conversations « d’idées » y sont rares de nos
jours. Du reste, en plus des écrivains des petites chapelles à la
mode, dont je parlais tout à l’heure, il se contente d’accueillir les
écrivains que la faveur publique a désignés par de gros tirages ou
certaines revues par leur publicité ; il ne fait pas les réputations. Il a
pourtant cet avantage de constituer un lieu de rencontre pour des
gens de lettres qui jusque-là ne se connaissaient que par leurs
œuvres, ou pas du tout. Il peut aussi servir à une candidature
académique.
Pamphile, qui n’est qu’un néophyte, n’y dit pas grand’chose, sauf
aux femmes, en quoi il a bien raison ; et, avec elles, il ne parle pas
littérature. Mais cela ne l’empêche pas d’avoir des yeux et des
oreilles. Il écoute attentivement, et sait regarder ; il sort de là, le plus
souvent, avec des considérations qui m’amusent. Je ne suis
nullement étonné — de telles illusions sont de son âge — qu’il se
trouve déçu à voir que beaucoup d’auteurs ne ressemblent pas à
leurs œuvres. Belphégor, si ardent et si incisif, en ses écrits, lui
apparaît sous la forme d’un petit homme blond, timide et doux
comme un Eliacin qui aimerait seulement couper les cheveux en
quatre, au lieu de réciter les leçons du grand-prêtre Joad. Il s’étonne
que Vergis, qui publia les deux plus beaux romans lyriques et
romanesques de la fin du romantisme ne veuille plus entendre parler
que de philosophie bouddhique ; que Paulus, qui a tant d’esprit dans
ses livres et au théâtre, se répande communément en plaisanteries
qui ne feraient pas même honneur à l’Argus du café du commerce
d’une petite ville de province — mais n’en sont pas moins accueillies
comme d’une originalité exceptionnelle.
Enfin Pamphile a découvert Lépide, dont le succès, dans ce
salon et dans plusieurs autres, demeure pour lui un mystère. Lépide
est terne, même gris, ennuyeux et ne dit rien sur rien qui mérite
jamais d’être retenu. On le croirait plutôt né pour la diplomatie que
pour la littérature. Mais c’est à la littérature qu’il applique sa
diplomatie. Il écrit ; il compose des ouvrages ; mais ses ouvrages,
assez ennuyeux, ont toujours, par surcroît, le tort de rappeler ceux
de quelque devancier. Son style est pur, mais sans caractère ; une
eau transparente et insipide. On ne saurait rien en retenir. Pourtant il
est là, et la place qu’on lui reconnaît est distinguée — comme sa
personne, empreinte de cette élégance, vraiment mondaine, qui
consiste à ne présenter aucune chose remarquable. Nul ne doute
qu’il ne soit destiné au plus brillant avenir.
Pamphile, un peu choqué, m’en demande la raison.
« Il n’y en a pas, lui dis-je. Il y a seulement, dans la littérature,
des réputations de salon comme il y avait, il y a trente ans, des
réputations de café, tout aussi peu méritées. Ce ne sont pas les
mêmes, voilà tout. Le café aimait « les forts en gueule » et prenait
leur vulgarité bruyante pour de l’originalité. Le monde aime les gens
effacés, discrets, serviables. Il les adopte ; il n’obligera personne à
lire leurs livres : cela n’est point en son pouvoir ; mais il les peut
pousser jusqu’à l’Académie.
— Lépide sera donc de l’Académie ?
— Pourquoi pas ? Il est de bonne compagnie. C’est là un mérite,
et l’on ne saurait indéfiniment dire « non » à un aimable homme
qu’on rencontre partout où l’on va, et sur lequel il n’y a rien à dire, ni
en bien ni en mal. Une fois mort, il sera comme s’il n’avait jamais
existé. Son dernier, et peut-être son premier lecteur, sera celui qui le
remplacera sous la Coupole. Le malheureux aura de la peine à s’en
tirer ; mais il s’en tirera si, de façon discrète, il sait faire entendre qu’il
est des écrivains dont l’influence est personnelle, et ne vient pas de
leurs ouvrages. »
CHAPITRE XX
L’ÉCRIVAIN ET L’ACADÉMIE