Get The Science of Science 1st Edition Dashun Wang Free All Chapters
Get The Science of Science 1st Edition Dashun Wang Free All Chapters
Get The Science of Science 1st Edition Dashun Wang Free All Chapters
com
https://ebookmeta.com/product/the-science-of-
science-1st-edition-dashun-wang/
OR CLICK BUTTON
DOWLOAD EBOOK
https://ebookmeta.com/product/the-science-of-science-1st-edition-
wang/
https://ebookmeta.com/product/philosophy-of-science-an-
introduction-to-the-central-issues-1st-edition-wang-wei/
https://ebookmeta.com/product/computational-methods-and-gis-
applications-in-social-science-fahui-wang/
https://ebookmeta.com/product/formal-methods-in-computer-science-
textbooks-in-mathematics-1st-edition-jiacun-wang/
The Science of Citizen Science Katrin Vohland (Editor)
https://ebookmeta.com/product/the-science-of-citizen-science-
katrin-vohland-editor/
https://ebookmeta.com/product/artificial-intelligence-for-
materials-science-springer-series-in-materials-science-312-yuan-
cheng-editor-tian-wang-editor-gang-zhang-editor/
https://ebookmeta.com/product/computer-science-principles-the-
foundational-concepts-of-computer-science-kevin-hare/
https://ebookmeta.com/product/the-silences-of-science-gaps-and-
pauses-in-the-communication-of-science-1st-edition-felicity-
mellor-editor/
https://ebookmeta.com/product/the-science-of-waste-1st-edition-
spellman/
The Science of Science
Albert-László Barabási
Northeastern University, Boston
University Printing House, Cambridge cb2 8bs, United Kingdom
A catalogue record for this publication is available from the British Library.
Introduction 1
Karim Lakhani, David Lazer, Jessie Li, Zhen Lei, Jess Love, Stasa
Milojevic, Federico Musciotto, Willie Ocasio, Sandy Pentland, Alex
Petersen, Filippo Radicchi, Iyad Rahwan, Lauren Rivera, Matt Salga-
nik, George Santangelo, Iulia Georgescu, Ned Smith, Paula Stephan,
Toby Stuart, Boleslaw Szymanski, Arnout van de Rijt, Alessandro
Vespignani, John Walsh, Ludo Waltman, Kuansan Wang, Ting Wang,
Adam Waytz, Klaus Weber, Stefan Wuchty, Yu Xie, Hyejin Youn.
Among the many people who made this journey possible, our
special thanks go to two of them in particular, whose “bright ambi-
ence” is especially omnipresent throughout the pages and over the
years. Brian Uzzi is not only always generous with his time, he also
has remarkable insight. Time and again he manages to take ideas we’re
struggling to present and effortlessly elevates them. We are also grateful
to Brian for championing the somewhat radical idea that physicists can
contribute to the social sciences. Our colleague James Evans has been a
close friend and collaborator, and several ideas discussed in this book
would have not been born without him, including but not limited to the
concluding remarks on engaging “all the science of science” for the
future development of the field.
We are extremely grateful for the generous research support we
have received over the years. In particular Riq Parra from AFOSR has
been a true believer from the very beginning, when few people knew
what we meant by the “science of science.” Many concepts discussed in
this book would not have been possible without his strong and con-
tinued support. Dashun also wishes to express his special thanks to the
Kellogg School of Management, an institution that offered a level of
support and trust that most researchers only dream of.
Many unsung heroes contributed to this text, collectively log-
ging countless hours to help guide the book along. We have benefited
tremendously from the excellent and dedicated editorial assistance of
Carrie Braman, Jake Smith, and James Stanfill, as well as Enikő Jankó,
Hannah Kiefer, Alanna Lazarowich, Sheri Gilbert, Michelle Guo, and
Krisztina Eleki. Special thanks to Alice Grishchenko who took on the
brave redesign effort, diligently redrawing all the figures and giving a
visual identity to the book. Yian Yin and Lu Liu have been our go-to
people behind the scenes, springing into action whenever we needed
help, and contributing to a variety of essential tasks ranging from data
analysis to managing references.
x / Acknowledgements
quantify the patterns that characterize discovery and invention, and offer
lessons to improve science as a whole. In this book, we aim to introduce
this burgeoning field – its rich historical context, exciting recent develop-
ments, and promising future applications.
We had three core audiences in mind as we wrote this book.
The primary audience includes any scientist or student curious about
the mechanisms that govern our passion, science. One of the founding
fathers of the science of science, Thomas Kuhn, a physicist turned
philosopher, triggered worldwide interest in the study of science back
to 1962 with the publication of The Structure of Scientific Revolutions.
Kuhn’s notion of “paradigm shift” today is used in almost every cre-
ative activity, and continues to dominate the way we think about the
emergence and acceptance of new ideas in science. In many ways, the
science of science represents the next major milestone in this line of
thinking, addressing a series of questions that are dear to the heart of
every scientist but may well lay outside of the Kuhnian worldview:
When do scientists do their best work? What is the life cycle of scientific
creativity? Are there signals for when a scientific hit will occur in a
career? Which kinds of collaboration triumph and which are destined to
for disaster? How can young researchers maximize their odds of suc-
cess? For any working scientist, this book can be a tool, providing data-
driven insight into the inner workings of science, and helping them
navigate the institutional and scholarly landscape in order to better
their career.
A broader impact of the science of science lies in its implications
for policy. Hence, this book may be beneficial to academic adminis-
trators, who can use science of science to inform evidence-based deci-
sion-making. From department chairs to deans to vice presidents of
research, university administrators face important personnel and invest-
ment decisions as they try to implement and direct strategic research.
While they are often aware of a profusion of empirical evidence on this
subject, they lack cohesive summaries that would allow them to extract
signals from potential noise. As such, this book may offer the know-
ledge and the data to help them better take advantage of useful insights
the science of science community has to offer. What does an h-index of
25 tell us about a physics faculty member seeking tenure? What would
the department most benefit from: a junior vs. a senior hire? When
should we invest in hiring a superstar, and what can we expect their
impact will be?
3 / Introduction
our best work and what distinguishes us from one another. The Science
of Collaboration explores the advantages and pitfalls of teamwork,
from how to assemble a successful team to who gets the credit for the
team’s work. The Science of Impact explores the fundamental dynamics
underlying scientific ideas and their impacts. The Outlook part sum-
marizes some of the hottest frontiers, from the role of AI to bias and
causality. Each part begins with its own introduction which illuminates
the main theme using questions and anecdotes. These questions are then
addressed in separate chapters that cover the science relevant to each.
By analyzing large-scale data on the prevailing production and
reward systems in science, and identifying universal and domain-specific
patterns, science of science not only offers novel insights into the nature
of our discipline, it also has the potential to meaningfully improve our
work. With a deeper understanding of the precursors of impactful
science, it will be possible to develop systems and policies that more
reliably improve the odds of success for each scientist and science
investment, thus enhancing the prospects of science as a whole.
Part
I THE SCIENCE OF CAREER
Average number
4
Overall number
10 6
3
10 5
2
10 4 1
10 3 0
1900 1925 1950 1975 2000 1900 1925 1950 1975 2000
Year Year
Figure 1.1 The growing number of scientists. (a) During the past century, both
the number of scientists and the number of papers has increased at an exponential
rate. (b) The number of papers coauthored by each scientist has been hovering
around two during the past 100 years, and increased gradually in the past 15 years.
This growth is a direct consequence of collaborative effects: Individual productivity
is boosted as scientists end up on many more papers as coauthors. Similar trends
were reported using data within a single field [5]. For physics, for example, the
number of papers coauthored by each physicist has been less than one during the
past 100 years, but increased sharply in the past 15 years. After Dong et al. [4] and
Sinatra et al. [5].
not feel their theory is fully articulated unless the introduction of the
paper spans a dozen pages. Meanwhile, a paper published in Physical
Review Letters, one of the most respected physics journals, has a strict
four-page limit, including figures, tables, and references. Also, when we
talk about individual productivity, we tend to count publications in
scientific journals. But in some branches of the social sciences and
humanities, books are the primary form of scholarship. While each
book is counted as one unit of publication, that unit is admittedly much
more time-consuming to produce.
And then there is computer science (CS). As one of the youngest
scientific disciplines (the first CS department was formed at Purdue
University in 1962), computer science has adopted a rather unique
publication tradition. Due to the rapidly developing nature of the field,
computer scientists choose conference proceedings rather than journals
as their primary venue to communicate their advances. This approach
has served the discipline well, given everything that has been accom-
plished in the field – from the Internet to artificial intelligence – but it
can be quite confusing to those outside the discipline.
10 / The Science of Science
7
10
103
101
all authors
log-norm. distr.
0 1 2 3 4
10 10 10 10 10
Number of publications
Andre De Grasse in third, with running times 9.89 s and 9.91 s, respect-
ively. These numbers are awfully close, reflecting a well-known fact that
performance differences between individuals are typically bounded [16].
Similarly, Tiger Woods, even on his best day, only took down his closest
contenders by a few strokes, and the fastest typist may only type a few
words more per minute than a merely good one. The bounded nature of
performance reminds us that it is difficult, if not impossible, to signifi-
cantly outperform the competition in any domain. Yet, according to
Fig. 1.2, this boundedness does not hold for scientific performance.
Apparently, it is possible to be much better than your competitors when
it comes to churning out papers. Why is that?
If any of these steps fail, there will be no publication. Let us assume that
the odds of a person clearing hurdle Fi from the list above is pi. Then,
the publication rate of a scientist is proportional to the odds of clearing
each of the subsequent hurdles, that is N ~ p1p2p3p4p5p6p7p8. If each of
these odds are independent random variables, then the multiplicative
nature of the process predicts that P(N) follows a lognormal distribu-
tion of the form (1.1).
To understand where the outliers come from, imagine, that
Scientist A has the same capabilities as Scientist B in all factors, except
that A is twice as good at solving a problem (F2), knowing when to stop
(F4), and determination (F7). As a result, A’s productivity will be eight
times higher than B’s. In other words, for each paper published by
14 / The Science of Science
Jorge E. Hirsch in 2005 [26]. What is the h-index, and how to calculate
it? Why is it so effective in gauging scientific careers? Does it predict the
future productivity and impact of a scientist? What are its limitations?
And how do we overcome these limitations? Answering these questions
is the aim of this chapter.
Therefore, if we define
1
m 1 þ1 , ð2:2Þ
=c =n
100 100
100 100 101 100 101
N N
Figure 2.1 The h-index of Albert Einstein (a) and Peter Higgs (b). To calculate
the h-index, we plot the number of citations versus paper number, with papers listed
in order of decreasing citations. The intersection of the 45 line with the curve gives
h. The total number of citations is the area under the curve [26]. According to
Microsoft Academic Graph, Einstein has an h-index of 67, and Higgs 8. The top
three most cited papers by Einstein are: (1) Can quantum mechanical description of
physical reality be considered complete, Physical Review, 1935; (2) Investigations
on the theory of Brownian movement, Annalen der Physik, 1905; and (3) On the
electrodynamics of moving bodies, Annalen der Physik, 1905. The top three for
Higgs are: (1) Broken symmetries and the masses of gauge bosons, Physical Review
Letters, 1964; (2) Broken symmetries, massless particles and gauge fields, Physics
Letters, 1964; (3) Spontaneous symmetry breakdown without massless bosons,
Physical Review, 1966.
Chapter 19). Yet, despite the model’s simplicity, the linear relation-
ship predicted by (2.3) holds up generally well for scientists with
long scientific careers [26].
This linear relationship (2.3) has two important implications:
(1) If a scientist’s h-index increases roughly linearly with time, then its
speed of growth is an important indicator of her eminence. In other
words, the differences between individuals can be characterized by
the slope, m. As (2.2) shows, m is a function of both n and c. So, if a
scientist has higher productivity (a larger n), or if her papers collect
more citations (higher c), she has a higher m. And the higher the m,
the more eminent is the scientist.
(2) Based on typical values of m, the linear relationship (2.3) also
offers a guideline for how a typical career should evolve. For
20 / The Science of Science
Q1: Given the value of a metric at a certain time t1, how well does it
predict the value of itself or of another metric at a future time t2?
22 / The Science of Science
(a) (b)
c(t1)
c(t1)
h(t1) c(t1)
(c) (d)
c(t2)
c(t2)
Figure 2.2 Quantifying predictive power of the h-index. Scatter plots compare
the total number of citations, C, after t2 = 24 years vs. the value of the various
indicators at t1 = 12 year for each individual within the sample. Hirsch hypothesized
C may grow quadratically with time, and hence used its square root when calculating
the total number of citations. By calculating the correlation coefficient, he found that
the h-index (a) and the number of citations at t1 (b) are the best predictors of the
future cumulative citations at t2. The number of papers correlates less (c), and the
number of citations per paper performs the worst (d). After Hirsch [31].
23 / The h-Index
(Fig. 2.2b), the total number of publications (Fig. 2.2c), and the average
number of citations per paper (Fig. 2.2d). He then asked if we want to
select candidates that have the most total citations by year 24, which
one of the four indicators gives us the best chance? By measuring the
correlation coefficient between future cumulative citations at time t2 and
four different metrics calculated at time t1, he found that the h-index
and the number of citations at time t1 turn out to be the best predictors
(Fig. 2.2).
While Fig. 2.2 shows that the h-index predicts cumulative
impact, in many cases it’s the future scientific output that matters the
most. For example, if we’re deciding who should get a grant, how many
more citations an applicant’s earlier papers are expected to collect in the
next few years is largely irrelevant. We’re concerned, instead, with
papers that the potential grantee has not yet written and the impact of
those papers. Which brings us to Q2:
Q2: How well do the different metrics predict future scientific output?
Highly cited papers. The main advantage of the h-index is that its
value is not boosted by a single runaway success. Yet this also means
that it neglects the most impactful work of a researcher. Indeed, once
a paper’s citations get above h, its relative importance becomes invis-
ible to the h-index. And herein lies the problem – not only do outlier
papers frequently define careers, they arguably are what define sci-
ence itself. Many remedies have been proposed to correct for this [34–
39], including the g-index (the highest number g of papers that
together received g2 or more citations [40, 41]) and the o-index (the
geometric mean of the number of citations gleaned pffiffiffiffiffiffiffiffiby a scientist’s
∗
highest cited papers c and her h-index: o = c∗ h [42]). Other
measures proposed to correct this bias include a-index [36, 38];
h(2)-index [39]; hg-index [34]; q2-index [37]; and more [35].
Inter-field differences. Molecular biologists tend to get cited more
often than physicists who, in turn, are cited more often than math-
ematicians. Hence biologists typically have higher h-index than physi-
cists, and physicists tend to have an h-index that is higher than
mathematicians. To compare scientists across different fields, we
26 / The Science of Science
must account for the field-dependent nature of citations [43]. This can
be achieved by the hg-index, which rescales the rank of each paper n
by the average number of papers written by author in the same year
and discipline, n0 [43] or the hs-index, which normalizes the h-index
by the average h of the authors in the same discipline [44].
Time dependence. As we discussed in Chapter 2.2, the h-index is time
dependent. When comparing scientists in different career stages, one
can use the m quotient (2.2) [26], or contemporary h-index [45].
Collaboration effects. Perhaps the greatest shortcoming of the h-index
is its inability to discriminate between authors that have very different
coauthorship patterns [46–48]. Consider two scientists with similar
h indices. The first one is usually the intellectual leader of his/her
papers, mostly coauthored with junior researchers, whereas the second
one is mostly a junior author on papers coauthored with eminent
scientists. Or consider the case where one author always publishes
alone whereas the other one routinely publishes with a large number
of coauthors. As far as the h-index is concerned, all these scientists are
indistinguishable. Several attempts have been proposed to account for
the collaboration effect, including fractionally allocating credit in
multi-authored papers [48–50], and counting different roles played
by each coauthor [51–54] by for example differentiating the first and
last authorships. Hirsch himself has also repeatedly acknowledged this
issue [46, 47], and proposed the hα-index to quantify an individual’s
scientific leadership for their collaborative outcomes [47]. Among all
the papers that contribute to the h-index of a scientist, only those
where he or she was the most senior author (the highest h-index among
all the coauthors) are counted toward the hα-index. This suggests that a
high h-index in conjunction with a high hα/h ratio is a hallmark of
scientific leadership [47].
between authors and expert reviewers, suggesting that the status signal-
ing may be less of a concern for scientific manuscripts. Indeed, through
those rebuttals and revisions, an objective assessment of the work is
expected to prevail. Yet, as we see next, the status effect is rarely
eliminated.
Whether an author’s status affects the perceived quality of his/
her papers has been long debated in the scientific community. To truly
assess the role of status, we need randomized control experiments,
where the same manuscript undergoes two separate reviews, one in
which the author identities are revealed and another in which they are
hidden. For obvious ethical and logistical reasons, such an experiment
is difficult to carry out. Yet, in 2017, a team of researchers at Google
were asked to co-chair the program of the Tenth Association for
Computing Machinery International Conference on Web Search and
Data Mining (WSDM), a highly selective computer science conference
with a 15.6 percent acceptance rate. The researchers decided to use the
assignment as a chance to assess the importance of status for a paper’s
acceptance [62].
There are multiple ways to conduct peer review. The most
common is the “single-blind” review, when the reviewers are fully
aware of the identity of the authors and the institution where they
work, but, the authors of the paper are not privy to the reviewer’s
identity. In contrast, in “double-blind” review, neither the authors nor
the reviewers know each other’s identity. For the 2017 WSDM confer-
ence the reviewers on the program committee were randomly split into a
single-blind and a double-blind group. Each paper was assigned to four
reviewers, two from the single-blind group and two from the double-
blind group. In other words, two groups of referees were asked to
independently judge the same paper, where one group was aware of
who the authors were, while the other was not.
Given the Lord Rayleigh example, the results were not surpris-
ing: Well-known author – defined as having at least three papers
accepted by previous WSDM conferences and at least 100 computer
science papers in total – were 63 percent more likely to have the paper
accepted under single-blind review than in double-blind review. The
papers under review in these two processes were exactly the same,
therefore, the difference in acceptance rate can only be explained by
author identity. Similarly, authors from top universities had a 58 percent
increase in acceptance once their affiliation was known. Further, for
Another random document with
no related content on Scribd:
The Project Gutenberg eBook of Aatetoverit
This ebook is for the use of anyone anywhere in the United States
and most other parts of the world at no cost and with almost no
restrictions whatsoever. You may copy it, give it away or re-use it
under the terms of the Project Gutenberg License included with this
ebook or online at www.gutenberg.org. If you are not located in the
United States, you will have to check the laws of the country where
you are located before using this eBook.
Title: Aatetoverit
Sosiaalinen romaani
Language: Finnish
Sosiaalinen romaani
Kirj.
MAX KRETZER
Suomentanut
Juho Ahava
Rouva Schorn aikoi juuri astua ulos ovesta, kun hän paitsi
miestään huomasi vielä vieraankin olevan tulossa luokseen. Se
pidätti hänet, mutta pani hänet uteliaasti silmäilemään valosta
puolipimeään esimajaan.
"Hiljaa, Ami, tiedä huutia!" sanoi rouva Schorn, mutta elukka, joka
muuten totteli jokaista käskyä ja pian tuli tutuksi, ei tänään
hyökkäysinnoissaan ottanut talttuakseen. Raivossaan lainkaan
asettumatta koetti se tarrata Rassmannin sääreen, ja vasta kun
Schorn pontevasti ärähti: "Koetapas — mars matkaasi!" ja huiskasi
uhkaavasti kädellään, vetäytyi koira syrjään; mutta murina, jolla se
väistymistään säesti, ilmaisi selvään, että se odotti ainoastaan ensi
tilaisuutta hyökkäyksensä uudistaakseen.
"Kyllä, Wilhelm."
Schorn jatkoi;