Raup y Crick. Measurement of Faunal Similarity in Paleontology
Raup y Crick. Measurement of Faunal Similarity in Paleontology
Raup y Crick. Measurement of Faunal Similarity in Paleontology
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].
SEPM Society for Sedimentary Geology is collaborating with JSTOR to digitize, preserve and extend access to
Journal of Paleontology.
http://www.jstor.org
ABSTRACT-A probabilistic index of faunal similarity is proposed which compares the number of taxa
common to two faunas with the number that would be expected to be in common if the taxa were
distributed randomly. Departures of observed from expected numbers in common express the level
of similarity or dissimilarity. The frequency of taxa in the whole data set is used to adjust for the
differing probability of occurrence of taxa (cosmopolitan versus endemic). The new index can be used
to determine whether similarities or dissimilarities between faunas are statistically significant.
The index is tested with 1) modern biogeography of echinoids, 2) environmental distribution of
modern foraminifera in Santa Monica Bay, and 3) Ordovician biogeography of nautiloids. In each
case, the proposed index is more effective than traditional indexes of faunal similarity (Simpson,
Jaccard, and Dice coefficients) in addition to the advantage of making possible rigorous assessment
of statistical confidence. The index should also be useful in a biostratigraphic context. The computer
program used for calculating the index is available from the authors.
20 20
.2011 .02
.20 .04
1. 2.
;= 20 20
= .11 .11
)= .20 .20
3. 4.
indicates the total pool (N) of taxa which could and Dice. Cheetham and Hazel (1969) have
occur in an assemblage and the two smaller provided an excellent comparative review of
circles (A and B) are assemblages drawn from these and about 20 other similarity coeffi-
this pool. The overlap zone (k) between the cients. Other critical reviews of selected coef-
smaller circles represents the number of taxa ficients exist in the literature (see, for example,
found in both assemblages. In all cases, the papers by Henderson and Heron, 1977, and
Simpson Coefficient is 20 yet one's intuition Simberloff, 1978). Most of these authors have
suggests that the four cases do not indicate the emphasized that it is important to have valid
same similarity in the sense of process (mean- measures of faunal similarity because of the
ing ecological, temporal or geographic similar- increasing use of multivariate statistical anal-
ity). ysis of large masses of distributional (presence/
In recognition of these and other difficulties, absence) data. The multivariate analysis can
many authors have proposed alternate be only as good as the matrix of similarity val-
schemes. Two of the coefficients most com- ues that forms its input!
monly applied in paleontology are the Jaccard At the risk of oversimplification, one can
argue that most existing similarity coefficients sprinkling. In the trilobite and echinoderm ex-
suffer from two main problems. First, they amples the departure from chance expecta-
have not been derived in a mathematically rig- tions is so obvious that sophisticated statistical
orous way: that is, they have been 'thought testing is unnecessary. But most cases of in-
up' rather than built on sound mathematical terest are more subtle and rigorous treatment
principles. Their validity has all too often been is obligatory-which is to say that one cannot
tested by whether they seem to work in prac- rely on intuition alone.
tice. Second, they have not been tied to clearly In actual cases, it makes no difference
defined null hypotheses; as a result, statisti- whether the null hypothesis of randomness is
cally meaningful comparisons between values rejectable 10% or 90% of the time. We wish
of a coefficient are impossible. It has been im- to use it only as a standard of comparison and
possible to say whether two assemblages are as a means of assessing the probability that
similar (or dissimilar) at the 95% level of con- two assemblages had different ecologic, tem-
fidence, for example. The discussion by Sim- poral, or geographic settings. When dealing
berloff (1978) includes a particularly good with assemblages from radically different fa-
treatment of this point. cies or from different continents, one would
Henderson and Heron (1977) recognized expect to be able to reject the null hypothesis
and discussed many of the problems just de- most of the time. On the other hand, when
scribed and made an attempt to produce a rig- dealing with assemblages from the same for-
orous and statistically valid similarity mea- mation in a local area, one would expect not
sure. The present effort takes a slightly to reject the null hypothesis and to conclude
different tack in the hope of developing a yet that the compositional differences between as-
more robust approach to the similarity ques- semblages are just the result of chance differ-
tion. Our approach is similar to that of Sim- ences in sampling.
berloff (1978), but our objectives and the re- We propose to use a comparison between
sulting technique are substantially different. the observed number of taxa common to two
assemblages (or faunas) and the probability
THE APPROPRIATE NULL HYPOTHESIS distribution of the expected number of com-
Suppose that taxa are sprinkled randomly mon taxa as a measure of the similarity of the
in space and time and that species lists are two assemblages. Assemblages which are
made up from the taxa that happen, by more similar than predicted by the null hy-
chance, to fall in certain areas and in certain pothesis will be interpreted as indicating a pos-
stratigraphic intervals. Most of the species lists itive bias in the make-up of the assemblages.
will differ from one another just because of That is, ecologic, temporal, or geographic fac-
the vagaries of sampling but they will have an tors must have limited the taxa available for
average similarity which is predictable from those assemblages. Conversely, assemblages
the numbers of taxa, areas, and stratigraphic less similar than predicted will be interpreted
intervals involved. As will be shown below, as indicating a negative bias.
it is possible under the random sprinkling hy- Simberloff (1978) had a quite different ob-
pothesis to predict how many species should jective. Working with modern species distri-
be expected to be shared ('k') and the expected butions in the Galapagos Islands, he was ask-
variation in this number. The expected 'k' and ing whether the total distribution represents
its probable variation constitute the appropri- a significant departure from the null hypoth-
ate null hypothesis for assessing faunal simi- esis of random sprinkling. That is, he was ask-
larity. ing whether the array of species lists is consis-
In the real world, the distribution of taxa in tent with the proposition that differences in
space and time is generally non-random. Tri- composition result solely from sampling error
lobites are confined to a small portion of geo- (in dispersal of species) and not from real bio-
logic history (the Paleozoic), echinoderms are geographic effects.
confined to marine environments, and so on.
When we use the temporal confinement of tri- METHODS
lobites or the ecological confinement of echi- Consider the Venn diagrams in Text-figure
noderms to make other interpretations, we are 1 and assume, as before, that the areas of the
tacitly rejecting the null hypothesis of random circles correspond to the numbers of species in
.8
>..
I. 2.
_ .6 .6'
_J
-4
<:.4 .4
(a
0
a. .2 .2
0 0
0 1 2 3 4 5 0 3 4 5
kexp
.4 .4
3. 4.
.3 .3
.2 .2
.1 .1. k bs
i I
0 f I
f
mt
_ - - i.
0 5 10 15 20 0 5 10 15 20
TEXT-FIG. 2-Curves showing solutions to equation (4) for all possible values of 'k' in the four cases
illustratedin Text-figure1. The point markedkobsis the numberof taxa observedin commonin Text-
figure1. Note that the relationshipsare not continuousfunctions:only integervaluesof 'k' are possible.
the total pool and two assemblages drawn from the urn to define assemblage A. Then
from that pool. Assume further that all species replace them and draw 'B' balls to form as-
in the pool have an equal chance of being cho- semblage B. The question is: how many of the
sen for each of the assemblages. This is a sim- same balls will be found in both A and B?
plistic assumption because it is well known This problem was solved by Henderson and
that species vary in their abundance so that Heron (1977) by a logical series of steps cul-
some have a much higher probability of oc- minating in their equation (4) and in a slightly
curring in any given assemblage than others. different form by Simberloff (1978), (Null Hy-
But this is a convenient scenario with which pothesis I). The solution presented here is
to introduce a methodology and is the one used more straightforward and more flexible than
by Henderson and Heron (1977) and by Sim- either of the previous efforts.
berloff (1978). The total number of different 'A' assem-
The situation just presented can also be blages that can be drawn from the pool (N) is
thought of in the classic context of an 'urn the number of combinations of N things taken
problem.' Assume that a large urn contains A at a time, or
many balls, each numbered differently, and
that this collection of balls constitutes the pool NcA N= - (1)
of species (N). Now draw 'A' balls at random (N A)! A!(
TABLE 1-Probabilities calculated from equation (4) for the four cases shown in Text-figure 2: 'kobs' is the number of
species observed to be in common and 'kexp'is the number expected to be in common on the assumption of random
sprinkling of species.
Cases:
1 2 3 4
Probability that kexpis less than kobs: .77 .07 .39 .005
Probability that kexpequals kobs: .21 .26 .24 .012
Probability that kexpexceeds kobs: .02 .67 .37 .983
1.00 1.00 1.00 1.000
co 30
mopolitan genera are more likely to occur and
are thus more likely to be genera common to
both members of a pair of assemblages. Local
endemics (those with probabilities of 1/40, in
the echinoid case) can occur in two assem- 20- \ A= 36
blages but the probability of this event is low. L.
0
B 19
One would naturally like to be able to derive
an equation equivalent to equation (4) which
would predict values of kexpunder the condi-
tions described above. We have been unable
to derive this equation. Therefore, we have
had to rely on monte carlo simulations-just
as Simberloff did for his purposes. Our meth- 0 5 10 15 19
od is as follows:
kexp
1) For each pair of assemblage sizes in the
real world data set construct an imaginary pair TEXT-FIG. 3-Example of treatmentof a compar-
of assemblages (A and B) by drawing species ison between two echinoid faunas. kohs is the
from the pool. This is accomplished by com- numberof generaactuallyobservedto be in com-
mon. The curveshowsthe percentof simulations
puter with a random number generator. Hav- yielding each value of kexp.The ruled portion
ing made the two assemblages, the lists of gen- indicates the numberof simulationshaving 'k'
era are compared and the number of genera values less than or equal to the observedvalue.
in common is recorded. This number is one
outcome of sampling under the random sprin-
kling hypothesis: that is, one point in a kexp A special problem arises where the smaller
probability distribution. assemblage (B) is very small. In one echinoid
2) The same procedure is repeated many pair, for example, assemblage B contained
times with the number of taxa shared by each only two genera and thus the only possible
pair of assemblages being recorded. values of 'k' are 0, 1, and 2. Fifty simulations
3) A frequency distribution of the results is produced the following k's:
an estimate of the probability distribution of
kexpunder the specified conditions of A and B. k=0 40 (80%),
4) The number of taxa actually shared (kobs) k= 1 9 (18%),
by assemblages of these sizes in the real world k= 2 1 (2%).
is compared with the monte carlo generated 50
distribution and the INDEXOF SIMILARITY is
computed as in the simplified case described There were no genera actually common to the
earlier. two areas (kobs= 0). Thus, using the proce-
An actual example of this procedure is il- dure described earlier, computing the INDEX
lustrated in Text-figure 3 for a pair of sam- would yield a value of 0.80. But this implies
pling areas in the echinoid data set: these areas a higher similarity than may exist. In other
had 19 and 36 genera, respectively. A fre- words, we do not know where the value of
quency distribution of 50 simulated assem- kobs falls within the 80%. Therefore, an arbi-
blage pairs is shown in Text-figure 3 along trary convention was adopted: the INDEX is
with the actual number observed in common computed on the basis of the midpoint of the
(5). In this case, kobsfalls nearly at the center string of simulated 'k' values which are equal
of the simulated distribution and the null hy- to the observed 'k.' The INDEXin this case is
pothesis cannot be rejected. But the percent- recorded as 0.40. The same convention was
age of simulations having 'k' less than or equal followed throughout. In the case illustrated in
to kobs may be used as the INDEX OF SIMILAR- Text-figure 3, the INDEXwas recorded as 0.39.
ITY. This was arrived at by summing the percent-
When this procedure was followed with the ages of the simulations that gave 'k' values less
entire echinoid data set, most cases fell be- than kobs (2 + 6 + 20 = 28) and adding one-
tween the 5% tails of their distributions (as in half the percentage of simulations where 'k'
Text-fig. 3). equaled kobs(?2 x 22 = 11).
The use of monte carlo methods calls for distributional data all come from Mortensen's
considerable computation time-much more Monograph of the Echinoidea (1928-51)
than would be required if an analytical expres- which provides a consistent and authoritative
sion comparable to equation (4) were avail- taxonomic base. Of the forty geographic sam-
able. But the results are just as accurate, given pling areas used for the present study, most
enough simulations. In the echinoid case we are relatively shallow water coastal or insular
used 50 simulations for each pair of assem- areas where distributions of taxa tend to re-
blage sizes. 100 or 1,000 simulations per pair flect regional climate. The others have uni-
would have produced more precise distribu- formly cold water faunas: the non-insular
tions but 50 was chosen as the best compro- ocean areas of the North Pacific, South Pacif-
mise with the limitations of computer budgets. ic, North Atlantic, Central Atlantic, and
The important point is that the simulation South Atlantic and the Arctic and Antarctic
technique does not sacrifice rigor unless the Oceans.
number of simulations is too low. As indicated earlier, the data set consists of
The computer program used for the echi- 222 genera which range from local endemics
noid and other analyses is available from the to those found in as many as 20 of the 40 sam-
authors. It is a relatively expensive program pling areas. In keeping with the philosophy of
to run. The cost depends on the number of the method, no data were discarded because
assemblages and the variation in their sizes. of endemism or cosmopolitanism and no areas
The echinoid data set described here is unusu- were excluded because of small sample size.
ally large and requires about 26 cpu minutes The basic computer program was run to as-
on an IBM 360/65 to produce the similarity sess similarity between the members of all pos-
matrix plus a complete record of the 19,750 sible pairs of the 40 generic lists. Fifty simu-
simulations required for the job. A variety of lations were used for each pair of assemblage
techniques could be used to reduce the cost sizes. The output consisted of 1) the tabulated
but they would sacrifice accuracy. results of all simulations (number of genera in
common) and 2) a matrix of values of the com-
APPLICATIONS
puted INDEX OF SIMILARITY. Various analy-
The method described in this paper can be ses were performed on the output, some of
applied to any data set consisting of presence which will be described below.
and absence of taxa. In other words, any sit- Text-figure 4 shows how one of the sam-
uation which yields floral or faunal lists is ap- pling areas compares with the other thirty-
propriate. Each list may represent a single col- nine. The reference area (marked by an 'X')
lecting locality or a composite of information on the west coast of Central America was cho-
from a group of geographically, ecologically or sen arbitrarily and other choices produce com-
stratigraphically related localities. parable results. Values of the INDEX OF SIM-
In order to test the methodology, we will ILARITYare contoured and show decrease in
present three quite different examples: 1) glob- similarity with distance from the reference
al biogeography of living echinoid echino- area. Contouring was straightforward; that is,
derms, 2) distribution of benthic foraminifera extreme contortion of contour lines was not
in Santa Monica Bay, California, and 3) global necessary. Furthermore, the resulting pattern
biogeography of Ordovician nautiloid cepha- is plausible and interpretable in biogeographic
lopods. It should be emphasized that the pro- terms. The map shows clearly that the echi-
posed INDEX OF SIMILARITY, like all other noids of the Eastern Pacific are much more
similarity measures, is a purely descriptive similar to those of the Western Atlantic than
tool. Its purpose is to measure similarities and to those of the Western Pacific and Indian
differences between taxonomic lists and to as- Ocean regions. In fact, the presence of the
sess the statistical significance of these simi- Central American barrier is not evident in the
larities and differences. It does not interpret pattern. (This would not be the case at the
the results in the sense of telling us the biolog- species level where virtually no echinoid
ical or geological factors responsible for the species are common to both sides of the Isth-
similarities or differences. mus of Panama.) While some details of the
Echinoid biogeography.-The data set used pattern may reflect sampling error, there is no
for this test was presented briefly above. The reason to believe that Text-figure 4 is not a
_ _____ _____
A:
COASTAL
)
J
TROPICAL/SUBTROPICAL 0 /
INDO-PACIFIC /
NORTHATLANTIC
I.
U 0
SEA OF JAPAN
SEA OF JAPAN
0
0 ANTARCTIC
PC I
TEXT-FIG.5-Multivariate analysisof echinoidbiogeographicdata. The firsttwo principalcomponents
(PCI and PCII)are plottedfor the 40 samplingareas. PCI separatescoastalareas of the Indo-Pacific
from otherareasand fromthe cold water, open ocean areas. PCII reflectswatertemperature.
sampling error. The orderliness of the contour ration is the result of the East Pacific Barrier
lines strongly suggests, of course, that the (Ekman, 1953), an 1,810 km expanse of open
Alaskan and Central American echinoids are ocean separating the islands of Outer Polyne-
in fact different in the sense that they do not sia and the tropical/subtropical coast of Amer-
represent random sprinkling from the same ica. Under ordinary oceanic conditions, echi-
pool. noid larvae are not capable of crossing this
Text-figures 5 and 6 show 2-dimensional barrier and faunas on either side of the barrier
ordination plots of the first three principal are significantly different below the family
components axes representing 98.5 percent of level. Separation of the South Australian and
the variation in the data set. The principal New Zealand regions from the tropical/sub-
components, PCI, PCII, PCIII, account for tropical regions of the Indo-Pacific illustrates
51, 28, and 19.5 percent of the variation, re- cold-temperate character of the South Austra-
spectively. The sampling areas which form lian and New Zealand faunas. Although geo-
tight, natural groups are shown as solid dots graphically proximal, the echinoid faunas of
and the groups are labeled. Others are shown the Sea of Japan and the Sea of Okhotsk are
as open circles and identified individually. remarkably different. The echinoid fauna of
Text-figure 5 is an ordination of PCI and the Sea of Japan consists of shallow water,
PCII. PCI clearly separates the coastal areas warm-temperate genera derived from the sub-
of the Indo-Pacific region from those of the tropical Indo-Pacific via the warm Kuroshio
Eastern Pacific and the Atlantic. This sepa- Current while the echinoid fauna of the Sea of
Il -
ATLANTIC
INDIAN OCEANO
IU NORTHATLANTIC ARCT \
0
CENTRALATLANTIC0O SOUTH ATLANTIC
SEA OF OKHOTSK
Q
^ \~~~ SOUTH PACIFIC
S
?* OSEA OF JAPAN
/COASTAL
TROPICAL/SUBTROPICAL COASTAL
INDO-PACIFIC * \ C APACIFIC
EASTERN
COASTAL ANTARCTIC
WESTERNATLANTIC
PC I
TEXT-FIG. 6-Multivariate analysisof echinoidbiogeographicdata. The first and third principalcom-
ponents(PCI and PCIII)are plotted for the 40 samplingareas. PCIII serves to separatethe coastal
areasof the EasternPacificand Atlanticinto distinctregions.
Okhotsk consists of cold-temperate genera de- pies were analyzed to produce distributional
rived from the north via the cold Oyashio Cur- data on 96 foraminiferal species and subspe-
rent. Any chance mixing of the faunas is re- cies. The samples covered an area of approx-
duced by a shallow submarine sill separating imately 100 square kilometers in water depths
the two bodies of water. Deep water and high ranging from 10 to 828 meters. All occurrence
latitude faunas tend to cluster in the lower data (in terms of percentage abundance) were
right corner of Text-figure 5. Text-figure 6 tabulated in the Zalesny paper. For the pres-
shows that the coastal areas of the Eastern ent study, these were converted to simple pres-
Atlantic, Western Atlantic, and Eastern Pa- ence and absence of taxa and the INDEX OF
cific are separated along PCIII. Naturally, sta- SIMILARITY was computed for all pairs of the
tistical significance cannot be assessed in the 70 sampling localities.
results of the multivariate analysis but the or- Text-figure 7 shows a contour map of raw
dination plots yield considerable information similarity data comparable to that for echi-
of biogeographic interest. noids (Text-fig. 4). The reference fauna (Za-
Foraminifera in Santa Monica Bay.-In a lesny's sample #3110) is in the left central part
superbly detailed study, Zalesny (1959) record- of the map and is indicated by an 'X.' Con-
ed and interpreted the distribution of the fo- tours reflect decreasing similarity of the other
raminifera in the bottom sediments of Santa 69 localities with respect to the reference lo-
Monica Bay, California. Seventy bottom sam- cality. Numerical values of the INDEX OF SIM-
50 fm
80-
100,
0
* 0
0
x
#3110
0
I
I
\"' /40 ,/
II
II
I
TEXT-FIG. 7-Analysis of foraminiferal assemblages from Santa Monica Bay. Solid contours indicate
variation of the INDEX OF SIMILARITYwith respect to an arbitrary reference fauna (#3110). Triangles
represent assemblages which are significantly similar to the reference fauna: open circles are assemblages
significantly dissimilar to the reference fauna; solid circles are assemblages not significantly similar or
dissimilar to the reference fauna.
ILARITY are not shown in this case but the The contours of faunal similarity follow the
location of each site is shown by a small sym- bathymetry with remarkable faithfulness. The
bol. Those indicated by triangles are the ones shelf edge is clearly defined and both canyons
that are significantly similar to sample #3110, are evident. The one major anomaly is the
those indicated by open circles are significant- small 'bump' on the inner shelf produced by
ly dissimilar, and the solid dots represent lo- sample #3348. Similarity between this sample
calities which are not significant in either di- and the deep water reference fauna is substan-
rection. Also included are the bathymetric tially higher than is found between the other
contours for 10, 50, and 100 fathoms. The shelf faunas and the reference fauna. It is not,
shelf edge is well defined in Santa Monica Bay however, a statistically significant anomaly. In
and the continental slope is steep. The shelf is fact, when #3348 is compared with the three
indented by two major submarine canyons: closest localities, statistically significant simi-
Santa Monica Canyon (near the reference lo- larity is found! #3348 may therefore by a sim-
cality) and the Redondo Canyon (southeast ple chance departure from the overall pattern
corner of the mapped area). of the contoured similarity surface or it may
result from a minor habitat difference between the other. In fact, 21 or 30% show statistical
#3348 and other shelf sites. The latter sug- significance and their distribution is obviously
gestion is likely in view of the fact that Zales- non-random over the geographic area. This
ny's sediment maps show that #3348 comes means that the null hypothesis of random
from a small patch of silt on the shelf surface sprinkling can be rejected when considering
otherwise covered with sand, gravel, or rock. foraminiferal assemblages of the whole bay.
All the deep water sampling sites are in silty This is not surprising in the Santa Monica Bay
sediments. It is not surprising therefore, that case and is certainly substantiated by Zales-
the silt patch on the shelf should yield an as- ny's detailed analysis of the distributions of
semblage with relatively high similarity to the individual taxa. It illustrates how the method
deep-water reference fauna. being presented here can be used to explore
It should be emphasized that Text-figure 7 the question of whether distribution of taxa is
does not in itself require a bathymetric or sed- purely stochastic or whether it is biased by
iment interpretation. As Zalesny (1959) points deterministic biological and/or physical fac-
out, many other ecological parameters such as tors. Even though the stochastic model can be
temperature and salinity parallel changes in rejected easily in this case, the INDEX based
depth and sediment type. The contoured IN- on the null hypothesis of random distribution
DEX OF SIMILARITY only provides a statistical is still a valuable aid to ecological interpreta-
framework for interpretation. tion.
Text-figure 7 can be used also to investigate Multivariate analysis was also carried out
another aspect of faunal similarity. The null on the foraminiferal data. Bivariate ordination
hypothesis of random sprinkling predicts that plots are eminently contourable and follow
about 5% of the sites should appear to be sig- bathymetry.
nificantly similar to the reference fauna and Ordovician nautiloid biogeography.-This
that about 5% should be significantly dissim- data set consists of 182 genera of Arenigian
ilar and that these cases of apparent statistical nautiloid cephalopods which range from en-
significance should be randomly distributed demics to those found in as many as 20 of 52
over the area. In this instance, therefore, 10% sampling areas. The data are taken from a
or about seven of the 69 assemblages should broader study of Ordovician biogeography
show statistical significance in one direction or (Crick, 1978). The location of sampling areas
TEXT-FIG.
9-Multivariate analysisof Arenigianbiogeographicdata. A plot of the first two principal
components (PCI and PCII) separates the major geographic elements of the early Ordovician.
is shown in Text-figure 8 on a reconstruction Bear Island at the present time but were sep-
of Ordovician paleogeography developed by arated (as part of Baltica) from Bear Island in
Scotese et al. (1979). the Ordovician.
The faunal relationships were measured Patterns in 2-dimensional ordinations of the
with the computer program used in the pre- principal component axes are not quite as eas-
ceding examples. Contouring of the similarity ily interpreted as were comparable plots of
values with respect to Siberia as an arbitrary Recent echinoid and foraminiferal data. This
reference area (Text-fig. 8) shows the expected reflects loss of information about physical en-
decrease in similarity away from the reference vironments and a certain amount of geograph-
area. Contouring the same data on a map of ic uncertainty. However, information on as-
modern geography (not shown) reveals sub- sociated faunas and sediments, along with
stantial anomalies which reflect the differences knowledge of tectonic setting, does make the
between modern and Ordovician geography. multivariate plots understandable. Text-figure
For example, the Bear Island fauna is signif- 9 shows a plot of PCI and PCII for the 52
icantly similar (at the 95% level) to faunas sampling areas. Clusters showing the principal
from Arctic Canada and Scotland but it is not geographic regions (plates) are indicated. Plots
significantly similar to Norway, Sweden, or including PCIII (not shown) show separation
Estonia. The latter three areas are closest to of two important Ordovician facies; the plat-
form facies characterized by shelly faunas and or biogeographic proximity rather than tem-
the slope deposits (graptolitic facies). More de- poral identity. But this is an ever-present
tail on this aspect is given elsewhere (Crick, problem in biostratigraphy which must be
1978). dealt with regardless of the method used to
assess similarity. In the biostratigraphic con-
DISCUSSION text, tests of statistical significance could be
The similarity measure presented here is performed in the manner of the echinoid and
somewhat cumbersome and expensive because foraminiferal examples.
of the simulation technique. The rewards may
be worth the extra effort, however. These may ACKNOWLEDGMENTS
be summarized as follows: This work was supported in part by the
1) Distributional data are weighted on the Earth Sciences Section, National Science
basis of frequency so that widespread taxa do Foundation, NSF Grant DES75-03870. We
not have a disproportionate influence on mea- would also like to thank Richard K. Bambach
surement of similarity. and Alan H. Cheetham for helpful reviews of
2) There is no need to discard taxa on the the manuscript.
a priori grounds that they are too widespread
or too localized. REFERENCES
3) The similarity or dissimilarity of any two Cheetham,A. H. and J. E. Hazel. 1969. Binary
faunas can be tested for statistical significance. (presence-absence) similaritycoefficients.J. Pa-
Such tests are robust assuming that enough leontol. 43:1130-1136.
simulations have been run. Crick, R. E. 1978. Ordoviciannautiloidbiogeog-
raphy:a probabilisticand multivariateanalysis.
4) Because the evaluation of similarity does Ph.D. Dissertation,Univ. Rochester,166 p.
not presume any particular shape for the prob- Ekman, S. 1953. Zoogeographyof the Sea. Sedg-
ability distribution of expected numbers of wick & JacksonLtd., London,417 p.
taxa in common, the results may be considered Henderson, R. A. and M. L. Heron. 1977. A prob-
precise and not dependent upon generaliza- abilistic method of paleobiogeographic analysis.
tions drawn from computed variances of the Lethaia 10:1-15.
Mortensen, T. 1928-1951. A Monograph of the
probability distribution. Echinoidea. C. A. Reitzel, Copenhagen. 5 vols.,
5) An entire faunal realm or data set can be 4469 p.
investigated for significance of the observed Rohlf, F. J., J. Kishpaugh and D. Kirk. 1971. NT-
departures from a random sprinkling (stochas- SYS. Numerical Taxonomy System of Multi-
tic) model of taxon distribution. variate Statistical Programs. Tech. Rep. State
The three examples that have been de- Univ. New York at Stony Brook, New York.
Scotese, C. R., R. K. Bambach, C. Barton, R. Van
scribed do not include one in a biostratigraphic Der Voo and A. M. Ziegler. 1979. Paleozoic
context but biostratigraphicapplications should base maps. J. Geol. 87:217-277.
be straightforward and follow logically from Simberloff, D. S. 1978. Using island biogeographic
the biogeographic/ecological cases used here. distributions to determine if colonization is sto-
For example, the probable stratigraphic posi- chastic. Am. Naturalist 112:713-726.
tion of a new fossil assemblage could be as- Simpson, G. G. 1943. Mammals and the nature
sessed by comparing it with a large number of of continents. Am. J. Sci. 241:1-31.
. 1947. Holarctic mammalian faunas and con-
assemblages in a standard (possibly composite) tinental relationships during the Cenozoic. Geol.
sequence. This could be done in the fashion of Soc. Am. Bull. 58:613-688.
the contour maps of Text-figures 4, 7, and 8 Zalesny, E. R. 1959. Foraminiferal ecology of
except that it would be a one-dimensional in- Santa Monica Bay, California. Micropaleontol.
stead of a two-dimensional problem. The 5:101-126.
highest INDEX OF SIMILARITYwould be cen-
tered on the assemblages in the standard se- MANUSCRIPT RECEIVED FEBRUARY 17, 1979
REVISED MANUSCRIPT RECEIVED APRIL 12, 1979
quence most similar to the new assemblage.
This would not demand that a temporal cor- The Field Museum of Natural History contributed
relation be made at that point, of course, be- $500 in support of this article.
cause the similarity might be due to ecological