Declustering and Debiasing: January 2007

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/228427948
Declustering and debiasing
Article · January 2007
CITATIONS READS
21 2,706
2 authors:
Michael J. Pyrcz Clayton Deutsch

University of Texas at Austin University of Alberta
78 PUBLICATIONS 1,115 CITATIONS 342 PUBLICATIONS 13,211 CITATIONS
SEE PROFILE SEE PROFILE
All content following this page was uploaded by Michael J. Pyrcz on 09 March 2015.
The user has requested enhancement of the downloaded file.

Declustering and Debiasing
M. J. Pyrcz ([email protected]) and C. V. Deutsch ([email protected])

University of Alberta, Edmonton, Alberta, CANADA
Abstract
Strategic project decisions are based on the distributions global variables, for example, total mineable
resource, or recoverable oil volume. These global variables distributions are very sensitive to rock type
proportions and the histograms of continuous variables. Representivity of the input one point statistics is
important in all spatial models.
The process of assembling representative one point statistics is complicated by sample clustering and
spatial bias in the data locations. Explanation is provided on the source of nonrepresentative sampling
and the need for declustering and debiasing. This work addresses some key implementation details on
declustering. Standard declustering is not always able to correct for sampling spatial bias. Two methods
for correcting bias in the one point statistics: “trend modeling for debiasing” and “debiasing by
qualitative data” are reviewed and demonstrated with a poly metallic data set.
Introduction
Great computational effort is exerted to build realistic geostatistical simulation models. The goodness of
these models is judged by their ability to reproduce input one-point statistics and continuity structures.
Geostatistical techniques slavishly reproduce input lithofacies proportions and the histogram of
petrophysical properties (Gringarten et al., pg. 1, 2000). There is very little intrinsic declustering or
debiasing within geostatistical simulation algorithms. Geostatistical simulation always weights the input
distribution. Gaussian simulation in particular ensures that the input distribution is approximately
reproduced. Clustered sampling that misrepresents the proportions within specific bins, or spatial bias
sampling that does not characterize the full range of the distribution, in the input distributions must be dealt
with explicitly.
The importance of representative input distributions must be evaluated with respect to the sensitivity of
the response variable to clustering or bias in the input statistics. Simulated models are only an intermediate
result. Management decisions focus on the results after the application of a transfer function.
Declustering methods are commonly applied in an automated fashion. This work addresses the
properties of a variety of declustering algorithms and methods of improving application. Issues such as
working with anisotropy, within facies and with multiple variables are addressed. It is essential to
understand the applicability and limitations of declustering algorithms since blindly applying declustering
may be worse than working with the naïve statistics.
Declustering is ineffective in cases with spatial bias. Debiasing tools such as “trend modeling for
debiasing” and “debiasing by qualitative data” should be brought into common practice for the purpose of
improving the inference of the one point statistic.
Nonrepresentative Sampling
It is natural that spatial data are collected in a nonrepresentative manner. Preferential sampling in
interesting areas is intentional and facilitated by geologic intuition, analogue data and previous samples.
This practice of collecting clustered or spatially biased spatial samples is encouraged by technical and
economical constraints, such as future production goals, accessibility and the costs of laboratory work.
The cost of uncertainty is not the same everywhere in the area of interest. For example, the cost of
uncertainty within a high grade region is much higher than the cost of uncertainty within clearly waste
material. Good delineation and a high level of certainty within the high grade materials allows for accurate
reserves estimation and optimum mine planning.
1
Future production goals may also encourage clustered or spatially biased sampling. It is common to
start mining in high grade regions. In this case it is desirable to delineate and characterize the high grade
regions.
Practical issues of accessibility can also cause spatially biased sampling. For example, the drilling
depth or available drilling stations may constrain sample selection. In the presence of a vertical trend,
limited depth of drilling may result in a subset of the underlying distribution not being sampled. There are
many possible scenarios under which accessibility would be a concern (see Figure 1).
Nonrepresentivity may also be introduced at the assaying stage. For example, when removing sections
of core for the purpose of permeability measurement, it is unlikely that a section of shale would be
subjected to expensive testing. Likewise, barren rock may not be sent for assays.
Conventional Statistics
Conventional statistics do not provide reasonable solutions to the problem of constructing representative
spatial distributions. A simple random sample from the population of interest would be representative, but
inappropriate in most cases. A sample is said to be representative when each unit of the population has the
same probability of being sampled. In conventional statistics this is accomplished by avoiding preferential
sampling or opportunity sampling. As explained above, there are many reasons that geologic samples are
collected in a biased manner.
Regular or random stratified sampling may be able to provide a good approximation of a representative
distribution. Sampling on a regular grid is rarely practical for the same accessibility and economic reasons
stated above. Regular sampling grids may be applied in preliminary resource investigation. These
sampling campaigns are often augmented by nonsystematic infill drilling. One approach would be to omit
the clustered infill samples for the purpose of building distributions. While this would more closely agree
with conventional statistical theory, throwing away expensive information is not very satisfying (Isaaks and
Srivastava, pg. 237-238, 1997).
Declustering
Declustering is well documented and widely applied (Deutsch, pg. 53-62, 2001; Isaaks and Srivastava, pg.
237 – 248, 1997; Goovaerts, pg. 77-82, 1997). There are various types of declustering methods, such as
cell, polygonal and kriging weight declustering. These methods rely on the weighting of the sample data,
in order to account for spatial representivity. Figure 2 shows the effect of weighting the histogram. Note
that weighting does not change the values: only the influence of each sample is changed.
There are two important assumptions in all declustering techniques: (1) the entire range of the true
distribution has been sampled or the data is not spatially biased, and (2) the nature of the clustering is
understood. Declustering may not perform well without these assumptions. The first assumption is
required since the weighting only adjusts the influence of each sample on the distribution and does not
change the actual sample value. Figure 3 shows an example where declustering could not work; there are
no low samples to give more weight to.
The second assumption is that the nature of the clustering is understood. If the data have no spatial
correlation, there would be no reason to apply declustering. Each sample, regardless of location, would be
a random drawing from the underlying distribution. Without an understanding of the spatial nature of the
data, declustering may be incorrectly applied.
There are a variety of methods to calculate the declustering weights. Polygonal, cell and kriging weight
methods will be discussed.
Polygonal Declustering
Polygonal declustering is commonly applied in other scientific disciplines, such as hydrology, for the
purpose of correcting for clustering in spatial data. The method is flexible and straightforward. The
polygonal declustering technique is based on the construction of polygons of influence about each of the
2
sample data. These polygons of influence are described by all midpoints between each neighbouring
sample data. Simple example data set with polygons of influence is shown in Figure 4.
For each polygon of influence the area is calculated and the weight assigned to each sample is
calculated as the proportion of polygon area to the entire area of interest (same as the sum of all polygon
areas).
area j
w' j = n
⋅n
∑ area
j =1
j
where n is the number of data.
Polygonal Declustering Sensitivity to Boundary
The area associated to peripheral samples is very sensitive to the boundary location. If the boundary is
located far from the data, then the peripheral samples will receive a large amount of weight, since the area
of their polygons of influence increase.
In general this great sensitivity to the boundary is perceived as weakness in polygonal declustering. A
common technique is to simply apply the boundary of the area of interest. This may be defined by geology,
leases etc. This approach may be reasonable depending on the problem setting. The second technique is to
assign a maximum distance of influence to the samples.
Polygonal Declustering Approximation
The application of polygonal declustering to a 3D data set requires the calculation of complicated solid
boundaries and volumes, which is computationally expensive. A close approximation can be rapidly be
calculated by discretizing the area of interest into a fine grid and assigning each node to the nearest data.
Polygonal Declustering and Anisotropy
Directional weights may be applied to the polygonal declustering algorithm in order to account for
anisotropy. This is demonstrated in Figure 5 for anisotropy ratios of 1, 2, 5, and 10:1.
Cell Declustering
The cell declustering technique is the most common method applied in Geostatistics. It is insensitive to the
boundary locations and for this reason is seen as more robust than polygonal declustering. A cell
declustering algorithm, DECLUS is standard in GSLIB.
Cell declustering applies the following steps:
• for each origin offset

1. overlay a regular grid over the data
2. assign an intermediate weight of the reciprocal of the number of data in the cell
3. standardize the weights by dividing by the number of cells with data
4. shift the grid
• average resulting weights over each origin offset
The method is demonstrated in Figure 6 for one origin offset.
3
For a given cell the weight of each sample is calculated as follows,
1
ni
w' j = ⋅n
number of cells with data
where ni is the number of samples in the cell in which sample j is located and n is the total number cells
with samples.
Cell Declustering and Cell Size
The weights assigned by cell declustering are sensitive to the cell size. If the cell size is set as very small
then every sample occupies its own cell and the result is equal weighting or the naïve sample distribution.
If the cell size is set as very large then all samples reside in the same cell and the result is once again equal
weighting.
A specific cell size will result in an unique set of weights. The question is, ‘which cell size identifies
the best weights?’. If there is a coarse grid with additional infill sampling then the coarsest sample spacing
is the best cell size (see Figure 7)
If this configuration is not present then a common procedure is to assign the cell size, which maximizes
or minimizes the declustered mean. This is demonstrated in Figure 8. This procedure is applied when the
samples are clearly clustered in low or high values (apply the cell size which renders the maximum or
minimum declustered mean respectively). The results are only accurate when there is a clear minimum or
maximum. One should not blindly assign the minimizing or maximizing cell size. It is shown in the next
section that such an assignment may results in poorer results than the naïve distribution in expected value.
Cell Declustering with the Minimizing and Maximizing Cell Size Warning
A large number of runs were carried out to determine whether in expected terms the minimum or maximum
result in an acceptable declustered mean (a declustered mean which closely approximates the true
underlying mean). 101 realizations were generated by sequential Gaussian simulation of a 50 x 50 space.
The continuity range was selected as small (range of 10) with respect to the simulation size so that the
realization mean and variance were consistently near 0.0 and 1.0 respectively (minimal ergodic
fluctuations). Regular samples where taken at 10 unit spacing and then infill samples were taken around a
specific percentile data value. By changing the specific percentile the level of clustering was changed.
The expected true mean, sample mean, 9 cell size, 10 cell size and minimum or maximum cell size were
compared for each percentile (see Figure 9). For cases with a percentile near the median (low magnitude of
clustering), application of the minimizing or maximizing cell size resulted in poorer results than the naïve
sample mean. The application of the coarsest regular sample spacing resulted in a best declustered mean in
expected terms. This confirms that there is a problem, in expected terms, with systematically applying the
cell size with min/max declustered mean, and that knowledge of the appropriate cell size provides better
results in expected terms.
Cell Declustering and Origin Offsets
Cell Declustering weights are also sensitive to the grid origin. To remove this sensitivity the procedure is
repeated with a specified number of origin offsets and the results are averaged.
The application of origin offsets has the following practical impacts:
1. smoothes out steps that would occur in the declustered mean and cell size relationship.
2. makes large cells sizes (> ½ data set size) unreliable
These effects are demonstrated by an exercise. Cell declustering was applied to the simple data set with
a variety of origin offsets. Figure 10 shows the smoothing effect of the application of origin offsets. The
greater the number of offsets the smoother the relationship between declustered mean and cell size. Also, it
4
can be seen that offsets cause the results to be unreliable when large cell sizes are applied. The data set
dimensions are 10x5 units. It would be expected that at a cell size of 10 that the declustered mean would
be equal to the naïve mean. With offsets this does not occur. The cause is demonstrated in Figure 11.
This reinforces the rule of thumb:
• Calculate the declustered mean for cells sizes 5% to 50% of the size of the area of interest and
apply the minimum number of offsets required to get a reasonably continuous relationship (around
5 is usually sufficient).
Cell Declustering with Multiple Variables
The common practice is to apply cell declustering to the primary variable and apply these weights to all
other collocated variables. This is intuitive since clustering should be related only to the data locations, not
the data values. If the cell size is chosen based on the declustered mean and cell size relationship, this
practice may be questioned since the declustered mean is dependent on the data values. Will the
maximizing or minimizing cell size be the same for each variable?
To explore this issue an exhaustive data set was generated with three collocated and correlated standard
normal variables. Table 1 below lists the properties of each variable.
Variogram Range Correlation to Var. 1

Variable 1 10 -
Variable 2 20 0.7
Variable 3 5 -0.8
Table 1 – Properties of the three correlated variables.
Samples (50) were drawn from the exhaustive data sets, and the sample variograms and correlations were
checked (see Figure 12). The sampling scheme was based on coarse grid (20 unit spacing) with some infill
clusters and random samples. The relationships of declustered mean vs. cell size are shown in Figure 13.
For all three variables there is a clear maximum or minimum at the same cell size. This has occurred
despite different variograms, and correlations. This exercise has supported the current practice of applying
the same cell declustering weights to all collocated variables.
Kriging Weight Declustering

Another technique applied for calculating declustering weights is to perform a kriging of the area of interest
and to sum the weights applied to each conditioning data and then to standardize these sums. This is
analogous to block kriging the area of interest.
This technique is similar to polygonal declustering in that it is sensitive to the boundaries. In addition,
this method is sensitive to search parameters. This method also integrates information on the spatial
continuity of the population, through the variogram. Since kriging weight declustering is a discretized
approximation method, the approximation improves as resolution of the kriged grid increases.
Disadvantages of Kriging Weight Declustering
In general, this approach is much more computationally intensive than the polygonal and cell declustering
techniques. Also, there may be artifacts in the weights due to the string effect. This string effect is
illustrated in Figure 14.
The conditioning data at the extents of the string receive greater weight. For the peripheral data the
weighting is greater, even at locations much closer to other data. This is caused by the implicit assumption
of kriging that area of interest is imbedded in an infinite domain.
5
Kriging Weight Declustering and Negative Weights
It is appropriate to include negative weights when calculating the sum of weights. It is possible that this
would result in a negative declustering weight. One way in which this would be realized would be if the
conditioning data is outside the area of interest and it is screened. In general, the conditioning data are
within the area of influence and a negative declustering weight would not occur.
Declustering and Anisotropy

Cell, polygonal and kriging weight declustering may be configured to account for anisotropy in the data.
An aspect ratio may be applied to the cells, the distance measure may be weighted in polygonal
declustering and anisotropic variograms may be applied in kriging weight declustering.
An example cross section is shown in Figure 15. There is a large horizontal to vertical anisotropy
continuity ratio. Wells with sample locations are added to the cross section (see Figure 16). Isotropic
declustering would assign similar weights to the samples in all three drill holes, since they are all nearly
equally spaced. Yet our knowledge of the spatial continuity tells us that the samples in drill hole 2 are
more redundant. Also, without accounting for anisotropy sample A in Figure 17 would receive greater
weight. Considering the high level of anisotropy this would not be correct. These cases can be addressed
by applying anisotropy to the declustering method.
Declustering and the Boundary

The cell technique is unique in that it does not consider the boundary. This is indicated as a strength, since
this results in a more robust assignment of declustering weights. While this is true, this may results in some
pathological problems.
It may not be reasonable to ignore the boundary in assigning weights to the example data set (see Figure
18). Clearly although the data is equally spaced it is not equally representative of the area of interest.
Figure 18 illustrates the differences between cell, polygonal and kriging weight declustering in assigning
declustering weights.
Cell declustering assigns equal weights to the data since it does not consider the boundary, while
polygonal and kriging weight declustering account for the boundary location. Also, kriging weight
declustering shows some artifacts due to screening in this case. Note that the data in the second row from
the bottom receive less weight due to screening.
Declustering Within Rock Types

A common procedure is to model each rock types properties separately and then to simulate continuous
variables (such as grade) within each rock type. A simple example is shown in Figure 19. This cross
section indicates two separate tabular facies sampled by two complete drill holes.
In this setting representative distributions of each rock type would be constructed by separating area of
interest into distinct rock types. Then weights would be applied to the naïve distributions for each rock
type. Figure 20 illustrates the separation of the area of interest into separate rock types.
In this setting the boundary is very important. This would disqualify the application of cell
declustering. When there are strings of data as shown in Figure 20 then the kriging weight declustering
method suffers from increasing events of the string effect. In this example the data nearest the contact
layers would receive disproportionate weight. Polygonal declustering is a good method to apply since
sensitivity to the boundary is required, and the polygonal method would not suffer from any artifacts.
Polygonal Declustering Within Rock Types
To illustrate the application of polygonal declustering with a rock type model, a synthetic example was
constructed. A random 2D data set was constructed with a uniform distribution in x and y and a normal
Gaussian property. The rock type model was constructed by smoothing an unconditional sequential
6
indicator simulation with 4 categories (see Figure 21). Then conventional polygonal declustering was
applied to the data set irrespective of the rock type (see of Figure 22} left for a map of declustered weights
and the polygons of influence). Also, polygonal declustering was performed constrained by the rock types
(see right of Figure 22) for the declustered weights and the polygons of influence. Considering the rock
types significantly improves the declustering weights.
When the entire range of the distribution is not sampled then it is necessary to apply debiasing
techniques.
Debiasing
There are two methodologies that may be applied to correct biased samples. The first method, “trend
modeling for debiasing”, separates the variable into a trend and residual. The second approach, “debiasing
by qualitative data”, corrects the distribution with a representative secondary data distribution and
calibration relationship to the primary variable of interest.
In the presence of a clear and persistent trend, trend modeling may be applied to ensure that the correct
distribution is reproduced. Trend modeling is well established (Goovaerts, pg. 126, 1997; Deutsch, pg.
182, 2001). The steps are as follows: (1) remove an appropriate trend model, (2) stochastically model
residuals, and (3) replace trend a posteriori. The resulting models reproduce the trend. An advantage of this
technique is that the simulation step may be simplified since cosimulation is not necessarily required.
While this technique will often debias the distribution, there is no direct control over the resulting
distribution. The result should be checked. Care should also be taken to build an appropriate trend model.
This requires that the mean of the residuals is close to 0.0 and the correlation between the trend and
residual is close to 0.0.
Another technique is to use soft data that is representative of the entire area of interest, and an
understanding of the relationship between the primary and soft secondary data to correct the primary
distribution (Deutsch et al., 1999). Then, this corrected distribution is applied as a reference distribution to
the subsequent simulation of the primary variable. The underlying relationship between the primary and
secondary data may be assessed from geologic modeling or the physics of the setting. This relationship
may not be directly observed due to a lack of data; nevertheless, a relationship between the secondary and
primary data, fˆx, y ( x, y) must be inferred for debiasing (see Figure 23).
The construction of the bivariate calibration is the difficult component of debiasing. There are a variety
of techniques for building this distribution. For example, the program SDDECLUS by Deutsch relies on
the user submitting data pairs which describe the bivariate relationship. This approach allows for the
greatest flexibility, since there is no constraint on the form of the bivariate calibration. For each paired
primary data a weight is assigned based on the secondary distribution
Another method is to calculate a series of conditional distributions of the primary given the secondary
data, fprimary|secondary, over the range of observed secondary value. This can be extrapolated over the range
of all secondary data by a trend. This is illustrated in Figure 24. The primary distribution is then calculated
by scaling the binned bivariate calibration by the secondary distribution. For the above bivariate
calibration this is illustrated in Figure 24. This is a discrete approximation of the solution of the secondary
distribution as expressed in Equation 1.
f y ( y ) = ∫ f y| x ( y | x ) ⋅ f x ( x )dx (1)
x
The trend method indirectly corrects the global distribution. This leads to models with precise trend
reproduction and indirect control over the distribution. The qualitative method focuses on directly
correcting the global distribution and retaining consistency by applying the secondary data as collocated
data in the simulation. The result is direct control over the reproduced distribution and indirect control over
trend reproduction.
The two techniques, also differ in the information which is integrated into the numerical model. In the
first method the simulation is augmented by information concerning the spatial behavior of the primary
variable. The second method relies on information concerning a more representative secondary data and
the relationship between the primary and secondary data. The information available may limit the ability to
7
apply either method. The method chosen also affects the resulting model of uncertainty. Each will
potentially decrease the overall model uncertainty. This is expected since each option involves adding
conditioning to the numerical model.
Debiasing Example
A realistic data set based on a 2D poly metallic “red” vein is presented. This sample set was gathered by
drilling. Some data were removed for checking and illustrating the method. For the sake of comparison, an
approximation of the true gold distribution was constructed by applying polygonal declustering to the
complete “red” data set (see Figure 26). In the complete data set the entire area of interest is well
delineated (see left of Figure 27 for location map of the complete data set) and polygonal declustering
results in a reasonable distribution. Since the true underlying distribution is not available, this distribution
will be assumed to be a good approximation of the underlying distribution.
There is a significant positive correlation between the gold grade and the thickness of the vein
(correlation coefficient = 0.6), so it was decided to apply gold as the primary variable and a smooth kriged
thickness map as the representative secondary data, see Figure 27.
Polygonal declustering was applied to the reduced data set. The resulting declustered distribution and
the voronoi polygons are shown in Figure 28. There is a great difference between the underlying mean
gold grade (0.69 g/t) and the declustered mean gold grade (1.25 g/t) and distributions do not have the same
shape. There is additional information that could aid in the inference of the correct distribution, such as
thickness that has a significant correlation to the primary variable, gold. There is also a clear trend in the
gold grades. This analogue information improves the distribution.
Debiasing by Qualitative Data
Debiasing by qualitative data with a bivariate trend was applied to correct the gold distribution. The
results are shown in Figure 29. The bivariate trend was set as a second order function with a linear segment
for gold grades greater than 5.0 g/t. This density calibration table was weighted by the thickness
(secondary) distribution, and the resulting corrected gold (primary) distribution is shown on the left of
Figure 29. Any estimated negative grades were set to zero. Sequential Gaussian simulation (SGSIM) was
performed with the debiased distribution as a reference distribution and the thickness map as a collocated
secondary data. A correlation coefficient of 0.72, was calculated from the density calibration table, was
applied to the secondary data. An omni-directional variogram with a nugget effect of 0.4 and an isotropic
spherical structure with a range of 140 units was used to represent the gold spatial continuity. No effort
was made to calculate and model a more specific variogram model since variogram inference is not a focus
of this work. Three realizations are shown in Figure 30.
The strong correlation between the primary data and the collocated secondary data has resulted in a
clear trend in the realizations. Some example simulated distributions are shown in Figure 31 and the
distribution of the realization means for 100 realizations are shown in Figure 32. The average of the
realization means is 0.84, which is higher than the average of the reference distribution (see Figure 26).
Nevertheless, the resulting distribution is closer to the reference true distribution in shape and statistics than
the declustering results.
Trend Modeling for Debiasing
Trend modeling for debiasing was also applied. A trend model was constructed from a moving window
average of all the gold samples in the complete data set. This model was scaled such that the mean of the
residuals was near 0. The gold samples, gold trend model and distribution of the residuals are shown in
Figure 33. Sequential Gaussian simulation was performed with the residuals and the trend model was
added a posteriori. Any negative estimates were set to 0. Three example realizations are shown in Figure
34. The trend is consistently reproduced in each realization. Some realization distributions are shown in
Figure 35 and the distribution of the realization means for 100 realizations are shown in Figure 36. The
mean of the realization means is 0.90. The resulting distributions are closer to the approximate true
distribution in shape and mean than the declustering results.
8
Conclusions
Nonrepresentative sampling is unavoidable in most geologic settings. Declustering techniques are widely
used and are generally effective for correcting for nonrepresentative data. It is important to understand the
appropriate methods and settings for the application of declustering. In settings where the underlying
distribution has not been adequately sampled, declustering may not be adequate and debiasing is required.
Debiasing relies on analogue information such as a trend in the primary variable or a well sampled
secondary variable and a calibration. Two debiasing methods, trend modeling for debiasing and debiasing
by qualitative data, have been demonstrated with a mining data set.
Acknowledgements
We are grateful to the industry sponsors of the Centre for Computational Geostatistics at the University of
Alberta and to NSERC and ICORE for supporting this research. Also, we would like to acknowledge
Julian Ortiz who contributed to work on declustering.
References
Deutsch, C.V. and A.G. Journel. 1998. GSLIB: Geostatistical Software Library: and User’s Guide,
2nd Ed. New York: Oxford University Press.
Deutsch, C.V., P. Frykman, and Y.L. Xie, 1999. Declustering with Seismic or “soft” Geologic
Data, Centre for Computational Geostatistics Report One 1998/1999, University of Alberta.
Deutsch, C. V., November 2001. Geostatistical Reservoir Modeling, in final stages of production at
Oxford University Press,
Goovaerts, P., 1997. Geostatistics for Natural Resources Evaluation. New York: Oxford University
Press.
Gringarten, E., P. Frykman, and C.V. Deutsch. December 3-6, 2000. Determination of Reliable
Histogram and Variogram Parameters for Geostatistical Modeling, AAPG Hedberg Symposium,
"Applied Reservoir Characterization Using Geostatistics", The Woodlands, Texas.
Isaaks, E. H. and R. M. Srivastava. 1989. An Introduction to Applied Geostatistics, New York:
Oxford University Press.
9
Figure 1 - Some examples of accessibility constraints illustrated on a cross section.
Figure 2 - The influence of weighting on a distribution. On the right the naïve distribution (dotted
line) is superimposed on the declustered distribution with the weights indicated.
Figure 3 An example underlying distribution (bold line) and the sample distribution (histogram).
The entire range of the true distribution has not been sampled.
10
Figure 4 - The polygon of influence.
Figure 5 – The effect of anisotropy on the polygons of influence. The horizontal distance was
weighted by factors 1, 2, 5, and 10.
11
Figure 6 – A simple illustration of the cell declustering technique.
Figure 7 – Optimum Cell Size Assignment
Figure 8 – The Relationship between Declustered Mean and Cell Size For the Simple Data Set
12
Figure 9 – A chart indicating the expected true, sample and declustered means (for cell size 9, 10 and
minimizing or maximizing), vs. the percentile of clustering. The consistent application of the coarsest
sample spacing as cell size results in a declustered mean closer to the true mean in expected terms.
Declustered Mean vs. Cell Size
6.8
Origin
Declustered Mean
6.6 Offsets
6.4 1
6.2 5
6 10
5.8
5.6
0 2 4 6 8 10
Cell Size
Figure 10 – The declustered mean vs. cell size relationship with a variety of origin offsets. Note that
the offsets smooth out steps in the declustered mean and cell size relationship. Also, with origin
offsets the declustered mean does not approach the naïve sample mean as the cell size becomes large.
13
Figure 11 – The effect of offsets with large cell sizes. If the origin is shifted the data may not all
reside within a single cell, instead the data is divided into 4 cells. Thus, the declustered mean is not
the naïve sample mean.
Figure 12 – For each data set, the exhaustive data set with the samples indicated, the variogram
model and experimental variogram of the samples and the scatter plot of the samples with the
variable 1 samples are shown.
14
Figure 13 – The declustered mean vs. cell size for the three variables.
15
Figure 14 – A string of data, with weights from the kriging weight declustering method indicated,
superimposed on the maps of the weights assigned to each data at all locations. The string effect
causes the outer data to receive greater weight (see maps for Data 1 and 6).
Figure 15 – An example cross section with strong anisotropy.
16
Figure 16 – An example cross section with wells and sample locations.
Figure 17 - An example cross section with wells and sample locations.
17
Figure 18 – An illustration of the difference in weight assignments due to a boundary for cell,
polygonal and kriging weight declustering. Cell declustering would equally weight the data, while
polygonal would larger weight to the data near the unsampled area. Kriging weight declustering
would result in weights similar to polygonal, but subject to screening and string effects.
Figure 19 – An example rock type model.
18
Figure 20 – An example model broken up into separate rock types.
Figure 21 – An example data set and facies model.
Figure 22 – Polygonal declustering weights and polygons, with and without facies.
19
Figure 23 - The calibration bivariate distribution, fˆx, y ( x, y) , and known marginal distribution of the
soft data variable, f x (x) .
Figure 24 - Calibration by bivariate trend. The points indicate the known primary and secondary
data. The arrow indicates a linear bivariate trend. The lines represent probability contours.
20
Figure 25 An illustration of the numerical integration of the conditional distribution along the
previously indicated linear bivariate trend.
Figure 26 – An approximation of the underlying distribution calculated by polygonal declustering of

the complete “red” data set.
21
Figure 27 – The original red.dat database (on the left) and the modified data base with kriged
thickness map.
Figure 28 – The resulting distribution from polygonal declustering of the modified red.dat data set
and a location map of the data set, with the associated voursior polygons.
22
Figure 29 – The density calibration table with collocated thickness data, the thickness distribution,
original gold distribution and the corrected gold distribution
Figure 30 – Three realizations of gold grade using the debiased distribution and collocated thickness
(secondary) data.
23
Figure 31 – The histogram of one realization and the cumulative distribution of 20 realizations.
Figure 32 – The histogram of realization means for 100 realizations.
Figure 33 – The reduced “red” data set with a gold trend, and the distribution of the residuals at the
data locations.
24
Figure 34 – Three realizations of gold grade resulting from addition of a stochastic residual and a
deterministic trend model.
Figure 35 – The histogram of one realization and the cumulative distribution of 20 realizations.
Figure 36 – The histogram of realization means for 100 realizations.
25
View publication stats

Declustering and Debiasing: January 2007

Uploaded by

Copyright:

Available Formats

Declustering and Debiasing: January 2007

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Declustering and Debiasing: January 2007

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Declustering and debiasing

Article · January 2007

Michael J. Pyrcz Clayton Deutsch

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

M. J. Pyrcz ([email protected]) and C. V. Deutsch ([email protected])

where n is the number of data.

Polygonal Declustering Sensitivity to Boundary

Polygonal Declustering Approximation

Polygonal Declustering and Anisotropy

Cell declustering applies the following steps:

• for each origin offset

The method is demonstrated in Figure 6 for one origin offset.

Cell Declustering and Cell Size

Cell Declustering and Origin Offsets

The application of origin offsets has the following practical impacts:

This reinforces the rule of thumb:

Cell Declustering with Multiple Variables

Variogram Range Correlation to Var. 1

Table 1 – Properties of the three correlated variables.

Kriging Weight Declustering

Disadvantages of Kriging Weight Declustering

Declustering and Anisotropy

Declustering and the Boundary

Declustering Within Rock Types

Polygonal Declustering Within Rock Types

Debiasing by Qualitative Data

Trend Modeling for Debiasing

Figure 7 – Optimum Cell Size Assignment

Declustered Mean vs. Cell Size

Figure 15 – An example cross section with strong anisotropy.

Figure 17 - An example cross section with wells and sample locations.

Figure 19 – An example rock type model.

Figure 21 – An example data set and facies model.

Figure 26 – An approximation of the underlying distribution calculated by polygonal declustering of

Figure 32 – The histogram of realization means for 100 realizations.

Figure 36 – The histogram of realization means for 100 realizations.

View publication stats

You might also like