Sparse versus Dense Datasets - Purdue Agronomy

handslustyInternet and Web Development

Dec 14, 2013 (3 years and 3 months ago)


Sparse Versus

Dense Spatial Data

R.L. (Bob) Nielsen

Professor of Agronomy

Purdue University

West Lafayette, IN 47907



Spatial data & GIS

Spatial data are the
fundamental components of
agricultural GIS.

Growers hope to minimize or
manage spatial yield
variability in order to increase
or maximize profitability.

The causes of yield variability must
therefore be determined, which requires the
acquisition of additional spatial data sets or
‘layers’ of information.

Spatial data sets can be ...


Many data points
per acre

e.g., grain yield
data sets often
consist of 300 to
600 data points
per acre


Fewer data points
per acre

e.g., typical grid
soil sampling
results in an
average of 0.4
data point per

GIS software …

Interpolates or fills in the spatial
'holes' in the data to create pretty
color maps that mysteriously become
the essence of truth for believers.

Dense data sets have fewer 'holes'
per acre than do sparse

Thus, less interpolation is required

Thus, the resulting map is intuitively
more believable

Yield data are dense …

One sec. readings at 3 mph
equal to 1 data point every 4.4 ft

600 data points per acre

with a 6
row combine header

Yield maps are believable …

Very little interpolation
required to create yield map.



Soil sample data are

Typical 2.5 acre sampling grid

Only 0.4 point per acre

Organic matter surface

Interpolated from o.m. values of
2.5 acre soil sample data



Soil surface color from
reclassified aerial IR

Soil o.m. surface map
interpolated from 2.5
acre samples

Mediocre correlation

acre soil sampling

More intense sampling

Five times as many data points as before

Still sparse relative to aerial imagery



Soil surface color from
reclassified aerial IR

Soil o.m. surface map
interpolated from half
acre samples

Improved correlation

2.5 ac soil O.M. map

Consequence of sparse

Aerial image, reclassified

Poor interpolation of
spatial variability

ac soil O.M. map

The challenge …

In order to interpret yield maps
wisely, you will need far more data
layers than just soil nutrient levels
and soil types.

Many factors influence yield!

Acquiring these data will require
forethought, time, timeliness,
attention to detail, and (of course)

The good news

Some of the additional data sets you will
acquire will be dense and, therefore,
satisfactory for creating spatial maps


Soil EC

Aerial photography

Satellite imagery

The bad news

Some of the additional data sets you will
acquire will be sparse data sets, the maps
from which must be taken with the
proverbial ‘grain of salt’.

Soil nutrients

Plant populations

Stand uniformity

Plant height

Insect pressure

Disease pressure

Weed pressure

Soil compaction

Bottom Line:

Data collected by field scouting,
including soil nutrient sampling, are
often too sparse for GIS programs to
accurately interpolate spatial

Yet, more intensive data collection is
often cost

and time

Example: Plant Counts in
Late Planted Soybean

Approx. 10 plant
population checks
per acre on a fairly
equal grid basis

292 total data
points on 30

Cost: Three hikers,
two GPS units, one

Directed sampling

Added another 80
population checks
on the fly as our
eyeballs dictated

372 data points

Cost: Included in
first day’s work

Revisited field, second day

GIS map did not agree
completely with our
eyeballs, so revisited

Added another 54
population checks

Total of 426 data
points on 30 ac.

Cost: Three hikers,
one GPS unit, one

Soy population map

Based on original grid samples
(10 per acre)

< 50k

50 to 100k

100 to 150k

150 to 200k

> 200k

Original data

Original data plus
directed samples
on the fly

Including revisit

Minor, but potentially
useful improvements

Did add’nl
sampling help?



Our map of populations
(17 June)

Green vegetation index
(NDVI) from IR aerial
image (8 July)

Not perfect, but acceptable


Sample as densely as time and
money will allow.

From the perspective of crop scouting
or monitoring, you can never have
too much data!

Remember, you rarely have a visual
idea of what the true spatial pattern

So, sometimes directed sampling is
not feasible.


Sample in as much of an equidistant
pattern as is logistically possible.

Better for GIS software, easier on the
person in the field.

Begin with a grid pattern, modify with
additional directed sampling as
suggested by other data layers or
your own eyes.

Thanks for your attention!

Farming is a
gamble, so let’s
practice ….

Pick a card and
concentrate on

I will make your card disappear!

Did you

concentrate hard?

I believe your card is missing!