2.0 - Data Interpretation

Created by PCT Agcloud Support, Modified on Wed, 27 Dec, 2023 at 4:23 AM by PCT Agcloud Support

HISTOGRAMS AND COLOUR TABLES

The ability to create maps that display spatial variability is fundamental to precision agriculture. Maps provide a summary of the data and help visualise the spatial variability within a field. All mapped data should be graphically summarised with a histogram (frequency table that displays the distribution of data and shows the colours represented within the map).

A histogram is a valuable tool, which highlights or skews characteristics within the data that may not be obvious when looking at the raw data or map layer. The histogram facilitates the creation of a colour chart that is meaningful for the data. Adjusting the colour chart used to display map data may illustrate a difference that was not highlighted by a previous colour chart.

There is no standardised colour scheme used to display spatial data in PA. Colour schemes vary from one data provider or software manufacturer to the next. It is therefore important to look at the range of values associated with the colours in a chart, as each colour will not always reflect the same value on a different map. All examples in these tutorials are based on a red-to-blue colour scheme where red represents the minimum data value and blue represents the maximum data value.

The example below left is an EM map showing the range of EM values from 2 to 25. The height of each colour bar indicates the total area within the field corresponding to each EM value. It is imperative that map layers are as accurate as possible if they are to be relied upon when making management decisions.

Some spatial data sets, such as yield, require processing to remove any errors and outliers from the data set. GIS packages which have customised imports for yield data will generally offer some level of filtering to correct the data.

In the example below, raw yield is on the left and cleaned yield data is on the right ready for surfacing.

However, for optimum results, professional data processing is recommended. For more information on yield data processing refer to Yield Mapping on page 24.

DATA INTERPOLATION

The interpolation of spatial data takes the individual data points and converts them to a grid format. This regular grid output is achieved by estimating values from the surrounding values. Common interpolation methods offered in GIS packages include Kriging, Spline, and Inverse Distance Weight. Knowledge of spatial statistics is useful when generating surfaced map layers.

The interpolated output from a given data set may vary significantly depending on the model employed. Each model assumes a relationship between data points which may or may not exist. The method of ‘smoothing’ should not be used to hide errors in a data set.

Erroneous points must be removed or corrected before any data interpolation is performed. Most yield data is highly variable and an accurate map representation will show many local maxima and minima that may make the interpretation difficult.

Since yield maps and other spatial data sets help to identify trends across a field, interpolation techniques are used to mask the highly localised variation and highlight the spatial trends.

BASIC STATISTICS

A GIS package should display basic summary statistics for a data set. These include data layer information (Your Name, Farm, Field Name, Season), minimum, maximum and mean values and the standard deviation.

Using relatively simple statistical calculations, the level of in-field variability can be quickly ascertained. Standard Deviation describes the spread or ‘dispersion’ of data around the mean or expected value. The larger the standard deviation the larger the variability in the data and the less useful the mean is as a descriptor of a typical area within the field.

Another measure of data variability is coefficient of variation (CV). The CV normalises the variation in the data by expressing the standard deviation as a ratio to the mean. The higher the CV value, the greater the spatial variability. CV is also used to compare the variability between different data sets (i.e. yield from one crop year to the next).

DATA ANALYSIS

Interpretation techniques may range from quick and simple eyeballing to a more rigorous statistical analysis using GIS software. Both approaches have their place in PA. Good decisions can often be made in a fraction of the time using very simple techniques.

A simple approach to data interpretation may involve a visual comparison of different data layers for a field. This facilitates the identification of spatial patterns through a visual side-by-side comparison. Identifying patterns which are repeated across data sets gives confidence to determine conclusions. Of equal importance to identifying reoccurring patterns, is the identification of inconsistencies between data layers.

For example, several years of similar yield maps and vegetative images for a field may be followed by a yield map showing dissimilar patterns. Such an inconsistency will naturally raise the question ‘Why?’ and may warrant further investigation and diagnostics, providing an opportunity for the grower to learn something new about their field.

There are also occasions when a more rigorous approach to data analysis is required. Statistical comparisons and spatial correlations between data layers can reveal relationships which may not be visually obvious.

In the following example from PCT Agcloud, colour lines help visualise a statistical comparison.

Red lines are r squared <0.3 and generally indicate a poor correlation.
Orange lines are r squared >0.3 and <0.6 and should be considered as ‘ok‘ but with caution.
Green lines are r squared greater than 0.6 and should consider highly correlated.

In any data interpretation, human involvement is vital. A grower or consultant who is actively engaged in the analysis process is more likely to query their data, verify its integrity, and incorporate any indigenous knowledge they possess. Human involvement can also accommodate imperfect data which is often generated in an agricultural environment. Imperfections caused by such common factors as glitches in yield monitors or clouds in aerial images can be corrected or excluded from any analysis.

Building a GIS implementation strategy for a farm is not an overnight process. Although there may be short-term gains such as improved sampling strategies using yield maps or aerial images, the long-term agronomic payback may require years of committed data gathering. Recognising stable spatial patterns and gaining an understanding of the underlying processes is a long-term payoff.

Some general standards for interpreting the level of variability and the likelihood of payback for investment in VRT.

CV Value and General observations

<5% - Generally not enough variability to warrant any action.

5% - 10% - Investigate to assess the economic benefit of managing the variability, particularly in high-value crops.

10-15% – Sufficient variability to expect to see a financial benefit in managing the variability in most crops.

>15% – Highly suitable for variable rate treatment (VRT) with a high payback expected.