Geo
28th August 2020
by Christoph Perger

The second part of this blog series is looking into reading the actual data within a File Geodatabase. We will explore how to access both the attribute as well as the spatial data of a vector layer and build some indexes, that will be used later.

gdb-ua2012.png

Walkthrough

This tutorial series is split into 3 separate posts and gives you some insights into how to handle File Geodatabase in your C# solutions.

Note: As test data for the following examples, we are using the Urban Atlas 2012 for Austria, which you also find in the sample code of this blog post. Any sceenshots or output we are showing here will visualize the output of this particular file geodatabase, but the results should look very similar for any other dataset as well.

Data analysis and indexing

In the following chapters we are going to examine the layer's content and do more in-depth analysis regarding attribute and geometric properties. Addionally we are deriving aggregated numbers as indexes for a later use.

Attribute indexing

Let's take a closer look at the attributes. We know that the Urban Atlas 2012 classification for each polygon can be found in the field CODE2012. So our next task is to find out what values we can find in that field. We write a small loop that iterates over the features in the dataset and extracts the values from the CODE2012 field and counts their occurance:

private static void AttributeIndex(string dataSetPath)
{
    var values = new Dictionary<string, int>();

    var fileGdbDriver = Ogr.GetDriverByName("OpenFileGDB");
    var dataSource = fileGdbDriver.Open(dataSetPath, 0);
    var layer = dataSource.GetLayerByIndex(0);

    var feature = layer.GetNextFeature();
    while (feature != null)
    {
        var classification = feature.GetFieldAsString("CODE2012");
        if (!values.ContainsKey(classification))
        {
            values.Add(classification, 1);
        }
        else
        {
            values[classification]++;
        }

        feature = layer.GetNextFeature();
    }

    var result = values.OrderBy(x => x.Key);
    foreach (var keyValuePair in result)
    {
        Console.WriteLine($"Value: {keyValuePair.Key} occurs {keyValuePair.Value} times.");
    }
}

Result:

Classification 11100 occurs  6416 times.
Classification 11210 occurs 24024 times.
Classification 11220 occurs 29536 times.
Classification 11230 occurs 23425 times.
Classification 11240 occurs 10800 times.
Classification 11300 occurs 27735 times.
Classification 12100 occurs 21215 times.
Classification 12210 occurs  1569 times.
Classification 12220 occurs 44160 times.
Classification 12230 occurs  3563 times.
Classification 12300 occurs   176 times.
Classification 12400 occurs    32 times.
Classification 13100 occurs  1355 times.
Classification 13300 occurs   659 times.
Classification 13400 occurs  3568 times.
Classification 14100 occurs  4297 times.
Classification 14200 occurs  3099 times.
Classification 21000 occurs 33946 times.
Classification 22000 occurs  2695 times.
Classification 23000 occurs 39897 times.
Classification 31000 occurs 19154 times.
Classification 32000 occurs   547 times.
Classification 33000 occurs   126 times.
Classification 40000 occurs    53 times.
Classification 50000 occurs  1994 times.
Classification 91000 occurs    81 times.

Spatial indexing

Now that we know the numeric distribution of classes in the dataset we are interested in their spatial coverage/distribution. A quite simple adjustment is needed to accomplish this task. Instead of counting the occurance of a value, we just calculate the area of the geometry of this feature and store that. Finally we devide it by the total size of all features, to get the percentage for each class.

private static void SpatialIndex(string dataSetPath)
{
    var values = new Dictionary<string, double>();

    var fileGdbDriver = Ogr.GetDriverByName("OpenFileGDB");
    var dataSource = fileGdbDriver.Open(dataSetPath, 0);
    var layer = dataSource.GetLayerByIndex(0);

    var feature = layer.GetNextFeature();
    while (feature != null)
    {
        var classification = feature.GetFieldAsString("CODE2012");
        var geometry = feature.GetGeometryRef();
        var area = geometry.Area();

        if (!values.ContainsKey(classification))
        {
            values.Add(classification, area);
        }
        else
        {
            values[classification] += area;
        }

        feature = layer.GetNextFeature();
    }

    var totalArea = values.Sum(x => x.Value);

    var result = values
        .OrderBy(x => x.Key)
        .ToDictionary(o => o.Key, o => o.Value / totalArea);
    foreach (var keyValuePair in result)
    {
        Console.WriteLine($"Value: {keyValuePair.Key} occupies {keyValuePair.Value,7:P} of the total area.");
    }
}

Result:

Classification 11100 occupies  0,27 % of the total area.
Classification 11210 occupies  1,48 % of the total area.
Classification 11220 occupies  1,96 % of the total area.
Classification 11230 occupies  1,38 % of the total area.
Classification 11240 occupies  0,41 % of the total area.
Classification 11300 occupies  0,67 % of the total area.
Classification 12100 occupies  1,70 % of the total area.
Classification 12210 occupies  0,20 % of the total area.
Classification 12220 occupies  1,71 % of the total area.
Classification 12230 occupies  0,25 % of the total area.
Classification 12300 occupies  0,05 % of the total area.
Classification 12400 occupies  0,12 % of the total area.
Classification 13100 occupies  0,28 % of the total area.
Classification 13300 occupies  0,04 % of the total area.
Classification 13400 occupies  0,13 % of the total area.
Classification 14100 occupies  0,45 % of the total area.
Classification 14200 occupies  0,46 % of the total area.
Classification 21000 occupies 30,65 % of the total area.
Classification 22000 occupies  1,43 % of the total area.
Classification 23000 occupies 15,09 % of the total area.
Classification 31000 occupies 35,08 % of the total area.
Classification 32000 occupies  3,93 % of the total area.
Classification 33000 occupies  0,33 % of the total area.
Classification 40000 occupies  0,25 % of the total area.
Classification 50000 occupies  1,63 % of the total area.
Classification 91000 occupies  0,08 % of the total area.

Next steps

Look at the actual data in the layer, we have been able to get some more insight into the features and derived some statistics about the number of occurance of each classification and their spatial distribution. In part 3 of our series we are going to create access methods to extract some specific features from the data.