import numpy as np def circular_structure(radius): size = radius*2+1 i,j = np.mgrid[0:size, 0:size] i -= (size/2) j -= (size/2) return np.sqrt(i**2+j**2) <= radius
ArcGIS & Python
Monday, October 15, 2012
Circular binary structure for SciPy morphological operations
I wanted to try buffering a mask raster, without converting to a shapefile and then back, so I used the SciPy Multi-dimensional Image Module's binary_dilation function. I then used the following function to generate a circular binary structure array of the desired radius:
Labels:
SciPy
Monday, September 10, 2012
ArcGIS 10.0 Implementation of the Canny Edge Detector
The following is my attempt at an almost pure ArcGIS implementation of the Canny Edge Detector. This would be much easier to apply in Python/NumPy (or in 1 line using the canny function in scikits), but I wanted to be able to be able to apply it to very large rasters that exceed the 2GB limit of the 32bit NumPy that is installed with ArcGIS.
I found this site and this site both really helpful for actually implementing the Canny Edge Detetor. Definitely refer to both of them for more examples and pictures.
I did use Python/NumPy for generating the Gaussian Filter kernel files that were needed for the ArcGIS Focal Statistics function. The following code will generate a properly formatted kernel file based on the kernel size and sigma value.
Finally, I had to build 8 kernel files for shifting the raster 1 cell in each of the primary directions for the non-maxima suppression calculation. I used this approach instead of using the Data Management Shift Function so that I would get spatial analyst raster objects but I'm not sure it is actually better. The following are an example of the kernel files for the first 4 directions.
The following code is my implementation of the Canny Edge Detector in ArcGIS 10.0. There are probably more efficient ways of doing it but this seemed to work. The intermediate files could be saved by adding a few
I found this site and this site both really helpful for actually implementing the Canny Edge Detetor. Definitely refer to both of them for more examples and pictures.
I did use Python/NumPy for generating the Gaussian Filter kernel files that were needed for the ArcGIS Focal Statistics function. The following code will generate a properly formatted kernel file based on the kernel size and sigma value.
import os import numpy as np def gaussian_kernel(workspace, size=5, sigma=1.4): sigma_str = str(round(sigma,1)).replace('.', 'p') file_name = 'gaussian_{0}_{1}.txt'.format(size, sigma_str) size = int(size) i,j = np.mgrid[0:size,0:size] i -= (size/2) j -= (size/2) g = np.exp(-(i**2+j**2)/(2*sigma**2)) g /= g.sum() f = open(os.path.join(workspace, file_name), 'wb') f.write('{0} {0}\n'.format(size)) for row in g: print ' {0}'.format(' '.join(['{0:0.6f}'.format(x) for x in row])) f.write('{0}\n'.format(' '.join(['{0:0.6f}'.format(x) for x in row]))) f.close()I then had to build kernel files for the Sobel Operators.
sobel_x.txt | sobel_y.txt |
---|---|
3 3 |
3 3 |
d0.txt | d1.txt | d2.txt | d3.txt |
---|---|---|---|
3 3 |
3 3 |
3 3 |
3 3 |
temp_obj.save(temp_raster)
calls. I didn't include all of my input checking and some names/values are hardwired.import os import arcpy from arcpy import env import arcpy.sa as sa import numpy as np def canny_edge(workspace, raster_name, filter_name, t_low, t_high): #### Set ArcGIS environment parameters arcpy.CheckOutExtension("Spatial") env.overwriteOutput = True env.workspace = workspace env.scratchWorkspace = workspace env.cellSize = raster_path env.extent = raster_path env.snapRaster = raster_path #### Hardwired file names for kernels sobel_x_path = os.path.join(workspace, 'sobel_x.txt') sobel_y_path = os.path.join(workspace, 'sobel_y.txt') d0_path = os.path.join(workspace, 'd0.txt') d1_path = os.path.join(workspace, 'd1.txt') d2_path = os.path.join(workspace, 'd2.txt') d3_path = os.path.join(workspace, 'd3.txt') d4_path = os.path.join(workspace, 'd4.txt') d5_path = os.path.join(workspace, 'd5.txt') d6_path = os.path.join(workspace, 'd6.txt') d7_path = os.path.join(workspace, 'd7.txt') #### Canny Edge Detector Workflow #### Apply gaussian filter filter_obj = sa.FocalStatistics( raster_path, sa.NbrWeight(filter_path), "SUM", "NODATA") #### Calculate 1st derivative in horizontal and vertical direction grad_x_obj = sa.FocalStatistics( filter_obj, sa.NbrWeight(sobel_x_path), "SUM", "DATA") grad_y_obj = sa.FocalStatistics( filter_obj, sa.NbrWeight(sobel_y_path), "SUM", "DATA") del filter_obj #### Calculate gradient and direction grad_obj = sa.SquareRoot( sa.Power(grad_x_obj,2) + sa.Power(grad_y_obj,2)) angle_obj = sa.ATan2(grad_y_obj, grad_x_obj) del grad_x_obj, grad_y_obj #### Reclassify angle into four directions #### See http://suraj.lums.edu.pk/~cs436a02/CannyImplementation.htm reclass_remap = sa.RemapRange( [[-3.141593, -2.748894, 0], [-2.748894, -1.963495, 1], [-1.963495, -1.178097, 2], [-1.178097, -0.392699, 3], [-0.392699, 0.392699, 0], [ 0.392699, 1.178097, 1], [ 1.178097, 1.963495, 2], [ 1.963495, 2.748894, 3], [ 2.748894, 3.141593, 0]]) zone_obj = sa.Reclassify( angle_obj, "Value", reclass_remap, "NODATA") del angle_obj, reclass_remap #### Non-maxima suppresion #### Shift raster 1 cell in each of the 8 possible directions #### This will get neighboring cells for non-maxima suppression #### Keep cells that are greater than the two neighboring cells #### along the gradient direction #### Calculate separate raster for each primary direction then sum d0_obj = sa.FocalStatistics( grad_obj, sa.NbrWeight(d0_path), "SUM", "DATA") d4_obj = sa.FocalStatistics( grad_obj, sa.NbrWeight(d4_path), "SUM", "DATA") grad_04_obj = sa.Con( (zone_obj == 0) & (grad_obj > d0_obj) & (grad_obj > d4_obj), grad_obj, 0) del d0_obj, d4_obj d1_obj = sa.FocalStatistics( grad_obj, sa.NbrWeight(d1_path), "SUM", "DATA") d5_obj = sa.FocalStatistics( grad_obj, sa.NbrWeight(d5_path), "SUM", "DATA") grad_15_obj = sa.Con( (zone_obj == 1) & (grad_obj > d1_obj) & (grad_obj > d5_obj), grad_obj, 0) del d1_obj, d5_obj d2_obj = sa.FocalStatistics( grad_obj, sa.NbrWeight(d2_path), "SUM", "DATA") d6_obj = sa.FocalStatistics( grad_obj, sa.NbrWeight(d6_path), "SUM", "DATA") grad_26_obj = sa.Con( (zone_obj == 2) & (grad_obj > d2_obj) & (grad_obj > d6_obj), grad_obj, 0) del d2_obj, d6_obj d3_obj = sa.FocalStatistics( grad_obj, sa.NbrWeight(d3_path), "SUM", "DATA") d7_obj = sa.FocalStatistics( grad_obj, sa.NbrWeight(d7_path), "SUM", "DATA") grad_37_obj = sa.Con( (zone_obj == 3) & (grad_obj > d3_obj) & (grad_obj > d7_obj), grad_obj, 0) del d3_obj, d7_obj del grad_obj, zone_obj print "Sum Directions" nms_obj = (grad_04_obj + grad_15_obj + grad_26_obj + grad_37_obj) del grad_04_obj, grad_15_obj, grad_26_obj, grad_37_obj nms_obj.save(raster_path.replace('.img', '_nms.img')) #### Hysteresis Thresholding t_low_obj = sa.RegionGroup( sa.Con(nms_obj > t_low, 1), "EIGHT", "CROSS", "NO_LINK") t_zs_obj = sa.ZonalStatistics( t_low_obj, "value", nms_obj, "MAXIMUM", "DATA") del t_low_obj t_high_obj = sa.Con((t_zs_obj >= t_high), nms_obj) del t_zs_obj, nms_obj t_high_obj.save(raster_path.replace('.img', '_edges.img'))There are probably issues at the edges of the rasters because I noticed that the Focal Statistics was not properly handling nodata using the weighting option.
Labels:
ArcGIS 10,
Canny Edge Detector,
Python
Monday, January 23, 2012
Python flood-fill algorithm
Update (2016-09-23): If anyone stumbles upon this I have made a github repo with working examples of the these two functions (http://github.com/cgmorton/flood-fill) and I have updated the code to address the missing variable mentioned in the comments.
I was converting a Matlab script to Python and needed to replicate the Matlab imfill function to fill holes/depressions in floating point rasters.
To determine the algorithm that imfill uses, I followed the reference in the imfill documentation: Soille, P., Morphological Image Analysis: Principles and Applications, Springer-Verlag, 1999, pp. 173-174. The 2nd edition, published in 2004 has a small description on hole filling in section 6.3.7 (pg 208). The key point is that the input image (referred to as the mask image) is set to the maximum value of the image except along the border (referred to as the marker image), and then morphologically eroded. The process is explained in more detail in Soile, P., Vogt, J., and Colombo, R., 2003. Carving and Adaptive Drainage Enforcement of Grid Digital Elevation Models. Water Resources Research, 39(12), 1366. The erosion operation outputs the minimum value in the neighboring cells of a point (defined using a structuring element that is typically a 3x3 square or a cross). The filled image is found by taking the maximum of either the erosion of the marker cell (the minimum value of the neighboring marker cells) or the original mask cell value.
The SciPy ndimage module has morphology tools. There is a binary hole filling function, but no floating point hole fill function. There is a grey_erosion function, and the following Python function mimics imfill using the grey_erosion function:
The Soille (2004) text and the Soille et al. (2003) article mentioned that a fast algorithm that uses priority queues is proposed in Soille and Gratin, 1994. An Efficient Algorithm for Drainage Networks Extraction on DEMs. Journal of Visual Communication and Image Representation, 5(2), 181-189. This article was difficult to obtain but it turns out that the article Liu et al., 2009. Another Fast and Simple DEM Depression-Filling Algorithm Based on Priority Queue Structure. Atmopsheric and Oceanic Science Letters, 2(4) 214-219 presents an algorithm that is nearly identical (but with no mention or citation of the Soile and Gratin (1994) article!). The article is available with a quick Google search. The basic idea of the method is the same as the previous algorithm, but instead of processing every cell each iteration, the cells with the lowest value are processed first. Conceptually, from the edge the algorithm works upslope until it hits a depression, at which point the depression is filled with the current value (h_crt) until a higher point is reached.
The following python function implements the priority queue based flood-fill algorithm:
A number of things helped speed up the algorithm. The biggest gain was saving 180 seconds by using Python Ints instead of NumPy Ints in the heap data tuple (cell_value, row, col, edge_flag). Other improvements included using the heapq module instead of Queue.PriorityQueue() (saved 47s), using the python max function instead of the numpy max function (saved 25s), creating local copies of the function references heappush and heappop (saved 2s), and only checking if the neighboring pixels were in the image for edge pixels using the edge_flag (saved 4s). Last, I set the max value of the marker array to be greater than the max value of the input array so that the while loop would terminate without checking the value each time.
I was converting a Matlab script to Python and needed to replicate the Matlab imfill function to fill holes/depressions in floating point rasters.
To determine the algorithm that imfill uses, I followed the reference in the imfill documentation: Soille, P., Morphological Image Analysis: Principles and Applications, Springer-Verlag, 1999, pp. 173-174. The 2nd edition, published in 2004 has a small description on hole filling in section 6.3.7 (pg 208). The key point is that the input image (referred to as the mask image) is set to the maximum value of the image except along the border (referred to as the marker image), and then morphologically eroded. The process is explained in more detail in Soile, P., Vogt, J., and Colombo, R., 2003. Carving and Adaptive Drainage Enforcement of Grid Digital Elevation Models. Water Resources Research, 39(12), 1366. The erosion operation outputs the minimum value in the neighboring cells of a point (defined using a structuring element that is typically a 3x3 square or a cross). The filled image is found by taking the maximum of either the erosion of the marker cell (the minimum value of the neighboring marker cells) or the original mask cell value.
The SciPy ndimage module has morphology tools. There is a binary hole filling function, but no floating point hole fill function. There is a grey_erosion function, and the following Python function mimics imfill using the grey_erosion function:
import numpy as np from scipy import ndimage
def flood_fill(test_array, four_way=False):
""""""
input_array = np.copy(test_array)
# Set h_max to a value larger than the array maximum to ensure # that the while loop will terminate h_max = np.max(input_array * 2.0)
# Build mask of cells with data not on the edge of the image # Use 3x3 square structuring element
# Build Structuring element only using NumPy module # Structuring element could also be built using SciPy ndimage module # el = ndimage.generate_binary_structure(2,2).astype(np.int) data_mask = np.isfinite(input_array) inside_mask = ndimage.binary_erosion( data_mask, structure=np.array([[1, 1, 1], [1, 1, 1], [1, 1, 1]]).astype(np.bool)) edge_mask = (data_mask & ~inside_mask) # Initialize output array as max value test_array except edges output_array = np.copy(input_array) output_array[inside_mask] = h_max # Array for storing previous iteration output_old_array = np.copy(input_array) output_old_array.fill(0) # Cross structuring element if four_way: el = np.array([[0, 1, 0], [1, 1, 1], [0, 1, 0]]).astype(np.bool) # el = ndimage.generate_binary_structure(2, 1).astype(np.int) else: el = np.array([[1, 1, 1], [1, 1, 1], [1, 1, 1]]).astype(np.bool) # el = ndimage.generate_binary_structure(2, 2).astype(np.int) # Iterate until marker array doesn't change while not np.array_equal(output_old_array, output_array): output_old_array = np.copy(output_array) output_array = np.maximum( input_array, ndimage.grey_erosion(output_array, size=(3, 3), footprint=el)) return output_arrayThis function is very slow though, because the calculation starts on the edge of the image and moves in. Only a small number of cells are changing but the entire array is being processed for each iteration. The function was used to fill holes on a 2206x2986 image with numerous large depressions and took 690 seconds to run.
The Soille (2004) text and the Soille et al. (2003) article mentioned that a fast algorithm that uses priority queues is proposed in Soille and Gratin, 1994. An Efficient Algorithm for Drainage Networks Extraction on DEMs. Journal of Visual Communication and Image Representation, 5(2), 181-189. This article was difficult to obtain but it turns out that the article Liu et al., 2009. Another Fast and Simple DEM Depression-Filling Algorithm Based on Priority Queue Structure. Atmopsheric and Oceanic Science Letters, 2(4) 214-219 presents an algorithm that is nearly identical (but with no mention or citation of the Soile and Gratin (1994) article!). The article is available with a quick Google search. The basic idea of the method is the same as the previous algorithm, but instead of processing every cell each iteration, the cells with the lowest value are processed first. Conceptually, from the edge the algorithm works upslope until it hits a depression, at which point the depression is filled with the current value (h_crt) until a higher point is reached.
The following python function implements the priority queue based flood-fill algorithm:
import heapq import numpy as np from scipy import ndimage
def flood_fill(test_array, four_way=False):
""""""
input_array = np.copy(test_array) input_rows, input_cols = input_array.shape # Set h_max to a value larger than the array maximum to ensure # that the while loop will terminate h_max = np.max(input_array*2.0) # Build mask of cells with data not on the edge of the image # Use 3x3 square structuring element inside_mask = ndimage.binary_erosion( np.isfinite(input_array),
structure=ndimage.generate_binary_structure(2, 2).astype(np.int)) # Initialize output array as max value test_array except edges output_array = np.copy(input_array) output_array[inside_mask] = h_max # Build priority queue and place edge pixels into priority queue # Last value is flag to indicate if cell is an edge cell put = heapq.heappush get = heapq.heappop fill_heap = [ (output_array[t_row, t_col], int(t_row), int(t_col), True) for t_row, t_col in np.transpose(np.where(edge_mask))] heapq.heapify(fill_heap)
def neighbors(row, col, four_way=False): """Return indices of adjacent cells""" if four_way: return [ (row - 1, col), (row, col + 1), (row + 1, col), (row, col - 1)] else: return [ (row - 1, col), (row - 1, col + 1), (row, col + 1), (row + 1, col + 1), (row + 1, col), (row + 1, col - 1), (row, col - 1), (row - 1, col - 1)]
# Iterate until priority queue is empty while True: try:
h_crt, t_row, t_col, edge_flag = get(fill_heap) except IndexError:
break for n_row, n_col in neighbors(t_row, t_col, four_way): # Skip cell if outside array edges if edge_flag: try: if not inside_mask[n_row, n_col]:
continue except IndexError:
continue if output_array[n_row, n_col] == h_max: output_array[n_row, n_col] = max( h_crt, input_array[n_row, n_col]) put(fill_heap, (output_array[n_row, n_col], n_row, n_col, False)) return output_arrayThis function only took 61 seconds to fill the same test image, which is a considerable improvement, but imfill only takes ~3 seconds, so there is still a long way to go. I will probably try implementing the function in C++ next.
A number of things helped speed up the algorithm. The biggest gain was saving 180 seconds by using Python Ints instead of NumPy Ints in the heap data tuple (cell_value, row, col, edge_flag). Other improvements included using the heapq module instead of Queue.PriorityQueue() (saved 47s), using the python max function instead of the numpy max function (saved 25s), creating local copies of the function references heappush and heappop (saved 2s), and only checking if the neighboring pixels were in the image for edge pixels using the edge_flag (saved 4s). Last, I set the max value of the marker array to be greater than the max value of the input array so that the while loop would terminate without checking the value each time.
Labels:
Hole Filling Algorithms,
Python,
SciPy
Friday, September 2, 2011
Simple python issues
This post will list some simple python issues that I always have to look up.
To build a string showing a number with a percent sign, use:
To swap values:
The three ways of building a path string:
(Note: I think Python converts all three to the first one, so splitting on "\\" seems to work best)
To traverse all sub-folders and files of a workspace, use:
To calculate DOY from a datetime object:
Regular expression to "match all":
Regular expression to "match nothing":
To build a string showing a number with a percent sign, use:
"%f%%" % some_float
To swap values:
a,b = b,a
The three ways of building a path string:
'D:\\Temp\\temp.txt', r'D:\Temp\temp.txt', 'D:/Temp/temp.txt'
(Note: I think Python converts all three to the first one, so splitting on "\\" seems to work best)
To traverse all sub-folders and files of a workspace, use:
for root, subs, files in os.walk(workspace)
To calculate DOY from a datetime object:
from datetime import datetime
dt_obj = datetime.now()
doy = dt_obj.timetuple().tm_yday
Regular expression to "match all":
'.*'
or '.*?'
Regular expression to "match nothing":
'a^'
Labels:
Python
Wednesday, August 31, 2011
Latitude & Longitude rasters in ArcGIS 10
The ability to calculate latitude and longitude rasters using the GRID commands $$YMap and $$XMap was removed in ArcGIS 10 (as well as all the GRID commands including $$COLMAP, $$ROWMAP, $$NROWS, $$NCOLS, $$CELLSIZE). To get around this you can still load the ArcGIS 9.3 geoprocessor in ArcGIS 10 and then call the functions using the Single Output Map Algebra tool. Latitude & longitude rasters can also be calculated with NumPy or the FlowAccumulation function but these aren't great options for really large rasters. Make sure to use a DEM that is in a geographic coordinates system and then project the results.
def lat_long_rasters(ned_raster, lat_raster, lon_raster): import arcgisscripting gp = arcgisscripting.create(9.3) gp.CheckOutExtension("Spatial") gp.WorkSpace = os.path.dirname(ned_raster) gp.SingleOutputMapAlgebra_sa( "Con(IsNull({0}), {1}, $$YMap)".format(ned_raster, ned_raster), lat_raster) gp.SingleOutputMapAlgebra_sa( "Con(IsNull({0}), {1}, $$XMap)".format(ned_raster, ned_raster), lon_raster) del gp
Labels:
arcgisscripting,
latitude,
longitude
Friday, August 19, 2011
Nan in NumPy array to NoData in ArcGIS raster
For smaller rasters, NumPy is a much faster alternative to some of the ArcGIS geoprocessing functions. One issue with doing this is that every raster must have exactly the same dimensions (rows, columns, cellsize, extent, etc.) since NumPy has no spatial component. Later on I will post about some of the techniques we use to make sure they are all identical.
There is one issue with converting NumPy arrays back to rasters though that I want to outline. The arcpy function NumPyArrayToRaster does not correctly convert NaN values. The values will display as NoData in the identity window, but they do not act as NoData in subsequent calculations and the 'Display NoData as' option in the Symbology tab has no impact.
To get around this, convert all NaN values to some unused value, and then set that value as the value_to_nodata parameter. For example:
This is Bug NIM063028 and is not fixed in any current or future version.
There is one issue with converting NumPy arrays back to rasters though that I want to outline. The arcpy function NumPyArrayToRaster does not correctly convert NaN values. The values will display as NoData in the identity window, but they do not act as NoData in subsequent calculations and the 'Display NoData as' option in the Symbology tab has no impact.
To get around this, convert all NaN values to some unused value, and then set that value as the value_to_nodata parameter. For example:
test_array[np.isnan(test_array)] = -9999
test_obj = arcpy.NumPyArrayToRaster(test_array, pnt, cellsize, cellsize, -9999)
test_obj.save(test_raster)
This is Bug NIM063028 and is not fixed in any current or future version.
Labels:
ArcGIS 10,
NumPy,
NumPyArrayToRaster
Pyramids documentation
There is a small issue with the documentation for the pyramids environment parameter. The explanation section for the pyramid_option has a few errors. The option is "PYRAMIDS", not "PYRAMID", and the default is to build full pyramids (-1 level), not NONE. This is Bug NIM062605 and is reported as fixed in 10.1. The example code at the bottom of the help page shows the correct usage. To force no pyramids I prefer to use: env.pyramid = "PYRAMIDS 0".
Also, the CreateRandomRaster function doesn't honor the pyramids setting. This is Bug NIM062606 and is not currently reported as fixed.
Also, the CreateRandomRaster function doesn't honor the pyramids setting. This is Bug NIM062606 and is not currently reported as fixed.
Labels:
ArcGIS 10,
CreateRandomRaster,
environment parameter,
pyramids
Subscribe to:
Posts (Atom)