Based on the explorations up to this point, I wanted to post some initial designs using the Gaussian renderer and blending with the original panorama at a more carefully specified horizon. The following images show the results with varying multipliers applied to the Gaussian functions used to draw SOM cells. If you look really closely, you may be able to see that one of the red benches dissolves into a plume of red. A person sitting on one of the red benches on the left turns into thick black smoke.
The following images show exponential and linear (respectively) increases of neighbourhood size where the maximum neighbourhood size at the top is the same (1600). They don’t seem all that aesthetically different, nor does there appear to be any smoother a transition at the horizon. The linear version is slightly more interesting (and cosmological).
After making the decision to seed the top row with averaged values I executed a few runs using a larger network (20px for each SOM cell) and a range of maximum neighbourhood sizes (10, 50, 100, 500, and 1000). The last is so large that it nearly covers the entire width of the map (~1400 cells). Following are a few rectangle renderings of the result. Note, I used too large a scaler, so each cell is drawn quite large. I think this explains the large rectangles at the bottom of the image, but that needs more investigation.
I wanted to follow up on some of the conceptual aspects I touched on in my previous post. When I started my Masters degree I was really interested in how I could create a computational process that was not random nor a perfect reflection of my intention nor the outside world. In general, I try to use randomness as sparingly as possible. In my Masters project I, as best as I can recall, did not use any randomness at all, making the system deterministic. Since it used images from its environment (that was constantly changing) it was unpredictable, and yet deterministic.
Thanks to Daniel Frenzel, ANNetGPGPU now supports setting different neighbourhood sizes in a single network. This means I will no longer have to generate a different source and data file for each neighbourhood size. Following is an image visualizing the weights (using the old rectangle renderer for performance reasons).
In the following images (raw on top, original overlap on bottom) I used a smaller range of neighbhourhood sizes compared to the previous post (1-30). I was hoping the buildings would be less obliterated, and I actually prefer some of the horizontality of the previous 1-40 version. The reason why I’m unhappy with these results is that the visually interesting structure only begins on the horizon, and thus the abstraction should only start at that point.
The following is a 4k video of a portion of the colour field sequence previously posted.
The following image is constructed by taking a single row from each of the trained networks such that the lowest (bottom) row has the smallest neighbourhood size (1) and the highest (top) row has the largest (170). Note that very little of the image is readable because the image is abstracted very quickly as the neighbourhood size increases in steps of 1. Note some stability in structure (position of colours) near the bottom of the image. Near the top the increasing horizontality indicates large differences in structure in subsequent neighbourhood sizes.
The following images show a sequence where the neighbourhood size increases linearly from 1 to 33 (skipping even neighbourhood sizes). In networks with neighbourhoods larger than 33, the results look about the same structurally and occasionally show large changes in subsequent neighbourhood sizes. The larger the neighbourhood the more instability there seems to be.
I realized that part of why the SOMs in the previous sequences are inconsistent over time is because a time seeded random number is used to rearrange of order of the segments (inputs) for each SOM, which adds significant random variation. I first tried to use serial training in ANNetGPGPU, but found that it is significantly slower than random training (serial training time: 745.579s; random training time: 12.1s). I also rewrote the code so that the next network actually uses the previous network as a starting point, rather than starting with the original training data for each neighbourhood size. The results, a selection of which follows, certainly have more cohesion, but the use of the previous network reduces some of the colour variation.
I managed to batch generate 170 different SOMs with different neighbourhood sizes (default/170 to default). Unfortunately, they are not stable over time; even though they have the same initial conditions, each result has a different structure. I’m not sure if this is due to the change of neighbourhood size, or some indeterminism in the way the algorithm proceeds on the GPU. Following is a selection of the sequence with increasing neighbourhood sizes. These SOMs are rendered using the new code that renders with Gaussianoids rather than rectangles.
The following images are renderings of the SOM structure trained and visualized using the same methods as previously posted. The only difference here is that much smaller neighbourhood sizes are used (top: default/150 and bottom: default/50)
The following images show visualizations of the SOM’s structure. The visualization is composed of rectangles where their colour, width and height correspond to the segments association with that location. The segments themselves are shown underneath the visualizations.
I think I have code working where the initial neighbourhood size (the number of neurons that are updated for each training step) starts off being very small (in this case default/20). The idea is to use the neighbourhood size such that the image becomes increasingly self-organized from the bottom to the top. In the first image below, only 1000 iterations of training are done. There is an interesting deconstruction of the image from the initial conditions (seeded from the original panorama).
Following from my previous post the following image and details show the level of fragmentation using blocks of 36px. This corresponds to the SOM being 36 times smaller than the original pano. I’m currently doing a 20px run (1421×306 lattice), but its proving to be very slow. Even though I’m not doing any training, getting the BMUs for each segment is extremely slow, ~4 hours for the 36px blocks. In the image below I use a gradient alpha mask to fade between the original panorama at the bottom and the lattice arranged segments.
I found a bug in the code such that the initial conditions of the network were not properly corresponding with the original pano. The following image shows a untrained map where the features of segments in each 100 px square block are averaged and then the closest region is presented in that location. The top image shows these segments in each block’s position. The image below is the same, but with the original pano underneath it and a detail. As this is the least fragmented the Cartesian SOM can represent, I’m now running a pass with smaller blocks and we’ll see how that looks.
The following images are the result of a 789×170 unit SOM where the initial weights are determined by the original panorama source. The training was done without modification, thus large initial neighbourhoods obliterate much of that initial structure. The number of units in the SOM make the segments too broadly distributed with large gaps between segments. The idea is to control the learning rate and neighbourhood size in the modified training routine such that the segments are located near where they are in the pano at the bottom and become increasingly self-organized at the top. I’ll try a smaller SOM next so there is hopefully more overlap between neighbouring segments.
The following image shows the initial weights I will use to train the SOM next. It just shows colour information, but the feature vectors also hold width and height of the segments. In grid locations that contain more than one segment, the feature vectors of those segments are averaged. For now, feature vectors for cells without any segments are set to zero (and appear black below).