Based on the explorations up to this point, I wanted to post some initial designs using the Gaussian renderer and blending with the original panorama at a more carefully specified horizon. The following images show the results with varying multipliers applied to the Gaussian functions used to draw SOM cells. If you look really closely, you may be able to see that one of the red benches dissolves into a plume of red. A person sitting on one of the red benches on the left turns into thick black smoke.
The following images show exponential and linear (respectively) increases of neighbourhood size where the maximum neighbourhood size at the top is the same (1600). They don’t seem all that aesthetically different, nor does there appear to be any smoother a transition at the horizon. The linear version is slightly more interesting (and cosmological).
After making the decision to seed the top row with averaged values I executed a few runs using a larger network (20px for each SOM cell) and a range of maximum neighbourhood sizes (10, 50, 100, 500, and 1000). The last is so large that it nearly covers the entire width of the map (~1400 cells). Following are a few rectangle renderings of the result. Note, I used too large a scaler, so each cell is drawn quite large. I think this explains the large rectangles at the bottom of the image, but that needs more investigation.
I wanted to follow up on some of the conceptual aspects I touched on in my previous post. When I started my Masters degree I was really interested in how I could create a computational process that was not random nor a perfect reflection of my intention nor the outside world. In general, I try to use randomness as sparingly as possible. In my Masters project I, as best as I can recall, did not use any randomness at all, making the system deterministic. Since it used images from its environment (that was constantly changing) it was unpredictable, and yet deterministic.
Thanks to Daniel Frenzel, ANNetGPGPU now supports setting different neighbourhood sizes in a single network. This means I will no longer have to generate a different source and data file for each neighbourhood size. Following is an image visualizing the weights (using the old rectangle renderer for performance reasons).
In the following images (raw on top, original overlap on bottom) I used a smaller range of neighbhourhood sizes compared to the previous post (1-30). I was hoping the buildings would be less obliterated, and I actually prefer some of the horizontality of the previous 1-40 version. The reason why I’m unhappy with these results is that the visually interesting structure only begins on the horizon, and thus the abstraction should only start at that point.
The following is a 4k video of a portion of the colour field sequence previously posted.
The following image is constructed by taking a single row from each of the trained networks such that the lowest (bottom) row has the smallest neighbourhood size (1) and the highest (top) row has the largest (170). Note that very little of the image is readable because the image is abstracted very quickly as the neighbourhood size increases in steps of 1. Note some stability in structure (position of colours) near the bottom of the image. Near the top the increasing horizontality indicates large differences in structure in subsequent neighbourhood sizes.
The following images show a sequence where the neighbourhood size increases linearly from 1 to 33 (skipping even neighbourhood sizes). In networks with neighbourhoods larger than 33, the results look about the same structurally and occasionally show large changes in subsequent neighbourhood sizes. The larger the neighbourhood the more instability there seems to be.
I realized that part of why the SOMs in the previous sequences are inconsistent over time is because a time seeded random number is used to rearrange of order of the segments (inputs) for each SOM, which adds significant random variation. I first tried to use serial training in ANNetGPGPU, but found that it is significantly slower than random training (serial training time: 745.579s; random training time: 12.1s). I also rewrote the code so that the next network actually uses the previous network as a starting point, rather than starting with the original training data for each neighbourhood size. The results, a selection of which follows, certainly have more cohesion, but the use of the previous network reduces some of the colour variation.
I managed to batch generate 170 different SOMs with different neighbourhood sizes (default/170 to default). Unfortunately, they are not stable over time; even though they have the same initial conditions, each result has a different structure. I’m not sure if this is due to the change of neighbourhood size, or some indeterminism in the way the algorithm proceeds on the GPU. Following is a selection of the sequence with increasing neighbourhood sizes. These SOMs are rendered using the new code that renders with Gaussianoids rather than rectangles.
The following images are renderings of the SOM structure trained and visualized using the same methods as previously posted. The only difference here is that much smaller neighbourhood sizes are used (top: default/150 and bottom: default/50)
The following images show visualizations of the SOM’s structure. The visualization is composed of rectangles where their colour, width and height correspond to the segments association with that location. The segments themselves are shown underneath the visualizations.
I think I have code working where the initial neighbourhood size (the number of neurons that are updated for each training step) starts off being very small (in this case default/20). The idea is to use the neighbourhood size such that the image becomes increasingly self-organized from the bottom to the top. In the first image below, only 1000 iterations of training are done. There is an interesting deconstruction of the image from the initial conditions (seeded from the original panorama).
Following from my previous post the following image and details show the level of fragmentation using blocks of 36px. This corresponds to the SOM being 36 times smaller than the original pano. I’m currently doing a 20px run (1421×306 lattice), but its proving to be very slow. Even though I’m not doing any training, getting the BMUs for each segment is extremely slow, ~4 hours for the 36px blocks. In the image below I use a gradient alpha mask to fade between the original panorama at the bottom and the lattice arranged segments.
I found a bug in the code such that the initial conditions of the network were not properly corresponding with the original pano. The following image shows a untrained map where the features of segments in each 100 px square block are averaged and then the closest region is presented in that location. The top image shows these segments in each block’s position. The image below is the same, but with the original pano underneath it and a detail. As this is the least fragmented the Cartesian SOM can represent, I’m now running a pass with smaller blocks and we’ll see how that looks.
The following images are the result of a 789×170 unit SOM where the initial weights are determined by the original panorama source. The training was done without modification, thus large initial neighbourhoods obliterate much of that initial structure. The number of units in the SOM make the segments too broadly distributed with large gaps between segments. The idea is to control the learning rate and neighbourhood size in the modified training routine such that the segments are located near where they are in the pano at the bottom and become increasingly self-organized at the top. I’ll try a smaller SOM next so there is hopefully more overlap between neighbouring segments.
The following image shows the initial weights I will use to train the SOM next. It just shows colour information, but the feature vectors also hold width and height of the segments. In grid locations that contain more than one segment, the feature vectors of those segments are averaged. For now, feature vectors for cells without any segments are set to zero (and appear black below).
I’ve finally got things working such that the arrangement of segments reflects their structure as learned by the SOM; the lack of such structure in previous results were due to a bug in ANNetGPGPU. The following images are all the result of a 100,000 iteration training period with random initial conditions. The last three images are details of the second image.
The following images show some very early work using a SOM (implemented in ANNetGPGPU) to arrange the montages of segmented regions. Rather than sorting according to single parameters, e.g. colour, the composition should reflect the similarity of regions across all parameters. In the first image, I used the wrong segment directory where I did not tweak the overexposed sky. The result is a number of very large white regions at the bottom. While all regions are clearly organized according to their colour parameters, the clusters don’t look right. I think I messed up some of the code mapping the 1D BMU IDs with 2D positions in the SOM lattice.
The following collage explorations use the same code used to generate collages during my Banff Centre Residency, where the segments are extracted from the pano previously posted. These images are composed of over 230,000 individual segments, but the very large regions from the monochromatic sky are filtered out. The images without mattes are details of the collages immediately above them. Note that these are quite low resolution as the original segments are so small. I’ve also extracted segments from a more coarse segmentation and I’ll generate some collages from those this afternoon.
The following image is a crop from the segmented panorama previously posted. The computer is still extracting pixels from the original image using these segmented regions, where the current count exceeds 50,000.
I’m working on a public art commission for the City of Vancouver and I thought it would be a good opportunity to revisit the Self-Organized Landscapes series. For this project, the idea is to combine both a Self-Organized Landscape and a panoramic photograph such that a photo-realistic and readable image dissolves into abstraction as informed by a Self-Organized Map (SOM). The first step was to shoot a ‘straight’ photographic proxy image. The following is a pano constructed using hugin from ~45 exposures where the full resolution is ~30,000 pixels across.
This is the third diptych prototype. I’ve changed the camera mode from auto white-balance and exposure to manual. The SOM in this case is much stronger, I suppose due to the colour constancy between images. Following are some details:
This test is made up of 2401 images. The capture duration was about 2.5 hours, with a 500ms settle between frames. Luckily this worked out from centre, but later I realized that (a) spirals can only do square grids, and (b) that the 0 tilt is not the centre of the tilt range. The auto white-balance and exposure does not work very well in this situation. I could deal with the contrast effects to some degree, but the auto white-balance is problematic. Notice the edge effects around the cabinets. The next step is to use a light meter to do manual exposure and turn off auto white-balance.
One of the most interesting results is the out of focus images that happen in areas with little contrast. I’ve chosen a couple of details from the above images:
This is the first post in the new SOL production category. This is because the SOL project has moved beyond the Dreaming Machine projects and I’ve decided to look at it as a different artwork.
I’ve had the idea of make SOL diptychs for some time. The diptych will be two collages made up of the same images. One will be arranged by the location of the images in space, captured systematically. The second will be arraged by the SOM. Below is an early prototype. This one is 4:3 rather than 16:9, which is because that is how the registration was done.
Images arranged by their captured location:
Images arranged by SOM trained on RGB histogram:
Its hard to say how successful it is, its only a 9×9 map, whereas SOLs are from 70×70 to 150×150 images. I’ll next capture a 50×50 map and see how long that takes, and how the resulting SOM looks.