I need to spend a little more time with this, but I think the balance between the foreground objects presence and stability over time is quite good. I’ll do another sequence with a slight change in maximum learning rate to shift a little towards stability / foreground objects (i.e. 10% less reorganization in areas of movement). I’m also going to reset to the start of the day (9am) and see how that looks. It is too bad if I end up being stuck with an 8 hour version of the work that sunrise and sunset would not be included.
There is a new GPU that should work in my shuttle available, an RTX 3060 Ti, but it will take three weeks to arrive and costs nearly $1k. I’m thinking it may be worthwhile if that means allowing the work to have the scale (but likely not resolution) of what I envisioned for this piece. The author of ANNetGPGPU kindly offered to do a performance test of my code using his RTX 3070 GPU, which would at least give me a sense of what kind of performance I may get if I upgrade.
The following video is 100 frames of current work in progress. I was not happy with the motion-blur effect so I went back to the copying of the moving objects into the frame and tweaked some other parameters. I think it’s going well, but I’m not happy with the hard edges of the moving objects, especially at 540p. The motion blur is not great either. Maybe the harder edges are ok for the larger objects, but less so the spots in the trees. Part of the idea of including those moving objects was to anchor the abstraction into the current frame; now that I’m also increasing the effect of the reorganization process in the areas of movement does that have a similar effect?
I did a quick run without inserting the foreground into the state of the network and I’m certainly much happier with the general aesthetic. There is even a sense of movement away from the camera along Broadway that is still readable. The following gallery shows the same sequence as above, except where foreground objects are not included.
I’m just not sure, from an audience perspective, that the movement of the scene would be readable without some subtle reference to foreground objects. Up to this point is to seed the network state with the moving objects to effect the reorganization process. An alternative could be just to present those moving objects visually without them effecting the reorganization process… This would really miss out on some of the complexity of the disintegration of moving objects, for example in the following frames.
The obvious answer is just to take more time and allow the algorithm to obliterate the seed of the moving objects more, but that means taking more time when there is already a time crunch. Another idea is to use the current frame mask (that shows where the moving objects are) to augment the learning process. This may not do what I imagine though… After some reflection, I don’t think it would do anything more concrete. I’m right now doing a run to get the aesthetic I want, training time be damned! (see below.)
I’ve head back from a few of my GPU inquires, and it does not look like upgrading my GPU is possible. I’m now looking at seeing if I can borrow a faster GPU or potentially buying a “gaming” PC from FreeGeek; the problem with the latter is that its certainly a lot more money for a whole new system (even an older one) and the GPUs are much more in the middle of the road and actually have fewer CUDA cores than my current GPU. It’s unclear to me how this will effect performance, but it strikes me that more parallelism is what I need than raw speed to do the training faster.
The following video is 50 frames of the current work in progress. There is quite an improvement of stability over time, but I it could still use a little more. I also find the foreground objects too prominent and have some ideas on how to decrease that dominance; the increase of stability over time is due to a smaller learning rate, which means subsequent frames effect the image less than the first frame, so the image is more stable. The effect is also that the foreground objects are also more stable, so there is a bit of a motion blur effect where previous frames may stay visible because subsequent frames don’t modify them much. I’m thinking I should use the neighbourhood map (that determines the degree of reorganization) in inverse to determine the learning rate of neurons (it looks like the SOM library I’m using may support that). That way the pixels will change the most in the areas of most movement, which should both give me more control of reorganization of foreground objects and maintain stability over time in areas where there is little movement.
The results are quite comparable at 720p, see gallery below, but process still takes about 40s per frame. Even if I change the length of the work to fit the normal hours of the MPCAS (9am to 10:30pm), that’s still 486,000 frames and at 40s per frame, 225 days to process. I just did another sketch at 540p and that’s down to 22s per frame, which would mean the 9am to 10:30pm version would only be done a month late. So it seems I need to consider upgrading the GPU and getting that to fit in this old machine may be troublesome.
I’ve requested a quote from my shuttle dealer and we’ll see where that goes. I’m sure it may be twice as fast, but that still means going with no larger than the 720p version. I suppose I should refine down parameters for 720p and 540p versions. If I can upgrade the GPU, I may be able to do the 720p version for 24hours. If I can’t upgrade, I’d end up doing the 540p version and it’ll be as long as it ends up being with the time available. The following galleries are the 720p and 540p versions with no refinement.
After running a few explorations with different parameters I’m contending with a significant issue, processing time. If I start next week I can use about 7.5s a frame to get the piece done, but so far I can’t get processing down below 90s per frame. I’ve confirmed it is actually the SOM training (reorganization) that is the slow part. Also, the results I have so far are really quite strong and while I have to tune things to get some stability over time, I’m quite happy with these results as they are. I don’t have a lot of options to decrease processing time, I could
lower the resolution; I’ve been working at fill 1080p resolution up to this point, dropping to 720p could mean 55% fewer pixels to process, I’ll try this first.
get a faster GPU; the GPU is a GTX 780, so not at all new.
shorten the video; this is not ideal since I’m hoping Grunt will be able to get special permission to have the screen on for 24hours for this special event; even in the case of sticking to the normal hours of the screen, that would only get me up to 13s per frame.
lower the frame-rate; since the emphasis of this project is the flow of movement with the reorganization of the machine learning, this would be the least favourable option.
The following gallery shows a few explorations to date. Each column is a different exploration and the rows are the sequence over time. The leftmost exploration feels the strongest to me and needs some find-tuning for stability over time.
I got into the MPCAS control room today to check on the laptop. I saw a stream of DTS STS errors which was a bad sign. I confirmed that the files were still being written to disk, so the process is still active, but unfortunately the camera had moved. I presume there was some kind of power interruption that caused both the camera to reset to its default position (see first post) and confuse the ongoing ffmpeg network connection. I moved the camera back (luckily the camera remembered its pan/tilt/zoom position) and checked that the new images were being captured, and they were. I did not get a close look at the scope of the issue (I’ll get a sense when I get the machine back in two months) but its likely I’ll have lots of missing data (days? weeks?) and missing variation. I hope it works well for the next two months at least to get the fall and winter changes! This is not so much of a problem for most of what I have in mind for those images, but it is too bad I won’t be able to do some explorations now.
I also wanted to make a gallery to show the refinement / bug fixing using the sunrise sketches. The left column was the first attempt and the right column the most recent work. The middle ones strike me as the weakest, so I think things are going in the right direction.
In debugging the sequence work, I noticed yet another bug in the code. I was actually cubing and squaring the max degree of reorganization in the loop that sets the max Sigma for each neuron. Since I was already squaring and scaling the matrix of values before that function, that was a making the degree of reorganization very very large. Now my max neighbourhood numbers actually make sense! After some explorations I ended up with a max neighbourhood size of 1000 and 5 iterations with a cubic increase of reorganization. The following gallery shows my explorations with the most successful at the bottom. Compare these to those posted here. I’m now re-doing the frames spanning sunrise.
Indeed increasing the number of training iterations does increase the abstraction (reorganization) of the elements copied from the current frame. Note the truck on the left of the frame in the following sequence. I also tried playing these back at 10fps and indeed the changed in the background are too different from frame to frame. It looks more like a frame-by-frame animation, compared to the smoothness I’m aiming for. for the next stage I’ll try some methods of making the effect of subsequent training more subtle and see if I can speed up the processing rate also.
I ended up doing something a little more simple than I had in mind to start; I’m using only the currently moving objects to ground the image in the currently processed frame in order to emphasize flow and movement. i.e., referencing the a recent post, I’m multiplying the current frame with the foreground mask and replacing pixels in the SOM using that image data. For example, see the following images.
The pixels that are not black in the right image are copied directly into the SOM state. This way we retain some of the previous training (from the seed image) but also ground the image in the current frame that is augmented by subsequent training (which only uses pixels from the current frame). The following sequence shows the current results.
The current frame does certainly ground the image where the delivery truck retains structure. The over-all stability in the frame is a little low though, so this may look too flickery in video. Also the truck may be too literal. This is because there are parts of the truck that are not changed by the training (not all pixels are changed for every frame). I also noticed that even with quite few training samples (from 250 down to 10) there is only a marginal difference to training time. Looking at the way the image changes frame to frame, I think this is due to the size of the neighbourhood (the max degree of reorganization). I’m going to try two things next; first I’m running again with more training iterations the same 10 frames to see how that effects the degree to which the truck is abstracted and the difference in processing time. Second, another idea is to create a version of the neighbourhood map that has a much smaller max degree of reorganization. I can compute that once and use it for subsequent frames. Of course I’m assuming that will make processing faster, which may not be the case.
Since I got the code wrong to get the degree of reorganization to increase exponentially, I reran the sampling of sunrise frames using the current parameters (now a learning rate of 1, 3 epochs and a max neighbourhood size of 15). These results are an improvement, and in this case only 3000 iterations of training are required.
Following from the previous post, I ran a more broad set of explorations where the exponential increase of reorganization (in this case squared) is actually manifest. There still needs to be some fine-tuning but these are going in the right direction. The images below are arranged from the most reorganization to the least, all using the same squared increase of reorganization. The image on the bottom is the most successful.
Following is the best previous result, without squared reorganization, for reference; clearly there is a significant decrease in abstraction in the areas where there is the most movement.
The squared version may be a little too literal (too little reorganization in the areas of the most movement) so I think I’m going to tweak things a little more. I also think I may want to include an offset so that the least amount of reorganization is not none, but a small amount of reorganization.
Looking at these images I see what I had in mind in terms of the aesthetic and flow and why the sequence results are disappointing; it’s because I’m imagining the high movement areas of the image being reorganized in the context of the whole image, in other words I’m imagining what the image would look like if I trained a SOM for every frame. The images above take about 45mins to calculate, so clearly that is not practical. I’m adding a few explorations with less training, but its very unlikely I’d be able to get processing fast enough to get the piece done by the winter solstice; if I can get processing down to 6s per frame, then I’d be generating for two months which would leave me with one month of research and development before final production.
I’m thinking through how to create the impression of the whole image being retrained without doing actually having to do it. Right now the sequence code reorganizes the first frame with a lot of training and then the subsequent frames with only a small amount of training. In the most recent sequence post, the cubic increase of reorganization means that there is little reorganization in the areas of the most movement; since the first frame is used as a seed and then refined, those pixels appear frozen. Subsequent frames do change the image, but in a highly abstract way that does not result in any sense of the flow of time. I have some ideas for how to change this.
Since the sense of the flow of time only happens in the areas of the image where there is lots of movement, perhaps the current frame could change the state of the SOM only in those areas. I could multiply the current frame by the neighbourhood map and add that to the current state of the SOM. Even better may to be pixelwise AND the current frame mask with the neighbourhood map and multiply that by the current frame and add it to the current state of the SOM.
The following images show a set of refinements tweaking params and decreasing the number of training iterations. The bottom image is feeling pretty close and includes an offset to the neighbourhood map so that the minimal amount of reorganization is a small amount (rather than none).
In working on having the current object mask (pixels that are moving in the current frame) change the neighbourhood map (the image that specifies the degree of reorganization) I realized some things were not working right; after just doing a debug test I realized that I was not actually using an exponential increase of degree of organization. This was due a math error on my part. This explains why the effect of the neighbourhood map was lower than expected. The following images are the sequence using the same parameters, except where the cubic increase of neighbourhood size is actually manifest. As you can see, there is no re-organization happening in much of the roadway.
I’m now running a a few of the previous (non-sequential) explorations again using the changed code to re-evaluate what the parameters should be.
The images above show 5 sequential frames from a quick test. The first frame is trained with more iterations and then subsequent frames are trained much less (250 training samples) using only the pixels that had changed since the previous frame. I was hoping this would make things more concrete and less abstract for the moving objects, but that is not the case at all. Actually, the colours that gain dominance are those emphasized in the new frames, but because the network is already trained, they don’t effect the locations where those pixels were seen much at all. So the question is what to do about this; the ML algorithm determines the composition, so the placement is always going to be emergent. One thing to try would be to combine the neighbourhood map (the image that specifies the degree of reorganization) with the mask that highlights the moving objects so that those areas are less organized than other areas. The following image is a mock-up showing the previous neighbourhood map (including a sketch of the exponential increase of neighbourhood size) with black areas showing the moving objects in the current frame. One issue with this approach is changing the degree of reorganization for every frame is potentially very slow. In this test frames took 100s to generate; far slow to be practical.
Following from the previous post, I’ve calculated sketch images using that same method (and params) trained on different images from the 24h set captured through sunrise. The top images are earlier in time (night) and the bottom images are later in time (daylight). In these images you get a sense of how divergent the output can be for similar frames (these images do show emergent properties). I’m going to work on tackling the issue of how these could look over time next.
This is the new enclosure design with the changes from the assembly for the first ZF. I’ve sent this off to the fabricator and hopefully will be able to assemble the second ZF soon. The button board electronics are giving me some issues as I’ve been unable to get the two board revisions to behave the same, even when using exactly the same Jetson image! The NVidia forum has not been any help either, I’m hoping changing to pull down buttons for the GPIO interface (from pull-up) will solve this issue with only being able to use a single pin on one of the two boards. I don’t think this is actually fixable, so unfortunately the two ZFs will need different software to reference the pins. This is not ideal. Who knows, maybe using pull down buttons will let me use the same pins on both boards (accidentally).
What is interesting about it is the interplay between the different degrees of reorganization that gives the colour regions ephemeral smoke-like shapes. Since then I had in mind different ways of manifesting degrees of reorganization (a continuity between sensation and imagination, realism and abstraction, etc.); at the start of the MPCAS project I considered using a depth camera to provide the degree of reorganization where closer objects would be less re-organized than distant objects (a direct manifestation of what the horizon was a proxy for). Since the view from the MPCAS camera does not have a clear place for a horizon it was unclear what should determine the differences in reorganization. Using the areas of movement is interesting because it’s consistent with the emphasis on flow I was interested in with a moving image. So I wrote some code to use the following image to determine the degree of reorganization and have been making a lot of explorations.
The roadways and trees have the most movement with some movement on the sidewalk as well. The street signs, utility poles, mountains and architecture are static. In my first explorations, below, I could not see much effect of the movement image. The degree of reorganization seemed quite consistent. There are some hints of effect though; note the details around the pink area amongst the trees on the right. This is an area where there is lots of movement (on Broadway) visible through the trees.
I thought perhaps the subtlety was due to the parameters and tried a few more variations, as follows. While the trace of the movement image is still visible in small details, the images are too evenly organized overall.
At this point I was starting to think there was a scaling issue and that the movement information was not being interpreted correctly. So I did a test using one of the foreground masks such that the foreground objects in the frame are not reorganized, but background objects are. Sure enough the code is working properly, see image blow. Some background objects are not reorganized because it was a quick test with few training iterations.
Since I was using code from “As our gaze peers off into the distance, imagination takes over reality…” I realized that I had used an exponential function where the degree of reorganization accelerates from the horizon to the top. So I went through another set of variations where the movement information is squared or cubed, see following images. The effect of the movement information is much clearer now and you can see traces of the architecture, street signs etc. in the degree of reorganization.
I’m feeling like these are on the right track, but I was still imagining the road areas would be much less organized. The top right is actually under-trained (8000 iterations) so the detail in the road area is due to those pixels not being changed, not due to the effect of the movement areas. It seems a relatively small number of iterations (50000) works quite well and also a quite large maximum degree of reorganization. The top left is actually way too reorganized with both too many training iterations (300,000) and too much reorganization. The bottom two images are the most successful and I also did a test with a fewer number of iterations and a greater degree of maximal reorganization, see following image. It looks like it has slightly too few training iterations, but is the strongest result so far.
I just started a test running these same params for a set of 35 images spanning from night time to early morning to get a sense of how they look in differing lighting conditions. After that I’ll see about changing the params slightly and then see about how to work with change over time. Each of these images takes about 45mins to calculate, so the whole 24h sequence at 10fps would take 70 years! The idea was to optimize since most of the frame is static over time; I’d train a base image that may take 45 mins but then only slowly refine that image frame by frame with an emphasis on the moving objects. The details of this process will have a huge effect on how the image evolves over time and also the aesthetic of each frame.
After writing some other code for the MPCAS (using the areas of the most movement to determine degree of pixel reorganization) I realized I was using version of the ML code (Self-Organizing Map) that has harder edges in the neighbourhood function. I went back to the code I used for the painting appropriation, and recalculated the night + day average colour over time gradients from the last post. These should be smoother now. I’ve included the previous gradients (left) and the new ones (right). I also just realized that I had been using code that uses time-seeding of random values so different runs with the same params and data would be different due to the order training samples are presented. For consistency I’d rather these results be deterministic.
Following from the last post, I’ve recalculated the gradients for each region using both day and night frames to increase their contrast and emphasize the day⇄night transition:
Wondering why the arch and roadway / sidewalk are flipped even though they have very similar colours? The algorithm creates emerging structures that result from complex interactions between the training data and the algorithm, even small changes in the parameters (such as the initial conditions or number of training iterations) can make significant changes to the final image.
Using the changes in average colour over time I thought I’d explore using ML to reorganize the stripes according to colour similarity. I started with the average for all regions and ended up with the following result where the original average colours are on top and the reorganized version is at the bottom.
Using these parameters as a starting point, I repeated the process using the region averages separated by night and day. The left column shows the originals and the right column the reorganized versions where the top half are day and the bottom half are night.
These parameters don’t work well in all cases; for example, the warmer peachy tones to the left of the plants (day) image, second from the top in the left column, are lost in the corresponding reorganized version. This is due to the infrequency of those tones and the general lack of colour variation in the source image. A fix for this would be to use fewer training iterations. I think these results are certainly interesting and could use more tweaking for each region and time-window rather than using the same parameters for all the images. The gradients are quite strong and the whole dusk and dawn transition is the most interesting thing about them.
A live version of this would be quite lovely; perhaps the window of time could be one day or one week and over time the generated gradient would change the distribution of colours and light. This would be most pronounced in the change of light and dark in the sequences where day and night are both included. I think I’ll do a quick exploration of that next. Even without being a live version, I should calculate such a gradient changing over time for a long-term time-lapse and create a video; the result would shows the changes of light colour and the length of days. A live version would not be technically very complex, be very computationally simple and lower-power use since the ML would happen over a long time-period; also the calculation of the average colour of an image is very efficient.
Following from the previous post, I’ve calculated the average colours over time and for the whole sequence for day and night frames. This yields some particularly nice results, in particular the average over time for the sky at night (upper left in gallery below). Rather than detecting a colour shift over time, I used the traffic lights in the frame as reference. The first frame where the first street light turns on is considered the first frame of night. Since the gross colour change towards the warm sodium lights is what I wanted to separate, this approach makes sense. A particular street light turns on early and turns off late, and since I’m working with a fixed frame I should be able to create a automatic detector to read this local change in a particular location. The following images show the same four regions (Sky, Plant-Life, Architecture and Sidewalk / Roadway, listed from top to bottom) with night (left column) and day (right column) versions of each.
The sodium colours really only dominate once the sun has set, so the night images still show significant dusk / dawn colours that may belong better in the day sets. Perhaps it may be better to detect the average brightness (or blueness?) of the sky region as the night / day detector. Weather manifests interestingly in the night sky, where clouds reflect the yellow city light; note how much darker the first and last nights are compared to the middle nights with rain and cloud. On my list of visual explorations is using machine learning to reorder these averages over time according to similarity. I believe I have some code that does this somewhere so it could be a quick implementation. The following gallery shows the corresponding average colours of the entire sequence separated by night and day. Note how much more significantly those warm sodium lights effect the built environment (Architecture and Roadway / Sidewalk). I think the next exploration will be a reorganization of these sequences of average colour.
I spent some time manually drawing boundaries around four main regions of the MPCAS frame: Sky, plant-life, architecture, and sidewalk / roadway. I took out my old Wacom tablet (which I have probably not used in over 10 years) to do the job and it was quite laborious! The extracted regions with a sample frame (left column) and their corresponding masks (right column) follow. The labour involved was worthwhile as the frame is fixed and I could use these masks to filter things like foreground objects and make different collages for each region.
Averages for Each Region Over Time
Averages of Whole Sequence
These results are promising! Interesting how similar the architecture and sidewalk / roadway average colours are and how much they differ from sky and plant life averages. The night frames with their very warm street lights have a much greater effect on the architecture and roadway sidewalk, which makes sense since those are the spaces lit by those lights. It would still be interesting to see these split up by night and day frames… I think I’ll try that next!
The next exploration in my list is the average colour of a whole sequence of frames. This follows a bit from this exploration, except the input images are reduced to a single average colour. Again, the X axis is time, but the Y axis is uniform:
We can see how there are 7 days and 7 nights in this set. Days end up being very grey and nights have an orange tinge due to the street lights. Compare the first 1920 frames (top) with the previous exploration (bottom):
These colours are quite muddy due to both the grey weather at the time of the capture, and also the dominance of the concrete in the intersection. Again, it seems like these results would be more interesting if I was to mask the image by regions, trees, road, sky etc. The averages of each masked area would be more interesting and I’m curious about the change over time. The last exploration is the simple average of the entire sequence, which is unsurprisingly a muddy colour. Another thing I should try is separating the night and day frames (and twilight?) and create averages colours of those, which I expect would be quite different.
Following from the previous post I tweaked the code a little bit and used the 7 Day time-lapse image set since that would be closer to what I had in mind for the collage. This image set has only about 10,000 frames and that resulted in 91,000 fragments (compared to the 800,000 for the previous collage). The colour is significantly different because the frames represent one weeks worth of diversity, rather than being dominated by the restrained palette caused by construction in the collage previously posted. The extraction of foreground objects is more messy as the background model takes longer to train when there is so much change between subsequent frames. I leaned into this and also tweaked the code so that the edges are not smoothed out as in the segments previously generated, see image blow.
I do like the edge quality, but there is still some fine texture that seems to be missing, I think this is due to the filtering of segments by area. I choose a somewhat arbitrary threshold to keep the number of segments per frame at a reasonable number (5–50 per frame). Another trick is to sort the segments by area so that the larger ones are drawn behind the smaller ones, this leads to a much more complex collage (e.g. this work in process) where the smaller and thinner pieces can tend to emphasize flow and segment orientation. All in all this is good progress for collages and I think I’ll leave this exploration aside for now.
I adjusted the ‘features’ (the numbers that define how the machine ‘sees’ each foreground object) so that only the average colour and orientation are used to determine similarity. This has the potential to speed up the process for larger image-sets. I think the overall structure is quite complex considering using simpler features (especially average colour) can lead to collages that become closer to gradients where there is too much organization. As part of this process I also revisited some code I wrote to visualize what a collage would look like without having to generate it. While for this case rendering the 90,000 segments (on an fast SSD) is not too slow it was certainly slow for 800,000 segments and I’m expecting many more for the long term time-lapse collage. The image below is the fast visualization where the colour and orientation of each segment is drawn using rectangles with one black edge. As you can see it’s not very organized and had much less training iterations than the previous collage.
Using the same features (hue histogram and orientation of objects) as I used in “Imagined Field from the Decomposition of an Apparatus“, I trained a Self-Organizing Map (an algorithm that arranges patterns by similarity) resulting in the above collage and following details:
The orientation is a lot less interesting because most vehicles are not very tall compared to their length and the orientation of the roads in the frame has a real stabilizing effect; many vehicles have the same orientation, e.g. the lower left detail above. Even the shadows (due to headlights?) seem to have a similar effect where areas of asphalt show similar directionality, see middle-right detail above.
The colours are quite a surprise due to the construction that was happening early in the morning on the day of the 24h capture. Since the foreground extraction process stopped short, due to the disk filling up, the dark period early in the morning has a lot of visual emphasis resulting in the warm and cool colour palette. The movement of workers in their high visibility clothing are quite dominant in some areas, e.g. the middle-left detail above.
The morphology operations really change the edge quality which looks too soft and rounded to me (see image below). I’ve added a note to myself about the edge quality and to try leaving those more noisy edges as seen in this post. I think I would like to revisit this collage method using the 7 day time-lapse images. I was intending the collage more for the time-lapse imagery anyhow. Using the 24h images we see how similar subsequent frames can be where the same object shows up in multiple locations without changing much, e.g. in the following image; this would be reduced in a time-lapse where subsequent frames are more likely to have very different foreground objects.
In some ways the most interesting parts are those that are the most constrained, e.g. the workers in the middle-left and the cars in the lower-left. As a fixed frame is used, one way to constrain the collages is to filter according to the position of the segments. I could create masks that would almost certainly filter only cars with a particular orientation (Broadway beyond intersection and cross walk), or trees (the area in front of Kingsgate mall above the sidewalk), or clouds (sky above the mountains).
To give you a sense of scale, this collage contains over 800,000 image fragments and the collage took 13 hours to calculate. These results are promising enough that I’d like to continue following this thread with the 7 day time-lapse test footage before moving onto other explorations. I’ll need to tweak the foreground segmentation since that does not work very well with so much change between frames, and I’m thinking about changing the features to a simpler orientation and average colour (rather than histogram) for each segment. This will really increase speed and scale better to the long-term time-lapse.
Following from my previous post I’ve written code that extracts foreground objects from the background using the masks previously calculated using foreground segmentation. Since I’m using the tenFPS set, subsequent frames are quite similar to each other. Following is a small subset of extracted foreground objects corresponding to only about 2s of time. The redundancy is due to what we would recognize as the same object in multiple frames. This gives a sense of the edge quality, which is tweaked a little with slightly different filtering operations to clean up edge boundaries and remove noise. In previous works from the Watching and Dreaming series I used a clustering algorithm to group objects deemed similar, but that had a strong ephemeral effect! It may be worth exploring here though since I’m working only with foreground objects (not all objects in the frame as in that previous work) and there is a lot of clear redundancy here.
I also did a version where I took all of the segments extracted from about an hour work of time and layered them on top of each other:
This represents only the objects that moved or changed over time during this period. There are some issues with the edge quality here (the isolated segments, e.g. floating away from other segments) above seem to have a dark boundary. The trails of peds and moving cars are quite interesting. I also made a version where I used the averaged background (without moving objects) as the background for the image above:
This follows from my work using a much less interesting visual data-set from my 2012 residency as part of the New Forms Festival where I showed work in progress on my Dreaming Machine.
[B. D. R. Bogart. The Zombie Formalist: An Art Generator that Learns. In Richard William Allen, editor, Art Machines 2: International Symposium on Machine Learning and Art 2021 Proceedings, pages 165–166, 2021.]
In writing my proposal for a Grunt Gallery show, I ended up making a few more explorations. I was curious how the colour in the frame would be manifest so I took the code from the MPCIP project and used that on one of the frames from the 24h set. The results are quite nice and really do show off the intensity of colour during the summer days. I ended up doing 8000 training iterations with a neighbourhood size of 250 (top) and 2000 iterations with a neighbourhood size of 1500 (bottom). At some point I’ll adapt this code so that the size of the neighbourhood is different for each pixel and use the previous average of foreground extraction to determine the relative size. I’m quite excited to see those results!
I also did a quick test using of the foreground segmentation masks as an alpha channel to pull out (create) moving objects from the corresponding frame. It’s quite messy as I did not do any filtering for noise, nor did I use the findContours code to create cleaner object bounds. I’m thinking about my emphasis on boundary-making and whether I should investigate using these raw masks without findContours for ‘object’ creation. The aesthetic would certainly me more chaotic and textural… There is also the problem of determining how to group white pixel blobs together, which is what findContours does.