Reduction in the number of sensors.

Posted: March 12, 2009 at 12:58 am

I did a few tests by reducing the number of sensors in the SOM. This was accomplished by systematic sampling of the histograms. The results show that the reduction of sensors makes no significant difference to the number of iterations required to create a good map. A visual inspection of the memory fields shows almost no difference between the maps resulting from 128 and 768 sensors. The following plot shows the number of iterations (x) plotted against the number of associated memory locations (y):


Here are the memory fields of the 768 (top) and 128 (bottom) sensor training sessions:



Degree of Stimulation

Posted: March 10, 2009 at 7:02 pm

After working with these large SOMs (50×50 or 75×75) the grid and uniformity of the images appears perhaps too consistent. One idea to inject some variation is to have the camera motivation keep track of the difference between the histogram of the middle region of the current frame and the previous frame. The larger the difference the larger that memory unit could be represented. A longer term idea could be to do a histogram analysis to figure out clusters of similarity (similar u-matrix values). These U-Matrix values could be mapped to unit size.

Dual-SOM Performance Update

Posted: March 6, 2009 at 7:28 pm

So once I added a random codebooks init to ann_som I configured oprofile to get a sense of what portions of the patch would need optimization. The results make it very clear that ann_som itself uses as much as 80% of the PD CPU usage. Python is using a measly 10%. My assumption that python may be a bottleneck is clearly not founded, and the only way to improve performance would be to limit the number of iterations ann_som goes through. Now that I have a fully populated SOM, I’ll see how few iterations are needed to train the second SOM. I wonder what will happen If I use the linear training method multiple times without clearing the SOM. It should optimize much faster the second time, as the majority of the data would not have changed.

Once I get an idea of how that will work then I should integrate the motivated camera and dual-SOMstuff into the current DM system.

Performance of Dual-SOM

Posted: March 4, 2009 at 9:30 pm

I have been able to run a dual SOM in a sufficient amount of time. The more memories are stored in the first SOM, the slower the training appears to be. This is not in terms of number of iterations but in terms of CPU time of accessing many more memory locations. Here is the U-matrix of the first SOM, once it has only been partially populated (3468 out of 5625 (75×75) units):


And the second SOM trained on those memories:



This second SOM was trained in 30s, over 15,000 iterations (2ms / iteration). When training this quickly the CPU usage is somewhat high, and does interfere with rendering. In order to see how feasible it really is I would need to integrate it into the DM system and see how it performs. One problem is that in the current DM system the pix_buffer is in the parent patch, and its the second patch that simply describes which memory location a particular input should be stored. In a dual-SOM the pix_buffer for the initial SOM will need to be in that second patch, and therefore would not be available to the first. A second pix_share could be used to send the data back to the parent patch, but its unclear if that would be fast enough. Following is the second SOM trained once the first SOM has associated all its units with images.



There are still lots of issues with making this approach work, but the quality of these second SOMs is so high that this may be worth it. Ideas to make it optimize faster:

  • Add random code-books initialization to ann_som
  • Try using a different numpy data type (rather than a python list) for faster iteration.
  • Perhaps a C PD external  that does the job of

75×75 unit SOM

Posted: March 3, 2009 at 8:36 am

While testing a faster way to train a SOM I trained a SOM on some of the motivated gaze images. I used linear training functions so it is a pretty good indication of the topology of the data. This feature map took 100,000 iterations to train, and some units have still not been associated with images: