September | 2010 | Mattias Fagerlund's Coding Blog

Forefront Identity Manager (FIM), Connected Data Source Extensions and stopped-extension-dll-load

September 17, 2010 2 Comments

I was trying to develop a new Connected Data Source Extensions for FIM using Visual Studio 2010, but FIM wouldn’t accept it. It kept giving me the error message “stopped-extension-dll-load” – without any further error details.

Turns out the error details are located in the windows application event log, there it said;

The management agent "Test MA" failed on run profile "Full Export". The run step stopped because a configured extension for this management agent could not be loaded.

User Action
Verify that the extension is located in the Extensions directory.

Well, the file is located in the correct directory, so that isn’t the problem… after some head scratching, tinkering and asking around it turns out that I should be using .NET Framework 3.5 instead of .NET Framework 4.0 – which Visual Studio 2010 defaults to. A soon as I switched over, I was able to connect to the MA without issue! Thanks Henrik for suggesting sticking to 3.5.

Claiming that the file couldn’t be found was extremely un-helpful, that alone cost me a lot of wasted time.

Filed under Uncategorized

SharpRBM, Restricted Boltzmann Machines in C#/.NET

September 14, 2010 5 Comments

Years ago I was looking through lots of the lectures on GoogleTechTalks and i came across one by Geoffrey E. Hinton about Restricted Boltzmann Machines (RBMs) which is a very interesting method for training Neural Networks (NN) to do stuff, like image categorization but also other interesting things.

Anyway, I wanted my own RBM to play with, but I was unable to find any useful code for C# on the internet. I put it aside, but recently I stumbled across it again and decided to pick it up again.

SharpRBM

I’ve started developing my own implementation of RBMs that I call SharpRBM, it’s hosted at CodePlex. I thought I’d blog about my experiments as a work on my implementation. Right now it can produce image classifiers that work fairly well. I won’t discuss how RBMs work in any detail here, Geoffrey et. al. do a great job of doing that.

For RBM implementations to be scalable, they should be implemented using something like CUDA where a GPU is utilized to train the network in parallel, but my initial concern is that I want a proof of concept and something to play around with.

Feature detectors

One funny thing about RBMs is that you don’t train them with labeled data, you train them to create their own feature detectors from the “raw” data. Then you use any other training method (back-prop, GA, whatever really) to match the labels to the constructed feature detectors.

What are feature detectors? Well, one feature detector might recognize a nose when it sees one, others might recognize mouths, eyes and ears. The highest level training, the level using labeled data, might conclude that when the feature detectors “see” eyes, ears, a mouth and a nose together, there’s a face in the picture! Even though the training of the feature detectors had absolutely no knowledge of faces to begin with.

These early experiments will not demonstrate classification using the generated feature detectors.

Reconstructing images from feature detectors

Once the feature detectors are trained, we can show them an image and each feature detector will either come on or not. From this inner state we can reconstruct the image it “imagines” by using the weights in reverse. This gives you a really good sense of what the RBM thinks of the picture space it’s working with, how much of it it has understood.

The number of feature detectors will determine how well the RBM is able to reproduce the image it saw, too few and you’ll get an ugly mixed version. Use too many and it takes forever to train the network.

I want to see…

The RBMs can be allowed to “dream” by repeatedly activating the feature detectors, then generate reconstructions of the internal state of the feature detectors and then activate the feature detectors from the reconstruction. If you do this for a while, you’ll see images that the RBM considers probable/plausible. For numbers, this looks really funny as it dreams about numbers that look like numbers it has previously seen.

From youtube.com;

The real reason I started this is that I want to see the kinds of faces and RBM dreams of when it’s been trained on faces… But we’re not there yet.

This is what my version looks like when it’s trying to train an RBM to create feature detectors for numbers;

Tests

For this demo run, I’m using 10 input images that I drew myself using Paint, glorious glorious ms paint. Each experiment is run for several hundred generations, until it looks (to me) like it’s converged on a solution. These images are very hard to create feature detectors for because they’re all very de-similar. A low number of handwritten numbers are easier, but these tests are to test the limits of feature detectors, so a harder set is better.

Using 2 feature detectors

Using just 2 feature detectors is no enough for the RBM to be able to reconstruct the images. Note how the reconstructs and the feature detectors are mixes of several of the input images;

Using 3 feature detectors

Using three is still not enough, but the images are clearer.

Using 4 feature detectors

Again, not particularly well reconstructed, but we’re closer.

Using 5 feature detectors

Closer, but still no cigar!

Using 6 feature detectors

Pretty close, actually! Note how the feature detectors are very chaotic, but the regenerated images aren’t! This is (probably) because the networks have been forced to re-use the feature detectors to mean many different things. Recombining them makes a fairly clean image because you can subtract one feature detector from another feature detector (by using negative weights) and thus come up with a cleaned up image.

Using 7 feature detectors

This one’s actually worse than the previous attempt – which might be due to the fact that I didn’t run it for as long. But it might also be some random thing that happens when the number of feature detectors is so low.

Using 8 feature detectors

Oh, we’re so close now! Feature detectors are still chaotic because they’re storing multiple actual features at once.

Using 9 feature detectors

Closer still

Using 10 feature detectors

Very good reconstruction, but the detectors are still chaotic! I was hoping that by now, we’d have nice clear detectors that identified one picture each, but alas. Not yet. Now we’ll pick up the pace and add more detectors for each experiment. Next up, 15 detectors!

10 and then 15 detectors

So, it seems my expectation of specialized feature detectors was incorrect;

50 and then 100 detectors

Still no specialization, and note how the detectors are starting to get washed out. The gray areas of the detectors means that the detector doesn’t care about the region in question. Very few detectors that most pixels in most detectors actually count, with many detectors, some pixels can be left unused. It also makes detectors duller and duller to look at, as you move forward…

In closing

Very odd indeed, why aren’t my feature detectors specializing? Maybe the first level of RBM nodes don’t specialize, maybe that’s for the subsequent levels to do? Also, at the point of this writing, SharpRBM isn’t quite finished, it doesn’t implement biases, which are probably required going forward. That’s something I’ll work on, and some of the results I’m getting might be because of this.

[Edit: Biases have been added and they didn’t do a thing for specialization]

More experimenting to follow.

[Edit: The current thinking is that the lack of specialization is due to the fact that this is the first layer, adding another layer might make things more agreeable! I’m working on a second layer, which should be simple enough, if I can just find the time]

Filed under Uncategorized

PicoGA: Evolving a 3D Projection Matrix

September 9, 2010 Leave a comment

In my previous blog post PicoGA – a tiny GA for C# I talk about PicoGA and I demonstrate a few problems that can be solved using it. One of the examples is evolving a 3D Projection Matrix from a few pixel to world coordinate mappings taken from a given photograph.

The example didn’t use “height” in the picture because it contained no heights that I could be sure of. In this example, I’m using the heights of the goal posts in the picture below to show that it can correctly handle World (x,y,z) to Screen(x,y) mapping.

From this picture I’ve identified 9 points that I’ve determined the World(x,y,z) coordinates (the football field is 105×68 meters, the goal is 2.44 meters high and 7.32 meters wide). The coordinates are specified below the picture.

I simply use MS Paint (yay!) to determine the pixel coordinate, and I use the given dimensions of the football field (105×68) to determine the world coordinates;

    new WorldToScreenCase(688, 349, 0,0 ,0), // 1
    new WorldToScreenCase(454, 155, 52.5,0,0), // 2
    new WorldToScreenCase(353, 70,  105,0,0), // 3
    new WorldToScreenCase(127, 169, 52.5, 34,0), // 4
    new WorldToScreenCase(155, 411, 0, 34,0), // 5
    new WorldToScreenCase(220, 364, 0, 34 - 7.32*0.5, 2.44), // 6
    new WorldToScreenCase(80,  378, 0, 34+7.32*0.5,2.44), // 7
    new WorldToScreenCase(92,  57,  105, 34 + 7.32*0.5, 2.44), // 8
    new WorldToScreenCase(145, 56,  105, 34-7.32*0.5,2.44), // 9

It takes a while, here’s a video of evolution in action;

but the program is successful in evolving an appropriate matrix;

As you can see, the evolved matrix (the red dots) hits the measured points (blue dots) with quite high accuracy. This matrix can now be used for animating something moving across the football field!

As I stated in my previous post, there are analytical methods for doing this, but I’ve been unable to locate any, and this is a cool use for GA…

This is the code that I use to map from world coordinates to screen coordinates, using my evolved matrix;

private static Point3D ProjectCase(Point3D point, List&lt;double&gt; l)
{
	Point3D p = new Point3D(
		point.X - l[12] * 100,
		point.Y - l[13] * 100,
		point.Z - l[14] * 100);

	Point3D result = new Point3D(
		l[0] * p.X + l[1] * p.Y + l[2] * p.Z + l[3],
		l[4] * p.X + l[5] * p.Y + l[6] * p.Z + l[7],
		l[8] * p.X + l[9] * p.Y + l[10] * p.Z + l[11]);

	if (result.Z != 0)
	{
		result.X *= l[15] / result.Z;
		result.Y *= l[15] / result.Z;
	}

	return result;
}

You can find the source code among the PicoGA demos.

Filed under .net, ai, artificial intelligence, c# Tagged with .net, ai, c#, GA

PicoGA – a tiny GA for C#

September 8, 2010 2 Comments

Have you ever found that you really want an extremely tiny and simple Genetic Algorithm method for optimizing a set of doubles (the genotype) given a fitness function? Me too!!! So, I wrote one, because it seemed fun.

I’m using it to evolve two toy problems, later I hope to show it with a slightly less toyish problem.

It’s really small (224 lines of code in a single file). It would be very easy to make it smaller – but then it’d be less readable. It would be equally simple to add new functionality, but then it would be larger and would no longer qualify for the name PicoGA. You can find PicoGA here http://picoga.codeplex.com/

What can you use a GA for?

First, two toy problems;

Find three doubles that sum up to five!

Not very difficult, but as I said, it’s a toy problem. First we must specify;

a fitness function
a population size (how many individuals (I just picked 20 for this simple problem))
a genome size / number of genes (3).

That comes out as one line of code;

      GA ga = new GA(
        20, // Individuals
        3, // Genotype length (number of doubles)
        individual => // Fitness function
          Math.Abs(5 - individual.Genotype.Sum(val => val)) 
        );

Simple! Now we need to run that for a couple of generations – but PicoGA takes care of that also. For each generation, we publish the results, if they’ve improved since the previous generation;

private static void SumTo5()
{
  Console.WriteLine("Sum to 5:...");
  GA ga = new GA(
    20, // Individuals
    3, // Genotype length (number of doubles)
    individual => // Fitness function
      Math.Abs(5 - individual.Genotype.Sum(val => val))
    );

  ga.RunEpoch(
    500, // Number of generations to run for
    null, // Action to perform for each generation
    () => // Action to perform once fitness has improved
    {
      Console.WriteLine(
        "Gen {2}: Fit={1}, Genotype={0}",
        string.Join(
          " ",
          ga.BestIndividual.Genotype.Select(val => val.ToString("0.00"))),
        ga.BestIndividual.Fitness.ToString("0.00"),
        ga.CurrentEpochGeneration);
    });

  Console.WriteLine("Sum to 5: done!");
  Console.WriteLine("");
}

Voilá, that’s all there is to it! For my test run, at generation 45 a solution is found;

PicoGA, demos

Sum to 5:…

Gen 0: Fit=3,39, Genotype=0,84 0,50 0,26

Gen 1: Fit=3,26, Genotype=0,97 0,50 0,26

…

Gen 43: Fit=0,04, Genotype=2,45 1,38 1,13

Gen 44: Fit=0,01, Genotype=2,48 1,38 1,13

Gen 45: Fit=0,00, Genotype=2,49 1,38 1,13

Sum to 5: done!

Find 1 2 3 4 5

For my second toy problem, I want the GA to find the values 1 2 3 4 5 (in that exact order) for it’s five genes. Again, not a very useful thing to do, possibly even less useful than my first example, but it’s a toy problem;

private static void Find12345()
{
    Console.WriteLine("Find 1 2 3 4 5:...");
    GA ga = new GA(
        200, // Number of individuals
        5, // Number of genes in the genotype
        individual => // Fitness function
        Math.Abs(individual.Genotype[0] - 1) +
        Math.Abs(individual.Genotype[1] - 2) +
        Math.Abs(individual.Genotype[2] - 3) +
        Math.Abs(individual.Genotype[3] - 4) +
        Math.Abs(individual.Genotype[4] - 5));

    ga.RunEpoch(500, null, () =>
        {
            Console.WriteLine(
                "Gen {2}: Fit={1}, Genotype={0}",
                string.Join(
                " ", 
                ga.BestIndividual.Genotype.Select(
                    val => val.ToString("0.00"))),
                ga.BestIndividual.Fitness.ToString("0.00"),
                ga.CurrentEpochGeneration);
        });

    Console.WriteLine("Find 1 2 3 4 5: done!");
    Console.WriteLine("");
}

At generation 105 a perfect solution is found. The reason it takes so long is that genes only mutated slowly and reaching 5 took a while. I’ve since upped mutation rate.

Gen 105: Fit=0,00, Genotype=1,00 2,00 3,00 4,00 5,00

Ok, but what can I really use it for? No more toy problems!

Well, lots and lots of stuff!

I once had an image of a soccer field and I needed a 3D perspective projection matrix that matched the image. I needed to render football players as they were moving around the pitch, for which I had pitch coordinates. But I needed to convert those pitch coordinates to screen coordinates, which isn’t as simple as it sounds.

The 3d Perspective Projection Matrix contains all information about the setup of the camera and target as the image was shot. Camera Field Of View, position of camera relative to subject, orientation of camera with regard to subject etc. For this I used a genotype of 18 genes (in effect 18 doubles).

The data we need

The pitch looked something like this;

First I measure points on the I could pinpoint pixels in the image and determine where they where in soccer field space and tie that to image space coordinates.

I compiled a list like this (a soccer pitch is 105 x 68 meters)

Pixel 165,178=> World 0, 0, 0
Pixel 377,120=> World 52.5, 0, 0
Pixel 513, 85=> World 105, 0, 0
Pixel 473, 157=>World 52.5, 34, 0
Pixel 611, 208 =>World 52.5, 68, 0
Pixel 735, 136 => World 105, 68, 0

There are analytical methods of retrieving the matrix given only that information, but I don’t know them and I couldn’t find them by googling. But I know GA, and using GA I evolved the matrix!

The fitness function

The fitness function is basically that we feed the six world coordinates into the 3d transformation matrix and sum up the errors of the pixel coordinates it suggests as compared to the pixel coordinates we previously measured. Trivial. Not really, because the formula for transforming from 3D space to 2D screen space isn’t trivial. This is the code I wound up using for this;

private static Point3D ProjectCase(Point3D point, List<double> l)
{
    Point3D p = new Point3D(
        point.X - l[12] * 100,
        point.Y - l[13] * 100,
        point.Z - l[14] * 100);

    Point3D result = new Point3D(
        l[0] * p.X + l[1] * p.Y + l[2] * p.Z + l[3],
        l[4] * p.X + l[5] * p.Y + l[6] * p.Z + l[7],
        l[8] * p.X + l[9] * p.Y + l[10] * p.Z + l[11]);

    if (result.Z != 0)
    {
        result.X *= l[15] / result.Z;
        result.Y *= l[15] / result.Z;
    }

    result.X += l[16] * 100;
    result.Y += l[17] * 100;

    return result;
}

Some of the times, PicoGA quickly (in a matter of minutes) finds a matrix that it likes. I’ve plotted in the pixels that the generated matrix produces in the picture below. They’re not perfect, but neither are the measurements that were used for generating the matrix, so they can never be perfect. The red dots are where the evolved matrix suggested;

Here’s the program for evolving the matrix;

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Windows.Media.Media3D;

namespace PicoGA
{
    public static class FindMatrix
    {
        internal class WorldToScreenCase
        {
            public WorldToScreenCase(double screenX, double screenY, double worldX, double worldY)
            {
                Screen = new Point3D(screenX, screenY, 0);
                World = new Point3D(worldX, worldY, 0);
            }

            internal Point3D Screen { get; set; }
            internal Point3D World { get; set; }
        }

        public static void FindProjectionMatrix()
        {
            Console.WriteLine("Find ProjectionMatrix:...");
            List<WorldToScreenCase> cases =
                new List<WorldToScreenCase>
                {						
                    // Old Image
                    new WorldToScreenCase(165,178 ,       0,      0),
                    new WorldToScreenCase(377,120 ,       52.5,      0),
                    new WorldToScreenCase(513,85 ,       105,      0),
                    new WorldToScreenCase(473,157 ,       52.5,      34),
                    new WorldToScreenCase(611,208 ,       52.5,      68),
                    new WorldToScreenCase(735,136 ,       105,      68),
                };

            GA ga = new GA(
                2000, // Number of individuals
                18, // Number of genes in the genotype
                individual => // Fitness function
                {
                    double errorSum = 0;
                    foreach (WorldToScreenCase test in cases)
                    {
                        Point3D testScreen = ProjectCase(test.World, individual.Genotype);
                        double sqrError = (testScreen - test.Screen).LengthSquared;
                        errorSum += sqrError;
                    }

                    return errorSum;
                });

            ga.RunEpoch(10000,
                () =>
                {
                    ga.BreakEpochRun = ga.BestIndividual.Fitness <= 1.0 || Console.KeyAvailable;
                },
                () =>
                {
                    Console.WriteLine(
                        "Gen {2}: Fit={1}, Genotype={0}",
                        string.Join(
                        " ",
                        ga.BestIndividual.Genotype.Take(5).Select(
                            val => val.ToString("0.00"))),
                        ga.BestIndividual.Fitness.ToString("0.00"),
                        ga.CurrentEpochGeneration);
                });

            if (Console.KeyAvailable)
            {
                Console.ReadKey();
            }

            Console.WriteLine("Results for training set:");
            foreach (WorldToScreenCase test in cases)
            {
                ShowTestResult(ga, test);
            }

            Console.WriteLine("");
            Console.WriteLine("Additional tests:");
            ShowTestResult(ga, new WorldToScreenCase(120, 73, 105, 34));

            Console.WriteLine("Find ProjectionMatrix: done!");
            Console.WriteLine("");
        }

        private static void ShowTestResult(GA ga, WorldToScreenCase test)
        {
            Point3D p = ProjectCase(test.World, ga.BestIndividual.Genotype);
            Console.WriteLine(
                " World ({0:0.00}, {1:0.00}) => \n" +
                "   Wanted:({2:0.00}, {3:0.00}) \n" +
                "   Received:({4:0.00}, {5:0.00})",
                test.World.X,
                test.World.Y,
                test.Screen.X,
                test.Screen.Y,
                p.X,
                p.Y);
        }

        private static Point3D ProjectCase(Point3D point, List<double> l)
        {
            Point3D p = new Point3D(
                point.X - l[12] * 100,
                point.Y - l[13] * 100,
                point.Z - l[14] * 100);

            Point3D result = new Point3D(
                l[0] * p.X + l[1] * p.Y + l[2] * p.Z + l[3],
                l[4] * p.X + l[5] * p.Y + l[6] * p.Z + l[7],
                l[8] * p.X + l[9] * p.Y + l[10] * p.Z + l[11]);

            if (result.Z != 0)
            {
                result.X *= l[15] / result.Z;
                result.Y *= l[15] / result.Z;
            }

            result.X += l[16] * 100;
            result.Y += l[17] * 100;

            return result;
        }
    }
}

You can find PicoGA here http://picoga.codeplex.com/.

Filed under ai, artificial intelligence, c#

Mattias Fagerlund's Coding Blog