More Fun With LINQ

March 30th, 2009 | 6 comments

So after the mess of my last post (see the comments for how it all plays out) it seems that while LINQ is really efficient for small data sets, ordering appears to happen at an O(n^2) rate (or close) of speed. Try sticking a list with a few million items in it through that and it bogs right down.

I decided to play with LINQ tonight and see if I can make things more efficient, with or without maintaining the readability. First I wrote up this method which I believe is about the most efficient it can get. It can turn through a 375,000 item list of Vector3s in about 0.012 seconds, i.e. fast enough to happen in a 60fps game (though just barely):

float d = float.PositiveInfinity;
Vector3 closest = testPoint;
foreach (var v in vectors)
{
	float dt =
		(testPoint.X - v.X) * (testPoint.X - v.X) +
		(testPoint.Y - v.Y) * (testPoint.Y - v.Y) +
		(testPoint.Z - v.Z) * (testPoint.Z - v.Z);
	if (dt < d)
	{
		closest = v;
		d = dt;
	}
}

With that as my benchmark I decided to compare that with a naive LINQ implementation:

var closest =
	(from v in vectors
	 let d = Vector3.DistanceSquared(testPoint, v)
	 orderby d
	 select v).First();

This LINQ method took .351 seconds to get through the same list of vectors. The problem is that we’re sorting the entire list which gets pretty slow with that many entries.

Next up is a really ugly LINQ query that aims to use a local variable to try and cut down on the time.

float e = float.PositiveInfinity;
var closest =
	(from v in vectors
	 let d = Vector3.DistanceSquared(testPoint, v)
	 where (d < e && ((e = d) != float.NegativeInfinity))
	 orderby d
	 select v).First();

This time we throw in a where clause that compares the distance to our local variable so that we never add an item to our intermediate list unless it is closer to us than the previous point. The ugly hacky stuff in there is one of the methods I’ve found for assigning a local variable inside a LINQ query. I just assign the value to the local variable and then compare that against a value I know it will never be (in this case a negative value since DistanceSquared cannot be negative).

By doing this test, the resulting list we wind up sorting tends to only have around 13 or 14 items in it. This method manages to get me my result in about 0.081 seconds. Not quite fast enough but it’s about 75% faster than our previous method.

I think that’s about the best I can come up with for now. LINQ definitely isn’t something you want to tear through giant lists with, but then when are you really going through 375,000 points comparing them to a position? At that point you’d use some spatial partitioning to narrow that down. As some food for thought, once we have a list with just 375 items in it, our results turn to this:

foreach: 0.0007573 seconds
LINQ (naive): 0.0104185 seconds
LINQ (with where clause hackery): 0.001516 seconds

In this case you can actually get away with the naive LINQ query as long as you don’t have too much else to do (it’s burned up 10 of the 16 milliseconds the frame gets), but the LINQ with the hackery actually is down to a point of being quick enough to use. But then you realize that the foreach loop is probably easier to understand and still kicks its butt in performance.

All in all I’d say use LINQ if you don’t mind the overhead. It’s a great way to simplify some of your code where garbage and performance aren’t issues.


Possibly Related Posts

(Automatically Generated)
Using LINQ To Find A Closest Point
Finding Distances The Extension Way
Extending GamePadState
C# Twitter quiz explained
Borrowing From The iPhone SDK

  1. shawnhargreaves
    March 31st, 2009 at 08:10

    Interesting stuff!

    I wonder how this would play out if you applied it to some kind of more sophisticated spatial data structure? Rather than concentrating on the big-O performance of the query operator (which is being fed a flat list of unsorted vectors) what if the input data was organized in some more useful way, say if the vectors were sorted by X coordinate, or grouped into a quadtree?

    I don’t know LINQ well enough to predict how well it would apply to such a structure, but that would be more representative of using it in a real optimized game collision environment. Would be interesting to see how LINQ compares with manual loops for traversing a quadtree…

  2. Fahllen
    March 31st, 2009 at 10:40

    I am sorry if this is at the wrong place to type this in, but I was following your tile based game tutorial. And even though i noticed it was no problem doing it with a Microsoft Visual Studio 2008 (Even though it was for a 2005), I noticed that something didnt work for me as it should at episode 13 part C.
    I was just wondering if you could help me.. I dont really get what’s the problem. Ive been trying to sort out this for hours now and was just hoping for some help.
    Thanks for your time.

    //Daniel

    P.S.
    Added you on MSN, if you could accept that, it would be alot easier for me to communicate.

  3. March 31st, 2009 at 10:42

    I don’t do one-on-one support, so I probably declined the MSN friend request. I don’t have time to help everyone who has problems out. Nothing personal; I’m just a busy guy. If you watch the videos and read the sample code, I’m fairly confident it will all work in 3.0, so perhaps you just missed a step or made a typo somewhere if something isn’t working as it should.

  4. yokall
    March 31st, 2009 at 12:52

    On the tile engine tutorial query, I have completed it using GS 3.0 and it all worked. Chapter 13 is quite long so it is very easy to miss things. Download the source code and use it to compare against your code.

  5. Darkside
    March 31st, 2009 at 14:42

    Well you compared the speeds of Lists using for each and linq.
    How about a test against an array using a for next?

    How does the speed of Linq (in hacky mode) but up against the simplicity of arrays?

  6. March 31st, 2009 at 14:47

    The for loop is (slightly) slower than foreach, so I just omitted it because I wanted the fastest loop method against LINQ.

    When I was doing my testing, I believe an array is slower than using a list. But it’d be pretty easy for someone to take my code here (and in the comments of the last post) and put together their own tests if they would like. I encourage experimenting with ideas; it’s a great way to learn about C# and .NET.

You must be logged in to post a comment.