Raising the Bar with LINQ

With NuoDB release 1.1 we have added support for Microsoft .Net and Language INtegrated Queries (LINQ).   In this post we’ll take a quick look at accessing the NuoDB quickstart hockey database using LINQ.

What are Language INtegrated Queries?

LINQ is Microsoft’s innovative technology, introduced in 2008, for the development of database applications by writing queries directly in application programming languages.   What’s the big deal about that, you may ask?

As we’ll illustrate in this post, LINQ helps reduce the dreaded impedance mismatchbetween programming languages and database management systems by extending the programming language syntax so that queries can be written directly in C#, VB or another LINQ-supported programming language.

When writing apps with LINQ there is no need for developers to switch context between thinking in the application programming language and thinking in SQL.  We simply use the type system, expressions and other constructs in our application programming language to write our queries.  That makes it a breeze to code selection predicates, finding relationships between different entities, aggregating query result sets and other database access functions.

In the Visual Studio environment, we can also use all the usual developer tools for symbol completion,  setting breakpoints in the debugger and inspecting variable values, including query results.

NuoDB quickstart in LINQ

To illustrate programming with LINQ we are going to use the NuoDB quickstart database of North American professional hockey teams, players and season stats.   Our quickstart database is based on the Hockey Databank[1] project, a data set with information about nearly 5,000 players and about 100 years worth of seasons statistics for these players and the teams that they played on.

In the following diagram, taken from the Visual Studio tool for importing database schemas,  we see the 3 entities that make up the NuoDB quickstart database schema:

  • MASTER table has biographic information on each player, when and where they were born and the range of years that they were active in the hockey league.
  • TEAMS table has info about the season stats for each team and year, including the number of games played, won, lost, or lost in overtime, final standing rank and playoff results.
  • SCORING table has info about the season stats for individual players, the position they played,  number of games, goals scored, goals assisted and penalty infraction minutes.
    Note that SCORING represents the many-to-many relationship between teams and players over time.  As players may change teams mid-season, the STINT attribute indicates when a player played for more than one team during a season.
 
linq-schema

A note for hard-core hockey fans:  we use a subset of the Hockey Databank, the full data set has many additional stats about teams and players, goal tending and coaching stats, awards, post season stats, game splits and team on team stats.

Adding entities and querying

Let’s take a look at some of the data manipulation constructs in LINQ.

Here is a code fragment for adding a new team into our sample database.  We simply instantiate a new C# object, set the values of its attributes using ordinary assignment statements, add the new object to the collection of teams and save our changes:

    testEntities ctx = new testEntities();
    TEAMS team = new TEAMS();
    team.TMID = "BOS";
    team.YEAR = 2013;
    team.CONFID = "EC";
    team.DIVID = "SE";
    team.PLAYOFF = "0";
    team.OTL = "0";
    team.NAME = "Boston Bruins";
    ctx.TEAMS.AddObject(team);
    ctx.SaveChanges();

That was easy, but so far this is not all that different from coding using an Object-Relational Mapping (ORM) framework, such as Active Record in Ruby on Rails or Hibernate in Java.    LINQ really shines when it comes to writing queries, from simple 1 table lookups with a single selection predicate, to complex queries that select from multiple tables and aggregate or sort the results based on several attributes.  In most ORM frameworks, it is not easy to write non-trivial queries, and usually it requires writing the query directly in SQL.

Our first query, go Bruins!

For our first example, we write a simple query to show the ranking for the Boston Bruins, for every season since the year 2003:

 
    var bostonStats = from team in ctx.TEAMS
        where team.TMID == "BOS" && team.YEAR > 2003
        select team;

    foreach ( TEAMS t in bostonStats )
        Console.WriteLine( t.YEAR + " " + t.TMID+” “ + t.RANK );

As you can see with LINQ, writing the query in C# is very natural.  The expression for selecting the teams we want to find is the exact same language construct we would write using an if statement.   This makes the task of writing database apps a lot faster and a lot less error prone.  Now how powerful is that!

Here are the query results.  Looks like the Bruins have been doing well lately.  We’ll see how well they end up against the Penguins this week.

   Year    Team   Rank
    2003   BOS    1
    2005   BOS    5
    2006   BOS    5
    2007   BOS    3
    2008   BOS    1
    2009   BOS    3
    2010   BOS    1
    2011   BOS    1

A second query, players from Slovakia?

A bit more complex query shows us the scoring stats of players who were born in Slovakia:

     var slovakianStats = from player in ctx.MASTER
         where player.BIRTHCOUNTRY == "Slovakia"
         orderby player.LASTNAME
         select new
            {
              player.LASTNAME, player.FIRSTNAME, player.SCORING
            };

     foreach (var t in slovakianStats)
         {
            Console.WriteLine(t.LASTNAME + " " + t.FIRSTNAME);
            foreach(var s in t.SCORING)
                {
                  Console.WriteLine("\t" + s.YEAR + " " + s.TMID 
                      + " " + s.GP + " " + s.G);
                }
         }

In this query we used a complex type to embed the scoring stats within each player as the value of player.SCORING.   Here are the results:

    Baca Jergus
        1990 HAR 9 0
        1991 HAR 1 0
    Balej Jozef
        2003 NYR 13 1
        2005 VAN 1 0
        2003 MTL 4 0
    …

Query 3, top-scoring rookies?

And finally here is an example showing LINQ code for finding rookie players who played for one of the top-ranked teams and who also scored higher than the average score of all the players in a given season.

linq-query

IDE tools and LINQ

We can easily set a debug breakpoint after the first part of this query and inspect the elements of the results set for the top ranked teams in the winners variable.

linq-ide-stack

Summary

In this brief tour we saw that writing apps using Language INtegrated Queries raises the level of abstraction for accessing databases and can help amp up developer productivity to a whole new level.

If you have not looked at the latest Microsoft tools for app development, give them a spin.  We should also note that LINQ is available in other environments, so even if you aren’t writing on Windows you may want to check it out.

Stay tuned for our future posts where we’ll take a look at how relational and extended-relational information models compare to document-oriented JSON databases, patterns and anti-patterns for representing complex data structures and dynamic schemas in NuoDB, and other related topics.

Please let us know what you think and what you’d like us to write about next.


[1] For more information about the Hockey Databank project please visithttp://sports.groups.yahoo.com/group/hockey-databank

Contributing Author: 

Tags: 

Lan Luu
Anonymous's picture
<p>Thank you Alberto for your

<p>Thank you Alberto for your cool blog about LINQ2NuoDB!</p><p>It could be nice, in one of your future blogs,&nbsp;you can share your experience about the performance of LINQ2NuoDB (e.g. benchmarks, tips for improvement ...), since we all know that ORM reduce query performance much.</p>

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.