I thought today would be a good time to share a SQL Server tip I use frequently at work. The data I work with and report on is often presented in a crosstab fashion. This is fairly easy to do in Microsoft Access, as Access has a built in Crosstab Wizard. T-SQL, the query language used in SQL Server, however, has no such functionality. You have to code it yourself.
For this example I’m going to use the data found at the Baseball Archive, which has an extensive collection of Major League Baseball statistics from the 1800’s all the way to last season available for free personal use. The data they provide is contained in an Access data, but I went ahead and imported it to SQL Server. For this example I’m using two tables: the Fielding table to obtain Position information and the Batting table for the number of Home Runs.
Let’s say you wanted to know how many Home Runs were hit per position per year from 2000 to 2003. You could write the query like this (I filtered out the Pitcher statistics because who cares about Pitcher’s hitting stats):
Query 1
select Fielding.POS, sum(hr) as hr, batting.yearid
from dbo.Fielding inner join batting on Fielding.PlayerID = batting.PlayerID
where (batting.yearID > ‘1999′ and batting.yearID < ‘2004′ ) and Fielding.pos <> ‘p’
group by Fielding.pos, batting.yearid
order by batting.yearid
Query 1 Results
This does give you the number of Home Runs hit per year by Position. However, it’s hard to examine two years of numbers without scrolling. Now, let’s look at the results using a crosstab query.
Query 2
/* Cross Tab */
select Fielding.Pos,
sum( case
when batting.yearid = ‘2000′ then (batting.hr)
end) as ‘2000 - Home Runs’,
sum( case
when batting.yearid = ‘2001′ then (batting.hr)
end) as ‘2001 - Home Runs’,
sum( case
when batting.yearid = ‘2002′ then (batting.hr)
end) as ‘2002 - Home Runs’,
sum( case
when batting.yearid = ‘2003′ then (batting.hr)
end) as ‘2003 - Home Runs’
from dbo.Fielding inner join batting on Fielding.PlayerID = batting.PlayerID
where (batting.yearID > ‘1999′ and batting.yearID < ‘2004′ ) and Fielding.pos <> ‘p’
group by Fielding.pos
order by Fielding.pos
Query 2 Results
Much better, right? Here you can see the trends of what is happening at each position per year.
The key to the crosstab query above is the case statement. What the code is going is testing the year and if it is equal to ‘2000′ is summing batting.hr. That’s all there is to it. The last thing I would do is take this code and save it as a stored procedure, so it can be used in you .Net application or wherever.
Improvements:
- Send the year to the query as a parameter so this query can be used for any range of years.
- Number of years could even be a parameter, making this even more “dynamic”
- ?
Any questions? Let me know if you can think of any other improvements you would add to the above code. Also, let me know if you would like to see the above code with the improvements I mentioned.



0 comments ↓
There are no comments yet...Kick things off by filling out the form below.
Leave a Comment