By tim
In building our Research data mart, which includes data on book sales trends, job postings), blog postings, and other data sources, Roger Magoulas has had to deal with a lot of very messy textual data, transforming it into something with enough structure to put it into a database. In this entry, he describes some of the problems, solutions, and the skills that are needed for dealing with unstructured data.
Roger wrote:
-
When integrating our research data mart with a legacy sales transaction system, I was asked to help tune a data mart with appx 3mm rows that joined to a few large dimensions; an aggregate query was not completing. I was able to tune the query down to …