The organization I work for is expanding its Business Intelligence capabilities, introducing a new Enterprise Data Warehouse for management reporting. This is based on the concept of OLAP Data Cubes – permitting greater flexibility for user reporting.
My question is – what’s the best approach for eliciting and documenting requirements for BI/Data Cubes? There’s less emphasis on the structure of reports due to the flexibility of the cube, and more emphasis on the definition of the data and the ways in which it can be manipulated. To what degree should I be focusing on the detail of the data? Should I learn about ‘star schemas’ and how to structure the data in ‘dimensions’? Or should I focus more on the business need – on what metrics are required, what the data/reports are to be used for, by who, how frequently etc.
I know that there is never a cookie cutter approach to eliciting and documenting requirements of any kind, but I’d be eager to know of any specific areas I should be covering to ensure that the specification I handover is comprehensive.
Finally, if there are any recommended books on Business Intelligence & BI requirements, I’d be grateful for any suggestions.
You are asking all the right questions. This reminded me of my first BI project.
My first Business Intelligence Project
The request from the client consisted of an excel sheet with some existing reports that needed some changes. They also wanted some new reports. Since these reports were manually created, they were always concerned about the accuracy of the data. Also the reports were delivered late due to the time taken to generate the reports.
We jumped on it and built the cubes (explained later) and delivered the reports. Everyone was very happy. However, after 3 months; the management wanted new reports and enhancements to existing reports. Though this required substantial changes to the solution, we were able to meet the requirements. However, by the time we delivered these, the management wanted more reports and changes.
This did get us thinking and we got together to figure out a better way of handling the project. By this time, we had understood that requirements for BI solutions had to be handled in a different way. Most of us are used to doing requirements for Transactional systems; i.e. applications that help in day to day operations of the business which are very different from Business Intelligence applications. BI applications leverage the data from the Transactional systems to help management make informed decisions and take calculated risks.
This was a big turning point and led to defining an overall approach for handling BI projects in general. We broke the team into 3 groups which were responsible for OLAP Database, Cubes and Reporting Dashboard. A fourth group was added later to take care of the Data Mining area.
OLAP Team Created the Data Archive
The focus of this group was to create the OLAP (Online Analytical Processing) database from the current OLTP (Online Transaction Processing) database. The key difference between the two is that while OLTP databases are designed to support the day to day functions (transactions); OLAP databases are designed to support analysis of the data. Data is pushed to OLAP databases from OLTP database through the process known as ETL (Extraction, Transformation and Loading). The key responsibility of the team was to define an overall archiving strategy to move data from OLTP DB to OLAP DB so that this data can be used for analysis.
One of the key activities of this team was to identify data elements that can be used for analysis and discard other unnecessary details. For example: We just retained the zip codes of shipping and billing addresses instead of the whole address. Another interesting addition was to have flags to indicate whether these addresses were same or different.
Other activities included performance tuning of this database. Performance tuning of OLAP database is different from OLTP. OLAP databases are generally tuned for bulk inserts and retrieval; while the OLTP databases are tuned for smaller but frequent inserts and updates.
From Business Analysis function perspective, we had data analysts in this team who worked with the management and development team to create the OLAP database. They defined data extraction, standardization and transformation rules to facilitate this process.
Cubes Team Focused on Analyzing Data to Support Reports
In general databases are two dimensional, i.e. rows and columns. Cubes are multi-dimensional databases and store the data with more than two dimensions. Cubes facilitate faster processing of large amount of data. This team focused on developing the cubes that would be used for reporting.
The team started with building dimensions and fact tables. Their key focus was to identify hierarchy for each dimension and how different data elements can be used. The team consumed the OLAP database, and providing feedback to the OLAP team on the changes required. The idea of adding flags for different addresses was one of the inputs from this team. The data analysts on this team defined the dimensions, fact tables and KPIs (Key Performance Indicators) as per the requirements of the management keeping the big picture in mind.
Reporting Dashboard Team Defined Reports and Security
This team worked with the executive team to understand their needs and define requirements. In a way this part of the process was closest to the requirements gathering processes followed for any project. The Business Analysis team here consisted of functional analysts. They defined the reports in detail including report formats and filters. They also detailed the requirements around the delivery and distribution of the reports including security. The team also identified the need for Statistical analysis. This lead to the creation of Data Mining group.
Data Mining Team Leveraged the Data for Business Insights
Data Mining team was added later to the project. This consisted of members focused on applying statistical tools to identify trends and data patterns to provide valuable business insights.
Together, we Defined Supporting Processes to Ensure Success
In addition to identifying the groups, we laid down some other processes. These processes were very instrumental in success of our project.
- Each group worked independently. It was very critical that the OLAP and the Cubes team worked independently and were focused on the bigger picture. Initially they were guiding by the reporting needs which skewed the overall framework.
- Security was a key concern. We were dealing with sensitive and confidential business information that had major repercussions if it landed in wrong hands. The need to understand reports and security together led to the development of the reporting dashboard team which streamlined the delivery of the reports and took care of the security aspects as well.
- One of the key changes was to reset the expectations of the management and other teams. Before, when we received a change request, the teams scrambled to deliver it through the cubes which resulted in a ripple effect of changes across the systems. Now, the reports were delivered from the OLAP database or even OLTP database if they could not be supported from the cubes. Of course, the reports took longer to generate but ensured the overall schema was not compromised. The underlying queries were expected to change as soon as cubes and OLAP databases supported it. To address the performance issue, a feature was developed to deliver the reports through email to the respective managers.
This post lays out about a 20,000 feet view of the entire process of building a BI solution. Depending on your industry and domain, these processes may vary.
What have been your key challenges while implementing BI solutions and how have you addressed them?