01 June 2011

Top tends in BI arena

I thought I’d share a few of my thoughts on the trends and technologies that are part of my daily life these days. From what I see in the market, the expectation from BI has changed significantly over the past few years. Now all size of companies are looking at investing in BI. They expect to spend less, get more ROI, more data accessibility, more insight into different levels of the business, more flexibility, and quicker response to the ever-changing business requirements.

Agile BI

Traditional waterfall design and development methodologies are too slow and too inflexible for BI. Many companies are now considering Agile for their BI implementations.
But don’t misinterpret what I am calling “Agile BI”, it’s not just about the Agile software development methodology, which is nothing new. IMHO Agile development by itself is not enough for BI, I think a different approach, technologies, and mindset are required to make BI more flexible
An agile software development process doesn't necessarily lead to agile BI, though it will speed up some steps in the process. The technology and processes (e.g. change management process) has to be made more flexible to really achieve agility.
Forrester defines Agile business intelligence as: An approach that combines processes, methodologies, organizational structure, tools, and technologies that enable strategic, tactical, and operational decision-makers to be more flexible and more responsive to the fast pace of changes to business and regulatory requirements.

Collaborative/social BI

Intelligence = Information + People

Collaboration and business intelligence have never been separated, but companies are just recently beginning to realize the benefits of the connection.

By introducing collaborative technologies to their business intelligence platform, companies are better positioned to utilize and take advantage of their employee data.

Collaboration is now becoming an essential part of the business intelligence. It enables crowdsourcing and in general the wisdom of the crowds tend to produce better decisions as organizations become increasingly complex.

Self-service BI

Let the business do it by themselves!
IT is always under one step behind from ever-changing business needs. The concept of self-service BI is to provide capabilities to enable business users exploring data and build their own reports. it may also enable analysts to publish and distribute their work to others.
Used in this way, the visual discovery tools become a nice tool for super users to create web-based reports and dashboards for colleagues. These reports are not static; they are “live” and interactive so casual users can analyse data just as business analysts would without the hassle of starting from scratch.

Predictive Analytics

"Predictive Analytics" is to analyse current and historical facts to make predictions about future events.

More and more vendors are attempting to make it more accessible to information workers and others without advanced degrees in statistics.
Data mining used to be mostly used by large enterprises (e.g. Banks) but these days I see companies in different sizes are started getting benefit from that.
There is also a trend in using social network data in data mining. It can be used to determine the trend, behaviour analysis and reduce customer churn.

18 October 2010

Project Phoenix - ANZ

Project Phoenix, started by SQL Server MVP Arnie Rowland, is a program which awards software, tools and training materials to under/unemployed developers who propose a software project for a non-profit agency, school or church.

More info: http://www.rodcolledge.com/rod_colledge/project-phoenix.html

28 September 2010

To PowerPivot, or not to PowerPivot?

We’ve all seen many presentations about PowerPivot and its great features like in-memory processing which enables Excel to handle millions of records. But the question is when to use PowerPivot as opposed to using a SSAS cube?
Here are some points you need to consider when choosing the tool:
Pros:
 
  • PowerPivot is a powerful tool to gather data from different data sources. It can be used to create POCs or demos in a quick (and possibly dirty) way.
  • It can be used to answer some ad-hoc questions or to play with data (data analysis).

Cons:
  • PowerPivot doesn't support data level security. Whoever gets access to the file can see all data inside it. With SSAS you can implement security at data level.
  • Complicated dimension types (like parent-child) are not supported in PowerPivot
  • Using PowerPivot workbooks as data source for reports may cause version management issues.
  • PowerPivot runs in memory, when you use PowerPivot for all your enterprise data you would need a huge RAM
  • Saving PowerPivot files in SharePoint may cause some issues when the size is too large.
  • SSAS allows you to fine tune your cube but with PowerPivot you can’t.
  • Data can easily be out of control as soon as it is saved on workstation. There also might be huge security risks involved. Even with the SharePoint implementation users have to download data to workstation first, and then publish it to SharePoint.
  • PowerPivot for SharePoint requires Enterprise license and SQL Server 2008R2 EE license. The cost (which is very high) may affect the SharePoint implementations of PowerPivot. By using PowerPivot without SharePoint (Excel add-in which is free) we may end up having even bigger Excel hell!
  • Source data is not always in high quality. We may need to validate and do data cleansing and also more often we need to apply some business rules which can be handled better in SSAS or at ETL level.
  • Working with data from different data sources may require high technical knowledge. Some data isn't that easy to join. A key isn't always available that lets you join data from different sources. It will make it hard for a non-technical person to work with the tool without any IT support.

  

From what I've seen so far, PowerPivot is not a replacement for Analysis Services, but it certainly has a role for tactical solutions. It also can be used by power users to quickly analyse data, or perhaps to build prototypes of larger scale SSAS implementations.

So I think PowerPivot is great for certain scenarios, but for Enterprise level data warehouse implementation SSAS is the way to go.

23 August 2010

New joy of my life...

This little man has delayed this blog for 4 months! Hopefully I’ll get more time to update this blog more often.

I started using SharePoint 2010 & SQL Server 2008 R2 in my new project and will share some thoughts on them very soon.

19 April 2010

SQL Server 2008 R2 - free ebook

You can download this free ebook from Microsoft Press to learn about new features of SQL Server 2008 R2.

Introducing Microsoft SQL Server 2008 R2, by Ross Mistry and Stacia Misner

Enjoy it!

08 February 2010

Gartner Magic Quadrant for BI Platforms 2010

Once again, Microsoft has been positioned in the Leaders Quadrant of the Gartner Magic Quadrant for BI Platforms (Gartner RAS Core Research Note G00173700 - 29 January 2010)

The full report can be accessed on Gartner’s web site here:
http://www.gartner.com/technology/media-products/reprints/oracle/article121/article121.html


Figure: Magic Quadrant for Business Intelligence Platforms

05 January 2010

Microsoft BI in 2010 – and my New Year’s resolution!



It’s the first week of 2010 and people have got great intensions to stick to their resolutions (I’ve been to the gym once since 1st Jan which is a good start!)

But what about the “Learning” resolutions?

I have been learning about SQL Server 2008 R2 these days and found that there are lots of new things happening in MB BI space. Whether you are a techie who loves learning new stuff, passionate about MS BI, or just want to add weight to your CV, it would be worthwhile watching out these MS technologies in 2010.

1) Master Data Services
2) PowerPivot

3) PerformancePoint Services for SharePoint Server 2010

4) SQL Azure and Cloud Computing

5) Excel Services, VISIO Services and other BI related components and APIs in MOSS 2010

And in terms of "Methodology" and delivery approach, I believe we will hear more about "Agile BI /Data Warehousing" due to the reduced IT budgets in this economy. I'll talk about it in another post.

21 October 2009

PowerPivot & Visio Services for SharePoint 2010




Gemini has got a new name: PowerPivot

Microsoft revealed that it officially branded Gemini as Microsoft PowerPivot for Excel 2010.

“With PowerPivot, users and IT departments can: Integrate massive amounts of data on the desktop from virtually any source; Perform lightning-fast calculations and analysis on large data volumes; Share and collaborate on user generated analysis through integration with SharePoint; Monitor user generated applications,” the SQL Server team representative noted.

Couple of points to consider:

  • PowerPivot for Excel only works with Excel 2010.
  • PowerPivot for SharePoint will require SQL Server 2008 R2 and SharePoint 2010.

For more information about "Managed Self Service BI" and PowerPivot see:



Visio Services: a great visualization tool which allows real time data to be shared on a SharePoint site for those without the Visio client.

Diagrams can connect to an OBDC source such as a SharePoint list, Excel spreadsheet, or SQL Server database. However it can not be used with SSAS or PowerPivot (which is a bit disappointing for BI developers). That's a great feature and I think we may be able to build fancy looking BI dashboards using Visio in near future.

20 October 2009

Self-Service BI?!

We keep hearing about "SELF-SERVE" functionalities in new versions of BI tools specifically with Microsoft plan to delivering BI to the masses.


SQL Server 2008 R2 introduced some self-service options to allow business units and users to create and share BI solutions. Here is what MS says about them:

  • SQL Server Project "Gemini” Add-in for Excel
Enables Excel power users to create BI solutions by streamlining the integration of data from multiple sources enabling interactive modeling and analysis of massive amount of data and by supporting the seamless sharing of data models and reports through Microsoft Office SharePoint 2010
  • SharePoint 2010 based Operations Dashboard
This SharePoint managed service enables front-line operators and administrators to monitor access and utilization of analyses and reports as well as track patterns of hardware usage to help ensure the right security privileges are applied and user generated solutions are available, up-to-date, and managed in a consistent way.
  • SQL Server Reporting Services Report Builder 3.0
This updated ad-hoc reporting client accelerates report creation, collaboration and consistency by allowing users to create and share report components that can be accessed via the shared component library and by enabling the rapid assembly of comprehensive business reports using these shared components.
  • Rich visualization of geospatial data
New support for geospatial visualization including mapping, routing, and custom shapes can help your end users create customized reports that leverage existing content objects, such as queries, data regions, and charts and graphs. You can also enhance location-based data reports with Bing Maps in Report Builder 3.0.


I know many people are excited about them and especially about project Gemini, here are some quotes I found on the net:

  • Gemini enables end users to work with business data
  • 100 million rows on a laptop with Office 2010 Excel
  • Slicers and spark lines in Excel 2010 are bringing BI to everyone
  • Excel 2010 provides BI for the masses by making it easy to build your own charts/graphs on and offline

I agree with all of them, BUT, the question is: are we missing the main goal of BI: “single version of the truth”. Is Gemini leading us to the next generation of “Spreadmarts”? Does having Excel and access to source data make a BI solution?

From my experience, if you ask business-users to do something which requires more than 1 minute of technical training, most probably they get confused, and in most cases, anything more than 5 clicks away is considered as “too technical” and won’t be adobted by business users. So how do we expect them to get this toy and build a BI solution?!


I believe the idea of "Self Service BI" is like a double edged sword, though it can be useful in the same way as we do self-service check out in supermarkets: do it yourself when you don’t have many items, so you can save time, but you will be responsible for that and if you get confused about something, that’s OK, you may ask a staff to help you out with that, but it may take more time than standing in line in the first place.

However, it’s a quick way to discover the underlying data, and can be handy in some cases such as the following scenarios:

  • If a business analyst needs to see what data is available in data sources
  • To analyse source data and create simple reports when underlying data is clean and in simple structure
  • To create a prototype for a reporting solution
  • BUT, when you are dealing with complicated logics to bring data in, or when there are too many data sources and you are not a BI specialist: forget about it!

I don’t want to be negative; I like Microsoft and their BI toolset, but is it the right way to go? Does it lead us to the “BI for masses” idea or towards “messy BI”?!

I’d like to hear your thoughts.

23 September 2009

Handling Ragged/Unbalanced hierarchies in SSAS

What is it?
Balanced hierarchy: In balanced (standard) hierarchies, branches of the hierarchy all have the same level (depth), with each member's parent being at the level immediately above the member.
Example: time dimension, where the depth of each level (year, quarter, month, etc.) is consistent.




Unbalanced hierarchy: the hierarchy branches can have inconsistent depths.
Example of an unbalanced hierarchy is an organization chart. The levels within the organizational structure are unbalanced, with some branches in the hierarchy having more levels than others.
The following charts illustrate the unbalanced hierarchies:



How to handle it on SSAS?
There are 2 methods to handle ragged hierarchies on SSAS:


Using Parent/child dimension
In this method we need to create parent-child dimension. Parent-child dimension can be created on top of a parent-child table/view where each record has a reference to its parent record (e.g. ParentID column). An example of this type of tables can be found on Adventure Works sample database.


Using ragged hierarchy with "Hide Member If" property
Another approach is to design a regular dimension on top of a ragged or unbalanced table/view. Ragged hierarchy: In a ragged hierarchy, the logical parent member of at least one member is not in the level immediately above the member.
Example: geography dimension which number of levels can be different on different countries. The levels provide a meaningful context to its members, thus, while Washington DC is a child of USA, it is included at the City level with Los Angeles. Therefore there will be nulls on some levels.


Note-1: We can also repeat the parent name instead of null as a placeholder.
Note-2: In the example above, 1st and 2nd record will be required when we need to assign facts to non-leaf members. In this example we may need to assign a fact to CA which is a non-leaf member.

Once we create the normal hierarchy, we can use HideMemberIf property to hide Nulls or repeated members.
HideMemberIf Setting
Description
Never
Level members are never hidden.
OnlyChildWithNoName
A level member is hidden when it is the only child of its parent and its name is null or an empty string.
OnlyChildWithParentName
A level member is hidden when it is the only child of its parent and its name is the same as the name of its parent.
NoName
A level member is hidden when its name is empty.
ParentName
A level member is hidden when its name is identical to that of its parent.



Example of an Account hierachy:




Important notes:
  • For large parent-child dimensions, aggregations are created only for the key attribute and the top attribute, therefore queries returning cells at intermediate levels are calculated at query time and can be slow. If you have a large parent-child hierarchy (more than 250,000 members), you may want to consider using a hierarchy with a fixed number of levels.
  • Microsoft recommends using P-C whenever you need to set unary operators.
  • DataMember property only works in parent-child dimensions
  • You can’t cross-join levels on a parent-child hierarchy. It might cause some complications in report development
References and more info: