Posts

Showing posts from January, 2018

PublicBI ORBIT - On the project Role-based Business Intelligence Training program

Image
I have some good news for those finding it difficult to get into Business Intelligence field. A lot of people want to work in BI or learn BI, but unfortunately there aren't enough opportunities. If you can get into a BI Team in any company, in any role, where you can learn from seniors in the team, that would be the best way to learn BI in my view.  For those, who don't have that opportunity,  I have come up with two options (1 free and 1 paid). Introducing PublicBI ORBIT. First of its kind BI training program, based on Public Data Warehouse project. ORBIT is like a play ground, where you play the game of BI in whichever role you want, experienced gamers teach you the rules of the game and guide you with the best practices. As you play the game in virtual teams, you experience the true meaning of team spirit, deal with communication, co-ordination challenges and much more.  Please see PublicBI ORBIT for more details. For free BI learning see www.publicdw.com  I have

AI and ML means death of BI?

Business Intelligence is all about deriving information and insights from data efficiently to enable decision making in order to improve business and thereby increase profits. As businesses evolve, business intelligence will also evolve. With AI (Artificial Intelligence) and ML (Machine Learning) lot more automation can happen in the BI space. This is already the trend for example you now see more and more vendors coming up with automated insights, automated report creation, etc. However, businesses are complex, till the time AI powered robots can make better business decisions / strategies than humans and run businesses on their own, humans will depend on BI to make informed-decisions.

BI, Data Analysts and Data Scientists - confusing?

There is lot of confusion about terminologies. Have a look at the definition of Business Intelligence in Wikipedia (Business intelligence - Wikipedia). Pasted below for quick read. “Business Intelligence (BI) comprises the strategies and technologies used by enterprises for the data analysis of business information. BI technologies provide historical, current and predictive views of business operations. Common functions of business intelligence technologies include reporting, online analytical processing, analytics, data mining, process mining, complex event processing, business performance management, benchmarking, text mining, predictive analytics and prescriptive analytics. BI technologies can handle large amounts of structured and sometimes unstructured data to help identify, develop and otherwise create new strategic business opportunities.” Some of them wrongly assume that reporting and visualization tools as BI solutions (frontend) because that is the only part they see

Extracting meaningful information from social media

Here is a real example of how an e-commerce company that sells furnitures makes use of social media data. Creates a Facebook page, gets followers, comes up with ads, pays Facebook to promote the ads and display the ads to potential customers.  Facebook provides data about how many of them viewed, liked, commented, shared and non-personal summarized demographic data to the furniture company through API’s.  Furniture company has automated jobs that access these API’s to collect data.  Then the story is the same, data is collected, arranged in a way it is suitable for reporting and analysis purposes and interesting trends and patterns are found by the data analysts and product owners, etc.  Company finds out how many (%) of those viewed, liked, commented and shared actually converted to a customer.  Based on learning, the next ad is created to target a specific group etc.  Goal is to increase sales.  Comments are also used for sentiment analysis for example to classi

Best practices to be considered during report development

Some of the best practices that comes to mind right now are listed below, main thing is to think from user's perspective Optimize time taken for report refresh Layout and arrangement of objects (Charts/graphs/grid/tables, logo, header, footer, title and one or two line descriptions, report refresh time, prompts answered by users ) Best chart/graph types Prompts/filters that can be provided. Provide good description for each prompt that use can select or provide values. Identify and set best values as default values for prompts Test, modify report and mention which format (Excel or pdf) export works best (if not mentioned in functional requirements) Keep number of pages to minimum possible but ensure that there isn’t too much information in the report because of which user can get confused Use legends, consistent colour coding for same values across reports Spell checks Versioning the report Mention any specific inclusions/exclusion/filter applied Mention from where the

Business Intelligence Architect

People become BI architects in different ways in different companies. A BI developer working in an in-house BI team over several years may get promoted to BI architect and then starts to perform the role of BI architect. Or a BI developer / analyst / tech lead working for an IT company with multiple client projects experience in various BI tools and technologies could become a BI architect with around 10 years of experience. Or a person switches to BI architect role from a non technical BI roles(BI analyst or BI Business Analyst) out of interest or better opportunities. A general set of tasks carried out by BI architect is given below, may not be true for all BI architects. A BI architect creates BI solution architecture diagrams and BI solution documents based on the high level program/project requirements. Creates technical roadmap. Is involved in providing response to RFPs/RFIs. Involved in BI tools selection, setting up framework (ETL framework, reporting templates, pro

Ideas for BI Projects

Just a list of ideas for BI projects irrespective of the domain (Industry) SLA KPI reporting - Find out which SLA’s are in place and see if there is a need to automate KPI reporting (Some companies spend weeks of manual effort). Fraud prevention - Detect fraud patterns based on data and create rules to identify fraud transactions/behavior etc. Regulatory compliance/Market compliance - Identify regulatory requirements and again see if there is a need to automate regulatory reporting. Competitor data analysis - Nielsen data, etc. Campaign/promotions analytics - What is the impact of a campaign? Revenue assurance - Ensure there is no revenue leakage by collecting data and reconciling data from end to end of the business processes. 360 degree view of Customer ( Customer Analytics)

BI in Human resources domain

Business Intelligence projects I have implemented in the past with HR data are Learning management Job application Compliance Work planning Learning Management - Goal is to support managers to identify which trainings are relevant for his Team. Followup with team members if trainings are pending. Plan trainings well ahead so that employee has enough time to complete the training. Job Application - Goal is to reduce turn around time between a new job application and selection/rejection state. Improve hiring to application ratio. Compare success ratios between hiring via different channels and identify best channel for similar job. Compliance - Goal is to ensure employees are compliant w.r.t working hours, special compliance rules for minors, compliance check for minimum meal/breaks timings. Identify patterns and prevent compliance issues as these can cause lot of financial and non financial damage to the company. Work Planning - Goal is to automatically find the right sub

Automation makes DWH developers / BI developers redundant?

Question :  Automation makes DWH developers/BI developers redundant? Answer: No. I think quite a lot of tasks will be done automatically but not all. DWBI developers will use tools to automatically create most part of the deliverables (ex ETL jobs, reports, dimensional models, metadata) however some deliverables will be custom requirement that will have to be done manually or automation enhanced to deal with custom requirements. The way things will be done could be different and the profile of a DWBI developer could also change. We already see some examples of automation. If you specify the list of source tables and target tables then a number of ETL jobs are automatically created in seconds using some of the ETL tools. Earlier ETL developer had to develop every single job even if it was a simple job. Automated ETL job creation can be very useful for one to one data migration, but as soon as a business specific logic needs to be placed in the transformation, someone needs to specify

MOLAP, ROLAP and HOLAP

MOLAP (Multi-dimensional online analytical processing) is a type of OLAP. The other type is ROLAP (relational online analytical processing). BI reporting and analysis tools in the old days implemented either MOLAP feature or ROLAP feature. The tools were also described as MOLAP tool or ROLAP tool. But now the BI reporting and analysis tools are equipped with both MOLAP and ROLAP features and hence they are also referred to as HOLAP (Hybrid OLAP) tool. As these concepts are already extensively written about in various other places I will just highlight the major differences MOLAP - All the possible combinations of calculations are pre-calculated and the calculated data is stored as cubes to enable faster/shorter refresh time at the time of query. ROLAP - Data is stored in relational format (RDBMS) but using dimensional modeling technique to make it look like a cube without a physical cube, no pre-calculations, this is to avoid unnecessary storage of calculated data th

BI - From good to have to must have

Some of the general changes in Business Intelligence that I have observed over the years are, In the past, mainly big/large companies invested in BI. BI was a “must have” for big companies whereas for the SME (Small and Medium Enterprises) sector it was a “good to have”. Now, that has changed, even SMEs consider BI as a “must have”. This emphasizes the fact that more and more decisions are now made based on data. In the past, executives had to be convinced that the organization/department needs a BI solution. This is now changing and BI is becoming a default function (like legal, accounts, HR) to be part of the organization. In the past, mostly strategic decisions were made based on data, now, even operational decisions are mostly based on data. I don’t know the percentage increase in operational BI usage but definitely there is an increasing trend. Operational BI has enabled ground level staff to perform better. As the number of ground level staff is usually very much more than

Data Warehouse characteristics

A data warehouse is usually part of business intelligence solution. You can also have a data warehouse for regulatory reporting purposes or legal requirements. Data warehouse is where the data is historized (versioned), centrally stored after cleansing, transforming and unifying of the data from one or more data sources. Data warehouse is designed in a such a way that it makes it easy for reporting and data analysis on large amount of data. Without data warehouse it would be very difficult and time consuming to create consolidated reports based on data from various sources and also to do a time trend report as most of the source system store only current snapshot of the data for operational reasons. Data warehouse (DWH) in its simplest form is a data repository/store specifically modeled/designed for high performance and efficient reporting and analysis of historic, current and calculated data. Usually a good business intelligence solution is backed by a data warehouse. In a da

Working of an ETL Tool

ETL stands for Extract, Transform and Load. ETL is one of the main components in a BI solution backed by a data warehouse. ETL is also used in other projects such as data migration and data integration projects. ETL flows/jobs can be built by scripting or by using ETL tools. Most of the companies currently use one of the existing ETL tools to build ETL flows/jobs. In short, ETL tools abstract the technical complexity and thereby enables developers to focus on ”what needs to be done” than on “how it needs to be done”, For example a developer doesn't have to bother about developing a connector to a database, developers design the ETL flow/job using drag and drop, click and configure GUI of the ETL tool and also run, test, debug, schedule the ETL jobs using GUI of the ETL tool. ETL tools provide a visual framework for ETL developers to design the ETL jobs. It provides a level of abstraction of the code/script. There are many features that ETL tool provides but I won’t go in t

Need for BI Tool when Excel exists

Firstly, I would like to clarify that BI tools are not only BI reporting tools. Unfortunately the BI software vendors market their BI reporting tool (which is one part of the BI solution) as the BI solution. A BI reporting tool mainly provides reporting, analysis, visualization and portal capabilities along with several other smaller features. But this is not always the BI solution. Yes, you can just use a BI reporting tool on top of a transactional database and say that you have a BI solution. But this solution will lack data historization/versioning. So, some of the strategic questions cannot be answered by this kind of BI solution as point in time data is missing. Only the operational and tactical BI needs are met. If this is good enough for the business then it's fine. However this is not always enough, This is why usually BI solution includes a data warehouse and to load this data warehouse there is usually ETL (script or Tool). Its not easy and not small effort to setup a go

Kimball vs Inmon - ROI

In general, Kimball approach (bottom up with conformed dimensions) provides faster ROI compared to the Inmon approach (Top down). Using Kimball approach business users get to start using BI solutions earlier compared to Top Down approach. They are able to provide feedback earlier and hence the solution gets improved and enhanced quicker compared to Top Down approach. In Kimball approach, as the most important subject is usually modeled and built first, once the users of first subject area are happy with the results it already sends positive message to next set of users/departments. On the other hand several problems ( people move, team changes, projects getting scrapped, lack of ownership, etc.) may come up in Top Down approach which takes year/s before end users can start using the solution. Note that exceptions are always there.

Data mining

In data mining, you are looking for hidden information but without any idea about what type of information you want to find and what you plan to use it for once you find it. As and when you dig into data and discover interesting information you start thinking how to make use of it to improve business. Example - A data miner starts digging into call records of a mobile network operator without any specific targets from his boss. Boss probably gives him a quantitative target to find at least 2 new patterns in a month. As he starts digging into the data he finds a pattern that there are less international calls on Tuesday (remember it is an example) compared to all other days. Now he shares this information with management and they come up with a plan to reduce international call rates on Tuesdays and start a campaign. Call rates go high, customers are happy with low call rates, more customers sign up, company makes more money as utilization % has increased. Watch out for these

ETL Testing

Usually in data warehousing/business intelligence projects, ETL (Extract, Transform, Load) tools such as Informatica, Datastage, Talend, etc., are used to design ETL Jobs. Some companies still use scripting to develop ETL jobs. ETL testing can be as simple as Testing an ETL job manually by running it from a GUI verifying that the job runs, and validating the data loaded against expected data. or can be complex like Automating test data creation using scripting, SQL and tools like SOAP UI, using tools like Jenkins or ETL tools itself to trigger automated test data creation scripts and then automatically run the actual ETL job and then automated comparison of the loaded data against baselined data and highlight anomalies (if any) and auto check data against expected results for every single test case that was created earlier. Integrate the test scripts and test data creation scripts with the full set of ETL jobs so that it can be run during regression testing of the ETL

Data Wrangling

In the context of business intelligence, data wrangling is converting raw data into a form useful for aggregation/consolidation during data analysis. Before data is analyzed/visualized we need to ensure that we have unified the data. Simple example, if you want to visualize number of customers by city, then you need to ensure that there is only one row per city before data visualization. If you have two rows like Muenchen and Munich representing the same city this could lead to wrong results. One of the rows has to be changed manually by the data analyst/user and this is done by creating a mapping on the fly in the visualization tool and applied to every row of data to detect for more such issues and the process is repeated for other cities. In a BI solution backed by a data warehouse all of these data transformation, cleaning, mapping, etc., is dealt by the ETL/ELT before data is presented to the user and hence the end user doesn’t have to bother about these data preparatory

Big data hype - is BI still relevant?

Firstly, data on its own is of no use. No matter if it is big or not. Second, big data doesn’t necessarily mean big/high returns. Too much has been said and written about big data. It is time people realize Today’s big data is tomorrow’s normal data . At the end it is just data that was earlier not captured, stored, processed and used for analysis which currently can be done with technological advancement. Business can gain insights from data using business intelligence solution(not referring just to the front end reporting tool but the whole solution). The source data for the BI solution could come from excel files or relational DB’s or xml files or log data or sensor data or something else. These are all the different types of data generated during the functioning of the business. Some are small in volume, less frequent, have defined structure, some are huge in volume, more frequent and semi-structured or unstructured and some have combination of these. Why give so much importance t

Consolidating content

In the last one and a half year I have created lot of content in business intelligence across various sites like Quora, LinkedIn, Facebook, Twitter, company websites, etc. I now think it is best if I consolidate all the contents related to BI in one place in this blog. Compared to all other sites this blog is the best for one main reason that the users don't have to sign in. So in the next couple of days you will notice many useful posts in this blog from me. Happy reading.

Use of Certified Business Intelligence Professional (CBIP) certfication from TDWI

Based on own experience I consider CBIP highly valuable. We can use it to our advantage depending on our situation. There are several uses, some uses are listed below, hope this helps As part of the day to day BI/DWH work we gradually forget the theory part, i.e We know why we do something in a specific way but we may forget the terminologies. When we take up CBIP and prepare seriously for it, it adds BI vocabulary and refreshes the theory and conceptual part. It then makes it easy for us to explain to others why we do certain things certain way with proper terminologies. If you are the only BI professional in an organization/department, others who probably don’t know BI can easily trust you if you have a certification from a reputed institute. CBIP is from a reputed institute (TDWI) and this makes it easy for you to tell others in the organization/department to trust you on your BI skills as you are a CBIP. If there are several BI professionals in the org/department, CBIP certificat

Self-Service BI

Self-Service BI - Users are able to service their day to day reporting and analysis requirements without involvement of a BI team/IT team. The service is not setup by the users themselves. Usually the BI team setup/enable this self-service by creating governed and metadata based reporting using various tools like Business Objects, MicroStrategy, Cognos, etc., at the frontend (user access) of the BI solution. Usually there are other tools (ETL, RDBMS) at the back end of the BI solution. Without BI team/IT involvement these tools are only as good as excel or even less useful than excel. If the data is not consolidated (integrated), not cleaned and not governed, you can imagine the outcome of users using this data. I guess below analogy gives you an idea Self Service BI in reality - You are hungry, You go to a buffet restaurant, you find multiple dishes well arranged, labelled (name, veg/non veg, spicy, hot, very hot, etc., contains x, y, z), you pick up what you want,

Business Intelligence challenges

These are the list of challenges faced by BI teams. Not a comprehensive list, but just the points that are right off the top of my head Data quality issues. Lack of source applications/systems knowledge. Lack of executive sponsorship and continuous support. People at the top not having longer term vision. BI tools selection. Source system changes without/late communication to BI Teams Lack of funding to BI teams compared to core development teams. Silos of BI teams First deliverable (ex - report) takes time as the foundation (data warehouse) is built. Inter department politics Lack of skilled team members Data governance/ownership issues/master data management issues Lack of business understanding Not giving importance to Datainformability

Watching 2D movie with 3D glasses?

What's the point in watching a 2D movie with 3D glasses? Either the movie has to be in 3D or the 2D movie has to be enhanced. Vendors will try to sell you fancy BI frontend tools (3D glasses) loaded with buzzwords, but if you don't have data (content in 3D), what's the point? Give importance to capturing data , doesn't matter if it's commercial business or an individual.

Popular posts from this blog

ETL developer vs Data engineer

3 years of IBI