What is the difference between a star schema and a snowflake schema, and when would you use each in a data warehouse?

Star schema: a central fact table surrounded by denormalized dimensions. Simpler to query, better analytical query performance because it requires fewer JOINs. Snowflake schema: dimensions are normalized into sub-dimensions (for example, the product dimension is split into product, category, and subcategory as separate tables). Saves storage space and reduces dimension data redundancy, but requires more JOINs and greater query complexity. In practice, the star schema is preferable for most BI use cases because query performance and business usability outweigh the normalization benefit of the snowflake schema.

How would you ensure that the metrics shown in a dashboard are consistent with the numbers reported by other company systems?

Establish a periodic reconciliation process: compare the dashboard's key metrics against the reference system (ERP, billing system) for the same periods and verify they match within a defined tolerance. Document known differences and their justification: the dashboard may exclude returns while the ERP includes them, and that difference must be documented. Automate validations where possible with alerts that notify when the dashboard diverges from the reference system beyond the threshold. Publish the exact definition of each metric alongside the dashboard so users understand what they are viewing.

How would you decide which type of chart to use to communicate different kinds of information in a dashboard?

Comparison between categories: horizontal bar chart sorted from highest to lowest. Metric evolution over time: line chart. Parts of a whole: stacked bar chart or treemap for many categories — never a pie chart with more than five segments. Value distribution: histogram or boxplot. Correlation between two variables: scatter plot. Tables when the user needs exact values and wants to make their own calculations. The rule is to choose the chart type that communicates the message with the greatest clarity for the specific user — not the one that appears most sophisticated or is the designer's favorite.

What is the difference between a calculated measure and a calculated column in Power BI, and when would you use each?

A calculated column is evaluated row by row at model load time and stored in the table. Useful for values needed in filters or slicers, or for enriching dimensions. A calculated measure is evaluated at query time in the context of the active filter. Useful for aggregated metrics that must be calculated dynamically based on user filters. Measures are more memory-efficient because they do not store pre-calculated values. For metrics like total revenue, average per customer, or conversion rate, always use measures. Calculated columns are appropriate for fixed classifications such as customer segments or price ranges.

How would you handle a stakeholder request to add 15 new metrics to an existing dashboard?

Do not add all 15 without evaluation. Ask the right questions first: what decision does each metric enable? Which are the three most important to start with? Frequently, after that conversation, the number of truly necessary metrics reduces to five or six. If there are more legitimate metrics, evaluate whether the current dashboard is the right place or whether an additional, more focused dashboard is needed. A dashboard with 20 metrics produces analysis paralysis; one with five focused metrics produces action. Document the metrics that were proposed but not included, with the reasoning, for future reference.

What are Slowly Changing Dimensions (SCD) and what are the differences between Types 1, 2, and 3?

SCDs manage changes to dimension attributes over time. Type 1: overwrite the previous value without maintaining history. Useful when history has no value (correcting a typo in a customer name). Type 2: create a new row with the new value and mark the previous one as inactive with validity dates. Maintains the complete history and allows analyzing customer behavior under the segment they held at the time of a purchase. Type 3: add an additional column for the previous value. Maintains only the immediately prior change — useful when only comparison with the previous state is needed. In most analytical BI contexts, Type 2 is the most valuable because it preserves the complete history.

How would you handle a situation where two business areas have different definitions of the same metric and their reports do not match?

Do not resolve the discrepancy technically before resolving it with the business. Facilitate a meeting with representatives from both areas to document the two definitions with their exact formulas and their inclusions and exclusions. Evaluate which definition is more appropriate for each area's purpose: both can be valid for different purposes. If a unified definition is required, work with the business to formally agree on it. Implement the agreed definition in the centralized semantic layer and deprecate the reports using the old definitions with a communicated transition period. Metric inconsistency is a governance problem — not just a technical one.

How would you ensure that a BI dashboard remains reliable after changes in the upstream data sources?

Implement automated alerts that detect: missing data for an expected period, volumes significantly different from the historical pattern, values outside the expected range for key metrics. Establish data quality tests in the pipeline feeding the warehouse that block the dashboard refresh if data fails validations. Document the dashboard's dependencies on its sources so you know what can be affected when an upstream system changes. Coordinate with the data team to receive advance notification of pipeline or source schema changes before they happen — not after.

How would you write a DAX measure in Power BI to calculate revenue for the same period last year for a YoY comparison?

Use DAX's SAMEPERIODLASTYEAR function which handles time intelligence automatically: Revenue LY = CALCULATE([Revenue], SAMEPERIODLASTYEAR('Calendar'[Date])). This requires a calendar table marked as a date table in the model. For greater control, use DATEADD: Revenue LY = CALCULATE([Revenue], DATEADD('Calendar'[Date], -1, YEAR)). The YoY variation is calculated as: YoY % = DIVIDE([Revenue] - [Revenue LY], [Revenue LY]). Wrapping in DIVIDE instead of dividing directly automatically handles division by zero when there is no prior year data.

How would you model a many-to-many relationship between orders and promotions in the data warehouse, where an order can have multiple promotions and a promotion can apply to multiple orders?

A many-to-many relationship cannot be modeled directly with a fact table and dimensions in a traditional star schema. Options: create a bridge table containing order-promotion combinations as individual rows, with the impact metrics of each promotion on each order (discount applied, affected lines). Another option in Power BI is using the native many-to-many relationship with an assignments fact table. The bridge table is more explicit and portable across tools. Promotion effectiveness analysis aggregates orders through the bridge table using the promotion key.

How would you optimize the performance of a Power BI dashboard that takes more than 30 seconds to load with the current model?

Diagnose with Power BI's Performance Analyzer to identify which visualizations generate the slowest queries. Review the data model: remove unnecessary columns not used in visualizations or metrics, reduce granularity if the level of detail is not needed for the dashboard's questions. Verify that table relationships use integer columns rather than text for joins. Review DAX measures: replace FILTER with CALCULATETABLE where possible, avoid iterators like SUMX over large tables when a simple measure achieves the same result. Consider materializing the most expensive metrics as calculated columns if they are frequently used in filters and do not require dynamic context.

How would you design the semantic layer of a data model so that business analysts can build their own reports without accessing SQL directly?

The semantic layer abstracts the technical complexity of the data model by exposing metrics and dimensions in business terms. In Power BI: a model published in the service with security roles and pre-calculated DAX metrics that users consume from Excel or Power BI reports without seeing the underlying model. In Looker: LookML that defines explores, dimensions, and measures with pre-defined joins. In dbt: metrics defined in schema.yml consumable from connected BI tools. Key design principles: name everything in business terms (Revenue instead of sum_amount_excl_tax), document each metric with its definition, and hide technical fields that should not be used directly.

How would you implement Row-Level Security in Power BI so that each regional manager only sees data for their region?

Define security roles in Power BI Desktop with DAX filters per role: the 'Regional Manager' role filters the regions table with [Region] = USERPRINCIPALNAME() if there is a table mapping users to regions, or with an expression comparing the region field to the user's Azure AD attribute value. Publish the model with the defined roles. In the Power BI service, assign users or Active Directory groups to each role. Users assigned to a role automatically see only the data the DAX filter allows. For models with complex RLS (a user may have access to multiple regions), use a user-to-region mapping table and a filter that verifies whether the user is in the access list for each row.

How would you build a retention cohort analysis report in Tableau or Power BI?

Cohort analysis requires calculating, for each user group defined by their acquisition date, how many remain active in each subsequent period. In SQL: calculate the cohort (month of first event) and the relative period (difference in months between the activity date and the cohort date) for each active user event. In the data warehouse, this calculation is done before importing into BI. In Power BI, create DAX measures that calculate active users by cohort and period using CALCULATE with relative period filters. The resulting visualization is a matrix with cohorts in rows, periods in columns, and retention rates as values, with conditional formatting coloring from green to red based on the percentage.

How would you manage the governance of a Power BI environment with dozens of workspaces and hundreds of reports with no centralized control?

Implement a layered governance model. Certified data workspace: only the BI team can publish; everyone can read. Departmental workspaces: each area can publish but with a naming convention and review process. Personal workspaces: for exploration only, with no distribution to other users. Implement a report certification process with clear criteria: metric validation, source documentation, data owner approval. Use the Power BI APIs to audit the report inventory, their data sources, and their actual usage (how many users open them). Reports with no usage in the last 90 days are archived or deleted. Governance without automation does not scale; the Power BI administration APIs are the technical enabler of control at scale.

How would you evaluate whether an existing dimensional model in the data warehouse adequately supports a business area's new analytical requirements?

Map the new analytical requirements against the existing model: what analysis dimensions (by which customer, which product, which channel) does the new requirement need? Do those dimensions exist in the model? Can the necessary metrics be calculated from existing fact tables? The identified gaps are of three types: missing dimensions (require a new dimension table or enriching an existing one), insufficient granularity (the fact table lacks the required level of detail), or metrics that require crossing facts from different tables (require a new fact table or a view combining both). With this analysis, the effort estimate for the model extension is accurate before committing to the stakeholder.

How would you design the BI architecture for a fast-growing organization with multiple source systems that needs different teams to access reliable data autonomously?

Layered architecture. Ingestion layer: pipelines that replicate source systems to the data lake with minimal transformation. Transformation layer (dbt on Snowflake, BigQuery, or Redshift): staging models that clean and standardize each source, intermediate models that combine business entities, and mart models that expose BI-ready data by business domain. Semantic layer: certified metrics defined once and consumable from any BI tool. Presentation layer: certified dashboards for common use cases plus self-service capability for teams that want to explore. The key is that each layer has a clear owner and a defined governance process before the organization scales.

How would you measure the real impact of the BI platform on the organization to justify the investment and guide its evolution?

Adoption metrics: weekly active users, number of reports consulted, departments with active users. Quality metrics: data incident rate reported by users, mean time to resolve discrepancies. Efficiency metrics: BI team time spent on ad-hoc reports versus platform improvements (a mature platform frees team time). Business impact metrics: quarterly interviews with leaders about decisions made using dashboards and the estimated value of those decisions. The most difficult but most valuable metric is connecting dashboard usage with improvements in the business metrics they monitor: if the sales team started using the pipeline dashboard and the close rate improved, that correlation is worth documenting.

How would you approach migrating a legacy BI platform (Excel reports or an obsolete tool) to a modern one without interrupting critical business reporting?

Inventory all existing reports with their usage frequency, audience, and criticality before touching anything. Classify into three categories: critical (migration priority 1), used but not critical (priority 2), and reports nobody remembers requesting (candidates for no migration). Start with the least complex reports to develop the methodology and build business confidence. For each migrated report, run both versions in parallel during a validation period where the business confirms the numbers match before deactivating the legacy report. Communicate the plan and timeline to users well in advance. Do not attempt to migrate everything at once: phased migration reduces risk and allows process adjustment with each batch.

How would you manage building a consolidated financial report that integrates data from multiple subsidiaries with different currencies, fiscal calendars, and accounting structures?

Multi-entity financial consolidation is the most complex BI use case. Prerequisites: a time dimension table that supports multiple fiscal calendars simultaneously with equivalent periods between entities. A historical exchange rate table for currency conversion at the required reporting date. A unified accounting model that maps each subsidiary's chart of accounts to the corporate chart of accounts. In the warehouse, transactions from each entity are loaded with their original currency and the conversion to the reporting currency calculated at ingestion time. The BI layer always operates in the consolidated currency. The complexity of intercompany transaction elimination must be managed in the warehouse before exposing data to BI.

How would you design the change management process for data model changes when a warehouse modification affects multiple production dashboards and reports?

A data model change is equivalent to a breaking change in an API: it affects all consumers. The process: document the complete inventory of reports and dashboards with their table and field dependencies (Power BI and Tableau administration APIs can automate part of this). Before any change, run an impact analysis identifying which reports will be affected. Communicate the change well in advance to the owners of affected reports, giving them sufficient time to update their reports before the change is applied to production. Use warehouse versioning (schema versioning) so reports can continue using the previous version during the transition. Automate regression tests that verify reports continue producing the same values after the change.

How would you structure the BI team and processes for a company scaling from 50 to 500 employees with growing analytical needs across multiple business areas?

At 50 employees, a generalist BI analyst covers all needs. At 500, specialization is necessary. Recommended structure: a central BI team (3-5 people) that manages data infrastructure, the dimensional model, metrics governance, and executive dashboards. BI analysts embedded in major business areas (commercial, product, operations) who deeply understand the domain and build area-specific reports using the central team's infrastructure. The central team establishes standards, tools, and processes. Embedded analysts have autonomy within those standards. Without this structure, the central team becomes a bottleneck that cannot serve all needs; with it, self-service becomes possible with governance.

Presenting visually elaborate dashboards without being able to articulate what business decisions they enable

A visually polished dashboard that nobody uses to make decisions is a BI failure, not a success. Interviewers at impact-oriented BI teams ask what decision the stakeholder changed after gaining access to the dashboard. A BI Analyst who cannot answer that question is measuring their work in aesthetic outputs, not in business outcomes.

Not having a clear principle for distinguishing what belongs in the data transformation layer and what belongs in the visualization layer

Complex business metrics calculated in the visual tool (with DAX or Tableau calculated fields directly on raw tables) produce dispersed, hard-to-maintain business logic. Technical interviewers ask where the calculation logic for critical metrics lives and why. A BI Analyst who cannot articulate that distinction will produce models that are difficult to maintain and metrics that become inconsistent.

Not mentioning metrics governance when discussing building dashboards for multiple business areas

When different departments have dashboards with same-named metrics calculated differently, the organization loses trust in its data. A BI Analyst who builds dashboards in silos without a metric definition and standardization process is creating the exact problem that experienced interviewers will immediately identify.

Describing data models without mentioning performance and maintainability considerations at scale

A data model that works well with one million records can become unusable with one hundred million. A BI Analyst who does not consider data volume, update frequency, and query performance when designing a model will produce solutions that scale poorly and require costly redesign as the business grows.

Not being able to explain the differences between BI tools and when each is most appropriate

Power BI, Tableau, and Looker have different strengths and limitations. A candidate who says they can use any of them without being able to articulate the fundamental differences (Power BI's semantic model versus Tableau's server-based approach versus Looker's LookML layer) demonstrates superficial knowledge of the tools they claim to use.

Treating data quality as someone else's responsibility and assuming warehouse data is perfect by definition

Production data has problems: duplicates, unexpected nulls, uncommunicated schema changes. A BI Analyst who does not have a process for validating the quality of data feeding their dashboards will produce reports with errors that users detect before the analyst does — destroying trust in the platform.

Business Analyst

Transforms the organization's data into business intelligence that allows leaders to make strategic decisions with complete visibility.

A BI Analyst (Business Intelligence Analyst) designs, builds, and maintains the business intelligence systems that provide the organization with an integrated view of its performance. Their work spans data modeling in the data warehouse through building executive dashboards and defining the metrics that measure business success. Unlike a Data Analyst who answers ad-hoc questions, the BI Analyst builds reusable analytical infrastructure that enables multiple teams to access reliable data autonomously. They work closely with the data team, business leaders, and operational teams.

Power BITableauSQLData ModelingDAXETL

Recruit the best Business Analyst here

Start now

Main Responsibilities

•Design and build dimensional data models in the data warehouse that support the organization's analytical requirements.
•Develop and maintain dashboards and reports in BI tools that allow teams to monitor their KPIs autonomously.
•Define and standardize business metrics, ensuring the entire organization uses the same definitions and calculations.
•Collaborate with the data team to design the transformations and models that feed BI solutions.
•Manage the report lifecycle: from requirements gathering through publication, maintenance, and user training.
•Ensure the quality and reliability of data presented in dashboards through automated validations and reconciliation processes.

Key Skills

Technical Skills

BI and visualization tools: Power BI with DAX and Power Query, Tableau with calculations and LOD expressions, or Looker with LookML
Dimensional modeling: star and snowflake schema design with fact tables, dimensions, and calculated metrics
Advanced SQL for data extraction and transformation from multiple sources into the data warehouse
DAX (Data Analysis Expressions) or Tableau Calculated Fields for complex calculated metrics in the semantic layer
Basic ETL knowledge and data pipeline understanding to diagnose data quality issues and understand how data arrives in the warehouse
Effective visualization design: visual perception principles, choosing the correct chart type, and building hierarchical dashboards

Soft Skills

Translating business requirements into data models and visualizations that answer exactly the right questions
Communication with business leaders to understand what decisions they need to make and designing dashboards around those decisions
Rigor in metric definition to prevent different business areas from using different definitions of the same KPI and generating inconsistent data
Ability to document data models and metrics clearly so others can use and maintain them
Managing multiple stakeholders with different needs about the same dataset without creating duplicate dashboards that later diverge
Judgment to simplify: most dashboards benefit from showing fewer metrics with more clarity rather than more metrics with more complexity

Real use cases

Context

Executive dashboards are the tool that allows organizational leaders to monitor the state of the business and detect deviations from objectives without depending on manual reports.

Real examples

CEO dashboard with financial, operational, and product KPIs integrated into a single view
Business area scorecards with traffic-light status versus period targets
Commercial performance view with pipeline, conversions, and close forecast
Operations dashboards with efficiency, capacity, and quality metrics in real time

Context

A well-designed data model is the foundation of all BI solutions. Without it, reports are inconsistent, hard to maintain, and slow to build.

Real examples

Star schema design for sales analysis with time, product, customer, and channel dimensions
Slowly Changing Dimension modeling to maintain the history of changes in business entities
Building a semantic layer with reusable calculated metrics shared across multiple reports
Data model documentation with a metrics dictionary for business teams

Context

When different business areas calculate the same KPIs differently, debates about the numbers replace debates about decisions. Metric standardization is the prerequisite for a coherent data culture.

Real examples

Formal definition of critical metrics: revenue, churn, CAC, LTV with the exact formula and documented exclusions
Implementation of a centralized semantic layer where all metrics are calculated once
Metrics governance process so that changes to definitions are approved and communicated
Reconciliation of historical discrepancies between reports from different areas that used different definitions

Context

When all reports go through the BI team, it becomes a bottleneck. Self-service analytics empowers business teams to answer their own questions with reliable data.

Real examples

Semantic layer design in Looker or Power BI that exposes pre-calculated metrics without exposing the underlying SQL
Business user training on BI tool usage with the available datasets
Documentation of available datasets and metrics so users know what they can explore
Report certification process that distinguishes official reports from exploratory ones

Context

Business data lives in multiple systems: ERP, CRM, e-commerce platform, marketing tools. Integrating them into a unified view enables cross-functional analyses that no individual system can provide.

Real examples

ERP sales data integration with CRM behavioral data for customer lifecycle analysis
Unified marketing dashboard combining Google Ads, Meta, email, and SEO data in a single view
360-degree customer view combining transactions, web behavior, support tickets, and NPS
Product profitability analysis combining ERP sales and cost data with logistics data

Business Analyst

Recruit the best Business Analyst here

Main Responsibilities

Key Skills

Technical Skills

Soft Skills

Real use cases

Basic questions

Technical questions

Advanced questions

Common interview mistakes

Other Roles

Other Frameworks

Salary Range LATAM

Solutions for recruiting and job searching

Top talent specialized in Business Analyst