Articles in this section
Category / Section

Understanding Inner Joins and Strategies to Minimize Record Counts

Published:

When tables are joined, it is usual to encounter duplicate or increased record counts. This happens when there are several matching values in either of the tables being joined. This article explains the causes of increased record counts and methods to minimize them.

Understanding Inner Joins

An inner join retrieves only those rows where there are corresponding values in both tables involved in the join. If there are several matching values in one or both tables, duplicate records may occur in the output.

Consider the following tables as an example:

Orders Table:

Orders.png

Customers Table:

Customers.png

When you execute an inner join on the CustomerID column, the outcome will contain only the records that have matching CustomerID values in both tables.

Result:

The outcome of performing an inner join on the CustomerID column would be:

Inner Join.png

Identifying Duplicate Records

Duplicate records frequently arise when there are several matching values in each table. For example, if you perform a join operation on a column such as City, where multiple customers share the same city, the result might contain duplicates.

Example:

Execute an inner join operation on the ‘City’ column.

Orders Table:

Inner Join.png

Customers Table:

Customers table 2.png

Executing an inner join based on the City column will produce the table shown below:

Result:

Inner Join table.png

In this scenario, the resulting table includes duplicate entries for London and Berlin because there are several customers associated with the same city in the Customers table.

How to Reduce Duplicate Records

  1. Use Primary Keys for Joins
    Joining tables on primary keys ensures a one-to-one relationship between rows.

    Example:
    Joining the Orders and Customers tables using the CustomerID column eliminates duplicates:

    Result:

    OrderID OrderDate CustomerName City
    1.00 01 Apr 2022 John Smith London
    2.00 02 Apr 2022 Sarah Johnson Paris
    3.00 03 Apr 2022 Mark Davis Berlin
  2. Apply Filters After Joins
    If only specific data is needed, use a filter condition to reduce duplicates.

    Example:
    Fetching orders for customers only from London:

     SELECT  Orders.OrderID, Orders.OrderDate,  Customers.CustomerName, Customers.City FROM Orders INNER JOIN Customer ON Orders.CustomerID = Customers.CustomerID WHERE Customers.City = 'London';
    

    Result:

    OrderID OrderDate CustomerName City
    1.00 01 Apr 2022 John Smith London

For Extract mode data sources Filters and Joins can be applied only in Bold BI. For SQL live data sources Filters and joins can be applied either on Bold BI or on the database server.

Conclusion

To avoid duplicate rows when joining tables:

  • Use primary keys to establish a unique match between rows.
  • Apply filters to narrow down the results based on specific conditions.

Additional References:

-How to configure data filters
-How to Join tables

Was this article useful?
Like
Dislike
Help us improve this page
Please provide feedback or comments
MB
Written by MunishKumar Balakrishnan
Updated
Comments (0)
Please  to leave a comment
Access denied
Access denied