How to Join Multiple Tables with Equivalent Values in Mapping columns but represented differently?
Joining two or more tables is possible by mapping unique, identical columns in each. However, different data vendors may have the same thing represented in their own way. For example, when taking Jira and GitLab, both of them have a username column as the primary key column in their user’s table. However, the representation of their values differs. Due to this, even if they are identical and unique, we cannot establish a relationship between the two data sources by default. So, we have to create an identical field in any one of the tables.
Let’s discuss this problem and how to resolve it in a detailed manner. Assume you have two different data sources, Connection 1 (JIRA) and Connection 2 (GitLab), created using Bold BI.
Using expressions in Bold BI
- You have to create an expression in any one of the data sources to transform the data to match the values in other data source.
- Connection 1 (JIRA) has a naming format for usernames that is different from the format used for usernames on Connection 2 (GitLab). For instance, “Gayathri Sankarraj” in JIRA and “gayathir.sankarraj” in GitLab.
- To standardize the username we can create expression in GitLab connection to transform the data format to the JIRA display name format. Later, using that expression to join both tables.
The below expression provides the required data transform for SQL Server databases,
CONCAT(
UPPER(SUBSTRING([username], 1, 1)),
SUBSTRING([username], 2, CHARINDEX('.', [username]) - 2),
' ',
UPPER(SUBSTRING([username], CHARINDEX('.', [username]) + 1, 1)),
SUBSTRING([username], CHARINDEX('.', [username]) + 2, LEN([username]))
)
If you are not able to transform data using expressions, an alternate way is to use a lookup table approach as explained below.
Using Lookup table
-
To solve this problem, let’s create a lookup table holding the mapping columns of both Connection 1 (JIRA) and Connection 2 (GitLab). The lookup table can be in Excel, CSV, Google Sheets, or any other connector.
-
Introduce another column to have one standard name for each row value representation.
-
Now, this lookup table should be added as a data source in Bold BI.
-
We can join the lookup table with the Connection 1 (Jira) data source and Connection 2 (GitLab) data source since the mapping table has columns (usernames) existing in both connections, along with the standard name column.
-
Inner joins are done to join the tables as follows.
-
Now, the user can configure widgets from different data sources, even with a different representation of values. For example, the grid widget is configured below.
This solution is possible only for joining data sources created in extract mode.
Related Articles: