While working on an API for my day job last week, I needed to do a SQL Inner Join with Sequelize and Typescript in a web API. Here’s how I achieved it.
Environment:
Things move quickly in the JS ecosystem. Here are the library versions used in the subject API at the time of writing:
"dependencies": {
"@types/express": "~4.0.39",
"@types/sequelize": "^4.27.21",
"express": "~4.16.2",
"sequelize": "4.38.0",
}
Context:
I have an identifier in one system that I need to correlate with an identifier in a second system, but I could only do that via an intermediate identifier, also in the second system. A good case for a SQL Join.
System 1 | System 2 | System 2 |
---|---|---|
Electronic Identifier | Animal Identifier | Legacy Animal Identifier |
ABCD | 555 WXYZ | 1234 |
In System 1, I have enough information to send this request:
{
"items": [
"ABC",
"DEF",
"GHI"
]
}
And in response I want:
{
"links": [
{
"rel": "self",
"href": "https://api.com/123-i-am-a-uuid"
}
],
"reference": "123-i-am-a-uuid",
"items": [
{
"legacyId": 123,
"electronicId": "ABC"
},
{
"legacyId": 456,
"electronicId": "DEF"
},
{
"legacyId": 789,
"electronicId": "GHI"
}
]
}
The initial approach was to make three repository queries to get the data I needed, and then go through three javascript map operations to splice the data into the above format. Using a series of javascript maps is inefficient, particularly as the data set grows.
A better way is to get SQL to do the grunt work, as it’s far more time and memory efficient than the javascript I would have needed to write.
As an aside, I’m not too fond of Javascript on the back end. I don’t feel it’s performant enough and as your codebase grows the lack of a robust type system hurts more and more. Typescript helps address the latter (assuming your libraries support it) but does nothing for the former. A typed language like C# or Java is preferable for web programming in a large enterprise due to better performance, more coherent ecosystem (see:
left-pad
) and robust type system.
I need to make a map of identifiers from an electronicIdEvent
entity
(that deals with electronic identifiers) and an animal
entity that deals with animal information.
electronicId | animalId | legacyAnimalId | |
---|---|---|---|
electronicIdEvent | ✓ | ✓ | |
animal | ✓ | ✓ |
I need the legacyAnimalId
field. To find a legacyAnimalId
given an electronicIdEvent
, I have to:
- Find all
electronicIdEvent
rows that match anelectronicId
. - Find all
animal
rows that match theanimalId
of theelectronicIdEvent
where anelectronicId
is present. - Return all
legacyAnimalId
rows that match theanimalId
of both theanimal
and theelectronicIdEvent
.
Hopefully, you can see why using javascript maps, and SQL selects for this would be tedious and inefficient.
Here’s a SQL query that demonstrates what I want if you’re more familiar with SQL then Javascript:
select event.eid, event.animalId, animal.id, animal.animalKey
from "electronicIdEvent" event
inner join "animal" animal
on event."animalId" = animal.id
1. Create associations for Sequzelize, between the data you need to link.
For Sequelize to generate the right SQL, it needs to know about the relationships between your data.
Here’s the code that creates associations in Sequelize:
this.animal.hasMany(this.electronicIdEvent, {
foreignKeyConstraint: true,
foreignKey: "animalId",
});
this.electronicIdEvent.belongsTo(this.animal, {
foreignKeyConstraint: true,
foreignKey: "animalId",
});
Associations allow the ORM to understand your relations and map One To One
, One To Many
and Many To Many
relationships to appropriate SQL code. The docs describe this here
2. Create a Sequelize repository method that uses this association to form a SQL query that returns the correct data:
There is some stuff in here that’s not great, but it works well enough. If I had more time, I’d figure out
a way of using typescript’s not-null override (!
), and ideally find a way of making event
a typed response.
The pseudocode I want is:
- First return all the rows for the
electronicId
andanimalId
columns inelectronicIdEvent
, where theelectronicId
is present in myelectronicIdList
- Next join the rows for those columns with the rows for the
legacyAnimalId
column, where theanimalId
is the same in both theelectronicIdEvent
andanimal
entities. - When the data comes back, drop the
animalId
column. I don’t need it once it has done the job of being the glue between the other entities and columns.
Or, as Javascript:
// imports omitted for brevity
export class AnimalRepository {
constructor(
private db: AnimalTimelineDbContext,
) {}
getLegacyAnimalIdentifierByElectronicIdentifier(electronicIdList: Array<string>): Promise<Array<IdentifierMap>> {
return this.db.electronicIdEvent
.findAll({
attributes: ["electronicId", "animalId"],
where: {
electronicId: electronicIdList,
},
include: [
{
model: this.db.animal,
required: true,
attributes: ["legacyAnimalId"],
},
],
})
.then((events) =>
events.map((event) => ({
eid: event.electronicId,
legacyAnimalId: event.Animal!.legacyAnimalId,
}))
);
}
}
Note: Setting
required: true
in theinclude
block forces the query to only return records which have an associated model, this converts the query from an OUTER JOIN to an INNER JOIN.
Because we only return complete rows (no partials)
event.Animal
cannot be null, I can override the typescript hint, i.e.event.Animal!.legacyAnimalId
.
I’m done at this point. We’ve submitted a list of electronic identifiers and gotten back an Array<IdentifierMap>
, which is syntactic sugar for Array<[string, integer]>
.
3. Convert the raw data into my map object, and fire the response back.
Because Sequelize is doing most of the grunt work in formatting the data, all I need in my web layer is to add the UUID that the user-submitted and save that map. Once I’ve saved the map and returned the reference to the calling system, the calling system can ask for that map and get the data (this is the most RESTful way of handling the data exchange and I’m omitting that setup for a follow-up post).