The Idea
I have few smart Netatmo cameras, both indoor and outdoor. And it is quite logical to integrate them into my personal Datawarehouse. There is even nice open API supporting such intention.
The goal is to load event data into detail layer and store the snapshot photos in Google photos using the Google Backup and Sync. Then, of course, somehow link the data with the photos.
The Data Model Concept
This time we are going to face bit more complicated interaction with the source system. So it is worth spending some time on robust Data Model design. Our Observation model utilizes the approach of supertype and subtypes. In fact, it offers something like Inheritance in object-oriented programming.
To keep things simple we will always use surrogate keys as PK. Integers generated outside of database in ETL scripts. Never using any internal autoincrement database feature. The surrogate key is generated for the supertype entity and then provided to all its subtypes. By saying so, it is clear that we can never load data into subtype table without having loaded records in its supertype.
The Dimensions
Let’s start with the dimensions. We have supertype table sensor. It contains the surrogate key «Sensor_Id» which is also the PK, then «Sensor_Source_Cd» identifying the sensor in the source, then «Source_Name» identifying the source system and finally «Target_Table» which specifies the subtype table, in our case, it is the table camera. That is the typical pattern for dimension supertype. Similarly we have supertype entity and its subtype person.
Then we have two small lookup tables defined directly in our data warehouse, event_type and observation_label, which are not linked to any source and which keys are hardcoded in table DDL. Such tables, of course, don’t need any supertype.
The Fact tables
Now, let’s move to the fact tables. We store data about the events in table camera_event and we can attach snapshot photos to the event using table camera_snapshot.
Both tables are subtypes of one supertype table observation
There is no source key generated for events in the source system. But we know the logical primary key. It is the combination of the Camera identification and the event timestamp. There cannot be more than one event coming from the particular camera at the same time.
Supertype observation contains «Observation_Id» the surrogate key. Then «Observation_Timestamp» the source timestamp specifying the event or snapshot creation time. And also «Sensor_Id» The Sensor surrogate key, in our case primary key of the camera in the personal data warehouse. Using «Related_Observation_Id» we can relate a camera snapshot to the camera event. The field «Target_Table» specifies target table (camera_event or camera_snapshot).
Subtype camera_event extends its supertype providing more specific details about the event. Classify it using «Event_type». It can store the person identification in «Person_Id» for person events generated by a camera with face recognition. Creation time is available in the supertype as The Unix timestamp The same value, just transformed into Date time field is «Event_Start_Dt».
Subtype camera_snapshot stores snapshot photo file name in «File_Name» and Google Drive File Id in «File_ID».
The last table to mention is observation_label_related. It allows us to assign several labels to one event. The outdoor camera generates events where multiple objects are identified, like a car, person, animal. In such case, we use labeling to add all needed information to one event.