dwh schema is a fairly textbook star schema, with things like jobs and worker usage in fact tables, and the various things that you might use in a SQL
WHERE clause in dimension tables.
The job_fact table, a typical "fact" table.
For example, to get reports about various jobs over time, you'll be querying the dwh.job_fact table:
"*_sk" columns can be used to do INNER JOINs to a similarly named dimension table
Any column that is named with an
_sk suffix is a Synthetic Key that points to a corresponding dimension table, named with the part of the column before the
_sk; the dimension table will have a
_dim suffix in the name. This way, it's easy to write the
JOIN's, the column name is a clue to the dimension table, which will have a column of the same name. Almost every dimension table will consist of a
PRIMARY KEY and a name column.
A typical dimension table, the "user_dim" table
For example, the user_sk column can be used to do a SQL
INNER JOIN to the user_dim table.
Get a count of all jobs for a particular user:
The time dimension table
The dwh.time_dim table is provided so that you don't have to perform date/time operations on every row in a fact table (since they can run into the 100's of millions of rows), instead you do a SQL
INNER JOIN to it and use the values in the time_dim table in your
WHERE clause. The time_sk column in every fact table has an identical value in the
time_dim table which has a single row with a primary key time_sk. The
time_sk value is actually the unix epoch time in seconds:
The "job status" dimension table
The dwh.jobstatus_dim table is one of the few exceptions to the normal dimension table structure; it provides a mapping between the integer and human-readable status values.