Basic DB mappings
Warning
Danger
This tutorial is for opayele-0.6 and earlier. If you try these examples with the latest version of Opaleye they won't work. I am working on updating (and expanding these tutorials), and have already completed Instant Gratification.
If you'd like to support my work please subscribe to my newsletter and let me know at hello@haskelltutorials.com if these tutorials helped you in any way!
Overview
In this chapter we will configure the DB<=>Haskell mapping for the following table:
tenants
- the master table of "tenants" in a typical multi-tenant SaaS app. You can think of a tenant as a "company account", where no two company accounts share any data.
At the end of the mapping process, we would like to have a schema as close to the following, as possible.
-- -- Tenants -- create table tenants( id serial primary key timestamp with time zone not null default current_timestamp ,created_at timestamp with time zone not null default current_timestamp ,updated_at not null ,name text not null ,first_name text not null ,last_name text not null ,email text not null ,phone text not null default 'inactive' ,status text integer ,owner_id not null ,backoffice_domain text constraint ensure_not_null_owner_id check (status!='active' or owner_id is not null) );create unique index idx_index_owner_id on tenants(owner_id); create index idx_status on tenants(status); create index idx_tenants_created_at on tenants(created_at); create index idx_tenants_updated_at on tenants(updated_at); create unique index idx_unique_tenants_backoffice_domain on tenants(lower(backoffice_domain));
Further, we will see how Opaleye deals with the following four cases:
- Non-nullable columns without DB-specified defaults
- Non-nullable columns with DB-specified defaults
- Nullable columns without DB-specified defaults
- Nullable columns with DB-specified defaults - TODO: What's a good use-case for such a column?
Creating the DB --------------
Since Opaleye does not have any support for migrations, setting up the DB schema is done by simply issuing SQL statement directly.
createdb vacationlabs $ psql vacationlabs < includes/db-mappings/schema.sql $
Now, to setup the DB<=>Haskell mapping for the tenants
table, we'll walk down the following code: :
module DB where
import Opalaye
import Data.Text
import Data.Time (UTCTime)
data TenantPoly key createdAt updatedAt name status ownerId backofficeDomain = Tenant {
tenantKey :: key
,tenantCreatedAt :: createdAt
,tenantUpdatedAt :: updatedAt
,tenantName :: name
,tenantStatus :: status
,tenantOwnerId :: ownerId
,tenantBackofficeDomain :: backofficeDomain
} deriving Show
type TenantPGWrite = TenantPoly
(Maybe (Column PGInt8)) -- key
(Maybe (Column PGTimestamptz)) -- createdAt
(Column PGTimestamptz) -- updatedAt
(Column PGText) -- name
(Column PGText) -- status
(Column (Nullable PGInt8)) -- ownerId
(Column PGText) -- backofficeDomain
type TenantPGRead = TenantPoly
(Column PGInt8) -- key
(Column PGTimestamptz) -- createdAt
(Column PGTimestamptz) -- updatedAt
(Column PGText) -- name
(Column PGText) -- status
(Column (Nullable PGInt8)) -- ownerId
(Column PGText) -- backofficeDomain
type Tenant = TenantPoly
Integer -- key
UTCTime -- createdAt
UTCTime -- updatedAt
Text -- name
Text -- status
(Maybe Integer) -- ownerId
Text -- backofficeDomain
$(makeAdaptorAndInstance "pTenant" ''TenantPoly)
$(makeLensesWith abbreviatedFields ''TenantPoly)
tenantTable :: Table TenantPGWrite TenantPGRead
tenantTable = Table "tenants" (pTenant Tenant{
tenantKey = optional "id"
,tenantCreatedAt = optional "created_at"
,tenantUpdatedAt = required "updated_at"
,tenantName = required "name"
,tenantStatus = required "status"
,tenantOwnerId = required "owner_id"
,tenantBackofficeDomain = required "backoffice_domain"
})
That's quite a lot of code to setup mappings for just one table! Most of it is just boilerplate that can easily be abstracted away using type-families or some TemplateHaskell. In fact there are libraries, such as, SilkOpaleye and dbrecord-opaleye which try to give Opaleye an easier-to-use API.
Strange polymorphic records
Firstly, let's tackle the strangely polymorphic TenantPoly
. :
data TenantPoly key createdAt updatedAt name status ownerId backofficeDomain = Tenant {
tenantKey :: key
,tenantCreatedAt :: createdAt
,tenantUpdatedAt :: updatedAt
,tenantName :: name
,tenantStatus :: status
,tenantOwnerId :: ownerId
,tenantBackofficeDomain :: backofficeDomain
} deriving Show
This is a base type which defines the shape of a set of related record-types (namely TenantPGRead
, TenantPGWrite
, and Tenant
). TenantPoly
is polymorphic over every single field of the record. This allows us to easily change the type of each field, while ensuring that the shape of all these related records is always the same. (Why would we want records with similar shapes, but different types, will get clearer in a moment - hang in there!) Generally, TenantPoly
is never used directly in any Opaleye operation. The concrete types - TenantPGRead
TenantPGWrite
and Tenant
- are used instead.
At the the time of writing, Opalaye does not do any reflection on the DB schema whatsoever. This is very different from something like Rails (in the Ruby world) and HRR (in the Haskell world), which generate the DB<=>Haskell mappings on the basis of schema reflection). So, Opaleye does not know what data-types to expect for each column when talking to the DB. Therefore, we have to teach it by essentially duplicating the SQL column definitions in Haskell. This is precisely what TenantPGRead
, TenantPGWrite
, makeAdaptorAndInstance
and tenantTable
do, and this is what we absolutely hate about Opaleye!
Note
We've scratched our own itch here and are working on Opaleye Helpers to help remove this duplication and boilerplate from Opaleye.
type TenantPGWrite = TenantPoly
Maybe (Column PGInt8)) -- key
(Maybe (Column PGTimestamptz)) -- createdAt
(Column PGTimestamptz) -- updatedAt
(Column PGText) -- name
(Column PGText) -- status
(Column (Nullable PGInt8)) -- ownerId
(Column PGText) -- backofficeDomain
(
type TenantPGRead = TenantPoly
Column PGInt8) -- key
(Column PGTimestamptz) -- createdAt
(Column PGTimestamptz) -- updatedAt
(Column PGText) -- name
(Column PGText) -- status
(Column (Nullable PGInt8)) -- ownerId
(Column PGText) -- backofficeDomain
(
$(makeAdaptorAndInstance "pTenant" ''TenantPoly)
tenantTable :: Table TenantPGWrite TenantPGRead
= Table "tenants" (pTenant Tenant{
tenantTable = optional "id"
tenantKey = optional "created_at"
,tenantCreatedAt = optional "updated_at"
,tenantUpdatedAt = required "name"
,tenantName = required "status"
,tenantStatus = required "owner_id"
,tenantOwnerId = required "backoffice_domain"
,tenantBackofficeDomain })
Different types for read & write
With this, we witness another quirk (and power) of Opaleye. It allows us to define different types for the read (SELECT) and write (INSERT/UPDATE) operations. In fact, our guess is that, to achieve type-safety, it is forced to do this. Let us explain. If you're using standard auto-increment integers for the primary key (which most people do), you essentially end-up having two different types for the INSERT
and SELECT
operations. In the INSERT
operation, you should not be specifying the id
field/column. Whereas, in the SELECT
operation, you will always be reading it. (Look at Persistent if you want to see another approach of solving this problem.)
One way to avoid having separate types for read & write operations, is to allow the PK field to be undefined
in Haskell, being careful not to evaluate it when dealing with a record that has not yet been saved to the DB. We haven't tried this approach yet, but we're very sure it would require us to teach Opalaye how to map undefined
values to SQL. Nevertheless, depending upon partially defined records for something as common as INSERT
operations does not bode too well for a language that prides itself on type-safety and correctness.
Therefore, the need for two separate types: TenantPGRead
and TenantPGWrite
, with subtle differences. But, before we discuss the differences, we need to understand how Opaleye deals with NULL
values and "omitted columns".
Handling NULL
and database defaults -------------------------------------
Let's look at the types of a few fields from TenantPGWrite
and how they interact with NULL
values and the DEFAULT
value in the DB:
The (Column a) types
updatedAt
of type(Column PGTimestamptz)
corresponding toupdated_at timestamp with time zone not null default current_timestamp
name
of type(Column PGText)
corresponding toname text not null
status
of type(Column PGText)
corresponding tostatus text not null default 'inactive'
In each of these cases you have to specify the field's value whenever you are inserting or updating via Opaleye. Moreover, the type ensures that you cannot assign a null
value to any of them at the Haskell-level. Please note, null
is NOT the same as Nothing
The (Maybe (Column a)) types
key
of type(Maybe (Column PGInt8))
corresponding toid serial primary key
createdAt
of type(Maybe (Column PGTimestamptz))
corresponding tocreated_at timestamp with time zone not null default current_timestamp
In both these cases, during an INSERT, if the value is a Nothing
, the entire column itself will be omitted from the INSERT statement and its fate will be left to the DB.
The (Column (Nullable a)) types
ownerId
of type(Column (Nullable PGInt8))
corresponding toowner_id integer
In this case, while you have to specify a value at the Haskell level, you can specify a null
as well.
For example, this is a possible INSERT operation:
runInsertMany -- PG Connection
conn -- Opaleye table identifer
userTable
[(TenantPGWrite
{= Nothing -- column will be omitted from query; will use DB's DEFAULT
tenantKey = Just $ pgUTCTime someTime -- column will NOT be omitted from query; will NOT use DB's DEFAULT
, tenantCreatedAt = pgUTCTime someTime
, tenantUpdatedAt = pgText "Saurabh"
, tenantName = pgText "inactive"
, tenantStatus = null -- specfically store a NULL value
, tenantOwnerId = pgText "saurabh.vacationlabs.com"
, tenantBackofficeDomain
} )]
Note
Please make sure you understand the difference between Maybe (Column a)
and Column (Nullable a)
. And possibly Maybe (Column (Nullable a))
- although we're not sure how useful the last one is!
Different types for read & write - again
Now, coming back to the subtle differences in TenantPGWrite
and TenantPGRead
:
- While writing, we may omit the
key
andcreatedAt
columns (because their type is(Maybe (Column x))
inTenantPGWrite
) - However, while reading, there is really no way to omit columns. You can, of course select 2 columns instead of 3, but that would result in completely different data types, eg:
(Column PGText, Column PGInt4)
vs(Column PGText, Column PGInt4, Column PGTimestamptz)
. - If your result-set is obtained from a LEFT JOIN, you can have a PGRead type of
(Column a, Column b, Column (Nullable c), Column (Nullable d))
, with the Nullable columns representing the result-set in a type-safe manner.
Note
Here are two small exercises:
What if ownerId
had the following types. What would it mean? What is a possible use-case for having these types?
TenantPGWrite
: (Maybe (Column (Nullable PGInt8)))TenantPGRead
: (Column (Nullable PGInt8))
And what about the following types for onwerId
?
TenantPGWrite
: (Maybe (Column PGInt8))TenantPGRead
: (Column (Nullable PGInt8))
Making things even more typesafe: If you notice, TenantPGWrite
has the key
field as (Maybe (Column PGInt8))
, which makes it omittable, but it also makes it definable. Is there really any use of sending the primary-key's value from Haskell to the DB? In most cases, we think not. So, if we want to make this interface uber typesafe, Opaleye allows us to do the following as well (notice the type of key
): :
type TenantPGWrite = TenantPoly
() -- key
(Maybe (Column PGTimestamptz)) -- createdAt
(Column PGTimestamptz) -- updatedAt
(Column PGText) -- name
(Column PGText) -- status
(Column (Nullable PGInt8)) -- ownerId
(Column PGText) -- backofficeDomain
You'll need to do some special setup for this to work as described in Making columns read-only
Wrapping-up
Coming to the last part of setting up DB<=>Haskell mapping with Opaleye, we need to issue these magic incantations: :
$(makeAdaptorAndInstance "pTenant" ''TenantPoly)
tenantTable :: Table TenantPGWrite TenantPGRead
tenantTable = Table "tenants" (pTenant Tenant{
tenantKey = optional "id"
,tenantCreatedAt = optional "created_at"
,tenantUpdatedAt = optional "updated_at"
,tenantName = required "name"
,tenantStatus = required "status"
,tenantOwnerId = required "owner_id"
,tenantBackofficeDomain = required "backoffice_domain"
})
The TH splice - makeAdaptorAndInstance
- does two very important things:
- Defines the
pTenant
function, which is subsequently used intenantTable
- Defines the
Default
instance forTenantPoly
(this is notData.Default
, but the poorly named *Data.Profunctor.Product.Default*
Right now, we don't need to be bothered with the internals of pTenant
and Default
, but we will need them when we want to do some advanced DB<=>Haskell mapping. Right now, what we need to be bothered about is tenantTable
. That is what we've been waiting for! This is what represents the tenants
table in the Haskell land. Every SQL operation on the tenants
table will need to reference tenantsTable
. And while defining tenantsTable
we've finally assembled the last piece of the puzzle: field-name <=> column-name mappings AND the name of the table! (did you happen to forget about them?)
Note
We're not really clear why we need to specify optional
and required
in the table definition when TenantPGWrite
has already defined which columns are optional and which are required.
And, one last thing. We've been talking about PGText
, PGTimestamptz
, and PGInt8
till now. These aren't the regular Haskell types that we generally deal with! These are representations of native PG types in Haskell. You would generally not build your app with these types. Instead, you would use something like Tenant
, defined below: :
type Tenant = TenantPoly
Integer -- key
UTCTime -- createdAt
UTCTime -- updatedAt
Text -- name
Text -- status
(Maybe Integer) -- ownerId
Text -- backofficeDomain
Remember these three types and their purpose. We will need them when we're inserting, udpating, and selecting rows.
TenantPGWrite
defines the record-type that can be written to the DB in terms of PG types.TenantPGRead
defines the record-type that can be read from the DB in terms of PG types.Tenant
defines the records that represents rows of thetenants
table, in terms of Haskell types. We haven't yet split this into separate read and write types.
Template Haskell expansion
If you're curious, this is what the TH splice expands to (not literally, but conceptually). You might also want to read the [documentation of Data.Profunctor.Product.TH](https://hackage.haskell.org/package/product-profunctors-0.7.1.0/docs/Data-Profunctor-Product-TH.html) to understand what's going on here. :
pTenant :: ProductProfunctor p =>
TenantPoly
(p key0 key1)
(p createdAt0 createdAt1)
(p updatedAt0 updatedAt1)
(p name0 name1)
(p status0 status1)
(p ownerId0 ownerId1)
(p backofficeDomain0 backofficeDomain1)
-> p (TenantPoly key0 createdAt0 updatedAt0 name0 status0 ownerId0 backofficeDomain0)
(TenantPoly key1 createdAt1 updatedAt1 name1 status ownerId1 backofficeDomain1)
pTenant = (((dimap toTuple fromTuple) . Data.Profunctor.Product.p7). toTuple)
where
toTuple (Tenant key createdAt updatedAt name status ownerId backofficeDomain)
= (key, createdAt, updatedAt, name, status, ownerId, backofficeDomain)
fromTuple (key, createdAt, updatedAt, name, status, ownerId, backofficeDomain)
= Tenant key createdAt updatedAt name status ownerId backofficeDomain
instance (ProductProfunctor p,
Default p key0 key1,
Default p createdAt0 createdAt1,
Default p updatedAt0 updatedAt1,
Default p name0 name1,
Default p status0 status,
Default p ownerId0 ownerId1,
Default p backofficeDomain0 backofficeDomain1) =>
Default p (TenantPoly key0 createdAt0 updatedAt0 name0 status0 ownerId0 backofficeDomain0)
(TenantPoly key1 createdAt1 updatedAt1 name1 status ownerId1 backofficeDomain1) where
def = pTenant (Tenant def def def def def def def)
Using Opaleye with simple Haskell records
TODO
Using Opaleye with polymorphic Haskell records
TODO
Supercharged polymorphic records with type-families
TODO
Simple newtypes over Int
(or Int64
) for your primary keys
TODO
Phantom types for reducing newtype boilerplate for your primary keys
TODO