It's been awhile. I want to share some technical things that I do that “for lack of a better word” sparks joy.
Incremental View Maintenance
Let’s face it, aggregate is slow in OLTP databases. This is why back on the olden days we have an OLAP cube. My beloved postgres has this problem too but then I found this postgres extension (pg_ivm). As the name implies, this extension makes your Materialized View updated everytime the underlying data / row is updated.
-- instead of
CREATE MATERIALIZED VIEW
-- we do
SELECT create_immv('myview', 'SELECT * FROM mytable');
And voila the materialized view will be automagically updated. It also creates an index for fast retrieval ;)
When to use this?
Say in a certain app one of the feature is to track a ledger to know how much money a user has. You can naively do SELECT sum(amount) from transactions were user_id = xxx
but this query grows linearly with how much data you have. Now with pg_ivm
you can trivially solve this by SELECT create_immv(balance_imv, 'SELECT sum(amount), user_id FROM transactions group by user_id
). Now to grab that balance you just need SELECT amount from balance_imv where user_id = {user_id}
DuckDB, sqlite-utils
Lately I’ve been loving the idea of simple tools like sqlite and duckdb; I reach them frequently in my day job to do basic data wrangling. sqlite-utils is a python lib and also a cli to help you query CSV/Parquet/JSON by loading it into sqlite. In the same spirit duckdb does the same thing where it can load static files with a slight twist: it is columnar. The duck fucking quack at aggregation. Another plus is that it provides more or less the same dialect as postgresql including much needed aggregation function. Love it!
How to Profile PHP code
In my dayjob, we inherited a legacy laravel 5.1 code to work with. I am mostly incharge of optimizing the slow queries; by slow i mean 2-3 minutes per page slow. This is also my first foray into PHP. So I started measuring things. Enter xdebug With Xdebug we can profile each endpoint / page view and see where exactly is the slowest part of it. Cute find: Laravel 5.1 Route runtime scales linearly with the amount of routes definition. 2000 routes definition bogs down the runtime 20ms lmao.