The RLS Problem Nobody Talks About Clearly
Row Level Security in Postgres is the kind of thing that sounds simple and isn't. You enable it on a table and you think you're done. You're not done.
Supabase's security advisor flagged several tables as "RLS enabled, no policies." What that means in practice: RLS is on, so the table is locked down, but because there's no policy defined, every query returns nothing. Zero rows. It's not a data exposure problem, it's a silent failure. A user signs in, queries their data, gets nothing back, and you have no idea why.
That's the benign version.
The worse version is what I found on some of my tables: WITH CHECK (true) on the anon role. That's a policy that says anyone, authenticated
or not, can write rows. I put that in during development to get things moving. It never came out. That's the kind of thing that bites you
six months after you shipped.
The actual OWASP classification for this is API1: Broken Object Level Authorization. It's been at the top of their API security list for
years because it keeps happening. A 2021 Parler breach exposed 70 terabytes of user data partly because their API didn't enforce
object-level authorization checks. Users could request other users' data just by changing an ID in the URL. My WITH CHECK (true) policy
isn't the same thing exactly, but it's in the same neighborhood. Writing data without being who you say you are.
My fix was straightforward once I knew what to look for: every table that handles user data needs auth.uid() = user_id in the policy, both
for reads and writes. The Supabase docs make this look easy. It is easy, once you understand that "enable RLS" and "write a policy" are two
separate steps and both are required.
The Supabase security advisor isn't something most
tutorials mention. It sits in the dashboard under a
tab most people walk past, and honestly it should
probably be harder to ignore than it is. I didn't run
it because some article told me to. I ran it because
somewhere in the middle of a refactoring pass on Holy
Scroll, I kept seeing the same category of problem
surface. Not crashes, not broken features.
Architecture that felt wrong the more I looked at it.
Tables I wasn't sure were protected correctly, RPC
functions whose permission model I couldn't
reconstruct from memory, comments in my own code like
"ideally we should use the authenticated client here,
but this operation is naturally tied to the device." I
was tracking everything in Fizzzy.do at the time, and
there was this growing cluster of tasks that kept
getting pushed down because none of them were
user-facing. None of them were broken. They just
weren't right.
That's when I sat down and actually tried to understand the architecture, not just patch what was visibly wrong. The report came back with ten advisory categories across my production project. Ten, on a Bible app over a thousand real users. And looking back at each finding, every single one traced back to a decision I had made, usually under time pressure (sometimes being lazy or feeling smarter than I thought I was), that felt reasonable in the moment and only looked like a problem once I was forced to examine it from the outside.
RLS Is Not Just a Toggle
Row Level Security is the thing Supabase tutorials introduce in the first five minutes and then move past before you really absorb what it does. The pitch is clean: enable it on a table, write a policy, Postgres enforces access at the row level based on who's calling. What the pitch leaves out is that there are at least three distinct ways to implement it and still end up with a misconfigured database.
The first one I hit: I enabled RLS on my
daily_verse_delivery table and never wrote any
policies. What happens in that situation is not that
reads and writes proceed freely. What happens is an
implicit full deny. Every SELECT, INSERT, and UPDATE
gets blocked for every role, including authenticated
users. The table had 13 rows from before I turned RLS
on. After that point, nothing new was being written.
My cron job was silently failing to log delivery
records, catching the exception, printing a warning,
and moving on. Zero user-facing symptoms. I only found
it when I looked at the row count in the advisor
output and wondered why a table I was actively writing
to had stopped growing.
This is how Postgres behaves, and it's the correct behavior. But when you're the only person on the project, there's no one running nightly anomaly checks. Silent failure is expensive.
The second issue was more instructive. I had WITH CHECK (true) on INSERT and UPDATE policies for the
anon role on my devices and notification_events
tables. The reasoning is right there in comments I
left myself: device registration happens before a user
signs in. Someone opens Holy Scroll for the first
time, hasn't created an account, the app needs to
register their device so push notifications work. So I
wrote a policy that lets the anon role insert without
any predicate check.
What I hadn't fully thought through was what "anon"
actually means in Supabase's model. It doesn't mean
"this specific iOS device running my app." It means
any HTTP request that presents the anon key. My anon
key is in Release.xcconfig, which gets compiled into
the app binary, which means it's extractable by anyone
who downloads the app from the App Store. That's not
a secret by design. Supabase is explicit that the anon
key is public-facing and that RLS is what protects
your data. But WITH CHECK (true) means there is no
predicate. Anyone with the key, from anywhere, can
insert rows into my devices table with no validation
that the request came from an iOS device running Holy
Scroll rather than a script running on a laptop.
In January 2026, an AI social platform called Moltbook
had its entire Supabase database publicly exposed
because RLS was disabled on their tables. Email
addresses, authentication tokens, and API keys for 1.5
million accounts were accessible to anyone with the
anon key and a basic HTTP client. The root cause is
the same category of mistake: having RLS on paper
without actually thinking through what the policies
permit. The scale of what happened to them relative to
what my configuration exposed is different. The
mechanism is the same.
The third issue I want to name here is the SECURITY DEFINER views. I had three of them:
user_monthly_stats, user_total_stats, and
admin_user_analytics. A view defined with SECURITY DEFINER runs as the view owner rather than the
querying user, which means RLS on the underlying
tables is bypassed entirely. The owner in Supabase is
typically the postgres superuser. So any authenticated
user who had SELECT on admin_user_analytics could
query aggregated data across all 722 users in my
database, regardless of what RLS said about the tables
underneath it. I'd created that view months earlier
to inspect data in the dashboard and never went back
to audit who could actually reach it from the API
layer.
The 22 That Actually Kept Me Up
The finding that made me sit with it longest was a
category called
anon_security_definer_function_executable. Functions
that are both SECURITY DEFINER and callable by the
anon role without any authentication. I had 22 of
them.
SECURITY DEFINER on a Postgres function means it
executes with the privileges of the function owner,
not the caller. I thought I was making things more
controlled by using it. Combined with being accessible
to the anon role, the actual effect is the opposite:
the function bypasses RLS, runs with elevated
privileges, and requires no session token to invoke.
The anon key alone is sufficient.
The worst one was migrate_anonymous_data. This RPC
handled a real product problem: when someone uses Holy
Scroll anonymously for a while and then signs in with
Apple, their reading history should transfer over.
The function signature was
migrate_anonymous_data(p_user_id uuid, p_device_id text). It accepted any UUID as a parameter and
reassigned anonymous analytics rows to that user's
account.
The vulnerability has a name. OWASP calls it Broken
Object Level Authorization, API1 in the API Security
Top 10. The pattern is simple: an endpoint accepts an
object ID as a parameter without verifying the caller
has any right to act on that object. Here, the object
is a user account, the identifier is a UUID, and the
action is claiming ownership of their data. And
because the function was SECURITY DEFINER and callable
by anon, this required no authentication at all. Just
the public anon key and a valid user UUID.
Parler's 2021 data scrape is the canonical example of what this looks like at scale. Their API accepted sequential post IDs with no authentication and no rate limiting. Someone wrote a script that walked the ID space in order and downloaded 70 terabytes of content, including posts users had deleted, which were only soft-deleted in the database and still fully accessible through the API. The specific vulnerability class in my case is an accepted-parameter IDOR rather than sequential enumeration, but the underlying principle is identical: an endpoint accepted an identifier without checking who was asking.
The correct fix is structural, not just a permission
revoke. The function should never accept p_user_id as
a parameter at all. Derive the caller's identity from
auth.uid() on the server side and run the function as
SECURITY INVOKER so RLS applies:
CREATE OR REPLACE FUNCTION
migrate_anonymous_data(p_device_id text)
SECURITY INVOKER
LANGUAGE plpgsql
AS $$
BEGIN
UPDATE user_daily_stats
SET user_id = auth.uid()
WHERE device_id = p_device_id AND user_id IS NULL;
END;
$$;
auth.uid() returns null for unauthenticated requests.
The WHERE clause matches nothing. The function becomes
useless without a valid session. The same pattern
applies to every function that was previously
accepting a user identifier as a trusted input:
replace the parameter with a server-side context
lookup.
There were also 17 functions flagged for a mutable
search_path. The attack surface here is a schema
injection: if an attacker can plant an object in a
schema that resolves before public in Postgres's
search path, they can intercept your function calls by
shadowing them. The remediation is one line added to
each function definition, SET search_path = public,
which pins the resolution context. It's invisible in a
code review, it's the kind of thing you'd never think
to check, and Supabase's advisor caught all 17.
Sign In With Apple: The Part I Actually Built Correctly
I want to spend some time here because it's easy for a piece like this to read as a list of things that went wrong. The Sign in with Apple implementation is the part of Holy Scroll's auth story I'm most confident in, and understanding why it's correct is as useful as understanding why the database wasn't.
Apple uses an OIDC flow. They act as the identity
provider, issue a signed JWT called an identity token,
and your backend validates that token to confirm the
user is who they say they are. The mechanism that
prevents replay attacks is the nonce.
Before initiating sign-in, the app generates a
cryptographically random 32-byte nonce using
SecRandomCopyBytes, hashes it with SHA-256, and sends
the hash to Apple as part of the authorization
request. Apple embeds that hash in the identity token
it signs. Supabase validates the token server-side and
checks that the nonce matches. An intercepted token
in transit is useless without the original nonce,
which never left the device.
The fallback in my randomNonceString() function, for
the rare case where SecRandomCopyBytes fails,
originally used a trimmed UUID().uuidString. I
eventually changed it. A UUID has around 122 bits of
entropy if generated correctly, which is technically
acceptable, but prefix(32) on a UUID string gives you
the first 32 characters of a hex-dash formatted
string, and the structure of that format is
predictable in ways that reduce the effective entropy.
The change was marginal in practice. The principle
matters: when you're generating values that gate
authentication, every bit of entropy you don't
preserve is entropy you chose to discard.
One edge case that catches a lot of people with Apple Sign In: Apple only returns the user's name on the first authentication. Every sign-in after that returns nil for the name fields. A naive upsert on every sign-in overwrites your stored names with null. I left a comment in the upsert function: "SMART upsert, preserves existing data when Apple doesn't provide names." What that means in practice is fetching the existing profile first, merging new data over only the non-nil fields, and then writing back. More network calls than a straight upsert, but the only way to not silently destroy your users' display names on their second login.
Session Storage: The Bug That Lived in a Comment
Somewhere during a debugging session, I added two lines to the session monitoring loop:
UserDefaults.standard.set(session.accessToken, forKey: "supabaseAccessToken")
UserDefaults.standard.set(session.refreshToken, forKey: "supabaseRefreshToken")
I already had these tokens being written to the
Keychain through SessionKeychainStore. The
UserDefaults write was for quick inspection during
development. It never got cleaned up.
The distinction between those two storage locations
matters more than it looks. UserDefaults on iOS has a
default file protection class of
NSFileProtectionCompleteUntilFirstUserAuthentication,
which means the data is decryptable any time after the
device's first post-boot unlock, including while the
screen is locked and background processes are running.
Keychain items with kSecAttrAccessibleWhenUnlocked
are only accessible while the device is actively in
use. An unencrypted iTunes backup includes
UserDefaults. A refresh token sitting in that backup
is a long-lived credential that doesn't rotate the way
an access token does.
This is iOS's version of the localStorage versus
httpOnly cookie problem on the web. The secure storage
option exists. I built it. I then added a second
storage path for convenience and left it there.
Related: I had unconditional print() calls throughout
the authentication flow logging user IDs, full names,
and emails, with no #if DEBUG guard. On iOS, that
output goes to the unified logging system and is
accessible over USB via Console.app without a
jailbreak. Crash reports submitted through MetricKit
or Xcode Organizer can include recent log output. None
of that belongs in a release build. The fix is
wrapping every print() that touches a name, email, or
identifier in a debug guard, or replacing it with
os_log with the appropriate privacy annotations.
The Dual-Write Bug I Introduced While Trying to Be Careful
This one's embarrassing in a very specific way.
I was worried about session persistence. I wanted to make sure that when Supabase updated the session tokens, they got stored somewhere
durable. So in startSessionMonitoring(), I set up auth state change listeners, and when a new session came in I wrote the tokens to... both
UserDefaults and the Keychain.
The intent was right. The execution created a problem.
UserDefaults is not encrypted. On iOS, it's stored at Library/Preferences/<bundle-id>.plist and it gets backed up to iCloud by default. So
my session refresh tokens, which are equivalent to long-lived credentials, were sitting in a location that iCloud backup could pick up and
that any process with the app's bundle ID entitlement on a compromised device could read.
NSFileProtectionCompleteUntilFirstUserAuthentication is the file protection class that applies after first unlock. It's not the highest
protection level. For tokens, you want kSecAttrAccessibleWhenUnlocked in the Keychain, where data is encrypted at rest and only accessible
when the device is unlocked.
My SessionRestorationService does read from the Keychain correctly. The session restoration path is fine. The problem is the
startSessionMonitoring() dual-write creating a second, less secure copy of the same credentials. Those should be Keychain-only.
The Notification Console
At some point curl commands aren't the right interface
for sending push notifications to real users. I built
a Next.js admin portal that runs locally behind
Tailscale. The access model has three independent
layers. Reaching the portal requires enrollment in my
tailnet, which requires my explicit approval. Once
inside, the portal requires a valid Supabase session.
Even with a valid session, calling the Edge Function
that actually delivers notifications requires an
x-admin-secret header that's separate from the
Supabase credentials entirely. Any one of those three
layers failing blocks the action.
The architecture of the full pipeline:
pg_cron (hourly)
-> send-daily-verse Edge Function [auth:
service_role OR x-admin-secret]
-> send-push-notification Edge Function [auth:
x-admin-secret]
-> APNs
Portal (Next.js, Tailscale)
-> Server Action [auth: Supabase session]
-> send-push-notification Edge Function [auth:
x-admin-secret]
-> APNs
Two paths into the same pipe. The cron path runs automatically on an hourly schedule. The portal path is deliberate, manual sends.
The server action uses supabase.auth.getUser(), not
supabase.auth.session(). The distinction is that
getUser() makes a live API call to validate the token,
while session() reads from local state and trusts
whatever's cached. For an admin operation, live
validation is the only acceptable choice. A replayed
or expired session should not pass. I left a comment
in the code referencing OWASP Broken Access Control
specifically because I knew I'd rationalize the
simpler version the next time I looked at it.
Zod validates every input field before the Edge
Function sees anything: types, value ranges,
cross-field refinements. The schema rejects invalid
image URLs, badge counts below zero, relevance scores
outside 0 to 1, and payloads with neither a title nor
content_available. The Edge Function has its own
validation on top of that. Redundant validation at
multiple layers isn't over-engineering when you're
building tooling that broadcasts to everyone.
Two bugs in the delivery layer I found after shipping:
The first one was that apns-push-type was hardcoded to
"alert" in the Edge Function despite accepting an
apns_push_type parameter in the request body. My daily
verse function sends content_available: 1 to trigger
a background app refresh when the verse arrives.
Apple's APNs spec requires apns-push-type: background
for those payloads. Sending content_available with
apns-push-type: alert doesn't generate an error. Apple
accepts the notification and then delivers it at its
own discretion. In Low Power Mode and certain Focus
states it silently drops it. I spent longer than I'd
like to admit assuming the problem was in my
timezone-aware hour-matching logic. It was a hardcoded
string I forgot to wire to the parameter I was
already parsing.
The second was that I was calling push_enabled = false
on a device for both APNs 410 and 400 responses. A
410 (Gone) means the device token is permanently dead,
the device was reset or the app was uninstalled.
Disabling it is correct. A 400 (Bad Request) means I
sent a malformed payload. The device is fine. I made
an error on my end. Tombstoning the device on a 400
means a single malformed broadcast silently disables
every device it reaches. I caught this after a test
send where a field had the wrong format and several
sandbox devices stopped receiving notifications. The
connection took a while to surface.
The APNs Bugs I Didn't Notice Until I Looked
Two bugs in my send-push-notification Edge Function that I want to document because they're subtle.
First: the apns-push-type header is hardcoded to "alert" regardless of what the caller passes. APNs uses this header to route notifications
correctly. Background notifications need "background". Sending a background notification with apns-push-type: alert violates APNs
requirements and Apple can reject or throttle it. I have a parameter to accept different push types but I never wired it to the actual
header.
Second: I'm marking a device's push_enabled column as false on both 410 and 400 response codes from APNs. These mean very different things.
A 410 is "this device token is permanently invalid and you should stop sending to it." A 400 is a bad request, usually a malformed payload
or wrong push type. My malformed payload shouldn't permanently disable push for a real user, but that's what happens. A user who gets caught
by a 400 response will silently stop receiving notifications until they reinstall.
The APNs JWT is also regenerated on every function invocation. The ES256 token has a 60-minute lifespan. Generating a new one for every push
adds unnecessary latency and burns compute. Module-level caching with a 50-minute TTL is the right pattern.
Account Deletion, and the Part I Shipped Knowing It Was Incomplete
I'll be direct about this one. When I built the
account deletion flow, I knew it wasn't complete. My
deleteAccount() function deletes the user's row from
user_profiles and calls signOut(). There is a comment
in the code that I wrote: "Note: Auth user record may
still exist in auth.users table. To fully delete auth
user, consider setting up a database trigger or using
admin API."
That comment was honest self-documentation from a
moment where I was trying to ship the delete account
screen because Apple's App Store review guidelines
require one, I had a release queued, and the full
implementation wasn't done. It went on the Fizzzy.do
board. It stayed there through several releases
because from the outside, the behavior looked correct.
The sign-out happened, the UI reset, the account
appeared gone from the app. What persisted invisibly
was the auth record in Supabase's auth.users table,
along with the user's email address and Apple subject
identifier.
Under GDPR Article 17, the right to erasure means all of it, not just the profile row. This is exactly the kind of gap that accumulates when you're moving fast without a project manager or a compliance review. I'm not making excuses for it. I'm naming it because every solo developer I know has some version of this: a comment in the code that says "this isn't right yet" that has been there long enough it's started to feel permanent.
The correct implementation is a server-side Edge
Function that calls supabase.auth.admin.deleteUser().
The client can't call the admin API directly because
it requires the service role key. An Edge Function
with the same authenticated gate as everything else
handles both the profile deletion and the auth record
deletion. The iOS client calls the function, the
function handles cleanup completely, and the user is
actually gone.
I I did my undergrad in cybersecurity. I work as a cybersecurity professional. I can tell you the
formal definition of Broken Object Level
Authorization, explain what SECURITY DEFINER does at
the Postgres privilege model level, and describe why
NSFileProtectionCompleteUntilFirstUserAuthentication
is weaker than it sounds.
What school didn't give me was the experience of having 22 functions callable without authentication in a production database with real users in it and having to figure out which ones were actually exploitable and which ones were just noisy. Some of these were things I knew about abstractly but hadn't fixed. Some were things I did intentionally during development that never got cleaned up. One of them (the dual-write) I introduced while actively trying to be more secure, which is humbling.
Building Holy Scroll is where theory stopped being
theory. The gap between knowing what RLS is supposed
to do and shipping WITH CHECK (true) because you're
focused on getting device registration working and
you'll tighten the policy later, that gap only exists
in production.. The
difference between those outcomes and a quiet
afternoon fixing advisor warnings is usually not
sophistication. It's whether someone looked at the
thing from the outside before someone else did.
Holy Scroll is a better app for having looked.
Holy Scroll is available on the App Store.
Sources:
- Moltbook Data Breach: Supabase RLS Security Lessons | Bastion
- Supabase Security Flaw: 170+ Apps Exposed by Missing RLS | byteiota
- Parler Hacked: Analysis of the Parler Data Breach | Salt Security
- API Security Issue 116: Parler API Vulnerabilities | API Security News
- Supabase RLS: Common Mistakes and CVE-2025-48757 | Vibe App Scanner