753

April 8th, 2024 × #caching#webdev#performance

Cache Ruins Everything Around Me

Discussion about cache invalidation issues when caching user-specific data, solutions like using different URLs, partial caching, edge functions, and drawbacks like flash of unstyled content.

or
Topic 0 00:00

Transcript

Scott Tolinski

Welcome to Syntax.

Scott Tolinski

In this Monday, hasty treat, we're gonna be talking about how cash ruins everything around me. That's right.

Topic 1 00:03

Talking about how cache ruins everything

Scott Tolinski

It it's both simultaneously a Wu Tang reference, but also a reference to things that are actually going on in our real development life.

Scott Tolinski

Right now, we have been experiencing some crazy interesting cache issues on the syntax site, and we thought it would be worthwhile to take some time to break them down in this episode. As always, my name is Scott Tolinski. I'm a developer from Denver. With me, as always, is Wes Bos. What's up, Wes?

Topic 2 00:20

Experiencing crazy cache issues on Syntax site

Wes Bos

Hey.

Wes Bos

Not too much. I was in a crazy rabbit hole with this cash stuff yesterday, and I know you've you've gone down the rabbit hole a couple times as well. And we finally figured out, like, what's going on, but, like, there's, like, a like, the joke of, like, the hardest problem in computer science is, like, cache invalidation.

Wes Bos

And I believe it because there's so many places where you can cache things. And if you get the data in the cache, it's very, very frustrating. So we thought we would explain the problem and the possible solutions that you can hit with this type of stuff. Specifically, the stuff's gonna be around CDNs, CDN caching, not browser and not like a like a memory cache.

Scott Tolinski

Yeah. Well, we can talk a little bit about those as well. But before we do, let's head on over to the Syntax YouTube. We've been putting out a ton of stuff on the Syntax YouTube. That's youtube.comforward/at syntax f m. And very recently, CJ has been doing some deep dives into content that we may have covered on the Syntax podcast already. Things like self hosting, setting up your own, VPS, like self hosting yourself on a, yeah, VPS. And he's gonna be diving into more. He's releasing 1 on Docker very soon.

Scott Tolinski

It's gonna be up by the time you're listening to this and then even getting into the things like Coolify or setting up your Mac. But CJ has been crushing it on the YouTube channel, and you can get every single episode that we've released now on video as well if you wanna see both of our, fine faces here. So let's get into caching in. Let's really quick, let's talk about the cache problem. And and what's interesting about this cache problem is that if you look at the GitHub issue where I was talking about this maybe about a month ago, you can see me kind of wrestling with this Wes I say, I solved it. It's a cash issue. And then I have an an answer immediately after this. No. Wait. I don't think it's a cache issue. And then another reply that says, no. Wait. It definitely was a cache issue.

Topic 3 02:42

Issue was intermittently popping up

Scott Tolinski

And the big thing here is that this was one of those issues that was intermittently popping up for us. The type of thing that you think you fix because, hey. It's working. And the next thing you know, not working anymore, then it's working, then it's not working.

Scott Tolinski

So what's the issue? What's the problem? Well, you may have noticed that sometimes when you go to the syntax website, if you refresh, it may have served you up a different theme than you were expecting based on what page you were landing on and when you were refreshing.

Topic 4 02:58

Wrong theme being served sometimes on refresh

Scott Tolinski

The big thing here is simply that that theme was being cached.

Scott Tolinski

And because it was being cached, we it was being cached at a CDN level.

Scott Tolinski

So therefore, it was serving up a cached version of the HTML that you might have been getting from somebody else's theme, whatever they may have said on the website.

Wes Bos

Yeah. So it it's really interesting because the way that the cache the CDN cache works is that we server render the entire page. And part of that page is it detects if you have a cookie for what your theme is in a set. And if if there is a cookie set, it renders it out, and it says theme dash level up, and then it makes it purple. Right? And this is this is possibly, like, not for us, but this can possibly be a security issue for people listening JS because when you put user specific data in a cache, there's a possibility that that user specific data is going to be served up to somebody else because we basically say, we visit a a single page on the Syntax website.

Wes Bos

The CDN says, oh, we don't have this page, so we better go render it. So it it just renders it for whoever is trying to visit.

Wes Bos

And if that person has a theme set, it's not very common because not a lot of people have a theme set. But if somebody goes to that specific page with a theme set, it renders out the HTML, caches that HTML with that user's settings, their theme, and then sticks it in the CDN. And then anybody else who then visits that page for the next, I don't know what it is, like, 10 minutes or so, will then get that cache HTML.

Wes Bos

And, again, that could be a possible issue if you're putting sensitive user data in there, but ours, luckily, is just, like, a cash thing. And I remember, like, this popped up, like, 6 months ago and I was like, I think it's because we're trying to cash. Like, you can't have your cake and eat it too. You can't have, user dynamically rendered and caching because we want to be able to to serve it up. Right? So I thought let's go through some of the, like, the possible solutions to this type of thing because, yeah, it can be really frustrating.

Topic 5 05:28

Solutions for caching dynamic content issues

Scott Tolinski

Yeah. Totally. And and, you know, part of that frustration too was that I didn't even think about it being cashed at the CDN level. And what would what was happening, the reason why we kept thinking it was solved or I was thinking it was solved is because we weren't caching at the CDN level for the index page. So if I was on the index page and I was changing the theme and refreshing on prod, it was working a 100% of the time. Now I didn't think that to check Yeah. The individual show pages or or pages that we were caching harder at the CDN Vercel, because then I would have noticed that in fact wasn't solved. So it it it's one of those things that, you know, hey, man. This is the type of the bug that you hit that really is the worst type because, hey, it works. No way. It doesn't work. No way. It works. No. It doesn't work. Yeah. What what's going on here? Yeah. I had to and, also, when you're dev ing locally,

Wes Bos

it doesn't pop up because we're not running a CDN cache locally. We're just rerendering every page reload or or every not even page roll, every single save. So, like, why else might this pop up? A lot of the times, you'll hit this where if you have AB testing, and it's gonna be cookie based. Right? So somebody visits a landing page for your website, and half the people get the best podcast ever, and then the the other half of the people get become a better web developer. So you want to cache the page, but you also want to serve different people the same content on the same URL. So AB testing, user selected features, themes is probably the biggest one.

Wes Bos

Geo based items, so based on their language or where they're coming from, do you wanna show euros or Canadian dollars? Do you wanna show, French or English? It it really depends. Images. I found that images are a really good use case for this because you visit the same URL for an image, scott.jpeg, and a lot of the times, things like Cloudinary and ImageX, they are going to vary the output based on the request. So if your browser supports if they can they can see the user agent coming in. And if that user agent supports Wes, they're gonna convert it to WebP and serve it up. Even though it has a Scott JPEG extension, they can still serve it up with the content type of web p. JSON or HTML, I've I've run to this many times. The I can has dad joke URL.

Wes Bos

The endpoint is literally just I can has dad Node Scott, and you visit it in the browser, and it gives you the HTML page. You ping it with a fetch request with the headers accept JSON, and it's gonna give you JSON. Same URL, different content output, but those things both of those things should be cached.

Wes Bos

JSON output and the HTML that's rendered out. Right? Different encodings. If it's something needs to be g zipped. Something doesn't need to be something doesn't support g zipped. You have to send it over without it. So this whole idea of sending different content via the same URL is called content negotiation.

Wes Bos

And I'll link up to the MDN docs on what that JS. And it could be a bit of a pain when you start running into caching at a CDN level because if you need to cache more than one thing, you want it to be dynamic, but not too dynamic.

Scott Tolinski

What are the solutions that that we have here? The first one might be something that was suggested to us quite a bit on Twitter, especially because this is something that people use often as a solution for internationalization.

Scott Tolinski

You'll often see, like, a query parameter with maybe the location or something where there's an intermediary step that has been brought in. So that way you have a different URL for the different the different content.

Topic 6 09:05

Using query params for different content versions

Scott Tolinski

Because that way, that URL itself can be cached, especially if you have, again, different options, different internationalization, language things. That works for anything that makes sense to put into a URL.

Scott Tolinski

And unfortunately for us, CSS theme, not really something that works there. But again No. It JS an interesting solution for many issues that and and the reason why you do end up seeing, URLs for being used for different language things because it works, and it's much easier than what we're trying to do for this theme system.

Wes Bos

Yeah. It's it's way easier, especially for languages. It's like if you just put the language Node, e n dash c a or e n dash f r, French Canadian, you put that in the URL, then it's so much easier to cache that. It's so much easier just to redirect somebody to that specific page based on their, request a page, there's a header that says what your accept language is. And based on the language that's set in your browser, the server is able to just redirect you to different pages, and putting it in the URL bar is better. Also, you can just share URLs that have your language in it, so it makes it easier. So in the example of the JSON versus HTML, what most people do is they don't they don't use the same URL. They set a query param or they add Scott JSON to the end or they add something to the URL. So it's a just it's an entirely different URL. You don't have to cache it.

Wes Bos

So that's probably the the most ideal.

Wes Bos

And I asked on Twitter, like like, what are people's strategies for content negotiation? And I had just like I said, like, oh, if it's a theme or if it's a language. And most people are like, well, if it's a language, just use different URLs. But Yeah. Yeah. That doesn't work for themes because you're not gonna tell somebody to go to syntax dot f m Wes mark theme equals dark. People no one's gonna type that in. Right? You need to just be able to visit the URL directly.

Topic 7 11:19

Not caching page but cache parts like database query

Wes Bos

So the next solution that you have here is don't just don't cache the page.

Wes Bos

You can cache other parts of it. So if you think about, like, what makes let's take syntax out of them. What could possibly make that slow? Right. There's the initial the initial request to the server could be slow. The database query where you pull the latest shows and come back could be slow. The actual rendering, taking the data, rendering out via Svelte could be slow.

Wes Bos

So one of the biggest parts of that is, like, the the database query could be slow, and you can simply just take that out of the equation and say, okay. I'm not gonna cache the actual rendering of the page, but I am going to cache the database query so that whole round trip to the database and back with all the data is not needed. And Scott built this where you built this right. You stick the data in redis if it's more than, I don't know,

Scott Tolinski

10 minutes old. Yeah. It's it's funny. I have a long running joke with a a friend of mine who's actually gonna be coming on the show eventually, and we just throw the word mang into everything. So we'd always call Superman, Superman or any any, any superhero. We call them, like, Spiderman for for no reason whatsoever.

Scott Tolinski

So I I made this the Cashmang, and the cache mang is, is basically just a function that it it accepts what the query is and what the query parameters are and all those things and saves the data into a database. It acts kind of like a stale warp reinvalidate just so in Redis. But it does so with a really simple key, especially for us with, like, if it's a show, it's just show colon whatever the show number is there. That complex query. Because let me tell you, that's a big old query in the database.

Scott Tolinski

That query nights, gets to be just saved in Redis. We're using Upstash for that, which has been really nice. It's just a basically a a Redis store.

Wes Bos

Yeah. Other other things, lots of key value stores are really handy for this.

Wes Bos

Deno has key value store, which I used recently. Cloudflare has key value store.

Wes Bos

And those things are are really easy. I what I like about Redis is that you can set pnpm expires on it directly so you don't ever have to worry about deleting it or managing it. You'd say, expires after some amount of time.

Scott Tolinski

So you could just nuke it. And, also, we do have a button where we could just nuke everything, and then it just repopulates itself, which is really nice. Yeah. I I made that because of Drupal. In Drupal, there's a lot of caching going on. And one of, like, the first extensions you install in Drupal is the one that puts the dump cache button in your toolbar because and whenever anything's going wonky, you just dump that cache.

Wes Bos

And Redis also has, like, cache tags as well, right, where you can you could tag it. Like, one kinda cool thing I really like about both 10 stack query and now Next JS is having it added is that you can tag your queries with something like show or show 125, and then you can say, alright, Let me invalidate all of the things that have a tag of show Wes, or let me invalidate everything that it has a tag of show. So you can get you can, like, add multiple tags to it from very broad to very specific.

Wes Bos

And that's really nice as well when you just want to say, alright, anything that has this tag, let me nuke it out.

Wes Bos

Oh, Brooklyn.

Scott Tolinski

I have my my daughter JS here. You can see her. Hey. Brooklyn, do you wanna say hi? She's not feeling so good, which is why she's oh.

Scott Tolinski

Oh.

Scott Tolinski

Okay. I'm gonna mute my mic here for a sec.

Wes Bos

Alright. Next 1 we have here is this idea of a cash key, which is kind of what I was looking for when we ran into this because we have, I don't know, maybe 6 different themes.

Topic 8 15:02

Having cache key variants for each theme

Wes Bos

And I was like, okay. Well, it's not a big deal to render out the home page for light Node, and then render it out for dark mode, and then render it out for the level up theme. And then you just you just have 3 versions of that cache or 3 variants of that cache. And then depending on the request that comes in, you serve them up one of those 3 versions. So that will significantly increase your cash size and significantly decrease your cash hits, the more options that you have.

Wes Bos

But it is often something that is is needed. And I looked into this, and it seems to be it's not a standard, so it has to be implemented at every single level of CDN.

Wes Bos

So there is Node standard called the vary header, but it doesn't seem very well supported.

Wes Bos

Specifically, Cloudflare and Vercel CDNs do not support this.

Topic 9 16:06

Edge functions to modify cached content

Wes Bos

And then it seems like, Netlify Netlify is a really good blog post about this. They rolled out their own header called the Netlify vary.

Wes Bos

And you can specifically say, alright.

Wes Bos

Here are the different variants. It must be based on things that are coming in to the request because the browser has to send them automatically when somebody visits a URL. So what are things that are automatically sent to the browser? Well, the cookies are, the accept language is you can often get the user's country code from your the CDN can get the country code and resolve that. So Netlify has this thing called the Netlify very header, which you can say, alright, based on device type or based on this specific cookie, which is what we want. We say based on the cookie of theme, this serve up one of these 6 different variants for this specific page.

Wes Bos

Fastly also has their own version of it, sort of like a hash for that specific page.

Wes Bos

Cloudflare does have cache keys, but they're enterprise only.

Wes Bos

And anytime like, it's not even like a it's not like a pro plan. It's not a business plan. It's enterprise, and, like, you gotta get on Yeah. Anything you gotta get on the sales. Yeah. Yeah. I don't want to go to sales. And, like, that, there's no chance that's gonna be cheap. Does anybody ever wanna talk to sales? I you know,

Scott Tolinski

shout out to sales folks, but, man, I worked one of my 1st gigs, Wes, was for, like, a small website, and they did, like, they did exercise. They weren't exercise bikes. They were like these specialized exercise equipment, and it's like they wanted an online store. That was it. We want an online store, but you can't have any prices, and all of the buy now button should take you to a page that just has a phone number. And I'm like, what? Why? Nobody's gonna they're like, well, we can't we can't have the store cannibalizing our sales department.

Scott Tolinski

We're like, oh. That drives me crazy. Begrudgingly did it. But yeah.

Wes Bos

So the next option is use an edge function. I think this is a really clear example of what a good use case for an edge function would be JS that, well, yeah, maybe you don't wanna make 6 different variations for every single page. Maybe you only wanna make 1, but you still want it to be dynamic. So an edge function is a function that sits in front of your actual server, and it will intercept the request coming in. And, Cloud for Work is a good example of this. So what you could do is you jump in the middle, you go and fetch the regular cached page, and then you parse out the HTML and change the bits that you want. So we could write a little Cloudflare Worker that and I actually started doing this, myself. So it will simply just go and fetch the page.

Wes Bos

It'll check if the cookie is set of a theme, and if it is set, it will just find that div and swap out the actual class name of theme dash light with themedash dark or themedash century, whatever all the other themes are. And I think that's part of me is like, that's a cool use case. But part of me also was like, I don't want to add, like, a whole layer of of abstraction on top of this type of thing just to to sort of work with the the cache. So that is a kind of an interesting use case, and that is a pretty common use case for using an edge function or a worker in front of your application.

Topic 10 19:52

Flash of unstyled content as client-side solution

Scott Tolinski

Yep. And the easiest solution here, which is also maybe the worst in my mind because I I kinda hate this, is the flash of Node unstyled content, but the flash of unthemed content. So you would just approach this from a pure client side perspective, which is what a lot of people told us to do. I don't love that solution. The website loads and it's like the normal colors. And then a second later, the class gets added and then the colors all switch. I mean, that to me is even more distracting than it's like the fonts changing after the page loads or, you know, content being dropped in while you're potentially scrolling. It could be really. I don't know. It doesn't seem like a good experience to me. So yeah. Tough problem.

Wes Bos

Yeah. It's I I think that was the part that a lot of people misunderstood is that if you only switch a client side, yeah, you get this quick little flash of flash of light mode or flash of dark Node. And Yes. Like, maybe at some point, we'll have, like, a header that is sent along.

Wes Bos

I know CSS has light and dark mode, and it's also not that we're loading different CSS files based on that. It's the themes are just a bunch of variables. Right? You can load every single theme in in nothing.

Wes Bos

But the problem is is that you want the class on the HTML element so that the very first render that comes through with CSS has the correct theme, and it's not picking it up on the 2nd render, and you're getting that initial flash. It's not that big of a deal because once you're on the page, you click the Tolinski, and it it's it goes. And, actually, I was trying to see how the the flash would be, and I I had to turn off JavaScript to get the not Svelte version, meaning that I I turned off JavaScript just to see it. It was so fast that I wasn't even seeing the Flash. Interesting. That's as as far as I've got right now. So I'm I've I'm at the spot now where I've written the worker, and I'm trying to finish up just doing a client Node. And then we'll see because oh, sorry. I should say because we're using Cloudflare, we don't have the ability to do the cache key thing, which I think would probably be ideal. And I don't really care that much to to switch any of this infrastructure. You know? So I think I'll I think we'll we'll try both, and we'll see kinda which one is is worth it.

Scott Tolinski

Yeah. Yeah. It should be interesting. Either way, you know, Wes wanna get this done in the, the best possible way. It's funny. On one of my side projects, Wes, I was having the same issue and then I realized, wait, SSR is doing me absolutely nothing on this site. There's Node SEO requirements.

Wes Bos

Guess what? Yeah. I just turned off SSR. Hey. You can do that, and it works really well. Go. Yep. Well, it should also say that it will also not flash for light and dark mode defaults.

Wes Bos

It's only when you explicitly go in and overwrite your theme. Say, I am light mode on my computer, but I'm turning on dark Node, or I am dark mode. I'm turning on the level up theme. So because CSS has native support for light and dark or system, it won't flicker. It's only when you explicitly go override it, and that that is a bit frustrating.

Scott Tolinski

Word. Well, let us know if you have any novel or interesting ideas that we did not cover on this in the episode. In fact, you know, we always say let us Node, but I wanna make this very clear. You can go to YouTube.

Scott Tolinski

You can leave us a comment. And guess what? We read all those comments. We look at them. So that's, like, the best place, I think, to get in touch with us about any episode or feedback or even questions that you might have. We try to get in there and answer everything. So if you listen to this and you say, wait a second. I have a good idea, or I think you guys didn't think about this, or, hey. I just want a little bit more information.

Scott Tolinski

Head on over to youtube.comforward/adsentaxfm.

Scott Tolinski

Look for this video. I will also have it available on the site Node that's all that good stuff, and you can just leave us a comment on the video. That'd be great.

Wes Bos

Beautiful. Alright. Thanks for tuning in. Catch you later. Peace.

Share