Real-timish Elasticsearch

The product I work on uses the Elasticsearch heavily. We use it to provide product filtering with many verticals selected by user and filtered by our business rules. We also use it to show similar and recommended products (via similarity vectors). Usually not compressed response from Elasticsearch for more important pages may take like 3-4MB each so you may guess how complicated it is. The peak traffic is also quite heavy, so we have to really care about Elasticsearch performance. Here are some tips how to survive.

Tracking snippets under control

You've just finished the cleanest code of the new feature. You're really proud of it but then marketing team needs just one tiny snippet to remarket a thing. Or to track its conversion. Or to automate lead generation. Whatever it is you end up with ugly JS snippet in the code. Oh, did I forget about GDPR and visitor consents? Yeah, you need them first to actually fire the tracking.

What's new in frontend world?

I work for one of the biggest ecommerce aggregators in Polish Internet. The most valuable feature is quite simple - you see some products, click them and buy them on a shop site. So how could we end up with SSRed React + Redux combo? Is it really worth it?

Failsafe mixin

Maybe it's not the most popular need, but sometimes you need to provide main feature in the most bullet proof way, so every other aspect around given feature can fail but the feature - not.

Logstash as messaging platforms bridge

I'm still impressed how powerful Logstash is. It's simple in theory - you set the input, make some transformations and send the output. The trick is - you can use it with variety of inputs and outputs and Logstash will handle them in performant way.

Too many cookies

What if one of your tracking snippet started to create cookies like crazy? I mean few hundreds for power users (and at least couple per each visitor). Few things may happen - the browser will start to replace cookies with other cookies after hitting a limit (180 cookies?). Not very cool, but what about varnish or nginx rejecting the request because of too large header?

Performance gotcha in Rails collection_ids +=

Today I've got Rollbar about timeout from database connection pool. At first I thought maybe we hit the connection limit for our MariaDB instance, but as it turned out few Sidekiq jobs stuck for longer time - 40 minutes actually.