Introduction
I am Saurabh, a Tech Lead at Pharmeasy. My team develops and maintains the ML infrastructure for serving customer recommendations. Last year, we deployed our first real-time recommendation engine for non-Rx product category recommendations on our home page. This blog will explain how Isima’s bi(OS) helped with this initiative and the solution architecture for recommendations.
Real-time ML
Like any eCommerce property, recommendations’ relevance, freshness, and speed are critical for us. While our data science teams pioneer new models for relevance, we in ML engineering care deeply about freshness and speed. Hence, we are cautious when we onboard a new recommendation engine and its new feature store. Our app chooses to timeout instead of waiting for stale and slow recommendations.
Converting eCommerce window shoppers
eCommerce consumers typically window shop – they land on the home page, search, browse various products, and repeat the process. The goal was to use the consumer clicks on products as a proxy for their intent and recommend close-enough products on the home page. The trick was to do it in real time.
Data from users’ devices is sent via a 3rd-party SaaS tool to bi(OS). With zero lines of code, bi(OS) prepares a feature that can support the following query by our recommendation engine –
“select distinct Top N product categories from all productIDs browsed from an unbounded table visits sorted by time over the last 15 days and keep it refreshed every ~1 min.” Note that inserts into visits happen with quorum consistency across three availability zones, with 5 9’s reliability, and the selects happen with a p99 latency < 25ms and p99.9 < 100ms.
This feature is kept fresh within a 1-minute boundary. The results of this feature go through hygiene checks before being served to the consumer. This solution has been in production for the past year. Ignoring early integration, bi(OS) has exceeded our expectations of SLAs for performance, availability, and reliability.
Highlights
Our use of bi(OS) for real-time recommendations is unique for the following reasons –
- Per-visit personalization – Compared to other recommendations at Pharmeasy, bi(OS) enables features on a per-visit level. It allowed us to personalize visits independent of the device used amongst a family.
- Real-time feature refresh – Our other data stores are updated using batch pipelines daily. bi(OS) provides features that are fresh within a minute. Amongst all our ML models, this is the fastest feature refresh for the highest traffic data stream 1.
- SQL-friendly syntax – bi(OS)’s SDKs expose an SQL-friendly syntax instead of custom APIs for other NoSQL data stores, making learning easy.
- QoS guarantees – Analytics teams within Pharmeasy use data stored within bi(OS) to run periodic ETL jobs, perform Adhoc analysis, and power dashboards. Inspite of such heavy queries, the SLA delivered to micro-services hasn’t been compromised. Our other feature stores aren’t used for any other purpose.
- Deployment – bi(OS) is delivered as-a-service and is accessed as a feature store over a VPC peering link by our micro-service. We are responsible for the maintenance and upkeep of our other feature stores.
- Observability – Unlike other data stores, bi(OS) provides observability metrics out of the box.
Conclusion
This real-time recommendation based on bi(OS) was our first implementation with zero code between data scientists and ML engineers. Furthermore, this implementation delivered one of the best CTRs of all recommendations and impeccable SLAs. This bi(OS) capability will be used for future recommendation engine projects.
Read here how bi(OS) helps all analytics use cases at Pharmeasy.
1 As per Tiktok’s Monolith paper, the faster the refresh frequency, the better the efficacy of the ML model. That paper demonstrated that the freshness of 30 minutes is better than 5 hours. In other words, our use of bi(OS) has a refresh frequency that is 30X faster than what was measured by Monolith.