Rotates Amazon Personalize filters on a schedule based on dynamic templates

Overview

Amazon Personalize Filter Rotation

This project contains the source code and supporting files for deploying a serverless application that provides automatic filter rotation capabilities for Amazon Personalize, an AI service from AWS that allows you to create custom ML recommenders based on your data. Highlights include:

  • Creates filters based on a dynamic filter naming template you provide
  • Builds filter expressions based on a dynamic filter expression template you provide
  • Deletes filters based on a dynamic matching expression you provide (optional)
  • Publishes events to Amazon EventBridge when filters are created or deleted (optional)

Why is this important?

Amazon Personalize filters are a great way to have your business rules applied to recommendations before they are returned to your application. They can be used to include or exclude items from being recommended for a user based on a SQL-like syntax that considers the user's interaction history, item metadata, and user metadata. For example, only recommend movies that the user has watched or favorited in the past to populate a "Watch again" widget.

INCLUDE ItemID WHERE Interactions.event_type IN ('watched','favorited')

Or exclude products from being recommended that are currently out of stock.

EXCLUDE ItemID WHERE Items.out_of_stock IN ('yes')

You can even use dynamic filters where the filter expression values are specified at runtime. For example, only recommend movies for a specific genre.

INCLUDE ItemID WHERE Items.genre IN ($GENRES)

To use the filter above, you would pass the appropriate value(s) for the $GENRE variable when retrieving recommendations.

You can learn more about filters on the AWS Personalize blog here and here.

Filters are great! However, they do have some limitations. One of those limitations is being able to specify a dynamic value for a range query (i.e., <, <=, >, >=). For example, the following filter to limit recommendations to new items that were created since a rolling point in the past is not supported.

THIS WON'T WORK!

INCLUDE ItemID WHERE Items.creation_timestamp > $NEW_ITEM_THRESHOLD

The solution to this limitation is to use a filter expression with a hard-coded value for range queries.

THIS WORKS!

INCLUDE ItemID WHERE Items.creation_timestamp > 1633240824

However, this is not very flexible or maintainble since time marches on but your static filter expression does not. The workaround is to update your filter expression periodically to maintain a rolling window of time. Unfortunately filters cannot be updated so a new filter has to be created, your application has to transition to using the new filter, and then the previous filter can be safely deleted.

The purpose of this serverless application is to make this process easier to maintain by automating the creation and deletion of filters and allowing you to provide a dynamic expression that is resolved to the appropriate hard-coded value when the new filter is created.

Here's how it works

An AWS Lambda function is deployed by this application that is called on a recurring basis. You control the schedule which can be a cron expression or a rate expression. The function will only create a new filter if a filter does not already exist that matches the current filter name template and it will only delete existing filters that match the delete template. Therefore, it is fine to have the function run more often than necessary (i.e. if you don't have a predictable and consistent time when filters should be rotated).

The key to the filter rotation function are the templates used to verify that the current template exists and if existing template(s) are eligible to delete. Let's look at some examples. You provide these template values as CloudFormation parameters when you deploy this application.

Current filter name template

Let's say you want to use a filter that only recommends recently created items. The CREATION_TIMESTAMP column in the items dataset is a convenient field to use for this. This column name is reserved and is used to support the cold item exploration feature of the aws-user-personalization recipe. Values must be expressed in the Unix timestamp format as long's (i.e. number of seconds since the Epoch). The following expression limits items that were created in the last month (1633240824 is the Unix timestamp from 1 month ago as of this writing).

INCLUDE ItemID WHERE Items.creation_timestamp > 1633240824

Alternatively, you can use a custom metadata column for the filter that uses a more coarse and/or human readable format but is still comparable for range queries, like YYYYMMDD.

INCLUDE ItemID WHERE Items.published_date > 20211001

As noted earlier, filters cannot be updated. Therefore you can't just change the filter expression of the filter. Instead, you have to create a new filter with a new expression, switch your application to use the new filter, and then delete the old filter. This requires using a predictable naming standard for filters so applications can automatically switch to using the new filter without a coding change. Continuing with the creation timestamp theme, the filter name could be something like.

filter-include-recent-items-20211101

Assuming we want to rotate this filter each day, the next day's filter name would be filter-include-recent-items-20211004, then filter-include-recent-items-20211005, and so on as time passes. Since there are limits on how many active filters you can have at any time, you cannot precreate filters. Instead, this application will dynamically create new filters. What is needed is a template that defines the filter name that can be resolved when the rotation script runs.

filter-include-recent-items-{{datetime_format(now,'%Y%m%d')}}

The above filter name template will resolve and replace the expression within the {{ and }} characters (handlebars or mustaches). In this case, we are taking the current time expressed as now and formatting it using the %Y%m%d date format expression. The result (as of today) is 20211102. If the rotation function finds an existing filter with this name, a new filter does not need to be created. Otherwise, a new filter is created using filter-include-recent-items-20211102 as the name.

The PersonalizeCurrentFilterNameTemplate CloudFormation template parameter is how you specify your own custom filter name template.

The functions and operators available to use in templates is described below.

Current filter expression template

When rotating and creating the new filter, we also may have to dynamically resolve the actual filter expression. The PersonalizeCurrentFilterExpressionTemplate CloudFormation template parameter can be used for this. Some examples.

INCLUDE ItemID WHERE Items.CREATION_TIMESTAMP > {{int(unixtime(now - timedelta_days(30)))}}
INCLUDE ItemID WHERE Items.published_date > {{datetime_format(now - timedelta_days(30),'%Y%m%d')}}

The above templates produce a hard-coded filter expression based on current time when they're resolved. The first produces a Unix timestamp (expressed in seconds as required by Personalize for CREATION_TIMESTAMP) that is 30 days ago. The second template produces an integer representing the date in YYYYMMDD format from 30 days ago.

Delete filter match template

Finally, we need to clean up old filters after we have transitioned to a newer version of the filter. A filter name matching template can be used for this and can be written in such a way to delay the delete for some time after the new filter is created. This gives your application time to transition from the old filter to the new filter. The PersonalizeDeleteFilterMatchTemplate CloudFormation template parameter is where you specify the delete filter match template.

The following delete filter match template will match on filters with a filter name that starts with filter-include-recent-items- and has a suffix that is more than one day older than today. In other words, we have 1 day to transition to the new filter before the old filter is deleted.

starts_with(filter.name,'filter-include-recent-items-') and int(end(filter.name,8)) < int(datetime_format(now - timedelta_days(1),'%Y%m%d'))

Any filters that trigger this template to resolve to true will be deleted. All others will be left alone. Note that all fields available in the FilterSummary of the ListFilters API response are available to this template. For example, the template above matches on filter.name. Other filter summary fields such as filter.status, filter.creationDateTime, and filter.lastUpdatedDateTime can also be inspected in the template.

Filter events

If you'd like to synchronize your application's configuration or be notified when a filter is created or deleted, you can optionally configure the rotator function to publish events to Amazon EventBridge. When events are enabled, there are two event detail types published by the rotator function: PersonalizeFilterCreated and PersonalizeFilterDeleted. Both have an event Source of personalize.filter.rotator and include details on the filter created or deleted. This allows you to setup EventBridge rules to process events as you please. For example, when a new filter is created, you can receive the PersonalizeFilterCreated event in a Lambda function to update your application's configuration to switch to using the new filter.

Filter template syntax

The Simple Eval library is used as the foundation of for the template syntax. It provides a safer and more sandboxed alternative than using Python's eval function. Check the Simple Eval library documentation for details on the functions available and examples.

The following additional functions were added as part of this application to make writing templates easier for rotating filters.

  • unixtime(value): Returns the Unix timestamp value given a string, datetime, date, or time. If a string is provided, it will be parsed into a datetime first.
  • datetime_format(date, pattern): Formats a datetime, date, or time using the specified pattern.
  • timedelta_days(int): Returns a timedelta for a number of days. Can be used for date math.
  • timedelta_hours(int): Returns a timedelta for a number of hours. Can be used for date math.
  • timedelta_minutes(int): Returns a timedelta for a number of minutes. Can be used for date math.
  • timedelta_seconds(int): Returns a timedelta for a number of seconds. Can be used for date math.
  • starts_with(str, prefix): Returns True if the string value starts with prefix.
  • ends_with(str, suffix): Returns True if the string value ends with suffix.
  • start(str, num): Returns the first num characters of the string value
  • end(str, num): Returns the last num characters of the string value
  • now: Current datetime

Installing the application

IMPORTANT NOTE: Deploying this application in your AWS account will create and consume AWS resources, which will cost money. The Lambda function is called according to the schedule you provide but typically should not need to be called more often than hourly. Personalize does not charge for filters but your account does have a limit on the number of filters that are active at any time. There are also limits on how many filters can be in a pending or in-progress status at any point in time. Therefore, if after installing this application you choose not to use it as part of your solution, be sure to follow the Uninstall instructions in the next section to avoid ongoing charges and to clean up all data.

This application uses the AWS Serverless Application Model (SAM) to build and deploy resources into your AWS account.

To use the SAM CLI, you need the following tools installed locally.

To build and deploy the application for the first time, run the following in your shell:

sam build --use-container --cached
sam deploy --guided --capabilities CAPABILITY_IAM CAPABILITY_AUTO_EXPAND

If you receive an error from the first command about not being able to download the Docker image from public.ecr.aws, you may need to login. Run the following command and then retry the above two commands.

aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws

The first command will build the source of the application. The second command will package and deploy the application to your AWS account, with a series of prompts:

Prompt/Parameter Description Default
Stack Name The name of the stack to deploy to CloudFormation. This should be unique to your account and region. personalize-filter-rotator
AWS Region The AWS region you want to deploy this application to. Your current region
Parameter PersonalizeDatasetGroupArn Amazon Personalize dataset group ARN to rotate filters within.
Parameter PersonalizeCurrentFilterNameTemplate Template to use when checking and creating the current filter.
Parameter PersonalizeCurrentFilterExpressionTemplate Template to use when building the filter expression when creating the current filter.
Parameter PersonalizeDeleteFilterMatchTemplate (optional) Template to use to match existing filters that should be deleted.
Parameter RotationSchedule Cron or rate expression to control how often the rotation function is called. rate(1 day)
Parameter Timezone Set the timezone of the rotator function's Lambda environment to match your own.
Parameter PublishFilterEvents Whether to publish events to the default EventBridge bus when filters are created and deleted. Yes
Confirm changes before deploy If set to yes, any CloudFormation change sets will be shown to you before execution for manual review. If set to no, the AWS SAM CLI will automatically deploy application changes.
Allow SAM CLI IAM role creation Since this application creates IAM roles to allow the Lambda functions to access AWS services, this setting must be Yes.
Save arguments to samconfig.toml If set to yes, your choices will be saved to a configuration file inside the application, so that in the future you can just re-run sam deploy without parameters to deploy changes to your application.

TIP: The SAM command-line tool provides the option to save your parameter values in a local file (samconfig.toml) so they're available as defaults the next time you deploy the app. However, SAM wraps your parameter values in double-quotes. Therefore, if your template parameter values contain embedded string values (like the date format expressions shown in the examples above), be sure to use single-quotes for those embedded values. Otherwise, your parameter values will not be properly preserved.

Uninstalling the application

To remove the resources created by this application in your AWS account, use the AWS CLI. Assuming you used the default application name for the stack name (personalize-filter-rotator), you can run the following:

aws cloudformation delete-stack --stack-name personalize-filter-rotator

Alternatively, you can delete the stack in CloudFormation in the AWS console.

Reporting issues

If you encounter a bug, please create a new issue with as much detail as possible and steps for reproducing the bug. Similarly, if you have an idea for an improvement, please add an issue as well. Pull requests are also welcome! See the Contributing Guidelines for more details.

License summary

This sample code is made available under a modified MIT license. See the LICENSE file.

Owner
James Jory
Applied AI Solutions Architect
James Jory
A small discord bot to interface with python-discord's snekbox.

A small discord bot to interface with python-discord's snekbox.

Hassan Abouelela 0 Oct 05, 2021
Discord-disnake - This package allows to use disnake without changing the discord namespace

This package is a shim This module allows to use disnake using discord namespace. This is not an independent library. Installing Python 3.8 or higher

5 Dec 13, 2022
IBD Style Relative Strength Percentile Ranking of Stocks (i.e. 0-100 Score).

relative-strength IBD Style Relative Strength Percentile Ranking of Stocks (i.e. 0-100 Score). I also made a TradingView indicator, but it cannot give

57 Jan 06, 2023
Asynchronous python aria2 mirror bot Telegram.

aioaria2-mirror-bot A Bot for Telegram made with Python using Pyrogram library. It needs Python 3.9 or newer to run. THIS BOT IS INTENDED TO BE USED O

Adek 85 Jan 03, 2023
This is a Python package to create a snowflake identifier similar to Discord's or Twitter's.

snowflake2 Based on falcondai and fenhl's Python snowflake tool, but with documentation and simliarities to Discord. Installation instructions Install

Learnloot 2 Mar 19, 2022
The open source version of Tentro - A multipurpose Discord bot.

Welcome to Tentro ๐Ÿ‘‹ A multipurpose Discord bot. ๐Ÿ  Homepage Install pip install -r requirements.txt Usage py Tentro.py Contributors ๐Ÿ‘ค Tentro Dev Tea

6 Jul 14, 2022
Talon accessibility - Experimental Talon integrations using macOS accessibility APIs

talon_accessibility Experimental Talon integrations using macOS accessibility AP

Phil Cohen 11 Dec 23, 2022
Um simples bot escrito em Python usando a lib pyTelegramBotAPI

Telegram Bot Python Um simples bot escrito em Python usando a lib pyTelegramBotAPI Instalaรงรฃo Windows: Download do Python 3 Aqui Download do ZIP do Cรณ

Sr_Yuu 1 May 07, 2022
A management system designed for the employees of MIRAS (Art Gallery). It is used to sell/cancel tickets, book/cancel events and keeps track of all upcoming events.

Art-Galleria-Management-System Its a management system designed for the employees of MIRAS (Art Gallery). Backend : Python Frontend : Django Database

Areesha Tahir 8 Nov 30, 2022
Some random bot for Discord which was created just for fun (Made with Discord.py library)

Ghosty Previously known as 'secondthunder-py-bot' This is repository of some random bot for Discord which was created just for fun and for some educat

ะ’ะปะฐะดะธัะปะฐะฒ 8 Oct 02, 2022
Kali Kush - Account Nuker Tool

Kali Kush - Account Nuker Tool This is a discord tool made by me, and SSL :) antho#1731 How to use? pip3 install -r requirements.txt -py kalikush.py -

ryan 3 Dec 21, 2021
Contrastive Language-Audio Pretraining

CLAP Contrastive Language-Audio Pretraining In due time this repo will be full of lovely things, I hope. Feel free to check out the Issues if you're i

Charles Foster 83 Dec 01, 2022
Gathers data and displays metrics related to climate change and resource depletion on a PowerBI report.

Apocalypse Status Dashboard Purpose Climate change and resource depletion are grave long-term dangers. The code in this repository will pull data from

Summer Is Here 1 Nov 12, 2021
A powerful application to automatically deploy GitHub Release.

A powerful application to automatically deploy GitHub Release.

Fentaniao 43 Sep 17, 2022
ML-Test-Client

ML-Test-Client Introduction What is this? This Test Client App is to be used to crowd-test machine learning models with the goal of finding the best c

11 Jul 15, 2022
Modified Version of mega.py package for Pyrogram Bots

Pyro Mega.py Python library for the Mega.co.nz API, currently supporting: login uploading downloading deleting searching sharing renaming moving files

I'm Not A Bot #Left_TG 10 Aug 03, 2022
Wrapper for Between - ๋น„ํŠธ์œˆ์„ ์œ„ํ•œ ํŒŒ์ด์ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ

PyBetween Wrapper for Between - ๋น„ํŠธ์œˆ์„ ์œ„ํ•œ ํŒŒ์ด์ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ Legal Disclaimer ์˜ค์ง ๊ต์œก์  ๋ชฉ์ ์œผ๋กœ๋งŒ ์‚ฌ์šฉํ• ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋น„ํŠธ์œˆ์€ VCNC์˜ ์ž์‚ฐ์ž…๋‹ˆ๋‹ค. ์•…์˜์  ๊ณต๊ฒฉ์— ์ด์šฉํ• ์‹œ ์ฒ˜๋ฒŒ ๋ฐ›์„์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ์šฉ์— ๋”ฐ๋ฅธ ์ฑ…์ž„์€ ์‚ฌ์šฉ์ž๊ฐ€

1 Mar 15, 2022
A modern, easy to use, feature-rich, and async ready API wrapper for Discord written in Python.

disfork A modern, easy to use, feature-rich, and async ready API wrapper for Discord written in Python. Key Features Modern Pythonic API using async a

2 Feb 09, 2022
A discord bot to check if messages have the correct code formatting.

discord-code-formatter A discord bot to check if messages have the correct code formatting. This was a basic project to help me learn Python and learn

Nash Boisvert 1 Nov 23, 2021
Automatically compile an AWS Service Control Policy that ONLY allows AWS services that are compliant with your preferred compliance frameworks.

aws-allowlister Automatically compile an AWS Service Control Policy that ONLY allows AWS services that are compliant with your preferred compliance fr

Salesforce 189 Dec 08, 2022