Having got the preview access to Amazon Fraud detector and with not much of documentation yet on the api based approaches, I just started exploring the capabilities of this service and here is an quick view on my understanding.
If you are someone like me, who prefer checking out the complete code over reading the article to get a glimpse, you can head straight to the github link :)
Also note, the official AWS documentation has detailed explanation on how to use this service through the console , and the following blog complements it by giving a API based approach (which I believe would be updated in the AWS official git shortly :p )
Now let’s get started !!
Amazon Fraud Detector
Amazon Fraud Detector is a fully managed service that helps you detect suspicious online activities such as the creation of fake accounts and online payment fraud. This service was launched during the recent AWS Summit in 2019. With Amazon Fraud Detector, you can create a fraud detection ML model with just a few clicks and use it to evaluate online activities in milliseconds.
Amazon Fraud Detector uses the uploaded historical data to automatically train, test, and deploy a customized fraud detection model. During this process, a series of models that have learned patterns of fraud from AWS and Amazon’s own fraud expertise are used to boost your model’s performance. To use the model, call the Amazon Fraud Detector GetPrediction API with meta data about an online event and synchronously receive a fraud prediction score and an outcome based on your configuration.
Working with Amazon Fraud Detector
The steps for creating a model, building a detector, and getting fraud predictions includes:
1.Gather historical fraud data (training data)
2.Create a model version (trained model)
3.Review model performance and deploy the model
4.Create a detector that includes the deployed model version , decision rules , variables and outcomes.
5.Finally , Send events to Amazon Fraud Detector and get a fraud prediction
Getting Started
PS: At the time of writing this article , the default boto3 supported in tensorflow_p36 kernel ( in Sagemaker) was <1.10.39 and to ensure FraudDetector methods are included we may need to update the conda packages.
As the first step , lets create a client for Fraud Detector to perform all the above mentioned operations. Since the the preview is available in US East 1 region only, ensure the region has been set to ‘us-east-1’.
client = boto3.client('frauddetector', region_name='us-east-1')
For the demo purpose , I have reused the sample data provided in the following link https://docs.aws.amazon.com/frauddetector/latest/ug/samples/training_data.zip
Step 1 : Define Model Details
The method put_model
would create / update a new model. And the following parameters can be configured
1. Model template to Use ( Currently 'Online Fraud Insights' is the only enabled option)
2. Data Location & IAM Role to Access data
3. Model Variables in the sample data
4. Fraud Label Schema Mapping.
Step 1a : Create & Activate Model Version
Once the model is created , lets create a new model version. By creating the model version , the model gets trained and ready to be consumed once the status changes to TRAINING_COMPLETE
response = client.create_model_version(
modelId=_modelId,
modelType=_modelType,
description='creating a new version of the given model'
)
get_model_version
method would give us the status of the model. Once the training is complete lets activate the model by the update_model_version
method to Activate !
response = client.update_model_version(
modelId=_modelId,
modelType=_modelType,
modelVersionNumber='1.0',
description='Activate the model',
status='ACTIVE'
)
Review the Trained Model’s Performance
Now that the model is trained and Activated , let’s quickly check the performance metrics
Amazon Fraud Detector validates model performance using 15% of your data that was not used to train the model. You can expect your trained Amazon Fraud Detector model to have real-world fraud detection performance that is similar to the validation performance metrics.
As a business, you must balance between detecting more fraud, and adding more friction to legitimate customers. To assist in choosing the right balance, Amazon Fraud Detector provides the following model performance metrics:
True positive rate (TPR)
– Percentage of total fraud the model detects. Also known as capture rate.
False positive rate (FPR)
– Percentage of total legitimate events that are incorrectly predicted as fraud.
Precision
– Percentage of fraud events correctly predicted as fraudulent as compared to all events predicted as fraudulent.
Area under the curve (AUC)
– Summarizes TPR and FPR across all possible model score thresholds. A model with no predictive power has an AUC of 0.5, whereas a perfect model has a score of 1.0.
The console gives a nice view on the performance of the model !!
Step 2 : Create a Detector
We use a detector to house your fraud prediction configurations
response = client.put_detector(
detectorId=_detectorId,
description='AWS fraud detector - Creating new detector'
)
print(response)
Let us also create the outcomes and rules that we wanted to tag with this detector. The following three outcomes and rules have been created as an example
client.put_outcome( name='verify_fraudulent_customer',description='Need a verification for this transaction.')
client.put_outcome( name='approve_customer',description='Low risk customer , approve the transaction')
client.put_outcome( name='review_customer',description='review this transaction')
Now , it’s time to create the new detector version and tag the rules. And once we create the new detector we will also activate it using update_detector_version_status
method.
response = client.create_detector_version(
detectorId = _detectorId,
description = 'Creating new detector version for' + _detectorId ,
rules=[
{
"detectorId": _detectorId,
"ruleId": "high_risk",
"ruleVersion": "1.0"
},
{
"detectorId": _detectorId,
"ruleId": "medium_risk",
"ruleVersion": "1.0"
},
{
"detectorId": _detectorId,
"ruleId": "low_risk",
"ruleVersion": "1.0"
}
],
modelVersions = [
{
'modelId' : _modelId,
'modelType' : _modelType,
'modelVersionNumber' : '1.0'
},
]
)print(json.dumps(response, indent=2))
Finally , we have the model and the detector active ! Time to predict..
To start with , let’s just try the training data to see the outcome !
response = client.get_prediction(
detectorId=_detectorId,
detectorVersionId='2.0',
eventId = "b453a710-fcec-4c7d-9ce4-2ef2daccd615",
eventAttributes = {"billing_name":"null",
"billing_address_1":"null",
"billing_city":"null",
"shipping_address_1":"null",
"shipping_city":"null",
"shipping_state":"null",
"shipping_postal":"null",
"order_amt":"585.05",
"ip_address":"192.3.33.20",
"email_address":"null",
"user_agent":"null",
"avs_code":"K",
"phone_number":"null",
"billing_state":"null",
"billing_postal":"null",
"event_timestamp":"12/20/2018 20:03",
"transaction_id":"712-36182",
"payment_instrument":"null"}
)print(json.dumps(response, indent=2))
And we get the following output
{
"outcomes": [
"verify_fraudulent_customer"
],
"modelScores": [
{
"modelVersion": {
"modelId": "aws_frauddetector_demo",
"modelType": "ONLINE_FRAUD_INSIGHTS",
"modelVersionNumber": "1.0"
},
"scores": {
"aws_frauddetector_demo_insightscore": 892.0
}
}
],
"ResponseMetadata": {
"RequestId": "167af50c-c286-449b-ba2a-0783c8a8e0f5",
"HTTPStatusCode": 200,
"HTTPHeaders": {
"content-type": "application/x-amz-json-1.1",
"date": "Tue, 07 Jan 2020 04:47:48 GMT",
"x-amzn-requestid": "167af50c-c286-449b-ba2a-0783c8a8e0f5",
"content-length": "231",
"connection": "keep-alive"
},
"RetryAttempts": 0
}
}
Let us mimic that we get the following live stream of data , by calling the get_prediction , let us see what’s the potential outcome.
response = client.get_prediction(
detectorId=_detectorId,
detectorVersionId='2.0',
eventId = "b453a710-fcec-4c7d-9ce4-2ef2daccd619",
eventAttributes = {"billing_name":"null",
"billing_address_1":"null",
"billing_city":"null",
"shipping_address_1":"null",
"shipping_city":"null",
"shipping_state":"null",
"shipping_postal":"null",
"order_amt":"100",
"ip_address":"192.168.1.1",
"email_address":"null",
"user_agent":"null",
"avs_code":"K",
"phone_number":"null",
"billing_state":"null",
"billing_postal":"null",
"event_timestamp":"null",
"transaction_id":"2342342",
"payment_instrument":"null"}
)print(json.dumps(response, indent=2))
And the Outcome is
{
"outcomes": [
"verify_fraudulent_customer"
],
"modelScores": [
{
"modelVersion": {
"modelId": "aws_frauddetector_demo",
"modelType": "ONLINE_FRAUD_INSIGHTS",
"modelVersionNumber": "1.0"
},
"scores": {
"aws_frauddetector_demo_insightscore": 999.0
}
}
],
"ResponseMetadata": {
"RequestId": "988b2e86-51e5-441c-84b0-08dcda336dbb",
"HTTPStatusCode": 200,
"HTTPHeaders": {
"content-type": "application/x-amz-json-1.1",
"date": "Tue, 07 Jan 2020 04:49:51 GMT",
"x-amzn-requestid": "988b2e86-51e5-441c-84b0-08dcda336dbb",
"content-length": "231",
"connection": "keep-alive"
},
"RetryAttempts": 0
}
}
Amazon Fraud Detector returns the fraud prediction outcome corresponding to the first rule (that is, the highest priority) that matched (that is, that evaluated to true). If no rules matched, Amazon Fraud Detector returns NO MATCH. Amazon Fraud Detector also returns the model score for any models added to your detector.
At this point, your model and associated detector logic are ready to evaluate online activities for fraud in real-time using the Amazon Fraud Detector GetPrediction API.
There we go ! Now that i am done the ‘Hello World’ of Amazon Fraud Detector ,time to play around with different datasets to see what more this service could offer.