Data Science Toolkit - Logistic regression custom model service

Article
02/26/2024

The Logistic Regression Custom Model Service allows you to create logistic regression models (sometimes called "logit models") to predict the likelihood of clicks or conversions based on a combination of multiple signals. The logistic regression models can then be associated with a line item using the Line Item Service - ALI (Archived).

Note

This offering is currently in Closed Beta and available to a limited set of clients. For more information about the Advanced Ads Toolset and potential use cases that may apply to your business, reach out to your account representative.

The formula for logistic regression is:

Screenshot of the formula for logistic regression.

For online advertising, the event is a click, a pixel fire, or another online action. The probability is conditional on both the predictors x 1 through xn and on an implicit set of variables that represent the features in a bid request. The beta coefficients are the weights that the model assigns to different predictors.

We convert this probability of an event happening to an expected value by multiplying the probability by the event's value (the eCPC goal for a click prediction), adding an additive offset to the estimate, and then applying min/max expected value limits to reduce the impact of mispredictions.

The formula for deriving an expected value for an impression from the probability of an event happening is:

Screenshot of the formula for deriving an expected value for an impression from the probability of an event happening.

The offset will usually be 0. However, a negative value may be useful as a security factor to ensure performance at the expense of delivery on low-performing inventory. That will ensure that the advertiser does not bid, instead of bidding very little and potentially incurring fixed fees.

For more information about how logistic regression custom models work, see Logistic Regression Models.

REST API

Add a new logistic regression model

POST https://api.appnexus.com/custom-model-logit
(template JSON)

Modify a logistic regression model

PUT https://api.appnexus.com/custom-model-logit?id=CUSTOM_MODEL_LOGIT_ID
(template JSON)

Delete a logistic regression model

DELETE https://api.appnexus.com/custom-model-logit?id=CUSTOM_MODEL_LOGIT_ID

View all logistic regression models

GET https://api.appnexus.com/custom-model-logit

View a specific logistic regression model

GET https://api.appnexus.com/custom-model-logit?id=CUSTOM_MODEL_LOGIT_ID

JSON fields

General

Field	Type	Description
`id`	int	The ID of the logistic regression custom model. Default: Auto-generated number Required on: `PUT/ DELETE` in query string
`name`	string	The name of the model.
`beta0`	float	β 0 coefficient in the logistic regression equation.
`max`	float	Max in the expected value equation.
`min`	float	Min in the expected value equation.
`offset`	float	Offset in the expected value equation.
`scale`	float	Scale in the expected value equation.
`predictors`	array	Array of predictors. See Predictors below for more details.

Predictors

Field	Type	Description
`predictor_type`		`scalar_descriptor`, `custom_model_descriptor`, `freq_rec_descriptor`, `segment_descriptor`, `categorical_descriptor`
`keys`
`hash_seed`
`default_value`
`hash_table_size_log`
`coefficients`

Hashed descriptor

This endpoint is to submit a pre-hashed table. bucket_index0 and bucket_index1, each 64 bits long, are there to support hashing algorithms that produce long values as keys. Currently, we only support one hashing algorithm: MurmurHash3_x64_128, which will create two 64 bit integers but we only use the lower 64 bits of the hash.

Values in bucket_index0 must always be smaller than (2 ^ hash_table_size_log) or they will get rejected.

Currently, the values in bucket_index1 are ignored as this is to be used for future expansion. If a value is sent for bucket_index1, it must be 0. The parameter is optional.

Hash Table keys

For each of your Hash Table keys, you will need a uint32 value. These values should be the ID of respective object that you are referencing from our system - domain_id, for instance, rather than the domain string value. These uint32 keys are then transformed into a byte array (little-endian), and hashed.

Python example

hash_bucket = (mmh3.hash64(bytes, seed)[ 0 ]) % table_size

Field	Type	Description
`type`
`keys`		Array of one to 5 descriptors in this list:
`hashed_seeds`		Seeds used when passed to `Murmurhash3_x64_128` function, only first one is used for now, array is for planned future hash functions that need more than one seed
`hash_id`		Existing hash table ID
`default_value`		Value returned by the descriptor if no match is found in your hash table
`hash_table_size_log`		Log of maximum value for a key of your table. Values larger than `2^hash_table_size_log` will be rejected. Max for `hash_table_size_log` is 64 (no bucketing)

Hashed descriptor example

{
    "type": "hashed",
    "keys": [ 
    ],
    "hash_seeds": [42, 42, 42, 42, 42, 42], 
    "hash_id": ,
    "default_value": 0,                     
    "hash_table_size_log": 20                
}

Lookup Table (LUT) descriptor

Field	Type	Description
`type`
`feature_keyword`
`default_value`
`initial_range_log`
`bucket_count_log_per_range`

LUT descriptor example

{
        "type": "lookup",
        "default_value": 0.1,
        "features": [
                {
                        "type": "categorical_descriptor",
                        "feature_keyword": "advertiser_id"
                },
        {
                        "type": "scalar_descriptor",
                        "feature_keyword": "user_age",
                        "default_value": 0
                }
        ],
        "coefficients": [
                {'weight': 1.1, 'key': [1, 1]},
                {'weight': 1.3, 'key': [2, 2]},
                {'weight': 1.2, 'key': [3, 3]},
        ]
}

Categorical descriptor

Since this is finding an exact match, the default_value, initial_range_log and bucket_count_log_per_range params are not needed.

Missing values

The key value of -1 can be used as a placeholder for a missing feature; For instance, when a domain is not reported by the seller. Otherwise, the default value of the LUT or Hash Model will be used, since no match will be found for that feature value.

Field	Type	Description
`type`
`feature_keyword`		- `country` - `region` - `city` - `dma` - `postal_code` - `user_day` - `user_hour` - `os_family` - `os_extended` - `browser` - `language` - `user_gender` - `domain` - `ip_address` - `position` - `placement` - `placement_group` - `publisher` - `seller_member_id` - `supply_type` - `device_type` - `device_model` - `carrier` - `mobile_app` - `mobile_app_instance` - `mobile_app_bundle` - `appnexus_intended_audience` - `seller_intended_audience` - `spend_protection` - `user_group_id` - `advertiser_id` - `brand_category` - `creative` - `inventory_url_id` - `media_type`
`default_value`
`object_type`		`Advertiser`, `li - ne_item`, `campaign` (not split?)
`object_id`		ID of the referenced advertiser,

Categorical descriptor example

{
    "type": "categorical_descriptor",
    "feature_keyword": "city"
}

Frequency and recency descriptor

Field	Type	Description
`type`
`feature_keyword`		`frequency_life` `frequency_daily` `recency`
`default_value`
`object_type`		`Advertiser`, `line_item`, `campaign` (not split?)
`object_id`		ID of the referenced advertiser,
`default_value`		Value returned by the descriptor if no match is found
`initial_range_log`		Used for log bucketing, initial range
`bucket_count_log_per_range`		used for log bucketing, # of buckets per range

Frequency and recency descriptor example

{
    "type": "frequency_recency_descriptor",
    "feature_keyword": 'frequency_life',
    "object_type": 'advertiser',
    "object_id": 1,                   
    "default_value": 0,              
    "initial_range_log": 4,           
    "bucket_count_log_per_range": 2   
}

Scalar descriptor

Field	Type	Description
`type`
`feature_keyword`		- `appnexus_audited` - `cookie_age` - `estimated_average_price` - `estimated_clearing_price` - `predicted_iab_view_rate` - `predicted_video_completion_rate` - `self_audited` - `size` - `creative_size` - `spend_protection` - `uniform` - `user_age` Note: The size descriptor is represented as a string in your models ("300x250", for instance), though converted to a scalar in our bidder. Any size is technically valid in our system, hence this feature being treated as a scalar rather than a categorical feature.
`default_value`		Value returned by the descriptor if no match is found
`initial_range_log`		Used for log bucketing, initial range
`bucket_count_log_per_range`		Used for log bucketing, # of buckets per range

Scalar descriptor example

{
    "type": "scalar_descriptor",
    "feature_keyword": "cookie_age",
    "default_value": 0,              
    "initial_range_log": 4,           
    "bucket_count_log_per_range": 2   
}

Segment descriptor

Field	Type	Description
`type`
`feature_keyword`		`segment_value` `segment_age` `segment_presence`
`segment_id`		ID of referenced segment
`default_value`		Value returned by the descriptor if no match is found
`initial_range_log`		Used for log bucketing, initial range
`bucket_count_log_per_range`		Used for log bucketing, # of buckets per range

Segment descriptor example

{
    "type": "segment_descriptor",
    "feature_keyword": "segment_age",
    "segment_id": 2,                  
    "default_value": 0,              
    "initial_range_log": 4,           
    "bucket_count_log_per_range": 2   
}

Partager via

Data Science Toolkit - Logistic regression custom model service

REST API

Add a new logistic regression model

Modify a logistic regression model

Delete a logistic regression model

View all logistic regression models

View a specific logistic regression model

JSON fields

General

Predictors

Hashed descriptor

Hash Table keys

Python example

Hashed descriptor example

Lookup Table (LUT) descriptor

LUT descriptor example

Categorical descriptor

Missing values

Categorical descriptor example

Frequency and recency descriptor

Frequency and recency descriptor example

Scalar descriptor

Scalar descriptor example

Segment descriptor

Segment descriptor example

Ressources supplémentaires

Partager via

Data Science Toolkit - Logistic regression custom model service

REST API

Add a new logistic regression model

Modify a logistic regression model

Delete a logistic regression model

View all logistic regression models

View a specific logistic regression model

JSON fields

General

Predictors

Hashed descriptor

Hash Table keys

Python example

Hashed descriptor example

Lookup Table (LUT) descriptor

LUT descriptor example

Categorical descriptor

Missing values

Categorical descriptor example

Frequency and recency descriptor

Frequency and recency descriptor example

Scalar descriptor

Scalar descriptor example

Segment descriptor

Segment descriptor example

Related topics

Ressources supplémentaires