Bidders - BSS Avro file format
BSS AVRO file format
This document covers how to prepare your audience files using the AVRO file format and onboard the data into the platform. AVRO is required to upload audiences containing extended ID’s and publisher-provided ID’s, and the legacy BSS file format does not support newer user ID types.
- Segments can be created through the segment service.
- Audience files can be uploaded to segments through the batch segment service.
Best practices
- Recommended file size: 100-300mb per file
- Recommended file compression: deflate
- Recommended delivery method: Passive Segment Upload (reach out to your Account Manager for access).
- Updating segments: Instead of sending the full audience memberships again, you can upload only the changes for existing segments. This will reduce the data size and the chance of reaching your daily upload limits.
Overview of steps
- Understanding the User-segments record
- Install the AVRO tools library
- Download the Xandr AVRO schema file
- Generate your AVRO audience file
User-segments record
A user record has two top level elements:
- User ID (uid)
- Array of segments
User ID types
Only one id type can be uploaded per uid record (e.g. Xandr User ID, IFA, Device ID, eid)
User ID Type | Description |
---|---|
AppNexus/Xandr User ID (ANID) |
Xandr ID, also known as user_id_64. |
Device ID |
Similar utility as ifa (Identifier for Advertising). It indicates the Mobile device type being onboarded. The device_id record consists of two fields: - domain (enum): Possible values are idfa, sha1udid, md5udid, openudid, aaid, windowsadid, rida, tifa, vida, and lgudid. - id (string) |
Identifier for Advertising (or IFA) |
Identifier for Advertising - indicates the device type being onboarded. The ifa record consists of two fields: - type (string): Type of ID. - id (string): IFA ID, representing the IFA in UUID format. For supported ifa types, see device extension object. |
External ID |
External ID - indicates Member defined identifier being onboarded. The external_id record consists of two fields: - member_id (int): Member ID of the member who owns the external_id. - id (string): corresponding value of the member_id. |
Extended ID's(eid) or Publisher-provided ID's(PPID) |
Extended ID - indicates the type of universal ID or publisher ID being onboarded. The eid record consists of two fields: - source (string): Source of the ID. - id (string) - Publisher or industry ID. Today these are the only two available for audience onboarding. |
Java library example
AppNexus/Xandr User ID (ANID)
{"uid":
{"long":12345},
"segments":
[{"id":123,
"code":"",
"member_id":0,
"expiration":0,
"timestamp":0,
"value":0}]}
Device ID
{"uid":
{"device_id":
{"id":"958cba26-f338-43f3-8bb0-ed821582daae",
"domain":"idfa"}},
"segments":
[{"id":123,
"code":"",
"member_id":0,
"expiration":0,
"timestamp":0,
"value":0}]}
Identifier for Advertising (or IFA)
{"uid":
{"ifa":
{"id":"99136473264876328",
"type":"atif"}},
"segments":
[{"id":123,
"code":"",
"member_id":0,
"expiration":0,
"timestamp":0,
"value":0}]}
External ID
{"uid":
{"external_id":
{"id":"clientid1",
"member_id":958}},
"segments":
[{"id":123,
"code":"",
"member_id":0,
"expiration":0,
"timestamp":0,
"value":0}]}
Extended ID's(eid) or Publisher-provided ID's(PPID)
{"uid":
{"eid":
{"source":"liveramp.com",
"id":"123123123"}},
"segments":
[{"id":123,
"code":"",
"member_id":0,
"expiration":0,
"timestamp":0,
"value":0}]}
Python library example
Python library example: AppNexus/Xandr User ID (ANID)
{'uid': 64,
'segments':
[seg1]}
Python library example: Device ID
{'uid': {'id':
'qweqeqweq',
'domain': 'idfa'},
'segments': [seg1]}
Python library example: Identifier for Advertising (or IFA)
{'uid': {'id':
'qweqeqweq', 'type':
'atif'}, 'segments':
[seg1]}
Python library example: External ID
{'uid': {'id':
'extid1',
'member_id': 914},
'segments': [seg1]}
Python library example: Extended ID's(eid) or Publisher-provided ID's(PPID)
{'uid': {'id':
'qweqeqweq',
'source':
'liveramp.com'},
'segments': [seg1]}
Segments object
You can upload to multiple segments within the same uid record by creating an array of segment objects.
File | Type | Description |
---|---|---|
id |
int | Xandr segment ID. |
code |
string | Xandr segment code. |
member_id |
int | Member ID of the segment. Required when code is specified. |
expiration |
int | Segment expiration in minutes. Set to: - 0 for maximum expiration (180 days).- -1 for segment removal.- -2 for default member expiration. |
timestamp |
long | Segment activation time in seconds from epoch. It specifies when segment becomes 'live'. Set to 0 to activate the segment immediately. |
value |
int | Segment value. |
Installing the AVRO tools library
Java library
Curl -o http://archive.apache.org/dist/avro/avro-1.10.1/java/avro-tools-1.10.1.jar
Python library
python3 -m pip install avro
Download the Xandr Avro schema
You can download the Xandr Avro Schema from here.
Generate your AVRO audience file
For examples using the Java and Python libraries, see below.
Java example
Create an audience file
{"uid":{"long":12345},"segments":[{"id":123,"code":"","member_id":0,"expiration":0,"timestamp":0,"value":0}]}
{"uid":{"external_id":{"id":"clientid1","member_id":958}},"segments":[{"id":123,"code":"","member_id":0,"expiration":0,"timestamp":0,"value":0}]}
{"uid":{"ifa":{"id":"99136473264876328","type":"atif"}},"segments":[{"id":123,"code":"","member_id":0,"expiration":0,"timestamp":0,"value":0}]}
{"uid":{"device_id":{"id":"958cba26-f338-43f3-8bb0-ed821582daae","domain":"idfa"}},"segments":[{"id":123,"code":"","member_id":0,"expiration":0,"timestamp":0,"value":0}]}
{"uid":{"eid":{"source":"liveramp.com","id":"123123123"}},"segments":[{"id":123,"code":"","member_id":0,"expiration":0,"timestamp":0,"value":0}]}
Convert the audience file into AVRO
Run the following command:
java -jar avro-tools-1.10.1.jar fromjson --codec deflate --schema-file xandr_schema.avsc sample.json > sample.avro
Where:
- xandr_schema.avsc = the supplied Xandr Avro schema file;
- sample.json = your audience file;
- and sample.avro = output AVRO file
Python example
Note
- Our examples are for the Python Avro Library, and are not to be confused with the Fast Avro Library.
Python Avro library does not use uid union type names. Instead, it determines the uid type by full match of field names.
{'uid': {'id': 'qweqeqweq', 'domain': 'idfa'}, 'segments': \[…\]}
The Fast Avro library uses hints to specify the exact type of uid similar to the Java library.
{'uid': ('external_id', {'id':'exitd1', 'member_id': 914}), 'segments': \[{'expiration': 259200, 'id': 25815407}\]}
DataFileWriter.append()
accepts a python dictionary (dict) type, not a JSON.
Creating an AVRO audience file
Sample script using the Python Avro library
import avro.schema
from avro.datafile import DataFileReader, DataFileWriter
from avro.io import DatumReader, DatumWriter
# the supplied xandr schema
schema = avro.schema.parse(open("xandr_schema.avsc", "rb").read())
# output avro file
writer = DataFileWriter(open("sample.avro", "wb"), DatumWriter(), schema, codec=’deflate’)
# segments
seg1 = {'id': 1000, 'code': '', 'member_id': 0, 'expiration': 0, 'timestamp': 0, 'value': 0}
# anid
writer.append({'uid': 64, 'segments': [seg1]})
# external id
writer.append({'uid': {'id': 'exitd1', 'member_id': 914}, 'segments': [seg1]})
# idfa
writer.append({'uid': {'id': 'qweqeqweq', 'domain': 'idfa'}, 'segments': [seg1]})
# eid (or ppid)
writer.append({'uid': {'id': 'qweqeqweq', 'source': 'liveramp.com'}, 'segments': [seg1]})
writer.append({'uid': {'id': 'qweqeqweq', 'source': 'netid.de'}, 'segments': [seg1]})
writer.close()