Skip to main content
Skip table of contents

Data Warehousing using the Katalys API

This page describes how a partner might use the Katalys API to pull their Katalys Partner Account data into their data warehouse. This use case assumes the partner has an account in good standing with Katalys, and has generated an API Key for their organization.

This document is meant as a high-level proof of concept. We cannot offer specific recommendations for your data infrastructure. Please consult your own development team to create a solution that works for your specific case!

Overview

A basic data warehousing architecture would entail 2 regularly-scheduled tasks (or “cron jobs”):

  1. A task that runs frequently, such as every hour, to query and store “realtime” data. This task will process the previous 2 hours of data.

  2. A task that runs infrequently, such as once per week, to update your datastore with any conversion adjustments. This task will process the previous 60 days of data.

Both tasks will use the Conversion Report API endpoint to download a list of conversions. The downloaded report must include the seq column. Also include dimensions relevant to your use case, such as order_time, payout, and conversion_status.

Data Schema

When building your schema, we recommend using the Katalys Conversion’s seq value as your primary key. The Katalys seq or “Sequence” value is an acceptable durable key for de-duplicating rows across reports, and for performing row updates.

Under most circumstances, once a seq appears it will remain visible in your dataset forever, even as status or value may change. However, there are cases where a seq might “disappear” from your view. These include testing data, which is sometimes purged, and merged orders, where data received indicates that a seq was actually a duplicate order. These cases are infrequent, but can happen.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.