Skip to main content

HubSpot

This page contains the setup guide and reference information for the HubSpot source connector.

Prerequisites

  • HubSpot Account
  • For Airbyte Open Source: Private App with Access Token

Setup guide

For Airbyte Cloud:

We highly recommend you use OAuth rather than Private App authentication, as it significantly simplifies the setup process.

For Airbyte Open Source:

We recommend Private App authentication.

More information on HubSpot authentication methods can be found here.

Step 1: Set up Hubspot

For Airbyte Cloud:

- OAuth (Recommended)

- Private App: If you are using a Private App, you will need to use your Access Token to set up the connector. Please refer to the official HubSpot documentation for a detailed guide.

For Airbyte Open Source:

- Private App setup (Recommended): If you are authenticating via a Private App, you will need to use your Access Token to set up the connector. Please refer to the official HubSpot documentation for a detailed guide.

- OAuth setup: If you are using Oauth to authenticate on Airbyte Open Source, please refer to Hubspot's detailed walkthrough. To set up the connector, you will need to acquire your:

  • Client ID
  • Client Secret
  • Refresh Token

Step 2: Configure the scopes for your streams

Unless you are authenticating via OAuth on Airbyte Cloud, you must manually configure scopes to ensure Airbyte can sync all available data. To see a breakdown of the specific scopes each stream uses, see our full Hubspot documentation.

Next, you need to configure the appropriate scopes for the following streams. Please refer to Hubspot's page on scopes for instructions.

StreamRequired Scope
campaignscontent
companiescrm.objects.companies.read, crm.schemas.companies.read
contact_listscrm.objects.lists.read
contactscrm.objects.contacts.read
contacts_list_membershipscrm.objects.contacts.read
contacts_form_submissionscrm.objects.contacts.read
Custom CRM Objectscrm.objects.custom.read
deal_pipelinescrm.objects.contacts.read
dealscrm.objects.deals.read, crm.schemas.deals.read
deals_archivedcrm.objects.deals.read, crm.schemas.deals.read
email_eventscontent
email_subscriptionscontent
engagementscrm.objects.companies.read, crm.objects.contacts.read, crm.objects.deals.read, tickets, e-commerce
engagements_emailssales-email-read
formsforms
form_submissionsforms
goalscrm.objects.goals.read
line_itemse-commerce
ownerscrm.objects.owners.read
productse-commerce
property_historycrm.objects.contacts.read
subscription_changescontent
ticketstickets
workflowsautomation

Step 3: Set up the HubSpot source connector in Airbyte

For Airbyte Cloud:

  1. Log in to your Airbyte Cloud account.
  2. From the Airbyte UI, click Sources, then click on + New Source and select HubSpot from the list of available sources.
  3. Enter a Source name of your choosing.
  4. From the Authentication dropdown, select your chosen authentication method:
    • Recommended: To authenticate using OAuth, select OAuth and click Authenticate your HubSpot account to sign in with HubSpot and authorize your account.
      HubSpot Authentication issues

      You might encounter errors during the connection process in the popup window, such as An invalid scope name was provided. To resolve this, close the window and attempt authentication again.

    • **Not Recommended:**To authenticate using a Private App, select Private App and enter the Access Token for your HubSpot account.
  5. For Start date, use the provided datepicker or enter the date programmatically in the following format: yyyy-mm-ddThh:mm:ssZ. The data added on and after this date will be replicated. If not set, "2006-06-01T00:00:00Z" (Hubspot creation date) will be used as start date. It's recommended to provide relevant to your data start date value to optimize synchronization.
  6. Click Set up source and wait for the tests to complete.

For Airbyte Open Source:

  1. Navigate to the Airbyte Open Source dashboard.
  2. From the Airbyte UI, click Sources, then click on + New Source and select HubSpot from the list of available sources.
  3. Enter a Source name of your choosing.
  4. From the Authentication dropdown, select your chosen authentication method:
    • Recommended: To authenticate using a Private App, select Private App and enter the Access Token for your HubSpot account.
    • **Not Recommended:**To authenticate using OAuth, select OAuth and enter your Client ID, Client Secret, and Refresh Token.
  5. For Start date, use the provided datepicker or enter the date programmatically in the following format: yyyy-mm-ddThh:mm:ssZ. The data added on and after this date will be replicated. If not set, "2006-06-01T00:00:00Z" (Hubspot creation date) will be used as start date. It's recommended to provide relevant to your data start date value to optimize synchronization.
  6. Click Set up source and wait for the tests to complete.

Experimental streams

Web Analytics streams may be enabled as an experimental feature but please note that they are based on API which is currently in beta and may change at some point of time or be unstable.

Custom CRM Objects

Custom CRM Objects and Custom Web Analytics will appear as streams available for sync, alongside the standard objects listed above.

If you set up your connections before April 15th, 2023 (on Airbyte Cloud) or before 0.8.0 (OSS) then you'll need to do some additional work to sync custom CRM objects.

First you need to give the connector some additional permissions:

  • If you are using OAuth on Airbyte Cloud go to the Hubspot source settings page in the Airbyte UI and re-authenticate via OAuth to allow Airbyte the permissions to access custom objects.
  • If you are using OAuth on OSS or Private App auth go into the Hubspot UI where you created your Private App or OAuth application and add the crm.objects.custom.read scope to your app's scopes. See HubSpot's instructions here.

Then, go to the replication settings of your connection and click refresh source schema to pull in those new streams for syncing.

Supported sync modes

The HubSpot source connector supports the following sync modes:

  • Full Refresh
  • Incremental
note

There are two types of incremental sync:

  1. Incremental (standard server-side, where API returns only the data updated or generated since the last sync)
  2. Client-Side Incremental (API returns all available data and connector filters out only new records)

Supported streams

The HubSpot source connector supports the following streams:

Notes on the property_history streams

Even though the stream is Incremental, there are some record types that are not affected by the last sync timestamp pointer. For example records of type CALCULATED will allways have most recent timestamp equal to the requset time, so whenever you sync there will be a bunch of records in return.

Notes on the engagements stream

  1. Objects in the engagements stream can have one of the following types: note, email, task, meeting, call. Depending on the type of engagement, different properties are set for that object in the engagements_metadata table in the destination:
  • A call engagement has a corresponding engagements_metadata object with non-null values in the toNumber, fromNumber, status, externalId, durationMilliseconds, externalAccountId, recordingUrl, body, and disposition columns.
  • An email engagement has a corresponding engagements_metadata object with non-null values in the subject, html, and text columns. In addition, there will be records in four related tables, engagements_metadata_from, engagements_metadata_to, engagements_metadata_cc, engagements_metadata_bcc.
  • A meeting engagement has a corresponding engagements_metadata object with non-null values in the body, startTime, endTime, and title columns.
  • A note engagement has a corresponding engagements_metadata object with non-null values in the body column.
  • A task engagement has a corresponding engagements_metadata object with non-null values in the body, status, and forObjectType columns.
  1. The engagements stream uses two different APIs based on the length of time since the last sync and the number of records which Airbyte hasn't yet synced.
  • EngagementsRecent if the following two criteria are met:
    • The last sync was performed within the last 30 days
    • Fewer than 10,000 records are being synced
  • EngagementsAll if either of these criteria are not met.

Because of this, the engagements stream can be slow to sync if it hasn't synced within the last 30 days and/or is generating large volumes of new data. We therefore recommend scheduling frequent syncs.

Notes on the Forms and Form Submissions stream

This stream sync only marketing forms. If you need other forms types try sync Contacts Form Submissions.

Limitations & Troubleshooting

Expand to see details about Hubspot connector limitations and troubleshooting.

Connector limitations

Rate limiting

The connector is restricted by normal HubSpot rate limitations.

Product tierLimits
Free & StarterBurst: 100/10 seconds, Daily: 250,000
Professional & EnterpriseBurst: 150/10 seconds, Daily: 500,000
API add-on (any tier)Burst: 200/10 seconds, Daily: 1,000,000

Troubleshooting

  • Consider checking out the following Hubspot tutorial: Build a single customer view with open-source tools.

  • Enabling streams: Some streams, such as workflows, need to be enabled before they can be read using a connector authenticated using an API Key. If reading a stream that is not enabled, a log message returned to the output and the sync operation will only sync the other streams available.

    Example of the output message when trying to read workflows stream with missing permissions for the API Key:

    {
    "type": "LOG",
    "log": {
    "level": "WARN",
    "message": "Stream `workflows` cannot be proceed. This API Key (EXAMPLE_API_KEY) does not have proper permissions! (requires any of [automation-access])"
    }
    }
  • Unnesting top level properties: Since version 1.5.0, in order to not make the users query their destinations for complicated json fields, we duplicate most of nested data as top level fields.

    For instance:

    {
    "id": 1,
    "updatedAt": "2020-01-01",
    "properties": {
    "hs_note_body": "World's best boss",
    "hs_created_by": "Michael Scott"
    }
    }

    becomes

    {
    "id": 1,
    "updatedAt": "2020-01-01",
    "properties": {
    "hs_note_body": "World's best boss",
    "hs_created_by": "Michael Scott"
    },
    "properties_hs_note_body": "World's best boss",
    "properties_hs_created_by": "Michael Scott"
    }
  • 403 Forbidden Error

    • Hubspot has scopes for each API call.

    • Each stream is tied to a scope and will need access to that scope to sync data.

    • Review the Hubspot OAuth scope documentation here.

    • Additional permissions:

      feedback_submissions: Service Hub Professional account

      marketing_emails: Market Hub Starter account

      workflows: Sales, Service, and Marketing Hub Professional accounts

  • Check out common troubleshooting issues for the Hubspot source connector on our Airbyte Forum.

Reference

Config fields reference

Field
Type
Property name
object
credentials
string
start_date
boolean
enable_experimental_streams

Changelog

Expand to review
VersionDatePull RequestSubject
4.2.62024-06-2240126Update dependencies
4.2.52024-06-1739432Remove references to deprecated state method
4.2.42024-06-1038800Retry hubspot _parse_and_handle_errors on JSON decode errors
4.2.32024-06-0639314Added missing schema types for the Workflows stream schema
4.2.22024-06-0438981[autopull] Upgrade base image to v1.2.1
4.2.12024-05-3038024etry when attempting to get scopes
4.2.02024-05-2438049Add resumable full refresh support to contacts_form_submissions and contacts_merged_audit streams
4.1.52024-05-1738243Replace AirbyteLogger with logging.Logger
4.1.42024-05-1638286Added default schema normalization for the Tickets stream, to ensure the data types
4.1.32024-05-1338128contacts_list_memberships as semi-incremental stream
4.1.22024-04-2436642Schema descriptions and CDK 0.80.0
4.1.12024-04-1135945Add integration tests
4.1.02024-03-2736541Added test configuration features, fixed type hints
4.0.02024-03-1035662Update Deals Property History and Companies Property History schemas
3.3.02024-02-1634597Make start date not required, sync all data from default value if it's not provided
3.2.02024-02-1535328Add mailingIlsListsIncluded and mailingIlsListsExcluded fields to Marketing emails stream schema
3.1.12024-02-1235165Manage dependencies with Poetry.
3.1.02024-02-0534829Add Contacts Form Submissions stream
3.0.12024-01-2934635Fix pagination for CompaniesPropertyHistory stream
3.0.02024-01-2534492Update marketing_emails stream schema
2.0.22023-12-1533844Make property_history PK combined to support Incremental/Deduped sync type
2.0.12023-12-1533527Make query string calculated correctly for PropertyHistory streams to avoid 414 HTTP Errors
2.0.02023-12-0833266Add ContactsPropertyHistory, CompaniesPropertyHistory, DealsPropertyHistory streams
1.9.02023-12-0433042Add Web Analytics streams
1.8.02023-11-2332778Extend PropertyHistory stream to support incremental sync
1.7.02023-11-0132035Extend the Forms stream schema
1.6.12023-10-2031644Base image migration: remove Dockerfile and use the python-connector-base image
1.6.02023-10-1931606Add new field aifeatures to the marketing emails stream schema
1.5.12023-10-0431050Add type transformer for Engagements stream
1.5.02023-09-1130322Unnest stream schemas
1.4.12023-08-2229715Fix python package configuration stream
1.4.02023-08-1129249Add OwnersArchived stream
1.3.32023-08-1029248Specify threadId in engagements stream to type string
1.3.22023-08-1029326Add primary keys to streams ContactLists and PropertyHistory
1.3.12023-08-0829211Handle 400 and 403 errors without interruption of the sync
1.3.02023-08-0128909Add handling of source connection errors
1.2.02023-07-2727091Add new stream ContactsMergedAudit
1.1.22023-07-2728558Improve error messages during connector setup
1.1.12023-07-2528705Fix retry handler for token expired error
1.1.02023-07-1828349Add unexpected fields in schemas of streams email_events, email_subscriptions, engagements, campaigns
1.0.12023-06-2327658Use fully qualified name to retrieve custom objects
1.0.02023-06-0827161Fix increment sync for engagements stream, 'Recent' API is used for recent syncs of last recent 30 days and less than 10k records, otherwise full sync if performed by 'All' API
0.9.02023-06-2627726License Update: Elv2
0.8.42023-05-1725667Fixed bug with wrong parsing of boolean encoded like "false" parsed as True
0.8.32023-05-3126831Remove authSpecification from connector specification in favour of advancedAuth
0.8.22023-05-1626418Add custom availability strategy which catches permission errors from parent streams
0.8.12023-05-2926719Handle issue when state value is literally "" (empty str)
0.8.02023-04-1016032Add new stream Custom Object
0.7.02023-04-1024450Add new stream Goals
0.6.22023-04-2825667Fix bug with Invalid Date like 2000-00-00T00:00:00Z while settip up the connector
0.6.12023-04-1021423Update scope for DealPipelines stream to only crm.objects.contacts.read
0.6.02023-04-0724980Add new stream DealsArchived
0.5.22023-04-0724915Fix field key parsing (replace whitespace with uderscore)
0.5.12023-04-0522982Specified date formatting in specification
0.5.02023-03-3024711Add incremental sync support for campaigns, deal_pipelines, ticket_pipelines, forms, form_submissions, form_submissions, workflows, owners
0.4.02023-03-3122910Add email_subscriptions stream
0.3.42023-03-2824641Convert to int only numeric values
0.3.32023-03-2724591Fix pagination for marketing emails stream
0.3.22023-02-0722479Turn on default HttpAvailabilityStrategy
0.3.12023-01-2722009Set AvailabilityStrategy for streams explicitly to None
0.3.02022-10-2718546Sunsetting API Key authentication. Quotes stream is no longer available
0.2.22022-10-0316914Fix 403 forbidden error validation
0.2.12022-09-2617120Migrate to per-stream state.
0.2.02022-09-1316632Remove Feedback Submissions stream as the one using unstable (beta) API.
0.1.832022-09-0116214Update Tickets, fix missing properties and change how state is updated.
0.1.822022-08-1815110Check if it has a state on search streams before first sync
0.1.812022-08-0515354Fix Deals stream schema
0.1.802022-08-0115156Fix 401 error while retrieving associations using OAuth
0.1.792022-07-2815144Revert v0.1.78 due to permission issues
0.1.782022-07-2815099Fix to fetch associations when using incremental mode
0.1.772022-07-2615035Make PropertyHistory stream read historic data not limited to 30 days
0.1.762022-07-2514999Partially revert changes made in v0.1.75
0.1.752022-07-1814744Remove override of private CDK method
0.1.742022-07-2514412Add private app authentication
0.1.732022-07-1314666Decrease number of http requests made, disable Incremental mode for PropertyHistory stream
0.1.722022-06-2414054Extended error logging
0.1.712022-06-2414102Removed legacy AirbyteSentry dependency from the code
0.1.702022-06-1613837Fix the missing data in CRM streams issue
0.1.692022-06-1013691Fix the URI Too Long issue
0.1.682022-06-0813596Fix for the property_history which did not emit records
0.1.672022-06-0713566Report which scopes are missing to the user
0.1.662022-06-0513475Scope crm.objects.feedback_submissions.read added for feedback_submissions stream
0.1.652022-06-0313455Discover only returns streams for which required scopes were granted
0.1.642022-06-0313218Transform contact_lists data to comply with schema
0.1.632022-06-0213320Fix connector incremental state handling
0.1.622022-06-0113383Add line items to deals stream
0.1.612022-05-2513381Requests scopes as optional instead of required
0.1.602022-05-2513159Use RFC3339 datetime
0.1.592022-05-1012711Ensure oauth2.0 token has all needed scopes in "check" command
0.1.582022-05-0412482Update input configuration copy
0.1.572022-05-0412198Add deals associations for quotes
0.1.562022-05-0212515Extra logs for troubleshooting 403 errors
0.1.552022-04-2812424Correct schema for ticket_pipeline stream
0.1.542022-04-2812335Mock time slep in unit test s
0.1.532022-04-2012230Change spec json to yaml format
0.1.522022-03-2511423Add tickets associations to engagements streams
0.1.512022-03-2411321Fix updated at field non exists issue
0.1.502022-03-2211266Fix Engagements Stream Pagination
0.1.492022-03-1711218Anchor hyperlink in input configuration
0.1.482022-03-1611105Fix float numbers, upd docs
0.1.472022-03-1511121Add partition keys where appropriate
0.1.462022-03-1410700Handle 10k+ records reading in Hubspot streams
0.1.452022-03-0410707Remove stage history from deals stream to increase efficiency
0.1.442022-02-249027Add associations companies to deals, ticket and contact stream
0.1.432022-02-2410576Cast timestamp to date/datetime
0.1.422022-02-2210492Add date-time format to datetime fields
0.1.412022-02-2110177Migrate to CDK
0.1.402022-02-1010142Add associations to ticket stream
0.1.392022-02-1010055Bug fix: reading not initialized stream
0.1.382022-02-039786Add new streams for engagements(calls, emails, meetings, notes and tasks)
0.1.372022-01-279555Getting form_submission for all forms
0.1.362022-01-227784Add Property History Stream
0.1.352021-12-249081Add Feedback Submissions stream and update Ticket Pipelines stream
0.1.342022-01-209641Add more fields for email_events stream
0.1.332022-01-148887More efficient support for incremental updates on Companies, Contact, Deals and Engagement streams
0.1.322022-01-138011Add new stream form_submissions
0.1.312022-01-119385Remove auto-generated properties from Engagements stream
0.1.302021-01-109129Created Contacts list memberships streams
0.1.292021-12-178699Add incremental sync support for companies, contact_lists, contacts, deals, line_items, products, quotes, tickets streams
0.1.282021-12-158429Update fields and descriptions
0.1.272021-12-098658Fix config backward compatibility issue by allowing additional properties in the spec
0.1.262021-11-308329Remove 'skip_dynamic_fields' config param
0.1.252021-11-238216Add skip dynamic fields for testing only
0.1.242021-11-097683Fix name issue 'Hubspot' -> 'HubSpot'
0.1.232021-11-087730Fix OAuth flow schema
0.1.222021-11-037562Migrate Hubspot source to CDK structure
0.1.212021-10-277405Change of package import from urllib to urllib.parse
0.1.202021-10-267393Hotfix for split_properties function, add the length of separator symbol ,(%2C in HTTP format) to the checking of the summary URL length
0.1.192021-10-266954Fix issue with getting 414 HTTP error for streams
0.1.182021-10-185840Add new marketing emails (with statistics) stream
0.1.172021-10-146995Update discover method: disable quotes stream when using OAuth config
0.1.162021-09-276465Implement OAuth support. Use CDK authenticator instead of connector specific authenticator
0.1.152021-09-236374Use correct schema for owners stream
0.1.142021-09-085693Include deal_to_contact association when pulling deal stream and include contact ID in contact stream
0.1.132021-09-085834Fix array fields without items property in schema
0.1.122021-09-025798Treat empty string values as None for field with format to fix normalization errors
0.1.112021-08-265685Remove all date-time format from schemas
0.1.102021-08-175463Fix fail on reading stream using API Key without required permissions
0.1.92021-08-115334Fix empty strings inside float datatype
0.1.82021-08-065250Fix issue with printing exceptions
0.1.72021-07-274913Update fields schema