Now booking new projects. 2 build slots open this month. Start a project →
E-commerce

How to Sync 100,000+ Products to Shopify

Updated 2026-06 · 9 min read · By the Former CTO and Co-founder

Syncing a catalog of 100,000 or more products to Shopify is a different problem than importing a few hundred items with a CSV. At that scale, you hit API rate limits, file size restrictions, and data consistency challenges that require a purpose-built architecture rather than a plugin. Seven Hills Services has synced catalogs exceeding 150,000 SKUs for clients including enterprise carriers, so the patterns here come from production systems.

This guide covers the tools Shopify provides for large-scale catalog work, the architecture decisions that determine whether your sync stays current or drifts, and the tradeoffs between different approaches. We focus on the Shopify GraphQL Admin API, the BulkOperation API, and the patterns that keep syncs reliable at scale.

Need this on your store?Custom Shopify features and automations, shipped in two weeks.

Why Standard Import Tools Break at 100,000 SKUs

The Shopify admin CSV import works up to about 1,000 products in a single file before it becomes unreliable. Third-party import apps on the App Store add retry logic and chunking, but most are designed for one-time migrations, not ongoing syncs. They also run on shared infrastructure, which means their throughput is shared across thousands of merchants and can be unpredictable during peak hours.

The core problem at 100,000+ SKUs is that you cannot treat the catalog as a single payload. You need to break it into chunks, track the state of each chunk, handle failures per chunk without retrying the entire catalog, and reconcile the final state to catch drift. A spreadsheet or a no-code tool cannot manage that state reliably. You need a sync service with its own database.

Use the Shopify GraphQL BulkOperation API for Large Syncs

The BulkOperation API is Shopify's intended path for large data operations. You submit a GraphQL mutation with a JSONL file URL, Shopify processes it asynchronously, and you poll or listen for a webhook when processing completes. For product creation and updates, the 'bulkOperationRunMutation' with 'productCreate' and 'productUpdate' mutations can process tens of thousands of records per operation without hitting the standard rate limits.

The practical limit per BulkOperation file is around 100 MB of JSONL data. For a catalog of 150,000 SKUs with multiple images and metafields per product, you will likely need to split into batches of 20,000 to 40,000 products and queue them sequentially. Build a job queue (Bull or BullMQ in Node.js works well) to manage the batch order and track which batches have completed, failed, or need retry.

Design Your Sync Architecture for Ongoing Updates

A one-time migration is simpler than an ongoing sync. For ongoing syncs, you need to solve three problems: detecting what changed in the source system, sending only the changed records to Shopify, and confirming that Shopify accepted the changes. Full catalog re-syncs at 100,000+ products will take hours even with the BulkOperation API, so a delta sync that only processes changed records is critical for keeping latency under control.

Store a hash or a 'last modified' timestamp for each product in your sync database. When your source system (ERP, PIM, or supplier feed) delivers an updated feed, compare each record against the stored hash. Only queue records where the hash changed. For price and inventory changes specifically, use the 'inventorySetQuantities' and 'productVariantsBulkUpdate' mutations, which are faster than full product updates and have higher rate limits.

Need this on your store?

Custom Shopify features and automations, shipped in two weeks.

See the Shopify Build

Handle Images, Metafields, and Variants at Scale

Images are the most expensive part of a large catalog sync. Shopify downloads and processes each image URL you provide, which adds time and can fail if your image CDN is slow. Pre-stage images to a fast CDN (Cloudflare R2 or AWS S3 with CloudFront) before submitting them to Shopify. Use stable image URLs so that unchanged images are not re-submitted on delta syncs. Shopify will skip re-downloading an image if the URL matches what is already stored.

Metafields are useful for storing supplier codes, custom attributes, and data that does not map to standard Shopify fields. Create metafield definitions in the admin first, then include them in your BulkOperation mutations. For products with many variants (more than 100 options per product), you will need the 'productVariantsBulkCreate' mutation and careful handling of Shopify's 2,000-variant limit per product, which requires splitting into multiple parent products.

Monitor, Alert, and Reconcile Your Catalog Sync

A sync at this scale will have partial failures. Your monitoring should track: total records submitted, total records confirmed by Shopify, records in error state, and time since last successful sync. Alert when error rate exceeds 1% of the batch or when a sync takes more than twice its historical average. Shopify returns detailed error arrays in BulkOperation results that tell you exactly which records failed and why.

Run a weekly reconciliation job that compares your source record count against the count of active Shopify products to catch records that were silently dropped. For price-sensitive catalogs, compare a random sample of prices between source and Shopify daily. The cost of a sync database and reconciliation job (a few hours of engineering time and $10 to $30 per month in infrastructure) is far cheaper than a pricing error on 10,000 products.

Key takeaways

  • Standard CSV imports and no-code apps are not reliable above a few thousand products. Use the Shopify GraphQL BulkOperation API for large catalogs.
  • Split large syncs into batches of 20,000 to 40,000 products and use a job queue to manage batch state and retries.
  • Implement delta syncs using hashes or timestamps so only changed records are submitted, keeping sync latency manageable.
  • Run a weekly reconciliation job comparing source and Shopify counts to catch drift before it causes customer-facing problems.

Frequently asked questions

Shopify does not publish a hard product limit, but performance degrades for stores with very large catalogs in the admin interface. Merchants with 500,000 or more SKUs often use headless storefront architectures where the storefront queries Shopify via API rather than rendering admin-side pages.

Using the BulkOperation API with properly batched JSONL files, a full initial sync of 100,000 products with images and metafields typically takes four to twelve hours depending on image count and file sizes. Delta syncs of changed records can run in minutes if the change set is small.

Some apps like Trunk, Stocky, or specialized feed management platforms can handle large syncs, but most work best under 50,000 SKUs and charge per SKU or per sync. For catalogs above 100,000 SKUs with custom data requirements, a custom sync service is usually more reliable and less expensive long-term.

Store the full previous feed snapshot in your sync database, diff it against each new feed on arrival, and only submit changed records to Shopify. A diff of 150,000 products completes in seconds on a small server. Most feed updates change fewer than 1% of records, so hourly syncs typically touch only a few hundred to a few thousand products.

Shopify returns a 'userErrors' array in the operation result identifying which records failed and the reason. Your sync service should store which records were in the failed batch, filter out the ones Shopify confirmed as successful, and requeue only the failed records for retry. Do not resubmit the entire batch, as that will create duplicate products for records that already succeeded.

Free project plan

Tell us what you are building

Get a plan and a fixed estimate at no cost. A real engineer, not a sales rep, replies within one business day.

No spam, ever. One reply within one business day.

Thank you. Your brief is on its way. We reply within one business day.

Prefer to talk first? Book a 30-minute call or connect on LinkedIn.

SH
Former CTO and Co-founder, Seven Hills

I started Seven Hills to do the work I am proudest of, directly with the people who depend on it. As a CTO and co-founder I led an 18-engineer team and personally shipped the products behind these case studies, from a Fortune 100 shipping system to a SaaS product we built and sold. You work with that experience, not a sales layer on top of it.

Connect on LinkedIn →
Start a projectBook a call