12,000 Shipping Labels Processed Monthly: AI-Powered Shipping Label Digitization for a Leading US Paper and Packaging Distributor

About the Client

A leading independent paper and packaging distributor in the United States, with a national network of warehouses serving printers, publishers, and commercial enterprises across North America. The company manages high volumes of inbound and outbound shipments daily, operating as a critical intermediary between carriers, suppliers, and end customers across multiple distribution facilities.

Warehouse operations at the client span a distributed network of sites – including locations across Pennsylvania, New Jersey, New York, and the broader Mid-Atlantic region, each processing parcels with major carriers including UPS, FedEx, USPS, and DHL.

Impact Delivered

12,000

shipments processed monthly

25–30%

workforce redeployment to higher-value operational tasks

90–98%

AI accuracy across carrier label formats

Standing at a Turning Point

The Manual Bottleneck

Every parcel arriving at the client’s warehouses required a warehouse operator to physically inspect the shipping label, manually read and transcribe key data fields – carrier name, tracking number, purchase order reference, and consignee name – and enter this information into an Excel file. With 12,000 shipments processed each month across a single warehouse at the start of this initiative, the cumulative data entry burden was significant.

The workforce tasked with this function was predominantly frontline labor, meaning human error rates were structurally high. Transcription mistakes, inconsistent data formats, duplicate entries, and missed fields were recurring challenges. The Excel files produced through this process fed downstream systems including ERP platforms and partner-facing documentation – meaning errors at the capture stage cascaded into broader operational and financial inaccuracies. Beyond accuracy, the sheer volume of manual effort consumed meaningful workforce capacity that could be redeployed to higher-value operational tasks.

The Risk of Standing Still

The client was not facing a technology problem in isolation. The client was actively planning to expand to additional warehouse locations, which meant the manual process, already strained at one site, would multiply in complexity and cost as new warehouses came online. Each incremental warehouse added more operators, more data entry hours, and more opportunities for compounding errors.

Without architectural intervention, the path forward was a linear scaling of manual labor leading to more people doing the same error-prone work at greater volume. The leadership team recognized that digitization was not optional but the prerequisite for sustainable multi-site growth. The business needed a solution capable of replacing manual data capture with an automated, accurate, and auditable system that could scale horizontally across its warehouse network without proportional increases in headcount.

Solutioning

Reframing the Problem: From Manual Entry to Intelligent Capture

When Zimetrics engaged with the client team, the initial framing of the problem was straightforward: automate the data entry step. Zimetrics reframed it as an intelligent document processing challenge – not simply a form-filling problem, but an opportunity to build a vision-capable AI system that could understand and extract structured data from visually complex, unstandardized shipping labels at scale.

Shipping labels from different carriers vary significantly in layout, typography, barcode placement, and field structure. A rules-based OCR approach would require carrier-specific templates and ongoing maintenance as label formats changed. Zimetrics instead proposed building an AI extraction engine powered by a Visual Large Language Model (VLM) – a multimodal AI capable of reading and interpreting images the way a human would, without requiring predefined templates. The chosen architecture was designed around AWS Bedrock, hosting a visual language model selected after evaluating multiple candidates. The team tested OpenAI’s GPT 4o mini alongside Bedrock-hosted VLM options Multiple OCR and AI extraction approaches were evaluated before selecting the final architecture.

Why AWS Bedrock

The decision to build on AWS Bedrock was driven by three factors: extraction accuracy, integration simplicity, and enterprise readiness. Bedrock provided managed model hosting, removing infrastructure complexity from the AI layer entirely. The Visual LLM selected – accessed via Bedrock – demonstrated extraction accuracy consistently in the 90–98% range on the carrier label formats most prevalent in the client’s warehouse environment.

Zimetrics structured the extraction layer with deliberate prompt engineering: each image was submitted to the model with a structured query defining the exact fields to extract – carrier name, tracking number, purchase order number, and consignee name – alongside output formatting instructions to ensure downstream parsing reliability. Business validation rules were embedded into the pipeline to flag exceptions before they reached the reviewer’s dashboard, ensuring that human reviewer’s effort was focused on genuine edge cases rather than routine processing. The full application stack was deployed on AWS, with the frontend Angular application hosted on S3 and delivered via CloudFront, backend logic running on AWS Lambda, authentication managed by Cognito, and data persisted in Amazon Aurora – creating a fully serverless, cloudnative architecture with no fixed infrastructure cost

Engineering the Transformation

The AI extraction engine was engineered as a scheduled batch processing system. Warehouse operators submit label images through the mobile Client App throughout their working shifts; the backend queues submissions and runs batch inference jobs at configured intervals using Amazon SageMaker as the processing orchestration layer. This batch architecture was deliberately chosen over real-time processing: it smoothed compute costs, improved throughput predictability, and matched the operational reality that downstream data consumption happened in cycles rather than requiring instantaneous output.

Each batch inference job passes the queued images to the VLM on Bedrock with a carefully engineered prompt. The model returns structured JSON containing the extracted fields – carrier, tracking number, PO number, consignee name – which are validated against pre-defined business rules before being written to the Aurora database. The rule engine checks for PO number format compliance (the 2D-XXXXXXXX pattern required by the client’s systems), tracking number presence, and carrier name validation for freight skids. Cases that pass all rules are automatically marked as completed; exceptions are routed to the reviewer dashboard for human review and correction.

Fault tolerance was built into every layer of the pipeline. A configurable retry mechanism handles transient failures – connectivity issues, model timeouts, or incomplete images – ensuring no submission is silently lost. All extraction events, rule outcomes, and system errors are logged to CloudWatch and CloudTrail, creating a complete audit trail for compliance and root-cause analysis. Application Architecture Zimetrics designed and delivered two purpose-built applications alongsi

Zimetrics designed and delivered two purpose-built applications alongside the AI engine.

Client Web App (Mobile Website)

The Client App is a mobile-first single-page Angular application, designed specifically for warehouse operators working on the floor with smartphones. After authenticating via AWS Cognito, operators are automatically assigned to their mapped warehouse and presented with a clean image capture interface. Built-in blur detection and quality validation alert operators to retake images that fall below clarity thresholds – a critical feature given variable warehouse lighting conditions. A barcode scanning capability allows tracking number capture even when label text is unclear or printed at small size. The location field pre-fills with the operator’s last selection, minimizing repetitive input, and the form resets automatically after each successful submission, enabling rapid high-volume capture across a shift.

Reviewer App (Web Dashboard)

The Reviewer App is a web-based dashboard providing centralized visibility across all warehouse submissions. Reviewers can filter cases by date range, processing status, validation rule outcome, and location, enabling efficient triage of exception cases. Each case presents the original label image alongside AI-extracted fields, allowing reviewers to visually verify and correct any extraction errors. Once satisfied, reviewers submit the case for completion. Validated data is exportable to Excel with configurable date range and filter parameters – producing the downstream SLR (Shipping Label Report) file that feeds the client’s ERP and logistics coordination workflows. A monthly summary view gives supervisors a real-time snapshot of processing volumes and exception rates across the warehouse network.

User management across the entire system is centrally administered by the Zimetrics team, ensuring access controls remain tightly governed as the warehouse network expands. Role-based access separates warehouse operator permissions (capture and submission only) from reviewer permissions (validation, correction, and export), maintaining data integrity across the workflow. Image data is retained for 60 days by default with the option to extend to 90 days, after which data is automatically purged in line with the client’s data governance requirements. The infrastructure is deployed via CloudFormation IaC templates, ensuring reproducible, auditable deployments as new warehouse locations are onboarded.

Future Outlook

With the first warehouse deployment live and processing 12,000 shipments monthly, the client is actively expanding the platform across its broader distribution network. Each new warehouse onboarding is a configuration exercise as the architecture was purpose-built to scale horizontally without modification. The client in the future may choose to introduce enhanced observability and predictive monitoring capabilities across the processing pipeline, giving operations leaders real-time visibility into system performance and exception rates across all locations.

The shipping label use case also serves as the first deployment of a broader Intelligent Document Processing platform that can be extended to other logistics and supply chain documents. Adjacent document types – including billing extracts and outbound shipping documentation, which would further expand the scope of automated data capture across the client’s logistics workflows can be explored.

Zimetrics Team Perspective

“The most powerful outcome of this project isn’t just the automation – it’s the data foundation it creates. Every shipment that passes through these warehouses now enters the system accurately, consistently, and auditably. That foundation is what enables intelligent operations at scale, and it positions the client to extend AI-driven capabilities across their supply chain as they grow.”