Every parcel arriving at the client’s warehouses required a warehouse operator to physically inspect the shipping label, manually read and transcribe key data fields – carrier name, tracking number, purchase order reference, and consignee name – and enter this information into an Excel file. With 12,000 shipments processed each month across a single warehouse at the start of this initiative, the cumulative data entry burden was significant.
The workforce tasked with this function was predominantly frontline labor, meaning human error rates were structurally high. Transcription mistakes, inconsistent data formats, duplicate entries, and missed fields were recurring challenges. The Excel files produced through this process fed downstream systems including ERP platforms and partner-facing documentation – meaning errors at the capture stage cascaded into broader operational and financial inaccuracies. Beyond accuracy, the sheer volume of manual effort consumed meaningful workforce capacity that could be redeployed to higher-value operational tasks.
The client was not facing a technology problem in isolation. The client was actively planning to expand to additional warehouse locations, which meant the manual process, already strained at one site, would multiply in complexity and cost as new warehouses came online. Each incremental warehouse added more operators, more data entry hours, and more opportunities for compounding errors.
Without architectural intervention, the path forward was a linear scaling of manual labor leading to more people doing the same error-prone work at greater volume. The leadership team recognized that digitization was not optional but the prerequisite for sustainable multi-site growth. The business needed a solution capable of replacing manual data capture with an automated, accurate, and auditable system that could scale horizontally across its warehouse network without proportional increases in headcount.
When Zimetrics engaged with the client team, the initial framing of the problem was straightforward: automate the data entry step. Zimetrics reframed it as an intelligent document processing challenge – not simply a form-filling problem, but an opportunity to build a vision-capable AI system that could understand and extract structured data from visually complex, unstandardized shipping labels at scale.
Shipping labels from different carriers vary significantly in layout, typography, barcode placement, and field structure. A rules-based OCR approach would require carrier-specific templates and ongoing maintenance as label formats changed. Zimetrics instead proposed building an AI extraction engine powered by a Visual Large Language Model (VLM) – a multimodal AI capable of reading and interpreting images the way a human would, without requiring predefined templates. The chosen architecture was designed around AWS Bedrock, hosting a visual language model selected after evaluating multiple candidates. The team tested OpenAI’s GPT 4o mini alongside Bedrock-hosted VLM options Multiple OCR and AI extraction approaches were evaluated before selecting the final architecture.
The decision to build on AWS Bedrock was driven by three factors: extraction accuracy, integration simplicity, and enterprise readiness. Bedrock provided managed model hosting, removing infrastructure complexity from the AI layer entirely. The Visual LLM selected – accessed via Bedrock – demonstrated extraction accuracy consistently in the 90–98% range on the carrier label formats most prevalent in the client’s warehouse environment.
Zimetrics structured the extraction layer with deliberate prompt engineering: each image was submitted to the model with a structured query defining the exact fields to extract – carrier name, tracking number, purchase order number, and consignee name – alongside output formatting instructions to ensure downstream parsing reliability. Business validation rules were embedded into the pipeline to flag exceptions before they reached the reviewer’s dashboard, ensuring that human reviewer’s effort was focused on genuine edge cases rather than routine processing. The full application stack was deployed on AWS, with the frontend Angular application hosted on S3 and delivered via CloudFront, backend logic running on AWS Lambda, authentication managed by Cognito, and data persisted in Amazon Aurora – creating a fully serverless, cloudnative architecture with no fixed infrastructure cost
The AI extraction engine was engineered as a scheduled batch processing system. Warehouse operators submit label images through the mobile Client App throughout their working shifts; the backend queues submissions and runs batch inference jobs at configured intervals using Amazon SageMaker as the processing orchestration layer. This batch architecture was deliberately chosen over real-time processing: it smoothed compute costs, improved throughput predictability, and matched the operational reality that downstream data consumption happened in cycles rather than requiring instantaneous output.
Each batch inference job passes the queued images to the VLM on Bedrock with a carefully engineered prompt. The model returns structured JSON containing the extracted fields – carrier, tracking number, PO number, consignee name – which are validated against pre-defined business rules before being written to the Aurora database. The rule engine checks for PO number format compliance (the 2D-XXXXXXXX pattern required by the client’s systems), tracking number presence, and carrier name validation for freight skids. Cases that pass all rules are automatically marked as completed; exceptions are routed to the reviewer dashboard for human review and correction.
Fault tolerance was built into every layer of the pipeline. A configurable retry mechanism handles transient failures – connectivity issues, model timeouts, or incomplete images – ensuring no submission is silently lost. All extraction events, rule outcomes, and system errors are logged to CloudWatch and CloudTrail, creating a complete audit trail for compliance and root-cause analysis. Application Architecture Zimetrics designed and delivered two purpose-built applications alongsi
Zimetrics designed and delivered two purpose-built applications alongside the AI engine.
The Client App is a mobile-first single-page Angular application, designed specifically for warehouse operators working on the floor with smartphones. After authenticating via AWS Cognito, operators are automatically assigned to their mapped warehouse and presented with a clean image capture interface. Built-in blur detection and quality validation alert operators to retake images that fall below clarity thresholds – a critical feature given variable warehouse lighting conditions. A barcode scanning capability allows tracking number capture even when label text is unclear or printed at small size. The location field pre-fills with the operator’s last selection, minimizing repetitive input, and the form resets automatically after each successful submission, enabling rapid high-volume capture across a shift.
The Reviewer App is a web-based dashboard providing centralized visibility across all warehouse submissions. Reviewers can filter cases by date range, processing status, validation rule outcome, and location, enabling efficient triage of exception cases. Each case presents the original label image alongside AI-extracted fields, allowing reviewers to visually verify and correct any extraction errors. Once satisfied, reviewers submit the case for completion. Validated data is exportable to Excel with configurable date range and filter parameters – producing the downstream SLR (Shipping Label Report) file that feeds the client’s ERP and logistics coordination workflows. A monthly summary view gives supervisors a real-time snapshot of processing volumes and exception rates across the warehouse network.
User management across the entire system is centrally administered by the Zimetrics team, ensuring access controls remain tightly governed as the warehouse network expands. Role-based access separates warehouse operator permissions (capture and submission only) from reviewer permissions (validation, correction, and export), maintaining data integrity across the workflow. Image data is retained for 60 days by default with the option to extend to 90 days, after which data is automatically purged in line with the client’s data governance requirements. The infrastructure is deployed via CloudFormation IaC templates, ensuring reproducible, auditable deployments as new warehouse locations are onboarded.