OCR-Powered Mobile App for Expense Tracking
To manage receipts and expenses efficient is a often challenge, particularly when it comes to digitizing physical receipts. In this project, I developed a mobile application that scans paper receipts, processes them using Optical Character Recognition (OCR), and extracts relevant data points such as the total sum, store name, and date of purchase. The result is a streamlined workflow for tracking expenses with minimal manual input.
Technology Stack
- Frontend Framework: Apache Cordova
- Initial OCR Approach: Tesseract.js (Client-side)
- Final OCR Solution: PaddleOCR (Server-side)
- Backend Language: PHP and Python (PHP for API endpoints)
- Data Format: JSON for communication between the frontend and backend
Frontend: Cross-Platform Mobile App with Cordova
The mobile app was built using Apache Cordova, allowing for cross-platform deployment using a single codebase. The app enables users to scan physical receipts using the device’s camera. The scanned image is then sent to the backend for processing.
Initially, the plan was to handle OCR directly on the device using Tesseract.js. After testing, it became clear that this approach suffered from low recognition accuracy and unacceptable performance, particularly when processing multiple receipts in succession. This led to a shift towards server-side processing.
Backend: Server-Side OCR with PaddleOCR
After exploring several OCR libraries—including Tesseract and GNU Ocrad—I integrated PaddleOCR, an OCR system developed by PaddlePaddle. It provides highly accurate text detection, angle classification, and recognition with minimal configuration. I exposed some API endpoints via PHP, which allowed me submit the scanned images to the server, make some CRUD operation on the expense entries and calls the ocr python script.
Here’s a simplified example of how PaddleOCR was used:
from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True, lang="en")
result = ocr.ocr(processed_image_path, cls=True)

The result
is a structured list containing:
- The recognized text
- Bounding box coordinates
- Confidence scores
This structured output made it straightforward to extract key information. I parsed the output to identify data points such as:
- Total amount (by detecting terms like “total,” “sum,” etc.)
- Purchase date
- Store name
Data Flow and User Interaction
Once the server finishes processing, the extracted data is returned to the mobile app as a JSON object. The app then:
- Overlays the bounding boxes on the receipt image
- Prefills a form with the extracted information
- Allows the user to verify and edit the data before saving
This hybrid approach balances automation with manual verification, enhancing both usability and data accuracy.

Expense Tracking and Reporting
Each expense item in the app has an associated status field, which indicates its current state in the workflow:
uploaded_unprocessed
auto_processed
manually_checked
approved
This status tracking enables flexible querying and reporting. For instance, users can filter by items needing review or generate monthly reports of approved expenses.

Conclusion
This project demonstrates a full-stack solution that combines mobile development, image processing, OCR, and backend integration. By leveraging PaddleOCR’s capabilities and streamlining the data extraction process, the app significantly reduces the overhead of manual expense tracking while remaining transparent and user-controllable.