
Capshot
Bringing Digital Assistance to see the Real
Case Study Overview
In the fast-paced world of today where multitasking is the norm, users are bombarded with foreign objects in their daily lives, either in the workplace, at home, or on the road. The fundamental design challenge that CapShot addresses is: How can a computer vision-based AR assistant enable users to become more proficient at noticing and reacting to objects in the physical world and offer a convenient and streamlined experience? CapShot is an AI-powered AR app that takes advantage of the smartphone camera to scan real-world objects and provide users with immediate, contextual, and actionable information.
Problem Statement
Most individuals whether they be professionals or non-professional users of technology do not have an interactive, fast, and intuitive means for detecting and acting upon objects nearby. Current AR and AI-based solutions barely provide usable feedback, accessibility options, or personalize settings, so people are left without their needed context of use.

Users don't just want to be able to recognize what they perceive more, they want to know it quickly, interact with it, and trust the AI that's making it happen.
Lack of knowledgeable AI answers as well as a perplexity in feedback making object-recognition AR apps unintelligible or inaccessible to non-pro users.
Experts (like vets, engineers, or teachers) need a system that can assist real-time decision-making using visual information—but current AR systems are too inflexible or too rudimentary.
Goals
Deliver Instant Recognition
Offer fast and accurate object detection with AI and AR, and real-time feedback for both hobbyist and professional use.
Enhance User Accessibility
Design for inclusiveness, offering support for voice output, scalable text, and simple navigation to enable users with various abilities and preferences.
Encourage User Interaction & Understanding
Create a responsive interface that not just shows what the AI is looking at but also allows users to interact with it by asking questions,
get explanations, or filter results.
Research Analysis
To design a useful product, research was conducted to be user-focused by interviewing four individuals with varying backgrounds:
-
Sandra, a veterinarian technologist, envisioned the application helping identify animal symptoms.
-
-
Keven, an accountant, wanted simplicity and suggested a guided walkthrough.
-
-
Nathan, a software engineer, wanted more interaction such as selecting which screen areas AI should address.
-
-
Daniel, a creative professional, enjoyed the light design and proposed automated, chat-like responses.
This research revealed an underlying need: users needed speedy, interactive, and information-rich AR experiences but also looked for intuitive interfaces and helpful aspects like tutorials and output-adjusting capabilities.
Personas

Sandra is a veterinarian who loves animals with a side of technology and is very tech savvy. Knowing if applications like this exists, it would be a great help in identifying illness or physical anomalies of animals
Results:
She likes the idea of having a format that AI can see through a camera
She that it seemed too simplistic
The text are hard to see
Does follow along with the tasks I described to her
Would like more functionality on AI and how it operates
Does like the side menus
Would want to see descriptions on what the AI sees and gives an output
Key Findings:
Because it's a creative app that, like ChatGPT, records what we see when we use the device and can provide us with results in any format, people like the app concept. can also help users understand what things are, how to utilize them, or what they imply, and vice versa. Additional the application does have a simplistic view and wants a visual enhancement.
Applying the AI AR functionality is necessary to improve the app's usability and yield better results.

Keven is an accountant at a insurance company and would always vouch for the latest tech for reliability and longevity use for daily needs.
Results:
He enjoys the simplicity
Would want a tutorial or a walkthrough of some wort
Does like how it's similar to other AI applications it uses which made it easier for him to navigate to places
Favors a lot with history scans
Key Findings:
The app is visually appealing and simple, but there should always be a section that guides the user on how its core functionality works. Therefore, it will leave users satisfied and pleased throughout the experience. Can never assume people would learn as they go throughout the application and so it is better off creating a tutorial like feature to fasten their knowledge of the application.

Nathan is a software engineer that programs applications for big insurance companies such as Liberty Mutual and Capital One. He does like technology in the bettering his life in any way when it comes to personal projects, health, and gym progression.
Results:
He does say its easy to use
He did note that he does like the functionality of how the app handles the AR functionality on capturing what the camera sees
It would be helpful that the user can click on a part of the screen so the AR and detect what the object is on the area
Likes the response the AI gives which can be useful for projects or information on a object
Maybe suggest a Press + Hold functionality that have more focus on an area of what the camera is looking
`
Key Findings:
Nathan did mention some good points, one of them having a focus feature. The simplicity on how the application uses is great for the user but can hopefully explain better in the tutorial

Daniel is an everyday photographer and rollerblader, if he is not in the streets showcasing his love for rollerblading, he takes delectable candid photos of urban areas and people. One thing that he favors more in tech is how technology progresses everyday to make things easier such as AI, calendars, automations, etc. For this application, he was definitely intrigued on how AI and AR can work and see things in our point of view instead of describing what we see or on an object
Results:
Likes the design and color choice on the application
Likes how the application is simplistic
May want to add a automated answer feature where the AI gives its response and have short couple words on replying to its answer such as “Tell me more” or “Keep Scanning”
Does like the tutorial section which does help him how the App and AI works
Simplifying Signing up for an account in the app
Key Findings:
Throughout the interview I did find some input that I can change in my application. The automated quick option prompt when the AI output is finished. I do have a good sense on some of the features I have worked on is good for now, there is always room for improvement
Findings and Implementation
Because it's a creative app that, like ChatGPT, records what we see when we use the device and can provide us with results in any format, people like the app concept. The app is visually appealing and simple, but there should always be a section that guides the user on how its core functionality works. Therefore, it will leave users satisfied and pleased throughout the experience. Adding tutorial section, history scans, enjoyable concept and walkthrough of application.

Steps for fixing/Takeaway:
-
Enhancing UI visuals with colors, shapes, pictures, etc
-
Create meaningful buttons
-
Better AI capture within camera
-
Description on what the AI sees and gives results
-
Better icons
-
Adding focus functionality
-
Simplifying Signing up for an account in the app
Wireframe and Interface Design
Being simple enough that uses a camera and its AI modeling detection from what it sees, the wireframe displayed here shows how the application wil lbe portrayed. Side bars, Signup page, and most importantly, camera screen which will have buttons and AI models that is set up to be.

The Main feature here points how the AI AR works
-
Response back to user on what it is
-
White lines indicate what the AI tries to detect
-
can take a few seconds to recognize
-
Can also ask further questions on the object
-
Button on the top left and right provide past scans
-
or account settings
-
Press detection can help the AI on what the user wants to detect

Research Validation and Feedback
The process began with low-fidelity wireframes built around the core features: object scanning, user navigation, and feedback output. The home screen served as the central hub, with quick access to a camera-based recognition feature and side menus for settings and history.
Round 1 Feedback Highlights:
Add clearer visual output on what the AI is detecting.
Improve accessibility (font sizes, onboarding guidance).
Add focus/click interaction to direct AI’s attention.
Round 1 Revisions:
Integrated an AI output text box.
Added top-left and top-right buttons for convenient access to menus.
Began envisioning an interactive "Helper Buddy" assistant.
Round 2 Feedback Highlights:
Users wanted responsive feedback like "Tell me more" or "Keep scanning."
Needed a more detailed but user-friendly interface.
Suggested customizable AI reply and scanning control.
Round 2 Revisions:
Added interactive suggestions in the AI output section.
Streamlined account sign-up flow.
Envisioned a press+hold interaction for object focus.
Enhanced layout for better scan-and-feedback flow.
Project Reflection
CapShot is more than a concept, more like a case study of responsive, accessible design. While AR and AI come together, products like these demonstrate the way innovation must be balanced with ease of use. This exercise served to highlight the power of iterative design, and with further testing and iteration, CapShot could become an invaluable daily tool for professionals, students, and end-users alike.