Europe Union

Gaze estimation in the wild. Creating a ground-breaking eye-tracking system for in-context market research

Eye Tracking AI Solution
Developing a successful Spark platform made our partners hungry for deepened market research in the wild. Thus, we partnered again to create something that hasn't been done before - a web eye-tracking solution for customer experience research in the natural environment.


  • Name: eye  square GmbH
  • Line of business: market research, human experience
  • Founding year: 1999
  • Size: 50-200 employees
  • Country: Germany (Berlin)


After a successful collaboration on the Spark surveying platform, the client came back to us with a new project for deepening the research with a more hands-on experience, adopting gaze estimation to track the user’s attention while browsing content on their smartphones. The key issue is that such an approach is still highly innovative and uncharted territory.


  • Collecting training and test data using crowd-sourcing platforms and a dedicated web app
  • Creating AI-powered computer vision processing pipeline to detect the point of gaze 
  • Developing a full-stack framework consisting of a set of web services, backend infrastructure, front-end SDK and web app
  • Continuous research & development of algorithms for accuracy improvement

Developers’ insights – the challenge

Review Quote
One of the biggest challenges was estimating the gaze point on the screen in a non-controlled environment with varying light and camera positions. Contemporary solutions rely on expensive hardware and extended sets of recorded people. Few solutions currently available on the market would do what ours does with just the built-in smartphone camera.
Review Quote
The challenge that we find to be most formidable is the complexity of the problem. We must perform several tasks simultaneously to find the touch point between the gaze and the screen surface. And these tasks are complicated research challenges on their own.
Review Quote
The biggest challenge is to achieve high accuracy of the point of gaze estimation, i.e. the point the smartphone user is looking at. Our client wants the margin of error not to exceed 1 cm for a wide range of phones and their users. It’s a difficult task for a variety of reasons. Firstly, every person is different, and their eyes have a unique shape, colour, etc. Secondly, each phone has a different camera, size and screen ratio. Thirdly, each use case has different lighting conditions, and the person holds the phone differently. All of these factors make the task and the ultimate goal so complex.

Technology stack

  • Python
  • PyTorch
  • OpenCV
  • MediaPipe
  • JavaScript
  • FastAPI
  • AWS
  • Docker
  • ClearML
  • DVC
Web Eye Tracking project

The challenges of the innovation in computer vision

The collaboration on the web eye-tracking project evolved organically from the previous project we worked on – the Spark platform for market research. Given the successful results of that project, eye square requested to start working on a new solution that would allow them to deepen the research even more. They wanted to create a solution to track the customer’s attention while browsing the content on their smartphones for more accurate marketing and experience research. They needed an accurate system for tracking the user’s gaze over the surface of the phone’s screen. The project’s key challenge was that the eye-tracking technology for mobile phones was an innovative concept without a practical market application yet. Therefore, they needed competent experts and engineers to create something entirely from scratch. Moreover, the created environment had to be executable in a real-life context without any specific research environment. The aim was to create something that would work in the daily setting and regular phone use.
Setting the right course for the product

Our initial communication involved the company’s COO, Phillip Reiter, technical project managers – Garrit Güldenpfennig, Frederic Neitzel and Olaf Briese, and the CFO, Friedrich Jakobi. They outlined their expectations and needs for the project. 


The company previously used  several third-party solutions for laptop-related use cases, some of them required external HW. The main goal this time was to create a new solution that would be suitable for mobile phones. The initial agreements took approximately two weeks, after which the team started working on the Proof of Concept to ensure the visions were aligned before starting the subsequent phases of such a complex R&D project.

Building a team of gaze estimation and computer vision experts

Our team comprises a Senior Computer Vision Engineer, Machine Learning Engineers, DevOps specialists, frontend developers and a project manager. 


eye square supported us with three technical Project Managers (one of whom has become a project coordinator), the CFO and a developer  to provide extra help. Moreover, an essential part of their contribution was coordinating the crowd-sourcing platform to acquire testers and data sets. The technical project managers Garrit, Frederic and Olaf also provided their expertise and help whenever needed.

Technological aspects of the gaze estimation project

To create a solution that exceeds the state-of-the-art technology, we had to use the available resources to the maximum. The crucial part of the process was to create a stable foundation that would allow the creation and development of new features that would bring it closer to the final project and what it should look like.

  • We applied Python, PyTorch and OpenCV, among other libraries, to create the base algorithm. We later based the development on testing data from early tests and larger data gathered via the ClickWorker crowd-sourcing platform. 
  • JavaScript was used to develop WETSDK, Training Data Collection App (TDCA) and an example web application illustrating the production use case. 
  • FastAPI was used to develop a communication interface between the end user – the web app and the algorithm running on the backend server 
  • AWS allowed us to store the training and validation data in the cloud
  • Docker made it easier to encapsulate the algorithm in self-contained SW images that can run in the cloud

Due to its complexity and innovation, the project must be divided into multiple stages and requires extensive research, including the “trial and error” approach. The biggest challenges involve the dynamic environment, as we wanted to create a solution that would work “in the wild”, without the need for any specific HW and with minimum prerequisites from the user.

This raised several obstacles, including the complexity of calibrating the phone camera. Phone screens & cameras differ from model to model. Therefore, it’s hard to find a generic estimation method, especially since phone manufacturers don’t disclose the physical dimension of devices. 

Since the user needs to have complete freedom to use their phone, there’s the challenge of making sure that the gaze estimation algorithm can be auto-calibrated in different models of smartphones. It is a difficult task given different angles, distances, and face detection capabilities. Our neural networks are trained on different faces and angles to get a result similar to regular use. 

Due to the vast crowd-sourced data from the ClickWorker platform, the team needs to evaluate the data quality.  An automatic framework was developed to filter out recordings that don’t satisfy basic quality metrics, like proper lighting, lack of blurring, etc.

Learn more about Python and its uses

Read the article

Developers’ Insights – overcoming obstacles

Review Quote
One of the most prominent obstacles we are proud to have overcome is putting the product elements together. Both parts are strongly dependent on each other, and thanks to putting them together, we can quickly transfer the solution into the cloud.
Review Quote
We needed to overcome the barrier of using our data set to train the neural network. Especially in our case, when acquiring the training data was more complicated.
Review Quote
One of the most prominent barriers we overcame was collecting enough data for training the neural network models. Such data exists but isn’t available for commercial use and doesn’t always match the specific use case. That’s why we’ve done a lot of work to collect and analyse a large amount of data from hundreds of people to create our own training set.
Web Eye Tracking project

Developers’ insights – technologies

Review Quote
One of the most important technologies we use would be computer vision systems (based on Pytorch framework), linear algebra, and our skills involving careful reading and applying the solutions we found in different papers.
Review Quote
The technologies I found especially helpful were Python, Numpy, OpenCV, PyTorch, MediaPipe, Scipy, GPU and CUDA.
Review Quote
The WET project wouldn’t be possible without applying machine learning algorithms. We use deep learning elements at every stage of image processing – from face detection in the three-dimensional surface to gaze point estimation and like. We find Python and PyTorch to be invaluable there.

First steps towards a groundbreaking eye-tracking solution

  • Milestones 1-3: developing a PoC
    • Milestone 1 – achieving a certain level of algorithm accuracy (06.06-31.07.2022)
    • Milestone 2 – creating SDK and an example of application design (05.08-30.09.2022)
    • Milestone 3 – web eye tracking service and further improvement of the algorithm accuracy to meet the acceptance criteria (01.10-31.10.2022)
  • Milestone 4: continuous research, improving the algorithm and preparing the application to collect a large amount of data (TDCA – Test Data Collection Application) via ClickWorker – a crowd-sourcing platform
  • Milestone 5: processing the data from ClickWorker, adding TDCA features, and further working on the algorithm accuracy
  1. The project started as a proof of concept. Initially, we aimed to achieve the required minimum algorithm accuracy.
  2. Upon establishing the essential accuracy, our team worked on the SDK environment and application design ready for testing and data collection.
  3. After establishing these features, we created a web eye-tracking service for further testing.
  4. The next step involved continuous research in improving the algorithm and preparing the application for collecting more considerable amounts of data from the ClickWorker crowd-sourcing platform
  5. Currently, we are working on the next round of testing to improve the algorithm’s accuracy to the maximum of 1 cm average point of gaze estimation error on a wide range of test subjects.

Listen to Karol Duzinkiewicz talk about Gaze Estimation.

Watch the video

What were the key metrics of our journey?

  • 80% – the target accuracy for the next phase
  • <1cm – the target margin of gaze point detection error

Outcomes and further steps towards reliable eye-tracking experience research

Even though the innovation threshold is set high for that project, the results are satisfying on both sides. The initial aim was to prepare the Proof of Concept. However, our partner keeps extending our work, as the results are good and the prospects promising.

We are gathering and processing more training data to improve the algorithm’s accuracy. We already exceeded the technological state-of-the-art and are continuing to work on achieving better accuracy and taking the next steps towards creating a working product.

Several elements of the processing pipeline developed by are currently considered for patent submission.

Learn more about the Spark market research platform.

View the case study

Words from eye square

Review Quote
It was the synergy between your coordination, efforts on our side, and all of your team effort. We are looking forward to the next project with you guys.
Review Quote
So, on this eye-tracking project in particular, we were very impressed with the competence of the team, the timelines, and how they are met. It’s very, good to work together, and you have a great project team on site and communication, as Michael mentioned, is very good.
Review Quote
So although the project is not yet over, we now have a prototype, and now the prototype has to become real live-action and has to be integrated in our technology. So these are the next steps on our journey. And we hope that DAC will continue to support us, as has before.
Review Quote
(…) we are confident in looking forward that we have a good partner on our side.

Read More of Our Case Studies