Europe Union

Outline of the problem

The key issue is to reconstruct 3D objects and buildings from unstructured image collections freely available on the internet. The challenge is to identify which parts of two images capture the same physical points of a scene, establish correspondences between pixel coordinates of image locations, and recover the 3D location of points by triangulation. 

Proposed solution

The proposed solution is to develop a machine learning algorithm based on computer vision techniques to register two images from different viewpoints. By creating a method to identify key points in the images and establish correspondences between them, we can calculate the fundamental matrix, which provides essential information about where and from which viewpoints the photos were taken. This process will lead to the generation of 3D models of the landmarks.

Which technologies have we applied:

  • Python,
  • Pytorch,
  • Kornia,
  • OpenCV
satellite_data_collection

Connecting Perspectives: Algorithmic Solutions for Cross-View Landmark Recognition in Tourist Imagery.

There were collections of tourist images of 16 landmarks taken from various angles and distances, such as nearby, below, and sometimes with obstructions like people. The challenge was to develop algorithms capable of identifying key points in these images (located on buildings) and then establish the correspondences between them across different viewpoints, even without knowing the exact camera parameters used to capture the images. The difficulty lies in dealing with diverse viewpoints, lighting conditions, occlusions, and user-applied filters in the images, without having access to capture location and device parameters like camera models and lenses.

 

The people and tech behind our project.  

The team consisted of five developers and researchers with varying levels of experience in computer vision, machine learning, and image processing. 

Each member was responsible for working independently in their niche area and performing experiments while also collaborating and discussing progress with others to finally integrate best approaches in one system.

 

The project leveraged Python for scripting and building experiment architecture. PyTorch was used to build and train neural networks for keypoint detection and matching, while Kornia provided state-of-the-art models for Computer Vision. OpenCV handled image preprocessing and image manipulation tasks. This cohesive tech stack enabled efficient experimentation and remarkable progress in 3D object reconstruction from diverse image collections.

A Holistic Journey through Landmark Recognition: From Exploration to Validation.

1
2
3
4
5
Unveiling Insights Through Literature Exploration

First, we delved into an in-depth literature review to gain a comprehensive understanding of existing solutions and techniques in the realm of stereophotogrammetry and 3D reconstruction from images. This initial phase allowed us to grasp the state-of-the-art approaches and identify potential areas for improvement.

Navigating Algorithms and Models for Key Point Identification

Next, we proceeded with experimentation, exploring various computer vision algorithms and machine learning models. Our primary aim was to identify key points within the images and establish meaningful correspondences across different viewpoints. This experimental stage enabled us to assess the performance and limitations of different approaches, guiding us towards the most promising paths.

From Theory to Reality: Prototyping and Refinement

With valuable insights from the experimentation phase, we moved on to developing prototypes. These prototypes served as crucial testing grounds for implementing diverse algorithms and fine-tuning parameter combinations on our dataset. Through this iterative process, we gained valuable feedback and refined our methods.

Forging Cohesion: Seamlessly Merging Algorithms and Techniques

As the project’s complexity demanded an integration of various algorithms and techniques, we dedicated substantial effort to ensuring a seamless fusion of components. This integration phase required meticulous coordination and harmonization of different modules to ensure they functioned cohesively.

Testing the Waters: Evaluating Performance and Potential

Finally, we put our solution to the test. Through extensive testing on unseen data, we rigorously evaluated its performance, assessing its accuracy and generalizability. This thorough examination allowed us to validate the effectiveness of our approach and ascertain its potential for real-world applications.

Results

The team’s developed machine learning algorithm successfully registered images from different viewpoints and calculated the fundamental matrix. This allowed them to create accurate 3D models of the landmarks from the collections of tourist images.

Key numbers

The project achieved success in solving the complex computer vision problem in a relatively short time frame of slightly over one month.

quote icon

The proposed solution showed promising results and had potential applications in virtual and augmented reality, cultural heritage preservation, and others 3D reconstructions where the data is not complete. quote icon

Michał Affek
Michał Affek Embedded Machine Learning Researcher

Pushing Limits in Computer Vision: Join Our Journey of Innovation.

Computer vision topics can be both challenging and innovative, as demonstrated by DAC.digital’s remarkable research-science project in reconstructing 3D objects and buildings from images. If you are interested in embarking on a project that involves computer vision and pushing the boundaries of this cutting-edge technology, we invite you to contact us to collaborate and work together!

Let’s join forces to unlock new possibilities in the world of computer vision.

Estimate your project!

Let’s revolutionize your customer experience together. Get in touch today!
ornament ornament

Check other case studies:

Client

  • Name: eye  square GmbH
  • Line of business: market research, human experience
  • Founding year: 1999
  • Size: 50-200 employees
  • Country: Germany (Berlin)

Challenge

After a successful collaboration on the Spark surveying platform, the client came back to us with a new project for deepening the research with a more hands-on experience, adopting gaze estimation to track the user’s attention while browsing content on their smartphones. The key issue is that such an approach is still highly innovative and uncharted territory.

Solution

  • Collecting training and test data using crowd-sourcing platforms and a dedicated web app
  • Creating AI-powered computer vision processing pipeline to detect the point of gaze 
  • Developing a full-stack framework consisting of a set of web services, backend infrastructure, front-end SDK and web app
  • Continuous research & development of algorithms for accuracy improvement

Technology stack

  • Python
  • PyTorch
  • OpenCV
  • MediaPipe
  • JavaScript
  • FastAPI
  • AWS
  • Docker
  • ClearML
  • DVC
Web Eye Tracking project

The challenges of the innovation in computer vision

The collaboration on the web eye-tracking project evolved organically from the previous project we worked on – the Spark platform for market research. Given the successful results of that project, eye square requested to start working on a new solution that would allow them to deepen the research even more. They wanted to create a solution to track the customer’s attention while browsing the content on their smartphones for more accurate marketing and experience research. They needed an accurate system for tracking the user’s gaze over the surface of the phone’s screen. The project’s key challenge was that the eye-tracking technology for mobile phones was an innovative concept without a practical market application yet. Therefore, they needed competent experts and engineers to create something entirely from scratch. Moreover, the created environment had to be executable in a real-life context without any specific research environment. The aim was to create something that would work in the daily setting and regular phone use.
Setting the right course for the product

Our initial communication involved the company’s COO, Phillip Reiter, technical project managers – Garrit Güldenpfennig, Frederic Neitzel and Olaf Briese, and the CFO, Friedrich Jakobi. They outlined their expectations and needs for the project. 

 

The company previously used  several third-party solutions for laptop-related use cases, some of them required external HW. The main goal this time was to create a new solution that would be suitable for mobile phones. The initial agreements took approximately two weeks, after which the team started working on the Proof of Concept to ensure the visions were aligned before starting the subsequent phases of such a complex R&D project.

Building a team of gaze estimation and computer vision experts

Our team comprises a Senior Computer Vision Engineer, Machine Learning Engineers, DevOps specialists, frontend developers and a project manager. 

 

eye square supported us with three technical Project Managers (one of whom has become a project coordinator), the CFO and a developer  to provide extra help. Moreover, an essential part of their contribution was coordinating the crowd-sourcing platform to acquire testers and data sets. The technical project managers Garrit, Frederic and Olaf also provided their expertise and help whenever needed.

Technological aspects of the gaze estimation project

To create a solution that exceeds the state-of-the-art technology, we had to use the available resources to the maximum. The crucial part of the process was to create a stable foundation that would allow the creation and development of new features that would bring it closer to the final project and what it should look like.

  • We applied Python, PyTorch and OpenCV, among other libraries, to create the base algorithm. We later based the development on testing data from early tests and larger data gathered via the ClickWorker crowd-sourcing platform. 
  • JavaScript was used to develop WETSDK, Training Data Collection App (TDCA) and an example web application illustrating the production use case. 
  • FastAPI was used to develop a communication interface between the end user – the web app and the algorithm running on the backend server 
  • AWS allowed us to store the training and validation data in the cloud
  • Docker made it easier to encapsulate the algorithm in self-contained SW images that can run in the cloud

Due to its complexity and innovation, the project must be divided into multiple stages and requires extensive research, including the “trial and error” approach. The biggest challenges involve the dynamic environment, as we wanted to create a solution that would work “in the wild”, without the need for any specific HW and with minimum prerequisites from the user.

This raised several obstacles, including the complexity of calibrating the phone camera. Phone screens & cameras differ from model to model. Therefore, it’s hard to find a generic estimation method, especially since phone manufacturers don’t disclose the physical dimension of devices. 

Since the user needs to have complete freedom to use their phone, there’s the challenge of making sure that the gaze estimation algorithm can be auto-calibrated in different models of smartphones. It is a difficult task given different angles, distances, and face detection capabilities. Our neural networks are trained on different faces and angles to get a result similar to regular use. 

Due to the vast crowd-sourced data from the ClickWorker platform, the team needs to evaluate the data quality.  An automatic framework was developed to filter out recordings that don’t satisfy basic quality metrics, like proper lighting, lack of blurring, etc.

Learn more about Python and its uses

Read the article
Web Eye Tracking project

First steps towards a groundbreaking eye-tracking solution

  • Milestones 1-3: developing a PoC
    • Milestone 1 – achieving a certain level of algorithm accuracy (06.06-31.07.2022)
    • Milestone 2 – creating SDK and an example of application design (05.08-30.09.2022)
    • Milestone 3 – web eye tracking service and further improvement of the algorithm accuracy to meet the acceptance criteria (01.10-31.10.2022)
  • Milestone 4: continuous research, improving the algorithm and preparing the application to collect a large amount of data (TDCA – Test Data Collection Application) via ClickWorker – a crowd-sourcing platform
  • Milestone 5: processing the data from ClickWorker, adding TDCA features, and further working on the algorithm accuracy
  1. The project started as a proof of concept. Initially, we aimed to achieve the required minimum algorithm accuracy.
  2. Upon establishing the essential accuracy, our team worked on the SDK environment and application design ready for testing and data collection.
  3. After establishing these features, we created a web eye-tracking service for further testing.
  4. The next step involved continuous research in improving the algorithm and preparing the application for collecting more considerable amounts of data from the ClickWorker crowd-sourcing platform
  5. Currently, we are working on the next round of testing to improve the algorithm’s accuracy to the maximum of 1 cm average point of gaze estimation error on a wide range of test subjects.

Listen to Karol Duzinkiewicz talk about Gaze Estimation.

Watch the video

What were the key metrics of our journey?

  • 80% – the target accuracy for the next phase
  • <1cm – the target margin of gaze point detection error

Outcomes and further steps towards reliable eye-tracking experience research

Even though the innovation threshold is set high for that project, the results are satisfying on both sides. The initial aim was to prepare the Proof of Concept. However, our partner keeps extending our work, as the results are good and the prospects promising.

We are gathering and processing more training data to improve the algorithm’s accuracy. We already exceeded the technological state-of-the-art and are continuing to work on achieving better accuracy and taking the next steps towards creating a working product.

Several elements of the processing pipeline developed by DAC.digital are currently considered for patent submission.

Learn more about the Spark market research platform.

View the case study

Estimate your project

Looking to develop your own blockchain solution? Let us help! Contact us now for a free estimate and let’s get started.
ornament ornament

Customer.

Sports Computing
Sports Computing combines the best of both worlds – a high-tech app based on AI with motion tracking and football. Changing the way we train, stay active and share our love of the sport, Sports Computing lets you share your love of football no matter where in the world you are. KickerAce – All you need is your phone and a ball.
Sports Computing
Experience we shared.
Computer vision processing Computer vision processing
Artificial Intelligence & Machine Learning
Mobile application development Mobile application development

Problem.

  • Need to promptly deliver a revamped version of the app based on a new UI design.
  • The software was expected to facilitate a large number of concurrent users, which required full scalability.
  • Lack of internal tech resources on the client’s end.
  • Looking for a team with competencies across a broad spectrum of skills – including mobile development, backend, video and image processing, AI/ML, and the ability to package all these skills together.
  • Previously choosing a partner that failed to deliver expected results and caused a go-to-market delay. 
  • Unmaintainable, messy code with no versioning scheme.

Solution.

  • Initially, performing detective work to find the most recent version of the app, fixed all burning issues, and deployed the app again to the testers to create a baseline.
  • Cleaning up the code and redesigning the application based on the new designs.
  • Bringing the backend in order based on established good practices – decoupling environment, creating a separate development and production infrastructure, setting up proper DevOps infrastructure in Azure context as well as setting up the CI/CD pipelines for mobile app
  • Setting up a dedicated team tackling the image analysis aspects of the app.
  • Developing the product in line with the Sport’s Computing Product Owner cooperation

Process.

The services are performed by DAC.digital developers chosen to form an interdisciplinary, independent team. The core areas of support were based on Data Science with Python and Image Analysis knowledge and experience and DevOps support and were aligned during the so-called “Block Planning Sessions” or prioritized and assigned to our team via email. The initial collaboration began with KickerAce mobile app development and further collaboration on Shot Analyzer software.

Delivered value.

The customer has been provided with fully scalable and functional software, meeting the deadlines, requirements, and specifications presented towards the beginning of the project. The collaboration between DAC.digital and customers’ teams has been based on transparency, openness, and honesty resulting in solid trust. Our problem-solving approach and excellent understanding of both technology and business allowed the Sports Computing team to feel comfortable and confident in the results of our work.

Testimonial.

Review Quote
Most important is that you cover our professional needs, which are pretty extensive and different from traditional projects. We couldn’t get a more ideal partner with extraordinary skills both within AI and application development. Professional and transparent project management is vital. PM and interactions are working exceptionally well. Your ability to work independently and come up with constructive alternative solutions, understandable for a layperson, has reduced the stress and concerns. We appreciate the good chemistry. We see DAC.digital as more than just another developer. We see you as an extension of Sports Computing.
Kjell Heen
CEO of Sports Computing

Used Technologies.

React Native
Azure
Terraform

Are you interested in video processing and mobile app development?

Just leave your email address and we’ll be in touch soon
ornament ornament