The Elements of Innovation Discovered
Limitless training data on demand for AI systems with LexSet Metal Tech News Weekly Edition – May 13, 2020
Burgeoning startup LexSet is taking an innovative approach to artificial intelligence training by creating an infinite amount of information by generating synthetic data.
Based in Brooklyn, New York, the now synthetic data generation company did not initially start with plans to become so.
Originally focused on creating a spatially aware AI for interior design, their vast accumulation of data inevitably led them to discover the potential of synthetic data for powering visual search, object recognition and spatial navigation.
This discovery led LexSet to shift their company focus and develop into generating content to train vision AI models.
Fundamentally, an AI does not know what is correct or incorrect with regards to queries posed to its "brain". Like a newborn it must be taught to differentiate between even something as simple as an orange or an apple.
To help nurture an AI, technicians and specialists consume thousands of human hours alone to create the labeling of images.
During the Transformative Technology Applied to Mining webinar, LexSet CEO Francis Bitonti said, "Photography was never meant to train AI, so a lot of the work that we're doing is putting information into these photographs that isn't already there."
Current AI training involves accumulating tens of thousands, if not millions of photographs, that then must be carefully labelled to be as accurate as possible when having an AI determine if something is correct or incorrect.
According to The Economist, nearly $2 billion is spent on data labeling alone, which is rapidly growing to $5 billion.
"What we're building is a simulation solution that actually creates training data, not photographs that humans have to go and label so they can train their algorithms."
When a company seeks to build an AI system, a typical training process firstly involves capturing the data-for simplicity sake, photographs – these photographs must then be labeled by a person. Bear in mind, this could be tens of thousands of pictures.
The photographs then undergo post-processing before finally the AI algorithm can begin being trained.
After the AI has run through the current dataset, evaluation of its successes and failures are made before the entire process begins again.
One iteration of this cycle can take upwards of three months, according to Bitonti.
What LexSet is doing to change this is condensing the gathering, labeling and post-processing of data into a single step.
To enable better computer vision development, LexSet has created TDaaS (Training Data as a Service), using 3D content to create photo-realistic synthetic data to train vision AI models.
Basically, with this approach companies can generate limitless amounts of training data on-demand.
TDaaS has been specifically proven in making better AI vision systems for robotic object recognition and navigation, without needing photos in the training set.
With their stored database of prerendered 3D environments, they provide ideal image data sets, with perfect annotations, free of bias. This allows faster training and yields a more accurate AI for less cost.
Now, synthetic data training is not a new concept but would run into something called the "over-fitting problem," meaning the AI was taught to recognize rendering and not real life.
LexSet's solution to this is to build their own "physically accurate" rendering machine, designed to simulate real-world lighting conditions, allowing them to successfully train AI with computer generated images.
For companies in the mining sector that are beginning to use AI for mapping or to help determine viable drilling sites, all the above-mentioned data must be accumulated and input and filtered to provide an accurate system.
The data layers involved in the geoscientific field, as well as other fields, are vast – geology, geochemical, geophysical, terrain, regional structure, known deposits, the list can go on.
Generating specific and exact parameters in rendered images will invariably reduce time and cost to train an AI system being adopted to map underground mine workings or help geologists find the motherlode.
With this remarkable approach to AI training, more widespread use of this technology by the mining sector may be that much closer to a reality as the potential saved time can be spent elsewhere.
LexSet is one of the 15 current members of Prospect Mining Studio, a platform to support startup companies developing frontier technologies that will advance the mining industry. More information on Prospect Mining Studio can be read at Tech startup platform for mining industry in the April 29 edition of Metal Tech News.
Reader Comments(0)