The Silicon Review
Many startups would like to incorporate a machine learning component into their product(s). Most of these products are unique in terms of the business, the data that is required to train the machine learning models, and the data that can be collected. One of the main challenges that these startups have is the availability of data specific to their business problem. Unfortunately, the quality of the machine learning algorithms is dependent on the quality of the domain specific data that is used to train these models. Generic data sets are not useful for the unique problems that these startups are solving. As a result, they cannot rollout a feature involving machine learning until they can collect enough data. On the other hand, customers ask for the product feature before their usage can generate the required data. In such a situation, one needs to rollout a machine learning solution incrementally. For this to happen, there must be a synergy between the data and the algorithms that have the ability to process this data.
In Pre-2003, startups had just started outsourcing product development offshore. But the success stories were very few. Offshore vendors were trying to use processes designed for large IT outsourcing projects while working with startups. Nitin and Manjusha realized that there was a need for a new way of working designed solely for startups. In April 2003, Talentica Software came into being exactly for this purpose. The “Talentica Way” is designed specifically for startups: with dedicated teams consisting of the brightest talent, Talentica balances process and flexibility to meet the constantly changing needs of startups. Talentica’s passion for technology and focus on execution increases your chances of successful outcomes, user acquisition and path to profitability. Going forward, Talentica aspires to become the number one company building products for startups. Building a solution involving machine learning is much more than the model. It is a complex mix of data structures, model training, model integration and architecture. Talentica engages in end-to-end delivery of a machine learning solution tailored to bring product features to life. There are many NLP APIs and services available today. Some of these services could give 80% accuracy on extraction tasks involving generic data. However, to solve really hard problems involving natural language understanding, especially with proprietary and small data sets, we need to skillfully use machine learning techniques along with traditional NLP algorithms.
Deep learning techniques have given a fillip to computer vision and image processing solutions. However, training models for proprietary and domain-specific data sets is a challenge. Talentica finds innovative ways to transform the domain-specific part of a problem into a generic computational problem in order to deliver practical solutions. Optimization algorithms are the foundation of modern-day machine learning. However, there is a rich history dating back to many decades. Talentica strives to use these fundamental algorithms to deliver solutions to problems involving allocation, balancing, routing.
Recognizing the shape of cells and detecting cells under mitosis is a challenging problem. Classical image processing was used for cell localization, image transformation and feature extraction. A trained convolutional neural network was used to classify the cell into different classes. However, to detect similar regions in an H&E stained tissue sample unsupervised approaches were used. Here a neural auto-encoder played the role of an image fingerprinting system. Devices that capture multiple electro-mechanical features generate huge amounts of time-series data. Talentica determined which set of features were significantly contributing to mechanical failure in different parts of the machine. Moving windows of time-series statistics were used to model conditions that represent the current state of the machine leading up to a future breakdown.
Targeted Extraction and Automated Understanding of Text
Processing text inputs can be challenging. NLP algorithms needed to work for multiple languages cope with the use of slang and are able to process short-form text that has partial grammatical consistency. Talentica used a combination of techniques involving parse tree traversals, a creation of overlapping n-gram sequences, language detection APIs and auto-encoder based pattern matching engine to extract concepts and questions from chat conversations or short reviews. These techniques can work for small data without the need for large trained models. Sequence2Sequence techniques were used to automatically convert a block of text into key-value pairs of attributes and their values. The final solution was an API-based service that was integrated into the main product.
Factors that influence rental or sale value of homes depend not only on the structural parameters of the home but also on geographical parameters and chronological factors. Multiple models were created after feature analysis while accounting for partial data and in some cases very little data for a region. Boosting based algorithms for regression were employed and the core implementation was adapted for a case where a price range needs to be predicted instead of a point estimate. Thousands of models were created and deployed so that models suitable for a region can be used to make the predictions. The final solution was an end-to-end flow adapted for the product architecture.
Applied Mathematical Optimization
Higher prediction accuracy was achieved using an ensemble consisting of a trained deep neural network, a pre-existing application-specific predictor, and a support vector machine classifier. While the DNN acted as a new complementary predictor, the SVM served to determine which of the two predictors probabilistically gave the correct result at that moment. This is achieved by creating a probability surface that exhibits a bi-model distribution peaking at each of the predictors. Such techniques can be used in applications such as indoor locationing, object location predictions in images, voting based systems. In a multiparty contest involving rating by human evaluators, Talentica algorithmically determined which of the many contest ideas should be presented to the human evaluator such that a likelihood estimate is maximized. In another situation, the objective function of a system of non-linear equations was minimized with constraints on the parameters. Here Talentica solved the problem using Lagrangian relaxation with gradient descent to determine the optimal parameters.
Meet the leader behind the success of Talentica
Nitin Shimpi is the CEO and Co-Founder of Talentica. He provides organizational leadership. Is instrumental in managing the day-to- day operations and expansion of Talentica. A crossword junkie. IIT Mumbai and Marquette University alumnus.