Machine Learning Q&A Session
We continue our Q&A sessions about Machine Learning. Let’s start with the following questions:
- We fund commodity transactions for our clients. How can we estimate financial risks of involvement in specific transactions?
- Different machine learning models can be used for this purpose. ML model will depend on the size of your marked dataset:
- Logistic regression / LDA / SVM if there are too little records (<1000);
- Decision tree if there are some records (1000-5000);
- Gradient Boosting Trees (GBT) if there is enough data (10000+ records)
- Other models like kNN, Naive Bayes Classifier can be used as well. They change the issue core a bit, but they can provide business with more valuable results.
- If some classification errors are much more important than the others, class weights can be used to tune prediction thresholds as well as to consider different classes' importance.
- Some of the aforementioned models can also calculate features importance.
- How can we explore our audience?
- ML can give you an opportunity to see which groups your clients consist of. Or you can cluster transactions to see which transactions are the biggest part of your portfolio:
- Text encoding (TF-IDF, One-hot-encoder)
- Text embeddings (GloVe etc)
- Clusterization (DBSCAN, K-Means, K-Medoids, Hierarchical)
- Can we structure form output if a client fills a form with some text fields?
- Sure. Text standardization allows to extract meaningful data from unstructured text input fields. The following techniques can be used:
- Named Entity Recognition (NER) extracts certain types of data from input. Basic NER models extract names, phone numbers, location, prices, etc
- Custom Entity detection via spaCy / DeepPavlov can be trained to extract specific entities: datacenter name, item description, commodity article, intent to action.
- Extracted structural data is useful both for analytics and as input to another processes or models.
- Is it possible to search for similar transactions?
- Yes. ML allows to find already executed similar transactions in history. It is useful in transaction analysis. It can emphasize some points of specific transactions that are more important than others.
- kNN (“K nearest neighbors”) model allows you to find which historical transactions are similar to the new one. The “similarity” is calculated in the N-dimensional space of transaction features. Different methods are used with kNN such as window size tuning or kernel tricks. They allow models to be more flexible and accurate.
- Clustering (DBSCAN, K-Means, K-Medoids, Hierarchical) can be used to divide transactions into large groups that have something in common.
How to Benefit from Machine Learning in the Marketplace
It is possible to make the marketplace work and be profitable with the help of machine learning. In fact, such algorithms have been used for a long time already to fuel the marketplace.Read more
Interview with Machine Learning Specialist
In our last article we talked about the benefits of Machine Learning for marketplaces. Since this is an extremely interesting and popular topic, we decided to interview one of our specialists to learn more about ML first-handRead more
Q&A Session about Machine Learning
After our interview with the Machine Learning specialist, we received a bunch of questions from our customers. As this topic is apparently of great interest, we have decided to organize several Q&A sessions.Read more