BERT – that is, as a hero sesame Street modifies the search engines
    • news
    • -
    • BERT – that is, as a hero sesame Street modifies the search engines

    BERT – that is, as a hero sesame Street modifies the search engines

    At the end of 2019, Google updated the algorithm that will improve the search results, 10% is called Google BERT (hence the connection with muppetem Bertema). Google describes BERT-and how the greatest change in the finder, from the time of the introduction almost five years ago RankBrainaand even one of the biggest changes in the history of algorithmic search. The BERT algorithm is currently being implemented around the world.

    It can be either in storage products, as wielopowierzchniowe and multi-level warehouses to check if a product has this his offer, but also for the development of intelligent automation of business processesfor example , optical character RECOGNITION (Engl. Optical Character Recognition). This algorithm is able to compare with each other, product names, at first glance, different, sometimes with the efficiency of the comparison is better than people with years of experience in the industry. Here we consider one of the practical examples of the use of this algorithm.

    Smart product comparison

    If we are introducing to the market a new product, we don’t know how to sell. There is a problem with where to be positioned in stock. Then comes to the rescue intelligent comparisonwhich can significantly reduce time spent on logistics. It allows you to find products similar to currently under consideration, and thus to determine the optimal location of the goods. It is built on the basis of two machine learning algorithms: klasteryzację k-means algorithm with the name Robert.

    The algorithm Roberts – what is it?

    Robert (eng. A Robustly Optimized BERT Pretraining Approach) pre-trained neural network used for problems related to natural language processing (NLP). This optimized algorithm BURT (eng. Bidirectional Encoder Representations from Transformers). Both these algorithms have the ability to eject from the context of a sentence. This may be due to the fact that Robert, or BERT, is not interpretiruya words one after another.

    Unlike most language models, these algorithms are the so-called national team illustration of the word (Embedding) by simultaneous reference as to what is happening to this word, and after it. Thus, the algorithm “sees” the difference between the word “lock” is used in respect of a zipper, and used against, for example, to the building.

    The basis of the algorithms BERT and Roberta deep neural network called Transformer. It is used to convert one sequence to another. This is done using the so-called Encoders and Decoders, which, as a rule, are responsible for converting the input sequence into a single vector (Encoder) and the unit vector in the output sequence (Decoder).

    In Transformersach Encoders and Decoders operate like two separate mechanisms. The first of them is responsible for understanding of the text on the machine. Their sign-text is replaced by the vector. It also adds information about the position of each word. Then it is processed within the subsequent layers of the network to get an idea of illustration of text, which is able to store the given information.

    The decoder is responsible in turn for predykcję of the probability of occurrence of this word. Having, for example, the phrase “I am fine”, the Decoder parses them word by word. Every time masks one more word and trying to predict their amount. Trying to predict the word “fine” Decoder sees “I am _”. Its aim is to be most accurately predict the missing words.

    The BERT algorithm uses only Transformersów Sensors. Thanks to them, able to read the entire sequence of words at the same time.

    Create a smart product comparison

    This mechanism uses several aspects of similarity between the products. On the one hand, the resemblance is physical. On the other hand, this information is not sufficient to uniquely assume that the similar products. For example, a white porcelain mug features the physical characteristics very similar to white porcelain bowl, however, is similar. So also is to compare the name. We are not talking here, however, that these names should sound the same, but that names, at first glance, different, but related to the same, were considered similar (for example, a jacket and a blazer).

    Limiting the number of products

    It was noted that the search engine uses the similarity of the physical. It is this similarity will help you choose the products are potentially similar. We want to test the products based on the algorithm BERT acted in certain categories, in order to avoid the comparison, for example, blue with yellow pants dresowych roofing sheets, because such a comparison does nothing – it products, it is highly from each other differ, and you need to compare them

    Comparing between selected physical characteristics under the consideration of the product and all other products, we get the physical similarity on a scale of 0-100 (this can be seen as a physical resemblance of the products, expressed as a percentage) and we are 100 of the most similar products.

    The fact that we are 100 of the most similar products is the result of tests showed that the inventory of potentially similar large enough to find the products, in fact, similar (if any), and small enough to only in exceptional situations to obtain information about nt. similarities products.

    The algorithm of Robert

    Having a limited number of potentially similar products, you can address the issue of the similarity between the names of the goods. In this part, first, using the algorithm of Robert every name should be replaced by the team of the illustration. Thus, the name of each product is represented as a vector.

    The second step is the identification of similarities between the obtained vectors. This is done using metrics BERTScore. The results returned by this measure the distance closer to human evaluation than other commonly used indices (e.g., the distance cosinusowa). She, however, has the disadvantage that the returned similarity (scale 0-1) is relatively very high, which can sometimes be confusing. So the next step is checked first whether the result BERTScore at least one of these 100 products exceeds 0.9 – if not, we can assume that is not the similar commodities. Otherwise, it is the group into 2 clusters given the similarity of the names that he receives by using the algorithm of Robert.

    Such an application earlier scribe model to another problem called transfer Learningiem. Due to the fact that among the considered 100 products, potentially similar, can be limited to a much smaller number. This is usually from 5 to 30 products. However, if after this step, we have more than 30 items, then there are two possibilities – or potentially among the similar products continue to be those who, however, not similar or the same products so much that these quietly as such can be rejected. Then again comes klasteryzacja into 2 groups, this time, however, because of the General similarity, i.e. the average of physical resemblance and similarity returned BERTScore.

    The above stage is the last in the appointment of similar products. Then among them, as a rule, are only for products that are really similar to under consideration and found the number of goods is between 0 and 30, in the vast majority of cases.


    Provides a comparison can greatly simplify the work associated with managing a warehouse, but also has many other applications, where we have a lot of text data and we would like to know how they are similar to each other. It also has application wherever there is a need for the so-called intelligent automation of business processes.

    This kind of innovative projects and many other latest technology implemented during summer internships at Comarch. If You are interested in the work of BURT and ” wanted(a)would you recognize them practical examples, it’s our experience in AI/ML is only for You.

    During the internship:

    • you will also learn many other interesting solutions
    • you will have the possibility of practical solution of the problem
    • you will learn the principles of systems based on artificial intelligence algorithms
    • you will have an impact on how these solutions AI will act

    Don’t think, if you want in the future to work with the latest technologies and solutions!


    Michael Więtczak Manager and solution Architect in the heart of Artificial Intelligence in Comarch

    King Głąbińska Engineer Machine Learning in the Artificial Intelligence in Comarch



    1A Sportyvna sq, Kyiv, Ukraine 01023

    1608 Queen St, Wilmington, NC, 28401