Mass-Generating Financial Descriptions and Analysis Using AI

Antanas Baltrušaitis
3 min readFeb 11, 2024

--

tl;dr

I have generated financial overviews and analyses for approximately 70,000 Lithuanian companies in the Lithuanian language using large language models (LLMs).

Here how I did it:

Situation

I run a Lithuanian startup — a website named “Scoris” — which publishes open data about all companies in Lithuania. I have access to ample data, but my site lacks substantial “text” content. As a result, Google’s algorithms rank it as lower relevance due to its heavy reliance on “data” rather than textual information. To address this, I realized the importance of incorporating more relevant text content on my website.

Complication

Employing AI/LLMs seemed like the perfect solution for this task, yet I encountered four major challenges:

  1. Speed: There are numerous companies, and generating descriptions within a reasonable timeframe was essential. Initially, generating and translating one company description took about 30 seconds, translating to roughly one month of continuous generation.
  2. Quality: Our data is reliable, and I aimed to maintain this reliability without introducing inaccuracies or “hallucinations” from LLM outputs.
  3. Cost: The process involves approximately 200 million tokens in total. Considering regular updates, using ChatGPT 3.5 could cost a few hundred euros, while ChatGPT 4 might reach a few thousand euros. Additionally, translation costs via Deepl or Google Translate APIs, which charge 20 EUR per 1 million characters, could add another 3,000 EUR.
  4. Language: Most LLMs primarily operate in English, but I needed descriptions in Lithuanian for my website.

Resolution

Note: After numerous iterations and preliminary work, I developed the following solution, all implemented on a local machine equipped with an RTX 3090, i5–13500, and 64 GB DDR5 RAM.

  1. GENERATION: My objective was to generate high-quality English descriptions based on my data as quickly as possible. Utilizing the oobabooga/text-generation-webui with OpenAI API endpoint, I found that 7B Mistal variants and 10.7B Solar variants using 4bit GPTQ or EXL2 offered the best performance, achieving speeds of 90–100 tokens/s. However, I discarded most of the 10.7B Solar output due to its inability to accurately understand and convert thousands/millions/billions in EUR, along with issues in rounding and consistency. Therefore, approximately 80% of the final output was generated using Mistal 7B variant models. A Python script was employed to fetch financial data from a database, combine it with a prompt, send it to the API, then fetch the response and store it in the database.
  2. TRANSLATION: The next step involved translating the generated English descriptions into Lithuanian. Due to cost considerations, using Deepl or Google Translate APIs was not feasible. I found decent machine translation (MT) LLMs capable of EN to LT translation. Initially, they were adequate but imperfect, especially in handling numbers (e.g. 80,000 “translating” to 80.0). Thus, I performed two rounds of fine-tuning: a) One using a public general EN-LT dataset, primarily consisting of well-translated EU Commission documents with numerous numerical data. b) Two using my specific dataset, for which I spent 500 EUR on Deepl API to translate approximately 100,000 generated English sentences into Lithuanian, and further fine-tuned the model on this dataset. After these adjustments, the machine translation’s accuracy significantly improved. Although CPU inference was 3x slower than GPU inference (7s vs 2s per description), it allowed me to run the generation of English descriptions and translations in parallel.
  3. VALIDATION: After generating and translating the content, validation was necessary to ensure accuracy. This phase was not initially planned, but I observed significant inaccuracies and “hallucinations” from the 10.7B Solar models and occasional issues with the 7B Mistral models. I implemented an initial cleanup based on observed patterns and used an LLM to label each generated description as “Correct/Incorrect.” The Mistral-7B-Instruct-v0.2 model was perfectly suited for this task.

The entire process, as outlined above, took just over one week to complete, and I am highly satisfied with the outcome. I plan to continue generating more descriptions using a similar approach.

https://scoris.lt/be/imone/142174791

--

--