Chainbase open-source large model in the field of encryption Theia-Llama-3.1-8B
ChainCatcher news, the full-chain data network Chainbase announced the open-source of its large language model Theia-Llama-3.1-8B specifically designed for the cryptocurrency field on HuggingFace.
The Chainbase team has innovatively built the first professional Web3 dataset, which includes various information on the top 2000 projects from CoinMarketCap. The dataset has been manually and algorithmically filtered to ensure the accuracy, diversity, and professionalism of the training data. Based on this dataset, the team efficiently fine-tuned the model using LoRA technology and accelerated the training process with tools like DeepSpeed. The model has been quantized to the Q8 GGUF format, significantly reducing memory usage and improving inference speed.
It is reported that Theia-Llama-3.1-8B is Chainbase's initial attempt at a large model in the cryptocurrency field, and this model has been successfully applied in Chainbase's DEMO interactive application TheiaChat, which currently has over 300,000 daily active users.